JP2004164670A

JP2004164670A - Semiconductor integrated circuit device, functional circuit block, memory, and information processor

Info

Publication number: JP2004164670A
Application number: JP2004009976A
Authority: JP
Inventors: Masahiro Iwamura; 将弘岩村; Shigeya Tanaka; 成弥田中; Hideo Maejima; 英雄前島; Tetsuo Nakano; 哲夫中野
Original assignee: Renesas Technology Corp
Current assignee: Renesas Technology Corp
Priority date: 1989-12-15
Filing date: 2004-01-19
Publication date: 2004-06-10
Anticipated expiration: 2020-11-24
Also published as: JP3718513B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a microprocessor capable of reducing the power consumption and increasing the processing speed. <P>SOLUTION: The microprocessor has a detection means for detecting instructions which operate respective arithmetic circuits in advance and a means for activating one or more arithmetic circuits corresponding to instructions detected by the detection means before arithmetic performance and inactivating the activated arithmetic circuits after the end of arithmetic performance, and n (n≥2) instructions are simultaneously read out and decoded, and n arithmetic circuits are used to simultaneously perform arithmetics. Thus a semiconductor integrated circuit device, especially a microprocessor having an on-chip memory like a cache memory, capable of reducing the power consumption and increasing the processing speed in a functional circuit block can be obtained. <P>COPYRIGHT: (C)2004,JPO

Description

本発明は、高速アクセスや多ビット出力等を要求される内蔵キャッシュメモリ等の低消費電力化が望まれる機能回路ブロックを有する、マイクロプロセッサ等の半導体集積回路に関するものである。 The present invention relates to a semiconductor integrated circuit such as a microprocessor having a functional circuit block that requires low power consumption, such as a built-in cache memory that requires high-speed access and multi-bit output.

以下、従来の技術をマイクロプロセッサを例に取り説明する。 Hereinafter, a conventional technique will be described using a microprocessor as an example.

近年の高速性能マイクロプロセッサ（以下ＭＰＵと記す）は、内部の命令実行速度と外部主メモリからの命令およびオペランドの内部への転送速度の不一致によって生じる問題を解決するために、ＭＰＵ内部にキャッシュメモリを内蔵することや並列度を高めて処理性能の向上を図るため複数の演算器を内蔵させることが一般的になってきており、その結果、消費電力の増大が深刻な問題になってきている。 Recent high-speed microprocessors (hereinafter referred to as MPUs) have a cache memory inside the MPU to solve the problem caused by the mismatch between the internal instruction execution speed and the transfer speed of instructions and operands from the external main memory to the inside. In order to improve the processing performance by increasing the degree of parallelism and increasing the degree of parallelism, it has become common to incorporate a plurality of arithmetic units, and as a result, the increase in power consumption has become a serious problem. .

キャッシュメモリを内蔵する主目的は、ＭＰＵの実行速度を見合ったスピードで命令やデータを高速にフェッチすることである。 The main purpose of incorporating a cache memory is to fetch instructions and data at high speed at a speed commensurate with the execution speed of the MPU.

今日、最高速のＣＩＳＣ型ＭＰＵのクロック周期は２５〜４０ＭＨｚであるが、近い将来、ＲＩＳＣ型で１００ＭＨｚを越えるＭＰＵが登場することが予想される。 Today, the clock cycle of the fastest CISC type MPU is 25-40 MHz, but it is expected that RISC type MPUs exceeding 100 MHz will appear in the near future.

このような、超高速ＭＰＵでは内蔵キャッシュメモリとして数ｎｓ以下の超高速アクセスが要求される。 Such an ultra-high-speed MPU requires an ultra-high-speed access of several ns or less as a built-in cache memory.

また、内蔵キャッシュメモリは、ワード数は比較的少ないが、１ワード当りの読出しビット数が極端に多い（汎用ＳＲＡＭでは最大８ビット）と云う特徴を有している。たとえば、今日の３２ビットＭＰＵでも数百ビットの並列読出し等は一般的に実現されており、将来ＭＰＵの６４ビット化が進めば、並列読出し数がさらに増大すると思われる。 The built-in cache memory has a feature that the number of words is relatively small, but the number of bits read out per word is extremely large (8 bits in a general-purpose SRAM). For example, parallel reading of several hundred bits is generally realized even in today's 32-bit MPU, and it is expected that the number of parallel readings will further increase if the MPU becomes 64-bit in the future.

ここで、一般に、超高速メモリのセンスアンプとしては、バイポーラトランジスタによる差動型の高感度センスアンプが好適である。しかし、この回路は定常的に比較的大きな電力を消費する。また、メモリの他の部分でも特別な電力節減手段がなければメモリアクセスが起こらなくても電力を消費する。 Here, in general, a differential high-sensitivity sense amplifier using a bipolar transistor is suitable as a sense amplifier for an ultra-high-speed memory. However, this circuit constantly consumes relatively large power. Also, power is consumed in other parts of the memory even if no memory access occurs without special power saving means.

すなわち、超高速アクセス、多ビット並列出力のキャッシュメモリを内蔵する単一チップＭＰＵではメモリ回路の消費電力が極端に大きくなるため適切な低消費電力化手段がなければ、キャッシュメモリのオンチップ化そのものが、やがて不可能になると予想される。 That is, the power consumption of the memory circuit becomes extremely large in a single-chip MPU incorporating a cache memory of ultra-high-speed access and multi-bit parallel output, so that there is no appropriate means for reducing power consumption. But it is expected that it will soon become impossible.

低消費電力化技術として知られている第１の従来技術としては、メモリアドレス信号と等価なチップセレクト信号ＣＳにより、メモリ回路をスタンドバイモードの消費電力と通常動作モードの消費電力に切換え実効的な消費電力を低減するものがある。 A first prior art known as a low power consumption technique is to switch a memory circuit between a standby mode power consumption and a normal operation mode power consumption by a chip select signal CS equivalent to a memory address signal. There are those that reduce the power consumption.

また、第２の従来技術としては、たとえばＡＴＤ（Addres Transition Detector）回路により、アドレス信号の変化を検知し、その信号により内部動作に必要なクロックパルスを生成し、メモリのセンスアンプ等を必要な期間だけ動作させて消費電力を低減するものが知られている。 Further, as a second conventional technique, a change in an address signal is detected by, for example, an ATD (Addres Transition Detector) circuit, and a clock pulse required for an internal operation is generated based on the signal, thereby requiring a memory sense amplifier or the like. A device that operates only during a period to reduce power consumption is known.

また、特許文献１などに記載されているように、ＭＰＵなどの論理ＬＳＩではａ）複数の機能ブロックに対応して電力制御命令を設け、プログラムにより該当する機能ブロックを活性又は非活性状態に切換えることにより低消費電力を実現する方法、ｂ）機能ブロック毎にクロック制御回路を設け、クロック供給の有無を制御することにより低消費電力を実現する方法、ｃ）機能ブロック毎に電力制御回路を設け、命令実行時に使用しない機能ブロックの電源供給を遮断することにより低消費電力を実現する方法、などが公知である。しかしながら、上記従来技術では通常電力消費状態と低消費電力状態相互間の切換え時の電源電流の急激な変化によって電源線や接地線に誘起される雑音に対する考慮が欠如しており、以下のような問題点がある。
１）低消費電力状態と通常動作状態とで回路電流が短時間に大きく変化するため電源線，ＧＮＤ線のインダクタンスや抵抗により、大きな雑音電圧が発生する。
２）上記雑音電圧により、機能回路自身あるいは他の内部回路が誤動作する。また、仮に誤動作が起きなかった場合でも上記雑音電圧が消滅するまでには一定の時間が必要なため、メモリアクセススピードの実効的な低下が引き起こされる。 Further, as described in Patent Document 1, for example, in a logic LSI such as an MPU, a) a power control instruction is provided corresponding to a plurality of function blocks, and the corresponding function block is switched to an active or inactive state by a program. B) A clock control circuit is provided for each functional block, and a method for realizing low power consumption by controlling the presence or absence of clock supply is provided. C) A power control circuit is provided for each functional block. A method of realizing low power consumption by cutting off the power supply to a functional block not used at the time of executing an instruction is known. However, in the above prior art, there is no consideration for noise induced on a power supply line or a ground line due to a rapid change of a power supply current when switching between a normal power consumption state and a low power consumption state. There is a problem.
1) Since the circuit current greatly changes between the low power consumption state and the normal operation state in a short time, a large noise voltage is generated due to the inductance and resistance of the power supply line and the GND line.
2) The functional circuit itself or another internal circuit malfunctions due to the noise voltage. Further, even if no malfunction occurs, a certain time is required until the noise voltage disappears, so that the memory access speed is effectively reduced.

図２４（Ａ）は電源系の雑音電圧発生を説明するための図である。同図において、1300は電源、１３１０は例えばメモリ回路などの機能回路ブロック１３２１，１３２２は、それぞれ電源，ＧＮＤ系のインダクタンス１３３１，１３３２はそれぞれ電源系，ＧＮＤ系の抵抗である。 FIG. 24A is a diagram for explaining generation of noise voltage in the power supply system. In the figure, reference numeral 1300 denotes a power supply, reference numeral 1310 denotes a power supply, functional circuit blocks 1321 and 1322 such as, for example, a memory circuit, and GND system inductances 1331 and 1332, respectively, a power supply system and a GND system resistance.

図２４（Ｂ）はＳＷを時刻ｔ₁ でオンし、時間ｔ₂ でオフした場合の電源電流ｉの変化、電源電位ｖ₁ ，ＧＮＤ電位ｖ₂ の変化の様子を示したものである。 FIG. 24B shows how the power supply current i changes and the power supply potential v ₁ and the GND potential v ₂ change when the SW is turned on at time t ₁ and turned off at time t ₂ .

図示するように、時刻ｔ₁ でスイッチＳＷをオンにしたとき、回路電流ｉはΔｔ₁ 時間の間に０から定常電流まで変化する。このとき、図示のように回路の電源電位ｖ₁ は負方向にピークを持つように大きく変化し、ＧＮＤ電位ｖ₂ は正方向にピークを持つように大きく変化する。逆に、時刻ｔ₂ でスイッチＳＷをオフしたとき、回路電流ｉは△ｔ₂ 時間の間に定常電流から０まで変化する。このとき、回路の電源電位ｖ₁ は正方向にピークを持つように大きく変化し、ＧＮＤ電位ｖ₂ は負方向にピークを持つように大きく変化する。 As shown, when the switch SW is turned on at time t ₁ , the circuit current i changes from 0 to a steady current during Δt ₁ time. At this time, the power supply potential v ₁ of the circuit as shown changes greatly to have a peak in the negative direction, the GND potential v ₂ changing to have a peak in the positive direction increases. Conversely, when the switch SW is turned off at the time t ₂ , the circuit current i changes from the steady current to 0 during the time Δt ₂ . At this time, the power supply potential v ₁ of the circuit changes to have a peak in the positive direction increases, GND potential v ₂ changes greatly to have a peak in the negative direction.

例えば、今、図２４の回路１３１０が５００ケのセンスアンプであり、１回路当り２
ｍＡの電流を消費するものとし、この電流をΔｔ＝１ｎｓで０から定常電流まで切換えるものとする。このとき、抵抗１３３１，１３３２を無視し、インダクタンス１３２１，
１３２２をＬ＝５ｎＨと仮定すると電源ノイズｖ_n は以下のようになる。 For example, now, the circuit 1310 in FIG. 24 is a 500 sense amplifier, and 2
It is assumed that a current of mA is consumed, and this current is switched from 0 to a steady current at Δt = 1 ns. At this time, the resistances 1331 and 1332 are ignored, and the inductance 1321 and
1322 The L = 5 nH assuming the power supply noise v _n are as follows.

すなわち、５Ｖ又はそれ以下の電源電圧で動作する今日の半導体集積回路では前記のような大きな電源雑音は許容しがたいものになる。 That is, in today's semiconductor integrated circuits operating at a power supply voltage of 5 V or less, such large power supply noise becomes unacceptable.

また、仮に前記雑音を適正な大きさまで低減できたとしても、図２４（Ｂ）に示すように、電源、ＧＮＤ雑音が消失するまでにはΔｔ₁ 時間，Δｔ₂ 時間が必要である。この時間は電流の切換え時間に依存するが通常１〜３ｎｓが必要である。数ｎｓ以下のアクセスタイムが必要な超高速メモリなどではこの時間は受け入れ難いものであり、高速動作の大きな障害になる。 Even if the noise can be reduced to an appropriate level, Δt ₁ time and Δt ₂ time are required until the power supply and the GND noise disappear, as shown in FIG. This time depends on the current switching time, but usually requires 1 to 3 ns. This time is unacceptable in an ultra-high-speed memory or the like that requires an access time of several ns or less, and becomes a major obstacle to high-speed operation.

上記のように、電源電流の変化による雑音の問題は半導体チップ内の複数の演算器やその他の機能回路ブロックについても同様である。 As described above, the problem of noise due to a change in power supply current is the same for a plurality of arithmetic units and other functional circuit blocks in a semiconductor chip.

ちなみに、近年の高性能ＭＰＵは、その処理能力を上げる目的で、いろいろな技術が導入されはじめている。計算機の処理性能は、次式により評価される。 Incidentally, various technologies have begun to be introduced in recent high-performance MPUs in order to increase the processing capacity. The processing performance of the computer is evaluated by the following equation.

ＣＰＩとは、１命令に要するサイクル数である。 The CPI is the number of cycles required for one instruction.

ここ数年の技術で注目されるのが、ＰＩＳＣプロセッサである。ＲＩＳＣは、上式の
ＣＰＩを１に近づけることにより性能を向上させることを主目的としている。 Attention in recent years' technology is the PISC processor. RISC is primarily intended to improve performance by bringing the above CPI closer to one.

最近、ＲＩＳＣの次の技術として、Super Scalar，ＶＬＩＷなどが注目されはじめた。この技術は、最大ｎ個の命令を同時に読み出し、ｎ個の命令を同時に解釈し、ｎ個の命令を同時に実行するもので、ハードウエアの並列度を増すことにより上式のＣＰＩを１／ｎまで下げて、計算機の性能を向上させるものである。このSuper Scalar，ＶＬＩＷなどの高速動作の演算回路としては、バイポーラトランジスタなどによる差動型の論理回路や
BiCMOSによる低振幅回路などが用いられるようになってきたが、直流電流を流す回路は、定常的に比較的大きな電力を消費する。 Recently, as the next technology of RISC, Super Scalar, VLIW and the like have started to attract attention. This technology reads out at most n instructions at the same time, interprets n instructions at the same time, and executes n instructions at the same time. By increasing the degree of parallelism in hardware, the CPI in the above equation can be reduced to 1 / n. To improve the performance of the computer. High-speed operation circuits such as Super Scalar and VLIW include differential logic circuits using bipolar transistors and the like.
Although low-amplitude circuits based on BiCMOS have come to be used, circuits that flow direct current constantly consume relatively large power.

Super Scalar，ＶＬＩＷなどのＭＰＵでは、同一機能を持つ高速動作の演算回路がｎ個要求されるわけであり、それに供ない、演算回路の消費電力もｎ倍に増えてしまうという問題がある。 In MPUs such as Super Scalar and VLIW, n high-speed operation circuits having the same function are required, and there is a problem that the power consumption of the operation circuit increases by n times.

なお、この種の技術として関連するものには、例えば非特許文献１において論じられている。 A related technique of this type is discussed, for example, in Non-Patent Document 1.

特開昭６１−４５３５４号公報JP-A-61-45354 日経エレクトロニクスＮo.４８７、１９８９年１１月２７日号のＰ１９１〜２００Nikkei Electronics No. 487, P191-200 of November 27, 1989 issue

以上の説明で明らかなように、従来のマイクロプロセッサなどの半導体集積回路や電子回路における低消費電力化技術は、電力切換え時に電源線や接地線に発生する雑音の問題が考慮されていないため、回路の誤動作を引き起したり、雑音が消失するまで一定の時間がかかるため、迅速なスタートアップが出来ないと云う問題があった。 As is clear from the above description, the conventional technology for reducing power consumption in a semiconductor integrated circuit or an electronic circuit such as a microprocessor does not consider the problem of noise generated in a power supply line or a ground line at the time of power switching. Since it takes a certain amount of time until a circuit malfunctions or noise disappears, there is a problem that a quick start-up cannot be performed.

特に、従来のオンチップメモリを有するＭＰＵでは電力切換え時の低ノイズ化とメモリアクセスの高速化にはトレードオフの関係があるため、超高速化が困難であるという問題を生じていた。 In particular, in a conventional MPU having an on-chip memory, there is a trade-off relationship between low noise at the time of power switching and high-speed memory access, so that there has been a problem that it is difficult to achieve ultra-high speed.

また、以上、キャッシュメモリを有するマイクロプロセッサにつき述べたが、高速化を要求される機能ブロックを有する半導体集積回路や電子回路においても同様な問題が生じる。 In addition, although the microprocessor having the cache memory has been described above, a similar problem occurs in a semiconductor integrated circuit or an electronic circuit having a functional block that requires high speed.

そこで、本発明は、機能回路ブロックの低消費電力化と高速化が可能な半導体集積回路装置、特に、キャッシュメモリ等のオンチップメモリを有すマイクロプロセッサを提供することを目的とする。 Therefore, an object of the present invention is to provide a semiconductor integrated circuit device capable of reducing power consumption and increasing the speed of a functional circuit block, and in particular, to provide a microprocessor having an on-chip memory such as a cache memory.

なお、ここで、活性化とは回路の動作上必要な所定の電力を供給することを意味し、不活性とは前記所定の電力より小さい電力を供給することを意味する。 Here, the activation means supplying a predetermined power required for the operation of the circuit, and the inactivation means supplying a power smaller than the predetermined power.

本発明の半導体集積回路装置によれば、電源系インダクタンスＬ，許容電源ノイズＶ_n ，回路電流の切換え幅ΔＩの機能回路ブロックと、該機能回路ブロックを、その動作開始に時間Ｔ先行して活性化する動作開始予告信号を発生する手段を有し、かつ、
前記Ｔ，Ｌ，Ｖ_n およびΔＩは、 According to the semiconductor integrated circuit device of the present invention, the functional circuit block having the power supply system inductance L, the allowable power supply noise V _n , and the switching width ΔI of the circuit current, and activating the functional circuit block in advance of the time T before the operation starts. Means for generating an operation start notice signal to
Wherein T, L, V _n and ΔI is

の関係を満たす。 Satisfy the relationship.

また、本発明のメモリによれば動作の開始を予告する予告信号を受信し、該予告信号受信時より所定の時間をかけて回路電流を所定の値まで増加することにより低電力消費モードより通常電力消費モードに移行し、動作の実行が終了後、所定の時間をかけて回路電流を低電力消費モード電流まで減少し、低電力消費モードに移行する機能を有することを特徴とする機能回路ブロックを提供し、また、アクセスを予告するアクセス予告信号により活性化し、アドレス信号とリード・ライト制御信号とデータ入出力信号とに基づいて、所定のメモリ動作を実行する。 Further, according to the memory of the present invention, a notice signal for announcing the start of the operation is received, and the circuit current is increased to a predetermined value over a predetermined time from the reception of the notice signal, so that the normal operation is performed in the low power consumption mode. A functional circuit block having a function of shifting to the power consumption mode, reducing the circuit current to the low power consumption mode current over a predetermined time after the execution of the operation is completed, and shifting to the low power consumption mode; And is activated by an access notice signal for giving notice of access, and executes a predetermined memory operation based on an address signal, a read / write control signal, and a data input / output signal.

また、本発明のマイクロプロセッサは、ｎ個（ｎ≧２）の命令を同時に読み出してデコードし、ｎ個の演算回路を用いて同時に演算を実行するマイクロプロセッサであって、各演算回路を操作する命令を先行検出する検出手段と、この検出手段により検出された命令に対応した演算回路を１つ以上演算実行に先立って活性化し、演算終了後に、活性化された演算回路を不活性化する手段とを有する。 Further, the microprocessor of the present invention is a microprocessor that simultaneously reads and decodes n (n ≧ 2) instructions, and simultaneously executes an operation using the n arithmetic circuits, and operates each arithmetic circuit. Detecting means for detecting an instruction in advance, and means for activating one or more arithmetic circuits corresponding to the instruction detected by the detecting means prior to execution of the operation, and deactivating the activated arithmetic circuit after completion of the operation And

更に、本発明のマイクロプロセッサは同時に実行する命令間で各種の競合状態が生じることを検出する手段を有し、競合状態が生じている場合には、競合する命令に対応する前記演算回路のうち１つを、演算実行に先立って活性化し、かつその他の演算回路を不活性化するための信号を前記先行検出手段に送る。 Further, the microprocessor of the present invention has means for detecting the occurrence of various race conditions between instructions executed at the same time, and in the case where a race condition has occurred, among the arithmetic circuits corresponding to the conflicting instructions, One of them is activated prior to execution of the operation, and a signal for inactivating other operation circuits is sent to the preceding detection means.

本発明によれば、機能回路ブロックの低消費電力化と高速化が可能な半導体集積回路装置、特に、キャッシュメモリ等のオンチップメモリを有するマイクロプロセッサを提供することができる。 According to the present invention, it is possible to provide a semiconductor integrated circuit device capable of reducing the power consumption and operating speed of a functional circuit block, and in particular, to provide a microprocessor having an on-chip memory such as a cache memory.

以下、本発明に係る半導体集積回路の実施例をマイクロプロセッサを例にとり説明する。 Hereinafter, an embodiment of a semiconductor integrated circuit according to the present invention will be described using a microprocessor as an example.

図１に、本発明の第１の実施例に係るマイクロプロセッサ（ＭＰＵ）の構成を示す。 FIG. 1 shows a configuration of a microprocessor (MPU) according to a first embodiment of the present invention.

図中、１００が単一チップＭＰＵであるが、説明の便宜上、以下に、その内部構成は本実施例の理解に必要な構成要素だけを記し、その他の部分は省略する。 In the figure, reference numeral 100 denotes a single-chip MPU, but for convenience of explanation, only the components necessary for understanding the present embodiment are described below, and other parts are omitted.

図中、１０１はプログラムカウンタであり、クロック信号ＣＬＫに同期して命令データの読み出しアドレスを発生する。１０２はメモリアドレスレジスタであり、命令キャッシュメモリ１０３の読み出しアドレスを保持する。１０４は命令データレジスタであり、命令キャッシュメモリ１０３から読み出した命令データを保持する。 In the figure, reference numeral 101 denotes a program counter, which generates an instruction data read address in synchronization with a clock signal CLK. A memory address register 102 holds a read address of the instruction cache memory 103. An instruction data register 104 holds the instruction data read from the instruction cache memory 103.

また、１１１は他のメモリアドレスレジスタであり、データキャッシュメモリ１１２のリード又はライトアドレスを保持する。１１３はメモリデータレジスタであり、データキャッシュメモリ１１２のリードデータまたはデータキャッシュ１１２へのライトデータを保持する。 Reference numeral 111 denotes another memory address register, which holds a read or write address of the data cache memory 112. A memory data register 113 holds read data of the data cache memory 112 or write data to the data cache 112.

命令データレジスタ１０４とデータレジスタ１１３は内部データバス１７２に結合されており、入出力制御回路１６０を介して外部データバス１６１との間でデータのやりとりを行う。 The instruction data register 104 and the data register 113 are coupled to the internal data bus 172 and exchange data with the external data bus 161 via the input / output control circuit 160.

１２０は第１の命令デコーダであり、命令レジスタ１０４の出力１０５をデコードし、所定の命令制御信号１２１，１２２を出力する。１４０は演算器であり、レジスタファイル１５０から内部バス１７３を介して演算に必要なデータを受取り、算術演算，論理演算，シフト演算などを実行し、その結果を内部バス１７４を介してレジスタファイル１５０に書き込む。また、他のケースでは演算結果を内部バス１７５を介してメモリアドレスレジスタ１１１に書き込む。 Reference numeral 120 denotes a first instruction decoder, which decodes the output 105 of the instruction register 104 and outputs predetermined instruction control signals 121 and 122. An arithmetic unit 140 receives data necessary for the operation from the register file 150 via the internal bus 173, executes arithmetic operation, logical operation, shift operation, and the like, and outputs the result to the register file 150 via the internal bus 174. Write to. In other cases, the operation result is written to the memory address register 111 via the internal bus 175.

命令デコーダ１２０の出力１２１は演算器１４０に演算の内容を指定する。また、命令デコーダ１２０の出力１２２はレジスタファイル１５０に対してリードやライトの動作を指定する。 The output 121 of the instruction decoder 120 specifies the content of the operation to the arithmetic unit 140. The output 122 of the instruction decoder 120 specifies a read or write operation for the register file 150.

１３０は第２の命令デコーダであり、命令レジスタ１０４の出力１０５を解釈し、例えばデータキャッシュ１１２に対するメモリアクセスを予知し、所定の加工を施した後、データキャッシュ１１２に対してメモリアクセス予告信号１３１を出力する。 Reference numeral 130 denotes a second instruction decoder, which interprets the output 105 of the instruction register 104, predicts a memory access to the data cache 112, for example, and performs a predetermined process. Is output.

データキャッシュ１１２はこの信号とメモリアドレスレジスタ１１１からのアドレス信号とリード／ライト制御信号（図中省略されている。）とから所定のメモリアクセスを実行する。 The data cache 112 executes a predetermined memory access from this signal, an address signal from the memory address register 111, and a read / write control signal (omitted in the figure).

なお、第２の命令デコーダ１３０は演算器１４０，レジスタファイル１５０、その他にも必要に応じて動作開始予告信号１３２，１３３を発生する機能を持たせることができる。 The second instruction decoder 130 can have a function of generating the operation start notice signals 132 and 133 as required, in addition to the arithmetic unit 140, the register file 150, and the like.

図２に本実施例に係るＭＰＵの代表的な命令実行ステージを示す。 FIG. 2 shows a typical instruction execution stage of the MPU according to the present embodiment.

図中、命令１，命令２はＲ−Ｒ演算（レジスタ−レジスタ間演算）の実行ステージを示している。 In the figure, Instruction 1 and Instruction 2 indicate the execution stages of the RR operation (register-register operation).

図中のＩＦステージで命令キャッシュ１０３から命令データをフェッチし、Ｄステージで命令デコーダ１２０によりデコードし、ＥＸステージで演算器１４０により所定の演算を実行する。最後にＷステージで演算結果をレジスタファイル１５０に書き込む。 Instruction data is fetched from the instruction cache 103 at the IF stage in the figure, decoded by the instruction decoder 120 at the D stage, and a predetermined operation is executed by the arithmetic unit 140 at the EX stage. Finally, the operation result is written in the register file 150 in the W stage.

次に、図中、中段に示す。データキャッシュ１１２に対するアクセスが発生するLOAD命令，ＳＴＯＲＥ命令では、ＩＦステージとＤステージまでは前述のＲ−Ｒ演算と同様であるが、次のＡＣステージではデータキャッシュ１１２をアクセスするための実効アドレス計算を行い、次のＣＡステージでデータキャッシュ１１２をアクセスする。最後にＷステージでは読み出したデータをレジスタファイル１５０に書き込む。 Next, it is shown in the middle of the figure. In the LOAD instruction and the STORE instruction in which access to the data cache 112 occurs, the same operation as the above-described RR operation is performed up to the IF stage and the D stage. Is performed, and the data cache 112 is accessed in the next CA stage. Finally, in the W stage, the read data is written to the register file 150.

以上のように、ＬＯＡＤ／ＳＴＯＲＥ命令ではデコードステージＤとメモリアクセスステージＣＡ間に実効アドレス計算ステージＡＣが必ず介在しており、本実施例ではＣＡステージより２ステージ前のＤステージでメモリアクセスの発生を予知し、キャッシュメモリ１１２に対してアクセス予告信号を出力することを特徴とする。 As described above, in the LOAD / STORE instruction, the effective address calculation stage AC is always interposed between the decode stage D and the memory access stage CA. In this embodiment, the memory access occurs at the D stage two stages before the CA stage. , And outputs an access notice signal to the cache memory 112.

次に、図３に、この命令のフェッチからアクセス予告信号の発生，メモリアクセスの実行までの動作タイミングをさらに詳細に示す。 Next, FIG. 3 shows the operation timing from the fetch of the instruction to the generation of the access notice signal and the execution of the memory access in more detail.

図中、３ａはシステムクロックＣＬＫであり、この周期は図３の命令実行ステージの１ステージと同じ長さであり、例えば５ｎｓである。３ｂはＩＦステージであり、図では
Ｍ₁〜Ｍ₅のＬＯＡＤ，ＳＴＯＲＥ命令がフェッチされることを示している。 In the figure, reference numeral 3a denotes a system clock CLK, which has the same length as one of the instruction execution stages of FIG. 3, for example, 5 ns. 3b is IF stage, indicating that LOAD of M ₁ ~M ₅ in the figure, STORE instruction is fetched.

３ｃはＤステージであり、ＩＦステージの次のステージでＭ₁〜Ｍ₅のＬＯＡＤまたは
ＳＴＯＲＥ命令がデコードされることを示している。 3c is a D stage, indicating that the LOAD or STORE instructions M ₁ ~M ₅ in the next stage of the IF stage is decoded.

３ｄはＡＣステージであり、３ｃのＤステージでデコードされたＬＯＡＤ／ＳＴＯＲＥ命令Ｍ₁〜Ｍ₅に対する実行アドレスＡ₁〜Ａ₅の計算が実行される。 3d is AC stage, the calculation of the execution address A ₁ to A ₅ for LOAD / STORE instructions M ₁ ~M ₅ decoded by 3c of the D stage is performed.

３ｅはアドレス計算結果のメモリアドレスＡ₁〜Ａ₅であり、このアドレスを使って３ｆのＣＡステージで実際のメモリアクセスが実行される。 3e is a memory address A ₁ to A ₅ address calculation result, actual memory access CA stage 3f using this address is executed.

３ｇは図１に示した第２の命令デコーダ１３０で得られるメモリアクセス予知信号Ｍ₁〜Ｍ₄ であり、３ｃのＤステージのＭ₁〜Ｍ₅のデコード結果として得られる。また、３ｈは３ｇのメモリアクセス予知信号Ｍ₁〜Ｍ₅に所定の加工を施して得られたメモリアクセス予告信号１３１であり、データキャッシュ１１２に出力される。 3g is a memory access prediction signal M ₁ ~M ₄ obtained by the second instruction decoder 130 shown in FIG. 1, is obtained as a result of decoding the M ₁ ~M ₅ of 3c in the D stage. Further, 3h denotes a memory access notice signal 131 obtained by performing predetermined processing to the memory access prediction signal M ₁ ~M ₅ of 3g, is output to the data cache 112.

ここで、アクセス予告信号３ｈは実際のメモリアクセスが行われる３ｆのＥ₁ ステージに対して１ステージ前に先行して発生しており、同様にＥａステージに対して１ステージ前に先行して発生する。 Here, the access warning signal 3h is generated prior to one stage prior to the actual memory access 3f E ₁ stage to be performed, similarly occurs prior to the previous one stage relative to Ea stage I do.

ここで、図４（Ａ）にメモリアクセス予告信号１３１を発生する第２の命令デコーダ
１３０（図１参照）の内部構成を、図４（Ｂ）にその動作タイミングを示す。 Here, FIG. 4A shows the internal configuration of the second instruction decoder 130 (see FIG. 1) for generating the memory access notice signal 131, and FIG. 4B shows the operation timing.

図中、４１０はメモリアクセス予知回路であり、命令レジスタ１０４が出力する命令データがメモリアクセスを伴う命令であるかどうかを検知する。具体的にはＬＯＡＤ命令とＳＴＯＲＥ命令を検出し、図４（Ｂ）の３ｇに示すような検知信号ＤＥＴを発生する。
４２０は検知信号ＤＥＴ３ｇをクロック信号ＣＬＫ３ａでラッチするフリップフロップであり、その出力／Ｑ４ａは図４（Ｂ）に示すような信号になる（ここで、／ＱはＱの反転出力を表わす）。４３０はインバータであり、フリップフロップ４２０の出力／Ｑ４ａを反転して図４（Ｂ）に示すようなアクセス予告信号ＰＲ３ｈを発生する。 In the figure, reference numeral 410 denotes a memory access prediction circuit which detects whether or not instruction data output from the instruction register 104 is an instruction accompanied by memory access. More specifically, it detects a LOAD instruction and a STORE instruction, and generates a detection signal DET as shown by 3g in FIG. 4B.
A flip-flop 420 latches the detection signal DET3g with the clock signal CLK3a, and its output / Q4a becomes a signal as shown in FIG. 4B (where / Q represents an inverted output of Q). An inverter 430 inverts the output / Q4a of the flip-flop 420 and generates an access notice signal PR3h as shown in FIG.

なお、ＰＲ信号１３１の極性は本質的なものではないが、本実施例では正極性のアクティブ信号としている。 The polarity of the PR signal 131 is not essential, but is a positive active signal in this embodiment.

次に、図５にデータキャッシュメモリ１１２（図１参照）の内部構成を示す。 Next, FIG. 5 shows the internal configuration of the data cache memory 112 (see FIG. 1).

図中、５１０はアドレスバッファであり、アドレス信号Ａｉを受けて、アドレスデコーダ・ドライバ５２０に必要な正，負のアドレス信号として出力する。アドレスデコーダ・ドライバ５２０の出力はメモリアレイ５３０に出力され、リードまたはライトすべきメモリアレイを選択する。 In the figure, reference numeral 510 denotes an address buffer which receives an address signal Ai and outputs it as positive and negative address signals required for the address decoder / driver 520. The output of the address decoder / driver 520 is output to the memory array 530, and selects a memory array to be read or written.

５４０は、センスアンプであり、メモリアレイ５３０から読み出した微小信号を所定の信号レベルまで増幅して出力する。５５０は出力ドライバであり、比較的重い付加を持つ出力Ｄ_o を駆動するために設けられている。 A sense amplifier 540 amplifies a small signal read from the memory array 530 to a predetermined signal level and outputs the amplified signal. An output driver 550 is provided to drive the output _Do having a relatively heavy load.

５６０は書き込み制御回路であり、書き込みデータＤｉを書き込み制御信号ＷＥを使ってメモリアレイ５３０の所定のアドレスに書き込む。 A write control circuit 560 writes the write data Di to a predetermined address of the memory array 530 using the write control signal WE.

５７０は電流制御信号発生回路であり、アクセス予告信号ＰＲを受けて、少なくとも１以上の電流制御信号５７５を発生する。本例においては、データキャッシュメモリ１１２が共有される場合や、命令の実行以外のアクセス要因がある場合等の例をも提示するために、複数の予告信号ＰＲ₁…ＰＲ_nを受けて、少なくとも１以上の電流制御信号５７５を発生する場合について示している。 Reference numeral 570 denotes a current control signal generating circuit which generates at least one or more current control signals 575 in response to the access notice signal PR. In the present example, and if the data cache memory 112 is shared, in order also provides an example of such a case where there is access to factors other than the execution of the instructions, receiving a plurality of warning signal PR ₁ ... PR _n, at least The case where one or more current control signals 575 are generated is shown.

電流制御信号５７５による回路電流の制御はキャッシュメモリ１１２内の電流制御信号発生回路５７０を除くすべての回路要素に対して適用可能である。どの回路を制御の対象に選ぶかは、適用する実際のハードウエア構成や用途に従う。 The control of the circuit current by the current control signal 575 is applicable to all circuit elements except the current control signal generation circuit 570 in the cache memory 112. Which circuit is selected for control depends on the actual hardware configuration and application to be applied.

図６（Ａ）に、電流制御信号発生回路５７０（図５参照）の構成例を、同図（Ｂ）は、その動作タイミングを示す。 FIG. 6A shows a configuration example of the current control signal generation circuit 570 (see FIG. 5), and FIG. 6B shows the operation timing.

図中、６１０はオアゲートであり、アクセス予告信号ＰＲ₁〜ＰＲ_nのオアをとり、その出力をインバータ６２０とフリップフロップ６６０に供給する。６３０はノアゲートであり、インバータ６２０の出力とフリップフロップ６３０の／Ｑ出力のノアをとり、図６
（Ｂ）の６ｃに示すような信号ＰＵＰを出力する。 In the figure, reference numeral 610 denotes an OR gate, which takes an OR of the access notice signals PR _{1 to} PR _n and supplies its output to the inverter 620 and the flip-flop 660. Reference numeral 630 denotes a NOR gate, which NORs the output of the inverter 620 and the / Q output of the flip-flop 630, and
A signal PUP as shown in FIG.

６４０はアンドゲートであり、フリップフロップ６６０のＱ出力６ｂとクロック信号
ＣＬＫ３ａのアンドをとり、図６（Ｂ）の６ｄに示すＭＣＬＫ信号を出力する。また、
６５０と６７０はそれぞれオアゲートおよびデイレイ回路であり、オアゲート６５０は前記ＭＣＬＫ信号６ｄとＭＣＬＫ信号をデイレイ回路６７０で所定時間遅延させた信号とのオアをとり、図６（Ｂ）の６ｆに示すΦＳＡ信号を出力する。 Reference numeral 640 denotes an AND gate, which ANDs the Q output 6b of the flip-flop 660 and the clock signal CLK3a and outputs an MCLK signal shown in 6d of FIG. 6B. Also,
650 and 670 are an OR gate and a delay circuit, respectively. The OR gate 650 ORs the MCLK signal 6d and the signal obtained by delaying the MCLK signal by the delay circuit 670 for a predetermined time, and outputs the ΦSA signal shown in 6f of FIG. Is output.

なお、図６（Ｂ）のＭＡ６ｅはメモリアクセス実行サイクルのメモリアドレスを示している。 MA6e in FIG. 6B indicates a memory address in a memory access execution cycle.

図６（Ｂ）に示すように、メモリアドレスＡ₁，Ａ₂に対するメモリアクセスはｔ₂とｔ₃のステージ６ｇで行われる。これに対して６ｃのＰＵＰ信号はｔ₂ ステージより１ステージ前のｔ₁ ステージで立上り、ｔ₃ ステージの終りで立下る信号になっている。 As shown in FIG. 6B, the memory access to the memory addresses A ₁ and A ₂ is performed at the stage 6g at t ₂ and t ₃ . 6c PUP signal contrast becomes rising, the falling down signal at the end of t ₃ stage 1 stage preceding t ₁ stage than t ₂ stages.

このＰＵＰ信号６ｃを基に回路電流を制御するが、このようすを図７に示す。図７の
７ａに示すように、ＰＵＰ信号６ｃに基づいて、ｔ₁ ステージの間に対象とする回路の電流をｉ₁ から所定の電流値ｉ₂ まで立上げ、ｔ₂ ，ｔ₃ のメモリアクセスステージではその電流値を維持し、メモリアクセスが完了したｔ₄ ステージの始めから、電流値の所定の低電流値ｉ₂ まで立下げる。 The circuit current is controlled based on the PUP signal 6c, and this is shown in FIG. As shown in FIG. 7A, the current of the target circuit is raised from i ₁ to a predetermined current value i ₂ during the t ₁ stage based on the PUP signal 6c, and the memory access at t ₂ and t ₃ is performed. It maintains its current value at the stage, from the beginning of t ₄ stage a memory access is completed, pulls up to a predetermined low current value i ₂ of the current value.

次に、ＭＣＬＫ信号６ｄ（図６Ｂ）はメモリアクセスステージｔ₂ ，ｔ₃ のそれぞれに対応して発生されるパルス信号であり、クロック同期式のメモリで実現する場合のメモリクロックとして有用である。なお、クロック同期型メモリについては文献１）〜３）を参照されたい。
文献
１）Kevin j.O'connor：Modular Embeded Cache Memory for a 32b Piplined RISC
Microprocessor.1987 IS SCC p.256，257
２）Masanori Odaka et al ：A 512 kb／5ns BiCMOS RAM with 1KG／150ps Logic Gate
Array.1989 IS SCC p.28，29
３）Masayoshi Kimoto et al：A 1.4ns／64kb RAM with 85ps／3688 Logic Gate Array.
1989 CI CC p.15.8.1〜15.8.4
また、ΦＳＡ信号６ｆはメモリアクセスステージｔ₂ ，ｔ₃ のそれぞれに対応して発生されるパルス信号であり、例えばセンスアンプを所定期間だけパルス動作させる信号として有用である。 Next, the MCLK signal 6d (FIG. 6B) is a pulse signal generated corresponding to each of the memory access stages t ₂ and t ₃ , and is useful as a memory clock in the case of being realized by a clock synchronous type memory. For the clock synchronous memory, refer to Documents 1) to 3).
Reference 1) Kevin j.O'connor: Modular Embeded Cache Memory for a 32b Piplined RISC
Microprocessor.1987 IS SCC p.256,257
2) Masanori Odaka et al: A 512kb / 5ns BiCMOS RAM with 1KG / 150ps Logic Gate
Array.1989 IS SCC p.28,29
3) Masayoshi Kimoto et al: A 1.4ns / 64kb RAM with 85ps / 3688 Logic Gate Array.
1989 CI CC p.15.8.1-15.8.4
The ΦSA signal 6f is a pulse signal generated corresponding to each of the memory access stages t ₂ and t ₃ , and is useful, for example, as a signal for pulsing the sense amplifier for a predetermined period.

すなわち、センスアンプのみを独立に活性化制御することにより、電流切り替えにより生じる電源ノイズを許容範囲に納め、かつ、高電力消費源であるセンスアンプの活性化時間を極力短くする信号等として用いることができる。 In other words, by independently controlling the activation of the sense amplifier alone, the power supply noise generated by the current switching is kept within an allowable range, and the signal is used as a signal for minimizing the activation time of the sense amplifier which is a high power consumption source. Can be.

以上のＰＵＰ信号，ΦＳＡ信号を用いて実際に回路電流を制御する例を以下に示す。 An example in which the circuit current is actually controlled using the above PUP signal and ΦSA signal will be described below.

まず、図８（Ａ）にＰＵＰ信号を使って、回路電流を制御する回路の第１の例を、同図（Ｂ）にその動作波形を示す。 First, FIG. 8A shows a first example of a circuit for controlling a circuit current using a PUP signal, and FIG. 8B shows an operation waveform thereof.

図中、８１１，８１２はＰＭＯＳであり、それぞれのソースは電源Ｖ₁ に接続され、それぞれのゲートは共通接続されてPMOS811 のドレインにも接続されている。また、８２１，８２２，８２３はＮＭＯＳであり、８２１のドレインはPMOS811 のドレインに、ゲートはＰＵＰ信号に、ソースは基準電位に接続されている。 In the figure, reference numerals 811 and 812 denote PMOSs, each of which has a source connected to the power supply V ₁ , a gate which is commonly connected and also connected to a drain of the PMOS 811. Reference numerals 821, 822, and 823 denote NMOSs. The drain of the transistor 821 is connected to the drain of the PMOS 811, the gate is connected to the PUP signal, and the source is connected to the reference potential.

NMOS822のドレインはPMOS812のドレインに、ゲートはインバータ８３０の出力に、ソースは基準電位に接続され、インバータ８３０の入力はＰＵＰ信号に接続されている。 The drain of the NMOS 822 is connected to the drain of the PMOS 812, the gate is connected to the output of the inverter 830, the source is connected to the reference potential, and the input of the inverter 830 is connected to the PUP signal.

また、８４０は例えば差動アンプのような能動回路でありデータキャッシュメモリ112や演算器１４０やレジスタファイル１５０(図１参照)等の機能回路ブロックに備えられているものであり、NMOS823 を介して定電流源８５０で定めた所定の動作電流を流すようになっている。さらにNMOS823 のゲートと、ＧＮＤ間には積分用のコンデンサＣが接続されている。 Reference numeral 840 denotes an active circuit such as a differential amplifier, which is provided in a functional circuit block such as the data cache memory 112, the arithmetic unit 140, and the register file 150 (see FIG. 1). A predetermined operating current determined by the constant current source 850 flows. Further, an integrating capacitor C is connected between the gate of the NMOS 823 and GND.

PMOS811，812とNMOS821，823はカレントミラー回路を構成しており、同図(Ｂ)に示すように、ＰＵＰ信号が“０”レベルから“１”レベルに立上がるとPMOS812 からコンデンサＣに、所定の充電電流が流れ、NMOS823 のゲート電圧Ｖｇおよび回路８４０の電流ｉは同図（Ｂ）中段および下段に示すように所定のslewrateでなだらかに立上がる。この立上り時間ｔ₁ は、前述した図７に示したステージｔ₁ に相当する時間である。 The PMOSs 811 and 812 and the NMOSs 821 and 823 constitute a current mirror circuit. As shown in FIG. 4B, when the PUP signal rises from the “0” level to the “1” level, a predetermined value is supplied from the PMOS 812 to the capacitor C. , And the gate voltage Vg of the NMOS 823 and the current i of the circuit 840 rise gradually at a predetermined slew rate as shown in the middle and lower parts of FIG. The rise time t ₁ is a time corresponding to the stage t ₁ shown in FIG.

同様にＰＵＰが“１”レベルから“０”レベルに変化すると電圧Ｖｇおよび電流ｉは所定のslewrateでなだらかに立下り、この立下り時間ｔ₄ は同様に図７に示したｔ₄ に相当する時間になる。 Similarly, when PUP changes from the "1" level to the "0" level, the voltage Vg and the current i gradually fall at a predetermined slew rate, and the fall time t ₄ similarly corresponds to t ₄ shown in FIG. It's time.

なお、電流ｉの立上げ時間ｔ₁ と立下げ時間ｔ₄ は必ずしも同じである必要はなく、回路の動作が終了した後なので立下げるときは、特別な不都合が生じない範囲でｔ₄ を短くすることもできる。 Note that the rise time t ₁ and the fall time t ₄ of the current i do not necessarily have to be the same, and since the operation of the circuit is completed, when the fall is made, shorten t ₄ within a range where no special inconvenience occurs. You can also.

図９（Ａ）にＰＵＰ信号を使って回路電流を制御する回路の第２の例を、同図（Ｂ）にその動作波形を示す。 FIG. 9A shows a second example of the circuit for controlling the circuit current using the PUP signal, and FIG. 9B shows the operation waveform thereof.

図中、９１１〜９１４はインバータ、９２１〜９２３はＮＭＯＳ、９３１〜９３３は定電流源、９４０は例えば差動アンプのような能動回路であり、データキャッシュメモリ
１１２や演算器１４０やレジスタファイル１５０（図１参照）等の機能回路ブロックに備えられているものである。 In the figure, reference numerals 911 to 914 denote inverters, reference numerals 921 to 923 denote NMOSs, reference numerals 931 to 933 denote constant current sources, and reference numeral 940 denotes an active circuit such as a differential amplifier. The data cache memory 112, the arithmetic unit 140, and the register file 150 ( (See FIG. 1).

ここで、インバータ９１２〜９１４の遅延時間を９１４，９１３，９１２の順に大きくなるように設計するとＰＵＰ信号が同図(Ｂ)のように“０”から“１”レベルに変化したとき、NMOS921〜923を流れる電流ｉ₁〜ｉ₃も所定の時間差をもって立上り、能動回路940の動作電流は時刻ｔ₁ 後にｉ₁＋ｉ₂＋ｉ₃ の定常電流まで階段状に立上がる。 Here, if the delay time of the inverters 912 to 914 is designed to increase in the order of 914, 913, and 912, when the PUP signal changes from “0” to “1” level as shown in FIG. Currents i _{1 to} i ₃ flowing through 923 also rise with a predetermined time difference, and the operating current of active circuit 940 rises stepwise to a steady current of i ₁ + i ₂ + i ₃ after time t ₁ .

同様に、ＰＵＰ信号が“１”から“０”レベルに変化すると能動回路９４０の回路電流はｔ₄ の時間内に段階状に立下がり、実効的に前述した図８の実施例と同様になだらかな電流変化を得ることができる。 Similarly, the circuit current of the active circuit 940 when the PUP signal changes to "0" level from "1" falling into stepped in time t _4, similarly smooth to the embodiment of FIG. 8 effectively described above A large current change can be obtained.

この立上り時間ｔ₁ および立下り時間ｔ₂ は第１の例と同様に図７のステージｔ₁ およびステージｔ₄ の時間に相当する。 The rise time t ₁ and the fall time t ₂ correspond to the times of the stages t ₁ and t ₄ in FIG. 7, as in the first example.

なお、以上、ＰＵＰ信号，ΦＳＡ信号を用いて実際に回路電流を制御する例を示したが、本実施例はこれに限定されるものではなく、他の一般の、回路電流を制御する方法によっても本実施例は実現できる。 Although an example in which the circuit current is actually controlled using the PUP signal and the ΦSA signal has been described above, the present embodiment is not limited to this, and may be implemented by another general method for controlling the circuit current. This embodiment can also be realized.

以下、前記第１の回路電流制御回路を用いた場合を例にとり、データキャッシュメモリ１１２（図１参照）の各部における回路電流制御の例を示す。 Hereinafter, an example of the circuit current control in each unit of the data cache memory 112 (see FIG. 1) will be described taking the case where the first circuit current control circuit is used as an example.

図１０はデータキャッシュメモリ１１２の図５の５１０で示したアドレスバッファの電流制御の実施例である。 FIG. 10 shows an embodiment of the current control of the address buffer indicated by 510 in FIG.

図中、１０１１〜１０１４はＮＰＮトランジスタ、１０２１，１０２２は抵抗、1031〜１０３３はＮＭＯＳ、１０４１〜１０４３は定電流源である。 In the figure, 1011 to 1014 are NPN transistors, 1021 and 1022 are resistors, 1031 to 1033 are NMOSs, and 1041 to 1043 are constant current sources.

NPN1011 と１０１２のエミッタは共通接続され、NMOS1031を介して定電流源１０４１に接続されている。NPN1011 と１０１２のベースはそれぞれアドレス信号Ａｉと基準電源
Ｖ_R に接続され、それぞれのコレクタは抵抗１０２１，１０２２を介して電源Ｖ₁ に接続されている。NPN1013，1014 のコレクタは電源Ｖ₁ に接続され、それぞれのベースは
NPN1011 のコレクタとNPN1012のコレクタに接続されている。また、NPN1013，1014のそれぞれのエミッタはNMOS1032，1033を介してそれぞれ定電流源１０４２，１０４３に接続されている。 The emitters of NPNs 1011 and 1012 are commonly connected and connected to a constant current source 1041 via an NMOS 1031. Based NPN1011 and 1012 are respectively connected to the address signal Ai and the reference power source V _R, are connected in the collector via a resistor 1021, 1022 to power V _1. The collector of NPN1013,1014 is connected to the power source V _1, each base
Connected to the collectors of NPN1011 and NPN1012. The emitters of NPNs 1013 and 1014 are connected to constant current sources 1042 and 1043 via NMOSs 1032 and 1033, respectively.

出力／ａｉは入力Ａｉの非反転出力としてNPN1014 のエミッタから取り出され、出力／ａｉは入力Ａｉの反転出力としてNPN1013 のエミッタから取り出されている。NMOS1031〜1033のゲートは制御信号Ｖｇに共通に接続されている。なお、制御信号Ｖｇは前述した図８にて示した信号Ｖｇに相当するものである。 The output / ai is taken from the emitter of the NPN 1014 as the non-inverted output of the input Ai, and the output / ai is taken from the emitter of the NPN 1013 as the inverted output of the input Ai. The gates of the NMOSs 1031 to 1033 are commonly connected to the control signal Vg. The control signal Vg corresponds to the signal Vg shown in FIG.

ここで、NPN1011，1012 、抵抗１０２１，１０２２と定電流源１０４１は差動アンプを構成しており、いま、電流制御信号Ｖｇが“１”レベルで、アドレス信号ＡｉがＶｇより高いとき、NPN1011がオン，NPN1012がオフになり、NPN1011のコレクタが“０”レベル、
NPN1012 のコレクタが“１”レベルになる。 Here, the NPNs 1011 and 1012, the resistors 1021 and 1022, and the constant current source 1041 constitute a differential amplifier. When the current control signal Vg is at the “1” level and the address signal Ai is higher than Vg, the NPN 1011 ON, NPN1012 is turned off, the collector of NPN1011 is at “0” level,
The collector of NPN1012 becomes "1" level.

NPN1011 のコレクタはエミッタフォロワトランジスタ１０１３のベースに接続されており、そのエミッタから“０”レベルの出力／ａｉが得られる。同様にNPN1012 のコレクタはエミッタフォロワトランジスタ１０１４のベースに接続されており、そのエミッタから“１”レベルの出力／ａｉが得られる。 The collector of the NPN 1011 is connected to the base of the emitter follower transistor 1013, and the output / ai of "0" level is obtained from the emitter. Similarly, the collector of NPN 1012 is connected to the base of emitter follower transistor 1014, and an output / ai of "1" level is obtained from the emitter.

アドレス信号ＡｉがＶ_R より低いとき、NPN1011とNPN1012は逆の動作をし、／ａｉ出力は“１”レベル、ａｉ出力は“０”レベルになる。 When the address signal Ai is lower than V _R, NPN1011 and NPN1012 is the reverse operation, / ai outputs "1" level, ai output becomes "0" level.

次に、電流制御信号Ｖｇが“０”レベルの場合、NMOS1031〜1033はすべてオフになり、このとき、電源Ｖ₁ からＧＮＤへの電流パスがなくなるため、この回路は電力を消費しなくなる。 Next, when the current control signal Vg is "0" level, NMOS1031～1033 off all this time, since the current path from the power source V ₁ to the GND is eliminated, this circuit will not consume power.

ここで、電流制御信号Ｖｇは前述した図８（Ｂ）で示したように立上り，立下り時間が所定の時間になるように設定されるので電流の変化も図７の７ａで示したようになだらかなものとすることができる。 Here, since the current control signal Vg is set so that the rise and fall times become a predetermined time as shown in FIG. 8B, the change in current is also as shown by 7a in FIG. It can be gentle.

したがって、前述した図２４（Ｂ）に示したような電流切換え時の電源、ＧＮＤノイズ（図２４Ｂ参照）を所望の大きさに抑制することができる。 Therefore, the power source and the GND noise (see FIG. 24B) at the time of the current switching as shown in FIG. 24B can be suppressed to a desired magnitude.

次に、図１１にデータキャッシュメモリ内のデコーダ・ドライバ５２０，メモリアレイ５３０，センスアンプ５４０（図５参照）の部分の回路電流制御の例を示す。 Next, FIG. 11 shows an example of circuit current control of the decoder / driver 520, memory array 530, and sense amplifier 540 (see FIG. 5) in the data cache memory.

図中、１１６１，１１６２はＮＯＲゲートであり、アドレスデコーダの最終段に該当する。 In the figure, reference numerals 1161 and 1162 denote NOR gates, which correspond to the last stage of the address decoder.

１１７１，１１７２はアンドゲートからなるワードドライバであり、一方の入力にアドレスデコーダ１１６１，１１６２の出力が接続され、他方の入力に制御信号Ｖｇが接続されその出力によりワード線ＷＬ₁，ＷＬ₂をそれぞれ駆動する。 Reference numerals 1171 and 1172 denote word drivers composed of AND gates. The outputs of the address decoders 1161 and 1162 are connected to one input, the control signal Vg is connected to the other input, and the word lines WL ₁ and WL ₂ are respectively connected by the outputs. Drive.

１１００は、特に限定するものではないが４ＭＯＳ型のメモリセルであり、説明の便宜上、１セルだけを図示する。 Although not particularly limited, reference numeral 1100 denotes a 4MOS type memory cell, and only one cell is shown for convenience of explanation.

１１１１，１１１２はビット線プルアップ用の負荷ＭＯＳである。また、１１１３〜
１１１６はビット線選択用のＭＯＳスイッチであり、カラム選択信号Ｃ₁，Ｃ₂により所望のビット線がコモンデータ線１１２０に結合される。 1111 and 1112 are load MOSs for pulling up bit lines. Also, 1113 ~
Reference numeral 1116 denotes a MOS switch for selecting a bit line, and a desired bit line is coupled to the common data line 1120 by column selection signals C ₁ and C ₂ .

１１２１，１１２２はＮＰＮトランジスタによりエミッタフォロワ回路であり、コモンデータ線１１２０の信号をＶＢＥ（ベース・エミッタ間電圧）だけレベルシフトして
NPN1123，1124のそれぞれのベースに伝える。NPN1123，1124のエミッタは共通接続され、
NMOS1141を介して電流源１１５１に接続されている。NPN1123，1124 のコレクタは抵抗
１１３１，１１３２を介して電源Ｖ₁ に接続される。 Reference numerals 1121 and 1122 denote emitter follower circuits using NPN transistors, which level-shift the signal on the common data line 1120 by VBE (base-emitter voltage).
Communicate to each base of NPN1123, 1124. The emitters of NPN 1123 and 1124 are connected in common,
It is connected to a current source 1151 via an NMOS 1141. The collectors of NPNs 1123 and 1124 are connected to power supply V ₁ via resistors 1131 and 1132.

NPN1123，1124 、抵抗１１３１，１１３２および電流源１１５１とは差動アンプを構成しており、メモリセル１００より読み出した微小信号を所定の振幅まで増幅する。同様に、１１５０は２ケの抵抗と２ケのＮＰＮからなる差動アンプを構成しておりNMOS1142を介して定電流源１１５２に接続されている。 The NPNs 1123 and 1124, the resistors 1131 and 1132, and the current source 1151 constitute a differential amplifier, and amplify a small signal read from the memory cell 100 to a predetermined amplitude. Similarly, 1150 constitutes a differential amplifier composed of two resistors and two NPNs, and is connected to a constant current source 1152 via an NMOS 1142.

１１５０の２つの入力はNPN1123，1124 のコレクタに接続されており、それらの信号を更に増幅して端子１１５１に所定の振幅の出力信号を得るものである。 The two inputs of 1150 are connected to the collectors of NPNs 1123 and 1124, and further amplify those signals to obtain an output signal of a predetermined amplitude at terminal 1151.

ここで、アンドゲート１１７１，１１７２の一方の入力には前述した電流制御信号Ｖｇ（図８参照）が接続されているため、Ｖｇが“１”レベルのとき、アンドゲート１１７１，１１７２は選択的に駆動され、ワード線ＷＬ₁，ＷＬ₂を選択的に駆動する。一方、Ｖｇが“０”レベルのとき、アンドゲート１１７１，１１７２を始めとするワードドライバはすべてオフになる。したがって、このときメモリセル１１００を始めとするすべてのメモリセルに流入する電流が遮断される。したがって、メモリアクセスしない状態での無駄な電力消費がカットされる。 Here, since the above-described current control signal Vg (see FIG. 8) is connected to one input of the AND gates 1171 and 1172, when the Vg is at the “1” level, the AND gates 1171 and 1172 are selectively connected. Driven to selectively drive the word lines WL ₁ and WL ₂ . On the other hand, when Vg is at the “0” level, all word drivers including the AND gates 1171 and 1172 are turned off. Therefore, at this time, the current flowing into all the memory cells including the memory cell 1100 is cut off. Therefore, unnecessary power consumption in a state where no memory is accessed is cut.

同様に、NMOS1141，1142のゲートには電流制御信号Ｖｇが接続されている。Ｖｇが
“１”レベルのときNMOS1141，1142はオン、Ｖｇが“０”レベルのときオフになる。 Similarly, the current control signal Vg is connected to the gates of the NMOSs 1141 and 1142. When Vg is at “1” level, the NMOSs 1141 and 1142 are on, and when Vg is at “0” level, they are off.

したがって、メモリアクセスしない状態ではセンスアンプの電流は流れないため、無駄な電力消費がカットされる。 Therefore, the current of the sense amplifier does not flow when the memory is not accessed, so that unnecessary power consumption is cut.

ここで、電流制御信号Ｖｇによる回路電流の変化は図７の７ａに示すようになるため、電流切り換えによる電源、ＧＮＤのノイズを許容値に抑制できるばかりでなく、メモリアクセスの開始時点には上記ノイズは消滅しているため高速な動作が可能となる。 Here, the change in the circuit current due to the current control signal Vg is as shown by 7a in FIG. 7, so that not only the power supply and GND noise due to the current switching can be suppressed to an allowable value, but also the above-mentioned time at the start of memory access. Since the noise has disappeared, high-speed operation is possible.

なお、図１１でスイッチSW1180を信号ΦＳＡ側に切り換えるとNMOS1141，1142がパルス的に動作される。ΦＳＡ信号は前述したように（図６（Ｂ）参照）メモリアクセスステージｔ₂ ，ｔ₃ の所定時間だけ“１”レベルになるパルス信号であり、本例の場合、メモリアクセス中の一定時間だけセンスアンプに電力を供給することになり、低電力化を図ることができる。 When the switch SW1180 is switched to the signal φSA in FIG. 11, the NMOSs 1141 and 1142 are operated in a pulsed manner. As described above, the ΦSA signal is a pulse signal that goes to “1” level only for a predetermined time of the memory access stages t ₂ and t ₃ (see FIG. 6B). Since power is supplied to the sense amplifier, power consumption can be reduced.

次に、図１２にデータキャッシュメモリ１１２の出力ドライバ５５０（図５参照）の回路電流制御の例を示す。 Next, FIG. 12 shows an example of circuit current control of the output driver 550 (see FIG. 5) of the data cache memory 112.

図中、PMOS1211のドレイン，ゲート，ソースはそれぞれNPN1241 のベース，入力Ｖ_IN，電源Ｖ₁ に接続されている。NMOS1221のドレイン，ゲート，ソースはそれぞれNPN1241 のベース，入力Ｖ_IN，抵抗１２５１の一端に接続されている。PMOS1222のドレイン，ゲート，ソースはそれぞれNMOS1221のドレイン，電流制御信号Ｖｇ，NPN1241 のベースに接続されている。また、抵抗１２５１の両端にはコンデンサ１２６１が接続されている。ダイオード１２３１のアノードとカソードはそれぞれNPN1241 のコレクタとベースに接続されており、NPN1241 のコレクタには電源Ｖ₁ が接続されている。NPN1241 のエミッタは出力端子であり、出力端子と電源Ｖ₂ 間には終端抵抗１２５２が接続されている。 In the figure, the drain, gate, and source of the PMOS 1211 are connected to the base of the NPN 1241, the input V _IN , and the power supply V ₁ , respectively. The drain, gate, and source of the NMOS 1221 are connected to the base of the NPN 1241, the input V _IN , and one end of the resistor 1251, respectively. The drain, gate, and source of the PMOS 1222 are connected to the drain of the NMOS 1221, the current control signal Vg, and the base of the NPN 1241, respectively. A capacitor 1261 is connected to both ends of the resistor 1251. An anode and a cathode of the diode 1231 are respectively connected to the collector and base of NPN1241, power V ₁ is connected to the collector of NPN1241. The emitter of NPN1241 is an output terminal, between the output terminal and the power supply V ₂ terminating resistor 1252 is connected.

いま、電流制御信号Ｖｇが“１”レベルのとき、PMOS1222はオフである。このとき、入力Ｖ_INが“０”レベルなら、PMOS1211がオン，NMOS1221がオフになる。 Now, when the current control signal Vg is at the “1” level, the PMOS 1222 is off. At this time, if the input V _IN is at the “0” level, the PMOS 1211 turns on and the NMOS 1221 turns off.

したがって、この時、PMOS1211を介してNPN1241 のベース電圧を立上げ、出力Ｖ_OUT は
“１”レベルになる。逆に、Ｖ_INが“１”レベルのとき、PMOS1211がオフ，NMOS1221がオンになり、NPN1241のベース電圧を引下げ、出力Ｖ_OUTは“０”レベルになる。 Therefore, at this time, the base voltage of the NPN 1241 rises via the PMOS 1211 and the output V _OUT goes to the “1” level. Conversely, when V _IN is at the “1” level, the PMOS 1211 is turned off and the NMOS 1221 is turned on, the base voltage of the NPN 1241 is reduced, and the output V _OUT becomes the “0” level.

なお、ダイオード１２３１はNPN1241 のベース電位の低下を所定値に抑えるためのクランパーである。 Note that the diode 1231 is a clamper for suppressing a decrease in the base potential of the NPN 1241 to a predetermined value.

また、抵抗１２５１は電流制限用、コンデンサ１２６１はスピードアップ用である。 Further, the resistor 1251 is for current limiting, and the capacitor 1261 is for speeding up.

次にＶｇが“０”レベルのとき、PMOS1222はオンになる。このとき、NPN1241のベース電位は入力Ｖ_INのレベルに関係なく引き下げられ、出力Ｖ_OUT は“０”レベルになる。したがって、NPN1241 のコレクタ電圧Ｖ_OUT は“１”レベルのときよりも小さくなり低消費電力化が図れる。 Next, when Vg is at the “0” level, the PMOS 1222 is turned on. At this time, the base potential of NPN 1241 is lowered irrespective of the level of input V _IN , and output V _OUT becomes “0” level. Therefore, the collector voltage V _{OUT of} NPN 1241 is smaller than that at the time of “1” level, and low power consumption can be achieved.

したがって、前述したアドレスバッファ５１０，デコーダ・ドライバ５２０，メモリアレイ５３０，センスアンプ５４０の回路電流制御と同様な効果が得られる。 Therefore, an effect similar to that of the above-described circuit current control of the address buffer 510, the decoder / driver 520, the memory array 530, and the sense amplifier 540 can be obtained.

以上、前記第１の回路電流制御回路を用いた場合を例にとり、データキャッシュメモリ１１２（図１参照）の各部における回路電流制御の例を示したが、回路電流制御回路としては、前記第２の回路電流制御回路（図９参照）や他の回路電流制御回路を用いても良い。 As described above, an example of the circuit current control in each unit of the data cache memory 112 (see FIG. 1) has been described taking the case where the first circuit current control circuit is used as an example. Circuit current control circuit (see FIG. 9) or another circuit current control circuit may be used.

以上、本実施例においては、アクセス予告信号を使ったメモリのアクセス方法による低電力化の例を中心に説明したが、前述したように、例えば単一チップＭＰＵ内の演算器やレジスタファイルなど命令語の解釈によって動作を制御されるすべての機能回路において同様に適用することができる。また、本実施例においては、回路電流を動作実行ステージの前ステージに同期して立上げを開始する例について、説明したが、これは、必ずしも同期させる必要はなく、電流変化による電源や接地線のノイズを所定の値に抑制できる時間分、実行ステージの開始に先行して、立上げを開始すれば良い。この場合、前記ＰＵＰ信号を、実行ステージの前ステージに同期してではなく所望のタイミングで有意信号とすれば良い。 As described above, in the present embodiment, the description has been made mainly on the example of the low power consumption by the memory access method using the access notice signal, but as described above, for example, the instruction such as the arithmetic unit and the register file in the single chip MPU The same applies to all functional circuits whose operation is controlled by word interpretation. Further, in the present embodiment, an example has been described in which the start-up of the circuit current is started in synchronization with the previous stage of the operation execution stage. The start-up may be started in advance of the start of the execution stage for the time during which the noise of the above can be suppressed to a predetermined value. In this case, the PUP signal may be a significant signal at a desired timing, not in synchronization with the stage before the execution stage.

以上、本実施例によれば、単一チップマイクロプロセッサに含まれるメモリ回路やその他の機能回路は実際の動作に先立つアクセス予告信号により回路電流を動作開始までに所定の割合で立上げた後、所定の動作を実行する。このため、これらの機能回路は実際に動作する時だけ回路性能上必要な電力を消費するため、単一チップマイクロプロセッサの低電力化に効果がある。 As described above, according to the present embodiment, after the memory circuit and other functional circuits included in the single-chip microprocessor start up the circuit current at a predetermined rate by the access notice signal prior to the actual operation before the operation starts, Perform a predetermined operation. Therefore, these functional circuits consume power required for circuit performance only when actually operating, which is effective in reducing the power of a single-chip microprocessor.

また低電力化した分だけ、新しい機能を付加することもできるため、高機能化，高集積化にも効果がある。 In addition, new functions can be added as much as the power consumption is reduced, which is also effective for higher functions and higher integration.

また、各機能回路は所定の割合で回路電流を変化させられるため、電流変化による電源や接地線のノイズを所定の値に抑制できる。このため、信頼性の高い回路動作を実現できる効果がある。 In addition, since each functional circuit can change the circuit current at a predetermined rate, noise of the power supply and the ground line due to the current change can be suppressed to a predetermined value. Therefore, there is an effect that a highly reliable circuit operation can be realized.

さらにまた、本実施例を適用した各機能回路では実際の動作を開始する時点で前記電源線や接地線のノイズが消滅しているため、最良の電源状態で動作することができ、回路の高速動作にも効果がある。 Furthermore, in the respective functional circuits to which the present embodiment is applied, since the noise of the power supply line and the ground line has disappeared at the time of starting the actual operation, the circuit can operate in the best power supply state, and the circuit can operate at high speed. There is also an effect on the operation.

次に、本発明をSuper Scalar型のＲＩＳＣプロセッサに適用した場合を説明する。 Next, a case where the present invention is applied to a Super Scalar type RISC processor will be described.

Super Scalar型のＲＩＳＣプロセッサとは、主にレジスタファイルを共用する複数の演算ユニットを設け、命令を簡単にしてパイプライン段数を少なくし、かつ、１マシンサイクルに複数の命令を読み出し、複数演算ユニットを制御するものである。つまり、１マシンサイクルで複数の命令が同時に読み出され、実行されるため、複数の演算ユニットが同時に動き、処理能力を高めることができる。 Super Scalar type RISC processor is mainly equipped with a plurality of operation units that share a register file, simplifies instructions and reduces the number of pipeline stages, and reads out multiple instructions in one machine cycle. Is controlled. In other words, a plurality of instructions are read and executed simultaneously in one machine cycle, so that a plurality of operation units operate simultaneously and the processing capability can be increased.

図１３は、第２の実施例で述べるプロセッサの命令一覧である。これらの命令を大きく類別すると、基本命令，分岐命令，ロード・ストア命令，システム制御命令に分けられる。なお、説明の都合上、簡単のために、上記の如く命令数を制限しているが、これは、本発明を制限するものではなく、さらに命令を増やしてもよい。 FIG. 13 is a list of instructions of the processor described in the second embodiment. These instructions can be roughly classified into basic instructions, branch instructions, load / store instructions, and system control instructions. The number of instructions is limited as described above for simplicity of explanation, but this does not limit the present invention, and the number of instructions may be further increased.

第２の実施例の構成を示したのが、図１４である。１４００はメモリインタフェース、１４０１はデータキャッシュ、１４０２はシーケンサ、１４０３は命令キャッシュ、1404は３２ビットの第１命令レジスタ、１４０５は３２ビットの第２命令レジスタ、１４０６は第１命令用第１のデコーダ、１４０８は、第１命令用第２のデコーダ、１４０９は、第２命令用第２のデコーダ、１４０７は、第２命令用第１のデコーダ、１４１３は第１，第２命令間の競合検出回路、１４１０は第１演算ユニット、１４１２は第２演算ユニット、１４１１はレジスタファイルである。本実施例では、１マシンサイクルの間に最大２つの命令が並列して読み出され実行される。本実施例でのパイプライン処理の最も基本的な動作を示したものが、図１５である。パイプラインはＩＦ(Instruction Fetch),Ｄ(Decode)，ＥＸ（Execution），Ｔ（Test），Ｗ（Write）の５段で構成される。 FIG. 14 shows the configuration of the second embodiment. 1400 is a memory interface, 1401 is a data cache, 1402 is a sequencer, 1403 is an instruction cache, 1404 is a 32-bit first instruction register, 1405 is a 32-bit second instruction register, 1406 is a first decoder for the first instruction, 1408 is a second decoder for the first instruction, 1409 is a second decoder for the second instruction, 1407 is a first decoder for the second instruction, 1413 is a conflict detection circuit between the first and second instructions, Reference numeral 1410 denotes a first operation unit, 1412 denotes a second operation unit, and 1411 denotes a register file. In this embodiment, up to two instructions are read and executed in parallel during one machine cycle. FIG. 15 shows the most basic operation of the pipeline processing in this embodiment. The pipeline is composed of five stages: IF (Instruction Fetch), D (Decode), EX (Execution), T (Test), and W (Write).

次いで、図１４を用いて、動作を説明する。ＩＦステージでは、シーケンサ１４０２内のプログラムカウンタによって指される２つの命令が命令キャッシュ１４０３より読み出され、バス１４１５，１４１７を通して、それぞれ第１命令レジスタ１４０４，第２命令レジスタ１０５にセットされる。 Next, the operation will be described with reference to FIG. In the IF stage, two instructions pointed to by the program counter in the sequencer 1402 are read from the instruction cache 1403 and set in the first instruction register 1404 and the second instruction register 105 via buses 1415 and 1417, respectively.

Ｄステージでは、第１命令レジスタ１４０４の内容が第１デコーダ１４０６でデコードされ、また、第２命令レジスタ１４０５の内容が第２デコーダ１４０７でデコードされる。その結果、第１命令レジスタ１４０４の第１ソースレジスタフィールドで指されるレジスタの内容がバス１４２５を通して、第２ソースレジスタフィールドで指されるレジスタの内容がバス１４２６を通して、第１演算ユニット１４１０へ送出される。また、第２命令レジスタの第１ソースレジスタで指されるレジスタの内容がバス１４２７を通して、第２ソースレジスタフィールドで指されるレジスタの内容がバス１４２８を通して、第２演算ユニット１４１２に送出される。 In the D stage, the contents of the first instruction register 1404 are decoded by the first decoder 1406, and the contents of the second instruction register 1405 are decoded by the second decoder 1407. As a result, the contents of the register pointed to by the first source register field of the first instruction register 1404 are sent to the first arithmetic unit 1410 via the bus 1425, and the contents of the register pointed to by the second source register field are sent to the first arithmetic unit 1410 via the bus 1426. Is done. The contents of the register pointed to by the first source register of the second instruction register are sent to the second arithmetic unit 1412 via the bus 1427, and the contents of the register pointed to by the second source register field are sent to the second arithmetic unit 1412 via the bus 1428.

次にＥＸステージの動作について説明する。ＥＸステージでは、第１命令レジスタのオペコードの内容に従って、第１演算ユニット１４１０において、バス１４２５，１４２６により送られてきたデータ間の演算を行う。並列して、第２命令レジスタ１４０５のオペコードの内容に従って、第２演算ユニット１４１２において、バス１４２７，１４２８により送られてきたデータ間の演算を行う。ロードストア命令はここでアドレス計算を行う。 Next, the operation of the EX stage will be described. In the EX stage, the first operation unit 1410 performs an operation between the data transmitted by the buses 1425 and 1426 in accordance with the contents of the operation code of the first instruction register. In parallel, according to the contents of the operation code of the second instruction register 1405, the second operation unit 1412 performs an operation between the data sent by the buses 1427 and 1428. The load store instruction calculates the address here.

次にＴステージの動作について説明する。Ｔステージでは、基本命令は、データを保持し続ける。ロードストア命令は、このステージで、前のＥＸステージで計算したアドレスをバス１４２９、又はバス１４３１を通して出力されたアドレスをもとにデータキャッシュ１４０１に対してメモリアクセスを実行する。なお、ストア命令の時は、同時に格納すべきデータがパス１４３７を通して出力される。 Next, the operation of the T stage will be described. In the T stage, the basic instruction keeps holding data. The load store instruction executes a memory access to the data cache 1401 at this stage based on the address calculated in the previous EX stage based on the address output through the bus 1429 or the bus 1431. In the case of a store instruction, data to be stored at the same time is output through a path 1437.

最後にＷステージの動作を説明する。Ｗステージでは、第１演算ユニット１４１０の演算結果が、バス１４２９を通して、第１命令レジスタのディスティネーションフィールドで指されるレジスタに格納される。また、第２演算ユニット１４１２の演算結果が、バス１４３１を通して、第２命令レジスタのディスティネーションフィールドで指されるレジスタに格納される。さらに、ロード命令の時は、ロード命令内のディスティネーションフィールドで指されるレジスタへ、バス１４３０を通して、格納される。 Finally, the operation of the W stage will be described. In the W stage, the operation result of the first operation unit 1410 is stored in the register pointed to by the destination field of the first instruction register via the bus 1429. The operation result of the second operation unit 1412 is stored in the register pointed to by the destination field of the second instruction register via the bus 1431. Further, at the time of a load instruction, the data is stored through a bus 1430 to a register indicated by a destination field in the load instruction.

図１５は、基本命令を連続して処理するフローを示したものである。１マシンサイクルに２命令ずつ処理される。また、この例では、第１演算ユニットと第２演算ユニットは常に並列して動作している場合について描かれている。 FIG. 15 shows a flow of continuously processing the basic instructions. Two instructions are processed in one machine cycle. Also, in this example, the case where the first arithmetic unit and the second arithmetic unit always operate in parallel is illustrated.

しかしながら、第１命令と第２命令との組み合せによっては、両命令を同時に実行できないことがある。これを競合と呼ぶ。 However, depending on the combination of the first instruction and the second instruction, both instructions may not be executed simultaneously. This is called a conflict.

例えば、第１命令のディスティネーションレジスタフィールドで指されるレジスタと、第２命令の第１ソースレジスタフィールドで指されるレジスタ、又は、第２命令の第２ソースレジスタフィールドＳ２で指されるレジスタが一致する時である。 For example, the register pointed by the destination register field of the first instruction and the register pointed by the first source register field of the second instruction, or the register pointed by the second source register field S2 of the second instruction are It is time to match.

このような競合が発生した時、ハードウエアは第１命令レジスタに入っている命令を１マシンサイクルかけて実行し、続いて次の１マシンサイクルで第２命令レジスタを実行するように制御される。つまり、第１命令，第２命令ともに、それぞれ１マシンサイクルかけて実行される。図１６に、競合が入った場合のパイプラインを示す。この例では、第１命令，第２命令共に加算命令であり、アドレス２の２命令について考えると第１命令はレジスタＲ（１），レジスタＲ（２）の内容を加算して、レジスタＲ（３）に格納するものであり、第２命令はレジスタＲ（４）とレジスタＲ（３）の内容を加算して、レジスタＲ（５）に格納するものである。ここで、第１命令のディスティネーションレジスタＲ(３)と、第２命令のソースレジスタＲ（３）競合している。このような場合、図１６に示す通り、１マシンサイクルごとに、各命令を実行する。 When such a conflict occurs, the hardware is controlled to execute the instruction contained in the first instruction register in one machine cycle, and then execute the second instruction register in the next one machine cycle. . That is, each of the first instruction and the second instruction is executed over one machine cycle. FIG. 16 shows a pipeline when a conflict occurs. In this example, both the first instruction and the second instruction are addition instructions. Considering the two instructions at address 2, the first instruction adds the contents of the registers R (1) and R (2) to form the register R ( 3), and the second instruction is to add the contents of the register R (4) and the register R (3) and store the result in the register R (5). Here, the destination register R (3) of the first instruction competes with the source register R (3) of the second instruction. In such a case, as shown in FIG. 16, each instruction is executed every one machine cycle.

つまり、ＰＣ２で第１命令を実行し、並行して行われる第２命令を無効化し、続いて、次のサイクルで第１命令を無効化し、並行して行われる第２命令を実行することで実現できる。なお、１サイクルずらした場合のディスティネーションとソースのぶつかりは従来からよく知られているショートパスを使えばよい。 That is, the first instruction is executed by the PC2, the second instruction executed in parallel is invalidated, and then the first instruction is invalidated in the next cycle, and the second instruction executed in parallel is executed. realizable. It should be noted that the collision between the destination and the source when shifted by one cycle may use a conventionally well-known short path.

Super Scalar型のＲＩＳＣプロセッサは、図１４で示した通り、演算ユニットを２つ持っており、上記のような競合の時は、必ずどちらか１つの演算ユニットしか使われない。残りの演算ユニットは、無意味な処理をしている。 As shown in FIG. 14, the Super Scalar type RISC processor has two operation units, and in the case of the above-mentioned conflict, only one of the operation units is always used. The remaining arithmetic units perform meaningless processing.

Super Scalar型のＲＩＳＣプロセッサにおいて、各種競合が見つけられた時に、使用される方の演算ユニットを動作開始に先立って、検出し活性化することが重要である。この様子を図１４で詳しく述べる。ＩＦステージで、第１命令，第２命令が読み出された後、Ｄステージで第１命令，第２命令の間の各種競合チェックが競合検出回路１４１３で行われる。 In a Super Scalar-type RISC processor, when various conflicts are found, it is important to detect and activate the operation unit to be used prior to the start of operation. This situation will be described in detail with reference to FIG. After the first and second instructions are read in the IF stage, various conflict checks between the first and second instructions are performed by the conflict detection circuit 1413 in the D stage.

各種競合チェック後、競合が生じていると認められると、一方の演算ユニットだけで実行されるので、信号１４３２，１４３３を通して、使用される演算ユニットを活性化すればよい。 After various types of conflict checking, if it is determined that a conflict has occurred, only one of the arithmetic units is executed. Therefore, the arithmetic unit to be used may be activated through signals 1432 and 1433.

なお、競合が生じていない場合には、両方の演算器を活性化する。活性化された演算ユニットは、１マシンサイクルの後半に、次のマシンサイクルのための制御信号が活性化を伝えてくれば、演算ユニットは、連続して活性化される。また、活性化を伝えてこなければ、そのマシンサイクル終了後、演算器は、不活性化される。 If no conflict occurs, both arithmetic units are activated. If the control signal for the next machine cycle transmits the activation of the activated arithmetic unit in the latter half of one machine cycle, the arithmetic unit is continuously activated. If the activation is not transmitted, the arithmetic unit is inactivated after the end of the machine cycle.

競合が生じた場合について、詳しく記述する。第１，第２命令の競合検出回路が競合を検出すると、初めに第１命令を実行するため、第１演算ユニットは、１４３３を経由して、制御信号１４３５より活性化を伝えられ、活性化される。同時刻、第２演算ユニットは、制御信号１４３２を経由して、制御信号１４３６より活性化しないことが伝えられる。このため、第２演算ユニットは、不活性のまま、つまり、低消費電力のままである。 The case where a conflict occurs will be described in detail. When the conflict detection circuit for the first and second instructions detects a conflict, the first operation unit is first executed, so that the first arithmetic unit is notified of activation by the control signal 1435 via 1433, and is activated. Is done. At the same time, it is informed that the second arithmetic unit is not activated by the control signal 1436 via the control signal 1432. Therefore, the second arithmetic unit remains inactive, that is, remains at low power consumption.

この時、信号１４３４は、競合が検出されたことをシーケンサ１４０２に伝えるものである。 At this time, the signal 1434 informs the sequencer 1402 that a conflict has been detected.

次のサイクルで、第２命令を実行するため、第１演算ユニットは、１４３３を経由して、制御信号１４３５より、活性化しないことが伝えられる。このため、第１演算ユニットは、不活性になる。同時刻、第２演算ユニットは、制御信号１４３２を経由して制御信号１４３６より、活性化することが伝えられる。 In the next cycle, to execute the second instruction, the control signal 1435 informs via the 1433 that the first arithmetic unit will not be activated. Therefore, the first arithmetic unit becomes inactive. At the same time, it is reported that the second arithmetic unit is activated from the control signal 1436 via the control signal 1432.

以上、本実施例のように、２命令同時処理において、各種競合が見つけられた時に、使用される方の演算ユニットを動作開始に先立って、検出し活性化することにより、活性化されない演算ユニットは、電力の消費を抑えることが可能であり、全体の消費電力を抑える効果がある。 As described above, in the two-instruction simultaneous processing, when various conflicts are found, the operation unit to be used is detected and activated prior to the start of operation, so that the operation unit that is not activated Can reduce power consumption, and has the effect of suppressing overall power consumption.

図１７〜図１９は、図１４の第１演算ユニット１４１０，第２演算ユニット１４１２，レジスタファイル１４１１を抜き出したものであり、接続関係は省略して書いてある。 FIGS. 17 to 19 show the first operation unit 1410, the second operation unit 1412, and the register file 1411 in FIG. 14, and the connection relation is omitted.

図１７は、第１，２演算ユニットに少なくとも１つ以上差動入力を利用した回路、例えばＥＣＬ回路で構成したものを示している。このような演算ユニットで構成したSuper
Scalar型のマイクロプロセッサにおいて、競合が検出された時、１マシンサイクルずつ実行されるため、実際に使用する第１又は第２演算ユニットは、信号線１４３５又は1436によって、活性化され予め目的の動作をするために定められた値の電流を電流源より流し実行されるが、残りの活性化されなかった演算器は電流源の流す電流を小さくするか流さない状態のままであるため、電力を消費しない。 FIG. 17 shows a circuit using at least one or more differential inputs to the first and second arithmetic units, for example, an ECL circuit. Super composed of such arithmetic units
In the Scalar-type microprocessor, when a conflict is detected, the execution is performed one machine cycle at a time. Therefore, the first or second arithmetic unit actually used is activated by the signal line 1435 or 1436 and the target operation is performed in advance. Is executed by flowing a current of a predetermined value from the current source in order to perform the operation.However, since the remaining unactivated arithmetic units reduce or allow the current to flow from the current source, the power is reduced. Do not consume.

図１８，図１９，図２０は、第１，２演算ユニットに少なくとも１つ以上のバイポーラトランジスタのベース・エミッタ間で論理をとる回路、例えば、ＥＣＬ回路，BiCMOS回路で構成したものを示している。なお、この回路構成そのものは、特開昭60−175167号公報に詳しく記述されている。この回路は、バイポーラトランジスタがＯＮしていると、直流電流が流れ、電力が増える欠点を持つ。このため、競合などが生じた時使用していない演算ユニットの電力を消費させないことは有効である。制御方法は、図１７で説明したものと同様である。 FIGS. 18, 19, and 20 show a circuit in which the first and second arithmetic units take logic between the base and emitter of at least one or more bipolar transistors, for example, an ECL circuit or a BiCMOS circuit. . The circuit configuration itself is described in detail in JP-A-60-175167. This circuit has a drawback that when the bipolar transistor is ON, a direct current flows and the power increases. For this reason, it is effective not to consume the power of the arithmetic unit that is not used when a conflict or the like occurs. The control method is the same as that described with reference to FIG.

図１８と図１９の違いは、電力を削減する方法が異なる点である。図１８は、バイポーラトランジスタのコレクタ側とＶｃｃの間にＰチャネルＭＯＳトランジスタが挿入されており、このＰチャネルＭＯＳトランジスタをＯＮさせた時、動作状態になり、ＯＦＦさせた時、不活性状態となる。 The difference between FIG. 18 and FIG. 19 is that the method for reducing power is different. FIG. 18 shows that a P-channel MOS transistor is inserted between the collector side of a bipolar transistor and Vcc. The P-channel MOS transistor is turned on when turned on, and is deactivated when turned off. .

図１９は、回路としては、動作状態にあるが、信号１４３５又は１４３６がＯＮになると、強制的にバイポーラトランジスタをＯＦＦさせ、バイポーラトランジスタのコレクタ−エミッタ間電流を流さなくさせる。これは、強制的に直流電流をカットさせたことを意味しており、これにより消費電力を削減できる。 FIG. 19 shows that the circuit is in the operating state, but when the signal 1435 or 1436 is turned on, the bipolar transistor is forcibly turned off and the current between the collector and the emitter of the bipolar transistor is stopped. This means that the DC current was forcibly cut, thereby reducing power consumption.

図２０は、図１４の第１演算ユニット１４１０，第２演算ユニット１４１２，レジスタファイル１４１１、さらにそれらのタイミングをとるクロック分配系を抜き出したものである。図２０の分配系で注目すべき点は、分配系内のクロックドライバＡである。 FIG. 20 shows a first operation unit 1410, a second operation unit 1412, a register file 1411, and a clock distribution system for taking the timings of the first operation unit 1410, the second operation unit 1412, and the clock distribution system shown in FIG. A point to be noted in the distribution system of FIG. 20 is a clock driver A in the distribution system.

クロックドライバＡは、それぞれ第１演算ユニット１４１０，レジスタファイル1411，第２演算ユニット１４１２だけに独立にクロックを供給している。このような分配系を含む演算ユニットで構成されたSuper Scalar型のマイクロプロセッサにおいて、競合が検出された時、１マシンサイクルごとに実行されるが、実際に使用しない第１又は第２演算ユニットは、信号線１４３５又は、1436によって、クロック分配系の特定エリアへのクロックを止めるように制御する。これにより、各ブロックに供給するクロック分配系以下の論理が固定される。つまり、２つある演算ユニット内のどちらか一方の演算ユニットは、クロックが供給されており、動作しているが、残りの一方の演算ユニットは、クロックの供給が行われない。 The clock driver A independently supplies a clock to only the first operation unit 1410, the register file 1411, and the second operation unit 1412. In a Super Scalar type microprocessor configured with an arithmetic unit including such a distribution system, when a conflict is detected, the microprocessor is executed every machine cycle, but the first or second arithmetic unit not actually used is , The signal line 1435 or 1436 is used to stop the clock to a specific area of the clock distribution system. Thereby, the logic below the clock distribution system to be supplied to each block is fixed. That is, one of the two operation units is supplied with a clock and is operating, but the other one is not supplied with the clock.

ＣＭＯＳ回路やBiCMOS基本回路は、コンプリメンタリな特性を持ち、通常の消費電力はきわめて小さいが、入力データが変化する過渡期に電力を消費する。クロックの供給が止められることは、論理が固定され、変化しないことを意味する。このため、消費電力が削減できる効果があり、図２０の制御方法は、ＣＭＯＳ回路やBiCMOS基本回路を含む演算ユニットで構成されたものに有効である。 CMOS circuits and BiCMOS basic circuits have complementary characteristics and consume very little power, but consume power during transitional periods when input data changes. Stopping the clock supply means that the logic is fixed and does not change. For this reason, there is an effect that power consumption can be reduced, and the control method in FIG. 20 is effective for a circuit configured by an arithmetic unit including a CMOS circuit and a BiCMOS basic circuit.

以上、図１７〜図２０までで述べたように、演算ユニットを構成する回路形式に対応して、不活性時の消費電力を削減することが可能である。なお、図１７，図１８の回路形式の組み合せによる演算ユニットの構成においてもそれぞれに対応して消費電力を削減できることは明白である。 As described above with reference to FIGS. 17 to 20, it is possible to reduce the power consumption in the inactive state in accordance with the circuit type of the arithmetic unit. It is apparent that the power consumption can be reduced correspondingly in each of the configurations of the arithmetic units by the combination of the circuit types shown in FIGS.

本実施例では、レジスタ間の競合同士の組み合せにより同時処理できないもの、（例えば、ロード命令とロード命令の組み合せなど）が挙げられる。その例として、図２１にその組み合せを示す。しかしながら、その組み合せは、ハードウエア上のインプリメンテーションで決まるもので本発明とは直接関係ない。つまり、図２１でその組み合せに１つ以上の組み合せに制約があった時、命令の組み合せによる競合が成立したことになる。 In the present embodiment, those that cannot be processed simultaneously due to a combination of conflicts between registers (for example, a combination of a load instruction and a load instruction) are given. As an example, FIG. 21 shows the combination. However, the combination is determined by hardware implementation and is not directly related to the present invention. That is, when there is a restriction on one or more combinations in FIG. 21, a conflict due to the combination of instructions is established.

更に、図１４に戻って、競合検出回路１４１３及び、デコーダ１４０６，１４０８，
１４０９，１４０７の他の動作につき、第３の実施例として説明する。 Returning again to FIG. 14, the conflict detection circuit 1413 and the decoders 1406, 1408,
Other operations 1409 and 1407 will be described as a third embodiment.

先に述べた例は、各種競合が見つけられた時に、使用される方の演算ユニットの動作開始に先立って、検出し活性化したが、本第３の実施例は、各種競合が見つけられた時、使用されない方の演算ユニットを動作開始に先立って、検出し、不活性化するものである。この様子を図１４を用いて同様に詳しく述べる。ＩＦステージで、第１命令，第２命令が読み出された後、Ｄステージで、第１命令，第２命令の間の各種競合チェックが第１命令，第２命令間の競合検出回路1413で行われる。各種競合チェック後、競合が生じていることが認められるが、一方の演算ユニットだけで実行されるので、信号１４３２，１４３３を通して残りの演算ユニットを不活性化すればよい。つまり第１，第２命令間の競合検出回路が競合を検出すると、初めに、第１命令を実行するが、第２命令は信号１４３２によって第２命令用第１デコーダを無効化し、第２演算ユニットを制御信号１４３６を通して不活性化させる。この時信号１４３４は、競合が検出されたことをシーケンサ１４０２に伝えるものである。次のサイクルで、競合検出回路の出力1433により、第１命令用第１デコーダを無効化し、第１演算ユニットを制御信号1435を通して不活性化させる。これと並行して、第２命令は実行される。なお、不活性化された演算ユニットは、１マシンサイクル内の後半で再び活性化するように制御することで、続く命令の実行は可能となる。 In the example described above, when various types of conflicts are found, they are detected and activated prior to the start of operation of the arithmetic unit to be used, but in the third embodiment, various types of conflicts are found. At this time, the operation unit that is not used is detected and inactivated before the operation starts. This will be described in detail with reference to FIG. After the first and second instructions are read in the IF stage, various conflict checks between the first and second instructions are checked in the D stage by the conflict detection circuit 1413 between the first and second instructions. Done. After various types of conflict checking, it is recognized that a conflict has occurred. However, since the execution is performed by only one of the arithmetic units, the remaining arithmetic units may be inactivated through signals 1432 and 1433. That is, when the conflict detection circuit between the first and second instructions detects a conflict, the first instruction is executed first, but the second instruction invalidates the first decoder for the second instruction by the signal 1432, and the second operation The unit is deactivated through control signal 1436. At this time, the signal 1434 informs the sequencer 1402 that a conflict has been detected. In the next cycle, the first decoder for the first instruction is invalidated by the output 1433 of the conflict detection circuit, and the first arithmetic unit is inactivated through the control signal 1435. In parallel with this, the second instruction is executed. The deactivated arithmetic unit is controlled so as to be activated again in the latter half of one machine cycle, so that the subsequent instruction can be executed.

以上、本実施例のように、２命令同時処理において、同時に実行する可能性のある２命令間に競合があるか否かを検出し、競合がある場合に、使用しない演算ユニットを不活性化することにより全体の消費電力を抑える効果がある。 As described above, in the two-instruction simultaneous processing, whether or not there is a conflict between two instructions that may be executed simultaneously is detected, and if there is a conflict, the arithmetic unit that is not used is deactivated. This has the effect of reducing the overall power consumption.

図１７〜図１９は、図１４の第１演算ユニット１４１０，第２演算ユニット１４１２，レジスタファイル１４１１を抜き出したものであり、接続関係は省略して書いてある。各演算ユニットへの低消費電力の仕方は先の第２の実行例と同様である。 FIGS. 17 to 19 show the first operation unit 1410, the second operation unit 1412, and the register file 1411 in FIG. 14, and the connection relation is omitted. The manner of reducing the power consumption of each arithmetic unit is the same as in the second embodiment.

このような演算ユニットで構成したSuper Scalar型のマイクロプロセッサにおいて、競合が検出された時、１マシンサイクルずつ実行されるが、実際に使用しない第１又は第２演算ユニットは、信号線１４３５又は１４３６によって、使用しない方の消費電力を削減する。この時、実際に使用されている第１又は第２演算ユニットは、目的の動作をするために設けられた値の電流を電流源より流し続ける。つまり、どちらか一方は、所定の電流が流れ続け、残りの一方が消費電力を削減するように制御する。 In the Super Scalar type microprocessor constituted by such an operation unit, when a conflict is detected, the operation is executed one machine cycle at a time, but the first or second operation unit which is not actually used is a signal line 1435 or 1436. This reduces the power consumption of those not using it. At this time, the first or second arithmetic unit actually used continues to supply a current of a value provided for performing a desired operation from the current source. That is, one of them controls so that a predetermined current continues to flow, and the other one reduces power consumption.

なお、第２の実施例と同様に、図１７，図１８の回路形式の組み合わせによる演算ユニットの構成においてもそれぞれに対応して消費電力を削減できることは明白である。 As in the second embodiment, it is apparent that the power consumption can be reduced correspondingly in the configuration of the arithmetic unit by the combination of the circuit forms of FIGS.

本実施例では、レジスタ間の競合について述べたが、その他の競合として、先の第２の実施例の中でも説明したように命令同士の組み合せにより同時処理できないもの、（例えば、ロード命令とロード命令の組み合せなど）が挙げられる。図２１はその組み合せの例を示す。しかしながら、その組み合せは、ハードウエア上のインプリメンテーションで決まるもので本発明とは直接関係ないのは、先の第２の実施例でも述べた通りであり、図
２１でその組み合せに１つ以上の組み合せに制約があった時、命令の組み合せによる競合が成立したことになる。 In this embodiment, the conflict between the registers has been described. As other conflicts, those which cannot be processed simultaneously by the combination of instructions as described in the second embodiment (for example, a load instruction and a load instruction) And the like). FIG. 21 shows an example of the combination. However, the combination is determined by the hardware implementation and is not directly related to the present invention, as described in the second embodiment, and one or more combinations shown in FIG. When there is a restriction on the combination of the instructions, the competition by the combination of the instructions is established.

さらに、本実施例では、基本命令の組み合せについて述べたが、分岐，命令，ロード命令のすぐ次の命令でロードしたデータを使用する（これをロードユースと呼ぶ）時にも、演算ユニットは、無意味な処理をする場合がある。この場合も本発明は有効である。図
２２は、分岐命令の時を示すものであり、図２３は、ロードユースの時である。なお、これらの動作は、容易に類推可能であるため省略する。 Furthermore, in this embodiment, the combination of basic instructions has been described. However, even when the data loaded by the instruction immediately following the branch, instruction, or load instruction is used (this is called load use), the operation unit is not used. There may be meaningful processing. The present invention is also effective in this case. FIG. 22 shows the case of a branch instruction, and FIG. 23 shows the case of load use. Note that these operations can be easily inferred, and thus will not be described.

さらに、ＮＯＰ命令，システム制御命令など、演算ユニットを実際に操作しない命令が検出された時、検出された側の演算ユニットを不活性化することも可能である。 Furthermore, when an instruction that does not actually operate the arithmetic unit, such as a NOP instruction or a system control instruction, is detected, it is also possible to deactivate the detected arithmetic unit.

図１４において、第１命令用第２デコーダ１４０８，第２命令用第２デコーダ１４０９は、それぞれの命令が演算ユニットを実際に操作するかどうかの命令をデコードすることで検出する回路である。 In FIG. 14, a second decoder for first instruction 1408 and a second decoder for second instruction 1409 are circuits that detect whether each instruction actually operates the arithmetic unit by decoding the instruction.

第１命令用第２デコーダ１４０８で検出すると信号線１４３５を通して第１演算ユニット１４１０を不活性化し、さらに第２命令用第２デコーダ１４０９で検出すると、信号線１４３６を通して、第２演算ユニット１４１２を不活性化する。これによって、演算ユニットの消費電力は削減できる効果がある。 When detected by the second decoder 1408 for the first instruction, the first operation unit 1410 is inactivated via the signal line 1435. When detected by the second decoder 1409 for the second instruction, the second operation unit 1412 is disabled via the signal line 1436. Activate. This has the effect of reducing the power consumption of the arithmetic unit.

さらに、本実施例では、２命令のSuper Scalar型のマイクロプロセッサについて述べたSuper Scalar型の別の制御方式においても有効であり、さらに２命令に限るものではなく、複数命令の同時処理機能を持つプロセッサに有効である。また、ＲＩＳＣプロセッサに限定されることなく、ＣＩＳＣプロセッサに適用できることは言うまでもない。 Further, the present embodiment is effective in another Super Scalar type control system described for a two-instruction Super Scalar type microprocessor, and is not limited to two instructions but has a simultaneous processing function of a plurality of instructions. Useful for processors. Further, it is needless to say that the present invention is not limited to the RISC processor but can be applied to the CISC processor.

なお、以上、本実施例においては、単一チップマイクロプロセッサを例にとり説明したが、他の１チップＬＳＩ等の半導体集積回路装置等においても、その各機能回路ブロックの動作開始を予知し機能回路ブロックの回路電流を制御することにより同様の効果を得ることができる。この場合、その動作開始の予知方法、および、回路電流制御のタイミングは適用する装置の構成，用途に従うが、動作開始に先立ち、動作開始を予知し、電流切り替え等に起因する誤動作が生じないよう動作開始に一定時間先行して機能回路ブロックを活性化することにより、低消費電力化と正常動作を確保し、ひいては装置の高速化を図るという本実施例の本質に何ら異なるものではない。 Although a single-chip microprocessor has been described as an example in the present embodiment, a semiconductor integrated circuit device such as another one-chip LSI may be used to predict the start of the operation of each functional circuit block. A similar effect can be obtained by controlling the circuit current of the block. In this case, the method of predicting the operation start and the timing of the circuit current control depend on the configuration and application of the device to be applied. However, prior to the start of the operation, the operation start is predicted so that a malfunction due to current switching or the like does not occur. By activating the functional circuit block for a certain period of time prior to the start of the operation, power consumption and normal operation are ensured, and the essence of the present embodiment that the speed of the device is increased is not different at all.

なお、さらに本実施例は半導体集積回路のみならず、一般の電子回路においても、同様に実現可能である。 The present embodiment can be similarly realized not only in a semiconductor integrated circuit but also in a general electronic circuit.

本発明の第１の実施例に係るマイクロプロセッサの構成を示すブロック図。FIG. 1 is a block diagram showing a configuration of a microprocessor according to a first embodiment of the present invention. マイクロプロセッサの命令実行ステージを示す説明図。FIG. 3 is an explanatory diagram showing an instruction execution stage of the microprocessor. マイクロプロセッサの動作タイミングを示すタイミングチャートを示す図。FIG. 3 is a timing chart showing operation timing of a microprocessor. アクセス予告信号発生回路の構成を示すブロックおよびその動作を示すタイミングチャートを示す図。FIG. 3 is a diagram showing a block showing a configuration of an access notice signal generation circuit and a timing chart showing its operation. キャッシュメモリの構成を示すブロック図。FIG. 2 is a block diagram showing a configuration of a cache memory. 電流制御信号発生回路を示す回路およびその動作を示すタイミングチャートを示す図。FIG. 4 is a diagram showing a circuit showing a current control signal generation circuit and a timing chart showing an operation thereof. アクセス予告信号と電源電流の関係を示すタイムチャートを示す図。The figure which shows the time chart which shows the relationship between an access notice signal and power supply current. 電流制御信号発生回路の構成を示すブロックおよびその動作を示すタイミングチャートを示す図。FIG. 2 is a diagram showing a block illustrating a configuration of a current control signal generation circuit and a timing chart illustrating an operation thereof. 電流制御信号発生回路を示すブロックおよびその動作を示すタイミングチャートを示す図。FIG. 2 is a diagram showing a block showing a current control signal generation circuit and a timing chart showing its operation. アドレスバッファの構成を示す回路。A circuit showing the configuration of an address buffer. メモリセル周辺回路を示すブロック図。FIG. 4 is a block diagram showing a memory cell peripheral circuit. 出力ドライブ回路を示す回路図。FIG. 3 is a circuit diagram showing an output drive circuit. 命令一覧を示す図。The figure which shows a list of instructions. 第２の実施例に係るマイクロプロセッサの構成を示すブロック図。FIG. 6 is a block diagram illustrating a configuration of a microprocessor according to a second embodiment. 第２の実施例で示されたマイクロプロセッサの命令実行ステージを示す説明図。FIG. 9 is an explanatory diagram showing an instruction execution stage of the microprocessor shown in the second embodiment. 各種競合が行った時のマイクロプロセッサの命令実行ステージを示す説明図。FIG. 4 is an explanatory diagram showing an instruction execution stage of the microprocessor when various types of conflicts occur. 演算ユニット内の回路例を示す図。FIG. 3 is a diagram illustrating a circuit example in an arithmetic unit. 演算ユニット内の他の回路例を示す図。FIG. 9 is a diagram illustrating another example of a circuit in the arithmetic unit. 演算ユニット内の他の回路例を示す図。FIG. 9 is a diagram illustrating another example of a circuit in the arithmetic unit. 演算ユニットへ供給されるクロック信号の分配系の回路を示す図。FIG. 2 is a diagram illustrating a circuit of a distribution system of a clock signal supplied to an arithmetic unit. ２命令同時処理における命令同士の組み合せルールを示す説明図。FIG. 4 is an explanatory diagram showing a combination rule between instructions in simultaneous processing of two instructions. 分岐命令の命令実行ステージを示す説明図。FIG. 4 is an explanatory diagram showing an instruction execution stage of a branch instruction. ロードユース時の命令実行ステージを示す説明図。FIG. 4 is an explanatory diagram showing an instruction execution stage at the time of load use. 回路電流の変化とノイズ電圧の関係を示す説明図。FIG. 4 is an explanatory diagram showing a relationship between a change in circuit current and a noise voltage.

Explanation of reference numerals

１０１…プログラムカウンタ、１０２…メモリアドレスレジスタ、１０３…命令キャッシュメモリ、１０４…命令データレジスタ、１１１…メモリアドレスレジスタ、１１２…データキャッシュメモリ、１１３…メモリデータレジスタ、１２０…第１の命令デコーダ、１３０…第２の命令デコーダ、１４０…演算器、１５０…レジスタファイル、１６０…入出力制御回路、４１０…メモリアクセス予知回路、５７０…電流制御信号発生回路、
９４０…能動回路、１１００…メモリセル、１１５０…センスアンプ。
101: Program counter, 102: Memory address register, 103: Instruction cache memory, 104: Instruction data register, 111: Memory address register, 112: Data cache memory, 113: Memory data register, 120: First instruction decoder, 130 ... second instruction decoder, 140 ... calculator, 150 ... register file, 160 ... input / output control circuit, 410 ... memory access prediction circuit, 570 ... current control signal generation circuit,
940: active circuit, 1100: memory cell, 1150: sense amplifier.

Claims

A functional circuit block having a power supply system inductance L, an allowable power supply noise V _n , and a switching width ΔI of a circuit current, and a means for generating an operation start notice signal for activating the functional circuit block in advance of the operation start time T. a, and the T, L, V _n and ΔI is

A semiconductor integrated circuit device satisfying the following relationship:

The operation mode is shifted from the low power consumption mode by receiving a notice signal for predicting the start of the operation, increasing the circuit current to a predetermined value over a predetermined time from the reception of the notice signal, and executing the operation. A function circuit block having a function of reducing the circuit current to the low power consumption mode over a predetermined period of time after completion of the operation and shifting to the low power consumption mode.

A memory which is activated by an access notice signal for giving notice of an access and executes a predetermined memory operation based on an address signal, a read / write control signal, and a data input / output signal.

An information processing apparatus comprising at least one of the semiconductor integrated circuit device according to claim 1, the functional circuit block according to claim 2, and the memory according to claim 3.