JP2000029695A

JP2000029695A - Processor and arithmetic processing system

Info

Publication number: JP2000029695A
Application number: JP10193075A
Authority: JP
Inventors: Masaru Goto; 後藤　　勝; Masanori Osawa; 正紀大澤; Yukihiro Sakamoto; 幸弘阪本
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1998-07-08
Filing date: 1998-07-08
Publication date: 2000-01-28

Abstract

PROBLEM TO BE SOLVED: To provide a processor capable of achieving a high operation frequency. SOLUTION: In this processor for dividing instruction into (n) (n>=2) pieces of stages, successively performing it, parallelly executing the different stages of continuous plural instructions based on clock signals and performing a pipeline processing, an arithmetic operation module 12a for performing an arithmetic operation and a logical operation module 12b for performing a logical operation are mutually independently designed by using a hardware description language and arranged in different areas on a substrate and the arithmetic operation and the logical operation are respectively executed within one clock cycle of the clock signals.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明が属する技術分野】本発明は、命令パイプライン
処理を行うプロセッサおよび当該プロセッサを備えた演
算処理システムに関する。[0001] 1. Field of the Invention [0002] The present invention relates to a processor for performing instruction pipeline processing and an arithmetic processing system having the processor.

【０００２】[0002]

【従来の技術】近年のプロセッサでは、ＲＩＳＣ(Reduc
ed Instruction Set Computer)型のアーキテクチャが主
流になっている。ＲＩＳＣ型のプロセッサは、ＣＩＳＣ
(Complex Instruction Set Computer)型のプロセッサが
命令機能レベルを上げて実行命令数を減らして高速化を
図るのに対して、命令パイプラインを駆使して、１命令
当たりの平均所要クロックサイクル数を可能な限り１に
近づけることで高速化を図っている。そのため、ＲＩＳ
Ｃ型のプロセッサでは、命令パイプライン処理に適する
ように命令の機能を単純化すると共に、命令パイプラン
が滞らないようにコンパイラによる静的コードスケジュ
ーリングを行っている。2. Description of the Related Art In recent processors, RISC (Reduce
(ed Instruction Set Computer) type architecture has become mainstream. RISC type processor is CISC
(Complex Instruction Set Computer) type processors increase the instruction function level to reduce the number of executed instructions and increase the speed, while using the instruction pipeline to enable the average required number of clock cycles per instruction The speed is increased by approaching 1 as much as possible. Therefore, RIS
In a C-type processor, the function of an instruction is simplified so as to be suitable for instruction pipeline processing, and static code scheduling is performed by a compiler so that an instruction pipeline is not delayed.

【０００３】命令パイプライン処理は、命令実行を複数
のステージ（段）に分割し、当該複数のステージをオー
バーラップさせて実行することで、全体としてのスルー
プットを上げる手法である。[0003] Instruction pipeline processing is a technique of increasing the overall throughput by dividing instruction execution into a plurality of stages (stages) and executing the plurality of stages in an overlapping manner.

【０００４】図５は、５段命令パイプライン処理を説明
するための図である。図５に示すように、５段命令パイ
プライン処理では、１命令の実行を、ＩＦ(Instruction
Fetch) 、ＲＦ(Register Fetch)、ＥＸ(EXecution) 、
ＭＥＭ(MEMory access) およびＷＢ(Write Back)の５ス
テージに分割し、各ステージをｌクロックサイクルで実
行する。各ステージの処理を簡単に説明すると、ＩＦス
テージでは、プログラムカウンタが指し示す外部メモリ
上のアドレスを更新した後に、当該更新したアドレスか
ら命令を読み込む（フェッチする）。ＲＦステージで
は、読み込んだ命令のデコードを行い、必要に応じて、
汎用レジスタからデータを読み込む。ＥＸステージで
は、必要に応じて、ＲＦステージで読み込んだデータを
用いた演算を行う。ＭＥＭステージでは、必要に応じて
外部メモリにアクセスを行う。ＷＢステージでは、ＲＦ
／ＥＸステージで演算が行われた場合に、当該演算の結
果をレジスタに書き込む。FIG. 5 is a diagram for explaining a five-stage instruction pipeline process. As shown in FIG. 5, in the five-stage instruction pipeline processing, execution of one instruction is performed by IF (Instruction).
Fetch), RF (Register Fetch), EX (EXecution),
It is divided into five stages of MEM (MEMory access) and WB (Write Back), and each stage is executed in one clock cycle. Briefly describing the processing of each stage, in the IF stage, after updating the address on the external memory indicated by the program counter, the instruction is read (fetched) from the updated address. In the RF stage, the read instruction is decoded, and if necessary,
Reads data from general-purpose registers. The EX stage performs an operation using the data read by the RF stage, if necessary. In the MEM stage, an external memory is accessed as needed. In the WB stage, RF
When an operation is performed in the / EX stage, the result of the operation is written to a register.

【０００５】上述した５段命令パイプライン処理では、
図５に示すようにクロックサイクル「５」では、ＩＦ、
ＲＦ、ＥＸ、ＭＥＭおよびＷＢステージが並列に実行さ
れ、命令パイプライン処理を採用しない場合に比べて、
見かけ上の演算速度を５倍にできる。In the above-described five-stage instruction pipeline processing,
As shown in FIG. 5, in clock cycle “5”, IF,
The RF, EX, MEM, and WB stages are executed in parallel, compared with the case where instruction pipeline processing is not adopted.
The apparent calculation speed can be increased five times.

【０００６】ところで、命令パイプライン処理では、動
作周波数を決める１クロックサイクルの時間は、複数の
ステージのうち最長の実行時間を要する（クリティカル
パスとなる）ステージの実行時問に応じて決定される。
従って、動作周渡数を上げるためには、クリティカルパ
スとなるステージの実行時間を短縮する必要がある。図
５に示す５段命令パイプライン方式では、演算処理を行
うＥＸステージが、１クロックサイクルの時間を決定す
る上でのクリティカルパスとなり、動作速度の向上を図
る場合のボトルネックとなっていた。In the instruction pipeline processing, the time of one clock cycle that determines the operating frequency is determined according to the execution time of the stage requiring the longest execution time (a critical path) among a plurality of stages. .
Therefore, in order to increase the number of operation rounds, it is necessary to reduce the execution time of a stage that becomes a critical path. In the five-stage instruction pipeline system shown in FIG. 5, the EX stage performing the arithmetic processing is a critical path for determining the time of one clock cycle, and has been a bottleneck in improving the operation speed.

【０００７】プロセッサでは、ＥＸステージで、ＡＬＵ
(Arithmetic Logic Unit) において算術論理演算処理を
行う。従来のプロセッサのＡＬＵは、数値データに対す
る加算、減算、乗算および除算、算術シフト、比較演算
などの算術演算を行う算術演算器と、非数値データに対
する論理演算、論理シフトなどの論理演算を行う論理演
算器とを有し、これらの演算器を、例えぱＶｅｒｉｌｏ
ｇ−ＨＤＬ(Hardware Description Language)などのハ
ードウェア記述言語および当該言語のコンパイラを用い
て一体的に自動設計している。従来のプロセッサでは、
例えば、動作周波数が４５（ＭＨｚ）であり、１クロッ
クサイクルの時間は２２（ｓｅｃ）である。In the processor, in the EX stage, the ALU
(Arithmetic Logic Unit) performs arithmetic logic operation processing. The ALU of the conventional processor includes an arithmetic operation unit that performs arithmetic operations such as addition, subtraction, multiplication and division, arithmetic shift, and comparison operation on numerical data, and a logic that performs logical operations such as logical operation and logical shift on non-numeric data. Computing units, and these computing units are referred to as Verilo, for example.
g-HDL (Hardware Description Language) or other hardware description language and a compiler of the language are used to automatically and integrally design. In conventional processors,
For example, the operating frequency is 45 (MHz), and the time of one clock cycle is 22 (sec).

【０００８】ところで、プロセッサの内部状態として
は、命令を順次に実行している通常の動作状態と、割り
込み信号が外部から入力されるのを待つ待機状態とがあ
るが、従来のプロセッサでは、待機状態においても、ク
ロック信号に基づいて動作するプロセッサ内の全てのモ
ジュールにクロック信号を供給し、これらのモジュール
を通常動作時と同様に動作させている。The internal states of the processor include a normal operating state in which instructions are sequentially executed and a standby state in which an interrupt signal is externally input. Also in the state, the clock signal is supplied to all the modules in the processor that operate based on the clock signal, and these modules are operated as in the normal operation.

【０００９】[0009]

【発明が解決しようとする課題】しかしながら、上述し
た従来のプロセッサでは、ＡＬＵ全体が一体的にハード
ウェア記述言語および当該言語のコンパイラを用いて自
動設計されているため、各演算器の構成要素が広い範囲
に分散して配置され、演算速度の向上を図ることが困難
であった。However, in the above-mentioned conventional processor, since the entire ALU is automatically designed integrally using a hardware description language and a compiler of the language, the components of each arithmetic unit are required. It is distributed over a wide range, and it is difficult to improve the calculation speed.

【００１０】また、上述した従来のプロセッサでは、待
機状態でも通常動作時と同じように、クロック信号に基
づいて動作するプロセッサ内の全てのモジュールにクロ
ック信号を供給し、これらのモジュールを動作させてい
るため、消費電力が大きくなるという問題がある。In the conventional processor described above, a clock signal is supplied to all the modules in the processor that operate based on the clock signal in the standby state, as in the normal operation, and these modules are operated. Therefore, there is a problem that power consumption increases.

【００１１】本発明は上述した従来技術の問題点に鑑み
てなされ、高い動作周波数を達成できるプロセッサを提
供することを目的とする。また、本発明は、低消費電力
化が図れる演算処理システムを提供することを目的とす
る。The present invention has been made in view of the above-mentioned problems of the prior art, and has as its object to provide a processor capable of achieving a high operating frequency. Another object of the present invention is to provide an arithmetic processing system capable of reducing power consumption.

【００１２】[0012]

【課題を解決するための手段】上述した従来技術の問題
点を解決し、上述した目的を達成するために、本発明の
プロセッサは、命令実行をｎ（ｎ≧２）個のステージに
分割して順次に行い、連続した複数の命令の異なるステ
ージをクロック信号に基づいて並列に実行してパイプラ
イン処理を行うプロセッサであって、算術演算を行う算
術演算モジュールと、論理演算を行う論理演算モジュー
ルとが基板上の異なる領域に配置してあり、前記算術演
算および前記論理演算をそれぞれ前記クロック信号の１
クロックサイクル内に実行する。SUMMARY OF THE INVENTION To solve the above-mentioned problems of the prior art and achieve the above-mentioned object, the processor of the present invention divides instruction execution into n (n ≧ 2) stages. A processor for performing pipeline processing by sequentially executing different stages of a plurality of continuous instructions based on a clock signal, and performing an arithmetic operation, and a logical operation module for performing a logical operation Are arranged in different regions on the substrate, and the arithmetic operation and the logical operation are each performed by one of the clock signals.
Execute within a clock cycle.

【００１３】本発明のプロセッサでは、算術演算モジュ
ールと論理演算モジュールとが、基板上の異なる領域に
配置してあるため、算術演算モジュールおよび論理演算
モジュールの構成要素を同一の領域に混在させた場合に
比べて、各モジュール内の構成要素を相互に近接して配
置することができる。その結果、算術演算モジュールに
おける算術演算と、論理演算モジュールにおける論理演
算との各々を高速に実行でき、パイプライン処理の１ク
ロックサイクルの時間を短縮できる。In the processor of the present invention, since the arithmetic operation module and the logical operation module are arranged in different areas on the substrate, when the components of the arithmetic operation module and the logical operation module are mixed in the same area. The components in each module can be arranged closer to each other. As a result, each of the arithmetic operation in the arithmetic operation module and the logical operation in the logical operation module can be executed at high speed, and the time of one clock cycle of the pipeline processing can be reduced.

【００１４】また、本発明のプロセッサは、特定的に
は、前記算術演算モジュールおよび論理演算モジュール
は、ハードウェア記述言語を用いて記述された回路設計
プログラムをコンパイラでコンパイルして行う設計を相
互に独立して行なって配置されている。In the processor according to the present invention, specifically, the arithmetic operation module and the logical operation module mutually execute a design performed by compiling a circuit design program described using a hardware description language with a compiler. They are arranged independently.

【００１５】さらに、本発明の演算処理システムは、ク
ロック信号を供給するクロック信号供給装置と、命令実
行をｎ（ｎ≧２）個のステージに分割して順次に行い、
連続した複数の命令の異なるステージを、前記クロック
信号供給装置からのクロック信号に基づいて並列に実行
してパイプライン処理を行うプロセッサとを有する演算
処理システムであって、前記プロセッサは、外部メモリ
から読み込もうとする命令が記憶されている前記外部メ
モリのアドレスを指し示すプログラムカウンタと、前記
読み込んだ命令をデコードし、低消費電力モード指示命
令をデコードしたときに、低消費電力モード指示信号を
前記クロック信号供給装置に出力するデコード手段と、
基板上の異なる領域に配置された算術演算を行う算術演
算モジュールおよび論理演算を行う論理演算モジュール
を用いて、前記デコードの結果に応じて演算を行う演算
手段と割り込み信号を入力すると、前記クロック信号供
給装置に低消費電力モード解除信号を出力する割り込み
制御手段とを有する。また、前記クロック信号供給装置
は、前記プロセッサの前記割り込み制御手段にクロック
信号を常時供給し、前記低消費電力モード指示信号を入
カすると、前記プログラムカウンタ、前記デコード手段
および前記演算手段へのクロック信号の供給を停止し、
低消費電力モード解除信号を入力すると、前記プログラ
ムカウンタ、前記デコード手段および前記演算手段への
クロック信号の供給を開始する。Further, the arithmetic processing system of the present invention includes a clock signal supply device for supplying a clock signal, and an instruction execution divided into n (n ≧ 2) stages and sequentially performed.
An arithmetic processing system comprising: a processor that executes different stages of a plurality of continuous instructions in parallel based on a clock signal from the clock signal supply device to perform a pipeline process, wherein the processor includes A program counter that indicates an address of the external memory in which an instruction to be read is stored; and a decoder that decodes the read instruction and decodes the low power consumption mode instruction instruction to output the low power consumption mode instruction signal to the clock signal. Decoding means for outputting to the supply device;
When using an arithmetic operation module for performing an arithmetic operation and a logical operation module for performing a logical operation arranged in different regions on the substrate, an operation means for performing an operation in accordance with the result of the decoding and an interrupt signal are input, the clock signal Interrupt control means for outputting a low power consumption mode release signal to the supply device. Further, the clock signal supply device constantly supplies a clock signal to the interrupt control means of the processor, and when the low power consumption mode instruction signal is input, a clock signal to the program counter, the decode means, and the arithmetic means. Stop supplying the signal,
When the low power consumption mode release signal is input, supply of a clock signal to the program counter, the decoding means, and the arithmetic means is started.

【００１６】[0016]

【発明の実施の形態】以下、本発明の実施形態に係わる
プロセッサおよび演算処理システムについて説明する。
図１は、本実施形態のプロセッサ１および外部メモリ２
との接続関係を説明するための図である。図１に示すよ
うに、プロセッサ１は、命令バス３およびデータバス４
を介して外部メモリ２と接続されている。外部メモリ２
は、プロセッサ１で処理される命令およびデータを記憶
し、当該命令およびデータをそれぞれ命令バス３および
データバス４を介してプロセッサ１に供給する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a processor and an arithmetic processing system according to an embodiment of the present invention will be described.
FIG. 1 shows a processor 1 and an external memory 2 according to this embodiment.
FIG. 4 is a diagram for explaining a connection relationship with the STA. As shown in FIG. 1, the processor 1 includes an instruction bus 3 and a data bus 4
Is connected to the external memory 2 via the. External memory 2
Stores instructions and data processed by the processor 1 and supplies the instructions and data to the processor 1 via the instruction bus 3 and the data bus 4, respectively.

【００１７】図２は、本実施形態の演算処理システムを
構成するプロセッサ１およびクロック信号供給装置とし
てのクロック信号供給回路４０の構成図である。図２に
示すように、プロセッサ１は、汎用レジスタ群１０、バ
イパスロジックモジュール１１、ＡＬＵモジュール１
２、乗算モジュール１３、除算モジュール１４、プログ
ラムカウンタ１５、アドレス演算モジュール１６、命令
デコーダ１７、制御レジスタ群１８および割り込みコン
トローラ１９を有する。ここで、割り込みコントローラ
１９が、本発明の割り込み制御手段に対応している。FIG. 2 is a configuration diagram of a processor 1 and a clock signal supply circuit 40 as a clock signal supply device that constitute the arithmetic processing system of the present embodiment. As shown in FIG. 2, the processor 1 includes a general-purpose register group 10, a bypass logic module 11, an ALU module 1
2, a multiplication module 13, a division module 14, a program counter 15, an address operation module 16, an instruction decoder 17, a control register group 18, and an interrupt controller 19. Here, the interrupt controller 19 corresponds to the interrupt control means of the present invention.

【００１８】また、プロセッサ１は、後述するように、
常に動作しているクロック信号Ｓ５３とスリープ命令の
実行の有無に応じて動作／停止が行われるクロック信号
Ｓ５２の２つのクロック信号に基づいて動作する。クロ
ック信号Ｓ５３，Ｓ５２はクロック信号供給回路４０か
ら入力される。プロセッサ１は、クロック信号Ｓ５２を
入力すると、クロック信号に基づいて動作する割り込み
コントローラ１９以外の全ての構成要素に当該入力した
クロック信号Ｓ５２を供給する。一方、プロセッサ１
は、クロック信号Ｓ５３を入力すると、当該クロック信
号Ｓ５３を割り込みコントローラ１９に対してのみ供給
し、割り込みコントローラ１９のみを動作させる。Further, as described later, the processor 1
It operates based on two clock signals, a clock signal S53 which is always operating and a clock signal S52 which is operated / stopped according to the execution of the sleep command. The clock signals S53 and S52 are input from the clock signal supply circuit 40. When receiving the clock signal S52, the processor 1 supplies the input clock signal S52 to all components other than the interrupt controller 19 that operates based on the clock signal. On the other hand, processor 1
Receives the clock signal S53, supplies the clock signal S53 only to the interrupt controller 19, and operates only the interrupt controller 19.

【００１９】プロセッサ１は、動作周波数が５４（ＭＨ
ｚ）であり、１クロックサイクルの時間は、１／（５４
×１０⁶）（ｓｅｃ）である。プロセッサ１は、ＲＩＳ
Ｃ型のプロセッサであり、１６ビットの固定長命令を実
行し、汎用レジスタ群１０の汎用レジスタに対してのロ
ード（読み出し）命令およびストア（書き込み）命令を
基本とするスタックマシーンアーキテクチャを採用して
いる。また、ＣＩＳＣ型のプロセッサと同程度の多様な
分岐条件を持つ分岐命令を備えている。The processor 1 has an operating frequency of 54 (MH)
z), and the time of one clock cycle is 1 / (54
× 10 ⁶ ) (sec). Processor 1 is a RIS
A C-type processor that executes a 16-bit fixed-length instruction and employs a stack machine architecture based on a load (read) instruction and a store (write) instruction for a general-purpose register of the general-purpose register group 10. I have. It also has a branch instruction having various branch conditions similar to those of the CISC type processor.

【００２０】汎用レジスタ群１０には、３２ビットの汎
用レジスタを３２本備えている。バイパスロジックモジ
ュール１１は、図３に示すように、命令実行が、ＩＦス
テージ３０、ＲＦステージ３１、ＥＸステージ３２、Ｍ
ＥＭステージ３３およびＷＢステージ３４の順で命令パ
イプライン方式で行われる場合に、ＥＸステージ３２の
結果をＭＥＭステージ３３およびＷＢステージ３４を介
さずに再びＥＸステージ３２およびＲＦステージ３１に
供給するバイパスと、ＭＥＭステージ３３を終了したデ
ータをＷＢステージ３４を介さずに再びＲＦステージ３
１に供給するバイパスとを提供する。バイパスロジック
モジュール１１によれば、先の命令のＥＸステージの演
算結果を、後の命令のＥＸステージで使用する場合に
は、例えば、先の命令のＥＸステージで得られた演算結
果をＷＢステージでレジスタに書き込んでから後の命令
のＲＦステージでレジスタから読み出すのではなく、先
の命令のＥＸステージおよびＭＥＭステージを終えた段
階の演算結果を、バイパスを使って、図１に示すＡＬＵ
モジュール１２、乗算モジュール１３および除算モジュ
ール１４に供給することで、後の命令のＥＸステージを
早いタイミングで実行できる。The general-purpose register group 10 has 32 32-bit general-purpose registers. As shown in FIG. 3, the bypass logic module 11 executes the instruction execution in the IF stage 30, the RF stage 31, the EX stage 32, the M
When the instruction pipeline is performed in the order of the EM stage 33 and the WB stage 34, a bypass that supplies the result of the EX stage 32 to the EX stage 32 and the RF stage 31 again without passing through the MEM stage 33 and the WB stage 34; , The data that has passed through the MEM stage 33 is transferred to the RF stage 3 again without passing through the WB stage 34.
1 to provide a bypass. According to the bypass logic module 11, when the operation result of the EX stage of the previous instruction is used in the EX stage of the subsequent instruction, for example, the operation result obtained in the EX stage of the previous instruction is used in the WB stage. Instead of writing to the register and then reading from the register at the RF stage of the subsequent instruction, the ALU shown in FIG.
By supplying the module 12, the multiplication module 13, and the division module 14, the EX stage of the subsequent instruction can be executed at an early timing.

【００２１】ＡＬＵモジュール１２は、算術演算モジュ
ール１２ａ、論埋演算モジュール１２ｂおよびシフト演
算モジュール１２ｃを有する。算術演算モジュール１２
ａは、数値データに対する加算を行う加算器、減算を行
う減算器、比較演算を行う比較演算器などを備えてい
る。論理演算モジュール１２ｂは、非数値データに対す
る論理演算を行う論理演算器、ビット・フィールド操作
を行うビット・フィールド操作器、データ変換を行うデ
ータ変換器などを備えている。シフト演算モジュール１
２ｃは、算術シフト器および論理シフト器などを備えて
いる。乗算モジュール１３は、乗算器を備えている。除
算モジュール１４は、除算器を備えている。The ALU module 12 has an arithmetic operation module 12a, a logical operation module 12b, and a shift operation module 12c. Arithmetic operation module 12
“a” includes an adder that performs addition to numerical data, a subtractor that performs subtraction, a comparison operation unit that performs a comparison operation, and the like. The logical operation module 12b includes a logical operation unit that performs a logical operation on non-numeric data, a bit field operation unit that performs a bit field operation, a data converter that performs data conversion, and the like. Shift operation module 1
2c includes an arithmetic shifter and a logical shifter. The multiplication module 13 includes a multiplier. The division module 14 includes a divider.

【００２２】プロセッサ１では、算術演算モジュール１
２ａ、論理演算モジュール１２ｂ、シフト演算モジュー
ル１２ｃ、乗算モジュール１３および除算モジュール１
４は、それぞれ個別に、例えばＶｅｒｉ１ｏｇ−ＨＤＬ
(Hardware Descpiption Language) などのハードウェア
記述言語を用いて設計プログラムを記述し、当該設計プ
ログラムをコンパイラを用いてコンパイルすることで自
動設計されている。従って、算術演算モジュール１２
ａ、論理演算モジュール１２ｂ、シフト演算モジュール
１２ｃ、乗算モジュール１３および除算モジュール１４
は、基板上の異なる領域に相互に独立して配置されてい
る。このように、ハードウェア記述言語を用いて各モジ
ュールを個別に設計することで、各モジュール内の構成
要素を基板上に近接して配置でき、信号処理の高速化を
図ることができる。その結果、図３に示すＥＸステージ
に必要とされる時間を短縮し、１クロックサイクルの時
間を前述したように短縮することが可能になった。In the processor 1, the arithmetic operation module 1
2a, logical operation module 12b, shift operation module 12c, multiplication module 13, and division module 1
4 are individually, for example, Verilog-HDL
(Hardware Description Language) or the like, a design program is described using a hardware description language, and the design program is automatically designed by compiling the program using a compiler. Therefore, the arithmetic operation module 12
a, logical operation module 12b, shift operation module 12c, multiplication module 13, and division module 14
Are arranged independently of each other in different regions on the substrate. In this way, by designing each module individually using the hardware description language, the components in each module can be arranged close to each other on the board, and the speed of signal processing can be increased. As a result, the time required for the EX stage shown in FIG. 3 can be reduced, and the time for one clock cycle can be reduced as described above.

【００２３】プログラムカウンタ１５は、次にフェッチ
する命令の図１に示す外部メモリ２上のアドレスを指し
示す。プログラムカウンタ１５が指し示す外部メモリ２
のアドレスは、原則として、１クロックサイクル毎に、
所定の間隔で自動的にインクリメントされる。なお、プ
ログラムカウンタ１５が指し示すアドレスの更新は、Ｉ
Ｆステージにおいて命令の読み出しを行う前に行われ
る。The program counter 15 indicates the address of the next fetched instruction on the external memory 2 shown in FIG. External memory 2 pointed to by program counter 15
Address is, in principle, every clock cycle,
It is automatically incremented at predetermined intervals. The update of the address indicated by the program counter 15 is based on the I
This is performed before the instruction is read in the F stage.

【００２４】アドレス演算モジュール１６は、外部メモ
リ２上のアクセスを行うデータのアドレスを算出する。
当該アドレスは、アドレスのバイト境界に応じて、自動
インクリメント機能によって生成される。The address operation module 16 calculates an address of data to be accessed on the external memory 2.
The address is generated by an automatic increment function according to a byte boundary of the address.

【００２５】命令デコーダ１７は、外部メモリ２から読
み出され命令バス上を伝送する命令をデコードして制御
信号を生成すると共に、命令パイプライン処理の制御を
統括して行う。なお、プロセッサ１は、図３に示すよう
に、ＩＦステージ３０、ＲＦステージ３１、ＥＸステー
ジ３２、ＭＥＭステージ３３およびＷＢステージ３４か
らなる５段命令パイプライン方式を採用している。ここ
で、ＩＦステージ３０、ＲＦステージ３１、ＥＸステー
ジ３２、ＭＥＭステージ３３およびＷＢステージ３４
が、それぞれ本発明の命令フェッチステージ、デコード
ステージ、演算ステージ、メモリアクセスステージおよ
びライトバックステージに対応している。The instruction decoder 17 decodes an instruction read from the external memory 2 and transmitted on the instruction bus to generate a control signal, and controls the instruction pipeline processing. The processor 1 employs a five-stage instruction pipeline system including an IF stage 30, an RF stage 31, an EX stage 32, a MEM stage 33, and a WB stage 34, as shown in FIG. Here, IF stage 30, RF stage 31, EX stage 32, MEM stage 33 and WB stage 34
Correspond to the instruction fetch stage, the decode stage, the operation stage, the memory access stage, and the write back stage of the present invention, respectively.

【００２６】各ステージの処理を簡単に説明すると、Ｉ
Ｆステージ３０では、プログラムカウンタ１５が指し示
す外部メモリ２上のアドレスを更新し、当該更新したア
ドレスから命令を読み込む（フェッチする）。ＲＦステ
ージ３１では、ＩＦステージ３０で読み込んだ命令のデ
コードを行い、必要に応じて、汎用レジスタ群１０の汎
用（データ）レジスタからデータを読み出す。また、Ｅ
Ｘステージ３２では、必要に応じて、ＲＦステージ３１
で汎用レジスタから読み出したデータを用いて、算術演
算モジュール１２ａ、論理演算モジュール１２ｂおよび
シフト演算モジュール１２ｃ、乗算モジュール１３およ
び除算モジュール１４の何れかにおいて演算を行う。Ｍ
ＥＭステージ３３では、必要に応じて外部メモリ２にア
クセスを行う。ＷＢステージ３４では、ＥＸステージ３
２で演算が行われた場合に、当該演算の結果を汎用レジ
スタに書き込む。The processing of each stage will be briefly described.
In the F stage 30, the address on the external memory 2 indicated by the program counter 15 is updated, and the instruction is read (fetched) from the updated address. The RF stage 31 decodes the instruction read by the IF stage 30, and reads data from the general-purpose (data) registers of the general-purpose register group 10 as necessary. Also, E
In the X stage 32, if necessary, the RF stage 31
The arithmetic operation module 12a, the logical operation module 12b, the shift operation module 12c, the multiplication module 13, and the division module 14 perform the operation using the data read from the general-purpose register. M
In the EM stage 33, the external memory 2 is accessed as needed. In the WB stage 34, the EX stage 3
When the operation is performed in step 2, the result of the operation is written to a general-purpose register.

【００２７】また、命令デコーダ１７は、スリープ命令
をデコードすると、スリープ信号Ｓ５１をクロック信号
供給回路４０に出力する。ここで、スリープ命令が本発
明の低消費電力指示命令に対応し、スリープ信号Ｓ５１
が本発明の低消費電力モード指示信号に対応している。Upon decoding the sleep command, the command decoder 17 outputs a sleep signal S51 to the clock signal supply circuit 40. Here, the sleep instruction corresponds to the low power consumption instruction instruction of the present invention, and the sleep signal S51
Corresponds to the low power consumption mode instruction signal of the present invention.

【００２８】制御レジスタ群１８は、割り込み制御およ
びデバック処理などに用いられる３２ビットの１０本の
制御レジスタを備えている。割り込みコントローラ１９
は、プログラムカウンタ１５が割り込み時に指し示すア
ドレスの外部メモリ２への退避や、スタックポインタの
操作などの割り込み制御を統括して行う。また、割り込
みコントローラ１９は、割り込み信号Ｓ５０を外部から
入力すると、当該割り込み信号Ｓ５０の内容に応じて予
め決められた割り込み処理を行う。割り込みコントロー
ラ１９は、割り込み信号Ｓ５０が、スリープ命令の実行
による待機状態を解除するものである場合には、ウェイ
クアップ信号Ｓ５４をクロック信号供給回路４０に出力
する。ここで、ウェイクアップ信号Ｓ５４が、本発明の
低消費電力モード解除信号に対応している。The control register group 18 includes ten 32-bit control registers used for interrupt control and debugging. Interrupt controller 19
Performs overall control of interrupt control such as saving the address indicated by the program counter 15 at the time of an interrupt to the external memory 2 and operating a stack pointer. When the interrupt signal S50 is input from outside, the interrupt controller 19 performs a predetermined interrupt process according to the content of the interrupt signal S50. The interrupt controller 19 outputs the wake-up signal S54 to the clock signal supply circuit 40 when the interrupt signal S50 is to release the standby state by executing the sleep command. Here, the wake-up signal S54 corresponds to the low power consumption mode release signal of the present invention.

【００２９】クロック信号供給回路４０は、プロセッサ
１の命令デコーダ１７からスリープ信号Ｓ５１を入力し
ていない通常状態では、クロック信号Ｓ５２およびＳ５
３の双方をプロセッサ１に供給する。また、クロック信
号供給回路４０は、プロセッサ１の命令デコーダ１７か
らスリープ信号Ｓ５１を入力すると、割り込みコントロ
ーラ１９からウェイクアップ信号Ｓ５４を入力するま
で、クロック信号Ｓ５２の供給を停止する。また、クロ
ック信号供給回路４０は、割り込みコントローラ１９か
ら、ウェイクアップ信号Ｓ５４を入力すると、プロセッ
サ１にクロック信号Ｓ５２の供給を再開する。In a normal state in which the sleep signal S51 is not input from the instruction decoder 17 of the processor 1, the clock signal supply circuit 40 supplies clock signals S52 and S5.
3 are supplied to the processor 1. When the clock signal supply circuit 40 receives the sleep signal S51 from the instruction decoder 17 of the processor 1, the clock signal supply circuit 40 stops supplying the clock signal S52 until a wake-up signal S54 is input from the interrupt controller 19. When the clock signal supply circuit 40 receives the wake-up signal S54 from the interrupt controller 19, the clock signal supply circuit 40 restarts the supply of the clock signal S52 to the processor 1.

【００３０】以下、プロセッサ１の動作について説明す
る。命令パイプライン処理の動作図４は、プロセッサ１の命令パイプライン処理を説明す
るための図である。なお、ＩＦ、ＲＦ、ＥＸ、ＭＥＭお
よびＷＢステージにおける処理内容は、前述した通りで
ある。クロックサイクル「１」：命令６０のＩＦステージが行
われ、図２に示すプログラムカウンタ１５によって指し
示される図１に示す外部メモリ２のアドレスがアドレス
「ＰＣ」に固定長「２」だけインクリメントされ、当該
アドレス「ＰＣ」から読み込まれた命令６０が命令バス
に伝送される（フェッチされる）。Hereinafter, the operation of the processor 1 will be described. Operation of Instruction Pipeline Processing FIG. 4 is a diagram for explaining instruction pipeline processing of the processor 1. The processing contents in the IF, RF, EX, MEM, and WB stages are as described above. Clock cycle "1": The IF stage of the instruction 60 is performed, and the address of the external memory 2 shown in FIG. 1 indicated by the program counter 15 shown in FIG. 2 is incremented by the fixed length "2" to the address "PC". The instruction 60 read from the address “PC” is transmitted (fetched) to the instruction bus.

【００３１】クロックサイクル「２」：クロックサイク
ル「１」でフェッチされた命令６０のＲＦステージが行
われ、当該命令６０が命令デコーダ１７でデコードされ
る。また、命令６１のＩＦステージが行われ、プログラ
ムカウンタ１５が指し示す外部メモリ２のアドレスがア
ドレス「ＰＣ＋２」にインクリメントされ、当該アドレ
ス「ＰＣ＋２」に記憶されている命令６１がフェッチさ
れる。Clock cycle “2”: The RF stage of the instruction 60 fetched in the clock cycle “1” is performed, and the instruction 60 is decoded by the instruction decoder 17. Further, the IF stage of the instruction 61 is performed, the address of the external memory 2 indicated by the program counter 15 is incremented to the address “PC + 2”, and the instruction 61 stored at the address “PC + 2” is fetched.

【００３２】クロックサイクル「３」：命令６０のＥＸ
ステージおよび命令６１のＲＦステージが行われる。ま
た、命令６２のＩＦステージが行われ、プログラムカウ
ンタ１５が指し示す外部メモリ２のアドレスがアドレス
「ＰＣ＋４」にインクリメントされ、当該アドレス「Ｐ
Ｃ＋４」に記憶されている命令６２がフェッチされる。Clock cycle "3": EX of instruction 60
The stage and the RF stage of instruction 61 are performed. Further, the IF stage of the instruction 62 is performed, the address of the external memory 2 indicated by the program counter 15 is incremented to the address “PC + 4”, and the address “P + 4” is set.
The instruction 62 stored in "C + 4" is fetched.

【００３３】クロックサイクル「４」：命令６０のＭＥ
Ｍステージ、命令６１のＥＸステージおよび命令６２の
ＲＦステージが行われる。また、命令６３のＩＦステー
ジが行われ、プログラムカウンタ１５が指し示す外部メ
モリ２のアドレスがアドレス「ＰＣ＋６」にインクリメ
ントされ、当該アドレス「ＰＣ＋６」に記憶されている
命令６３がフェッチされる。Clock cycle "4": ME of instruction 60
The M stage, the EX stage of the instruction 61 and the RF stage of the instruction 62 are performed. Further, the IF stage of the instruction 63 is performed, the address of the external memory 2 indicated by the program counter 15 is incremented to the address “PC + 6”, and the instruction 63 stored at the address “PC + 6” is fetched.

【００３４】クロックサイクル「５」：命令６０のＷＢ
ステージ、命令６１のＭＥＭステージ、命令６２のＥＸ
ステージおよび命令６３のＲＦステージが行われる。ま
た、命令６４のＩＦステージが行われ、プログラムカウ
ンタ１５が指し示す外部メモリ２のアドレスがアドレス
「ＰＣ＋８」にインクリメントされ、当該アドレス「Ｐ
Ｃ＋８」に記憶されている命令６４がフェッチされる。Clock cycle "5": WB of instruction 60
Stage, MEM stage of instruction 61, EX of instruction 62
The stage and the RF stage of instruction 63 are performed. Further, the IF stage of the instruction 64 is performed, the address of the external memory 2 indicated by the program counter 15 is incremented to the address “PC + 8”, and the address “P + 8” is set.
The instruction 64 stored in "C + 8" is fetched.

【００３５】クロックサイクル「６」：命令６１のＷＢ
ステージ、命令６２のＭＥＭステージ、命令６３のＥＸ
ステージおよび命令６４のＩＦステージが行われる。Clock cycle “6”: WB of instruction 61
Stage, MEM stage of instruction 62, EX of instruction 63
The stage and the IF stage of instruction 64 are performed.

【００３６】クロックサイクル「７」：命令６２のＷＢ
ステージ、命令６３のＭＥＭステージおよび命令６４の
ＥＸステージが行われる。Clock cycle "7": WB of instruction 62
The stage, the MEM stage of the instruction 63 and the EX stage of the instruction 64 are performed.

【００３７】クロックサイクル「８」：命令６３のＷＢ
ステージおよび命令６４のＭＥＭステージが行われる。Clock cycle "8": WB of instruction 63
The stage and the MEM stage of instruction 64 are performed.

【００３８】クロックサイクル「９」：命令６４のＷＢ
ステージが行われる。Clock cycle "9": WB of instruction 64
The stage takes place.

【００３９】以上説明したように、プロセッサ１の５段
命令パイプライン処理では、クロックサイクル「５」で
命令６０の実行が終了し、クロックサイクル「６」で命
令６１の実行が終了し、クロックサイクル「７」で命令
６２の実行が終了し、クロックサイクル「８」で命令６
３の実行が終了し、クロックサイクル「９」で命令６４
の実行が終了する。従って、プロセッサ１によれば、１
命令の実行が、見かけ上１クロックサイクルで終了す
る。また、プロセッサ１によれば、前述したように、算
術演算モジュール１２ａ、論理演算モジュール１２ｂ、
シフト演算モジュール１２ｃ、乗算モジュール１３およ
び除算モジュール１４を、ハードウェア記述言語および
当該言語のコンパイラを用いて各モジュール毎に個別設
計しているため、これらのモジュールを図２に示すよう
に、基板上の異なる領域に集積して配置できる。その結
果、各モジュールの実行時間を短縮でき、クロックサイ
クルを決定する上でのクリティカルパスとなる図３に示
すＥＸステージに必要とされる時間を短縮し、１クロッ
クサイクルの時間を、１／（５４×１０⁶）（ｓｅｃ）
に短縮できる。As described above, in the five-stage instruction pipeline processing of the processor 1, the execution of the instruction 60 ends at the clock cycle "5", the execution of the instruction 61 ends at the clock cycle "6", and the clock cycle The execution of the instruction 62 is completed at “7”, and the instruction 6 is executed at the clock cycle “8”.
3 is completed, and in the clock cycle “9”, the instruction 64 is executed.
Execution is terminated. Therefore, according to the processor 1, 1
Execution of the instruction apparently ends in one clock cycle. Further, according to the processor 1, as described above, the arithmetic operation module 12a, the logical operation module 12b,
Since the shift operation module 12c, the multiplication module 13 and the division module 14 are individually designed for each module using a hardware description language and a compiler of the language, these modules are mounted on a board as shown in FIG. In different regions. As a result, the execution time of each module can be reduced, the time required for the EX stage shown in FIG. 3, which is a critical path for determining a clock cycle, is reduced, and the time of one clock cycle is reduced to 1 / ( 54 × 10 ⁶ ) (sec)
Can be shortened to

【００４０】スリープ命令を実行したときの動作当該動作を説明する前提としては、プロセッサ１で実行
するプログラム中に、割り込み信号Ｓ５０の入力待ちを
行う命令に続いてスリープ命令が記述されている。先
ず、割り込み信号Ｓ５０の入力待ちを行う命令が、ＩＦ
ステージでフェッチされる。このとき、クロック信号供
給回路４０からプロセッサ１にクロック信号Ｓ５２，Ｓ
５３が供給されている。 Operation When Executing Sleep Command As a premise for explaining the operation, a sleep instruction is described in a program executed by processor 1 after an instruction for waiting for input of interrupt signal S50. First, the instruction for waiting for the input of the interrupt signal S50 is issued by the IF
Fetched on stage. At this time, the clock signals S52 and S
53 are supplied.

【００４１】次に、割り込み信号Ｓ５０の入力待ちを行
う命令のＲＦステージが行われ、当該命令が命令デコー
ダ１７によってデコードされる。これにより、プロセッ
サ１は、割り込み信号Ｓ５０の入力待ち状態となる。ま
た、スリープ命令がＩＦステージでフェッチされる。次
に、命令デコーダ１７において、スリープ命令のＲＦス
テージが行われ、命令デコーダ１７からクロック信号供
給回路４０にスリープ信号Ｓ５１が出力される。これに
より、クロック信号供給回路４０からプロセッサ１への
クロック信号Ｓ５２の供給が停止され、クロック信号供
給回路４０から割り込みコントローラ１９に対してのみ
クロック信号Ｓ５３が供給される。そのため、割り込み
コントローラ１９のみが動作状態となり、プロセッサ１
内の割り込みコントローラ１９以外の構成要素が停止状
態となる。Next, the RF stage of the instruction for waiting for the input of the interrupt signal S50 is performed, and the instruction is decoded by the instruction decoder 17. As a result, the processor 1 enters a state of waiting for the input of the interrupt signal S50. Also, a sleep instruction is fetched in the IF stage. Next, the instruction decoder 17 performs an RF stage of a sleep instruction, and the instruction decoder 17 outputs a sleep signal S51 to the clock signal supply circuit 40. As a result, the supply of the clock signal S52 from the clock signal supply circuit 40 to the processor 1 is stopped, and the clock signal S53 is supplied only from the clock signal supply circuit 40 to the interrupt controller 19. Therefore, only the interrupt controller 19 is in the operating state, and the processor 1
The components other than the interrupt controller 19 are stopped.

【００４２】次に、割り込みコントローラ１９に割り込
み信号Ｓ５０が入力されると、割り込みコントローラ１
９からクロック信号供給回路４０にウェイクアップ信号
Ｓ５４が出力される。ウェイクアップ信号Ｓ５４がクロ
ック信号供給回路４０に入力されると、クロック信号供
給回路４０からプロセッサ１へクロック信号Ｓ５２の供
給が開始される。これにより、クロック信号Ｓ５２に基
づいて、プロセッサ１内の全ての構成要素が動作状態に
なる。Next, when the interrupt signal S50 is input to the interrupt controller 19, the interrupt controller 1
9 outputs a wake-up signal S54 to the clock signal supply circuit 40. When the wake-up signal S54 is input to the clock signal supply circuit 40, the clock signal supply circuit 40 starts supplying the clock signal S52 to the processor 1. Thus, all the components in the processor 1 are activated based on the clock signal S52.

【００４３】このように、プロセッサ１によれば、スリ
ープ命令を用いることで、割り込み信号Ｓ５０の入力待
ちを行う待機状態になったときに、割り込みコントロー
ラ１９以外の構成要素に対してのクロック信号の供給を
停止し、これらの構成要素の動作を停止させることがで
きる。その結果、プロセッサ１が割り込み信号Ｓ５０の
入力待ち状態となったときの低消費電力化が図れる。電
力消費の削減量はプログラムの実行に応じた待機状態の
時間にもよるが、プロセッサ１によれば、従来のスリー
プ命令を用いない場合に比べて、消費電力を１／４程度
にすることができる。また、割り込みコントローラ１９
に対しては、常にクロック信号Ｓ５３が供給されるた
め、割り込み信号Ｓ５０の入力に応じた処理を適切に実
行できる。As described above, according to the processor 1, by using the sleep command, when the CPU 1 enters the standby state for waiting for the input of the interrupt signal S 50, the processor 1 outputs the clock signal to the components other than the interrupt controller 19. The supply can be stopped and the operation of these components can be stopped. As a result, low power consumption can be achieved when the processor 1 is in a state of waiting for the input of the interrupt signal S50. Although the amount of reduction in power consumption depends on the time of the standby state according to the execution of the program, the processor 1 can reduce the power consumption to about 1/4 as compared with the case where the conventional sleep instruction is not used. it can. Also, the interrupt controller 19
, The clock signal S53 is always supplied, so that the processing according to the input of the interrupt signal S50 can be appropriately performed.

【００４４】本発明は上述した実施形態には限定されな
い。例えば、上述した実施形態では、本発明における
「ｎ」が「５」の場合を例示したが、「ｎ」は２以上で
あれば「５」以外であってもよい。すなわち、命令パイ
プライン処理の段数が５段には限定されない。The present invention is not limited to the above embodiment. For example, in the embodiment described above, the case where “n” in the present invention is “5” is exemplified, but “n” may be other than “5” as long as “n” is 2 or more. That is, the number of stages of instruction pipeline processing is not limited to five.

【００４５】[0045]

【発明の効果】以上説明したように、本発明のプロセッ
サによれば、命令パイプライン処理の演算ステージに必
要とされる時間を短縮でき、１クロックサイクルの時問
を短縮し、動作周波教の向上を図れる。また、本発明の
演算処理システムによれば、低消費電力化が図れる。As described above, according to the processor of the present invention, the time required for the operation stage of the instruction pipeline processing can be shortened, the time of one clock cycle can be shortened, and the operating frequency can be reduced. Can be improved. Further, according to the arithmetic processing system of the present invention, low power consumption can be achieved.

[Brief description of the drawings]

【図１】図１は、本発明の実施形態のプロセッサおよび
外部メモリとの接続関係を説明するための図である。FIG. 1 is a diagram for explaining a connection relationship between a processor and an external memory according to an embodiment of the present invention;

【図２】図２は、図１に示すプロセッサの構成図であ
る。FIG. 2 is a configuration diagram of a processor shown in FIG. 1;

【図３】図３は、図２に示すプロセッサの５段命令パイ
プライン処理およびバイパスロジックモジュールを説明
するための図である。FIG. 3 is a diagram for explaining a five-stage instruction pipeline processing and a bypass logic module of the processor shown in FIG. 2;

【図４】図４は、図２に示す５段命令パイプライン処理
の実行状態を説明するための図である。FIG. 4 is a diagram for explaining an execution state of the five-stage instruction pipeline processing shown in FIG. 2;

【図５】図５は、一般的な５段命令パイプライン処理を
説明するための図である。FIG. 5 is a diagram for explaining general five-stage instruction pipeline processing;

[Explanation of symbols]

１…プロセッサ、２…外部メモリ、１０…汎用レジスタ
群、１１…バイパスロジックモジュール、１２…ＡＬＵ
モジュール、１２ａ…算術演算モジュール、１２ｂ…論
理演算モジュール、１２ｃ…シフト演算モジュール、１
３…乗算モジュール、１４…除算モジュール、１５…プ
ログラムカウンタ、１６…アドレス演算モジュール、１
７…命令デコーダ、１８…制御レジスタ群、１９…割り
込みコントローラ、４０…クロック信号供給回路DESCRIPTION OF SYMBOLS 1 ... Processor, 2 ... External memory, 10 ... General-purpose register group, 11 ... Bypass logic module, 12 ... ALU
Module, 12a: arithmetic operation module, 12b: logical operation module, 12c: shift operation module, 1
3 Multiplication module, 14 Division module, 15 Program counter, 16 Address calculation module, 1
7 Instruction decoder, 18 Control register group, 19 Interrupt controller, 40 Clock signal supply circuit

───────────────────────────────────────────────────── フロントページの続き (72)発明者阪本幸弘東京都品川区北品川６丁目７番35号ソニー株式会社内Ｆターム(参考） 5B013 AA11 ────────────────────────────────────────────────── ─── Continued on the front page (72) Inventor Yukihiro Sakamoto 6-35 Kita-Shinagawa, Shinagawa-ku, Tokyo F-term in Sony Corporation (reference) 5B013 AA11

Claims

[Claims]

1. A processor for executing pipeline processing by dividing an instruction execution into n (n.gtoreq.2) stages and sequentially performing different stages of a plurality of continuous instructions in parallel based on a clock signal. Wherein an arithmetic operation module for performing an arithmetic operation and a logical operation module for performing a logical operation are arranged in different areas on a substrate, and the arithmetic operation and the logical operation are respectively performed within one clock cycle of the clock signal. Processor.

2. The arithmetic operation module and the logical operation module are arranged independently of each other by performing a design by compiling a circuit design program described using a hardware description language with a compiler. The processor according to claim 1.

3. A multiplication module for performing a multiplication, a division module for performing a division, and a shift operation module for performing a shift operation, wherein the arithmetic operation module performs an arithmetic operation excluding multiplication, division, and a shift operation. The arithmetic operation module, the logical operation module, the multiplication module, the division module, and the shift operation module are arranged in different areas on a substrate, and the arithmetic operation, the logical operation, the multiplication, the division, 2. The processor of claim 1, wherein each of the shift operations is performed within one clock cycle of the clock signal.

4. The arithmetic operation module, the logical operation module, the multiplication module, the division module, and the shift operation module are independent from each other in design performed by compiling a circuit design program described using a hardware description language with a compiler. 4. The processor according to claim 3, wherein the processor is arranged to operate.

5. A program counter for indicating an address of the external memory in which an instruction to be read from an external memory is stored; a decoding unit for decoding the read instruction; An arithmetic unit for performing an arithmetic operation by selecting and using one of an arithmetic operation module and the logical operation module; and a plurality of data registers for storing data, wherein the n stages are configured such that: An instruction fetch stage for reading an instruction from an address in an external memory; a decode stage for decoding the read instruction by the decoding unit; and an operation for performing an operation by the operation unit as necessary based on a decoding result of the decoding unit. Stage and, if necessary, the external memory A memory access stage to be accessed, at least a processor and a write back stage for writing the result of the arithmetic operation as required to the data register.

6. A clock signal supply device for supplying a clock signal, and an instruction is divided into n (n.gtoreq.2) stages and sequentially executed, and different stages of a plurality of continuous instructions are supplied to the clock signal supply device. A processor for executing pipeline processing in parallel based on a clock signal from the device, wherein the processor reads an address of the external memory in which an instruction to be read from the external memory is stored. A program counter for pointing; a decoding means for decoding the read instruction and outputting a low power consumption mode instruction signal to the clock signal supply device when decoding the low power consumption mode instruction instruction; Arithmetic operation module for performing the arranged arithmetic operation and logical operation module for performing the logical operation And an interrupt control unit that outputs a low power consumption mode release signal to the clock signal supply device when an interrupt signal is input to the clock signal supply device. The apparatus, when constantly supplying a clock signal to the interrupt control unit of the processor and inputting the low power consumption mode instruction signal, stops supplying the clock signal to the program counter, the decoding unit and the arithmetic unit, An arithmetic processing system which starts supplying a clock signal to the program counter, the decoding means and the arithmetic means when a low power consumption mode release signal is input.