JP2000357089A

JP2000357089A - Data processor

Info

Publication number: JP2000357089A
Application number: JP11167812A
Authority: JP
Inventors: Naomiki Mitsuishi; 直幹三ッ石
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1999-06-15
Filing date: 1999-06-15
Publication date: 2000-12-26
Anticipated expiration: 2019-06-15
Also published as: JP3740321B2

Abstract

PROBLEM TO BE SOLVED: To improve the throughput of a CPU by expanding the bus width of instruction read while maintaining compatibility with an existent CPU. SOLUTION: Concerning the existent CPU, the width of an internal data bus is made greater than the basic unit of an instruction, an instruction register 200 is provided for holding the plural units of read instructions and a means for monitoring the quantity of instructions existent in the instruction register is provided. According to the basic unit time (called state) of execution, the existent instruction is divided into the state for only performing the read control of the instruction (first operation) and the state including the calculation of an effective address or control of operation processing of data (second operation) and corresponding to the condition of the read instruction, the state for controlling only the read of the instruction can be omitted (skipped). Thus, the quantity of simultaneously readable instructions can be made greater than the existent CPU by the great bus width and the throughput is improved by the said skip as well.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、マイクロコンピュ
ータ、マイクロコントローラ、中央処理装置（ＣＰＵ）
等のデータ処理装置に関し、例えば、半導体集積回路化
された、機器組込み型のマイクロコンピュータに適用し
て有効な技術に関するものである。The present invention relates to a microcomputer, a microcontroller, and a central processing unit (CPU).
The present invention relates to a technology which is effective when applied to, for example, a microcomputer incorporated in a device integrated into a semiconductor integrated circuit.

【０００２】[0002]

【従来の技術】半導体集積回路で成るマイクロコンピュ
ータは、アドレス空間の拡張や、命令セットの拡大、高
速化などが図られてきた。マイクロコンピュータのＣＰ
Ｕは、ソフトウェアによって、その機能が定義されてい
るから、アドレス空間の拡張や、命令セット拡大、高速
化などを図ったマイクロコンピュータにおいても、既存
のマイクロコンピュータのソフトウェア資産を有効に利
用できることが望ましい。2. Description of the Related Art A microcomputer comprising a semiconductor integrated circuit has been designed to have an expanded address space, an enlarged instruction set, and a higher speed. Microcomputer CP
Since the function of U is defined by software, it is desirable that the software resources of the existing microcomputer can be effectively used even in a microcomputer which has an expanded address space, an expanded instruction set, and a higher speed. .

【０００３】このため、オブジェクトレベルで互換性を
保ちつつ、アドレス空間の拡張や、命令セット拡大を実
現した例として、例えば、特開平６−５１９８１号公
報、平成５年６月（株）日立製作所発行『Ｈ８／３００
Ｈシリーズプログラミングマニュアル』などの記載があ
る。この中で、いわゆるロードストア型アーキテクチャ
を採用することが、命令セットの拡張を図る上で有効で
あることを示されている。かかるロードストア型アーキ
テクチャは、所謂ＲＩＳＣ（ＲｅｄｕｃｅｄＩｎｓｔ
ｒｕｃｔｉｏｎＳｅｔＣｏｍｐｕｔｅｒ）型のマイク
ロコンピュータの普及と共に、採用される場合が多くな
っている。[0003] For this reason, as an example of realizing expansion of the address space and expansion of the instruction set while maintaining compatibility at the object level, for example, Japanese Patent Application Laid-Open No. 6-51981, June 1993, Hitachi, Ltd. Issued "H8 / 300
H Series Programming Manual ”. Among them, it is shown that adopting a so-called load store type architecture is effective in expanding an instruction set. Such a load store type architecture is a so-called RISC (Reduced Inst.
With the spread of a "function set computer" type microcomputer, it is increasingly used.

【０００４】また、上記ＣＰＵのように２ステートで基
本命令を実行していたものと互換性を保ちつつ１ステー
トで基本命令を実行するように高速化し、さらに、ＣＰ
Ｕとは独立した乗算器を内蔵して高速化を図った例とし
て、特開平８−２６３２９０号公報、平成７年３月
（株）日立製作所発行『Ｈ８Ｓ／２６００シリーズＨ８
Ｓ／２０００シリーズプログラミングマニュアル』に記
載のものがある。このＣＰＵにおいては、命令の基本単
位は１６ビット（ワード）とされ、データバス幅も１６
ビットである。即ち、１回のバスアクセスで１ワード分
の命令リードが可能になる。特に、前記平成７年３月
（株）日立製作所発行『Ｈ８Ｓ／２６００シリーズＨ８
Ｓ／２０００シリーズプログラミングマニュアル』ｐｐ
２６８〜２７７には命令実行時のバス状態について記載
がある。Further, while maintaining compatibility with the above-described CPU which executes basic instructions in two states, the speed is increased so that the basic instructions are executed in one state.
Japanese Patent Application Laid-Open No. 8-263290, March 1995, "H8S / 2600 Series H8" issued by Hitachi, Ltd.
S / 2000 Series Programming Manual ”. In this CPU, the basic unit of an instruction is 16 bits (word) and the data bus width is 16 bits.
Is a bit. That is, one word of instruction can be read by one bus access. In particular, "H8S / 2600 Series H8" issued by Hitachi, Ltd.
S / 2000 Series Programming Manual ”pp
268 to 277 describe the bus state at the time of instruction execution.

【０００５】上記のＣＰＵにおいて、１ワードの命令コ
ード長を持つレジスタ間演算命令では、１ワードの命令
リード（プリフェッチを行なうため、次の命令リード）
とＰＣのインクリメント（＋２）を行なうとともに、Ｃ
ＰＵ内部で汎用レジスタの内容を読み出して、演算を行
い、結果を汎用レジスタに書込む。[0005] In the above CPU, an inter-register operation instruction having an instruction code length of one word reads an instruction of one word (the next instruction is read because prefetch is performed).
And PC are incremented (+2), and C
The contents of the general-purpose register are read inside the PU, an operation is performed, and the result is written to the general-purpose register.

【０００６】一方、１６ビットイミディエイトデータと
汎用レジスタとの演算命令では、１６ビットイミディエ
イトデータを命令中に含むため、命令コード長は２ワー
ドになる。２回の命令リード（自命令の第２ワード及び
次の命令リード）と２回のＰＣのインクリメントを行な
う。この処理に２ステートを必要とする。一方、ＣＰＵ
内部で１６ビットイミディエイトデータと汎用レジスタ
の内容を読み出して、演算を行い、結果を汎用レジスタ
に書込む。内部の演算動作については、上記レジスタ間
演算命令と同様に１ステートで実行可能である。即ち、
１ステートは、命令リードとＰＣインクリメントのみを
行い、別の１ステートでは、命令リードとＰＣインクリ
メントと内部の演算処理を行なっている。On the other hand, in an operation instruction for 16-bit immediate data and a general-purpose register, the instruction code length is 2 words because 16-bit immediate data is included in the instruction. The instruction is read twice (the second word of the own instruction and the next instruction) and the PC is incremented twice. This process requires two states. On the other hand, CPU
Internally, 16-bit immediate data and the contents of the general-purpose register are read, an operation is performed, and the result is written to the general-purpose register. The internal operation can be executed in one state as in the case of the inter-register operation instruction. That is,
In one state, only instruction reading and PC increment are performed, and in another state, instruction reading, PC increment and internal arithmetic processing are performed.

【０００７】そのほかの、複数ワードの命令コード長を
持つ命令の実行も、命令リードとＰＣインクリメントの
みを行うステートと、命令リードとＰＣインクリメント
と内部の演算処理やデータアクセスを行なうステートか
ら構成される。Execution of an instruction having an instruction code length of a plurality of words also includes a state in which only instruction read and PC increment are performed, and a state in which instruction read, PC increment, internal arithmetic processing, and data access are performed. .

【０００８】上記のような技術によって、単位時間のＣ
ＰＵまたはマイクロコンピュータの処理能力が向上す
る。これにより、マイクロコンピュータの応用システム
の制御処理を高速化でき、マイクロコンピュータによっ
て制御される機器の高速化や高機能化、高精度化、或
は、従来複数の半導体集積回路（マイクロコンピュー
タ）で構成したものを、統合したりすることによる小型
化などを図ることができる。また、特に、割込みに対す
る応答時間を短縮することによって、種々の機器を制御
する場合の時間的な精度、いわゆるリアルタイム性を向
上することができる。[0008] By the above-described technique, C per unit time
The processing capability of the PU or the microcomputer is improved. As a result, the control processing of the microcomputer application system can be speeded up, and the speed of the device controlled by the microcomputer can be increased, the function can be increased, the accuracy can be improved, or a conventional semiconductor integrated circuit (microcomputer) can be used. It is possible to reduce the size by integrating the components. In particular, by shortening the response time to the interrupt, it is possible to improve the temporal accuracy in controlling various devices, that is, the so-called real-time property.

【０００９】[0009]

【発明が解決しようとする課題】本発明者は、互換性を
維持して、ソフトウェア資産を有効利用できるようにし
つつ、また、論理的物理的規模の増大を最小限にし、か
つ消費電力の増大も最小限にして、機器組み込み型マイ
クロコンピュータ等のデータ処理装置を更に高速処理可
能にすることを検討した。SUMMARY OF THE INVENTION The present inventor has made it possible to maintain compatibility so that software assets can be effectively used, minimize an increase in logical and physical scale, and increase power consumption. We also studied to minimize the number of data processing units and to enable data processing devices such as microcomputers with built-in devices to process at even higher speeds.

【００１０】ＣＰＵの動作を高速化するためには、１命
令実行のために必要なステート数を減少させること、半
導体集積回路で成るようなシングルチップマイクロコン
ピュータの動作速度乃至動作周波数を向上させること、
に大別することができる。後者の動作速度乃至動作周波
数は、半導体集積回路の微細化、トランジスタの高速化
などによって実現することができる。しかしながら、動
作速度乃至動作周波数は、必然的に消費電流の増加を招
く。従って、消費電流の増大を最小限にするには、各命
令実行のために必要なステート数を減少させる方がよ
い。In order to speed up the operation of the CPU, it is necessary to reduce the number of states required for executing one instruction and to increase the operating speed or operating frequency of a single-chip microcomputer such as a semiconductor integrated circuit. ,
Can be roughly divided into The latter operation speed or operation frequency can be realized by miniaturization of a semiconductor integrated circuit, high-speed operation of a transistor, or the like. However, the operating speed or the operating frequency inevitably increases the current consumption. Therefore, in order to minimize the increase in current consumption, it is better to reduce the number of states required for executing each instruction.

【００１１】なお、機器の高速化や高機能化、小型化
は、アドレス空間が比較的小さく命令セットが比較的小
さいＣＰＵやマイクロコンピュータにおいても要求され
るから、前記アドレス空間の広いＣＰＵとアドレス空間
の小さいＣＰＵが存在する場合には、その双方に対して
高速化を図ることが望ましい。It is to be noted that a high-speed, high-performance, and compact device is required for a CPU or a microcomputer having a relatively small address space and a relatively small instruction set. If there is a CPU with a small size, it is desirable to increase the speed for both.

【００１２】また、互換性を維持することによって、ア
センブラ、Ｃコンパイラ、シミュレータ／デバッガなど
のクロスソフトウェアツールを共通に利用できるように
なる。即ち、これらの開発環境は既存のものが使用でき
るから、開発環境を逸早く用意することができる。By maintaining compatibility, cross software tools such as an assembler, a C compiler, and a simulator / debugger can be commonly used. That is, since these existing development environments can be used, the development environments can be quickly prepared.

【００１３】通常のマイクロコンピュータは、命令を読
み込んで動作するから、自ずから命令をリードすること
が必要になる。前記の通り、データアクセスも行なう
が、データをアクセスするためには、これを指示する命
令を読み込む必要がある。また、データアクセスを行な
わない命令もあるから、バスサイクルの内、命令のリー
ドの方が、データアクセスよりも多い。例えば、命令リ
ード８０％、データアクセス２０％の場合もある。An ordinary microcomputer operates by reading an instruction, so that it is necessary to read the instruction by itself. As described above, data access is also performed, but in order to access data, it is necessary to read an instruction designating this. Also, some instructions do not perform data access. Therefore, in the bus cycle, instruction reading is more common than data access. For example, the instruction read may be 80% and the data access may be 20%.

【００１４】命令のリードを高速にするためには、デー
タバス幅を命令の単位語長よりも大きくすればよい。例
えば命令の単位語が１６ビットのとき、３２ビットに広
げればよい。１回のバスアクセスで２ワード分の命令リ
ードが可能になる。In order to read an instruction at a high speed, the data bus width may be made larger than the unit word length of the instruction. For example, when the unit word of the instruction is 16 bits, it may be expanded to 32 bits. Instruction reading for two words can be performed by one bus access.

【００１５】しかしながら、単純に命令のリードを高速
にしても、リードした分の命令を実行（消費）しなけれ
ば、意味がない。However, even if the reading of instructions is simply performed at high speed, it is meaningless unless instructions corresponding to the read are executed (consumed).

【００１６】１６ビット固定長の命令セットと３２ビッ
トデータバスを採用したマイクロコンピュータとして、
５段のパイプラインを構成する場合、命令フェッチは２
命令を１バスサイクルで実行できるので、命令フェッチ
のステージを１回おきに空けることができる。空いたス
テージにメモリアクセスを行なうことができる。As a microcomputer adopting a 16-bit fixed-length instruction set and a 32-bit data bus,
In the case of configuring a five-stage pipeline, instruction fetch requires 2
Since an instruction can be executed in one bus cycle, an instruction fetch stage can be opened every other time. Memory access can be performed to the empty stage.

【００１７】しかしながら、1回の命令リードのビット
数を増やすして高速化する方式には以下のような問題点
のあることが本発明者によって明らかにされた。However, the present inventor has clarified that the method of increasing the number of bits per instruction read to increase the speed has the following problems.

【００１８】パイプラインを深くすると、分岐命令や割
込み例外処理のように、プログラムの流れが変わると、
それ以降のパイプラインを解消し、新たにパイプライン
を埋め直さなければならない。機器制御用のマイクロコ
ンピュータにおいては、分岐命令が比較的多いし、割込
みも多く発生するため、パイプラインの段数が大きくな
って、分岐命令や割込み例外処理の実行時間が向上でき
ないのでは好ましくない。When the pipeline is deepened, when the flow of the program changes, such as in a branch instruction or interrupt exception handling,
The pipeline after that must be eliminated and a new pipeline refilled. In a microcomputer for controlling equipment, since there are relatively many branch instructions and many interrupts, the number of pipeline stages becomes large, which is not preferable because the execution time of branch instructions and interrupt exception processing cannot be improved.

【００１９】また、３２ビットデータバスは４の倍数番
地から始まる４バイトを１回にリードまたはライト可能
にしているが、既存の１６ビット単位で命令リードを行
なうＣＰＵと互換性を維持しようとすれば、１６ビット
固定長命令は採用できず、１ワード長の命令や、３ワー
ド長の命令などが混在してしまう。換言すれば、３２ビ
ットのデータバスの単位にアライメントできない。例え
ば、２ワード長の命令は、自命令が４の倍数番地に存在
するか、そうでないかで、１回の命令リードを行なう
か、２回の命令リードを行なうか、などを判断しなけれ
ばならないと考えられる。２回の命令リードを行なこと
になれば、前記既存のＣＰＵの命令実行と同等の実行時
間になりかねない。更に、自命令のアドレスを判定した
り、１回及び２回の命令リードの両方を行なう制御を行
なったりすることは、論理規模の増大を招き易い。Although the 32-bit data bus allows four bytes starting from a multiple of 4 to be read or written at one time, compatibility with existing CPUs that read instructions in 16-bit units is maintained. For example, a 16-bit fixed-length instruction cannot be adopted, and a one-word-length instruction and a three-word-length instruction are mixed. In other words, alignment cannot be performed in units of a 32-bit data bus. For example, for an instruction with a 2-word length, it is necessary to determine whether to execute one instruction read or two instruction reads depending on whether the own instruction is at a multiple address of 4 or not. It is considered not to be. If the instruction is read twice, the execution time may be equivalent to that of the existing CPU. Further, determining the address of its own instruction or performing control for performing both one and two instruction reads tends to increase the logical scale.

【００２０】一方、いわゆるマイクロプロセッサとして
は、平成７年１月（株）日経ＢＰ社発行『日経エレクト
ロニクス』ｐｐ６８〜８０「１９９８年に転機、ハード
ウェアを単純化してＶＬＩＷへ」などに記載されている
ように、スーパスカラやＶＬＩＷなどのように高速化が
図られている。いずれも同時に実行できる処理の数を増
やして（例えば４並列）、全体的な処理性能を向上する
ものである。しかしながら、上記の技術では、複数命令
を並列して実行するため、制御回路・演算器などのＣＰ
Ｕの資源を複数セット持つことになり、物理的・論理的
規模の増大を招く。On the other hand, a so-called microprocessor is described in “Nikkei Electronics”, pp. 68-80, published in January 1995 by Nikkei BP Co., Ltd., “turning into 1998, simplifying hardware to VLIW”, and the like. As described above, the speed is increased as in superscalar and VLIW. In each case, the number of processes that can be executed simultaneously is increased (for example, four parallel processes) to improve the overall processing performance. However, in the above technique, since a plurality of instructions are executed in parallel, the CP of the control circuit
Having a plurality of sets of U resources causes an increase in physical and logical scale.

【００２１】また、機器組み込み型のマイクロコンピュ
ータ（またはマイクロコントローラ）においては、各種
の機器の状態を参照しつつ、処理の内容を変えていく。
機器の状態を参照するために、データアクセスを行なう
し、処理の内容を変えるために、分岐命令を実行する。
従って、局所的なプログラムを繰り返し実行するような
ことは、マイクロプロセッサに比較して、少ない。ま
た、制御対象に応じて、命令の実行順序に制約があった
り、局所的な命令実行時間に制約がある場合がある。ソ
フトウェア的に矛盾がないからといって、必ずしも、命
令の順序を変更できるとは限らない。更には、制御する
システムによっては、消費電力を小さく抑える必要があ
る場合もある。Further, in a microcomputer (or microcontroller) embedded in a device, the contents of processing are changed while referring to the state of various devices.
Data access is performed to refer to the state of the device, and a branch instruction is executed to change the content of processing.
Therefore, the number of times that a local program is repeatedly executed is less than that of a microprocessor. Further, depending on the control target, there may be a case where there is a restriction on an instruction execution order or a case where there is a restriction on a local instruction execution time. Just because there is no inconsistency in software does not necessarily mean that the order of the instructions can be changed. Furthermore, depending on the system to be controlled, it may be necessary to reduce power consumption.

【００２２】上記のように、分岐の処理が多いから、ス
ーパスカラなどの手法を用いて、複数命令並列処理可能
としても、分岐命令が存在して、一旦処理した結果を放
棄せざるを得ない場合も生じる。また、条件分岐命令
は、条件分岐命令が参照する分岐条件が確定するまで、
分岐の判定ができないから、条件分岐命令が参照する分
岐条件を生成する命令と条件分岐命令とは同時には実行
できず、その先の処理も実行できない。条件分岐命令が
存在する度に、命令の並列処理ができなくなってしま
う。分岐予測や投機的実行を行なえば、実際に処理すべ
き動作と異なる動作を行なってしまう可能性も無視でき
ない。機器制御の場合には、条件分岐命令を組合せて
（ツリー状に構成して）、多数の分岐先の中から分岐先
を判定することが多いから、分岐予測に適さないし、予
測した分岐先に、更に条件分岐が存在することも多いと
考えられる。結果的に分岐予測は、ヒット率が非常に低
くなってしまい易い。また、分岐予測や投機的実行によ
って、平均的な処理時間は向上できるかも知れないが、
個別の局所的な処理を高速化することは期待できないと
考えられる。As described above, since a large number of branch processes are performed, even if a plurality of instructions can be processed in parallel using a technique such as superscalar, there is a branch instruction and the result of the processing must be abandoned. Also occurs. The conditional branch instruction is used until the branch condition referenced by the conditional branch instruction is determined.
Since the branch cannot be determined, the instruction for generating the branch condition referred to by the conditional branch instruction and the conditional branch instruction cannot be executed at the same time, and the subsequent processing cannot be executed. Each time a conditional branch instruction is present, it becomes impossible to execute the instructions in parallel. If branch prediction or speculative execution is performed, the possibility of performing an operation different from the operation to be actually processed cannot be ignored. In the case of device control, a branch destination is often determined from a number of branch destinations by combining conditional branch instructions (constructing a tree shape). It is considered that there are often conditional branches. As a result, the branch prediction tends to have a very low hit ratio. The average processing time may be improved by branch prediction and speculative execution,
It is not expected that the speed of individual local processing can be increased.

【００２３】すなわち、複数の命令を並列実行可能とし
ても、無駄になり易い。並列実行するための論理が無駄
になり、論理的物理的な資源を有効に利用できない。ま
た、無駄な論理や無駄な動作は消費電力を不所望に増大
させてしまう。That is, even if a plurality of instructions can be executed in parallel, it is likely to be wasted. Logic for parallel execution is wasted, and logical and physical resources cannot be used effectively. In addition, useless logic and useless operation undesirably increase power consumption.

【００２４】結局、機器組み込み制御用のマイクロコン
ピュータについては、スーパスカラやＶＬＩＷなどのよ
うな並列処理を行なっても、ソフトウェア資産の有効利
用を図り難いし、また、実際には、マイクロプロセッサ
ほどの高速化も困難である。少なくとも、論理的物理的
規模の割に高速化が図り難く、消費電力を増大させてし
まう。As a result, it is difficult to effectively use software resources for a microcomputer for controlling embedded devices even if parallel processing such as super scalar or VLIW is performed. Is also difficult. At least, it is difficult to increase the speed for the logical and physical scale, and the power consumption increases.

【００２５】なお、同じ演算の繰り返し処理と、モータ
制御などの機器制御とでは、ＣＰＵ乃至はマイクロコン
ピュータ毎に処理性能の傾向が異なることが、平成１０
年２月（株）ＣＱ出版社発行『インターフェイス』ｐｐ
１３４〜１４５「組み込み用ＣＰＵのパフォーマンスの
徹底研究」などに記載されている。この観点からも、適
当な費用の、機器組み込み制御に適した既存のＣＰＵ乃
至マイクロコンピュータの互換性を維持し、既存のアー
キテクチャを継承し、費用の増加を適正にし、高速化を
図っていく必要のあることが本発明者によって明らかに
された。It should be noted that, in the repetitive processing of the same operation and the equipment control such as the motor control, the tendency of the processing performance differs for each CPU or microcomputer.
February, CQ Publishing Co., Ltd. “Interface” pp
134 to 145, "Thorough study of performance of embedded CPU". From this point of view, it is necessary to maintain the compatibility of existing CPUs and microcomputers suitable for embedded control of equipment at appropriate cost, inherit the existing architecture, appropriately increase the cost, and increase the speed. Have been clarified by the present inventors.

【００２６】本発明の目的は、機器制御に好適なマイク
ロコンピュータ等のデータ処理装置の高速化を図ること
にある。An object of the present invention is to increase the speed of a data processing device such as a microcomputer suitable for controlling equipment.

【００２７】本発明の別の目的は、既存のＣＰＵとの互
換性を維持して、既存のソフトウェアの利用を可能にし
つつ、かつ、論理的な規模の増加を最小限にして、製造
費用の増加を最小限にして、ＣＰＵの処理性能を向上さ
せることにある。It is another object of the present invention to maintain compatibility with existing CPUs, to enable the use of existing software, and to minimize the increase in logical size while reducing manufacturing costs. An object of the present invention is to improve the processing performance of a CPU by minimizing the increase.

【００２８】本発明の前記並びにその他の目的と新規な
特徴は本明細書の記述及び添付図面から明らかになるで
あろう。The above and other objects and novel features of the present invention will become apparent from the description of the present specification and the accompanying drawings.

【００２９】[0029]

【課題を解決するための手段】本願において開示される
発明のうち代表的なものの概要を簡単に説明すれば下記
の通りである。The following is a brief description of an outline of a typical invention among the inventions disclosed in the present application.

【００３０】すなわち、既存のデータ処理装置例えばＣ
ＰＵに対して、内部データバス幅を、少なくとも命令の
基本単位（例えばワード）よりも大きくし、リードした
命令を複数単位保持することができる命令レジスタを持
ち、この命令レジスタに存在する命令の量を監視する手
段を設ける。（既存の）命令を、実行の基本単位時間
（ステートと称する）に従って、命令のリード制御（プ
ログラムカウンタのインクリメントの制御を含む）のみ
を行なうステート（第1の動作）と、実効アドレスの計
算やデータの演算処理の制御を含むステート（第2の動
作）に分割し、既にリードした命令の状況に応じて、命
令のリードのみの制御を行なうステートを省略可能にす
る。換言すれば、前記命令レジスタに存在する既に読み
込んだ命令の量に応じた前記監視手段の指示に従い、前
記命令のリードのみの制御を行なうステートを省略（ス
キップ）する。That is, an existing data processing device such as C
For the PU, the internal data bus width is made at least larger than the basic unit (eg, word) of the instruction, and the PU has an instruction register capable of holding a plurality of read instructions. Is provided. According to a basic unit time of execution (referred to as a state), an (existing) instruction is subjected to only instruction read control (including increment control of a program counter) (a first operation), an effective address calculation or The state is divided into a state (second operation) including control of data arithmetic processing, and a state in which only read of an instruction is controlled can be omitted according to the state of an already read instruction. In other words, according to the instruction of the monitoring means according to the amount of the instruction already read in the instruction register, the state in which only the instruction is read is omitted (skipped).

【００３１】各命令の実行時の命令リードの量を、自命
令の命令長に対して、多くしたり、少なくしたりする。
これを、リード済み乃至リード実行中の命令の量に従っ
て制御する。The amount of instruction read at the time of execution of each instruction is increased or decreased with respect to the instruction length of its own instruction.
This is controlled according to the number of instructions that have been read or are being read.

【００３２】上記によれば、内部データバス幅を命令の
基本単位（ワード）よりも大きくすることによって、一
度にリードする命令の量を、既存のＣＰＵより、大きく
できる。According to the above, by making the internal data bus width larger than the basic unit (word) of an instruction, the number of instructions to be read at one time can be made larger than that of an existing CPU.

【００３３】基本の実行ステートの場合、既存のＣＰＵ
と同様に、自分の命令コード長に対応した回数の命令リ
ードを行なうことにより、前記ステートの省略（スキッ
プ）を行なわない場合には、実行した自命令の命令コー
ドの量より、リードした命令の量を大きくして、リード
済みの命令コードを蓄積できる。In the case of the basic execution state, the existing CPU
Similarly to the above, when the instruction is read a number of times corresponding to the own instruction code length and the state is not omitted (skipped), the amount of the instruction code of the executed instruction is determined by the amount of the instruction code of the executed own instruction. By increasing the amount, the read instruction code can be accumulated.

【００３４】一方、省略（スキップ）を行なって、命令
リードを行なわないことにより、実行した自命令の命令
コードの量と、リードした命令の量を同等にして、リー
ド済みの命令コードの量を維持したり、実行した自命令
の命令コードの量より、リードした命令の量を少なくし
て、リード済みの命令コードの量を減少できる。On the other hand, by omitting (skipping) and not reading the instruction, the amount of the instruction code of the executed own instruction is made equal to the amount of the read instruction, and the amount of the read instruction code is reduced. It is possible to reduce the amount of the read instruction code by reducing the amount of the read instruction from the amount of the instruction code of the maintained or executed own instruction.

【００３５】これによって、リード済みの命令の量を所
定の範囲内に収めつつ（命令のリードの量と、命令の実
行の量のバランスを採りつつ）、命令のリードを高速化
して、全体の命令実行時間を短縮することができる。Thus, while keeping the amount of read instructions within a predetermined range (while balancing the amount of instruction read and the amount of instruction execution), the speed of instruction read is increased, and the overall Instruction execution time can be reduced.

【００３６】また、省略（スキップ）するステートを自
動的に変えることによって、命令の配置の変更に対応で
きる。Further, by automatically changing the states to be omitted (skipped), it is possible to cope with a change in the instruction arrangement.

【００３７】命令レジスタには、命令コードとともに、
その命令コードのアドレス（ＩＡＢ）上のビット１につ
いての情報を格納し、命令デコーダで同時に判定すると
よい。そのようなビット１の値は、ワードのような基本
単位のアドレスが４の倍数であるか４の倍数でない偶数
であるかを示している。これにより、制御を容易にし、
命令デコーダの論理的規模の増大を最小限にすることが
できる。In the instruction register, together with the instruction code,
Information about bit 1 on the address (IAB) of the instruction code may be stored, and the instruction decoder may determine it at the same time. Such a value of bit 1 indicates whether the address of a basic unit such as a word is a multiple of 4 or an even number that is not a multiple of 4. This makes control easier,
An increase in the logical scale of the instruction decoder can be minimized.

【００３８】分岐命令や割込み例外処理の先頭アドレス
のリードを除き、命令のリードは３２ビット単位で行
い、プログラムカウンタのインクリメントは＋４とす
る。分岐命令や割込み例外処理の先頭アドレスが４の倍
数番地のときは、同様に、命令のリードは３２ビット単
位で行い、プログラムカウンタのインクリメントは＋４
とする。分岐命令や割込み例外処理の先頭アドレスが４
の倍数番地でないときは、先頭の命令のリードは１６ビ
ット単位で行い、プログラムカウンタのインクリメント
は＋２とする。プログラムカウンタのインクリメントの
＋２又は＋４（＋２／＋４）を自動的に制御するように
インクリメンタを構成するには、例えば、ビット０をバ
イトアドレスとするとき、インクリメンタのビット１
を、その入力と１との論理和によって得られる値にし、
ビット１へのキャリーを与えるようにすればよい。Instructions are read in 32-bit units, except for branch instructions and the start address of interrupt exception handling, and the program counter is incremented by +4. When the start address of the branch instruction or the interrupt exception handling is a multiple of 4, similarly, the instruction is read in 32-bit units, and the increment of the program counter is +4.
And Start address of branch instruction or interrupt exception handling is 4
If the address is not a multiple address, the first instruction is read in 16-bit units, and the increment of the program counter is +2. To configure the incrementer to automatically control the increment of +2 or +4 (+ 2 / + 4) of the program counter, for example, when bit 0 is a byte address, bit 1 of the incrementer
To the value obtained by the logical sum of its input and 1.
Carry to bit 1 may be provided.

【００３９】このように、分岐命令や割込み例外処理の
先頭アドレスのリードを除き、命令のリードは３２ビッ
ト単位で行い、また、プログラムカウンタのインクリメ
ントは＋２／＋４を自動的に行なうことにより、制御を
容易にし、論理規模の増加を抑止することができる。As described above, except for the read of the start address of the branch instruction and the interrupt exception processing, the instruction is read in units of 32 bits, and the increment of the program counter is controlled by automatically performing + 2 / + 4. And the increase in the logical scale can be suppressed.

【００４０】ライトデータバッファにプログラムカウン
タの内容を格納するようにし、更に、ライトデータバッ
ファをＦＩＦＯ（First-In First-Out：先入れ先出し）
構造とし、また、前記プログラムカウンタのインクリメ
ンタと同様に前記ビット１をセットする回路を持つとよ
い。これにより、実際にリードした命令のアドレスとプ
ログラムカウンタに保持した内容との食い違いが大きく
なり、また、一意に決まらなくても、サブルーチン分岐
命令時に、待避すべきプログラムカウンタの値をライト
データバッファから容易に得ることができる。かつ、制
御を容易にし、論理規模の増加を抑止することができ
る。The contents of the program counter are stored in the write data buffer, and the write data buffer is stored in FIFO (First-In First-Out).
It is preferable that the circuit has a structure and has a circuit for setting the bit 1 like the incrementer of the program counter. As a result, the discrepancy between the address of the actually read instruction and the contents held in the program counter increases, and even if the instruction is not uniquely determined, the value of the program counter to be saved from the write data buffer at the time of a subroutine branch instruction is determined. Can be easily obtained. In addition, control can be facilitated and an increase in logical scale can be suppressed.

【００４１】実効アドレスの計算やデータの転送処理が
複数のステートに亘って動作する場合も、制御自体は１
度に行い、制御信号に遅延を設けるなどして、実際の動
作を複数のステート（例えば、アドレス計算を最初のス
テート、リードデータの格納を次のステート）で行なう
ようにする。Even when the calculation of the effective address and the data transfer process operate over a plurality of states, the control itself is one.
The actual operation is performed in a plurality of states (for example, the address calculation is performed in the first state, and the read data is stored in the next state) by providing a delay to the control signal.

【００４２】分岐命令や割込み例外処理などの場合、最
低限１ワード分のプリフェッチが完了した時点で、分岐
先の先頭命令のデコードを開始し、実行することによ
り、分岐命令や割込み例外処理による処理時間の損失を
最小限にし、いわゆる応答性、ひいては、リアルタイム
性を向上できる。In the case of a branch instruction or interrupt exception handling, etc., when the prefetch of at least one word is completed, the decoding of the first instruction at the branch destination is started and executed, so that the processing by the branch instruction and interrupt exception handling is performed. Time loss can be minimized, and so-called responsiveness and thus real-time performance can be improved.

【００４３】独立した内部バスを持つなどして、制御信
号を遅延させるべき動作（例えば、リードデータの格
納）が、次の制御動作（プログラムカウンタ・インクリ
メント）と重なっても同時に動作可能なように、実行手
段を構成する。An operation for delaying a control signal (for example, storage of read data) by having an independent internal bus or the like can be performed simultaneously even when the next control operation (program counter / increment) overlaps. And the execution means.

【００４４】基本単位時間で処理可能な命令（省略可能
なステートを持たない命令）は、算術論理演算器（ＡＬ
Ｕ）等の演算手段を複数設けて、重なった時間を持ちな
がら動作可能にする。独立した内部バスを持つなどし
て、一方の演算手段の動作と、命令リードのための動作
とが重なっても同時に動作可能なように、実行手段を構
成する。それぞれの演算手段を制御するための制御回路
を持つ。一方の制御回路は一方の演算手段を制御して全
ての命令の制御を行なうようにし、他方の制御回路は他
方の演算手段の制御を専ら行なうようにする。An instruction that can be processed in the basic unit time (an instruction having no state that can be omitted) is an arithmetic logic unit (AL
A plurality of calculation means such as U) are provided so as to be able to operate while having an overlapping time. The execution means is configured to have an independent internal bus so that the operation means can operate simultaneously even if the operation of one of the operation means and the operation for reading the instruction overlap. It has a control circuit for controlling each calculation means. One control circuit controls one arithmetic unit to control all instructions, and the other control circuit exclusively controls the other arithmetic unit.

【００４５】このように、演算手段と制御回路のみを複
数設けるとともに、一方の制御回路は全ての命令の制御
を行なうようにし、他方の制御回路は演算手段の制御を
専ら行なうようにすることにより、論理規模の増加を抑
止することができる。As described above, by providing only a plurality of arithmetic means and control circuits, one control circuit controls all instructions, and the other control circuit exclusively controls the arithmetic means. Thus, an increase in the logical scale can be suppressed.

【００４６】前置コードのように、制御信号のみを発生
する命令コードは、命令をリードし、命令レジスタに保
持した状態で、前置コードであることを検出し、所望の
制御信号を発生する検出回路及び制御信号発生回路を設
けることにより、スキップ可能にし、スキップ時には制
御信号のみを発生するようにする。これによって、命令
の実行時間を短縮できる。また、前置コードであること
のみを検出し、検出した結果に基づいて制御信号を発生
すればよいから、前記検出回路と制御信号発生回路の論
理的規模を最小限にすることができる。An instruction code that generates only a control signal, such as a prefix code, reads the instruction, detects that the instruction code is a prefix code, and generates a desired control signal while holding it in the instruction register. By providing a detection circuit and a control signal generation circuit, skipping is enabled, and at the time of skipping, only a control signal is generated. Thereby, the execution time of the instruction can be reduced. Further, since it is sufficient to detect only the prefix code and generate the control signal based on the detected result, the logical scale of the detection circuit and the control signal generation circuit can be minimized.

【００４７】オブジェクトレベルで互換性を保ちつつ、
アドレス空間の広い（命令セットの大きい）ＣＰＵとア
ドレス空間の小さい（命令セットの小さい）ＣＰＵが存
在する場合には、アドレス空間の広いＣＰＵで、上記高
速化を実現して、下位互換性をもつ、アドレス空間の小
さいＣＰＵにも存在する命令について、同様に上記高速
化を可能にできる。換言すれば、同一の方法で、オブジ
ェクトレベルで互換性を保ちつつ、アドレス空間の広い
ＣＰＵとアドレス空間の小さいＣＰＵでも高速化を可能
にできる。オブジェクトレベルで互換性を保つことによ
る利点と高速化を可能にすることの利点の双方を享受す
ることができる。While maintaining compatibility at the object level,
When there is a CPU having a wide address space (large instruction set) and a CPU having a small address space (small instruction set), the CPU having a wide address space realizes the above-described high speed and has backward compatibility. In addition, the above-described high-speed operation can be similarly performed for an instruction that is also present in a CPU having a small address space. In other words, it is possible to increase the speed of a CPU having a wide address space and a CPU having a small address space by using the same method while maintaining compatibility at the object level. Both the advantages of maintaining compatibility at the object level and the advantage of enabling high speed can be enjoyed.

【００４８】既存の命令を実行可能にし、内部動作の順
序なども同等にしているから、既存のＣＰＵと比較し
て、将来拡張余裕を大きく損なうことがない。例えば、
既存のＣＰＵに対して、新たな命令の追加が可能になっ
た場合には、かかる技術を、本発明を適用したＣＰＵに
も用いることができると考えられる。命令セットの互換
性を維持していれば、機械語としては、既存のＣＰＵと
同じ命令を追加することはできる。また、追加命令も、
複数の実行ステート数を持つものであれば、固有の動作
を行なう部分と省略可能なステートとに分け、後者を必
要に応じて省略することは可能とすることはできる。少
なくとも、必要に応じて命令のリードとプログラムカウ
ンタ・インクリメントを禁止することができ、既存ＣＰ
Ｕと同等の処理時間では実現可能である。追加命令が１
ステートで実行可能であれば、複数個設けた演算手段
（例えばＡＬＵ，ＡＬＵＳ）の交互の動作などによって
高速化を実現できる。Since the existing instructions are made executable and the order of the internal operations and the like are made the same, the future expansion margin is not significantly impaired as compared with the existing CPU. For example,
When it becomes possible to add a new instruction to an existing CPU, it is considered that such a technique can be used for a CPU to which the present invention is applied. As long as the compatibility of the instruction set is maintained, the same instruction as that of the existing CPU can be added as a machine language. Also, additional instructions
If it has a plurality of execution states, it can be divided into a part that performs a specific operation and a state that can be omitted, and the latter can be omitted if necessary. At a minimum, instruction reading and program counter increment can be prohibited if necessary.
It can be realized with a processing time equivalent to U. 1 additional instruction
If it can be executed in a state, the speed can be increased by alternate operation of a plurality of arithmetic means (for example, ALU, ALUS).

【００４９】既存のＣＰＵと同じ命令セットとすること
により、アセンブラ、Ｃコンパイラ、シミュレータ／デ
バッガなどの開発ツール、いわゆるクロスソフトウェア
を共通にすることができる。クロスソフトウェアを共通
化することによって、早く開発環境を整えることができ
る。By using the same instruction set as the existing CPU, development tools such as assembler, C compiler, simulator / debugger, so-called cross software can be shared. By sharing the cross software, the development environment can be quickly set up.

【００５０】[0050]

【発明の実施の形態】図２には本発明が適用されたシン
グルチップマイクロコンピュータの一例が示される。FIG. 2 shows an example of a single-chip microcomputer to which the present invention is applied.

【００５１】シングルチップマイクロコンピュータ１
は、全体の制御を司るＣＰＵ２、割込コントローラ（Ｉ
ＮＴ）３、ＣＰＵ２の処理プログラムなどを格納するメ
モリであるＲＯＭ４、ＣＰＵ２の作業領域並びにデータ
の一時記憶用のメモリであるＲＡＭ５、タイマ６、タイ
マ７、シリアルコミュニケーションインタフェース（Ｓ
ＣＩ）８、Ａ／Ｄ変換器９、システムコントローラ（Ｓ
ＹＳＣ）１０、第１入出力ポート（ＩＯＰ[１]）１１乃
至第９入出力ポート（ＩＯＰ[９]）１９、クロック発振
器（ＣＰＧ）２０の機能ブロック乃至はモジュールから
構成され、公知の半導体製造技術により１つの半導体基
板（半導体チップ）上に形成される。Single-chip microcomputer 1
Are the CPU 2 that controls the entire control and the interrupt controller (I
NT) 3, a ROM 4 for storing a processing program of the CPU 2, etc., a RAM 5, a work area for the CPU 2 and a memory for temporarily storing data, a timer 6, a timer 7, a serial communication interface (S
CI) 8, A / D converter 9, system controller (S
YSC) 10, first input / output port (IOP [1]) 11 through ninth input / output port (IOP [9]) 19, and a clock oscillator (CPG) 20. It is formed on one semiconductor substrate (semiconductor chip) by technology.

【００５２】かかるシングルチップマイクロコンピュー
タ１は、電源端子として、グランドレベル（Ｖｓｓ）、
電源電圧レベル（Ｖｃｃ）、アナロググランドレベル
（ＡＶｓｓ）、アナログ電源電圧レベル（ＡＶｃｃ）、
アナログ基準電圧（Ｖｒｅｆ）を有し、更に、専用制御
端子として、リセット（ＲＥＳ）、スタンバイ（ＳＴＢ
Ｙ）、モード制御（ＭＤ０、ＭＤ１）、クロック入力
（ＥＸＴＡＬ、ＸＴＡＬ）の各端子を有する。The single chip microcomputer 1 has a ground level (Vss) as a power supply terminal.
Power supply voltage level (Vcc), analog ground level (AVss), analog power supply voltage level (AVcc),
It has an analog reference voltage (Vref) and further has a dedicated control terminal for reset (RES), standby (STB)
Y), mode control (MD0, MD1), and clock input (EXTAL, XTAL) terminals.

【００５３】ＣＰＧ２０の端子ＥＸＴＡＬ、ＸＴＡＬに
接続される水晶発振子またはＥＸＴＡＬ端子に入力れる
外部クロックに基づいて生成される基準クロック（シス
テムクロック）に同期して、シングルチップマイクロコ
ンピュータ１は動作を行う。この基準クロック１周期を
ステートと呼ぶ。The single-chip microcomputer 1 operates in synchronization with a crystal oscillator connected to the terminals EXTAL and XTAL of the CPG 20 or a reference clock (system clock) generated based on an external clock input to the EXTAL terminal. . One cycle of this reference clock is called a state.

【００５４】シングルチップマイクロコンピュータ１の
機能ブロックは、内部バス２１によって相互に接続さ
る。内部バス２１は、内部アドレスバス、内部データバ
ス、及び内部コントローラバスを含む。内部コントロー
ルバスはリード信号、ライト信号、バスサイズ信号、シ
ステムクロックなどを含む。内部データバスの内、ＣＰ
Ｕ２のプログラムを格納するＲＯＭ４とＣＰＵ２との間
は、３２ビットとされる。特に制限されないが、図２の
例では、ＲＡＭ５も同様に３２ビットバスでインタフェ
ースされる。そのほか、外部バスも３２ビットバスとし
てもよい。The functional blocks of the single-chip microcomputer 1 are interconnected by an internal bus 21. Internal bus 21 includes an internal address bus, an internal data bus, and an internal controller bus. The internal control bus includes a read signal, a write signal, a bus size signal, a system clock, and the like. CP of internal data bus
The space between the ROM 4 storing the U2 program and the CPU 2 is 32 bits. Although not particularly limited, in the example of FIG. 2, the RAM 5 is also interfaced with a 32-bit bus. In addition, the external bus may be a 32-bit bus.

【００５５】内部アドレスバスはその位相によって、Ｉ
ＡＢ、ＰＡＢの２種類があり、内部データバスもその位
相によって、ＩＤＢ、ＰＤＢが存在する。例えば、リー
ドの場合、ＩＡＢの後、ＰＡＢは０．５ステート遅延す
る。ＰＡＢとＰＤＢは同期している。ＰＤＢの後、ＩＤ
Ｂは０．５ステート遅延する。ＩＡＢとＰＡＢ、ＩＤＢ
とＰＤＢは、図示されないバスコントローラによってバ
ッファリングされている。かかる機能ブロックやモジュ
ールは内部バスを介して、ＣＰＵ２によってリード／ラ
イトさる。内蔵ＲＯＭ４、ＲＡＭ５は、ＩＡＢ及びＩＤ
Ｂを介してＣＰＵ２とインタフェースされ、１ステート
でリード／ライト可能とする。なお、タイマ６、タイマ
７、ＳＣＩ８、Ａ／Ｄ変換器９、ＩＯＰ[１]１１〜ＩＯ
Ｐ[９]１９、ＣＰＧ２０が有す制御レジスタを総称し
て、内部Ｉ／Ｏレジスタと呼ぶ。これらは、ＰＡＢ及び
ＰＤＢに接続される。ＰＤＢのバス幅は、特に制限はさ
れないものの、１６ビットとする。これは、内部Ｉ／Ｏ
レジスタは各機能ブロックに分散しているので、これを
３２ビットバスで接続しようとすると、バスの総配線長
が大きくなってしまい、物理的規模の増大を招き易い
し、内部Ｉ／Ｏレジスタ（各機能ブロック）上で有意味
のデータは８乃至１６ビットであり、３２ビットでアク
セスする必要性が低いためである。The internal address bus depends on its phase.
There are two types, AB and PAB, and the internal data bus also has IDB and PDB depending on the phase. For example, in the case of a read, the PAB is delayed by 0.5 state after the IAB. PAB and PDB are synchronized. After PDB, ID
B is delayed by 0.5 state. IAB, PAB, IDB
And PDB are buffered by a bus controller (not shown). These functional blocks and modules are read / written by the CPU 2 via the internal bus. Built-in ROM 4 and RAM 5 are IAB and ID
It is interfaced with the CPU 2 via B and can be read / written in one state. The timer 6, the timer 7, the SCI 8, the A / D converter 9, the IOP [1] 11 to IO
The control registers of the P [9] 19 and the CPG 20 are collectively called internal I / O registers. These are connected to PAB and PDB. Although the bus width of the PDB is not particularly limited, it is 16 bits. This is the internal I / O
Since the registers are dispersed in each functional block, if they are connected by a 32-bit bus, the total wiring length of the bus becomes large, which tends to increase the physical scale, and the internal I / O register ( This is because the meaningful data on each functional block) is 8 to 16 bits, and it is not necessary to access it with 32 bits.

【００５６】各入出力ポート１１〜１９は、アドレスバ
ス、データバス、バス制御信号あるいはタイマ６，７、
ＳＣＩ８、Ａ／Ｄ変換器９の入力端子や入出力端子と兼
用されている。すなわち、タイマ６，７、ＳＣＩ８、Ａ
／Ｄ変換器９は、それぞれ入出力信号を有し、入出力ポ
ートと兼用にされた端子介して、外部と入力又は入出力
されるものである。例えばＩＯＰ[５]１５、ＩＯＰ[６]
１６、ＩＯＰ[７]１７は、タイマ６，７の入出力端子と
兼用、ＩＯＰ[８]１８はＳＣＩ８の入出力端子と兼用に
されている。アナログデータの入力端子は、ＩＯＰ[９]
１９と兼用にされている。Each of the input / output ports 11 to 19 is connected to an address bus, a data bus, a bus control signal or a timer 6, 7,
It is also used as an input terminal and an input / output terminal of the SCI 8 and the A / D converter 9. That is, timers 6, 7, SCI8, A
The / D converter 9 has an input / output signal, and is input / output to / from the outside via a terminal shared with an input / output port. For example, IOP [5] 15, IOP [6]
16, IOP [7] 17 is also used as input / output terminals of timers 6 and 7, and IOP [8] 18 is also used as input / output terminals of SCI8. The analog data input terminal is IOP [9]
19 is also used.

【００５７】上記シングルチップマイクロコンピュータ
１にリセット信号ＲＥＳが与えられると、ＣＰＵ２を始
めとし、シングルチップマイクロコンピュータ１はリセ
ット状態になる。このリセットが解除されると、ＣＰＵ
２は所定のアドレスからスタートアドレスをリードし
て、このスタートアドレスから命令のリードを開始する
リセット例外処理を行う。この後、ＣＰＵ２は逐次、Ｒ
ＯＭ４などから命令をリードし、解読して、その解読内
容に基づいてデータの処理或はＲＡＭ５、タイマ６，
７、ＳＣＩ８等とのデータ転送を行う。即ち、ＣＰＵ２
は、入出力ポートＩＯＰ[１]〜ＩＯＰ[９]、Ａ／Ｄ変換
器９などから入力されるデータ、或はＳＣＩ８などから
入力される指示を参照しつつ、ＲＯＭ４などに記憶され
ている命令に基づいて処理を行い、その結果に基づい
て、入出力ポートＩＯＰ[１]〜ＩＯＰ[９]、タイマ６，
７などを使用して、外部に信号を出力し、各種機器の制
御を行うものである。When the reset signal RES is given to the single-chip microcomputer 1, the single-chip microcomputer 1 including the CPU 2 is reset. When this reset is released, the CPU
2 performs a reset exception process for reading a start address from a predetermined address and starting reading an instruction from the start address. Thereafter, the CPU 2 sequentially executes R
An instruction is read from the OM 4 or the like, decoded, and data processing or the RAM 5, the timer 6,
7, and data transfer with the SCI 8 and the like. That is, CPU2
The instruction stored in the ROM 4 or the like refers to data input from the I / O ports IOP [1] to IOP [9], the A / D converter 9 or the like, or instructions input from the SCI 8 or the like. , And based on the result, I / O ports IOP [1] to IOP [9], timer 6,
7 to output signals to the outside to control various devices.

【００５８】タイマ６，７、ＳＣＩ８、外部信号などの
状態を割込み信号として、ＣＰＵ２に伝達することがで
きる。割込信号は、Ａ／Ｄ変換器９、タイマ６、タイマ
７、ＳＣＩ８、ＩＯＰ[１]１１〜ＩＯＰ[９]１９の所定
のものが出力し、割込コントローラ３はこれを入力し
て、所定のレジスタなどの指定に基づて、ＣＰＵ２に割
込要求信号２２を与える。割込要因が発生すると、ＣＰ
Ｕ割込要求が発生され、ＣＰＵ２は実行中の処理を中断
して、例外処理状態を経て、所定の処理ルーチンに分岐
し、所望の処理を行い、割込要因をクリアしたりする。
所定の処理ルーチンの最後には、通常復帰命令がおか
れ、この命令を実行することによって前記中断した処理
を再開する。The statuses of the timers 6 and 7, the SCI 8, and external signals can be transmitted to the CPU 2 as interrupt signals. The interrupt signal is output by a predetermined one of the A / D converter 9, the timer 6, the timer 7, the SCI 8, and the IOP [1] 11 to IOP [9] 19. An interrupt request signal 22 is given to the CPU 2 based on the designation of a predetermined register or the like. When an interrupt factor occurs, the CP
When a U interrupt request is generated, the CPU 2 interrupts the processing being executed, branches to a predetermined processing routine via an exception processing state, performs a desired processing, and clears an interrupt factor.
At the end of the predetermined processing routine, a normal return instruction is placed, and the interrupted processing is resumed by executing this instruction.

【００５９】図３にはＣＰＵ２に内蔵されている汎用レ
ジスタ及び制御レジスタの構成例（プログラミングモデ
ル）が示される。FIG. 3 shows a configuration example (programming model) of general-purpose registers and control registers built in the CPU 2.

【００６０】ＣＰＵ２は、３２ビット長の汎用レジスタ
ＥＲ０〜ＥＲ７を持っている。汎用レジスタＥＲ０〜Ｅ
Ｒ７は、全て同機能を持っており、アドレスレジスタと
してもデータレジスタとしても使用することができる。The CPU 2 has 32-bit general-purpose registers ER0 to ER7. General-purpose registers ER0 to ER
R7 all have the same function and can be used as both an address register and a data register.

【００６１】データレジスタとしてしては３２ビット、
１６ビットおよび８ビットレジスタとして使用きる。ア
ドレスレジスタ及び３２ビットレジスタとしては、一括
して汎用レジスタＥＲ（ＥＲ０〜ＥＲ７）として使用す
る。１６ビットレジスタとしては、汎用レジスタＥＲを
分割して汎用レジスタＥ（Ｅ０〜Ｅ７）、汎用レジスタ
Ｒ（Ｒ０〜Ｒ７）として使用する。これらは同等の機能
を持っており、１６ビットジスタを最大１６本まで使用
することができる。なお、汎用レジスタＥ（Ｅ０〜Ｅ
７）を、特に拡張レジスタと呼ぶ場合がある。８ビット
レジスタとしては、汎用レジスタＲを分割して汎用レジ
スタＲＨ（Ｒ０Ｈ〜Ｒ７Ｈ）、汎用レジスタＲＬ（Ｒ０
Ｌ〜Ｒ７Ｌ）として使用する。これらは同等の機能を持
っており、８ビットレジスタを最大１６本まで使用する
ことができる。各レジスタ独立に使用方法を選択するこ
とができる。As a data register, 32 bits,
Can be used as 16-bit and 8-bit registers. The address register and the 32-bit register are collectively used as general-purpose registers ER (ER0 to ER7). As the 16-bit register, the general-purpose register ER is divided and used as general-purpose registers E (E0 to E7) and general-purpose registers R (R0 to R7). These have the same function, and up to 16 16-bit registers can be used. Note that the general-purpose registers E (E0 to E
7) may be particularly called an extension register. As the 8-bit register, the general-purpose register R is divided into general-purpose registers RH (R0H to R7H) and a general-purpose register RL (R0
L to R7L). These have equivalent functions and can use up to 16 8-bit registers. The method of use can be selected independently for each register.

【００６２】汎用レジスタＥＲ７には、汎用レジスタと
しての機能に加えて、スタックポインタ（ＳＰ）として
の機能が割り当てられており、例外処理やサブルーチン
分岐などで暗黙的に使用される。例外処理は前記割込み
処理を含む。The general-purpose register ER7 is assigned a function as a stack pointer (SP) in addition to the function as a general-purpose register, and is used implicitly in exception processing and subroutine branching. The exception processing includes the interrupt processing.

【００６３】ＰＣは２４ビットのカウンタで、ＣＰＵ２
が次に実行する命令のアドレスを示す。特に制限されな
いものの、ＣＰＵ２の命令は、全て２バイト（ワード：
１６ビット）を単位としているため、ビット０は無効で
あり、また、命令のリードは４バイト（ロングワード：
３２ビット）としているために、ビット１も使用されな
い。また、スタックに待避される場合などは、上位８ビ
ットを０としたロングワードサイズとして扱われる。PC is a 24-bit counter.
Indicates the address of the instruction to be executed next. Although not particularly limited, all instructions of the CPU 2 are 2 bytes (word:
Since the unit is 16 bits, bit 0 is invalid, and instruction reading is 4 bytes (long word:
32), bit 1 is not used. Also, when the data is saved on the stack, it is treated as a long word size with the upper 8 bits set to 0.

【００６４】ＣＣＲは８ビットのコンディションコード
レジスタで、ＣＰＵ２の内部状態を示している。割込み
マスクビット（Ｉ）とハーフキャリ（Ｈ）、ネガティブ
（Ｎ）、ゼロ（Ｚ）、オーバフロー（Ｖ）、キャリ
（Ｃ）の各フラグを含む８ビットで構成されている。CCR is an 8-bit condition code register, which indicates the internal state of the CPU 2. It is composed of 8 bits including an interrupt mask bit (I) and flags of half carry (H), negative (N), zero (Z), overflow (V) and carry (C).

【００６５】ＥＸＲは８ビットのレジスタで、割込みな
どの例外処理の制御を行なう制御情報が設定され、割込
みマスクビット（Ｉ２〜Ｉ０）とトレース（Ｔ）の各ビ
ットを含んでいる。EXR is an 8-bit register in which control information for controlling exception processing such as interrupts is set, and includes interrupt mask bits (I2 to I0) and trace (T) bits.

【００６６】汎用レジスタ上のデータ構成例、メモリ空
間上のデータ構成、アドレッシングモードと実効アドレ
スの計算方法などについては、例えば平成７年３月
（株）日立製作所発行『Ｈ８Ｓ／２６００シリーズＨ８
Ｓ／２０００シリーズプログラミングマニュアル』記載
のＣＰＵと同様である。For an example of a data configuration on a general-purpose register, a data configuration on a memory space, an addressing mode and a method of calculating an effective address, see, for example, “H8S / 2600 Series H8” issued by Hitachi, Ltd. in March 1995.
This is the same as the CPU described in “S / 2000 Series Programming Manual”.

【００６７】図４には別のＣＰＵに内蔵されている汎用
レジスタ及び制御レジスタの構成例を示す。これは、平
成元年７月（株）日立製作所発行『Ｈ８／３００シリー
ズプログラミングマニュアル』記載のＣＰＵと同様の構
成であり、１６ビットの汎用レジスタＲ０〜Ｒ７を有し
ている。本発明を適用した、図３のプログラミングモデ
ルを持つＣＰＵ２は、図４のＣＰＵの汎用レジスタ及び
命令セットを包含している。換言すれば、前記ＣＰＵ２
は図４のレジスタ及び命令セットを有するＣＰＵに対し
て上位互換の関係を有する。FIG. 4 shows a configuration example of a general-purpose register and a control register built in another CPU. This has the same configuration as the CPU described in "H8 / 300 Series Programming Manual" issued by Hitachi, Ltd., July 1989, and has 16-bit general registers R0 to R7. The CPU 2 having the programming model of FIG. 3 to which the present invention is applied includes the general-purpose registers and the instruction set of the CPU of FIG. In other words, the CPU 2
Has an upwardly compatible relationship with the CPU having the register and instruction set of FIG.

【００６８】図５には前記ＣＰＵ２の機械語の命令フォ
ーマットの一例が示される。ＣＰＵ２の命令は、２バイ
ト（ワード）を単位にしている。各命令はオペレーショ
ンフィード（ｏｐ）、レジスタフィールド（ｒ）、ＥＡ
拡張部（ＥＡ）、およびコンディションフィールド（ｃ
ｃ）を含む。FIG. 5 shows an example of a machine language instruction format of the CPU 2. The instruction of the CPU 2 is in units of 2 bytes (words). Each instruction is an operation feed (op), a register field (r), and EA
Extension (EA) and condition field (c
c).

【００６９】特に制限はされないものの、前記ＣＰＵ２
は、前記平成７年３月（株）日立製作所発行『Ｈ８Ｓ／
２６００シリーズ、Ｈ８Ｓ／２０００シリーズプログラ
ミングマニュアル』記載のＣＰＵと同じ命令フォーマッ
トとしている。特に基本的な演算命令、転送命令は１６
ビット長（１ワード）である。Although not particularly limited, the CPU 2
Was published in March 1995 by Hitachi, Ltd. “H8S /
It has the same instruction format as the CPU described in the "2600 Series, H8S / 2000 Series Programming Manual". In particular, basic operation instructions and transfer instructions are 16
It is a bit length (one word).

【００７０】オペレーションフィールド（ｏｐ）は、命
令の機能を表し、アドレッシングモードの指定オペラン
ドの処理内容を指定する。命令の先頭４ビットを必ず含
む。２つのオペレーションフィールドを持つ場合もあ
る。The operation field (op) indicates the function of the instruction, and specifies the processing content of the operand in the addressing mode. Always include the first 4 bits of the instruction. It may have two operation fields.

【００７１】レジスタフィールド（ｒ）は組合わせて汎
用レジスタを指定する。前記レジスタフィールド（ｒ）
はアドレスレジスタのとき３ビット、データレジスタの
とき３ビット（３２ビットレジスタ）または４ビット
（８または１６ビットレジスタ）である。２つのレジス
タフィールドを持つ場合、或いは、レジスタフィールド
を持たない場合もある。The register field (r) specifies a general-purpose register in combination. The register field (r)
Is 3 bits for an address register and 3 bits (32-bit register) or 4 bits (8 or 16-bit register) for a data register. In some cases, there are two register fields, or there are no register fields.

【００７２】前記ＥＡ拡張部（ＥＡ）は、イミディエイ
トデータ、絶対アドレスまたはディスプレースメントを
指定する。８ビット、１６ビット、または３２ビットで
ある。コンディションフィールド（ｃｃ）は条件分岐命
令（Ｂｃｃ命令）の分岐条件を指定する。The EA extension (EA) specifies immediate data, an absolute address, or a displacement. 8, 16 or 32 bits. The condition field (cc) specifies a branch condition of a conditional branch instruction (Bcc instruction).

【００７３】図１には前記ＣＰＵ２のブロック図が例示
される。ＣＰＵ２は、制御部ＣＯＮＴと、前記汎用レジ
スタＥＲ０〜ＥＲ７、プログラムカウンタＰＣ、コンデ
ィションコードレジスタＣＣＲを含む実行部ＥＸＥＣか
ら構成される。FIG. 1 shows a block diagram of the CPU 2 as an example. The CPU 2 includes a control unit CONT, an execution unit EXEC including the general-purpose registers ER0 to ER7, a program counter PC, and a condition code register CCR.

【００７４】制御部ＣＯＮＴは、例えば３ワード分のＦ
ＩＦＯから成る命令レジスタ２００、命令レジスタ検出
回路（ＭＯＮ）２０１、命令レジスタコントローラ（Ｆ
ＩＦＯＣＮＴ）２０２、命令デコーダ（ＤＥＣ）２０
３、サブ命令デコーダ（ＤＥＣＳ）２０４、レジスタセ
レクタ（ＳＥＬ）２０５をむ。命令デコーダ（ＤＥＣ）
２０３、サブ命令デコーダ（ＤＥＣＳ）２０４は、例え
ば、ＰＬＡ（ＰｒｏｇｒａｍｍａｂｌｅＬｏｇｉｃＡ
ｒｒａｙ）または布線論理などで構成される。命令デコ
ーダ（ＤＥＣ）２０３は全ての命令に対応し、ＣＰＵ２
の全体の制御を行う。サブ命令デコーダ（ＤＥＣＳ）２
０４は、レジスタ間演算命令などの演算動作を命令デコ
ーダ（ＤＥＣ）２０３と重なった時間に実行する制御だ
けを行なう。命令デコーダ２０３の出力の一部が命令デ
コーダ２０３にフィードバックされている。これは各命
令コード内の遷移に用いるステージコード（ＴＭＧ）
と、命令コード間に用いる制御コード（ＭＯＤ）を含
む。かかる制御コードは、命令デコーダ（ＤＥＣ）２０
３及び命令レジスタ検出回路（ＭＯＮ）２０１で生成さ
れ、マルチプレク（ＭＰＸ）２０６サを介して、命令デ
コーダ２０３に入力される。また、命令レジスタ検出回
路（ＭＯＮ）２０１は、レジスタ間演算命令、前置コー
ドを検出する検出回路を持ち、レジスタ間演算命令の検
出結果を信号ＮＸＴＭＯＮ１によって、前置コードの検
出結果を信号ＮＸＴＭＯＮ２によって命令デコーダ（Ｄ
ＥＣ）２０３に指示する。更に、命令レジスタ検出回路
２０１は、命令レジスタ２００のリード中であることを
信号ＩＦＭＯＮによってデコーダ２０３に指示する。命
令レジスタコントローラ（ＦＩＦＯＣＮＴ）２０２は、
命令レジスタに保持している有効な命令コードの量を検
出し、検出結果を検出信号ＦＩＦＯＣＮＴ１，ＦＩＦＯ
ＣＮＴ２によって命令デコーダ（ＤＥＣ）２０３に指示
する。前記信号検出信号ＦＩＦＯＣＮＴ１、ＦＩＦＯＣ
ＮＴ２、ＮＸＴＭＯＮ１、ＮＸＴＭＯＮ２、ＩＦＭＯＮ
に基づくデコーダ２０３による制御内容は、図１０乃至
図１３、図２２及び図２３に基づいて詳述する。The control unit CONT controls, for example, F words for three words.
An instruction register 200 comprising an IFO, an instruction register detection circuit (MON) 201, and an instruction register controller (F
IFOCNT) 202, instruction decoder (DEC) 20
3. Sub instruction decoder (DECS) 204 and register selector (SEL) 205 are included. Instruction decoder (DEC)
The sub instruction decoder (DECS) 203 includes, for example, a PLA (Programmable Logic A).
(rray) or wiring logic. An instruction decoder (DEC) 203 supports all instructions and
Performs overall control. Sub-instruction decoder (DECS) 2
04 performs only control for executing an operation such as an inter-register operation instruction at a time overlapping with the instruction decoder (DEC) 203. A part of the output of the instruction decoder 203 is fed back to the instruction decoder 203. This is the stage code (TMG) used for the transition in each instruction code
And a control code (MOD) used between instruction codes. Such a control code is transmitted to an instruction decoder (DEC) 20.
3 and an instruction register detection circuit (MON) 201, and are input to an instruction decoder 203 via a multiplex (MPX) 206. The instruction register detection circuit (MON) 201 has a detection circuit for detecting an inter-register operation instruction and a prefix code, and detects the detection result of the inter-register operation instruction with a signal NXTMON1 and the detection result of the prefix code with a signal NXTMON2. Instruction decoder (D
EC) 203 is instructed. Further, the instruction register detection circuit 201 instructs the decoder 203 by the signal IFMON that the instruction register 200 is being read. The instruction register controller (FIFOCNT) 202
The amount of valid instruction codes held in the instruction register is detected, and the detection result is detected by detection signals FIFOCNT1, FIFOFO.
The instruction is given to the instruction decoder (DEC) 203 by CNT2. The signal detection signals FIFOCNT1, FIFOC
NT2, NXTMON1, NXTMON2, IFMON
Will be described in detail with reference to FIGS. 10 to 13, 22 and 23.

【００７５】レジスタセレクタ（ＲＳＥＬ）２０５は、
デコーダ２０３，２０４の出力に基づいて汎用レジスタ
などのレジスタ選択信号を形成すると共に、図示はされ
ないものの、汎用レジスタのリードとライトが競合した
場合の検出回路を備えている。The register selector (RSEL) 205 is
A register selection signal for a general-purpose register or the like is formed based on the outputs of the decoders 203 and 204, and a detection circuit (not shown) is provided when a read and a write of the general-purpose register conflict.

【００７６】実行部ＥＸＥＣは３２ビット単位でデータ
転送可能に構成され、図３の汎用レジスタとコントロー
ルレジスタと、更に、テンポラリレジスタＴＲＡ、ＴＲ
Ｄ、算術論理演算器ＡＬＵ、サブ算術論理演算器ＡＬＵ
Ｓ、算術演算器ＡＵ、インクリメンタＩＮＣ、リードデ
ータバッファＲＤＢ、ライトデータバッファＷＤＢ、ア
ドレスバッファＡＢを含む。これらの機能ブロックはＧ
Ｂ（第１の内部バス）、ＰＣＧＢ（第２の内部バス）、
ＤＢ（第５の内部バス）、ＷＢ（第４の内部バス）、Ｐ
ＣＷＢ（第３の内部バス）の内部バスによって相互に接
続されている。The execution unit EXEC is configured to be able to transfer data in 32-bit units, and includes the general-purpose register and the control register shown in FIG. 3, and the temporary registers TRA and TR.
D, arithmetic logic unit ALU, sub arithmetic logic unit ALU
S, an arithmetic operation unit AU, an incrementer INC, a read data buffer RDB, a write data buffer WDB, and an address buffer AB. These functional blocks are G
B (first internal bus), PCGB (second internal bus),
DB (fifth internal bus), WB (fourth internal bus), P
They are interconnected by an internal bus of CWB (third internal bus).

【００７７】前記内部バスＧＢは、汎用レジスタＥＲ０
〜ＥＲ７の所定のレジスタなどから算術論理演算器ＡＬ
Ｕ、ＡＬＵＳなどへのデータ転送、汎用レジスタＥＲ０
〜ＥＲ７の中の所定レジスタやリードデータバッファＲ
ＤＢ、算術論理演算器ＡＬＵからアドレスバッファＡＢ
へのアドレスの転送などに用いられる。The internal bus GB is connected to a general-purpose register ER0.
Arithmetic logic unit AL from a predetermined register
Data transfer to U, ALUS, etc., general-purpose register ER0
A predetermined register in ER7 or read data buffer R
DB, arithmetic logic unit ALU to address buffer AB
It is used for transferring addresses to the Internet.

【００７８】前記内部バスＰＣＷＢは、プログラムカウ
ンタＰＣからアドレスバッファＡＢ、インクリメンタＩ
ＮＣ、ライトデータバッファＷＤＢへの命令のアドレス
への転送などに用いられる。The internal bus PCWB is provided with an address buffer AB and an incrementer I from the program counter PC.
NC, used to transfer an instruction to the write data buffer WDB to an address.

【００７９】前記内部バスＤＢは、汎用レジスタＥＲ０
〜ＥＲ７の中の所定レジスタから算術論理演算器ＡＬＵ
やライトデータバッファＷＤＢへのデータ転送などに用
いられる。The internal bus DB is connected to a general-purpose register ER0.
Arithmetic logic unit ALU from a predetermined register in ~ ER7
And data transfer to the write data buffer WDB.

【００８０】内部バスＷＢは、算術論理演算器ＡＬＵ、
ＡＬＵＳやリードデータバッファＲＤＢから汎用レジス
タへのデータ転送などに用いられる。内部バスＰＣＷＢ
は、インクリメンタＩＮＣからプログラムカウンタＰＣ
への命令のアドレスへの転送に用いられる。The internal bus WB includes an arithmetic and logic unit ALU,
It is used for transferring data from the ALUS or read data buffer RDB to a general-purpose register. Internal bus PCWB
Is the program counter PC from the incrementer INC
Used to transfer the instruction to the address.

【００８１】リードデータバッファＲＤＢは、ＲＯＭ
４、ＲＡＭ５、内部Ｉ／Ｏレジスタ、或は図示はされな
い外部メモリから、リードした命令コードやデータを一
時的に格納する。ライトデータバッファＷＤＢはＲＯＭ
４、ＲＡＭ５、内部Ｉ／Ｏレジスタ、或は外部メモリへ
のライトデータを一時的に格納するとともに、命令リー
ドのアドレスを一時的に格納する。リードデータバッフ
ァＲＤＢ、ライトデータバッファＷＤＢによってＣＰＵ
２の内部動作と、ＣＰＵ２の外部のリード／ライト動作
のタイミングを調整している。The read data buffer RDB is a ROM
4, temporarily storing the read instruction code or data from the RAM 5, the internal I / O register, or an external memory (not shown). Write data buffer WDB is ROM
4. Temporarily store write data to the RAM 5, the internal I / O register, or the external memory, and temporarily store the instruction read address. Read data buffer RDB and write data buffer WDB
2 and an external read / write operation timing of the CPU 2 are adjusted.

【００８２】アドレスバッファＡＢは、ＣＰＵ２がリー
ド／ライトするアドレスを一時的に格納する他、格納し
た内容に対するインクリメント機能を有している。イン
クリメント機能を有するアドレスバッファは特開平４−
３３３１５３号公報などに記載されている。The address buffer AB has a function of temporarily storing an address read / written by the CPU 2 and an increment function for the stored content. An address buffer having an increment function is disclosed in
No. 333153.

【００８３】インクリメンタＩＮＣは、主にＰＣの加算
に用いられ、＋２／＋４を行なう。前記算術演算器ＡＵ
は、プログラムカウンタ相対の分岐命令／サブルーチン
分岐命令の分岐アドレスの生成に使用される。算術論理
演算器ＡＬＵは、命令によって指定される各種の演算や
実効アドレスの計算などに用いられる。サブ算術論理演
算器ＡＬＵＳは、専らレジスタ間演算命令の演算に用い
られるものである。実行すべき命令がレジスタ間接演算
命令であるかは前記信号ＮＸＴＭＯＮ１によって検出さ
れる。The incrementer INC is mainly used for addition of PC, and performs + 2 / + 4. The arithmetic unit AU
Is used to generate a branch address of a branch instruction / subroutine branch instruction relative to the program counter. The arithmetic and logic unit ALU is used for various operations specified by instructions, calculation of effective addresses, and the like. The sub-arithmetic-logic operation unit ALUS is used exclusively for operation of an inter-register operation instruction. Whether the instruction to be executed is a register indirect operation instruction is detected by the signal NXTMON1.

【００８４】前記算術論理演算器ＡＬＵとサブ算術論理
演算器ＡＬＵＳの動作は、０．５ステートずらされるよ
うになっている。算術論理演算器ＡＬＵは基本クロック
（φ）がハイレベルの期間にデータを入力し、この結果
を基本クロック（φ）がロウレベルの期間に出力する。
これに対して、算術演算器ＡＬＵＳは、基本クロック
（φ）がロウレベルの期間にデータを入力し、この結果
を基本クロック（φ）がハイレベルの期間に出力する。
ＣＰＵ２は、命令フェッチ、デコード、実行の３段パイ
プラインで命令を実行する。このとき、例えば、加算命
令“ＡＤＤ．ＬＥＲ０，ＥＲ１”とシフト命令“ＳＨＬ
ＬＥＲ１”が連続する場合、基本クロック（φ）のハ
イレベルに同期してＥＲ０とＥＲ１の内容がバスＤＢ、
ＧＢに読み出され、算術論理演算器ＡＬＵに入力され
る。算術論理演算器ＡＬＵで加算が行なわれ、加算結果
が基本クロック（φ）のローレベルに同期して、バスＷ
Ｂに出力される。この基本クロック（φ）のローレベル
でＥＲ１のリード／ライトが競合する。バスＷＢの内容
がＥＲ１に書込まれる。ＥＲ１の内容は読み出されず、
代わりに、算術論理演算器ＡＬＵの内容がバスＧＢに読
み出され、サブ算術論理演算器ＡＬＵＳに入力される。
即ち、演算器は命令の順序に従って、算術論理演算器Ａ
ＬＵとサブ算術論理演算器ＡＬＵＳが動作し、一方の算
術論理演算器ＡＬＵの結果を、他方の算術論理演算器Ａ
ＬＵＳの入力に利用できるから、レジスタの競合を本質
的に回避できる。The operations of the arithmetic and logic unit ALU and the sub arithmetic and logic unit ALUS are shifted by 0.5 state. The arithmetic and logic unit ALU inputs data during a period when the basic clock (φ) is at a high level, and outputs the result during a period when the basic clock (φ) is at a low level.
On the other hand, the arithmetic operation unit ALUS inputs data during a period when the basic clock (φ) is at a low level, and outputs the result during a period when the basic clock (φ) is at a high level.
The CPU 2 executes instructions in a three-stage pipeline of instruction fetch, decode, and execution. At this time, for example, the addition instruction “ADD.LER0, ER1” and the shift instruction “SHL”
When LER1 "continues, the contents of ER0 and ER1 are synchronized with the high level of the basic clock (φ) and the contents of the bus DB,
The data is read to GB and input to the arithmetic and logic unit ALU. The addition is performed by the arithmetic and logic unit ALU, and the addition result is synchronized with the low level of the basic clock (φ) and the bus W
B. At the low level of the basic clock (φ), read / write of ER1 competes. The contents of bus WB are written to ER1. The contents of ER1 are not read,
Instead, the contents of the arithmetic and logic unit ALU are read to the bus GB and input to the sub arithmetic and logic unit ALUS.
In other words, the arithmetic unit operates in accordance with the order of the instructions and the arithmetic and logic unit A
The LU and the sub-arithmetic logic unit ALUs operate, and the result of one arithmetic logic unit ALU is compared with the result of the other arithmetic logic unit A.
Since it can be used for LUS input, register conflicts can be essentially avoided.

【００８５】命令が重なって動作するのは、各々の命令
の最初または最後の１ステート（１ステートで実行する
命令は全部の期間）とされ、更に、この期間に動作する
のは特定種類の動作（演算動作）とされているから、Ｃ
ＰＵ２の命令デコード動作の一部を双方の算術論理演算
器ＡＬＵ，ＡＬＵＳのために重なった期間で行なえばよ
く、その他の順序的な動作を制御する、相対的に大きな
命令デコーダ（ＤＥＣ）２０３は従来同等にでき、追加
するサブ命令デコーダ（ＤＥＣＳ）２０４を相対的に小
さいものとすることによって、論理的規模の増加を最小
限にすることができる。Instructions operate overlapping in the first or last one state of each instruction (an instruction executed in one state is the entire period). Further, during this period, a specific type of operation is performed. (Arithmetic operation), C
A part of the instruction decoding operation of the PU2 may be performed for both arithmetic logic units ALU and ALUS in the overlapped period, and a relatively large instruction decoder (DEC) 203 for controlling other sequential operations is provided. By increasing the sub-instruction decoder (DECS) 204 to be relatively small, it is possible to minimize the increase in logical scale.

【００８６】特に制限はされないものの、サブ命令デコ
ーダ（ＤＥＣＳ）２０４は、サブ算術論理演算器ＡＬＵ
Ｓを用いた演算の種類の指定（演算制御）と、演算に用
いる汎用レジスタの入出力制御、サブ算術論理演算器Ａ
ＬＵＳの演算結果に基づくコンディションコードレジス
タＣＣＲの設定制御を行なう。Although not particularly limited, the sub-instruction decoder (DECS) 204 includes a sub-arithmetic-logic operation unit ALU
Designation of the type of operation using S (operation control), input / output control of general-purpose registers used in the operation, sub-arithmetic logic operation unit A
The setting control of the condition code register CCR based on the operation result of LUS is performed.

【００８７】一方、命令デコーダ（ＤＥＣ）２０３は、
上記に加えて、命令の動作タイミングの生成、バス制
御、ＰＣ制御、実効アドレスの計算、実効アドレスの計
算に用いる汎用レジスタの入出力制御、メモリアクセス
のデータの入出力制御、命令レジスタの制御、割込み制
御などを行なう。On the other hand, the instruction decoder (DEC) 203
In addition to the above, generation of instruction operation timing, bus control, PC control, calculation of effective addresses, input / output control of general-purpose registers used for calculation of effective addresses, input / output control of data for memory access, control of instruction registers, Performs interrupt control and the like.

【００８８】ここで、前記デコーダ２０３とサブデコー
ダ２０４の機能について補足説明する。サブデコーダ２
０４は、信号ＮＸＴＭＯＮ１によってレジスタ間接演算
命令であることが検出されたとき、当該レジスタ間接演
算命令をデコードし、そのデコード結果によるサブ算術
論理演算器ＡＬＵＳの演算制御は０．５ステート遅れて
開始する。Here, the functions of the decoder 203 and the sub-decoder 204 will be supplementarily described. Sub decoder 2
04, when it is detected by the signal NXTMON1 that the instruction is a register indirect operation instruction, the register indirect operation instruction is decoded, and the arithmetic control of the sub-arithmetic logic unit ALUS based on the decoded result is started with a delay of 0.5 state. .

【００８９】図６には前記ＲＯＭ４の構成が例示され
る。ＲＯＭ４は、並列データ入出力ビット数が最大３２
ビットとされ、そのデータ入出力端子は内部データバス
ＩＤＢに接続されている。４の倍数番地から始まる連続
した４バイトの、下位アドレスが“０”のバイトが上位
に、下位アドレスが“３”のバイトが下位になるように
構成されている。FIG. 6 illustrates the configuration of the ROM 4. The ROM 4 has a maximum of 32 parallel data input / output bits.
The data input / output terminal is connected to the internal data bus IDB. It is configured such that, of consecutive 4 bytes starting from a multiple of four, the byte whose lower address is "0" is higher and the byte whose lower address is "3" is lower.

【００９０】このＲＯＭ４は、４の倍数番地から始まる
３２ビットデータ（ロングワードデータ）を一括して１
ステートでリードされる。また、それ以外の偶数番地か
ら始まる３２ビットデータは、１ステートずつ２回に分
けてリードされる必要がある。同様に偶数番地から始ま
る１６ビットデータ（ワードデータ）を一括して、１ス
テートでリード可能にされる。奇数番地から始まる１６
ビットデータのリードは認められていない。これは命令
コードが１６ビット単位であることに対応している。ま
た、ＲＯＭ４は、任意の番地の８ビットデータ（バイト
データ）を１ステートでリード可能にされる。This ROM 4 collectively stores 32-bit data (long word data) starting from a multiple of 4 in one bit.
Read in state. Further, other 32-bit data starting from an even-numbered address needs to be read twice for each state. Similarly, 16-bit data (word data) starting from an even address can be collectively read in one state. 16 starting from an odd address
Reading of bit data is not permitted. This corresponds to the instruction code being in 16-bit units. The ROM 4 can read 8-bit data (byte data) at an arbitrary address in one state.

【００９１】即ち、１６ビット長の命令が連続する場
合、１回のＲＯＭ４のリードで２命令を読み出すことが
できる。ＲＡＭ５のリード／ライトについても同様の構
成とされる。That is, when 16-bit instructions are consecutive, two instructions can be read by reading the ROM 4 once. The read / write of the RAM 5 has the same configuration.

【００９２】図７にはＣＰＵ２のアドレシングモードが
例示される。レジスタ間接（＠ＥＲｎ）は、命令コード
のレジスタフィールド（ｒ１）で指定されるアドレスレ
ジスタ（ＥＲｎ）の内容をアドスとしてメモリ上のオペ
ランドを指定する。FIG. 7 illustrates an addressing mode of the CPU 2. The register indirect (@ERn) specifies an operand on the memory using the contents of the address register (ERn) specified by the register field (r1) of the instruction code as an address.

【００９３】ポストインクリメントレジスタ間接（＠Ｅ
Ｒｎ＋）は、命令コードのレジスタフィールド（ｒ１）
で指定されるアドレスレジスタ（ＥＲｎ）の内容をアド
スとしてメモリ上のオペランドを指定する。その後、ア
ドレスレジスタの内容に1、2または4が加算され、加算
結果がアドレスレジスタに格納される。バイサイズでは
1、ワードサイズでは2、ロングワードサイズでは4がそ
れぞれ加算される。Post-increment register indirect ($ E
Rn +) is an instruction code register field (r1)
The operand on the memory is designated by using the contents of the address register (ERn) designated by (1) as an address. Thereafter, 1, 2, or 4 is added to the contents of the address register, and the addition result is stored in the address register. By size
1, 2 is added for word size, and 4 is added for long word size.

【００９４】プリデクリメントレジスタ間接（＠−ＥＲ
ｎ）は、命令コードのレジスタフィールド（ｒ１）で指
定されるアドレスレジスタ（ＥＲｎ）の内容から１，２
又は４を減算した内容をアドレスとしてメモリ上のオペ
ランドを指定する。その後、減算結果がアドレスレジス
タに格納される。バイトサイズでは１、ワーサイズでは
２、ロングワードサイズでは４がそれぞれ減算される。Pre-decrement register indirect ($ -ER
n) is 1, 2 from the contents of the address register (ERn) specified by the register field (r1) of the instruction code.
Alternatively, an operand on the memory is designated by using the content obtained by subtracting 4 as an address. After that, the subtraction result is stored in the address register. 1 is subtracted for byte size, 2 for word size, and 4 for longword size.

【００９５】ディスプレースメント付きレジスタ間接
（＠（ｄ：１６，ＥＲｎ））は、命令コードのレジスタ
フィールド（ｒ１）で指定されるアドレスレジスタ（Ｅ
Ｒｎ）の内容に命令コード中に含まれる１６ビットディ
スプレースメント（ｄ）を加算した内容をアドレスとし
てメモリ上のオペランドを指定する。加算に際して、１
６ビットディスプレースメントは符号拡張される。The register indirect with displacement ($ (d: 16, ERn)) corresponds to the address register (E) specified by the register field (r1) of the instruction code.
The operand on the memory is designated by using the content obtained by adding the 16-bit displacement (d) included in the instruction code to the content of Rn) as an address. When adding, 1
The 6-bit displacement is sign-extended.

【００９６】絶対アドレス（＠ａａ：１６）は、命令コ
ード中に含まれる絶対アドレス（ａａ）で、メモリ上の
オペランドを指定する。特に制限はされないものの、１
６ビット絶対アドレスの場合、上位１６ビットは符号拡
張される。The absolute address ($ aa: 16) is an absolute address (aa) included in the instruction code and specifies an operand on the memory. Although not particularly limited, 1
In the case of a 6-bit absolute address, the upper 16 bits are sign-extended.

【００９７】図８には転送命令“ＭＯＶ．Ｗ＠ａａ：
１６，Ｒｄ”の動作タイミングが示される。図８の（１
−１）、（１−２）には制御部（特に命令デコーダ２０
３）ＣＯＮＴの動作が示され、図８の（２）には実行部
ＥＸＥＣの動作が示されている。実際には、制御部ＣＯ
ＮＴの出力する制御信号に基づいて実行部ＥＸＥＣが動
作するから、制御部ＣＯＮＴの動作と実行部ＥＸＥＣの
動作には時間差が存在するが、図８では便宜的にその時
間差を０として表現している。また、制御部ＣＯＮＴの
動作において（１−１）は従来技術同等の動作であり、
（１−２）は本発明特有の動作の一例に対応されてい
る。FIG. 8 shows a transfer instruction “MOV.W@aa:
16, Rd ". The operation timing of (1) in FIG.
-1) and (1-2) include a control unit (in particular, the instruction decoder 20).
3) The operation of the CONT is shown, and the operation of the execution unit EXEC is shown in (2) of FIG. Actually, the control unit CO
Since the execution unit EXEC operates based on the control signal output from the NT, there is a time difference between the operation of the control unit CONT and the operation of the execution unit EXEC. In FIG. 8, the time difference is expressed as 0 for convenience. I have. In the operation of the control unit CONT, (1-1) is an operation equivalent to the conventional technology,
(1-2) corresponds to an example of an operation unique to the present invention.

【００９８】実行部ＥＸＥＣは、第１ステートＳＴ１
で、次命令の命令リード（ｉｆ）とＰＣインクリメント
（＋４：従来技術では＋２）を行なう。第２ステートＳ
Ｔ２で、リードデータバッファから本命令のＥＡ拡張部
（ａａ）を内部バス（ＧＢ）経由でアドレスバッファに
転送すると共に、データリードのためのバスコマンドを
発行する。第３ステートＳＴ３では、次の次の命令の命
令リード（ｉｆ）とプログラムカウンタＰＣインクリメ
ント（＋４：従来技術では＋２）を行なうと共に、第２
ステートＳＴ２でリードしたデータを、リードデータバ
ッファから内部バス（ＷＢ）経由で汎用レジスタに転送
するとともに、データを検査し、結果をコンディション
コードレジスタＣＣＲにセットする。The execution unit EXEC executes the first state ST1
Then, the instruction read (if) of the next instruction and the PC increment (+4: +2 in the prior art) are performed. Second state S
At T2, the EA extension (aa) of this instruction is transferred from the read data buffer to the address buffer via the internal bus (GB), and a bus command for data read is issued. In the third state ST3, the instruction read (if) of the next next instruction and the program counter PC increment (+4: +2 in the prior art) are performed, and the second instruction is executed.
The data read in state ST2 is transferred from the read data buffer to the general-purpose register via the internal bus (WB), the data is checked, and the result is set in the condition code register CCR.

【００９９】制御部ＣＯＮＴの動作（１−１）は、上記
実行部ＥＸＥＣの動作に即した制御内容になっている。
即ち、第２ステートＳＴ２で、アドレスの出力とバスコ
マンドの生成、第３ステートＳＴ３でリードデータの格
納の制御信号を発生している。更に、前記平成７年３月
（株）日立製作所発行『Ｈ８Ｓ／２６００シリーズＨ８
Ｓ／２０００シリーズプログラミングマニュアル』で
は、ＰＣのインクリメントは＋２であり、リードデータ
は、リードデータバッファから内部バス（ＧＢ）を経由
して演算器（ＡＬＵ）に入力し、演算器（ＡＬＵ）はそ
のままデータを内部バス（ＷＢ）に出力して、汎用レジ
スタに格納していた。演算器（ＡＬＵ）を介すること
で、内部バスの増加を抑止し（ＧＢ、ＤＢ、ＷＢの３種
類）、演算器（ＡＬＵ）のデータ検査回路とフラグセッ
ト回路を共有していた。これらの詳細な相違点は省略す
る。The operation (1-1) of the control unit CONT is controlled in accordance with the operation of the execution unit EXEC.
That is, a control signal for outputting an address and generating a bus command is generated in the second state ST2, and a control signal for storing read data is generated in the third state ST3. In addition, “H8S / 2600 Series H8” issued by Hitachi, Ltd.
In the S / 2000 Series Programming Manual, the PC increment is +2, read data is input from the read data buffer to the arithmetic unit (ALU) via the internal bus (GB), and the arithmetic unit (ALU) remains unchanged. Data is output to the internal bus (WB) and stored in a general-purpose register. Through the arithmetic unit (ALU), the increase of the internal bus is suppressed (three types of GB, DB, and WB), and the data inspection circuit and the flag set circuit of the arithmetic unit (ALU) are shared. These detailed differences are omitted.

【０１００】制御部ＣＯＮＴの動作（１−２）は、デー
タアクセスのための制御を、第２ステートＳＴ２で行な
う。即ち、第２ステートＳＴ２で、アドレスの出力とバ
スコマンドの生成、及び、リードデータの格納の制御信
号を発生する。演算部ＥＸＥＣには、まず、アドレスの
出力とバスコマンドの生成のための制御信号が与えら
れ、次のステートでリードデータの格納（ＲＤＢ−Ｒ
ｄ）の制御信号が与えられるようにされる。In operation (1-2) of control unit CONT, control for data access is performed in second state ST2. That is, in the second state ST2, a control signal for outputting an address, generating a bus command, and storing read data is generated. The operation unit EXEC is first supplied with a control signal for outputting an address and generating a bus command, and stores read data (RDB-R) in the next state.
The control signal of d) is provided.

【０１０１】制御部ＣＯＮＴの第１ステートＳＴ１及び
第３ステートＳＴ３は、命令リード（ｉｆ）とＰＣイン
クリメント（＋４）のみを行なう。これらの第１、第３
ステートＳＴ１，ＳＴ３は、命令レジスタ（ＦＩＦＯ）
２００にリード済みの命令の量に従って、省略（スキッ
プ）される。リード済みの命令が少なければ、第１ステ
ートＳＴ１及び第３ステートＳＴ３を実行し、本命令の
命令長（２ワード）より多い命令をリードする。リード
済みの命令の量が適切であれば、第１ステートＳＴ１又
は第３ステートＳＴ３の一方を実行し、本命令の命令長
（２ワード）と同じ量の命令をリードする。リード済み
の命令が多ければ、第１ステートＳＴ１及び第３ステー
トＳＴ３を実行せず、命令をリードしない。どの動作を
行なうかは、前記ＦＩＦＯＣＮＴ１，ＦＩＦＯＣＮＴ，
ＩＦＭＯＮなどの信号を用いて命令デコーダ２０３が決
定する。In the first state ST1 and the third state ST3 of the control unit CONT, only the instruction read (if) and the PC increment (+4) are performed. These first and third
States ST1 and ST3 are an instruction register (FIFO)
It is skipped (skipped) according to the number of instructions that have been read to 200. If the number of read instructions is small, the first state ST1 and the third state ST3 are executed, and instructions longer than the instruction length (two words) of this instruction are read. If the amount of the read instruction is appropriate, one of the first state ST1 and the third state ST3 is executed, and an instruction having the same amount as the instruction length (two words) of the present instruction is read. If there are many instructions that have been read, the first state ST1 and the third state ST3 are not executed, and no instructions are read. Which operation is performed depends on the above-mentioned FIFOCNT1, FIFOCNT,
The instruction decoder 203 determines using a signal such as IFMON.

【０１０２】例えば、本命令が複数命令連続して実行さ
れる場合には、第１ステートＳＴ１、第２ステートＳＴ
２のみの実行（第３ステートＳＴ３は省略）が行われ
る。図８の（１−１）に示される従来と同様の制御を受
ける前の命令の第３ステートＳＴ３のワードサイズ（１
６ビット）命令リードと次の命令の第１ステートＳＴ１
のワードサイズ命令リードが、図８の（１−２）に示さ
れる制御を受ける本発明に係る第１ステートＳＴ１のロ
ングワードサイズ（３２ビット）命令リードに合体され
たと理解することができる。For example, when this instruction is executed continuously for a plurality of instructions, the first state ST1 and the second state ST
2 (the third state ST3 is omitted). The word size (1) of the third state ST3 of the instruction before receiving the same control as the conventional one shown in (1-1) of FIG.
6 bits) Instruction read and first state ST1 of next instruction
Can be understood to be combined with the long word size (32 bit) instruction read of the first state ST1 according to the present invention which is controlled by the control shown in (1-2) of FIG.

【０１０３】なお、第１ステートＳＴ１又は第３ステー
トＳＴ３の一方を実行する場合、何れを実行して何れを
省略（スキップ）するかは、それ以前の命令リードの状
態によって決まる。分岐命令の分岐先の先頭におかれ、
かつ４の倍数番地でない場合は、プリフェッチされたの
は自命令の第1ワードのみであり、自命令の第２ワード
を待つため、第１ステートＳＴ１を実行するし、分岐命
令の分岐先の先頭におかれても、４の倍数番地の場合
は、自命令の第２ワードも同時にリード（プリフェッ
チ）済みであるから、第１ステートＳＴ１は省略（スキ
ップ）し、第３ステートＳＴ３を実行する。When one of the first state ST1 and the third state ST3 is executed, which one is executed and which is omitted (skipped) is determined by the state of the instruction read before that. At the beginning of the branch destination of the branch instruction,
If the address is not a multiple of 4, only the first word of the instruction is prefetched, and the first state ST1 is executed to wait for the second word of the instruction. Even in the case of a multiple address of 4, the first state ST1 is omitted (skipped) and the third state ST3 is executed because the second word of the instruction has already been read (prefetched) at the same time.

【０１０４】図９には分岐命令（ＪＭＰ＠ａａ：２
４）の動作タイミングが示される。図９の（１）には制
御部（特に命令デコーダ２０３）ＣＯＮＴの動作が示さ
れ、図９の（２）には実行部ＥＸＥＣの動作が示されて
いる。図８と同様に、制御部ＣＯＮＴの動作と実行部Ｅ
ＸＥＣの動作との時間差を便宜的に０として表現してい
る。FIG. 9 shows a branch instruction (JMP @aa: 2
The operation timing of 4) is shown. 9A shows the operation of the control unit (particularly, the instruction decoder 203) CONT, and FIG. 9B shows the operation of the execution unit EXEC. 8, the operation of the control unit CONT and the execution unit E
The time difference from the XEC operation is expressed as 0 for convenience.

【０１０５】図９において、第１ステートＳＴ１は、自
命令の第２ワードのリードの完了を待つためであり、第
２ワードがリード済みであれば省略（スキップ）可能で
ある。分岐先の命令を２回リードする。１回目は、リー
ドした命令をＣＰＵ２内部に取り込むが、２回目は命令
リードの発行のみを行い、リードした命令のＣＰＵ２内
部への取り込みは、次の命令の実行と重なる。In FIG. 9, the first state ST1 is for waiting for the completion of the reading of the second word of the own instruction, and can be omitted (skipped) if the second word has been read. Read the instruction at the branch destination twice. The first time, the read instruction is fetched into the CPU 2, but the second time only the instruction read is issued, and the fetch of the read instruction into the CPU 2 overlaps the execution of the next instruction.

【０１０６】１回目は、４の倍数番地の場合、２ワード
をリードし、ＰＣインクリメントは＋４となり、４の倍
数番地でない場合、１ワードをリードし、ＰＣインクリ
メントは＋２となる。At the first time, when the address is a multiple of 4, two words are read, and the PC increment is +4. When the address is not a multiple of 4, one word is read, and the PC increment is +2.

【０１０７】このため、分岐先に同じ分岐命令（ＪＭＰ
＠ａａ：２４）が存在した場合、分岐先が４の倍数番
地であれば、自命令の第２ワードのリード（プリフェッ
チ）が完了した状態で、実行が開始されるので、第１ス
テートＳＴ１を省略（スキップ）できる。分岐先が４の
倍数番地でなければ、自命令の第２ワードのリードが完
了していない状態で、実行が開始されるので、第１ステ
ートＳＴ１を省略（スキップ）できない。For this reason, the same branch instruction (JMP
If (aa): 24) is present and the branch destination is a multiple address of 4, the execution is started with the read (prefetch) of the second word of the instruction being completed. Can be omitted (skipped). If the branch destination is not a multiple of 4, the execution is started in a state where the reading of the second word of the instruction is not completed, so that the first state ST1 cannot be omitted (skipped).

【０１０８】分岐命令を実行した場合も、従来と同じタ
イミングで分岐先の命令の実行を開始できる。４の倍数
番地に分岐している場合などは、分岐先の命令実行を短
縮できる。分岐命令や割込み例外処理などの応答性を維
持向上することができる。When a branch instruction is executed, execution of the instruction at the branch destination can be started at the same timing as in the prior art. In the case of branching to a multiple of four, the execution of the instruction at the branch destination can be shortened. Responsiveness such as branch instructions and interrupt exception handling can be maintained and improved.

【０１０９】上記の通り、分岐命令は、その配置される
アドレスによらず、実行可能である。実行ステート数が
異なるが、最低限、従来同等であり、むしろ、無操作命
令を挿入するなどの必要はなく、ソフトウェアに負担を
かけなくてよい。As described above, a branch instruction is executable irrespective of the address where it is located. Although the number of execution states is different, the number of execution states is at least the same as the conventional one. Rather, there is no need to insert a no-operation instruction, and it is not necessary to put a burden on software.

【０１１０】図１０〜図１３にはプログラムを実行した
ときのタイミングチャートが例示される。実行プログラ
ムは、Ｌ０ＪＭＰＬ１Ｌ１ＭＯＶ．Ｗ＠ａａ１，Ｒ０ＡＤＤ．ＷＲ０，Ｒ１ＥＸＴＳ．ＬＥＲ１ＭＯＶ．ＬＥＲ１，＠ａａ２である。FIGS. 10 to 13 show timing charts when the program is executed. The execution program is L0 JMP L1 L1 MOV. W ＠ aa1, R0 ADD. W R0, R1 EXTS. LER1 MOV. L ER1, ＠ aa2.

【０１１１】尚、ＭＯＶ．ＬＥＲ１，＠ａａ２は、Ｍ
ＯＶ．ＷＲ１，＠ａａ２の命令コードに、前置コード
を付加した命令コードを持っている。かかる前置コード
は制御信号を発生し、続く命令コード（ＭＯＶ．ＷＲ
１，＠ａａ２）の動作を変更するもので、特開平６−５
１９８１号公報などに記載されている。Note that MOV. L ER1, ＠ aa2 is M
OV. It has an instruction code obtained by adding a prefix code to the instruction code of W R1, $ aa2. Such a prefix code generates a control signal and the subsequent instruction code (MOV.WR)
1, .DELTA.aa2).
1981, etc.

【０１１２】図１０〜図１３では、命令の配置されるア
ドレスが相違され、ラベルＬ０，Ｌ１は、図１０では、
Ｌ０＝２，Ｌ１＝１４、即ちＬ０．ＥＱＵ２Ｌ１．ＥＱＵ１４、図１１では、Ｌ０＝２，Ｌ１＝１２、即ちＬ０．ＥＱＵ２Ｌ１．ＥＱＵ１２、図１２では、Ｌ０＝０，Ｌ１＝１４、即ちＬ０．ＥＱＵ０Ｌ１．ＥＱＵ１４、図１３では、Ｌ０＝０，Ｌ１＝１２、即ちＬ０．ＥＱＵ０Ｌ１．ＥＱＵ１２、とされる。また、データは共通で、ａａ１＝１０２、１
１２＝１０４、即ちａａ１．ＥＱＵ１０２ａａ２．ＥＱＵ１０４とする。In FIGS. 10 to 13, the addresses where the instructions are arranged are different, and the labels L0 and L1 are labeled in FIG.
L0 = 2, L1 = 14, that is, L0. EQU 2 L1. In FIG. 11, L0 = 2, L1 = 12, that is, L0. EQU 2 L1. In FIG. 12, L0 = 0 and L1 = 14, that is, L0. EQU 0 L1. In FIG. 13, L0 = 0 and L1 = 12, that is, L0. EQU 0 L1. EQU 12, The data is common, aa1 = 102, 1
12 = 104, ie, aa1. EQU 102 aa2. EQU 104.

【０１１３】分岐命令の第１ステートは、命令レジスタ
（ＦＩＦＯ）２００に１ワード分の命令コードが存在す
る（ＦＩＦＯＣＮＴ１＝１）と、省略可能とされる。The first state of the branch instruction can be omitted if an instruction code for one word exists in the instruction register (FIFO) 200 (FIFOCNT1 = 1).

【０１１４】転送命令（ＭＯＶ．Ｗ）の第１ステート
は、命令レジスタ（ＦＩＦＯ）２００に１ワード分の命
令コードが存在すると、省略可能とされる。The first state of the transfer instruction (MOV.W) can be omitted if the instruction register (FIFO) 200 has an instruction code for one word.

【０１１５】転送命令（ＭＯＶ．Ｗ）の第３ステート
は、命令レジスタ（ＦＩＦＯ）２００に２ワード分の命
令コードが存在する（ＦＩＦＯＣＮＴ１＝ＦＩＦＯＣＮ
Ｔ２＝１）場合、または、命令レジスタ（ＦＩＦＯ）２
００に１ワード分の命令コードが存在し、かつ、命令リ
ード実行中の場合（ＦＩＦＯＣＮＴ１＝ＩＦＭＯＮ＝
１）に、省略可能とされる。In the third state of the transfer instruction (MOV.W), an instruction code for two words exists in the instruction register (FIFO) 200 (FIFOCNT1 = FIFOCN).
T2 = 1) or the instruction register (FIFO) 2
00 when an instruction code for one word exists and the instruction read is being executed (FIFOCNT1 = IFMON =
In 1), it can be omitted.

【０１１６】レジスタ間演算命令は、命令レジスタ（Ｆ
ＩＦＯ）２００に２ワード分の命令コードが存在する
（ＦＩＦＯＣＮＴ１＝ＦＩＦＯＣＮＴ２＝１）場合、ま
たは、命令レジスタ（ＦＩＦＯ）２００に１ワード分の
命令コードが存在し、かつ、命令リード実行中の場合
（ＦＩＦＯＣＮＴ１＝ＩＦＭＯＮ＝１）に、次の命令が
レジスタ間演算命令などのとき（ＮＸＴＭＯＮ１＝
１）、サブ命令デコーダ（ＤＥＣＳ）２０４とサブ算術
論理演算器ＡＬＵＳの動作を指示する。The register-to-register operation instruction includes an instruction register (F
(IFO) 200 has an instruction code for two words (FIFOCNT1 = FIFOCNT2 = 1), or an instruction register (FIFO) 200 has an instruction code for one word and the instruction is being read ( When FIFOCNT1 = IFMON = 1, when the next instruction is an inter-register operation instruction or the like (NXTMON1 = IFMON = 1)
1) Instruct the operation of the sub instruction decoder (DECS) 204 and the sub arithmetic logic unit ALUS.

【０１１７】転送命令（ＭＯＶ．Ｌ）の第１ステート
（前置コード、ＮＸＴＭＯＮ２＝１）は、命令レジスタ
（ＦＩＦＯ）２００に１ワード分の命令コードが存在す
ると、省略可能とされる。省略されない場合は、前置コ
ードを命令デコーダで解読し、命令リードとＰＣインク
リメントを行なうとともに、制御信号を発生する。省略
した場合は、命令レジスタから、所望の信号を生成し、
命令デコーダに入力する。The first state (prefix code, NXTMON2 = 1) of the transfer instruction (MOV.L) can be omitted if an instruction code for one word exists in the instruction register (FIFO) 200. If not omitted, the prefix code is decoded by the instruction decoder, the instruction read and PC increment are performed, and a control signal is generated. If omitted, generate the desired signal from the instruction register,
Input to the instruction decoder.

【０１１８】転送命令（ＭＯＶ．Ｌ）の第２ステート、
第４ステートは、転送命令（ＭＯＶ．Ｗ）の第１ステー
ト、第３ステートと同様である。The second state of the transfer instruction (MOV.L)
The fourth state is the same as the first and third states of the transfer instruction (MOV.W).

【０１１９】図１０の場合は以下の通りの動作になる。
基準クロックφのサイクルＴ０におけるスロットＣ２
（基準クロックφのローレベル期間）で、ＣＰＵ２は、
図示されない分岐命令の実行時に、命令フェッチ（ｉ
ｆ）を示す、バスコマンド（ＢＣＭＤ）を出力し、ま
た、アドレスをアドレスバッファＡＢからアドレスバス
ＩＡＢに出力する。同様に、サイクルＴ１のスロットＣ
２でバスコマンドと、次のアドレスを出力する。In the case of FIG. 10, the operation is as follows.
Slot C2 in cycle T0 of reference clock φ
In the (low-level period of the reference clock φ), the CPU 2
When a branch instruction (not shown) is executed, an instruction fetch (i
f), and outputs an address from the address buffer AB to the address bus IAB. Similarly, slot C in cycle T1
In step 2, a bus command and the next address are output.

【０１２０】アドレスバスＩＡＢの内容と、バスコマン
ドに基づき、内蔵ＲＯＭ４の内容が、サイクルＴ１のス
ロットＣ２で内部データバスＩＤＢに得られ、これをサ
イクルＴ２のスロットＣ１（基準クロックφのハイレベ
ル期間）で命令レジスタ（ＦＩＦＯ）２００及びリード
データバッファＲＤＢにラッチする。なお、この時の命
令アドレスは４の倍数番地ではないので、内部データバ
スＩＤＢの下位側（ビット１５〜０）のみが使用され
る。同様にサイクルＴ２のスロットＣ１で、次のアドレ
スの内容を命令レジスタ（ＦＩＦＯ）２００及びリード
データバッファＲＤＢにラッチする。今回は、４の倍数
番地であるので、内部データバスＩＤＢの上位（ビット
３１〜１６）及び下位（ビット１５〜０）が使用され
る。On the basis of the contents of the address bus IAB and the bus command, the contents of the internal ROM 4 are obtained on the internal data bus IDB in the slot C2 of the cycle T1. ), The data is latched in the instruction register (FIFO) 200 and the read data buffer RDB. Since the instruction address at this time is not a multiple of 4, only the lower side (bits 15 to 0) of the internal data bus IDB is used. Similarly, the contents of the next address are latched in the instruction register (FIFO) 200 and the read data buffer RDB in the slot C1 of the cycle T2. In this case, since the address is a multiple of 4, the upper (bits 31 to 16) and lower (bits 15 to 0) of the internal data bus IDB are used.

【０１２１】サイクルＴ２のスロットＣ１で命令コード
（ｊｍｐ−１）がデコーダ（ＤＥＣ）２０３に入力され
て、命令の内容が解読される。The instruction code (jmp-1) is input to the decoder (DEC) 203 in the slot C1 of the cycle T2, and the contents of the instruction are decoded.

【０１２２】自命令の第２ワード（ｊｍｐ−２）のリー
ドが完了していないので、第１ステートＳＴ１を実行す
る。Since the reading of the second word (jmp-2) of the own instruction has not been completed, the first state ST1 is executed.

【０１２３】サイクルＴ２のスロットＣ２では、バスコ
マンドは無操作とされ、リード／ライトは開始されな
い。In the slot C2 of the cycle T2, the bus command is set to no operation, and the read / write is not started.

【０１２４】サイクルＴ３のスロットＣ２で、リードデ
ータバッファＲＤＢの内容（絶対アドレス＝１４）を、
内部バスＧＢを経由して、アドレスバッファＡＢに格納
し、アドレスバスＩＡＢに出力するとともに、バスコマ
ンドを発行して、命令のリードを行なう。同様に、サイ
クルＴ４のスロットＣ２でバスコマンドと、次のアドレ
スを出力して、命令のリードを行なう。In the slot C2 of the cycle T3, the contents of the read data buffer RDB (absolute address = 14) are
Via the internal bus GB, the data is stored in the address buffer AB, output to the address bus IAB, issue a bus command, and read the instruction. Similarly, a bus command and the next address are output in slot C2 of cycle T4, and the instruction is read.

【０１２５】リードした内容を、サイクルＴ５のスロッ
トＣ１、サイクルＴ６のスロットＣ１で、命令レジスタ
（ＦＩＦＯ）２００及びリードデータバッファＲＤＢに
ラッチする。The read contents are latched in the instruction register (FIFO) 200 and the read data buffer RDB in the slot C1 in the cycle T5 and the slot C1 in the cycle T6.

【０１２６】また、内部バスＧＢの内容は、ライトデー
タバッファＷＤＢとインクリメンタＩＮＣにも入力さ
れ、インクリメンタＩＮＣではインクリメント（＋２／
＋４）が行われる。The contents of the internal bus GB are also input to the write data buffer WDB and the incrementer INC, and the increment (+ 2 /
+4) is performed.

【０１２７】サイクルＴ４のスロットＣ１で、インクリ
メンタＩＮＣでインクリメント（＋２）された結果（１
６）が、内部バスＷＢを経由してプログラムカウンタＰ
Ｃにライトされる。同様に、サイクルＴ５のスロットＣ
１で、インクリメント（＋４）された結果（２０）が、
内部バスＷＢを経由してプログラムカウンタＰＣにライ
トされる。In the slot C1 of the cycle T4, the result (1) of the increment (+2) by the incrementer INC
6) is the program counter P via the internal bus WB.
Written to C. Similarly, slot C in cycle T5
1, the result (20) incremented (+4) is
The data is written to the program counter PC via the internal bus WB.

【０１２８】サイクルＴ５のスロットＣ１で命令コード
（ｍｏｖ−１）がデコーダ（ＤＥＣ）２０３に入力され
て、命令の内容が解読される。The instruction code (mov-1) is input to the decoder (DEC) 203 in the slot C1 of the cycle T5, and the contents of the instruction are decoded.

【０１２９】自命令の第２ワード（ｍｏｖ−２）のリー
ドが完了していないので、第１ステートＳＴ１を実行す
る。Since the reading of the second word (mov-2) of the own instruction has not been completed, the first state ST1 is executed.

【０１３０】サイクルＴ５のスロットＣ２でバスコマン
ドと、次のアドレスを出力して、命令のリードを行な
う。また、プログラムカウンタＰＣのインクリメントな
どを行なう。The bus command and the next address are output in slot C2 in cycle T5, and the instruction is read. In addition, the program counter PC is incremented.

【０１３１】サイクルＴ６のスロットＣ２で、リードデ
ータバッファＲＤＢの内容（絶対アドレス＝１０２）
を、内部バスＧＢを経由して、アドレスバッファＡＢに
格納し、アドレスバスＩＡＢに出力するとともに、バス
コマンドを発行して、データのリードを行なう。第３ス
テートＳＴ３は省略（スキップ）される。In slot C2 of cycle T6, the contents of read data buffer RDB (absolute address = 102)
Is stored in the address buffer AB via the internal bus GB and is output to the address bus IAB, and at the same time, a bus command is issued to read data. The third state ST3 is omitted (skipped).

【０１３２】リードしたデータは、サイクルＴ８のスロ
ットＣ１でリードデータバッファＲＤＢに格納され、内
部バスＷＢを経由して、汎用レジスタＥＲ０（実質的に
はＲ０）にライトされる。また、リードデータバッファ
ＲＤＢ上のデータを検査して、結果をコンディションコ
ードレジスタＣＣＲの所定のビット（例えば、ネガティ
ブＮ、ゼロＺ、オーバフローＶ）に反映する。この動作
は、命令コード（ｍｏｖ−１）を解読した結果に基づ
き、行われるが、次の命令と重なった時間に実行され
る。The read data is stored in the read data buffer RDB in the slot C1 in the cycle T8, and is written to the general-purpose register ER0 (substantially R0) via the internal bus WB. In addition, the data in the read data buffer RDB is inspected, and the result is reflected in predetermined bits (for example, negative N, zero Z, overflow V) of the condition code register CCR. This operation is performed based on the result of decoding the instruction code (mov-1), but is executed at a time overlapping with the next instruction.

【０１３３】サイクルＴ７のスロットＣ１で命令コード
（ａｄｄ）がデコーダ（ＤＥＣ）２０３に入力されて、
命令の内容が解読される。The instruction code (add) is input to the decoder (DEC) 203 in slot C1 of cycle T7,
The contents of the instruction are decoded.

【０１３４】サイクルＴ７のスロットＣ２でバスコマン
ドと、次のアドレスを出力して、命令のリードを行な
う。また、プログラムカウンタＰＣのインクリメントな
どを行なう。The bus command and the next address are output in slot C2 of cycle T7, and the instruction is read. In addition, the program counter PC is incremented.

【０１３５】サイクルＴ８のスロットＣ１で、汎用レジ
スタＥＲ１（Ｒ１）の内容を内部バスＧＢに読み出し、
算術論理演算器ＡＬＵに入力する。また、汎用レジスタ
ＥＲ０（Ｒ０）の内容を内部バスＤＢに読み出そうとす
るが、前命令のライトと競合しているために、リードデ
ータバッファＲＤＢから読み出し（遅延時間を最小限に
できる）、算術論理演算器ＡＬＵに入力する。算術論理
演算器ＡＬＵには加算が指示される。In the slot C1 of the cycle T8, the contents of the general-purpose register ER1 (R1) are read out to the internal bus GB.
Input to arithmetic and logic unit ALU. Also, the contents of the general-purpose register ER0 (R0) are read to the internal bus DB, but are read from the read data buffer RDB (the delay time can be minimized) because of a conflict with the write of the previous instruction. Input to arithmetic and logic unit ALU. The arithmetic and logic unit ALU is instructed to add.

【０１３６】サイクルＴ８のスロットＣ２で、汎用レジ
スタＥＲ１（Ｒ１）に演算結果を格納する。また、演算
結果を検査して、結果をコンディションコードレジスタ
ＣＣＲの所定のビット（例えば、ネガティブＮ、ゼロ
Ｚ、オーバフローＶ、キャリＣ、ハーフキャリＨ）に反
映する。At slot C2 in cycle T8, the operation result is stored in general-purpose register ER1 (R1). Further, the operation result is inspected, and the result is reflected on predetermined bits (for example, negative N, zero Z, overflow V, carry C, half carry H) of the condition code register CCR.

【０１３７】次の命令コードが、レジスタ間演算命令
（ＮＸＴＭＯＮ１＝１）であるために、サイクルＴ７の
スロットＣ２で、サブ命令デコーダ（ＤＥＣＳ）２０４
に、命令コード（ｅｘｔｓ）を入力する。Since the next instruction code is an inter-register operation instruction (NXTMON1 = 1), the sub-instruction decoder (DECS) 204 in slot C2 of cycle T7.
, An instruction code (exts) is input.

【０１３８】サイクルＴ８のスロットＣ２で、汎用レジ
スタＥＲ０の内容を内部バスＧＢに読み出そうとする
が、前命令のライトと競合しているために、算術論理演
算器ＡＬＵから読み出し（遅延時間を最小限にでき
る）、算術論理演算器ＡＬＵに入力する。算術論理演算
器ＡＬＵには拡張が指示される。At the slot C2 in the cycle T8, the contents of the general-purpose register ER0 are read to the internal bus GB. Input to the arithmetic and logic unit ALU. Extension is instructed to the arithmetic and logic unit ALU.

【０１３９】サイクルＴ９のスロットＣ１で、汎用レジ
スタＥＲ０に演算結果を格納する。また、演算結果を検
査して、結果をコンディションコードレジスタＣＣＲの
所定のビット（例えば、ネガティブＮ、ゼロＺ、オーバ
フローＶ）に反映する。At slot C1 in cycle T9, the operation result is stored in general-purpose register ER0. Further, the operation result is inspected, and the result is reflected in predetermined bits (for example, negative N, zero Z, overflow V) of the condition code register CCR.

【０１４０】サイクルＴ８のスロットＣ１で命令コード
（ｍｏｖｌ−１）がデコーダ（ＤＥＣ）２０３に入力さ
れて、命令の内容が解読される。In the slot C1 of the cycle T8, the instruction code (movl-1) is input to the decoder (DEC) 203, and the contents of the instruction are decoded.

【０１４１】自命令の第２ワード（ｍｏｖｌ−２）のリ
ードが完了していないので、第１ステート（前置コー
ド）ＳＴ１を実行する。Since the reading of the second word (movl-2) of the own instruction has not been completed, the first state (prefix code) ST1 is executed.

【０１４２】サイクルＴ８のスロットＣ２でバスコマン
ドと、次のアドレスを出力して、命令のリードを行な
う。また、プログラムカウンタＰＣのインクリメントな
どを行なう。また、制御信号を発生して、次の命令コー
ドへ指示（この場合は、ロングワードサイズ指示）を伝
達する。The bus command and the next address are output in slot C2 in cycle T8, and the instruction is read. In addition, the program counter PC is incremented. Further, it generates a control signal to transmit an instruction (in this case, an instruction of long word size) to the next instruction code.

【０１４３】サイクルＴ９のスロットＣ１で命令コード
（ｍｏｖｌ−２）がデコーダ（ＤＥＣ）２０３に入力さ
れて、命令の内容が解読される。At slot C1 in cycle T9, the instruction code (movl-2) is input to the decoder (DEC) 203, and the contents of the instruction are decoded.

【０１４４】自命令の第３ワード（ｍｏｖｌ−３）のリ
ードが完了しており、第２ステートＳＴ２を省略（スキ
ップ）する。次の命令のリードが発行されていないの
で、第４ステートを実行する。The reading of the third word (movl-3) of the own instruction has been completed, and the second state ST2 is omitted (skipped). Since the read of the next instruction has not been issued, the fourth state is executed.

【０１４５】図１１の動作を、専ら図１０との相違点に
関し説明すれば、以下の通りである。サイクルＴ５のス
ロットＣ１で命令コード（ｍｏｖ−１）がデコーダ（Ｄ
ＥＣ）２０３に入力されて、命令の内容が解読される。
このとき、自命令の第２ワード（ｍｏｖ−２）のリード
が完了しており、第１ステートＳＴ１を省略（スキッ
プ）する。また、次の命令のリードが発行されているた
め、第３ステートＳＴ３が省略（スキップ）される。The operation of FIG. 11 will be described below solely with respect to differences from FIG. In the slot C1 of the cycle T5, the instruction code (mov-1) is
EC) 203, and the contents of the command are decoded.
At this time, the reading of the second word (mov-2) of the own instruction has been completed, and the first state ST1 is omitted (skipped). Since the read of the next instruction has been issued, the third state ST3 is omitted (skipped).

【０１４６】サイクルＴ６のスロットＣ１で命令コード
（ａｄｄ）がデコーダ（ＤＥＣ）２０３に入力されて、
命令の内容が解読される。The instruction code (add) is input to the decoder (DEC) 203 in the slot C1 of the cycle T6.
The contents of the instruction are decoded.

【０１４７】次の命令コードは、レジスタ間演算命令で
あるが、次の次の命令のリードが完了していないため、
サブ命令デコーダ（ＤＥＣＳ）２０４への命令コード
（ｅｘｔｓ）の入力は行なわない。The next instruction code is an inter-register operation instruction, but since the reading of the next instruction has not been completed,
No instruction code (exts) is input to sub-instruction decoder (DECS) 204.

【０１４８】サイクルＴ７のスロットＣ１で命令コード
（ｅｘｔｓ）がデコーダ（ＤＥＣ）２０３に入力され
て、命令の内容が解読される。In the slot C1 of the cycle T7, the instruction code (exts) is input to the decoder (DEC) 203, and the contents of the instruction are decoded.

【０１４９】算術論理演算器ＡＬＵを使用して演算を行
なう。レジスタの競合は発生しないので、汎用レジスタ
ＥＲ０を読み出す。An operation is performed using the arithmetic and logic unit ALU. Since no register conflict occurs, the general-purpose register ER0 is read.

【０１５０】サイクルＴ８のスロットＣ１で、命令コー
ド（ｍｏｖｌ−１）が前置コードであることを判定（Ｎ
ＸＴＭＯＮ２＝１）し、命令コード（ｍｏｖｌ−２）が
デコーダ（ＤＥＣ）２０３に入力されて、命令の内容が
解読される。自命令の第３ワード（ｍｏｖｌ−３）のリ
ードが完了していないので、第２ステートＳＴ２を実行
する。次の命令のリードが完了しており、第４ステート
を省略（スキップ）する。In slot C1 of cycle T8, it is determined that the instruction code (movl-1) is a prefix code (N
XTMON2 = 1), the instruction code (movl-2) is input to the decoder (DEC) 203, and the content of the instruction is decoded. Since the reading of the third word (movl-3) of the own instruction has not been completed, the second state ST2 is executed. The reading of the next instruction has been completed, and the fourth state is omitted (skipped).

【０１５１】図１２の動作を、主に図１０との相違点に
関し説明すれば、以下の通りである。サイクルＴ２のス
ロットＣ１で命令コード（ｊｍｐ−１）がデコーダ（Ｄ
ＥＣ）２０３に入力されて、命令の内容が解読される。
自命令の第２ワード（ｊｍｐ−２）のリードが完了して
おり、第１ステートＳＴ１を省略（スキップ）する。そ
れ以降の動作は、図１０と同様である。The operation of FIG. 12 will be described mainly with respect to differences from FIG. 10 as follows. In the slot C1 of the cycle T2, the instruction code (jmp-1)
EC) 203, and the contents of the command are decoded.
The reading of the second word (jmp-2) of the own instruction has been completed, and the first state ST1 is omitted (skipped). The subsequent operation is the same as in FIG.

【０１５２】図１３の動作を、主に図１０との相違点に
関し説明すれば、以下の通りである。図１２と同様
に、サイクルＴ２のスロットＣ１で命令コード（ｊｍｐ
−１）がデコーダ（ＤＥＣ）２０３に入力されて、命令
の内容が解読される。自命令の第２ワード（ｊｍｐ−
２）のリードが完了しており、第１ステートを省略（ス
キップ）する。それ以降の動作は、図１１と同様であ
る。The operation of FIG. 13 will be described mainly with respect to the differences from FIG. 10 as follows. As in FIG. 12, the instruction code (jmp
-1) is input to the decoder (DEC) 203, and the contents of the instruction are decoded. The second word (jmp-
Since the read of 2) has been completed, the first state is omitted (skipped). The subsequent operation is the same as in FIG.

【０１５３】前記５命令を、従来技術では１２ステート
で実行可能とされている。これに対して、図１０では９
ステート、図１１、図１２では８ステート、図１３では
７ステートで実行している。処理に必要なステート数を
５８〜７５％に短縮している。分岐先が４の倍数番地か
どうかによって、ばらつきがあるが、分岐先が４の倍数
番地でない場合に、内部データバスＩＤＢの上位側を使
用できず、内部バスのスループットが低下するためであ
る。In the prior art, the five instructions can be executed in 12 states. In contrast, FIG.
The states are executed in eight states in FIGS. 11 and 12, and seven states in FIG. The number of states required for processing is reduced to 58 to 75%. This is because there is variation depending on whether the branch destination is a multiple address of 4, but when the branch destination is not a multiple address of 4, the upper side of the internal data bus IDB cannot be used, and the throughput of the internal bus decreases.

【０１５４】所定のプログラムの処理速度を向上するた
めに、４の倍数番地に所望のプログラムを配置したい場
合には、アセンブラに４の倍数番地へアライメントする
制御命令を設け、これを利用することができる。かかる
制御命令は、例えば、平成４年５月（株）日立製作所発
行『Ｈ８Ｓ，Ｈ８／３００シリーズクロスアセンブラ』
ｐ６２などに記載されている。アライメントする制御命
令は、マイクロコンピュータの命令に変換されなく、変
更によるプログラム品質などを大きく損なうことはな
い。When it is desired to arrange a desired program at a multiple address of 4 in order to improve the processing speed of a predetermined program, it is possible to provide a control instruction for aligning the multiple address of 4 with the assembler and use this. it can. Such a control command is described in, for example, “H8S, H8 / 300 Series Cross Assembler” issued by Hitachi, Ltd. in May 1992.
p62 and the like. The control instruction for alignment is not converted into a microcomputer instruction, and the change does not significantly impair the program quality or the like.

【０１５５】また、以上の動作タイミング例において
は、短いプログラムで、しかも分岐命令直後で、かつ、
更に、分岐命令を実行しているため、本発明の改善の効
果は必ずしも、最大限に表現されているとは言えない。Further, in the above operation timing example, in the case of a short program, immediately after the branch instruction,
Further, since the branch instruction is executed, the effect of the improvement of the present invention cannot always be said to be maximized.

【０１５６】例えば、レジスタ間演算命令のみを実行し
続けるような場合は、ＡＬＵ、ＡＬＵＳを交互に動作さ
せていくことにより、実効的に１ステートに２命令実行
し、従来技術に対して５０％に短縮できる。For example, in the case where only the inter-register operation instruction is continuously executed, two instructions are effectively executed in one state by alternately operating the ALU and the ALUS. Can be shortened to

【０１５７】また、バス幅を拡張した分を、省略可能な
ステートを省略（スキップ）して高速化を実現している
から、典型的には、命令リードの時間を半分にできる。
命令リードを８０％、データアクセスを２０％とする
と、６０％に、命令リードを７０％、データアクセスを
３０％とすると、６５％に短縮できることになる。In addition, since the speed that has been increased by omitting (skipping) the states that can be omitted from the expanded bus width, the instruction read time can be typically halved.
If the instruction read is 80% and the data access is 20%, it can be reduced to 60%. If the instruction read is 70% and the data access is 30%, it can be reduced to 65%.

【０１５８】プログラムの内容に依存するが、プログラ
ム全体としては、５０〜７５％程度の改善の効果が得ら
れる。Although it depends on the contents of the program, an improvement effect of about 50 to 75% can be obtained as a whole program.

【０１５９】図１４にはインクリメンタＩＮＣのブロッ
ク図が例示されている。インクリメンタＩＮＣはプログ
ラムカウンタＰＣのインクリメント（＋２／＋４）を行
なう。FIG. 14 illustrates a block diagram of the incrementer INC. The incrementer INC increments the program counter PC (+ 2 / + 4).

【０１６０】前述の通り、２組の算術論理演算器ＡＬ
Ｕ、ＡＬＵＳがあるのに対して、インクリメンタＩＮＣ
は２組でなく、１組とされる。分岐命令などを除き、プ
ログラムカウンタＰＣのインクリメントは＋４とされ
る。従って、通常入力されるプログラムカウンタＰＣの
下位２ビットは、２’ｂ００となる。As described above, two sets of arithmetic logic units AL
U and ALUS, but incrementer INC
Are not two sets but one set. Except for the branch instruction and the like, the increment of the program counter PC is +4. Therefore, the lower two bits of the normally input program counter PC are 2'b00.

【０１６１】インクリメンタＩＮＣは、各ビットハーフ
アダー３００で構成され、内部バスＧＢを入力とし、内
部バスＰＣＷＢへ出力を行なう。ビット１のみ、データ
入力を論理和回路（ＯＲ）３０１によって論理値１（＋
２）に固定し、更に、キャリ（論理値１）を入力（＋
２）する。すなわち、＋２を二重に行なうことによって
＋４を実現する。The incrementer INC includes each bit half adder 300, inputs the internal bus GB, and outputs to the internal bus PCWB. Only for bit 1, the data input is applied to the logical value 1 (+
2) and carry (logical value 1) input (+
2) Do it. That is, +4 is realized by performing +2 twice.

【０１６２】分岐命令の場合、分岐先のアドレスが４の
倍数番地（ビット１が０）であれば、分岐命令以外と同
様に、＋４が行われる。分岐先のアドレスが４の倍数番
地でなければ（ビット１が１）、前記論理和回路３０１
による＋２の意味がないから、キャリ入力の＋２のみが
行われる。In the case of a branch instruction, if the address of the branch destination is a multiple of 4 (bit 1 is 0), +4 is performed in the same manner as in the case of other than the branch instruction. If the address of the branch destination is not a multiple of 4 (bit 1 is 1), the logical sum circuit 301
Since there is no meaning of +2, only +2 of carry input is performed.

【０１６３】これによって、分岐先のアドレスによっ
て、自動的に＋２／＋４の選択が行なわれる。換言すれ
ば、入力された内容より大きく、その内最も小さい４の
倍数が出力される。Thus, + 2 / + 4 is automatically selected according to the branch destination address. In other words, the smallest multiple of 4, which is larger than the input content, is output.

【０１６４】図１５にはライトデータバッファＷＤＢの
ブロック図が例示される。ライトデータバッファＷＤＢ
は３つの部分ＷＤＢ−Ｍ、ＷＤＢ−Ｓ、ＷＤＢ−ＯＵＴ
から構成され、ＷＤＢ−Ｍには、内部バスＧＢからの入
力が可能とされ、ＷＤＢ−ＭからＷＤＢ−Ｓへの転送が
可能とされ、更に、ＷＤＢ−Ｍ及びＷＤＢ−ＳからＷＤ
Ｂ−ＯＵＴへの転送が可能とされると共に、ＷＤＢ−Ｏ
ＵＴには、内部バスＧＢ、ＤＢからの入力が可能とされ
る。一方、ＷＤＢ−Ｍ、ＷＤＢ−Ｓからは内部バスＧＢ
へ出力可能とされる。データバスＩＤＢへの出力は、Ｗ
ＤＢ−ＯＵＴから行なう。FIG. 15 illustrates a block diagram of the write data buffer WDB. Write data buffer WDB
Is the three parts WDB-M, WDB-S and WDB-OUT
The WDB-M can receive an input from the internal bus GB, can transfer the data from the WDB-M to the WDB-S, and can transmit the data from the WDB-M and the WDB-S to the WD.
Transfer to B-OUT is enabled, and WDB-O
Input from the internal buses GB and DB is enabled to the UT. On the other hand, from the WDB-M and WDB-S, the internal bus GB
Can be output to The output to the data bus IDB is W
Performed from DB-OUT.

【０１６５】待避すべきプログラムカウンタＰＣの値は
予めライトデータバッファＷＤＢに格納しておくことと
するライトデータバッファＷＤＢに、待避すべきプログ
ラムカウンタＰＣの値を予め格納しておくことは、特開
昭６２-２９３６６５号公報に記載されているが、本発
明では、自命令の命令コード長、そして命令実行開始時
に実行中の命令リードの状態（ＩＦＭＯＮ）によって、
待避すべきプログラムカウンタＰＣ値の出力方法を相違
させる。The value of the program counter PC to be saved is stored in the write data buffer WDB in advance. The value of the program counter PC to be saved is stored in the write data buffer WDB in advance. According to the present invention, according to the present invention, the instruction code length of the instruction itself and the state of the instruction read being executed at the start of the instruction execution (IFMON) are described in the present invention.
The output method of the program counter PC value to be saved is different.

【０１６６】具体的には、４の倍数番地に存在する１ワ
ード命令で、命令リード中であれば（ＩＦＭＯＮ＝
１）、待避すべきプログラムカウンタＰＣの値をＷＤＢ
−Ｓから内部データバスＩＤＢに得る。その時、ＰＣ値
のビット１を１に固定する。More specifically, if a 1-word instruction is present at an address that is a multiple of 4 and an instruction is being read (IFMON =
1) The value of the program counter PC to be saved is stored in WDB.
-S to the internal data bus IDB. At that time, the bit 1 of the PC value is fixed to 1.

【０１６７】４の倍数番地に存在する１ワード命令で、
命令リード中でなければ（ＩＦＭＯＮ＝０）、待避すべ
きプログラムカウンタＰＣの値をＷＤＢ−Ｍから内部デ
ータバスＩＤＢに得る。また、その時のＰＣ値のビット
１を１に固定する（図２１のＴ８部分参照）。A one-word instruction existing at a multiple address of 4
If the instruction is not being read (IFMON = 0), the value of the program counter PC to be saved is obtained from the WDB-M to the internal data bus IDB. Also, bit 1 of the PC value at that time is fixed to 1 (see T8 in FIG. 21).

【０１６８】４の倍数番地に存在しない１ワード命令
で、命令リード中であれば（ＩＦＭＯＮ＝１）、待避す
べきＰＣの値をＷＤＢ−Ｓから内部データバスＩＤＢに
得る。そのときのＰＣ値のビット１に対する１固定は行
なわない。If the instruction is being read (IFMON = 1) with a one-word instruction that does not exist at a multiple address of 4, the value of the PC to be saved is obtained from the WDB-S to the internal data bus IDB. At this time, the PC value is not fixed to 1 for bit 1.

【０１６９】４の倍数番地に存在しない１ワード命令
で、命令リード中でなければ（ＩＦＭＯＮ＝０）、待避
すべきプログラムカウンタＰＣの値をＷＤＢ−Ｍから内
部データバスＩＤＢに得る。そのときのＰＣ値のビット
１に対する１固定は行なわない（図２０のＴ９部分参
照）。If the instruction is not being read (IFMON = 0) for a one-word instruction that does not exist at a multiple address of 4, the value of the program counter PC to be saved is obtained from the WDB-M to the internal data bus IDB. At this time, the PC value is not fixed to 1 for bit 1 (see T9 in FIG. 20).

【０１７０】４の倍数番地に存在する２ワード命令で、
命令リード中であれば（ＩＦＭＯＮ＝１）、待避すべき
プログラムカウンタＰＣ値をＷＤＢ−Ｍから内部データ
バスＩＤＢに得る。前記ビット１の固定は行なわない。With a two-word instruction existing at a multiple of four,
If the instruction is being read (IFMON = 1), the program counter PC value to be saved is obtained from the WDB-M to the internal data bus IDB. The bit 1 is not fixed.

【０１７１】４の倍数番地に存在する２ワード命令で、
命令リード中でなければ（ＩＦＭＯＮ＝０）、ＰＣから
待避すべきプログラムカウンタＰＣの値を内部データバ
スＩＤＢに得る。ビット１の固定は行なわない（図１９
のＴ８部分参照）。A two-word instruction at a multiple address of 4
If the instruction is not being read (IFMON = 0), the value of the program counter PC to be saved from the PC is obtained on the internal data bus IDB. Bit 1 is not fixed (FIG. 19).
T8 part).

【０１７２】４の倍数番地に存在しない２ワード命令
で、命令リード中であれば（ＩＦＭＯＮ＝１）、待避す
べきプログラムカウンタＰＣの値をＷＤＢ−Ｓから内部
データバスＩＤＢに得る。その時、ＰＣ値のビット１を
１に固定する。If a 2-word instruction that does not exist at a multiple address of 4 is being read (IFMON = 1), the value of the program counter PC to be saved is obtained from the WDB-S to the internal data bus IDB. At that time, the bit 1 of the PC value is fixed to 1.

【０１７３】４の倍数番地に存在しない２ワード命令
で、命令リード中でなければ（ＩＦＭＯＮ＝０）、待避
すべきプログラムカウンタＰＣ値をＷＤＢ−Ｍから内部
データバスＩＤＢに得る。その時、ＰＣ値のビット１を
１に固定する（図１８のＴ１０部分参照）。If the instruction is not being read (IFMON = 0) for a two-word instruction that does not exist at a multiple address of 4, a program counter PC value to be saved is obtained from the WDB-M to the internal data bus IDB. At this time, bit 1 of the PC value is fixed to 1 (see T10 in FIG. 18).

【０１７４】また、プログラム相対のアドレッシングモ
ードは、次の命令のアドレスを基準にディスプレースメ
ント（相対値）が加算されるが、これに用いる次の命令
のアドレスは、サブルーチン分岐命令時に待避するプロ
グラムカウンタＰＣの値と同じ値になるから、上記のラ
イトデータバッファＷＤＢの内容を使用することができ
る。即ち、上記同様に、ＷＤＢ−Ｍ、ＷＤＢ−Ｓまたは
プログラムカウンタＰＣから、適宜、内部バスＧＢへ次
の命令のアドレスを読み出し、算術論理演算器ＡＬＵな
どでディスプレースメントと加算を行なえばよい。ビッ
ト１を１にセットすることは、算術論理演算器ＡＬＵで
行なってもよいし、内部バスＧＢ上、或いは、ライトデ
ータバッファＷＤＢ又はプログラムカウンタＰＣ上で行
なってもよい。In the program-relative addressing mode, a displacement (relative value) is added based on the address of the next instruction, and the address of the next instruction used for this is a program counter saved at the time of a subroutine branch instruction. Since the value becomes the same as the value of the PC, the contents of the write data buffer WDB can be used. That is, similarly to the above, the address of the next instruction may be appropriately read from the WDB-M, WDB-S or the program counter PC to the internal bus GB, and the displacement and the addition may be performed by the arithmetic and logic unit ALU or the like. Setting the bit 1 to 1 may be performed by the arithmetic and logic unit ALU, on the internal bus GB, or on the write data buffer WDB or the program counter PC.

【０１７５】図１６にはプログラム相対分岐アドレス計
算用の算術演算器ＡＵの一例が示される。算術演算器Ａ
Ｕは、１ワード長のプログラム相対分岐命令の実行開始
に先立って、ライトデータバッファＷＤＢに格納された
プログラムカウンタＰＣの値（単にＰＣ値とも称する）
をマルチプレクサＭＰＸ経由で入力する。１ワード長の
ため、プログラムカウンタＰＣは使用しない。更に算術
演算器ＡＵは、リードデータバッファＲＤＢまたは命令
レジスタ（ＦＩＦＯ）２００に保持された、命令コード
中に含まれる８ビットディスプレースメントを、内部バ
スＤＢを経由して入力する。算術演算器ＡＵは双方の入
力を加算する。かかる分岐命令が４の倍数番地に存在す
るとき、更に、制御信号ｐｌｓ２によって、ＰＣ値のビ
ット１を１に固定して、実効的に＋２を同時に行なうこ
とが可能にされる。FIG. 16 shows an example of an arithmetic operation unit AU for calculating a program relative branch address. Arithmetic unit A
U is a value of the program counter PC stored in the write data buffer WDB (also simply referred to as a PC value) prior to the start of execution of the one-word-length program relative branch instruction.
Through the multiplexer MPX. Because it is one word long, the program counter PC is not used. Further, the arithmetic operation unit AU inputs the 8-bit displacement included in the instruction code, which is held in the read data buffer RDB or the instruction register (FIFO) 200, via the internal bus DB. Arithmetic unit AU adds both inputs. When such a branch instruction is present at a multiple address of 4, the control signal pls2 further fixes bit 1 of the PC value to 1, thereby making it possible to simultaneously execute +2 simultaneously.

【０１７６】プログラム相対分岐アドレス計算用の演算
器ＡＵを持つことにより、算術論理演算器ＡＬＵの動作
状態に拘らず、分岐アドレスの計算ができ、分岐の高速
化が実現できる。By having the arithmetic unit AU for calculating the program relative branch address, the branch address can be calculated irrespective of the operation state of the arithmetic and logic unit ALU, and the speeding up of the branch can be realized.

【０１７７】プログラム相対のアドレッシングモード
は、分岐命令に限定されず、転送命令などにも使用で
き、かかる算術論理演算器ＡＵによって、実効アドレス
の計算を高速にし、ひいては命令の処理速度を向上でき
る。The program-relative addressing mode is not limited to the branch instruction but can be used for a transfer instruction and the like, and the arithmetic and logic unit AU can speed up the calculation of the effective address and thus improve the processing speed of the instruction.

【０１７８】図１７にサブルーチン分岐命令（ＪＳＲ
＠ａａ：２４）の動作タイミングが示される。同図の表
現形式は図９の場合と同様である。図１７において、第
３ステートＳＴ３に、プログラムカウンタＰＣのスタッ
クのためのステートが挿入される他は、図９の分岐命令
と同様である。第１ステートＳＴ１は、自命令の第２ワ
ードのリードの完了を待つためであり、第２ワードがリ
ード済みであれば省略（スキップ）する。第２ステート
ＳＴ２の分岐先アドレスの命令リード、第３ステートＳ
Ｔ３のスタック、第４ステートＳＴ４の次の命令のリー
ドは、サブルーチン分岐命令に固有の動作であり、省略
（スキップ）されない。FIG. 17 shows a subroutine branch instruction (JSR
＠Aa: 24) is shown. The representation format in the figure is the same as in the case of FIG. In FIG. 17, the third state ST3 is the same as the branch instruction in FIG. 9 except that a state for stacking the program counter PC is inserted. The first state ST1 is for waiting for the completion of the reading of the second word of the own instruction, and is omitted (skipped) if the second word has been read. Instruction read at branch destination address in second state ST2, third state S
The reading of the next instruction in the stack of T3 and the fourth state ST4 is an operation unique to the subroutine branch instruction, and is not omitted (skipped).

【０１７９】図１８及び図１９にはサブルーチン分岐命
令を含むプログラムの実行タイミング図の一例が示され
る。各図には、分岐命令で分岐した先で、以下のプログ
ラムＬ０ＪＭＰＬ１Ｌ１ＭＯＶ．Ｗ＠ａａ１，Ｒ０ＪＳＲＬ２を実行した場合のタイミングが示されている。図１８及
び図１９では、命令の配置されるアドレスが相違され、
ラベルは、図１８では、Ｌ０＝２，Ｌ１＝１４，Ｌ２＝
４０，即ち、Ｌ０．ＥＱＵ２Ｌ１．ＥＱＵ１４Ｌ２．ＥＱＵ４０、図１９では、Ｌ０＝２，Ｌ１＝１２，Ｌ２＝４０，即
ち、Ｌ０．ＥＱＵ２Ｌ１．ＥＱＵ１２Ｌ２．ＥＱＵ４０とされる。FIGS. 18 and 19 show an example of an execution timing chart of a program including a subroutine branch instruction. In each figure, the following program L0 JMP L1 L1 MOV. The timing when W を aa1, R0 JSR L2 is executed is shown. 18 and 19, the addresses where the instructions are arranged are different.
In FIG. 18, the labels are L0 = 2, L1 = 14, L2 =
40, that is, L0. EQU 2 L1. EQU 14 L2. In FIG. 19, L0 = 2, L1 = 12, L2 = 40, that is, L0. EQU 2 L1. EQU 12 L2. EQU 40.

【０１８０】サブルーチン分岐命令は２ワード長の命令
コードを持ち、図１８では、サブルーチン分岐命令が、
アドレス１８（〜２１）に存在するから、待避すべき、
ＰＣ値の内容は２２である。また、図１９では、サブル
ーチン分岐命令が、アドレス１６（〜１９）に存在する
から、待避すべきＰＣ値の内容は２０である。尚、図１
８、図１９において、サブルーチン分岐命令実行前まで
の動作タイミングは、図１０、図１１と同じである。The subroutine branch instruction has an instruction code having a length of two words.
Because it exists at address 18 (~ 21), it should be saved
The content of the PC value is 22. In FIG. 19, since the subroutine branch instruction exists at the address 16 (to 19), the PC value to be saved is 20. FIG.
8 and 19, the operation timing before execution of the subroutine branch instruction is the same as in FIGS. 10 and 11.

【０１８１】図１８では、サイクルＴ７のスロットＣ１
で命令コード（ｊｓｒ−１）がデコーダ（ＤＥＣ）２０
３に入力されて、命令の内容が解読される。この時点
で、ＰＣ値は次に命令リードすべきアドレス（２４）、
ライトデータバッファＷＤＢ−Ｍには前回の命令リード
時のアドレス（２０）、ＷＤＢ−Ｓには前前回の命令リ
ード時のアドレス（１６）が格納されている。In FIG. 18, slot C1 in cycle T7
The instruction code (jsr-1) is the decoder (DEC) 20
3, the contents of the command are decoded. At this point, the PC value is the address to read the next instruction (24),
The write data buffer WDB-M stores the address (20) at the time of the previous instruction read, and WDB-S stores the address (16) at the time of the previous and previous instruction read.

【０１８２】サイクルＴ７のスロットＣ２で、ＷＤＢ−
Ｍの内容を、ＷＤＢ−ＯＵＴ（２０）に転送する。At slot C2 in cycle T7, WDB-
The contents of M are transferred to WDB-OUT (20).

【０１８３】サイクルＴ９のスロットＣ１で、スタック
ポインタＳＰを内部バスＧＢに読み出し、算術論理演算
器ＡＬＵに入力してデクリメント（−４）を行なう。At the slot C1 in the cycle T9, the stack pointer SP is read out to the internal bus GB, and is input to the arithmetic and logic unit ALU to decrement (-4).

【０１８４】サイクルＴ９のスロットＣ２で、デクリメ
ントした結果を、内部バスＷＢに読み出して、スタック
ポインタＳＰにライトするとともに、内部バスＧＢにも
読み出して、アドレスバッファＡＢに格納して、内部ア
ドレスバスＩＡＢに出力する。また、同時にロングワー
ドデータライトのバスコマンドを発行する。In the slot C2 of the cycle T9, the result of the decrement is read out to the internal bus WB, written to the stack pointer SP, read out to the internal bus GB, stored in the address buffer AB, and stored in the address buffer AB. Output to At the same time, a long word data write bus command is issued.

【０１８５】ライトデータは、サイクルＴ１０のスロッ
トＣ１で、ＷＤＢ−ＯＵＴの内容を、ビット１を１に固
定して出力し、この内容（２２）がスタックに格納され
る。As for the write data, the contents of WDB-OUT are output with the bit 1 fixed to 1 in slot C1 of cycle T10, and the contents (22) are stored in the stack.

【０１８６】図１９では、サイクルＴ６のスロットＣ１
で命令コード（ｊｓｒ−１）がデコーダ（ＤＥＣ）２０
３に入力されて、命令の内容が解読される。In FIG. 19, slot C1 in cycle T6
The instruction code (jsr-1) is the decoder (DEC) 20
3, the contents of the command are decoded.

【０１８７】この時点で、ＰＣ値は次に命令リードすべ
きアドレス（２０）、ライトデータバッファＷＤＢ−Ｍ
には前回の命令リード時のアドレス（１６）、ＷＤＢ−
Ｓには前前回の命令リード時のアドレス（１２）が格納
されている。At this point, the PC value is the address (20) at which the next instruction is to be read, and the write data buffer WDB-M
Contains the address (16) at the time of the previous instruction read, WDB-
In S, the address (12) at the time of the previous and previous instruction read is stored.

【０１８８】サイクルＴ７のスロットＣ２で、ＰＣ値の
内容（２０）を、内部バスＤＢを経由して、ＷＤＢ−Ｏ
ＵＴに転送する。In the slot C2 of the cycle T7, the contents (20) of the PC value are transferred to the WDB-O via the internal bus DB.
Transfer to UT.

【０１８９】サイクルＴ７のスロットＣ１で、スタック
ポインタＳＰを内部バスＧＢに読み出し、算術論理演算
器ＡＬＵに入力してデクリメント（−４）を行なう。サ
イクルＴ７のスロットＣ２で、デクリメントした結果
を、内部バスＷＢに読み出して、スタックポインタＳＰ
にライトするとともに、内部バスＧＢにも読み出して、
アドレスバッファＡＢに格納して、内部データバスＩＡ
Ｂに出力する。また、同時にロングワードデータライト
のバスコマンドを発行する。In the slot C1 of the cycle T7, the stack pointer SP is read out to the internal bus GB, and is input to the arithmetic and logic unit ALU to decrement (-4). In slot C2 of cycle T7, the result of the decrement is read out to the internal bus WB, and the stack pointer SP
To the internal bus GB, and
The data is stored in the address buffer AB and the internal data bus IA
Output to B. At the same time, a long word data write bus command is issued.

【０１９０】ライトデータは、サイクルＴ８のスロット
Ｃ１で、ＷＤＢ−ＯＵＴの内容を出力し（ビット１の固
定は行なわない）、この内容（２２）がスタックに格納
される。As the write data, the contents of WDB-OUT are output in slot C1 in cycle T8 (bit 1 is not fixed), and the contents (22) are stored in the stack.

【０１９１】図２０、図２１には別のサブルーチン分岐
命令を含むプログラムの実行タイミングの一例が示され
る。FIGS. 20 and 21 show an example of the execution timing of a program including another subroutine branch instruction.

【０１９２】図２０では、サブルーチン分岐命令とし
て、アドレッシングモードを絶対アドレスではなく、メ
モリ間接を使用し、分岐命令で分岐した先で、以下のプ
ログラムＬ０ＪＭＰＬ１Ｌ１ＭＯＶ．Ｗ＠ａａ１，Ｒ０ＪＳＲ＠＠Ｌ３Ｌ３．ＤＡＴＡ．ＬＬ２を実行した場合のタイミング図が示されている。この場
合の、ラベルは、Ｌ０＝２，Ｌ１＝１４，Ｌ２＝４０，
Ｌ３＝１６０、即ち、Ｌ０．ＥＱＵ２Ｌ１．ＥＱＵ１４Ｌ２．ＥＱＵ４０Ｌ３．ＥＱＵ１６０としている。なお、メモリ間接では、命令コード中に含
まれるアドレス（Ｌ３）に従って、メモリをリードし、
リードした内容（Ｌ２）が分岐アドレスとなる。In FIG. 20, as a subroutine branch instruction, the addressing mode is not an absolute address but a memory indirect, and the following program L0 JMP L1 L1 MOV. W ＠ aa1, R0 JSR ＠＠ L3 L3. DATA. A timing diagram when LL2 is executed is shown. In this case, the labels are L0 = 2, L1 = 14, L2 = 40,
L3 = 160, that is, L0. EQU 2 L1. EQU 14 L2. EQU 40 L3. EQU 160. In the memory indirect, the memory is read according to the address (L3) included in the instruction code,
The read content (L2) becomes the branch address.

【０１９３】図２０のタイミングはサブルーチン分岐命
令実行前までは、図１０と同じとされる。図２０のサイ
クルＴ７のスロットＣ１で命令コード（ｊｓｒ）がデコ
ーダ（ＤＥＣ）２０３に入力されて、命令の内容が解読
される。The timing of FIG. 20 is the same as that of FIG. 10 before execution of the subroutine branch instruction. The instruction code (jsr) is input to the decoder (DEC) 203 in the slot C1 of the cycle T7 in FIG. 20, and the content of the instruction is decoded.

【０１９４】この時点で、ＰＣ値は次に命令リードす
べきアドレス（２４）、ライトデータバッファＷＤＢ−
Ｍには前回の命令リード時のアドレス（２０）、ＷＤＢ
−Ｓには前前回の命令リード時のアドレス（１６）が格
納されている。At this point, the PC value is the address (24) at which the next instruction is to be read, and the write data buffer WDB-
M is the address (20) at the time of the previous instruction read, WDB
In -S, the address (16) at the time of the previous and previous instruction read is stored.

【０１９５】サイクルＴ７のスロットＣ２で、ＷＤＢ−
Ｍの内容を、ＷＤＢ−ＯＵＴ（２０）に転送する。サイ
クルＴ９のスロットＣ１で、スタックポインタＳＰを内
部バスＧＢに読み出し、算術論理演算器ＡＬＵに入力し
てデクリメント（−４）を行なう。サイクルＴ９のスロ
ットＣ２で、デクリメントした結果を、内部バスＷＢに
読み出して、スタックポインタＳＰにライトするととも
に、内部バスＧＢにも読み出して、アドレスバッファＡ
Ｂに格納して、内部データバスＩＡＢに出力する。ま
た、同時にロングワードデータライトのバスコマンドを
発行する。In slot C2 of cycle T7, WDB-
The contents of M are transferred to WDB-OUT (20). In the slot C1 in the cycle T9, the stack pointer SP is read out to the internal bus GB, and is input to the arithmetic and logic unit ALU to decrement (-4). In slot C2 of cycle T9, the result of the decrement is read out to the internal bus WB, written to the stack pointer SP, and also read out to the internal bus GB, so that the address buffer A
B and output to the internal data bus IAB. At the same time, a long word data write bus command is issued.

【０１９６】ライトデータは、サイクルＴ１０のスロッ
トＣ１で、ＷＤＢ−ＯＵＴの内容を、ライトアドレスの
ビット１を１に固定して、出力し、この内容（２２）が
スタックに格納される。The write data is output with the content of WDB-OUT fixed at bit 1 of the write address at 1 in slot C1 of cycle T10, and the content (22) is stored in the stack.

【０１９７】図２１では、サブルーチン分岐命令とし
て、アドレッシングモードとしてプログラムカウンタ相
対を使用し、分岐命令で分岐した先で、以下のプログラ
ムＬ０ＪＭＰＬ１Ｌ１ＭＯＶ．Ｗ＠ａａ１，Ｒ０ＢＳＲＬ２を実行した場合のタイミングを示す。この場合のラベル
は、Ｌ０＝２，Ｌ１＝１２，Ｌ２＝４０、即ち、Ｌ０．ＥＱＵ２Ｌ１．ＥＱＵ１２Ｌ２．ＥＱＵ４０とされる。この場合、ＢＳＲのディスプレースメント
は、２２（１０進数）とされる。In FIG. 21, as a subroutine branch instruction, the relative address of the program counter is used as an addressing mode, and after branching with a branch instruction, the following program L0 JMP L1 L1 MOV. The timing when W ＠ aa1, R0 BSR L2 is executed is shown. In this case, the labels are L0 = 2, L1 = 12, L2 = 40, that is, L0. EQU 2 L1. EQU 12 L2. EQU 40. In this case, the displacement of the BSR is 22 (decimal number).

【０１９８】図２１において、サブルーチン分岐命令実
行前までは、図１１のタイミングと同じとされる。サイ
クルＴ６のスロットＣ１で、ライトデータバッファＷＤ
Ｂに格納されたＰＣ値（１６）と、リードデータバッフ
ァＲＤＢに保持した８ビットディスプレースメント（２
２）が、算術演算器ＡＵに入力され、更に、ＰＣ値のビ
ット１の１固定が指示されて、加算が行なわれ、加算結
果（４０）がサイクルＴ６のスロットＣ２で、内部バス
ＧＢに出力され、アドレスバッファＡＢに格納され、内
部アドレスバスＩＡＢに出力される。In FIG. 21, before the execution of the subroutine branch instruction, the timing is the same as in FIG. In slot C1 of cycle T6, the write data buffer WD
B and the PC value (16) stored in the read data buffer RDB and the 8-bit displacement (2
2) is input to the arithmetic operation unit AU, and further, the fixation of bit 1 of the PC value to 1 is instructed, the addition is performed, and the addition result (40) is output to the internal bus GB in the slot C2 of the cycle T6. Then, the data is stored in the address buffer AB and output to the internal address bus IAB.

【０１９９】また、サイクルＴ６のスロットＣ２で、ラ
イトデータバッファＷＤＢ−Ｍの内容（１６）が、ＷＤ
Ｂ−ＯＵＴに転送され、サイクルＴ９のスロットＣ１
で、１に固定されて、内部データバスＩＤＢに出力され
る。この内容（１８）がリターンアドレスとしてスタッ
クに書込まれる。尚、ビット１の１固定は、ＢＳＲの存
在するアドレスのビット１が０であることに呼応して行
われる。In the slot C2 of the cycle T6, the content (16) of the write data buffer WDB-M is changed to WD
B-OUT and transferred to slot C1 in cycle T9.
Is fixed to 1 and output to the internal data bus IDB. This content (18) is written to the stack as a return address. The fixing of bit 1 to 1 is performed in response to the fact that bit 1 of the address where the BSR exists is 0.

【０２００】図２２及び図２３にはデコーダ（ＤＥＣ）
２０３に含まれる、図８の転送命令に対するデコード論
理の一部の論理記述が示される。同図に示された論理記
述は、ＲＴＬ（Register Transfer Level）若しくはＨ
ＤＬ（Hardware DescriptionLanguage）記述と呼ばれ、
公知の論理合成ツールを用いることによって論理回路に
展開することができる。ＨＤＬはＩＥＥＥ１３６４とし
て標準化されている。これに示される論理記述の構文
は、ケース（ｃａｓｅ）文に準拠しており、ａｌｗａｙ
ｓ＠の次の（）内で定義された値若しくは信号に変化が
有ったとき、それ以下の記述行の処理を行う、という記
述内容になっている。尚、「５’ｂ００００１」は５ビ
ット長のバイナリデータ００００１を意味する。ＩＲ
［８］はインストラクションレジスタＩＲ（ＤＥＣの入
力値）の最下位から９ビット目の論理値を意味する。記
号〜は論理値反転を意味する。FIGS. 22 and 23 show a decoder (DEC).
A logical description of a part of the decode logic corresponding to the transfer instruction of FIG. The logical description shown in the figure is RTL (Register Transfer Level) or H
It is called DL (Hardware Description Language) description,
It can be developed into a logic circuit by using a known logic synthesis tool. HDL is standardized as IEEE1364. The syntax of the logical description shown here conforms to the case statement, and is always
When the value or signal defined in () following s ＠ changes, the description line below that value is processed. Note that "5'b00001" means 5-bit binary data 00001. IR
[8] means the logical value of the ninth bit from the least significant of the instruction register IR (input value of DEC). The symbol ~ means logical value inversion.

【０２０１】図２２及び図２３の論理記述は、転送命令
“ＭＯＶ．Ｗ＠ａａ：１６，Ｒｄ”のコードを解読す
るための論理記述に相当する。図２２及び図２３の論理
記述において、ｃａｓｅｘ（ＩＲ）の次行に記述された
１６‘ｂ０１１０＿１０１？＿？？００＿？？？？がそ
の転送命令のコードを意味する。ＩＲ［８］＝０のとき
バイトサイズ、ＩＲ［８］＝１のときワードサイズ、
ＩＲ［７］＝０のときメモリ→汎用レジスタ（リード
型）、ＩＲ［７］＝１のとき汎用レジスタ→メモリ（ラ
イト型）、の転送を意味する。その命令において、第１
ステートＳＴ１、第３ステートＳＴ３を省略するかは、
信号ＦＩＦＯＣＮＴ１，ＦＩＦＯＣＮＴ２，ＩＦＭＯＮ
の状態に従って決定する。即ち、その論理記述では、ス
テージコードＴＭＧに従って制御信号を生成するように
なっており、現時点でのステージコードＴＭＧの値とそ
の時のＦＩＦＯＣＮＴ１，ＦＩＦＯＣＮＴ２，ＩＦＭＯ
Ｎの値等にしたがって、次のステージコードＮＥＸＴＴ
ＭＧの値を決定するようになっており、これによって、
第１ステートＳＴ１、第３ステートＳＴ３を省略するか
を決定する。図２２を参照するに、第1ステートＳＴ１
のステージコードは１、第２ステートＳＴ１のステージ
コードは１７、第３ステートＳＴ１のステージコードは
３である。The logical description of FIGS. 22 and 23 corresponds to the logical description for decoding the code of the transfer instruction “MOV.W@aa: 16, Rd”. In the logical description of FIGS. 22 and 23, 16'b0110_101? _? ? 00_? ? ? ? Means the code of the transfer instruction. Byte size when IR [8] = 0, word size when IR [8] = 1,
When IR [7] = 0, transfer from memory to general-purpose register (read type), and when IR [7] = 1, transfer from general-purpose register to memory (write type). In that instruction, the first
Whether to omit the state ST1 and the third state ST3,
Signal FIFOCNT1, FIFOCNT2, IFMON
Is determined according to the condition of That is, in the logical description, a control signal is generated according to the stage code TMG, and the current value of the stage code TMG and the FIFOCNT1, FIFOCNT2, and IFMO at that time are output.
According to the value of N, etc., the next stage code NEXTTT
The value of MG is determined, whereby
It is determined whether the first state ST1 and the third state ST3 are omitted. Referring to FIG. 22, the first state ST1
Is 1, the stage code of the second state ST1 is 17, and the stage code of the third state ST1 is 3.

【０２０２】詳しくは、図２２における論理記述の第１
の部分（１−１）でステージコードＴＭＧが生成され
る。ステージコードＴＭＧは１→１７→３と進行する
が、ＦＩＦＯＣＮＴ１、ＦＩＦＯＣＮＴ２、ＩＦＭＯＮ
の状態によって、ステージコード１７、ステージコード
３は省略される。ステージコード１で、自命令の第２ワ
ードがリード済み（ＦＩＦＯＣＮＴ１＝１）であれば、
データのリード／ライトの制御を行なう。ステージコー
ド１で、自命令の第２ワードがリード済みでなければ
（ＦＩＦＯＣＮＴ１＝０）、ステージコード１７に進
み、データのリード／ライトの制御を行なう。More specifically, the first logical description in FIG.
The stage code TMG is generated in the portion (1-1). The stage code TMG progresses from 1 → 17 → 3, but FIFOCNT1, FIFOCNT2, IFMON
, The stage code 17 and the stage code 3 are omitted. In the stage code 1, if the second word of the own instruction has been read (FIFOCNT1 = 1),
Controls data read / write. If the second word of the own instruction has not been read in the stage code 1 (FIFOCNT1 = 0), the process proceeds to the stage code 17 to control data read / write.

【０２０３】論理記述の第２の部分（１−２）でバス制
御を行なう。ｎｏｐ＝０はバスアクセス開始、ｎｏｐ
＝１はバスアクセス禁止を指示する。ｄａｔａ＝０は命
令リード、ｄａｔａ＝１はデータアクセスを指示する。
ｂｙｔｅ＝０はワードサイズ、ｂｙｔｅ＝１はバイト
サイズを指示する。ｗｒｉｔｅ＝０はリード、ｗｒｉ
ｔｅ＝１はライトを指示する。Bus control is performed in the second part (1-2) of the logical description. nop = 0 starts bus access, nop
= 1 indicates bus access prohibition. Data = 0 indicates an instruction read, and data = 1 indicates a data access.
Byte = 0 indicates a word size, and byte = 1 indicates a byte size. write = 0 is read, wr
te = 1 indicates a write.

【０２０４】本転送命令の場合、ステージコード１で自
命令の第２ワードがリード済みでない場合、及びステー
ジコード３で命令リードを行い、ステージコード１で自
命令の第２ワードがリード済みの場合、またはステージ
コード１７で、データアクセスを行なう。データアクセ
スのリード／ライトはＩＲ［７］によって指示される。
命令リードの場合は所定のタイミングで内部データバス
ＩＤＢの内容がＩＲとリードデータバッファＲＤＢに格
納される。データリードの場合は所定のタイミングで内
部データバスＩＤＢの内容がリードデータバッファＲＤ
Ｂに格納される。データライトの場合は所定のタイミン
グでライトデータバッファＷＤＢの内容が内部データバ
スＩＤＢに出力される。In the case of this transfer instruction, when the second word of the own instruction has not been read by the stage code 1, or when the instruction is read by the stage code 3 and the second word of the own instruction has already been read by the stage code 1. , Or the stage code 17 performs data access. Read / write of data access is instructed by IR [7].
In the case of an instruction read, the contents of the internal data bus IDB are stored in the IR and the read data buffer RDB at a predetermined timing. In the case of data read, the contents of the internal data bus IDB are read at a predetermined timing and read data buffer
B. In the case of data write, the contents of the write data buffer WDB are output to the internal data bus IDB at a predetermined timing.

【０２０５】図２３の論理記述の第３の部分（１−３）
で実効アドレスを計算する。本転送命令の場合、ステー
ジコード１で自命令の第２ワードがリード済みの場合、
またはステージコード１７で、リードデータバッファＲ
ＤＢに保持している命令コードのＥＡ拡張部１６ビット
を、ｒｄｂｅｘｔ信号によって３２ビットに符号拡張し
た上、内部バスＧＢに出力する。ステージコード１で自
命令の第２ワードがリード済みでない場合、及びステー
ジコード３で、ＰＣ値の内部バスＰＣＧＢへの読み出
しと、アドレスバスＡＢ、インクリメンタＩＮＣへの入
力、及び、インクリメント結果の内部バスＰＣＷＢから
プログラムカウンタＰＣへの格納が指示される。なお、
アドレスバッファＡＢへは、内部バスＰＣＧＢからの入
力が指示されない場合、内部バスＧＢから入力されるよ
うにされている。尚、ｒｄｂｇｂはリードデータバッフ
ァＲＤＢの尚用をバスＧＢに出力する指示信号、ｒｄｂ
ｅｘｔはリードデータバッファＲＤＢを符号拡張する指
示信号である。Third Part (1-3) of Logical Description in FIG.
To calculate the effective address. In the case of this transfer instruction, if the second word of the own instruction has already been read in the stage code 1,
Alternatively, the read data buffer R
The 16-bit EA extension part of the instruction code held in the DB is sign-extended to 32 bits by the rdbext signal, and then output to the internal bus GB. When the second word of the own instruction has not been read by the stage code 1, and by reading the PC value to the internal bus PCGB, input to the address bus AB and the incrementer INC, and the internal of the increment result in the stage code 3. Bus PCWB instructs storage in program counter PC. In addition,
When the input from the internal bus PCGB is not instructed to the address buffer AB, the input is made from the internal bus GB. Note that rdbgb is an instruction signal for outputting the other use of the read data buffer RDB to the bus GB.
ext is an instruction signal for sign-extending the read data buffer RDB.

【０２０６】図２３における論理記述の第４の部分（１
−４）で、転送データ及びレジスタを制御する。リード
型（ＩＲ［７］＝０）の場合は、ステージコード１で自
命令の第２ワードがリード済みの場合、またはステージ
コード１７で、リードデータをリードデータバッファＲ
ＤＢからバスＷＢへ出力し、汎用レジスタへ（Ｒｄ）へ
格納する。コンディションコードレジスタＣＣＲの、
Ｎ、Ｚ、Ｖフラグの更新を指示する。図８等に示した通
り、かかる動作は制御が遅延される。遅延回路自体は図
示されない。The fourth part (1) of the logical description in FIG.
In -4), transfer data and registers are controlled. In the case of the read type (IR [7] = 0), when the second word of the own instruction has been read by the stage code 1, or the read data is read by the stage code 17 into the read data buffer R
The data is output from the DB to the bus WB and stored in the general-purpose register (Rd). Of the condition code register CCR,
The update of the N, Z, and V flags is instructed. As shown in FIG. 8 and the like, the control of this operation is delayed. The delay circuit itself is not shown.

【０２０７】ライト型（ＩＲ［７］＝１）の場合は、ス
テージコード１で自命令の第２ワードがリード済みの場
合、またはステージコード１７で、汎用レジスタ（Ｒ
ｄ）から内部バスＤＢへデータを出力し、いずれの場合
もライトデータバッファＷＤＢに格納する。また、コン
ディションコードレジスタＣＣＲの、Ｎ、Ｚ、Ｖフラグ
の更新を指示する。In the case of the write type (IR [7] = 1), the stage code 1 has already read the second word of its own instruction, or the stage code 17 has the general-purpose register (R
The data is output from d) to the internal bus DB, and in any case, the data is stored in the write data buffer WDB. Further, it instructs updating of the N, Z, and V flags of the condition code register CCR.

【０２０８】図２４乃至図２６にはデコーダ（ＤＥＣ）
２０３に含まれる、図９及び図１７の分岐命令／サブル
ーチン分岐命令に対するデコード論理の一部の論理記述
が示される。同図の表現形式は前記図２２及び図２３の
場合と同様である。図２４乃至図２６のデコード論理で
は、ＩＲ［１０］＝０のとき分岐（ＪＭＰ）、ＩＲ［１
０］＝１のときサブルーチン分岐（ＪＳＲ）、とされて
いる。FIGS. 24 to 26 show a decoder (DEC).
The logical description of a part of the decode logic for the branch instruction / subroutine branch instruction of FIGS. 9 and 17 included in 203 is shown. The representation format in the figure is the same as that in FIGS. 22 and 23. In the decoding logic of FIGS. 24 to 26, when IR [10] = 0, branch (JMP) and IR [1
0] = 1, the subroutine branch (JSR) is determined.

【０２０９】図２４に示される論理記述の第１の部分
（２−１）でステージコードＴＭＧが生成される。ステ
ージコードＴＭＧは１→１７→２→３と進行するが、Ｆ
ＩＦＯＣＮＴ１、ＦＩＦＯＣＮＴ２、ＩＦＭＯＮの状態
によって、ステージコード１７は省略される。ステージ
コード１で、自命令の第２ワードがリード済み（ＦＩＦ
ＯＣＮＴ１＝１）であれば、実効アドレス計算と分岐先
の命令リードの制御を行なう。ステージコード１で、自
命令の第２ワードがリード済みでなければ（ＦＩＦＯＣ
ＮＴ１＝０）、ステージコード１７に進み、実効アドレ
ス計算と分岐先の命令リードの制御を行なう。The stage code TMG is generated in the first part (2-1) of the logical description shown in FIG. The stage code TMG progresses from 1 → 17 → 2 → 3, but F
The stage code 17 is omitted depending on the state of IFOCNT1, FIFOCNT2, and IFMON. In stage code 1, the second word of own instruction has been read (FIF
If OCNT1 = 1), the control of the effective address calculation and the instruction read of the branch destination are performed. In the stage code 1, if the second word of the own instruction has not been read (FIFOC
(NT1 = 0), the process proceeds to the stage code 17, where the effective address is calculated and the instruction at the branch destination is controlled.

【０２１０】図２４に示される論理記述の第２の部分
（２−２）でバス制御を行なう。本転送命令の場合、ス
テージコード１で自命令の第２ワードがリード済みでな
い場合、バスアクセスは禁止状態となる。ステージコー
ド１で自命令の第２ワードがリード済みの場合、または
ステージコード１７で、分岐先の命令リードを行なう。
サブルーチン分岐の場合は、ステージコード２に進み、
ＰＣ値のスタックのためのロングワードサイズのデータ
ライトを行なう。ステージコード２では、リード済みの
分岐先命令に続く命令リードを行なう。The bus control is performed in the second part (2-2) of the logical description shown in FIG. In the case of this transfer instruction, if the second word of the own instruction has not been read in the stage code 1, the bus access is prohibited. If the second word of the own instruction has been read in the stage code 1, or the instruction of the branch destination is read in the stage code 17.
In the case of a subroutine branch, go to stage code 2,
A long word size data write for stacking PC values is performed. In the stage code 2, an instruction read following the already read branch destination instruction is performed.

【０２１１】図２５に示された論理記述の第３の部分
（２−３）で実効アドレスを計算する。本転送命令の場
合、ステージコード１で自命令の第２ワードがリード済
みの場合、またはステージコード１７で、リードデータ
バッファＲＤＢに保持している命令コードのＥＡ拡張部
を内部バスＧＢに出力し、アドレスバッファＡＢに格納
する。この内容は自動的にインクリメンタＩＮＣに入力
され、インクリメント（＋２／＋４）が行われる。ま
た、インクリメント結果のプログラムカウンタＰＣへの
格納が指示される。また、ステージコード３で、ＰＣ値
の内部バスＰＣＧＢへの出力と、インクリメント結果の
内部バスＰＣＷＢからプログラムカウンタＰＣへの格納
が指示される。An effective address is calculated in the third part (2-3) of the logical description shown in FIG. In the case of this transfer instruction, if the second word of the own instruction has already been read by the stage code 1, or the stage code 17 outputs the EA extension part of the instruction code held in the read data buffer RDB to the internal bus GB. , In the address buffer AB. This content is automatically input to the incrementer INC, and increment (+ 2 / + 4) is performed. Further, an instruction to store the increment result in the program counter PC is issued. The stage code 3 instructs output of the PC value to the internal bus PCGB and storage of the increment result from the internal bus PCWB to the program counter PC.

【０２１２】図２６に示された論理記述の第４の部分
（２−４）で、転送データ（スタックされるＰＣ）とレ
ジスタを制御する。ステージコード１で自命令の第２ワ
ードがリード済みの場合、またはステージコード１７
で、スタックポインタの内容の内部バスＧＢへの読み出
しを指示する。算術論理演算器ＡＬＵに入力され、図示
されないものの、算術論理演算器ＡＬＵには、デクリメ
ント（−４）が指示される。In the fourth part (2-4) of the logical description shown in FIG. 26, transfer data (PC to be stacked) and registers are controlled. When the second word of the own instruction has been read in the stage code 1, or when the stage code 17
Instructs to read the contents of the stack pointer to the internal bus GB. Although not shown, the arithmetic and logic unit ALU is instructed to decrement (−4) by the arithmetic and logic unit ALU.

【０２１３】ステージコード２でデクリメント結果を、
算術論理演算器ＡＬＵから内部バスＧＢへ出力する指示
が与えられる。これによってデクリメント結果はアドレ
スバッファＡＢに格納される。また、内部バスＷＢから
スタックポインタＳＰへの格納が指示される。The result of the decrement in stage code 2 is
An instruction to output from arithmetic logic unit ALU to internal bus GB is provided. As a result, the decrement result is stored in the address buffer AB. In addition, storage from the internal bus WB to the stack pointer SP is instructed.

【０２１４】また、前記の通り、サブルーチン分岐命令
が存在したアドレスをＡ１として保持し、命令コードと
同時にデコードする。かかるアドレス情報と命令リード
実効中を示す信号ＩＦＭＯＮを用いて、ライトデータバ
ッファＷＤＢ−ＯＵＴへ転送する値が、ＰＣ、ＷＤＢ−
Ｍ、ＷＤＢ−Ｓの何れかから選択される。また、前記ア
ドレス情報に基づいて、出力するデータのビット１を１
にセット（＋２）するかが選択される。As described above, the address where the subroutine branch instruction exists is held as A1, and is decoded simultaneously with the instruction code. Using the address information and the signal IFMON indicating that the instruction read is being executed, the value to be transferred to the write data buffer WDB-OUT is PC, WDB-OUT.
M or WDB-S. Further, based on the address information, bit 1 of the output data is set to 1
(+2) is selected.

【０２１５】かかる制御によって、命令の配置に依存せ
ず、また、適宜、命令のステートを省略しつつ、実行す
ることが実現される。By such control, the execution can be realized without depending on the arrangement of the instructions and omitting the states of the instructions as appropriate.

【０２１６】尚、省略可能なステートを持つことができ
ず、また、算術論理演算器ＡＬＵとサブ算術論理演算器
ＡＬＵＳの交互の動作を利用して、プログラムカウンタ
ＰＣのインクリメントを行なわずに実行することができ
ない命令が存在する場合には、ＦＩＦＯＣＮＴ２の状態
などを参照して、バスコマンドの発行と、ＰＣインクリ
メントを禁止するようにすればよい。例えば、図２２乃
至図２６において、第２の部分（１−２，２−２）でｎ
ｏｐ＝１とし、第３の部分（１−３，２−３）で、ｉｎ
ｃｐｃ＝０とすればよい。これによって、命令の実行
（消費）量より、命令リード量が大きくなってしまい、
命令レジスタ（ＦＩＦＯ）２００がオーバフローした
り、サブルーチン分岐命令などにおいて待避すべきプロ
グラムカウンタＰＣの内容が失われてしまったりするこ
とを防止できる。It should be noted that the state cannot have an omissible state, and is executed without incrementing the program counter PC by utilizing the alternate operation of the arithmetic and logic unit ALU and the sub-arithmetic and logical unit ALUS. If there is an instruction that cannot be performed, the bus command issuance and the PC increment may be prohibited by referring to the state of the FIFOCNT2. For example, in FIG. 22 to FIG. 26, n in the second portion (1-2, 2-2)
op = 1, and in the third part (1-3, 2-3), in
It is sufficient to set cpc = 0. As a result, the instruction read amount becomes larger than the instruction execution (consumption) amount,
It is possible to prevent the instruction register (FIFO) 200 from overflowing or losing the contents of the program counter PC to be saved due to a subroutine branch instruction or the like.

【０２１７】以上より以下の作用効果を得ることができ
る。〔１〕既存のＣＰＵに対して、互換性を損なわず
に、データバス幅を拡張して、命令リードを高速化する
とともに、命令実行の制御を、命令固有の動作を含むス
テートと、命令のリードのみを行なうステートに分け、
後者を省略（スキップ）可能にする。命令のリードを高
速化した分を、命令実行の一部のステートを省略（スキ
ップ）して、命令のリードの量と実行の量のバランスを
採るとともに、命令の実行時間を短縮して、高速化を実
現できる。命令の一部を省略（スキップ）可能とし、リ
ード済みの命令の量に応じて、適宜省略（スキップ）す
ることにより、命令の配置を任意（リロケータブル）に
できる。命令の配置を任意（リロケータブル）にして、
プログラムの作成を容易にしたり、Ｃコンパイラなどの
開発上の制約をなくすることができる。From the above, the following functions and effects can be obtained. [1] Compared to the existing CPU, the data bus width is expanded without compromising compatibility, the speed of instruction reading is increased, and control of instruction execution is controlled by a state including an operation peculiar to the instruction and a state of the instruction. Divide into states that only perform reads,
The latter can be omitted (skipped). A part of the instruction execution speed is omitted (skipped) to balance the amount of instruction read and the amount of execution, and the instruction execution time is shortened to reduce the instruction execution time. Can be realized. Some instructions can be omitted (skipped), and the instructions can be omitted (skipped) as appropriate according to the number of instructions that have been read. Arbitrary instruction placement (relocatable)
This makes it easy to create a program and eliminates restrictions on development of a C compiler and the like.

【０２１８】〔２〕レジスタ間演算命令のように、１ワ
ード（基本単位長）の命令コードを持ち、１ステート
（単位時間）で実行し、省略（スキップ）可能なステー
トを持たない命令に対して、演算器を複数設け、かかる
演算器を、実行のための資源の実行時間よりも短い時間
の差で、動作させることにより、複数のレジスタ間演算
命令などを、実効的に同時に実行することができる。実
効的に同時に動作する一方の命令の命令リードを省略
し、命令のリードの量と実行の量のバランスを採るとと
もに、命令の実行時間を短縮して、高速化を実現でき
る。命令デコーダを、全ての命令に対応するもの（ＤＥ
Ｃ２０３）と、前記実効的に同時に動作する演算器の一
方を専ら制御するもの（ＳＤＥＣ２０４）とすることに
よって、論理規模の増大を最小限にし、ひいては製造費
用の増加も最小限にすることができる。演算器を時間差
を持って動作させることによって、並列処理を行なわず
に済み、汎用レジスタの競合などの対応を容易にし、か
つ、論理規模の増大を抑止することができる。全ての命
令に対応する命令デコーダ及び実行部ＥＸＥＣの各ブロ
ックは、概略既存のＣＰＵの論理と大部分を共通にでき
るから、設計資産を有効に利用して、設計品質を向上し
たり、開発期間を短縮したりできる。[2] For an instruction having an instruction code of one word (basic unit length), such as an operation instruction between registers, which is executed in one state (unit time) and has no states that can be omitted (skipped), A plurality of arithmetic units are provided, and the arithmetic units are operated with a time difference shorter than the execution time of the resources for execution, so that a plurality of register operation instructions and the like are effectively executed simultaneously. Can be. It is possible to omit the instruction reading of one instruction that operates effectively simultaneously, balance the amount of instruction reading and the amount of execution, shorten the instruction execution time, and realize high speed. The instruction decoder must be compatible with all instructions (DE
C203) and one of the arithmetic units that operate effectively and simultaneously are exclusively controlled (SDEC204), thereby minimizing an increase in logic scale, and thus minimizing an increase in manufacturing cost. . By operating the arithmetic units with a time difference, it is not necessary to perform parallel processing, it is possible to easily cope with contention of general-purpose registers, and it is possible to suppress an increase in logic scale. Since the blocks of the instruction decoder and the execution unit EXEC corresponding to all the instructions can share most of the logic of the existing CPU, the design quality can be effectively used to improve the design quality and the development period. Can be shortened.

【０２１９】〔３〕前置コードのように、制御信号のみ
を発生する命令コードはスキップ可能にし、スキップ時
に制御信号のみを、命令デコーダを使用せずに、発生す
るようにすることにより、命令の実行時間を短縮して、
高速化を実現できる。[3] An instruction code that generates only a control signal, such as a prefix code, can be skipped, and only a control signal can be generated at the time of skipping without using an instruction decoder. The execution time of
Higher speed can be realized.

【０２２０】〔４〕分岐命令や割込み例外処理時に、分
岐先の先頭命令をリードして、直ちに実行開始するよう
にして、応答性を維持向上できる。[4] At the time of branch instruction or interrupt exception processing, the head instruction at the branch destination is read and execution is immediately started, so that responsiveness can be maintained and improved.

【０２２１】〔５〕命令レジスタに、命令コードと共
に、その命令コードのアドレスバスＩＡＢのビット１の
内容を格納し、命令デコーダ２０３で同時に判定するこ
とにより、制御を容易にし、デコード回路を簡略化し、
論理規模の増大を抑止することができる。[5] The contents of bit 1 of the address bus IAB of the instruction code are stored in the instruction register together with the instruction code, and are simultaneously determined by the instruction decoder 203, thereby facilitating control and simplifying the decoding circuit. ,
An increase in the logical scale can be suppressed.

【０２２２】〔６〕分岐先命令リード後のＰＣインクリ
メントを、分岐先アドレスの内容に応じて、インクリメ
ンタＩＮＣで自動的に＋２／＋４を切り替えることによ
り、分岐先が４の倍数番地かどうかに拘らず画一的な処
理を行い、論理規模の増加を抑えることができる。[6] The increment of the PC after the branch destination instruction is read is automatically switched between + 2 / + 4 by the incrementer INC according to the content of the branch destination address, thereby determining whether the branch destination is a multiple of four. Regardless, uniform processing can be performed, and an increase in logical scale can be suppressed.

【０２２３】〔７〕ライトデータバッファＷＤＢに、Ｐ
Ｃインクリメント時に、インクリメントする前のＰＣの
内容を格納し、更に、ライトデータバッファＷＤＢをＦ
ＩＦＯ構造にしておくことにより、また、ライトデータ
バッファＷＤＢのビット１を論理値１に固定することを
可能にして、＋２を容易に実現できる。サブルーチン分
岐命令時の待避すべきＰＣ値を容易に得ることができ
る。また、ライトデータバッファＷＤＢに保持した待避
すべきＰＣ値と、ディスプレースメントを加算する算術
演算器ＡＵを持ち、ライトデータバッファＷＤＢから直
接入力する経路を持つことにより、プログラム相対のア
ドレッシングモードの準備を高速にし、ひいては命令の
処理速度を向上できる。また、プリフェッチカウンタと
プログラムカウンタを別に持ち、更にインクリメンタを
別に持つ必要がなく、論理規模の増大を抑止できる。[7] P is stored in the write data buffer WDB.
At the time of C increment, the contents of the PC before the increment are stored, and the write data buffer WDB is
By using the IFO structure, it is also possible to fix bit 1 of the write data buffer WDB to a logical value 1, and +2 can be easily realized. A PC value to be saved at the time of a subroutine branch instruction can be easily obtained. In addition, by having an arithmetic operation unit AU for adding the PC value to be saved and the displacement held in the write data buffer WDB and having a path directly input from the write data buffer WDB, preparation for the addressing mode relative to the program is performed. It is possible to increase the speed and, consequently, the processing speed of the instruction. Further, it is not necessary to have a prefetch counter and a program counter separately, and it is not necessary to further have a separate incrementer, so that an increase in logical scale can be suppressed.

【０２２４】〔８〕互換性を保った、アドレス空間の広
いＣＰＵと狭いＣＰＵがある場合、双方に、互換性を維
持しつつ、高速化を実現することができる。適宜必要な
命令やアドレッシングモードを持つ様にすればよい。[8] When there are a CPU having a wide address space and a CPU having a wide address space while maintaining compatibility, it is possible to realize high speed while maintaining compatibility in both. What is necessary is just to have necessary instructions and addressing modes as needed.

【０２２５】[0225]

〔９〕既存の命令を実行可能にし、内部の
動作の順序なども同等にしているから、既存のＣＰＵと
比較して、将来拡張余裕を大きく損なうことがない。例
えば、既存のＣＰＵに対して、新たな命令の追加が可能
になった場合には、かかる技術を、本発明を適用したＣ
ＰＵにも用いることができると考えられる。命令セット
の互換性を維持していれば、機械語としては、既存のＣ
ＰＵと同じ命令を追加することはできる。また、追加命
令も、複数の実行ステート数をもつものであれば、固有
の動作を行なう部分と省略可能なステートとに分け、後
者を必要に応じて省略することは可能とすることはでき
る。少なくとも、必要に応じて命令のリードとＰＣイン
クリメントを禁止することはでき、既存ＣＰＵと同等の
処理時間では実現可能である。追加命令が１ステートで
実行可能であれば、ＡＬＵとＡＬＵＳの交互の動作な
どによって高速化を実現できる。[9] Since existing instructions are made executable and the order of internal operations and the like are made the same, the expansion margin in the future is not significantly impaired as compared with the existing CPU. For example, when it becomes possible to add a new instruction to an existing CPU, such a technique is applied to a C to which the present invention is applied.
It is believed that it can also be used for PU. If the instruction set compatibility is maintained, the existing C
The same instructions as the PU can be added. Also, if the additional instruction has a plurality of execution states, the additional instruction can be divided into a part that performs a specific operation and a state that can be omitted, and the latter can be omitted as necessary. At least, the reading of the instruction and the PC increment can be prohibited if necessary, and can be realized with a processing time equivalent to that of the existing CPU. If the additional instruction can be executed in one state, high speed can be realized by alternate operation between ALU and ALUS.

【０２２６】〔１０〕既存のＣＰＵと同じ命令セットと
することにより、アセンブラ、Ｃコンパイラ、シミュレ
ータ／デバッガなどの開発ツール、いわゆるクロスソフ
トウェアを共通にすることができる。クロスソフトウェ
アを共通化することによって、逸早く開発環境を整える
ことができる。また、開発環境の開発に必要な資源を抑
制でき、また、利用者にとっても、既存の開発環境を利
用することによって、不所望の費用を回避することがで
きる。[10] By using the same instruction set as the existing CPU, development tools such as assembler, C compiler, simulator / debugger, so-called cross software can be shared. By sharing the cross software, the development environment can be quickly set up. In addition, resources required for development of the development environment can be suppressed, and unnecessary costs can be avoided for the user by using the existing development environment.

【０２２７】以上本発明者によってなされた発明を実施
形態に基づいて具体的に説明したが、本発明はそれに限
定されるものではなく、その要旨を逸脱しない範囲にお
いて種々変更可能であることは言うまでもない。The invention made by the present inventor has been specifically described based on the embodiments. However, it is needless to say that the present invention is not limited to the embodiments and can be variously modified without departing from the gist of the invention. No.

【０２２８】例えば、本発明は、互換性を維持すること
とは別に、全く新規のマイクロコンピュータにも適用で
きる。命令セット即ち、命令の種類やアドレッシングモ
ードの種類及びこれらの組合せなども任意にできる。汎
用レジスタは、アドレス及びデータに共通に利用可能な
ものである必要はなく、一部または全部がアドレス専用
またはデータ専用のものであってもよい。汎用レジスタ
のデータサイズについても任意とすることができる。For example, the present invention can be applied to a completely new microcomputer besides maintaining compatibility. An instruction set, that is, an instruction type, an addressing mode type, a combination thereof, and the like can be arbitrarily set. The general-purpose register does not need to be commonly used for addresses and data, and may be partially or entirely dedicated to addresses or data. The data size of the general-purpose register can be arbitrarily set.

【０２２９】前置コードの種類は特に限定はされない。
また、前置コードは、ロングワードを指示する情報のほ
か、そのほかの制御情報を含んでもよい。また、命令コ
ードの基本単位１６ビットに限定する必要はなく、８ビ
ット或いは３２ビットなど任意のビット幅とできる。デ
ータバスの幅も３２ビットに限定されず、６４ビットな
どでもよい。命令の基本単位の２倍でなく、４倍などで
もよい。The type of the prefix code is not particularly limited.
The prefix code may include other control information in addition to the information indicating the long word. Further, the basic unit of the instruction code does not need to be limited to 16 bits, and may be an arbitrary bit width such as 8 bits or 32 bits. The width of the data bus is not limited to 32 bits, but may be 64 bits. The basic unit of the instruction may be quadruple instead of twice.

【０２３０】命令レジスタ（ＦＩＦＯ）の容量も３ワー
ド分に限定されない。最小限２ワード以上あればよい。
容量が大きければ、省略可能なステートを持たない命令
が存在した場合にも、蓄積された命令を、続く命令実行
で省略するステートを大きくして、命令の量のバランス
を採ることができる。ただし、容量を大きくしても、分
岐命令実行時にはリードした命令が無駄になってしまう
から、通常、乃至は定常的な状態で、命令レジスタに存
在する命令の量はあまり大きくしない方がよい。The capacity of the instruction register (FIFO) is not limited to three words. At least two words are required.
If the capacity is large, even if there is an instruction that does not have an omissible state, it is possible to balance the amount of instructions by increasing the state of the stored instruction to be omitted in the subsequent instruction execution. However, even if the capacity is increased, the read instruction becomes useless at the time of execution of the branch instruction. Therefore, it is better not to increase the amount of the instruction existing in the instruction register in a normal or steady state.

【０２３１】本発明では、並列処理を行なわないように
しているが、並列処理と組合せて構成することもでき
る。一部の命令を並列処理させてもよい。演算器や命令
デコーダの数なども任意にできる。シングルチップマイ
クロコンピュータのその他の機能ブロックについても何
等制約されない。In the present invention, parallel processing is not performed. However, the present invention can be configured in combination with parallel processing. Some instructions may be processed in parallel. The number of arithmetic units and instruction decoders can be arbitrarily set. The other functional blocks of the single-chip microcomputer are not restricted at all.

【０２３２】以上の説明では主として本発明者によって
なされた発明をその背景となった利用分野であシングル
チップマイクロコンピュータに適用した場合について説
明したが、それに限定されるものではなく、その他のマ
イクロコンピュータまたはデータ処理装置も適用可能で
あり、本発明は少なくとも、命令を解読して処理し、演
算処理を行なうデータ処理装置に適用することができ
る。In the above description, the case where the invention made by the present inventor is mainly applied to a single-chip microcomputer in the field of use as the background has been described. However, the present invention is not limited to this. Alternatively, a data processing device is also applicable, and the present invention can be applied at least to a data processing device that decodes and processes instructions and performs arithmetic processing.

【０２３３】[0233]

【発明の効果】本願において開示される発明のうち代表
的なものによって得られる効果を簡単に説明すれば下記
の通りである。The effects obtained by typical ones of the inventions disclosed in the present application will be briefly described as follows.

【０２３４】すなわち、（既存のＣＰＵに対して、）内
部データバス幅を、少なくとも命令の基本単位（ワー
ド）よりも大きくし、リードした命令を（複数単位）保
持する命令レジスタを持ち、この命令レジスタに存在す
る命令の量を監視する手段を設け、（既存の）命令を、
実行の基本単位時間（ステート）にしたがって、命令の
リード（とＰＣインクリメント）のみの制御を行なうス
テートと、実効アドレスの計算やデータの演算処理の制
御を含むステートに分割する。例えば、実効アドレスの
計算やデータの転送処理が複数のステートに亘って動作
する場合も、制御自体は１度に行い、制御信号に遅延を
設けるなどして、実際の動作を複数のステート（例え
ば、アドレス計算を最初のステート、リードデータの格
納を次のステート）に行なうようにし、制御信号を遅延
させるべき動作（例えば、リードデータの格納）は、次
の制御動作（ＰＣインクリメント）と重なっても同時に
動作可能なように、実行手段を構成する。前記命令のリ
ードのみの制御を行なうステートを省略可能にすると共
に、前記命令レジスタに存在する量に従って（前記監視
手段の指示に従い）、前記命令のリードのみの制御を行
なうステートを省略を行なう（スキップする）。That is, the internal data bus width (for the existing CPU) is made at least larger than the basic unit (word) of the instruction, and an instruction register for holding the read instruction (plural units) is provided. Provide a means to monitor the amount of instructions present in the registers,
According to the basic unit time (state) of execution, the state is divided into a state in which only control of instruction reading (and PC increment) is controlled, and a state including control of effective address calculation and data arithmetic processing. For example, even when the calculation of the effective address and the data transfer process operate over a plurality of states, the control itself is performed at once, and the actual operation is performed in a plurality of states (for example, by providing a delay to the control signal). The address calculation is performed in the first state, the read data is stored in the next state, and the operation for delaying the control signal (for example, storing the read data) overlaps with the next control operation (PC increment). The execution means is configured to be able to operate simultaneously. The state for controlling only the reading of the instruction can be omitted, and the state for controlling only the reading of the instruction can be omitted according to the amount existing in the instruction register (according to the instruction of the monitoring unit) (skip). Do).

【０２３５】これにより、内部データバス幅を命令の基
本単位（ワード）よりも大きくすることによって、一度
にリードする命令の量を（既存のＣＰＵより）大きくで
き、基本の実行ステートの場合、（既存のＣＰＵと同様
に）自分の命令コード長に対応した回数の命令リードを
行なうことにより、省略（スキップ）を行なわない場合
には、実行した自命令の命令コードの量より、リードし
た命令の量を大きくして、リード済みの命令コードの量
を蓄積でき、一方、省略（スキップ）を行なって、命令
リードを行なわないことにより、実行した自命令の命令
コードの量と、リードした命令の量を同等にして、リー
ド済みの命令コードの量を維持したり、実行した自命令
の命令コードの量より、リードした命令の量を少なくし
て、リード済みの命令コードの量を減少できるから、リ
ード済みの命令の量を所定の範囲内に収めつつ（命令の
リードの量と、命令の実行の量のバランスを取りつ
つ）、命令のリードを高速化して、全体の実行時間を短
縮することができる。また、省略（スキップ）するステ
ートを自動的に変えることによって、命令の配置の変更
に対応できる。Thus, by making the internal data bus width larger than the basic unit (word) of an instruction, the number of instructions to be read at one time can be made larger (than the existing CPU). When the instruction is read a number of times corresponding to the instruction code length of the own instruction (as in the case of the existing CPU) and skipping (skip) is not performed, the amount of instruction code of the executed instruction is By increasing the amount, the amount of the read instruction code can be accumulated, while the omission (skip) is performed and the instruction is not read, so that the amount of the instruction code of the executed own instruction and the amount of the read instruction Make the amount equal, maintain the amount of read instruction code, or reduce the amount of read instruction Since the amount of instruction codes can be reduced, the speed of instruction reading can be increased while keeping the amount of read instructions within a predetermined range (while balancing the amount of instruction reading and the amount of instruction execution). , The overall execution time can be reduced. Further, by automatically changing the states to be omitted (skipped), it is possible to cope with a change in instruction arrangement.

【０２３６】オブジェクトレベルで互換性を保ちつつ、
アドレス空間の広い（命令セットの大きい）ＣＰＵとア
ドレス空間の小さい（命令セットの小さい）ＣＰＵが存
在する場合には、アドレス空間の広いＣＰＵで、上記高
速化を実現して、下位互換性をもつ、アドレス空間の小
さいＣＰＵにも存在する命令について、同様に上記高速
化を可能にできる。換言すれば、同一の方法で、オブジ
ェクトレベルで互換性を保ちつつ、アドレス空間の広い
ＣＰＵとアドレス空間の小さいＣＰＵでも高速化を可能
にできる。オブジェクトレベルで互換性を保つことによ
る利点と高速化を可能にすることの利点の双方を享受す
ることができる。While maintaining compatibility at the object level,
When there is a CPU having a wide address space (large instruction set) and a CPU having a small address space (small instruction set), the CPU having a wide address space realizes the above-described high speed and has backward compatibility. In addition, the above-described high-speed operation can be similarly performed for an instruction that is also present in a CPU having a small address space. In other words, it is possible to increase the speed of a CPU having a wide address space and a CPU having a small address space by using the same method while maintaining compatibility at the object level. Both the advantages of maintaining compatibility at the object level and the advantage of enabling high speed can be enjoyed.

【０２３７】既存のＣＰＵと同じ命令セットとすること
により、アセンブラ、Ｃコンパイラ、シミュレータ／デ
バッガなどの開発ツール、いわゆるクロスソフトウェア
を共通にすることができる。クロスソフトウェアを共通
化することによって、逸早く開発環境を整えることがで
きる。By using the same instruction set as the existing CPU, development tools such as assembler, C compiler, simulator / debugger, so-called cross software can be shared. By sharing the cross software, the development environment can be quickly set up.

[Brief description of the drawings]

【図１】本発明に係るデータ処理装置を適用したＣＰＵ
のブロック図である。FIG. 1 shows a CPU to which a data processing device according to the present invention is applied.
It is a block diagram of.

【図２】本発明に係るデータ処理装置を適用したシング
ルチップマイクロコンピュータのブロック図である。FIG. 2 is a block diagram of a single-chip microcomputer to which the data processing device according to the present invention is applied.

【図３】図１のＣＰＵに内蔵されている汎用レジスタ及
び制御レジスタに関するプログラミングモデルの説明図
である。FIG. 3 is an explanatory diagram of a programming model related to a general-purpose register and a control register built in the CPU of FIG. 1;

【図４】別のＣＰＵに内蔵されている汎用レジスタ及び
制御レジスタに関するプログラミングモデルの説明図で
ある。FIG. 4 is an explanatory diagram of a programming model related to a general-purpose register and a control register built in another CPU.

【図５】図１のＣＰＵ２における機械語の命令フォーマ
ットの一例説明図である。FIG. 5 is an explanatory diagram of an example of a machine language instruction format in the CPU 2 of FIG. 1;

【図６】ＲＯＭの一例を示す説明図である。FIG. 6 is an explanatory diagram showing an example of a ROM.

【図７】ＣＰＵのアドレシングモードの説明図である。FIG. 7 is an explanatory diagram of an addressing mode of a CPU.

【図８】転送命令“ＭＯＶ．Ｗ＠ａａ：１６，Ｒｄ”
の動作タイミングチャートである。FIG. 8: Transfer instruction “MOV.W@aa: 16, Rd”
6 is an operation timing chart of FIG.

【図９】分岐命令“ＪＭＰ＠ａａ：２４”の動作タイ
ミングチャートである。FIG. 9 is an operation timing chart of a branch instruction “JMP $ aa: 24”.

【図１０】ＣＰＵによるプログラムの実行タイミングの
一例を示すタイミングチャートである。FIG. 10 is a timing chart showing an example of a program execution timing by a CPU.

【図１１】図１０とは命令配置アドレスの異なるプログ
ラムの実行タイミングを例示するタイミングチャートで
ある。FIG. 11 is a timing chart illustrating the execution timing of a program having a different instruction arrangement address from that of FIG. 10;

【図１２】図１０及び図１１とは命令配置アドレスの異
なるプログラムの実行タイミングを例示するタイミング
チャートである。FIG. 12 is a timing chart illustrating execution timings of programs having different instruction arrangement addresses from those of FIGS. 10 and 11;

【図１３】図１０乃至図１２とは命令配置アドレスの異
なるプログラムの実行タイミングを例示するタイミング
チャートである。FIG. 13 is a timing chart illustrating execution timings of programs having different instruction arrangement addresses from those of FIGS. 10 to 12;

【図１４】インクリメンタの一例を示すブロック図であ
る。FIG. 14 is a block diagram illustrating an example of an incrementer.

【図１５】ライトデータバッファの一例を示すブロック
図である。FIG. 15 is a block diagram illustrating an example of a write data buffer.

【図１６】プログラム相対分岐アドレス計算用の算術演
算器の一例を示すブロック図である。FIG. 16 is a block diagram showing an example of an arithmetic operation unit for calculating a program relative branch address.

【図１７】ＣＰＵによるサブルーチン分岐命令“ＪＳＲ
＠ａａ：２４”の動作タイミングを例示するタイミン
グチャートである。FIG. 17 shows a subroutine branch instruction “JSR”
It is a timing chart which illustrates operation timing of "aa: 24".

【図１８】サブルーチン分岐命令を含むプログラムの実
行タイミングを例示するタイミングチャートである。FIG. 18 is a timing chart illustrating the execution timing of a program including a subroutine branch instruction.

【図１９】サブルーチン分岐命令を含むプログラムの実
行タイミングを例示するタイミングチャートである。FIG. 19 is a timing chart illustrating the execution timing of a program including a subroutine branch instruction.

【図２０】別のサブルーチン分岐命令を含むプログラム
の実行タイミングを例示するタイミングチャートであ
る。FIG. 20 is a timing chart illustrating the execution timing of a program including another subroutine branch instruction.

【図２１】別のサブルーチン分岐命令を含むプログラム
の実行タイミングを例示するタイミングチャートであ
る。FIG. 21 is a timing chart illustrating the execution timing of a program including another subroutine branch instruction.

【図２２】図８の転送命令に対するデコーダによるデコ
ード論理の一部の論理記述を図２３と共に例示する説明
図である。FIG. 22 is an explanatory diagram exemplifying a part of the logical description of the decoding logic by the decoder for the transfer instruction of FIG.

【図２３】図８の転送命令に対するデコーダによるデコ
ード論理の一部の論理記述を図２２と共に例示する説明
図である。23 is an explanatory diagram exemplifying a logical description of a part of decode logic by a decoder for the transfer instruction in FIG.

【図２４】図９及び図１７の分岐命令／サブルーチン分
岐命令に対するデコーダによるデコード論理の一部の論
理記述を図２５及び図２６と共に例示する説明図であ
る。FIG. 24 is an explanatory diagram exemplifying a part of the logic description of the decoding logic by the decoder for the branch instruction / subroutine branch instruction in FIGS. 9 and 17 together with FIGS. 25 and 26;

【図２５】図９及び図１７の分岐命令／サブルーチン分
岐命令に対するデコーダによるデコード論理の一部の論
理記述を図２４及び図２６と共に例示する説明図であ
る。FIG. 25 is an explanatory diagram exemplifying a part of the logic description of the decoding logic by the decoder for the branch instruction / subroutine branch instruction of FIGS. 9 and 17 together with FIGS. 24 and 26;

【図２６】図９及び図１７の分岐命令／サブルーチン分
岐命令に対するデコーダによるデコード論理の一部の論
理記述を図２４及び図２５と共に例示する説明図であ
る。FIG. 26 is an explanatory diagram exemplifying a part of the logic description of the decoding logic by the decoder for the branch instruction / subroutine branch instruction of FIGS. 9 and 17 together with FIGS. 24 and 25;

[Explanation of symbols]

１シングルチップマイクロコンピュータ２ＣＰＵ４ＲＯＭ２００命令レジスタ２０２命令レジスタコントローラ２０３命令デコーダ２０４サブ命令デコーダ２０５レジスタセレクタＦＩＦＯＣＮＴ１，ＦＩＦＯＣＮＴ２命令コード量検
出信号ＥＲ０〜ＥＲ７汎用レジスタＰＣプログラムカウンタＡＬＵ算術論理演算器ＡＬＵＳサブ算術論理演算器ＡＵ算術演算器ＩＮＣインクリメンタＷＤＢライトデータバッファＲＤＢリート゛データバッファＡＢアドレスバッファReference Signs List 1 single-chip microcomputer 2 CPU 4 ROM 200 instruction register 202 instruction register controller 203 instruction decoder 204 sub-instruction decoder 205 register selector FIFOCNT1, FIFOCNT2 instruction code amount detection signal ER0 to ER7 general-purpose register PC program counter ALU arithmetic logic unit ALUS sub-arithmetic Logical operation unit AU arithmetic operation unit INC incrementer WDB write data buffer RDB read data buffer AB address buffer

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ０６Ｆ 15/78 ５１０Ｇ０６Ｆ 15/78 ５１０Ｇ５１０Ａ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G06F 15/78 510 G06F 15/78 510G 510A

Claims

[Claims]

1. A data processing device which reads an instruction code having a basic unit bit number and operates according to a basic unit time, comprising: a data bus having a bit number larger than the basic unit bit number; and a data bus connected to the data bus. An instruction code input means for inputting an instruction code from a data bus and capable of holding a plurality of instruction codes; a monitoring means for monitoring a state of the instruction code held in the instruction code holding means; Control means for decoding an instruction code obtained from the means and controlling an instruction execution operation, wherein the control means controls a part or all of the instruction execution operation based on the instruction code for each basic unit time. The operation is divided into two operations, a first operation for reading an instruction code and a second operation for performing other operations. When including bets, in response to said monitored state by the monitoring means, the data processing device, characterized in that to prevent the execution of the first operation.

2. The method according to claim 1, wherein the state of the instruction code is a retained amount of the instruction code. When the retained amount is large, execution of the first operation is suppressed, and when the retained amount is small, the first operation is executed. 2. The data processing device according to claim 1, wherein

3. The method according to claim 2, wherein the second operation, which has a data transfer instruction as an executable instruction and is controlled by the control means in response to an instruction code of the data transfer instruction, includes a transfer source address for generating a transfer source address. A generating operation and a transfer data storing operation for storing the transferred data, wherein the control unit performs the same basic unit for generating the control information for the transfer source address generating operation and the transfer data storing operation. 2. The data processing device according to claim 1, wherein the data is generated in time, and the transfer source address generating operation and the transfer data storing operation are executed in different basic unit times.

4. A read data buffer comprising: a register; a flag; a read data buffer; and an internal bus connected to the register and the read data buffer. 4. The data processing apparatus according to claim 3, wherein data is temporarily stored, and the stored data is output to said internal bus, and said register means is capable of storing the contents of said internal bus.

5. The read data buffer means for temporarily storing the transferred data, inspecting the stored data, and reflecting an inspection result on the flag means. 5. The data processing device according to 4.

6. A program counting means, wherein the control means reads or writes the program counting means as the first operation, and stores the transferred data in the register means or the flag means. 6. The data processing apparatus according to claim 4, wherein the data is reflected in parallel with the data.

7. A program counting means, comprising: an arithmetic operation means capable of incrementing the contents of the program counting means, wherein the value incremented by the arithmetic operation means is a first value corresponding to the size of the data bus. 2. The data processing device according to claim 1, wherein the value is a value selected from a value and a second value smaller than the first value according to the input value.

8. An instruction having a branch instruction as an executable instruction,
At the time of execution of the branch instruction, after reading the instruction code of the branch destination, the arithmetic operation means selects the value to be incremented to the first value or the second value according to the input value. 8. The data processing device according to claim 7, wherein

9. An instruction having an instruction that does not take a branch as an executable instruction, wherein when executing the instruction that does not take a branch, the arithmetic operation means sets the value to be incremented to the first value. 8. The method according to claim 7, wherein
Or the data processing device according to 8.

10. A data processing device which reads an instruction code having a basic unit bit number and operates according to a basic unit time, comprising: a data bus having a bit number larger than the basic unit bit number; and a data bus connected to the data bus. Command code input means for inputting an instruction code from a data bus and holding a plurality of instruction codes; first control means and second control means for decoding the instruction code and controlling an instruction execution operation; A first arithmetic and logic operation unit and a second arithmetic and logic operation unit whose operation phases are shifted from each other and operate in accordance with the basic unit time, wherein the first control unit controls a first arithmetic and logic unit And the second control means controls a second arithmetic and logic unit. The first control means includes a function of the second control means. The first control means and the second control means 2 Control means is operable to overlapping time, data processing apparatus, wherein the first control means performs a control of said second control means.

11. A monitor for monitoring a state of an instruction code stored in the instruction code storage unit and generating a control signal according to a monitoring result, wherein the instruction code storage unit stores the stored instruction code. Inspect the contents of the code, generate a detection signal when a predetermined instruction is detected, the first control means, when the detection signal is in a predetermined state, all or a part of the predetermined instruction code, 11. The data according to claim 10, wherein the data is supplied to the second control means, and when the control signal generated by the monitoring means is in a predetermined state, the control by the second control means is permitted. Processing equipment.

12. A computer system comprising: a program counting means; and an arithmetic operation means capable of incrementing the contents of the program counting means, wherein the arithmetic operation means has a phase different from that of the first arithmetic logic unit. 12. The data processing device according to claim 10, wherein the data processing device operates in synchronization with a unit time.

13. A data processing apparatus which operates by reading an instruction code composed of a basic unit bit number, comprising: a data bus having a bit number larger than the basic unit bit number; and a data bus connected to the data bus. Instruction code holding means capable of holding a plurality of instruction codes, monitoring the status of the instruction codes held in the instruction code holding means, decoding the instruction codes and executing the instructions Control means for controlling an operation, wherein the control means controls a read operation of an instruction code to be executed later, and when a predetermined instruction code is decoded, an instruction code according to a monitoring result by the monitoring means. A data processing device characterized in that the read amount of the data can be changed.

14. A branch instruction as an executable instruction, wherein the control means reads an instruction at a branch destination when decoding and executing an instruction code of the branch instruction, and reads the instruction at the branch destination. 14. The data processing apparatus according to claim 13, wherein the content of the instruction read at the branch destination is decoded when the content is input.

15. An arithmetic and logic unit having a first arithmetic and logic unit, a second arithmetic and logic unit, arithmetic operation means, register means, program counting means, read data buffer means, and address buffer means, and at least the address buffer means. A first internal bus connecting the program counting means, the register means, the first arithmetic and logic unit, the second arithmetic and logic unit, and the arithmetic operation means; and at least the address buffer means and the program count. Means, a second internal bus connecting the arithmetic operation means, at least a fourth internal bus connecting the read data buffer means and the register means, and connecting at least the program counting means and the arithmetic operation means And a third internal bus, wherein the register by the first bus is provided. Means for transferring to the second arithmetic and logic unit and transferring from the program count means to the address buffer means by the second bus in parallel. Data transferred from the read data buffer means to the register means by a bus and transfer from the arithmetic operation means to the program count means by the fourth bus in parallel. Processing equipment.

16. A fifth internal bus for coupling said register means, said first arithmetic and logic unit, and a second arithmetic and logic unit, wherein said fifth means is provided from said register means by said first bus. To the arithmetic and logic unit and another transfer from the register means to the first arithmetic and logic unit by the fifth bus can be performed in parallel. A transfer from the register means to the second arithmetic and logic unit by a bus and another transfer from the register means to the second arithmetic and logic unit by the fifth bus can be performed in parallel. The data processing device according to claim 15, wherein:

17. The first internal bus is further coupled to the read data buffer means, wherein the transfer from the register means to the first arithmetic and logic unit by the first bus; The transfer from the read data buffer means to the register means by the bus is detected, output from the register means to the first bus is inhibited, and the first data is transferred from the read data buffer to the first bus. 17. The data processing apparatus according to claim 15, further comprising control means for instructing output to a bus.

18. The second internal bus is further coupled to the write data buffer means, and when the data is transferred from the program count means to the address buffer by the second bus, the data is transferred to the write data buffer means. 18. The data processing apparatus according to claim 15, wherein the data processing is performed.

19. The write data buffer means having a lower bit correcting means, wherein the lower bit can be corrected by the correcting means when outputting the contents transferred from the program counting means. 19. The data processing device according to claim 18, wherein

20. An apparatus according to claim 1, further comprising another arithmetic operation means, wherein said another arithmetic operation means is coupled to said first internal bus and said write data buffer means. 20. The data processing device according to 18 or 19.

21. A plurality of register means, wherein the register means can use the entire area or an area divided into two to hold data, and has an address having a larger number of bits than one of the divided bits. And the same instruction code as another data processing device having a register corresponding to the one of the divided bits and including the instruction execution function of the other data processing device. And wherein an instruction using the entire register can be executed.
6. The data processing device according to 5.