JPH02300841A

JPH02300841A - Microprocessor and data processor using the same

Info

Publication number: JPH02300841A
Application number: JP12192589A
Authority: JP
Inventors: Toyohiko Yoshida; 豊彦吉田
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1989-05-15
Filing date: 1989-05-15
Publication date: 1990-12-13
Anticipated expiration: 2011-08-07
Also published as: JP2522048B2

Abstract

PURPOSE:To eliminate the need for a memory cycle for the transfer of the PC value of a coprocessor instruction by providing a program counted value holding means (PC queue) where the PC value of the coprocessor instruction is held in preparation for the occurrence of an exception at the time of executing the coprocessor instruction. CONSTITUTION:Coprocessors 11 and 12 execute arithmetic operations in accordance with the commands transferred from a microprocessor 10. When an exception occurs at the time of executing operations, an exception processing handler is started. In this handler, exception information is read out from coprocessors 11 and 12 and its contents are checked, and the PC value of the coprocessor where the exception occurs is read out from the main processor 10 as necessary or the entry number of the PC queue corresponding to the command which causes the exception is read out from the coprocessor and the PV value of the coprocessor instruction which causes the exception is read out from the PC queue of the main processor 10 based on this number, and the coprocessor instruction which causes the exception is specified. Thus, the operation can be instructed to the coprocessor instruction with one memory cycle.

Description

[Detailed description of the invention]

［産業上の利用分野］本発明は高速なコプロセッサインターフェイスを有し、
少ないバスサイクルでコマンドをコプロセッサへ転送し
、コプロセッサ命令を高速に実行することが可能なマイ
クロプロセッサ及びそれを用いたデータ処理装置に関す
る。更に、コプロセッサ命令の実行に際して例外が発生
した場合に、例外を発生したコプロセッサ命令のプログ
ラムカウンタ（１’Ｃ）値を特定することが可能なマイ
クロプロセッサ及びそれを用いたデータ処理装置に関す
る。［従来の技術］従来のマイクロブロセノ９を１プロセッサ（Ｍａｉｎ１
’ｒｏｃｅｓｓｉｎｇ　１Ｊｎｉｔ：以下ＭＰＩＪと称
ず）として用いたデータ処理装置では、ＭＰ［ｌが整数
演算を行い、浮動小数点演算は専用のコプロセッサで行
う構成が−ＩＩＩＱ的である。例えば、［別面インター
フェイス、“数値演算プロセッサ−１出版、１９８７年
１にはマイクロプロセッサを主プロセッサとして種々の
浮動小数点プロセッサをコプロセッサとして用いたデー
タ処理装置が記載されている。このような従来のデータ処理装置では、主プロセッサと
コプロセッサとで分担して命令を実行するために種々の
工夫が施されている。第４０図に従来のマイクロプロセ
ッサとコプロセッサとを用いたデータ処理装置の一構成
例を示す。装置全体を制御するＭＰＬＩ２０１とコプロセッサであ
るＦＰＵ（Ｆｌｏａｔｉｎｇ−ｐｏｉｒ＋ｔ　Ｐｒｏｃ
ｅｓｓｉｎｇ　Ｕｎｉｔ）２０２と、命令及びデータを
記憶するメモリシステム２０３と、装置全体にクロック
ＣＬにを供給するクロ、クジエネレータ２０４とが、制
′４ＴＩ信号線２１１，２１２，２１３，２１４゜２１
５．２１６等により結合されている。なお、信号線２１
０はクロックＣＬにを、１δ号線２１１　は信号ＢＳＩ
、ＡＳＷ。ＤＳ誹、ｌ？／−鰺を、１３号線２１２は信号ＣＰＳＴ
Ｏ：２を、１３号線２１３は信号ｃｐｏｃ塀を、信号線
２１４は（δ号ＤＣＩをそれぞれ転送し、また信号＆！
２１５はアドレスバスＡＯ：３Ｌ！＝して、信号ｖＡ２
１６　ｕ：テータハスＤＯ：３１トしてそれぞれ使用さ
れる。第値図は上述の第４０図に示したデータ処理装置におい
てコプロセッサ命令を実行する場合に、ＭＰＩ＋２０１
からＰＰＩＪ２０２へコマンド、オペランド及びコブ１
１セツサ命令のＰＣ（！を転送する際のタイミングチャ
ートである。この従来のマイクロブロセノ４〕を？１ＰＵ２０１とし
て用いたデータ処理ＶｉＴ１では、？１ＰＵ２０１がコ
プロセッサ命令をフェッチしデコードし、このデコード
結果に従ってコマンドとオペラン］′とコプロセッサ命
令のＰＣ値とが総てデータバス２１６を介して転送され
る。第値図のタイミングチャートに示す例では、コマンド転
送に１メモリサイクル、オペラン１′転送に２メモリサ
イクル（データの上位と下位に各１メモリサイクル＞　
、ｐｃ値転送に１メモリザイクルの合計４メモリ９イク
ルを要する。このようなコマンドとオペランドとコブＣ
ｌセッサ命令のＰＣ値とを転送する手法については、゛
肛６８８８１／ＭＣ６８８Ｂ２Ｆｌｏａｔｉｎｇ−Ｐｏ
ｉｎｔ　Ｃｏ−ｐ「ｏｃｓｓｓｏｒ　１ｌａｅｒ’ｓ　
　Ｍａｎｕａｌ’。ｎ０ＴＯＲＯＬＡ、　ｒＮｃ、、　ＰＲＥＮＴＩ（ｊ！
　ＨＡＬＬ、　１９８７．５ｅｃＮｏｎ７、あるいは［
別冊インターフェイス、“数値演算プロセッサ″ＣＱ出
版、１９８７年」のｐｐ、　１９４−２１０に詳しく述
べられている。また、ＦＰＩＩがコプロセッサ命令をフェッチしデコー
ドしコプロセッサ−・コマンドを転送するのではなく、
コプロセッサが直接コプロセッサ命令をフェッチしデコ
ードしてそれを実行する手法を採用したデータ処理装置
が従来より！２案されている。例えば、インテル社製マイクロプロセッサ１８０８Ｇを
Ｍｒ’ｌｌとして用いたデータ処理装置では、コブ[Industrial Application Field] The present invention has a high-speed coprocessor interface,
The present invention relates to a microprocessor capable of transferring commands to a coprocessor in fewer bus cycles and executing coprocessor instructions at high speed, and a data processing device using the microprocessor. Furthermore, the present invention relates to a microprocessor and a data processing apparatus using the microprocessor that can specify the program counter (1'C) value of the coprocessor instruction that generated the exception when an exception occurs during execution of the coprocessor instruction. [Conventional technology] The conventional microbroseno 9 has one processor (Main1
In the data processing device used as 'processing 1Jnit (hereinafter referred to as MPIJ), MP[l performs integer operations and floating point operations are performed by a dedicated coprocessor, which is similar to -IIIQ. For example, a data processing device using a microprocessor as a main processor and various floating point processors as co-processors is described in [Another Interface, Numerical Processor-1 Publishing, 1987. In this data processing device, various measures have been taken to share instructions between a main processor and a coprocessor. Fig. 40 shows a conventional data processing device using a microprocessor and a coprocessor. An example of a configuration is shown below: MPLI 201 that controls the entire device and FPU (Floating-poir+tProc) that is a coprocessor.
essing unit) 202, a memory system 203 for storing instructions and data, and a clock generator 204 for supplying a clock CL to the entire device.
5.216 etc. In addition, the signal line 21
0 is for clock CL, 1δ line 211 is for signal BSI
, A.S.W. DS insult, l? /-Mackerel, Route 13 212 is signal CPST
O:2, line 13 213 transfers the signal cpoc fence, signal line 214 transfers the (δ DCI), and the signal &!
215 is address bus AO:3L! =, the signal vA2
16 u: Thetahas DO: 31 to be used respectively. The value diagram shows MPI+201 when executing a coprocessor instruction in the data processing device shown in FIG.
Command, operand and Cobb 1 from to PPIJ202
This is a timing chart when transferring one setter instruction to the PC (!). In the data processing ViT1 using this conventional microbroseno 4] as the ?1PU 201, the ?1PU 201 fetches and decodes the coprocessor instruction, and this decoding According to the result, the command, operan]' and the PC value of the coprocessor instruction are all transferred via the data bus 216. In the example shown in the timing chart of Figure 1, it takes one memory cycle to transfer the command and one memory cycle to transfer the operan 1'. 2 memory cycles (1 memory cycle each for upper and lower data)
, it takes one memory cycle to transfer the pc value, which is a total of 4 memory cycles and 9 cycles. Cobb C with commands and operands like this
Regarding the method of transferring the PC value of l processor instruction, refer to 68881/MC688B2Floating-Po
int Co-p “ocsssor 1layer’s
Manual'. n0TOROLA, rNc,, PRENTI(j!
HALL, 1987.5ecNon7, or [
It is described in detail in the separate volume Interface, "Numerical Processor," CQ Publishing, 1987, pp. 194-210. Also, rather than the FPII fetching and decoding coprocessor instructions and forwarding coprocessor commands,
Conventional data processing devices employ a method in which a coprocessor directly fetches, decodes, and executes coprocessor instructions! Two proposals have been proposed. For example, in a data processing device that uses Intel's microprocessor 1808G as Mr'll, Cobb

【］
セソナである１８０８７が常時バスを監視し、コプロセ
ッサ命令がハスへ出力された場合にこれをコブＩＪセッ
サがフェッチしデコードして実行する。この手法につい
ては例えば［別冊インターフェイス、“数値演算プロセ
ッサ”Ｃ口出版、１９８７年１のｐｐ。６０−７４に詳しく記されている。［発明が解決しようとする課題１従来のマイクロブ１１セツサをＭｌｌｔ＋として用いた
データ処理′装置では、第値図のタイミングチャートに
示す如く、コプロセッサ命令に対する演算をＭＰｔｌが
コプロセッサに指示する場合にコブＣｌセッサ命令のＰ
Ｃ値をも併せて転送するため、演算の指示に多（のメモ
リサイクルを必要とし、装置全体の性能を低下させてい
た。コプロセッサ命令のＰＣ値はコプロセッサ命令の実
行には直接必要ではなく、コプロセッサ命令の実行に際
して例外が発生した場合に備えて転送されるものであり
、はとんどの場合は利用されずに無駄になる。また、コプロセッサが直接コプロセッサ命令をフェッチ
してデコー］゛する従来のマイクロプロセフすをｈＰｕ
として用いたデータ処理装置では、コブロセ、すに肝υ
と同様の命令プリフェッチキフ。 −及び命令デコーダを備える必要があり、コプロセッサ
のハードウェア盪が増加する。また、同一のコブロセ、
すを複数接続してそれらを並列動作させる場合、いずれ
のＭＰＩＩがいずれのコプロセッサ命令を実行するのか
を制御するためにコプロセッサ間の調停が必要になる。このような手法を採用したデータ処理装置の問題点につ
いては「内訳美行、“３２ビット時代の幕開けで浮動小
数点演算コブ〔コセ、すの本格的臂及朋が到来″、日経
エレクトロニクス　阻４２５１９８７年７月１３日号ｐ
ｐ、　１２３−１３８　Ｊに詳しく述べられている。［課題を解決するための手段］本発明のデータ処理装置の主プロセッサとして用いられ
るマイクロプロセッサは、コプロセッサ命令のｐｃ（ａ
を保持するＰＣキュー（プログラムカウンタ値保持手段
）を有する。また、このＰＣキ】、−は複数個のコプロセッサ命令の
ＰＣ値を保持することが可能であり、コプロセッサ命令
をフェッチしデコードして、コプロセッサに演算を指示
する場合には、コプロセッサ命令のＰＣ値の代わりにそ
のＰＣ値を格納したキューのエントリ番号を演算Ｊｆｉ
示子と連結したコマンドをコプロセッサへ転送する手段
を有する。更に、複数のコプロセッサを接続した場合、コマンドの
転送先のコプロセッサの識別占コプロセッサ命令のＰＣ
（直をコプロセッサへ転送するかどうかの判断の基準と
なるコブロセソ→）識別子（ＣＰＩＤ）とをプロセンサ
ステータスワード（ＰＳＷ）中に保持する。［作用］本発明のデータ処理装置では、主プロセッサがコプロセ
ッサ命令をフェッチしデコードし、その結果に従いコプ
ロセッサへ転送すべき演算指示子を生成する。プロセッ
サステータスワード（１”；Ｗ）中のコプロセッサ識別
子（ＣＩ’１Ｉｌ）の値が“００１゛である場合は、コ
プロセッサ命令のＰＣ値はＰＣＣキー−１つのエン［・
すに保持される。ＰＣ値を保持したＰＣキューのエント
リ番号と演算指示子と、プロセッサステータスワード中
のコプロセッサ識別子とは連結されコマンドとして単一
のメモリサイクルでコプロセッサヘ転送される。コプロ
セッサ識別子（ＣＰＩＤ）の値が“００１″以外の場合
は、コマンド転送の後、コプロセッサ命令のＰＣ値もコ
プロセッサへ転送される。コプロセッサは本発明のマイクロプロセッサから転送さ
れてきたコマンドに従い演算を実行する。演算の実行に際して例外が発生した場合、例外処理ハン
ドラが起動される８例外処理ノ１ンドラではコプロセッ
サから例外情報が読出され、その内容が調べられ、必要
であれば例外を発生したコプロセッサのＰＣ値が主プロ
セッサから読出されるか、または例外を発生したコマン
ドに対応するＰＣＣキー−エントリ番号がコプロセッサ
から読出され、例外を発生したコプロセッサ命令のＰＣ
（ａがその番号に基づいて主プロセッサのＰＣキューか
ら読出され、例外を発生したコプロセッサ命令が特定さ
れ［発明の実施例］以下、本発明をその実施例を示す図面に基づいて詳ｉボ
する。（１）「本発明のマイクロプロセッサを用いたデータ処
理装置」第１図は本発明のマイクロプロセッサを用いたデータ処
理装置の一構成例を示すブロック図である。装置全体を制？３！する主プロセッサ（Ｍｉｌｌ）　１
０、浮動小数点演算を行う２つのコプロセッサ（第１Ｆ
ｌ”１ｌ）ｉｉ、　＜第２　ＦＰｔｌ）　１２、命令及
びデータを記憶するメモリシステム１３、ＭＰＵｌ０に
対する割込みを制御する割込み制御回路１５、装置全体
のタイミングを決定するクロックを供給するクロックジ
ェネレータ１４等にて構成されている。ＭＰＵ１０、第１ＦＰＵＩＩ、第２ＦＰＵ１２及びメモ
リシステム１３は６４ビツトのデータバスＤＯ；６３（
信号線１）と３２ビツトのアドレスバスＡＯ：３１（信
号ｖＡ２）とで接続されている。　　ＭＰｔｌｌＯとメ
モリシステム１３とは３２ビツトの命令バス１０：３１
（ｆｆ号線３）とでも接続されている。また、装置全体
は制御信号ｖＡ４で結合されている。１旧０は装置全体のバスマスクとして動作し、アドレス
バスＡ　Ｏ：　３１　［２１へは常時−ｐｕｔｏから信
号が出力される。１１ｐｕｔｏはメモリシステム１３＾
・アドレスバスＡ　Ｏ：　３１　（２１を通じてアドレ
スを出力し、メモリシステム１３命令バスＩＯ：３Ｈ３
１を１ｆｆｉシて命令をフェッチし、実行する。　　Ｍ
ＰｕｌＯが命令を実行する際、必要ならばデータバスＤ
Ｏｘ６３（ｘｉを通してメモリシステム１３との間でオ
ペランドを入出力する。また、コプロセッサ命令をデコ
ードした場合はｎｐｕｉｏから第１ＰＰＬＩＩＩあるい
は第２ＦＰｕ１２ヘアドレスバスＡＯ：３０２１を通じ
てコマンドを送り、ｐｐｕｔｔまたは１２で演算を行う
。ＦＰＵＩＩまたは１２で演算を行う場合、必要ならば？
１ＰＵｌＯはデータバスＤＯ：６３（１１を通じてオペ
ランドを入出力する。コプロセッサ命令を第１ＦＰＩＩ
ＩＩで実行するか第２ＦＰ［Ｉ１２で実行するかは、そ
の時点でｈＰｕｌＯ内に保持さているコプロセッサ識別
子（ＣＰＩＯ）４こより決定される。ＣＰＩＤはソフｊ
・ウェアによりＢ換え可能であり、ＣＰｌ［ｌを書換え
ることによりＭＰＩＩＩＯはコプロセッサ命令を第１Ｆ
ＰＵＩＩと第２ＦＰＵ１２とのいずれで実行することも
可能である。また、ＦＰＵＩＩ。１２での演算実行時に例外が発生した場合に、例外を発
生したコプロセッサ命令のプログラムカウンタ（ＰＣ）
値を特定するため、コプロセッサ命令のＰＣ値を保持す
るレジスタがＭｌ’υ１０及びＦＰυ１２内に備えられ
ている。ＭＰＩＪＩＯには第１ＦＩＩ旧ｌで実行するコプロセッ
サ命令のｐｃ値を複数保持するＰＣキュー（第１ＦＰｃ
２１）があり、第２ＦＰＵ１２には第２ＦＰｔｌ１２で
実行するコプロセッサ命令のｐｃ（ａを保持する第２Ｆ
ＰＣ２２がある。第２ＦＰＣ２２に保持されるＰＣ値は
ｎｐｕｉｏから転送される。これらの第１．第２のＦｌ）Ｃ２１，２２が上述のレジ
スタである。第１ＦＩ）Ｃ２１あるいは第２ＦＰＣ２２に保持された
ＰＣ値は対応するコプロセッサ命令が例外を起こした際
に使用される。ＦＰＵＩＩ（１２）においてコプロセッ
サ命令が例外を起こすと、まずそのＦＰＵＩＩ（１２）
からυ１込み制御回路ｔ５にル１込みが要求される。　
Ｊ！Ｉ込み制御回路１５はＦＰＵＩＩ（１２）からの割
込み要求に従ってＭＰＵｌ０に割込み処理を要求する。これにより割込み処理ハンドラが起動する。割込み処理
ハンドラではｐｐｕｌｌ（１２）から例外情報が読出さ
れ、その内容に従って第１ＦＰｃ２１または第２ＮＰＣ
２２に保持されている１１ｃ（ｉｔ！が読出されて例外
処理が行われる。以下、本発明のマイクロプロセッサの命令体系９処理機
構及び処理力法Ｇこついて更に詳しく述べる。（２１ｒ本発明のマイクロプロセッサの命令のフォーー
ン　〕　ト　１本発明のマイクロプロセッサの命令は１６ビツト単位で
可変長となっており、奇数バイト長の命令はない。本発明のマイクロプロセッサでは高頻度に使用される命
令を短いフォーマットとするため、特に工夫された命令
フォーマット体系を有する０例えば、２オペランド命令
に対しては、基本的に「４バイト」＋「拡張部」の構成
を有し、総てのアト１ノノシングモードが利用可能な一
般形フオーマノドと１．頻度の高い命令志アドレッシン
グ千〜Ｆのみを使用可能な短縮形フォーマットとの２つ
のフォーマノ１−がある。第１１図から第１５図に本発明のマイクロプロセッサの
命令フォーマットの模式図を示す。なお、これらのフＡ
・−マット中に現われる記号の意味は以下の通りである
。一二オペレーンゴンコードが入る部分［ｉａ　：　８ビｙ（・の−船形のアドレッシングモー
ドでオペランドを指定する部分Ｓｈ　：　６ビノトの短縮形のアドレッシングモードで
オペランドを指定する部分Ｒｎ＝レジスタファイル上のオペランドをレジスタ番号
で指定する部分フォーマント・は、第１１図に示すように、右側がＬＳ
Ｂ　側で且つ高いアドレスである。アドレスＮとアドレ
スＮ＋１との２バイトを見ないと命令〕１−マットが判
別できないが、これは命令が必ず１６ビノト（２バイト
）単位でフェッチ及びデコードされることを前提として
いるためである。本発明のマイクロプロセッサの命令では、いずれのフォ
ーマントの場合も、各オペランドのｈまたはＳｂの拡張
部は、必ずそのＥａまたはｓｈの基本部を含むハーフワ
ードの直後に置かれる。これは、命令により暗黙に指定
される即値データあるいは命令の拡張部に優先する。従
って、４ハイド以上（Ｄ　命令では、Ｅａの拡張部によ
って命令のオペレーションコードが分断される場合があ
る。また後述Ｖる如く、多段間接モードにま）てＨａの拡張
部に更に拡張部が付加される場合にも、次の命令オペレ
ーションコードよりもそちらの方が優先される。例えば
、第１ハーフワードに１！ａｌを含み、第２ハーフワー
ドに［Ｈａ２を含み、第３ハーフワードまである６バイ
ト命令の場合を考える。Ｅａｌに多段間接モードを使用しているため、昔通の拡
張部の他に多段間接モードの拡張部も有するものとする
。この場合、実際の命令ビットパターンは、命令の第１
ハーフワード（Ｅａｔの基本部を含む）、Ｅａｌ’の拡
張部、ｌ！ａｌの多段間接モード拡張部、命令の第２ハ
ーフワード（Ｅａ２の基本部を含む）、、Ｅａ２の拡張
部、命令の第３ハーフヮ−１′、の順となる。（２，１）　　ｒ短縮形２オペランド命令−１第１２図
は２オペランド命令の短縮形フォーマノ］・の模式図で
ある。このフォーマットにはソースオペランド側がメモリとな
るＬ−ｆｏｒｍａｔとデスティネーションオペランド側
がメモリとなるＳ−ｆｏｒｍａｔとがある。Ｌ−ｆｏｒｍａｔでは、ｓｈはソースオペランドの指定
フィールド、Ｒｒｉはデスティネーションオペランドの
レジスタの指定フィールド、ＲＲはｓｈのオペランドサ
イズの指定を表す。レジスタ上に置かれたデスティネー
ションオペランドのサイズは、３２ビ、トに固定されて
いる。レジスタ側とメモリ側とのサ　゛イズが巽なり、
ソース側のサイズが小さい場合に符号拡張が行なわれる
。５−ｆｏｒｓａＬでは、ｓｈはデスティネーシ」ンオペ
ランドの指定フィールド、Ｒｎはソースオペランドのレ
ジスタ指定フィール１′、ｌはｓｈのオペランＦ’　９
イズの指定を表す。レジスタ」−に置かれたソースオペ
ランドのサイズは、３２ビツトに固定されている。レジ
スタ側とメモリ側とのサイズが異なり且つソース側のサ
イズが大きい場合には、溢れた部分の切捨てとオーバー
フローチェックが行なわれる。（２，２）　　ｒ−船形１オペランド命令」第１３図は
１オペラン１゛命令の一般形フオーマツ）　（Ｇｌ−ｆ
ｏｒｍａｔ）の模式図である。開はオペランドサイズの指定フィールドである。なお、一部のＧ　ｌ　−ｆ　ｏｒｗａａ　を命令では、
Ｅａの拡張部以外にも拡張部がある。また、問を使用し
ない命令もある。（２，３）　　ｒ−一般形２オペランド命令」第１４図
は２オペランド命令の一般形フオーマットの模式図であ
る。このフォーマットに含まれるのは、８ビツトで指定する
一般形アドレッシングモードのオペランドが最大２つ存
在する命令である。オペランドの総数自体は３つ以上に
なる場合もある。　　ＥａＭはデスティネーシッンオベ
ランドの指定フィールド、ｉＩＭはデスティネーション
オペランドサイズの指定フィールド、ＥａＲはソースオ
ペランドの指定フィールド、ＲＲはソースオペランドザ
イズの指定フィールドである。一部のＧ４ｏｒｓａｔ命
令では、ＥａＭ及びＥａＲの拡張部以外にも拡張部があ
る。第１５図はショートブランチ命令のフォーマントの模式
図である。ｃｃｃｃは分岐条件指定フィールド、ｄｉｓｐ：８はジ
ャンプ先との変位指定フィールドである。本発明のマイ
クロプロセッサでは、８ビツトで変位を指定する場合に
は、ビットパターンでの指定値を２倍して変位値とする
。（２，４）　　ｒアドレッシングモード」本発明のマイ
クロプロセッサの命令のアｉルノシングモード権定方法
には、レジスタを含めて６ビ／１で指定する短縮形と、
８ピツ１で指定する一ｍ形がある。未定義のアドレッシングモードを指定した場合、あるい
は意味的に考えて明らかに不合理なアドレノソングモー
ドの組合わせが指定された場合には、未定義命令を実行
した場合と同様に予約命令例外が発生し、例外処理が起
動する。これに該当するのは、デスティネーションが即値モード
の場合、アドレス計算を伴うべきアドレノングモード指
定フィールドで即値モードを使用した場合等である。第１６図から第２６図に示すフォーマット中で使用され
る記号の意味は以下の通りである。Ｒｎ＝レジスタ指定（Ｓｈ）　：　６ビノトの短縮形アドレッシングモＸド
での指定方法（Ｅａ）　：　８　ヒツトの一般形アドレッシングモー
ドでの指定方法図中で点線で囲まれた部分は、拡張部を示す８（２，４
，１）　　ｒ１本アドレッシングモード」本発明のマイ
クロプロセッサの命令では種々なアドレッシングモード
をサポートする。その内、本発明のマイクロプロセッサ
でサポートする基本アドレッシングモードには、レジス
タ直接モード、レジスタ間接モード、レジスタ相対間接
モード、即値モード、絶対モード、ＰＣ相対間接モード
、スタックポツプモード及びスタックポツプモードがあ
る。レジスタ直接モードは、レジスタの内容をそのままオペ
ランドとする。フォーマントを第１６図に示す１図中、
Ｒｎは汎用レジスタまたはＦＰυレジスタの番号を示す
。レジスタ間接モードは、汎用レジスタの内容をアドレス
とするメモリの内容をオペランドとする。フォーマットを第１７図に示す。図中、Ｒｎは汎用レジ
スタの番号を示す。レジスタ相対間接モードは、ディスプレースメント値が
１６ビツトであるか３２ビツトであるかにより、２種類
に分かれる。それぞれ、汎用レジスタの内容に１６ピン
トまたは３２ビツトのディスプレースメント偵を加えた
偵をアドレスとするメモリの内容をオペランドとする。フォーマットを第１８図に示す。図中、１ｌｒｉは汎用
レジスタの番号を示す。ｄｉｓｐ：１６とｄｉｓｐ：３２は、それぞれ、１６ビ
ツトのディスプレースメント値、３２ビツトのディスプ
レースメント値を示す、ディスプレースメント偵は符号
付きとして扱われる。即値モードは、命令コード中で措定されるビンドパクン
をそのまま２進数と見なしてオペランドとする。フォー
マットを第１９図に示す０図中１ｍ５ｄａ　ｔａは即値
を示す、ｉ−一−ｄａ　ｔａのサイズは、オペランドサ
イズとして命令中で指定される。絶対モードは、アドレス値が１６ビツトで示されるか３
２ビツトで示されるかにより２種類に分かれる。それぞ
れ、命令コード中で肯定される１６ビツトまたは３２ビ
ツトのピントパタンをアドレスとしたメモリの内容をオ
ペランドとする。フォーマツ１を第２０図に示す。図中
、ａｂｓ：１６とａｂｓ：３２とは、それぞれ、１６ビ
ソト、３２ビツトのアドレス値を示ずｓ　ａｂｓ：１６
でアドレスが示される場合は指定されたアドレス値を３
２ビツトに符号拡張する。ＰＣ相対間接モードは、ディスプレースメント値が１６
ビノトであるか３２ビツトであるかにより、２種類に分
かれる。それぞれ、プログラムカウンタの内容に１６ビ
ツトまたは３２ビツトのディスプレースメント値を加え
た値をアトｌ／スとするメモリの内容をオペランドとす
る。フォーマットを第２１図に示す。図中、ｄｉｓｐ：
１６とｄｉｓｐ：３２は、それぞれ１６ビツトのディス
プレースメント値、３２ビットのディスプレースメント
値を示す、ディスプレースメント値は符号付きとして扱
われる。ＰＣ相対間接モードにおいて参照されるプログ
ラムカウンタの値は、そのオペランドを含む命令の先頭
アドレスである。多段間接アドレッシングモードにおい
てプログラムカウンタ（ＰＣ）の値が参照される場合に
も、同じように命令先頭のアドレスをｐｃ相対の基準埴
土して使用する。スタックポツプモードはスタックポインタ（ＳＰ）の内
容をアドレスとするメモリの内容をオペランドとする。オペランドアクセス後、ＳＰをオペランドサイズだけイ
ンクリメントする。例えば、３２ビツトデータを扱う際
には、オペランドアクセス後にＳＰが＋４だけ更新され
る。Ｂ、Ｈ，Ｄのサイズのオペランドに対するスタック
ポツプモードの指定も可能であり、それぞれＳｌｌが＋
１．＋２．＋８だけ更新される。フォーマットを第２２図に示す、オペランドに対してス
タックポツプモードが意味を持たない場合には、予約命
令例外が発生される。具体的に予約命令例外となるのは
、ｗｒｉｔｅオペランド、ｒｅａｄ−◎ｄｉｆｙ−ｗｒ
ｉ　ｔｅオペランドに対するスタックポツプモード指定
である。スタックブツシュモードは、ＳＰの内容をオペランドサ
イズだけデクリメントした内容をアドレスとするメモリ
の内容をオペランドとする。スタソクプ、・シュモード
ではオペランドアクセス前にＳＰがデクリメントされる
。例えば、３２ビツトデータを扱う際には、オペランド
アクセス前にＳＰが−４だけ更新される。８゜１６．６
４ピントのサイズのオペランドに対するスタックブツシ
ュモードの指定も可能であり、それぞれＳＰが−Ｌ−２
，−８だけ更新される。フォーマットを第２３図に示す、オペランドに対してス
タックブツシュモードが意味を持たない場合には、予約
命令例外が発生される。具体的に予約命令例外となるの
はｒｅａｄオペランド、　ｒｅａｄ−ｍｏｄｉｆｙ−ｗ
ｒＨｅオペランドに対すスタックブツシュモード指定で
ある。（２，４，２）　　ｒ多段間接アドレッシングモート”
」複雑なアドレッシングも、Ｗ本釣には加算と間接参照
の組合わせに分解される。従って、加算と間接参照のオ
ペレージ３ンとをアドレッシングのプリミティブとして
与えておと、それらを任意に組合わせることができれば
、いかに複雑なアトレッシングモードも実現可能である
。本発明のマイクロプロセッサの命令の多段間接アドレ
ッシングモードはこのような考え方に基づいている。複
雑なアドレッシングモードは、モジエール間のデータ参
照あるいは八Ｉ（Ａｒｔｉｆｉｅｉａｌ　Ｉｎｔｅｌｌ
ｉｇｅｎｃｅ）言語の処理系に特に有用である７多段間接アドレッシングモードの指定に際しては、基本
アドレッシングモード指定フィールドでは、レジスタベ
ース多段間接モード、ｐｃ＜−ス多段間接モード、絶対
ベース多段間接モードの３種類の指定方法の内いずれか
１つを指定する。レジスタベース多段間接モードは、汎用レジスタの値を
拡張する多段間接アドレッシングのベース値とするアド
レッシングモードである。フォーマ、トを第２４図に示
す。図中、Ｒｎは汎用レジスタの番号を示す。ＰＣヘース多段間接モードは、プログラムカウンタの値
を拡張する多段間接アドレッシングのベース値とするア
ドレッシングモードである。フォーマットを第２５図に
示す。絶対ベース多段間接モードは、ゼロを拡張する多段間接
アドレッシングのベース埴とするアドレッシングモード
である。フォーマットを第２６図に示す。拡張される多段間接モード指定フィールドは、１６ビノ
トを一単位としており、これを任意数反復する。手段の
多段間接モードにより、ディスブレースメントの加算、
インデクスレジスタのスケーリング（Ｘｌｌ　Ｘ２１　
Ｘ４．　Ｘ８）と加算及びメモリの間接参照を行なう。多段間接モードのフォーマットを第２７図ｉこ示ず。各
フィールドは以下に示す意味を有する。ト０　：多段間接モード継続Ｅ＝１　８アドレス計算終了ｔａｐ　−＝＞　ａｄｄｒｅｓｓ　　ｏｆ　ｏｐｅｒａ
ｎｄＩ・０　：メモリ間接参照なしｔａｐ＋ｄｉｓｐ＋Ｒｘ＊５ｃａｌｅ＝＝＞ｔａｐ■・
ｌ　：メモリ間接参照ありｍｅｎ　［ｔｍｐ　＋　ｄｉｓｐ　＋　Ｒｘ　参５ｃａ
ｌｅ　］　＝＝＞　　ｔｓｐＭ＝０　　：　　＜ＲＸ＞をインデクスとして使用−一
１：特殊なインデクス＜ＲＸ＞・０　インデクス偵を加算しない（Ｒｘ・０）＜ＲＸ＞−１プログラムカウンタをインデクス値として
使用（Ｒｘ＝ＰＣ）＜Ｒｘ＞−２＋　　ｒｅｓｅｒｖｅｄ１１＝（］：多段間接モード中の４ビツトのフィールド
ｄ４の値を４倍してディスプレースメント値とし、これ
を加算する。　ｄ４は符号付きとして扱い、オペランド
のサイズとは関係なく必ず４倍して使用する。＝ｉ　　：多段間接モードの拡張部で指定されたｄｉ
ｓｐｘ（１６／３２ビツト）をディスプレースメント値
とし、これを加算する拡張部のサイズはｄ４フィールドで指定する。ｄ４＝０００１　　　ｄｉｓｐｘは１６ビノトｄ４・０
０１０　　　ｄｔｓｐに　は３２ビ、ト×ｘ　；インデ
クスのスケール（ｓｃａｌｅ＝１／２／４／８）プログ
ラムカウンタに対して×２、×４、×８のスケーリング
を行なった場合には、その段の処理終了後の中間１ａ（
ｔａｐ）として、不定値が入る。この多段間接モードによりて得られる実効アドレスは予
測できない値となるが、例外は発生しない。プログラムカウンタに対するスケーリングの指定は行、
＜（ってはいけない。多段間接モードによる命令フォーマットのバリエーショ
ンを第２８図及び第２９図に示す。第２８図は多段間接
モードが継続するか終了するかのバリエージ」ンを示す
。第２９図はディスブレースメン］・のサイズのバリエ
ーションを示す。任意段数の多段間接モードが利用できれば、コンパイラ
の中で段数による場合分けが不要になるので、コンパイ
ラの負担が軽減されるというメリットがある。多段の間
接参照の頻度が非常に少ないとしても、コンパイラとし
ては必ず正しいコードを発生できなければならないから
である。このため、フォーマット上、任意の段数が可能
になっている。（３）「本発明のマイクロプロセッサのコプロセッサコ
マンド」本発明のマイクロプロセッサの命令にはＦＰＵＩＩ。ｌ２で実行されるコプロセッサ命令が含まれる。コプロ
セッサ命令はＭＰＩＩＩＯによりデコー１゛された後、
コマンドとしテＦＰｔｌｌｌ（１２）　ヘ転送されＦＰ
ＵＩＩ（１２）で実行される。コマンドは第３０図に示
すフォーマットを有し、第３１図に示す如く、アドレス
バス２の下位２０ビツトを使用して転送される。コマンドは、ＦＰυ命令フィールドとＰＣＩＤフィール
ドとＣＰ［Ｄフィールドとにて構成される。　　ＦＰＩ
Ｊ命令フィールドはＦＰＵＩＩ　（１２）に対する命令
の内容を示す演算指定子である。　ＰＣＩＯフィールド
はりＵｌＯ内に保持されたコプロセッサ命令のＩ）Ｃ（
ｉを区別するための番号を示し、ＦＰＩＩＩｍ　（１２
）で例外が発生した場合に、例外を起こしたコプロセッ
サ命令のＰＣ値をＭＰＵｌ０内に保持された複数のＰＣ
値から選ぶための番号である。ＣＰＩＯはＭＰｔｌｌＯ
のプロセッサステータスワード（ＰＳＷ）のＣＰ［Ｄフ
ィールドの債を示し、複数のＦＰｔｌをＭＰＬＩＩＯに
接続した場合に各ＦＰｕを区別するための番号である。ＦＰＵ命令には第３２図から第３４図に示すフォーマン
トがある。メモリ・　ＦＰυレジスタ間命令（Ｈ−ＦＲ
命令）、ＩＩＰυレジスタ・　ＦＰｕレジスタ間命令（
Ｇｌ？−Ｐ　Ｒ命令）及びｌオペラン］−命令はオペレ
ーションコードフィール１′とＦＰＬＩ　レジスタ番号
フィールド（ＰＲ）とにて構成され、第３２図に示すフ
ォーマ／１・を有する。　　ＦＰＩＩレジスク・　ＦＰ
ＩＪレジスタ間命令（Ｐ１？−、ＰＲ命令）はオペレー
ションコードフィールドと２つのＦＰ［Ｉ　レジスタ番
号フィールド（ＦｉｌｌとＦＩ１２）とにて構成され、
第３３図に示すフォーマットを有する。ゼロオペランド
フィールドは第３４図に示すフォーマットを有する。（４）［機能ブロックの構成」第２図は本発明のマイクロプロセッサ（第１図のＭＰＵ
ｌ０）の−構成例を示すブロック図である。本発明のマイクロプロセッサの内部を機能的に大きく分
けると、命令フェッチ部５１、命令デコード部５２、Ｐ
Ｃ計算部５３、オペランドアドレス計算部５４、マイク
ロＲＯＭ部５５、データ演算部５６及び外部バスインタ
ーフェイス部５７に分かれる。第２図では、その他に外
部ヘアドレスを出力するアドレス出力回路５８と外部と
データの人出力を行うデータ入出力回路５９、制御信号
入出力回路６０及び命令コードの入力を行う命令入力回
路６１を上述の各機能ブロック部と分けて示した。（４，１）　　ｒ命令フェッチ部」命令フェッチ部５１には、命令キャッシュ、命令キュー
とその制御部等がある。これらは次にフェッチすべき命
令のアドレスを決定して、命令キャッシュまたは外部の
メモリシステムＩ３から命令をフェッチする。また、命
令キャッシュの命令登録も行う。次にフ虱ノチすべき命令のアドレスは、命令キューへ入
力すべき命令のアドレスとして専用のカウンタで計算さ
れる。ジャンプが発生した場合には、ジャンプターゲッ
トの新たな命令のアドレスがＰＣ計算部５３またはデー
タ演算部５６から転送されてくる。外部のメモリから命令がフェッチされる場合は、外部バ
スインターフェイス部５７を通して、フエ・ノチずべき
命令のアドレスがアドレス出力回路５８から外部へ出力
され、命令入力回路６１から命令コードがフェッチされ
る。バッファリングした命令コードの内、命令デコード部５
２で次にデコードすべき命令コードは命令デコード部５
２へ出力される。（４，２）　　ｒ命令デコード部」命令デコード部５２では基本的に１６ビノト（ハーフワ
ード）単位で命令コードをデコードする。この命令デコ
ード部５２には第１ハーフワードに含まれるオペレージ
３ンコードをデコードするＦＨＷデコーダ、第２．第３
ハーフワードに含まれるオペレーションコードをデコー
ドするＮＦＩＩＮデコーダ、アドレッシングモードをデ
コードするアドレッシングモードデコーダが含まれる。ＦＨＷデコーダ及びＮＦＨ−デコーダの出力を更にデコ
ードして、マイクロＲｏｌｉのエントリアドレスを計算
する第２デコーダ、条件分岐命令の分岐Ｙ−側を行う分
岐予測機構、オペランドアドレス計１１の際のバイブラ
インコンフリクトをチェックするアドレス計算フンフリ
クトチェソク機（１）１　モ含マｔ１．　ル。命令フェッチ部５１から入力された命令コードはｌクロ
、りにつき０−６バイトデコードされる。デコード結果の内、データ演算部５６での演算に関する
情報が７４２０１２０Ｍ部５５へ、オペランドアドレス
計算に関係する情報がオペランドアドレス計算部５４へ
、ｐｃ計算に関係する情報がＩＩｃ計算部５３へそれぞ
れ出力される。命令デコード部５２では、本発明のマイクロプロセッサ
において実行される主プロセッサ命令とコプロセッサで
実行されるコプロセッサ命令との両方がデコードされる
。（４，３）　　ｒマイクロデコーダ」７４２０１２０Ｍ部５５には、データ演算部５６の制御
を行うための種々のマイクロプログラムルーチンが格納
されているマイクロｌ？ＯＮ　、マイクロシーケンサ、
マイクロ命令デコーダ等が含まれる。マイクロ命令は−
？マイクロＯＭから１クロツクに１度読出される。マイ
クロシーケンサはマイクロプログラムで示されるシーケ
ンス処理の他に、例外１割込、トラップ（この３つをあ
わせてＥＸＴと称す）の処理及びテスト割込みをハード
ウェア的に受付ける。またマイクロＲＯＭ５５部はスト
アバッファの管理も行う。マイクロＲＯＭ５５部には命令コードに依存しない割込
み及び演算実行結果によるフラッグ情報と、デコーダ２
の出力等の命令デコード部５２の出力が入力される。マイクロデコーダの出力は主にデータ演算部５６に対し
て出力されるが、ジャンプ命令の実行による他の先行処
理中止情報等、一部の情報は他の機能ブロックへも出力
される。（４，４）　　ｒオペランドアドレス計算部」オペラン
ドアドレス計算部５４は、命令デコード部５２のアドレ
スデコーダ等から出力されたオペランドアドレス計算に
関係する情報によりハードワイヤード制御される。この
オペランドアドレス計算部５４ではオペランドのアドレ
ス計算に関する大半の処理が行われる。メモリ間接アド
レンングのためのメモリアクセスのアドレス及びオペラ
ンドアドレスがメモリにマツプされたＩｌｏ　ｔＪｉ域
に入るか否かのチェックも行われる。また、コブロセ・
２す命令のオペランドア（゛レス計算も行われる。アドレス計算結果は外部バスインターフェイス部５７へ
送られる。アドレス計算に必要な汎用レジスタ及びプロ
グラムカウンタの埴はデータ演算部５６から入力される
。メモリ間接アドレッシングを行う際は、外部バスインタ
ーフェイス部５７を通じてアドレス出力回路５８から外
部へ参照すべきメモリアドレスが出力され、データ入出
力部５９から入力された間接アドレス値が命令デコー　
ド部５２をそのまま通過することにより、フムノチが行
われる。（４，５）　　ｒＰＣ計算部」ＰＣ計算部５３は命令デコード部５２から出力されるｐ
ｃ計算に関係する情報でハードワイヤードに制御され、
命令のＰＣ（ｉを計算する０本発明のマイクロプロセッ
サの命令は可変長命令であり、命令をデコードした後で
なければその命令の長さが判断されない。ｐｃ計算部５
３は、命令デコード部５２から出力される命令長をデコ
ード中の命令のｐｃ値に加算することにより、次の命令
のｐｃ値を生成する。ＰＣ計算部５３の計算結果は各命令のｐｃ値として命令
のデコード結果と共に出力される。（４，６）　　ｒデータ演算部」データ演算部５６はマイクロプログラムにより制御され
、マイクロＲＯＭの出力であるマイクロ命令に従って各
命令の機能を実現するために１・要な演算をレジスタフ
ァイルと演算器とで実行する。命令の演算対象となるオペランドがアドレスまたは即値
である場合は、オペランドアドレス、ｉｆ算師部５４計
算されたアドレスまたは即値が外部ハスインターフェイ
ス部５７を通過してデータ演算部５６へ入力される。ま
た、命令の演算対象となるオペランドがメモリにあるデ
ータの場合は、アドレス計算部５４で計算されたアドレ
スをバスインターフェイス部５７が”？ドレス出力回路
５８から出力することにより、メモリシステム１３から
フェッチしたオペランドがデータ入出力回路５９からデ
ータ演算部５６へ入力される。コプロセッサ命令に対しては、マイクロプログラムに従
って必要なコマンド及びオペランドの転送を定められた
コプロセッサプロトコルでデータＩＦ４算部５６が外部
ハスｌ／Ｆ部５７．アドレス出力回路５８、データ入出
力回路５９．制御信号入出力回路６０等を介して行う。データ演算の際にメモリシステム１３をアクセスする必
要がある場合は、マイクロプログラムの指示により列部
バスインターフェイス部５７を通じてアドレス出力回路
５Ｂからアドレスを肝■０外部へ出力することにより、
データ入出力回′Ｂ５９を通じて「１的のデータがフェ
ッチされる。メモリシステム１３にデータをス［・アする場合は、外
部バスインターフェイス部５７を通じてアドレス出力回
路５８からアドレスを出力すると同時に、データ入出力
回路５９からデータをＭＰＵｌ０り（部へ出力する。オ
ペランドストアを効率的に行うために、データ演算部５
６には４バイトのストアバッファがある。ジャンプ命令の処理あるいは例外処理等が実行されて新
たな命令アドレスをデータ演算部５６が得た場合は、こ
れをデータ演算部５６は命令フェッチ部５１とＰＣ計算
部５３へ出力する。第５図は本発明のマイクロプロセッサのデータ演算部５
６．マイクロＲＯＭ部５５の一部及びバスインターフェ
イス部５７の一部の詳細な構成のブロック図である。データ演算部５６内には、汎用レジスタ及び作業用レジ
スタからなるレジスタファイル１０３．プロセッサの状
態ビンを及びフラッグからなるプロセッサステータスワ
ード（ＰＳＷ）　１０４．＾ＬＵ及びバレルシフタ等か
らなるデータ演算回ｌ？８１０５．メモリとデータとを
やり取りする際にデータをバイト単位でローティトして
ワード整置する整置回路１０６．８ハイドの外部バスと
４バイトの内部バスとの間でデータを変換する４バイト
−８バイト変換回路１０７．内部データバスであるＳ１
バス、Ｓ２バス、　ＤＯババスコプロセッサ命令に従っ
てコマンドをコプロセッサへ転送する際にそのコプロセ
ッサ命令のＰＣ値を４つまで保持するＰＣキュー２１等
が備えられている。主な演算器相互間はｓｔババス　Ｓ２バス、　ＤＯババ
ス結合されており、１つのレジスタ間演算を指示するｌ
マイクロ命令を１クロツクサイクルで処理する。ＰＣＣキー２１に保持されたコプロセッサ命令のＰＣ値
は、そのコプロセッサ命令がオーバーフロー。アンダーフローあるいはゼロ除算等の例外を起こした際
に例外処理ハンドラで読出されて使用される。またＰＣ
キュー２１の４つのエントリへはサイクリックに値が書
込まれ、５番目の書込みでは１番目に書込まれたエント
リのＰＯ２直にオーバライドされる。これにより、ＰＣ
キュー２１には常に最新の４つのｌＪＣ値が保持される
。ＰＳ１値９４の構成を第３５図の模式図に示す。ＰＳＷ１０４は、プロセッサの動作モードを示す動作モ
ードフィールド、外部割込みのマスクレベルを示すＩＭ
Ａｓにフィールド、コプロセッサに対してコマンドを転
送する際に転送先のコプロセッサを特定するために使用
するＣＰＩＤフィールド及びキャーリー及びゼロフラッ
グなど演算結果により変化するフラッグフィールドにて
構成されている。（４、７）　　ｒ　外部ハスインターフェイス部、１外
部バスインターフェイス部５７は本発明のマイクロプロ
セッサの入出力ピンでの通信を制御する。メモリのアクセスはすべてクロック同期で行われ、最小
２クロックサイクルで行うことが可能である。外部バスインターフェイス部５７にはデータキャッシュ
があり、オペランドの続出しがデータキャッシュにヒツ
トした場合はメモリアクセスは行われない。オペランド
の書込みはデータキャッシュとメモリの両方に対して行
われる。コプロセッサがメモリに対してオペランドを書
込む際は、そのオペランドをデータ入出力回路５８を介
して読込み、同一アドレスのデータキャッシュの内容を
変更する。メモリに対するアクセス要求は、命令フェア千部５１．
アビレス計算部５４及びデータ演算部５６から独立に生
じる。外部バスインターフェイス部５７はこれらのメモ
リアクセス要求を調停する。更に、メモリと本発明のマ
イクロプロセッサとを結ぶデータバスサイズである６４
ビツト　（ダブルワード）の整置境界を跨ぐメモリ番地
に位置するデータのアクセスに際しては、このブロック
内で境界を跨ぐことを自動的に検知して、２回のメモリ
アクセスに分解して行われる。ブリフェッチされるオペランドとストアされるオペラン
ドとが重なる場合のコンフリクト防止処ｆ’ｌびストア
オペラン劃゛からフェッチオペランドへのバイパス処理
もこの角部バスインターフーイス部５７で行われる。（以　下　余　白）（５）［パイプライン処理Ｊ本発明のマイクロプロセッサは命令をパイプライン処理
することにより高性能を発揮する。ここではまず、本発
明のマイクロプロセッサのパイプライン処理方法につい
て説明する。（５，１）　　ｒパイプライン処理機構」本発明のマイ
クロプロセッサのパイプライン処理を模式的に第４図に
示す。命令をブリフェッチする命令フェッチステージ（ＩＦス
テージ）　３１、命令をデコードするデコードステージ
（Ｄステージ）３２、オペランドのアドレス計算を行う
オペランドアドレス計算ステージ（Ａステージ）３３、
マイクロｌ？ＯＮアクセス（特にＲステージ３６と称す
）とオペランドのブリフェッチ（特にＯＦステージ３７
と称す）を行うオペランドフェッチステージ（Ｆステー
ジ）３４、命令を実行する実行ステージ（Ｅステージ）
３５の５段構成をパイプライン処理の基本とする。Ｅステージ３５には手段のストアバッファがある他、高
機能命令の一部は命令実行自体をパイプライン化してい
るため、実際には５段以上のパイプライン処理効果が発
揮される。各ステージはそれぞれ他のステージとは独立して動作し
、理論には５つのステージが完全に独立して動作する。各ステージは１回の処理を最小ｌクロックで実行可能で
ある。従って理想的にはｌクロックごとに次々とパイプ
ライン処理が進行する。本発明のマイクロプロセッサには、メモリーメモリ間演
算、メモリ間接アドレッシング等のような一回の基本パ
イプライン処理では処理不可能な命令があるが、これら
の処理に対しても可能な限り均衡したパイプライン処理
が行えるように工夫されている。複数のメモリオペラン
ドを有する命令に対しては、メモリオペランドの数に基
づいてデコード段階で命令を複数のパイプライン処理単
位（ステップコード）に分解してパイプライン処理を行
うようにしている９パモの分解方法に関しては特願昭６１−２３６４５６で詳し
く述べられている。ＩＦステージ３１からＤステージ３２へ渡される情報は
命令コードそのものである。Ｄステージ３２からＡステ
ージへ渡される情報は命令で指定された演算に関する情
Ｉｎ（Ｄコード値と称す）と、オペランドのアドレス計
算に関係する情報（Ａコード４２と称す）との２つがあ
る。Ａステージ３３からＦステージへ渡される情報はマ
イクロプログラムルーチンのエントリ番地及びマイクロ
プログラムへのパラメータ等を含むＲコード４３と、オ
ペランドのアドレスとアクセス方法指示情報等を含むＦ
コード４４との２つである。Ｆステージ３４からＥステ
ージ３５へ渡される情報は演算制御情報とリテラル等を
ＮむＥコード４５と、オペランド及びオペランドアドレ
ス等を含むＳコード４６との２つである。Ｅステージ３５以列のステージで検出されたＢＩＴは、
そのコードがＥステージ３５へ到達するまではＢＩＴ処
理を起動しない。Ｅステージ３５で処理されている命令
のみが実行段階の命令であり、ＩＦステージ３１からＦ
ステージ３４までの間で処理されている命令は未だ実行
段階に至っていないのである。従ってＥステージ３５以外で検出されたＥＩＴはこれが
検出されたことがステツブコード中に記録されて次のス
テージへ伝えられるのみである。（５，２）　　ｒ各パイプラインステージの処理」各パ
イプラインステージの人出力ステ、ブコ・＝１・には第
４図に示す如き名称が便宜上付与されている。またステ
ップコーじはオペレーションコードに関する処理を行い
、マイクロＲＯＭのエントリ番地及びＥステージ３５に
対するパラメータ等になる場合とＥステージ３５のマイ
クロ命令に対するオペランドになる場合とがある。（５，２，１）　　ｒ命令フェッチステージ」命令フェ
ッチステージ（ＩＦステージ）３１は命令を命令キャッ
シュあるいはメモリシステム１３からフェッチして命令
キューへ入力し、Ｄステージ３２−１命令コードを出力
する。命令キューの入力は整置された４バイト単位で行
われる。メモリシステム１３から命令をフェッチする場
合は、整置された４バイトにつき最小２クロックを要す
る。命令キャッシュがヒツトした場合は、整置された４
バイ［につきｌクロックでフェッチ可能である。命令キ
ューの出力単位はぼ２バイトごとに可変であり、ｌクロ
ックの間に最大６バイトまで出力できる。またジャンプの直後には命令キ１−をバイパスして命令
基本部２バイトを直接命令デコーダへ転送することも可
能である。命令キャッシュへの命令の登録及びクリア等の制御、プ
リフェッチ先命令アドレスの管理及び命令キューの制’
＋ＢもＩＦステージ３１で行われる。（５，２，２）　　ｒ命令デコードステージ」命令デコ
ードステージ（Ｄステージ）３２はＩＩ＋スデージ３１
から入力された命令コー　ドをデコートする。デコード
は命令デコード部５２のＦ１怜デコーダ、ＮＦＨ−デコ
ーダ、アドレッシングモー１′デコーダを使用して、ｌ
クロックに１度行われ、１回のデコード処理で、θ〜６
バイトの命令コードが消費される（リターンサブルーチ
ン命令の復帰先アドレスを含むステンブコードの出力処
理等では命令コードは消費されない）、１回のデコード
でＡステージ３３に対してアドレス耐算情報としてのＡ
　、１−ド４２である約３５ピントの制御フードと最大
３２ビツトアドレス修飾情報と、オペコードの中間デコ
ーｔ〜結果としてのＤコード値である約５０ビツトの制
御コードと８ビツトのリテラル情報とが出力される。Ｉ〕ステージ３２では各命令のＰＣ計算部５３の制ｊ１
、命令キューからの命令コード出力処理も行われる。＜５．２．３）　　ｒオペランドアドレス計算ステージ
」オペランドアドレス計算ステージ（へステー・ジ）３
３での処理は大きく２つに分かれる。１つは命令デコー
ド部５２の第２デコーダを使用して、オペレーションコ
ードの後段デコードを行う処理で、他方はオペランドア
ドレス計算部５４でオペランドアドレスの計算を行う処
理である。オペコードの後段デコード処理はＤコード値を人力とし
、レジスタ及びメモリの書込み予約及びマイクロプログ
ラムルーチンのエントリ番地とマイクロプログラムに対
するパラメータ等を含むＲコート４３の出力を行う。な
お、レジスタ及びメモリの書込み予約は、アドレス計算
で参照したレジスタ及びメモリの内容が、パイプライン
上を先行する命令で書換えられることにより誤ったア１
”レス計算が行われることを回避するためである。オペランドアドレス計算処理はΔコード４２を入力とし
、これに従ってオペランドアドレス冨１算部５４で加算
及びメモリ間接参照を紺合わセてア］・レス計算を行い
、その計算結果をＦコード４４として出力する。この際
、アドレス計算に伴うレジスタ及びメモリの読出し時に
コンフリクトチェックが行われる。即ち、先行命令のレ
ジスタあるいはメモリへの書込み処理が終了していない
ために：１ンフリクトが指示されれば、Ｅステージ３５
での書込み処理が終了するまで先行命令がｉＨｔ！させ
られる。（５，２，４）　　ｒマイクロＲＯ？＋アクセスステー
ジ」オペランドフェッチステージ（Ｆステージ）３４で
の処理も大きく２つに分かれる。１つはマイクロＲＯＭ
のアクセス処理であり、これを特にＲステージ３６と称
す。他方はオペランドブリフＬソチ処理であり、これを
特にＯＦステージ３７と称す、Ｒステージ３６と叶ステ
ージ３７とは必ずしも同時に動作するわけではなく、メ
モリアクセス権が獲得可能か否か等に依存して、独立に
動作する。Ｒステージ３６での処理であるマイクロＲＯＭアクセス
処理は、Ｒコード４３に対して次の［Ｅステージ３５で
の実行に使用する実行制御コードであるＥコード４５を
生成するためのマイクロＲＯＭアクセスとマイクロ命令
デコード処理とである。１つの１７コード４３に対する
処理が２つ以上のマイクロプログラムステップに分解さ
れる場合、マイクロＲＯＭはＥステージ３５で使用され
、次のＲコード４３はマイクロＲＯＭアクセス待ちにな
る。Ｒコード４３に対するマイクロＲＯＭアクセスが行
われるのはその前のＥステージ３５での最後のマイクロ
命令実行の時である。本発明のマイクロプロセッサでは
ほとんどの基本命令はｌマイクロプログラムステ２１行
われるため、実際にはＲコード４３に対するマイクロ１
？ＯＭアクセスが次々と行われることが多い。（５，２，５）　　ｒオペランドフェッチステージ」オ
ペランドフェッチステージ（ＯＦステージ）３７はＦス
テージ３４で行われる上記の２つの処理の内、オペラン
ドブリフェッチ処理を行う。オペランドのプリソエノチ
はデータキャッシュまたはメモリシステム１３から行わ
れる。オペランドブリフェッチはＦコード４４を入力とし、フ
ェッチしたオペランドとそのアドレスとをＳ：１Ｊ−Ｆ
４６として出力する。１つのＦコード４４ではダブルワ
ード境界を跨いでもよいが、８バイト以下のオペランド
フェッチが指定される。Ｆコード４４にはオペランドの
アクセスを行うか否かの指定も含まれており、Ａステー
ジ３３で計算されたオペランドアドレス自体及び即値を
Ｅステージ３５へ転送する場合にはオペラン１゛ブリフ
エツチは行われず、Ｆコード４４の内容がＳコード４６
として転送される。ブリノエノヂしようとするオペラン
ドとＥステージ３５が書込み処理を行おうとするオペラ
ンドとが一致する場合は、オペランドブリフェッチはデ
ータキャッシュあるいはメモリからは行われず、バイパ
ス処理される。（５，２，６）　　ｒ実行ステージＪ実行ステージ（Ｅステージ）３５はＥコード４５゜Ｓコ
ード４６を入力として動作する。このＥステージ３５が
命令を実行するステージであり、Ｆステージ３４以前の
ステージで行われた処理は総てＥステージ３５での処理
のための前処理である。Ｅステージ３５でジャンプ命令
が実行されたりあるいはＦ！ＩＴ処理が起動されたりし
た場合は、ＩＦステージ３１からＦステージ３４までの
間の処理は総て無効化される。Ｅステージ３５はマイク
ロプログラムにより制御され、Ｒコード４５で示された
マイクロプログラムルーチンのエントリ番地からの一連
のマイクロ命令を実行することにより命令を実行する。マイクロ１ｉＯｒｌの続出しとマイクロ命令の実行とは
パイプライン化されている。従って、マイクロプログラ
ムで分岐が発生した場合はｌマイクロステップの空白が
生じる。また、Ｅステージ３５はデータ演算部５６にあ
るストアバッファを利用して、４バイト以内のオペラン
ドストアと次のマイクロ命令の実行とをパイプライン処
理することも可能である。Ｅステージ３５ではＡステージ３３で行ったレジスタ及
びメモリに対する書込み予約をオペランドの書込みの後
に解除する。各種の割込は命令の切れ目のタイミングにおいてＥステ
ージ３５で直接受付けられ、マイクロプログラムにより
必要な処理が実行される。その他の各種ＢＩＴの処理も
マイクロプログラムにより実行される。（５，３）　　ｒ各パイプラインステージの状態制御」
パイプラインの各ステージは入力ランチと出力ランチと
を有していて、他のステージどは独立に動作することを
基本とする。各ステージはｌ一つ前に行った処理が終了
し、その処理結果を出力ラッチから次のステージの入力
ランチへ転送し、自身のステージの入力ランチに次の処
理に必要な人力信号が総てそろえば次の処理を開始する
。つまり、各ステージは、１つ前段のステージから出力さ
れてくる次の処理に必要な人力信号が総て有効となり、
現在の処理結果を後段のステージの入力ランチへ転送し
て出力ラノチが空になると次の処理を開始する。換言すれば、各ステージが動作を開始する直前のタイミ
ングで入力信号が総てそろっている必要がある。人力信
号がそろっていない場合は、そのステージは待ち状態（
入力持ち）になる。出力ラッチから次のステージの入力
ラッチへの転送を行う際は次のステージの入力ランチが
空き状態になっている必要があり、次のステージの入力
ラッチが空いていない場合もバイブラインステージは待
ち状態（出力待ち）になる、必要なメモリアクセス権が
獲得不可能であったり、処理しているメモリアクセスに
ウェイトステート　（待機状態）が挿入されていたり、
その他のパイプラインコンフリクトが生じると、各ステ
ージの処理自体が遅延する。（６）［本発明のマイクロプロセッサを用いたデータ処
理装置の詳細回路接続図」第３図に本発明のマイクロプロセッサを肝υ］０とし、
第１ＦＰＵＩＩ、メモリシステム１３．クロックジェネ
レータ１値割込み制？１１回路１５を用いたデータ処理
装置の接続関係を示す。本発明装置の第１ＦＰＵＩＩは最大３個までコプロセッ
サ命令を基積して受付けることが可能であるが、ＭＰｔ
ｌｌＯでは４エントリのＰＣキューがあるため、コプロ
セ、す命令の実行時に例外が発生した際にも、その命令
のＰＣ値はＰＣＣ１０２１に保持され、後続のコプロセ
ッサ命令のＰＣ値の書込みにより破壊されることはない
。また、ＭＰｌｌｌｏのＰＳＷ１０４のＣＰＩ旧“００
１′であれば、第１ＦＰＵＩＩにはコプロセッサ番号０
０１がち−えられているとする。システムクロックＣＬＫはクロックジェネレータ１４に
より生成され、ＩＳ月線７０を介してＭＰＩＩＩＯ，第
１ＦＰｔ１１１．メモリシステム１３へ入力される。信
号ｖｉ１７１のＢＡＴＯ：１はバスサイクルの種類を示
す信号、１３号線７２のＣＤＥＩはコプロセッサである
第１　ＦＰＵＩＩへのデータ出力制御信号、信号線７３
のＢＳＩ、ＡＳＷ、０５１．Ｒ／圓薯はＭＰｕｌＯが出
力するその他のバスサイクル制御信号、信号線７４のＣ
ＰＳＴＯ：２は第１ＦＰ旧１の内部状態をＭＰｕｌｏへ
転送する制ｉＴ１４８号、信号線７５のＣＰ［ｌＣＩは
第１ＦＰＵＩＩが出力するバスサイクル完了制ｒＢ信号
、信号線７６のＤＣｌはメモリシステムが出力するパス
サイクル完了′Ｍ御信号である。また信号Ｍｌはデータ
バスＤＯ；６３として、ｆｉ　号ｖＡ２はアドレスバス
ＡＯ：３１としてそれぞれ使用される。割込み関係では、メモリシステム１３から割込み制御回
路１５に割込みを要求する信号線７９、第１ＦＰＩ１１
１で例外が発生した際に割込み制御回路１５に割込みを
要求する信号線８０、それらの割込み要求が割込み制御
回路１５で整理された結果肝υ１０へ割込みを要求する
信号＊７８の信号ＩＲＬＯ：２がある。（δ号線３のＩ　Ｏ：３１は命令コードをメモリシステ
ム１３からＭＰＩＩＩＱへ転送する命令バスである。各
信号はシステムクロックに対してクロック同期して動作
する。信号線７１の信号ＢＡＴＯ；１の意味は第９図の表に示
す通りである。各値“００″、”０１”ど１０”、１１
″それぞれに対してアドレスバスＡ　０　二３１　ｆ２
＋及びデータバスＤＯ：６３（１１の内容が定められて
いる。未定義の部分に対しては、ＭＰＵｌ０．第１１’
Ｐｔ１ｌｌ及びメモリシステム１３はその内容を無視す
る。信号線７４の信号ＣＰＳＴＯ：１の意味は第１０図の表
に示す通りである。、値″００１′は第１ＦＰＩＩＩＩ
がデータ出力の準備作業中であることを示す。値“ＯＩ
Ｏ”はコマンドが正常に受付けられたことを示す。値“
旧１゜は第１ＦＰＵＩＩがデータ出力準備を完了したこ
とを示す。値“ｉｏ１″は第１ＦＰＩＩＩＩがコマンド
の再転送を要求していることを示す。値”１１０′はコ
マンド転送に際してエラーが生じたことを示す。値“Ｉ
ＩＩ”は第１ＦＰＩＩ１１が接続されていないことを示
す。値“０００”及び０１００″は未定義である。信号線８０の信号Ｉ　ＲＬＯ：　２はその値により割込
み処理をＭＰＵｌ０に要求する。　　ＭＰｌｌｌｏはＩ
ＲＬＯ：２とｌｌ５Ｗ（１０４）中の団＾Ｓにフィール
ドの値に従って割込みを受付けるか割込みをマクスする
かを決定する。（６，１）　　ｒメそりアクセス動作」第６図は第３図
に構成を示したデータ処理装置のメモリアクセス動作の
タイミングチャートである。メモリアクセスは、十分高速なメモリに対してはクロッ
クＢＣＬＫの２クロツクに１度のレートで行われる。第
６図では最初にゼロウェイトのデータリードサイクル、
次にゼロウェイトのデータライトサイクル、次に１クロ
ツクウエイトの命令リードサイクル、次に１クロツクウ
エイトのデータライトサイクルを示す。図中ＣＬにとＢ
ＣＬＫとは信号線７０を介して供給されるシステムクロ
ックである。ＲＣＬＫはＣＬにの２倍の周期のバスクロックであり、
Ｃ１、Ｋの奇数番目パルスと偶数番目パルスとを定める
。、ＣＬにに同期して動作する本発明のマイクロプロセ
、すとＢＣＩＪの同期はシステノ・リセット時に行われ
る。データリードサイクルではＢＡＴＯ：１−“００′とし
てアドレスが出力され、口ＳｌとＢＣＬＩＩとがローレ
ベルであるＣＬにの立下り時に口ｃｓ（７６）がアサー
トされた際のデータバスＤＯ：６３１１１の値を取込み
、バスサイクルを終了する。データライトサイクル゛ではＢＡＴＯ：１＝”ＯＯ’と
し、まずアドレスが、次にｌクロック遅れてデータがデ
ータバスＤＯ：６３（１１へ出力され、ＤｓｉとＢＣＬ
にとがローレベルである間のＣＬにの立下り時にＩ）（
Ｊ　（７６）がアサートされればバスサイクルが終了す
る。命令リードサイクルではＢＡＴＯ：１　＝　”０１”と
してアドレスを出力し、口ＳｌとＢＣＬにとがローレベ
ルである間のＣＬにの立下り時に口Ｃ１ｌ　（７６）が
アサートされた際の命令バスＩＯ：３１ｔ３１の値を次
込み、バスサイクルを終了する。このようにプロセッサモードではクロックＣＬＫに同期
したバスサイクルを本発明のマイクＣＩプロセッサが起
動することにより外部との入出力動作を行う。（６，２）　　ｒ　Ｆ　Ｐ　Ｕアクセス動作１第７図は
第３図に構成を示したデータ処理装置においてＭＰＩＩ
ＩＯから第１ＦＰＩＩＩＩヘコマンドを転送する場合の
タイミングチャートの一例を示す、コマンド転送はバス
９イクルの一種として行われる。第７図では最初にメモリ・ＦＰＵ　レジスタ間（門−Ｆ
Ｒ）　ＦＰＵ命令に対するコマンドとオペランドの０ウ
エイトの転送サイクル、次にメモリからの０ウエイトの
データリードサイクル、次にＦＰＬＩ　レジスタ・　Ｆ
ＰＩＩレジスタ間（ＰＲ−Ｆｉｔ）ＦＰυ命令に対する
コマンｌの１ウエイトの転送サイクル、次にＭＰＵ　レ
ジスター　　ＦＰＵレジスタ間（ＧＩＮ−Ｆｉｌ）Ｆｌ
”Ｕ命令に対１−　ルコ？ンドとオペランドとＦＰＵ命
令命令値Ｃ値ウエイトの転送サイクルが示されている。メモリ・　ＦＰＵレジスタ間ＦＰＵ命令に対するコマン
ドとオペランドとの転送サイクルではＢＡＴＯ：１＝“
１０′としてアドレスバスＡ　Ｑ　Ｈ３１（２１のビッ
ト１６からビット３１ヘコマンドが出力され、ＤＳ＃と
ＢＣＬＫ、’、がローレベルである間のＣＬにの立下り
時にＣＰＤＣＩ（７５）がアサートされれば肝Ｕ１０は
バスサイクルを終了する。第１ＦＰＵＩＩはアドレスバ
ス２からコマンドを、データバスＤＯ：６３（１１から
オペランドをそれぞれ受取る。この際、ＣＰＳＴＯ：２
　（７４）の値により第１ＦＩ”ｔｉｌｌの状態がＭＰ
ｔｌＩＱへ通知される。ＦＰＵレジスタ・　ＦＰｕレジスタ間ＦＰｕ命令に対す
るコマンドの転送サイクルはオペランド転送がない以外
はメモリ・ＦＰｔｌ　レジスタ間ＦＰＵ命令に対するコ
マンドとオペランドの転送サイクルと同しである。１′１Ｐｔｌレジスタ・　Ｉ’ＰＩＩレジスタ間Ｆｌｌ
ｌｌ命令に対するコマンドとオペランドとＰＣ値の転送
サイクルもメモリ・　Ｆｌ’ｔｌレジスタ間Ｆｌ）Ｕ命
令に対するコマンドとオペランドの転送サイクルと類似
している。即ち、データバスｌの下位４バイトのみでオペランドが
転送される点が異なる。第８図は第３図に構成を示したデータ処理装置において
ＦＰｕレジスタ・メモリ間ＦＰＵ命令に対するコマンド
とオペランドとの転送を行う場合のタイミングチャート
である。この場合、まず？１ｌｌＩＩＩＯから第１ＦＰ
υ１１ヘコマンドが転送され、その後、第１ＦＰＵＩＩ
から？ＩＰＩＩＩＯとメモリシステム１３ヘオペランド
が転送される。コマンドの転送サイクルではＢＡＴＯ：１−“ｌＯ′と
してアドレスバスＡ　Ｏ：　３１　ｆ３＋のビットｌＯ
からビット３１ヘコマンドが出力され、口Ｓ１とＢＣＬ
Ｋとがローレベルである間のＣＬＫの立下り時にＣＰＤ
ＣＩ　（７５）がアサートされればＭＰＵｌ０はバスサ
イクルを終了する。第１ＦＰＩＩＩＩはアドレスバス２からコマンドを受取
る。この際、ＣＰＳＴＯ：２（７４）の値により第１ＦＰＵ
ＩＩの状態がＭｌ’［ＩｌＯへ通知される。この例では
ＦＰＵ　レジスタ・メモリ間ＦＰＵ命令に対するコマン
ドであり、第ｔｐｐｕｉｉが上京にコ゛７ンドを受取り
、ＣＰＳＴＯ：２（７４）が“０１０”ど００１”、“
Ｏ１１′　と順次変化する。門！）υ１０はコマンド転
送後このＣＰＳＴ（１２の値を見て、°Ｏ１１“となっ
た後にＣＰＩ）Ｅｌをアサートし、その１クロツク後か
らオペランド転送のバスサイクルを起動する。このサイクルではＢＡＴＯ：１−００″でアドレスバス
ＡＯ：３１ｆ２１へはオペランドを格納すべきメモリア
ドレスがＭＰＬｌｌｏから出力され、データバスＤＯ：
６３（１１へは第１ＦＰＵＩＩからオペランドが出力さ
れる。オペランドはＭＰＵｌ０とメモリシステム１３と
の両方で取込まれる。　ＭＰＩＩＩＯがオペランドを取
込むのは内蔵データキャッシュを書換えるためである。また、ＭＰＵ１．０が第２Ｆｌ１１１２をアクセスする
場合のプロトコルとタイミングは、１１Ｐｕｉｏが第１
ＦＰＩＩＩＩをアクセスする場合と基本的に同じである
。異なる点は、コマンドを転送したメモリサイクルの直
後にＢＡＴＯ：１−“１１１としてコプロセッサのＰＣ
値を肝ｕｌＯから第２ＰＰυ１２へ転送するメモリサイ
クルが存在する点のみである。（７）　ｒ　：ｌブＤセノ４ノ”命令の処理例」第３６
図〜第３８図は第１ＦＰＵＩＩでコブロセ・７す命令を
実行する場合の処理手順例のフロ−チャードである。第１ＦｐＨｌｌノｃＰＩＤ４ｉ″００１”　？あり、こ
れらの例ではＭｌｌＵｌｏのｌ’５Ｗ１０４中のＣ１１
ＩＤ値も００１”であるため、コプロセッサ命令のＰＣ
（直は第１ＦＰｔｌｌｌへは転送されない。第３６図はメモリ・　ＦＰＩ＋レジスタ間ＦＰＵ命令で
あるコプロセッサ命令がＭＰＩＩＩＯでフェッチされて
から第１ＦＰＵＩＩで実行されるまでの処理手順のフロ
ーチャートである。第３７図はＦＰＩＩレジスタ・メモリ間ＦＰＵ命令であ
るコプロセッサ命令が−ＰＵＩＯでフェッチされてから
第１ＦＰＵＩＩで実行されるまでの処理手順のフローチ
ャートである。第３８図はメモリ・　ＦＰＵレジスタ間ＦＰＩＩ命令で
あるコプロセッサ命令が５ｐｕｉｏでフェッチされ、第
１ＦＰυ１１で実行され、例外を起こし、第１ＦＰυ１
１が割込制？１回路１５を通じてＭＰＩＩＩＯに割込み
を要求して、１’１ＰＩ１１０が例外処理を行う場合の
処理手順のフローチャートである。例外処理は割込み処理ハンドラで処理される。本発明のマイクロプロセッサでは、第１ＦＰＵＩＩから
ＰＣＩＤを読出す命令とＰＣＩＤに基づいて？ＩＰＩＩ
ｌＯのＰＣキュー２１の対応するエントリからコプロセ
ッサ命令のＰＣ値を読出す特権命令とがある。割込み処
理ハンドラではこれらの命令を用いてＰＣＩＤ及びコプ
ロセッサのｐｃ埴を続出す。第３９図はメ１す・　ＦＰ［Ｉレジスタ間ＦＰＩ命令で
あるコプロセッサ命令が−ＰＵＩＯでフェッチされてか
ら第２ＦＰＵ１２で実行されるまでの処理手順のフロー
チャー１・である。第２ＦＰｕ１２（７）ＣＰＩＯは０１０”？’あり、コ
コテはｌ’１Ｐｕｌ。のＰＳＷ１０４中のＣＰＩＤ（ｆｉが０１０”である際
にコプロセッサ命令が実行された場合を示す６（８）「本発明の他の実施例」上述の実施例では、ＭＰＵｌ０からコブロセツ（Ｊへの
コマンド社送とオペランド転送とを同一のメモリサイク
ルで行っているが、必ずしも１つのメモリサイクルで行
う必要はない。オペランドのビ。ト数がデータバスｌのビット数より大きい場合はオペラ
ンド転送を複数のメモリサイクルで行ってもよい。また本発明ではコプロセッサ命令の実行時の例外を割込
みにより処理し、第１ＦＦｕｌｌからＰＣＩＯを涜出す
動作とＰＣキュー２１のエントりに蓄えられたコプロセ
ッサ命令のＰＣ値をｈ１旧Ｏへ読出ず動作とをハンドラ
中の命令で行っている。しかし、コプロセッサ命令の実
行時に発生した例外を専用信号でＭＰＵｌ０へ通知し、
ＭＰＵ１０がハードウェア動作としてＰＣＩＤの読出し
とＰＣキュー２１のエントリの内容の読出しとを行い、
例外処理ハンドラへ渡す情報として例外を起こしたコプ
ロセッサ命令のＰＣ値をスタックに積んでもよい。この
場合、例外処理ハンドラは例外を起こしたコプロセッサ
命令のＰＣ値をスタック中から得るため、ハンドラでは
Ｐｃ１口を読出したり、ＰＣキュー２１の内容を読出し
たりする必要はない。し発明の効果］以上に詳述した如く、本発明のマイクロプロセッサでは
コプロセッサ命令の実行時の例外発生に備えてコプロセ
ッサ命令のｐｃ値を保持するプログラムカウンタ値保持
手段（ＰＣキュー）を備えているため、コプロセッサ命
令に対応するコマンドを転送する際、コプロセッサ命令
のｐｃ値を転送するためのメモリサイクルを必要としな
い。従って、１メモリザイクルでコプロセッサ命令に対
するコプロセッサへの動作指示が可能であり、高性能な
データ処理装置を得ることが可能になる。また、プログラムカウンタ値保持手段（１）Ｃキュー）
に複数のエントリを備え、そのエントリ番号をコマンド
と連結してコブロセソ’Ｊへ転送するため、コプロセッ
サに対して対応するプログラムカウンタ値保持機手段の
ＰＣ（ａを区別しつつ複数のコマンドをキヱーイングし
て転送することが可能である。エントリ番号は２ビツト
でコプロセッサ命令のＰＣ値３２ビットに比べ非常に小
さいビット数であり、コマンドとエントリ番号とを連結
しても１回のバスサイクルで３２ビツトのアドレスバス
２を通じて全体を転送可能である。このため、ｌメモリ
サイクルでコプロセッサ命令に対するコプロセッサへの
動作指示が可能になるので、本発明のマイクロプロセッ
サを用いて高性能なデータ処理装置を構成することが可
能である。また、コプロセッサ識別子（ＣＰＩＯ）の値によりコプ
ロセッサ命令のｐｃ値をコプロセッサへ転送するか否か
を判別しているため、複数のコブロセ、すを接続して、
コプロセッサ識別子を変更しつつ同時に複数のコプロセ
ッサを並列動作させる場合においても１つのプログラム
カウンタ値保持機構で特定の１つのコプロセッサについ
てはコプロセッサ命令のｐｃ値を保持することが可能で
ある。従って、プログラムカウンタ値保持手段の効果を
…なうことなく、複数のコプロセッサを接続したデータ
処理装置を構成することが可能である。[]
The sesona 18087 constantly monitors the bus, and when a coprocessor instruction is output to the Hass, the Cobb IJ processor fetches it, decodes it, and executes it. This method is described in, for example, [Separate Volume Interface, "Numerical Processor", C-Kuchi Publishing, 1987, pp. 1]. 60-74. [Problem to be Solved by the Invention 1] In a data processing device using a conventional microb 11 setter as Mllt+, as shown in the timing chart in the value diagram, when MPtl instructs the coprocessor to perform an operation for a coprocessor instruction, Cobb Cl processor instruction P
Because the C value is also transferred, many memory cycles are required to instruct the operation, which degrades the performance of the entire device.The PC value of the coprocessor instruction is not directly necessary for the execution of the coprocessor instruction. This is transferred in case an exception occurs during the execution of a coprocessor instruction, and in most cases it is not used and is wasted. [Decoding]
The data processing device used as
Instruction prefetch kiff similar to. - It is necessary to provide an instruction decoder and an instruction decoder, which increases the hardware complexity of the coprocessor. Also, the same Kobrose,
When multiple processors are connected and operated in parallel, arbitration between coprocessors is required to control which MPII executes which coprocessor instruction. Regarding the problems with data processing devices that employ this method, see Miyuki Hanakaku, ``At the dawn of the 32-bit era, the full scale of floating-point arithmetic has arrived'', Nikkei Electronics 425, 1987. July 13th issue p.
p, 123-138 J. [Means for Solving the Problems] A microprocessor used as a main processor of a data processing device of the present invention has a coprocessor instruction pc(a
It has a PC queue (program counter value holding means) that holds . In addition, this PC key ], - can hold the PC value of multiple coprocessor instructions, and when fetching and decoding a coprocessor instruction and instructing the coprocessor to perform an operation, the coprocessor Calculate the entry number of the queue that stores the PC value instead of the PC value of the instruction Jfi
It has means for transferring the command concatenated with the indicator to the coprocessor. Furthermore, when multiple coprocessors are connected, the PC of the coprocessor command is used to identify the coprocessor to which the command is transferred.
(Cobroceso →) identifier (CPID) which is a criterion for determining whether or not to transfer the direct data to the coprocessor is held in the processor status word (PSW). [Operation] In the data processing device of the present invention, the main processor fetches and decodes a coprocessor instruction, and generates an operation indicator to be transferred to the coprocessor according to the result. If the value of the coprocessor identifier (CI'1Il) in the processor status word (1"; W) is "001", the PC value of the coprocessor instruction is
is retained. The entry number of the PC queue holding the PC value, the operation indicator, and the coprocessor identifier in the processor status word are concatenated and transferred as a command to the coprocessor in a single memory cycle. If the value of the coprocessor identifier (CPID) is other than "001", the PC value of the coprocessor instruction is also transferred to the coprocessor after the command is transferred. The coprocessor executes operations according to commands transferred from the microprocessor of the present invention. When an exception occurs during the execution of an operation, the exception handler is activated.The exception handler is read from the coprocessor, its contents are examined, and if necessary, the exception information of the coprocessor that generated the exception is The PC value is read from the main processor or the PCC key-entry number corresponding to the command that caused the exception is read from the coprocessor and the PC of the coprocessor instruction that caused the exception.
(A is read from the PC queue of the main processor based on the number, and the coprocessor instruction that caused the exception is identified. [Embodiments of the Invention] The present invention will be described in detail below based on the drawings showing the embodiments. (1) "Data processing device using the microprocessor of the present invention" Fig. 1 is a block diagram showing an example of the configuration of a data processing device using the microprocessor of the present invention. Main processor (Mill) 1
0, two coprocessors (first F
12, a memory system 13 that stores instructions and data, an interrupt control circuit 15 that controls interrupts to MPU10, a clock generator 14 that supplies a clock that determines the timing of the entire device, etc. The MPU 10, the first FPU II, the second FPU 12, and the memory system 13 are connected to a 64-bit data bus DO;
It is connected by a signal line 1) and a 32-bit address bus AO:31 (signal vA2). MPtllO and memory system 13 use a 32-bit instruction bus 10:31
(FF line 3) is also connected. Furthermore, the entire device is coupled by a control signal vA4. 1 old 0 operates as a bus mask for the entire device, and a signal is always output from -puto to the address bus AO:31[21. 11puto is the memory system 13^
・Address bus A O: 31 (outputs the address through 21, memory system 13 instruction bus IO: 3H3
1 to 1ffi to fetch and execute the instruction. M
When PulO executes an instruction, it uses the data bus D if necessary.
Operands are input and output to and from the memory system 13 through Ox63 (xi). Also, when a coprocessor instruction is decoded, a command is sent from npuio to the first PPLIII or second FPU12 through the address bus AO:3021, and the operation is performed using pputt or 12. When performing calculations on FPU II or 12, if necessary?
1PUIO inputs and outputs operands through data bus DO: 63 (11).
Whether to execute on II or second FP [I12 is determined based on the coprocessor identifier (CPIO) held in hPulO at that time. CPID is software j
・B can be changed by software, and by rewriting CPl[l, MPIIIO changes the coprocessor instruction to the 1st F.
It is also possible to execute by either the PUII or the second FPU 12. Also, FPU II. If an exception occurs during the execution of an operation in 12, the program counter (PC) of the coprocessor instruction that generated the exception.
To specify the value, a register is provided in Ml'υ10 and FPυ12 to hold the PC value of the coprocessor instruction. MPIJIO has a PC queue (first FPc) that holds multiple pc values of coprocessor instructions to be executed in the first FII
21), and the second FPU 12 has a second FPU 12 that holds the coprocessor instruction pc(a) to be executed by the second FPtl12.
There is a PC22. The PC value held in the second FPC 22 is transferred from npuio. The first of these. The second Fl)C21 and 22 are the above-mentioned registers. The PC value held in the first FI)C21 or the second FPC22 is used when the corresponding coprocessor instruction causes an exception. When a coprocessor instruction causes an exception in FPUII (12), first the FPUII (12)
Therefore, the υ1 inclusion control circuit t5 is requested to include the υ1 inclusion.
J! The I interrupt control circuit 15 requests the MPU10 to perform interrupt processing in accordance with the interrupt request from the FPU II (12). This starts the interrupt handler. In the interrupt processing handler, the exception information is read from ppull (12), and according to the contents, the exception information is
11c(it!) held in 22 is read out and exception handling is performed.Hereinafter, the instruction system 9 processing mechanism and processing power method G of the microprocessor of the present invention will be described in more detail. 1. The instructions of the microprocessor of the present invention have a variable length in units of 16 bits, and there are no odd-byte length instructions.The microprocessor of the present invention shortens frequently used instructions. For example, for a 2-operand instruction, it basically has a configuration of "4 bytes" + "extension part", and all at-1 non-singing There are two formats: a general format in which 1. modes can be used, and a shortened format in which only the frequently used command addressing modes 1 to 1 can be used. A schematic diagram of the instruction format of the processor is shown.
-The meanings of the symbols appearing in the mat are as follows. 12 The part where the operand code is entered [ia: The part where the operand is specified in the 8-bit addressing mode of the 8-bi-y (-ship type) Sh: The part where the operand is specified in the 6-bit abbreviated addressing mode Rn = On the register file As shown in Figure 11, the partial formant that specifies the operand of by register number is LS on the right side.
It is on the B side and has a high address. The instruction 1-mat cannot be determined without looking at the two bytes of address N and address N+1, but this is because it is assumed that instructions are always fetched and decoded in units of 16 bits (2 bytes). In the microprocessor instructions of the present invention, in any formant, the h or Sb extension of each operand always immediately follows the halfword containing its Ea or sh base. This overrides any immediate data or extensions of the instruction implicitly specified by the instruction. Therefore, for 4 or more hides (in the D instruction, the operation code of the instruction may be divided by the extended part of Ea. Also, as shown in V below, in the multi-stage indirect mode), an extended part is added to the extended part of Ha. Even if the instruction operation code is specified, it has priority over the next instruction operation code. For example, 1 for the first half word! Consider the case of a 6-byte instruction that includes al, the second halfword includes [Ha2, and goes up to the third halfword. Since the multi-stage indirect mode is used for Eal, it is assumed that the multi-stage indirect mode extension part is also provided in addition to the traditional extended part. In this case, the actual instruction bit pattern is
Halfword (including the base part of Eat), extension part of Eal', l! The multi-stage indirect mode extension of al, the second halfword of the instruction (including the basic part of Ea2), the extension of Ea2, and the third half-1' of the instruction. (2,1) r Shortened 2-operand instruction-1 FIG. 12 is a schematic diagram of a shortened 2-operand instruction. This format includes an L-format in which the source operand side is a memory, and an S-format in which a destination operand side is a memory. In L-format, sh represents a source operand designation field, Rri represents a destination operand register designation field, and RR designates the sh operand size. The size of the destination operand placed on the register is fixed at 32 bits. The size of the register side and the memory side becomes different,
Sign extension is performed when the size on the source side is small. 5-forsaL, sh is the destination operand specification field, Rn is the source operand register specification field 1', and l is the sh operand F'9.
Represents the size specification. The size of the source operand placed in the register is fixed at 32 bits. If the sizes on the register side and memory side are different and the size on the source side is large, overflow portions are truncated and an overflow check is performed. (2, 2) r-ship type 1-operand instruction" Figure 13 shows the general format of the 1-operan 1" instruction) (Gl-f
FIG. Open is a field specifying the operand size. Note that in some G l -f orwaa instructions,
There are extensions other than the extension of Ea. There are also instructions that do not use questions. (2, 3) r-General form 2-operand instruction" FIG. 14 is a schematic diagram of the general form format of the 2-operand instruction. This format includes instructions that have a maximum of two operands in general addressing mode specified by 8 bits. The total number of operands itself may be three or more. EaM is a destination operand specification field, iIM is a destination operand size specification field, EaR is a source operand specification field, and RR is a source operand size specification field. Some G4orsat instructions have extensions in addition to the EaM and EaR extensions. FIG. 15 is a schematic diagram of the formant of a short branch instruction. cccc is a branch condition specification field, and disp:8 is a displacement specification field from the jump destination. In the microprocessor of the present invention, when specifying displacement using 8 bits, the specified value in the bit pattern is doubled to obtain the displacement value. (2, 4) "r Addressing Mode" The method for specifying the addressing mode of the microprocessor instructions of the present invention includes a shortened form in which registers are specified in 6 bits/1, and
There is a 1m type specified by 8 pits 1. If an undefined addressing mode is specified, or if a combination of adreno song modes that is semantically unreasonable is specified, a reserved instruction exception will occur in the same way as when an undefined instruction is executed. occurs and exception handling is triggered. This applies when the destination is in immediate mode, or when immediate mode is used in the addressing mode specification field that should involve address calculation. The meanings of the symbols used in the formats shown in FIGS. 16 to 26 are as follows. Rn=Register specification (Sh): Specification method in the 6-bit abbreviated addressing mode (Ea): Specification method in the 8-bit general addressing mode. Showing 8 (2, 4
, 1) r1 Addressing Modes The microprocessor instructions of the present invention support various addressing modes. Among these, the basic addressing modes supported by the microprocessor of the present invention include register direct mode, register indirect mode, register relative indirect mode, immediate value mode, absolute mode, PC relative indirect mode, stack pop mode, and stack pop mode. There is. In register direct mode, the contents of the register are used as operands. The formant is shown in Figure 16.
Rn indicates the number of a general-purpose register or FPυ register. In the register indirect mode, the operand is the contents of the memory whose address is the contents of the general-purpose register. The format is shown in FIG. In the figure, Rn indicates the number of a general-purpose register. The register relative indirect mode is divided into two types depending on whether the displacement value is 16 bits or 32 bits. The operands are the contents of the memory whose address is the sum of the contents of the general-purpose register plus 16 pins or 32 bits of displacement. The format is shown in FIG. In the figure, 1lri indicates the number of a general-purpose register. disp:16 and disp:32 indicate a 16-bit displacement value and a 32-bit displacement value, respectively.The displacement values are treated as signed. In the immediate mode, the bindpakun specified in the instruction code is treated as a binary number and used as an operand. The format is shown in FIG. 19, where 1m5data indicates an immediate value, and the size of i-1-data is specified in the instruction as the operand size. Absolute mode indicates whether the address value is represented by 16 bits or 3
It is divided into two types depending on whether it is indicated by 2 bits. Each operand is the contents of a memory whose address is a 16-bit or 32-bit focus pattern affirmed in the instruction code. Formatsu 1 is shown in FIG. In the figure, abs:16 and abs:32 indicate 16-bit and 32-bit address values, respectively.
If the address is indicated by , set the specified address value to 3
Sign extend to 2 bits. PC relative indirect mode has a displacement value of 16.
It is divided into two types depending on whether it is Binoto or 32 bit. In each case, the contents of the memory whose address is the value obtained by adding a 16-bit or 32-bit displacement value to the contents of the program counter are used as operands. The format is shown in FIG. In the figure, disp:
16 and disp:32 indicate a 16-bit displacement value and a 32-bit displacement value, respectively.The displacement values are treated as signed. The value of the program counter referenced in PC relative indirect mode is the start address of the instruction that includes the operand. When the value of the program counter (PC) is referenced in the multi-stage indirect addressing mode, the address at the beginning of the instruction is similarly used as the PC-relative reference value. The stack pop mode uses the contents of the memory whose address is the contents of the stack pointer (SP) as the operand. After accessing the operand, SP is incremented by the operand size. For example, when handling 32-bit data, SP is updated by +4 after operand access. It is also possible to specify stack pop mode for operands of size B, H, and D, each with Sll +
1. +2. Updated by +8. The format is shown in FIG. 22. If the stack pop mode has no meaning for the operand, a reserved instruction exception is generated. Specifically, the reserved instruction exceptions are the write operand, read-◎dify-wr
This is the stack pop mode specification for the ite operand. In stack bush mode, the operand is the contents of the memory whose address is the contents obtained by decrementing the contents of SP by the operand size. In star mode, SP is decremented before operand access. For example, when handling 32-bit data, SP is updated by -4 before operand access. 8°16.6
It is also possible to specify stacked bush mode for operands with a size of 4 pints, and each SP is -L-2.
, -8 is updated. If the format is shown in FIG. 23, and the stacked bush mode has no meaning for the operand, a reserved instruction exception is generated. Specifically, the reserved instruction exception is the read operand, read-modify-w
This is the stack bush mode specification for the rHe operand. (2, 4, 2) r multi-stage indirect addressing mode”
Even complex addressing is broken down into combinations of addition and indirect references in W-line fishing. Therefore, if addition and indirect reference operations are given as addressing primitives and they can be combined arbitrarily, any complex addressing mode can be realized. The multistage indirect addressing mode of instructions of the microprocessor of the present invention is based on this idea. Complex addressing modes require data references between modules or Artificial Intel.
When specifying the multi-stage indirect addressing mode, there are three types in the basic addressing mode specification field: register-based multi-stage indirect mode, pc<-space multi-stage indirect mode, and absolute-based multi-stage indirect mode. Specify one of the specification methods. The register-based multi-stage indirect mode is an addressing mode in which the value of a general-purpose register is used as a base value for multi-stage indirect addressing to be expanded. The format is shown in FIG. In the figure, Rn indicates the number of a general-purpose register. The PC-based multi-stage indirect mode is an addressing mode in which the value of the program counter is used as the base value for multi-stage indirect addressing to be extended. The format is shown in FIG. The absolute base multi-stage indirect mode is an addressing mode that uses zero as the base for multi-stage indirect addressing. The format is shown in FIG. The extended multi-stage indirect mode specification field has 16 bits as a unit, and this is repeated an arbitrary number of times. Addition of displacement,
Index register scaling (Xll
X4. X8) and performs addition and memory indirect reference. The format of the multi-stage indirect mode is not shown in FIG. Each field has the meaning shown below. 0: Continuation of multi-stage indirect mode E=1 8 Address calculation end tap -=> address of opera
ndI・0: No memory indirect reference tap+disp+Rx*5cale==>tap■・
l: Men with memory indirect reference [tmp + disp + Rx reference 5ca
le] ==> tsp M=0: Use <RX> as index-1: Special index <RX>・0 Do not add index value (Rx・0) <RX>-1 Use program counter as index value Use (Rx=PC) <Rx>-2+ reserved 11=(]: Multiply the value of 4-bit field d4 in multi-stage indirect mode by 4 to obtain a displacement value, and add this. d4 is treated as signed. , regardless of the size of the operand, it is always multiplied by 4 and used.=i: di specified in the extension part of multi-stage indirect mode
spx (16/32 bits) is used as a displacement value, and the size of the extension section to which this is added is specified in the d4 field. d4=0001 dispx is 16 bits d4・0
010 dtsp has 32 bits, txx; index scale (scale = 1/2/4/8). Intermediate 1a after completion of processing (
An undefined value is entered as tap). Although the effective address obtained by this multi-stage indirect mode is an unpredictable value, no exceptions occur. To specify scaling for the program counter, line
<(Do not do this. Figures 28 and 29 show variations in instruction formats in multi-stage indirect mode. Figure 28 shows variations in whether multi-stage indirect mode continues or ends. Figure 29 indicates the variation in size of dis- brace men]. If a multi-stage indirect mode with an arbitrary number of stages can be used, there is no need for the compiler to differentiate between cases according to the number of stages, which has the advantage of reducing the burden on the compiler. This is because even if the frequency of indirect references is very low, the compiler must be able to generate correct code without fail.For this reason, an arbitrary number of stages is possible in terms of format. The microprocessor commands of the present invention include coprocessor instructions that are executed on the FPU II. After the coprocessor instructions are decoded by the MPIIIO,
The command is forwarded to FPtll (12) and the FP
Executed in UII (12). The command has the format shown in FIG. 30, and is transferred using the lower 20 bits of address bus 2, as shown in FIG. The command is composed of an FPυ command field, a PCID field, and a CP[D field. FPI
The J instruction field is an operation specifier indicating the content of the instruction for FPUII (12). The PCIO field of the coprocessor instruction held in the UlO
FPIIIm (12
), the PC value of the coprocessor instruction that caused the exception is stored in multiple PCs held in MPU10.
This is a number to choose from. CPIO is MPtllO
This is a number to distinguish each FPu when multiple FPtl are connected to MPLIIO. The FPU instruction has formants shown in FIGS. 32 to 34. Memory/FPυ register instruction (H-FR
instruction), IIPυ register/FPu register instruction (
Gl? The instruction consists of an operation code field 1' and an FPLI register number field (PR), and has a former /1 shown in FIG. FPII Regisc・FP
The IJ register instruction (P1?-, PR instruction) consists of an operation code field and two FP[I register number fields (Fill and FI12).
It has the format shown in FIG. The zero operand field has the format shown in FIG. (4) [Functional block configuration] Figure 2 shows the microprocessor of the present invention (MPU in Figure 1).
10) is a block diagram showing an example of the configuration. Functionally, the inside of the microprocessor of the present invention can be roughly divided into an instruction fetch section 51, an instruction decoding section 52, and a P
It is divided into a C calculation section 53, an operand address calculation section 54, a micro ROM section 55, a data calculation section 56, and an external bus interface section 57. In FIG. 2, there is also an address output circuit 58 that outputs addresses to the outside, a data input/output circuit 59 that outputs data to the outside, a control signal input/output circuit 60, and a command input circuit 61 that inputs instruction codes. It is shown separately from each of the above-mentioned functional blocks. (4, 1) r Instruction Fetch Unit” The instruction fetch unit 51 includes an instruction cache, an instruction queue, a control unit thereof, and the like. They determine the address of the next instruction to be fetched and fetch the instruction from the instruction cache or external memory system I3. It also registers instructions in the instruction cache. The address of the next instruction to be found is calculated by a dedicated counter as the address of the instruction to be input into the instruction queue. When a jump occurs, the address of a new instruction of the jump target is transferred from the PC calculation unit 53 or the data calculation unit 56. When an instruction is fetched from an external memory, the address of the instruction to be processed is output from the address output circuit 58 to the outside through the external bus interface section 57, and the instruction code is fetched from the instruction input circuit 61. Among the buffered instruction codes, instruction decoding section 5
In step 2, the next instruction code to be decoded is sent to the instruction decoder 5.
Output to 2. (4, 2) r Instruction Decode Unit” The instruction decode unit 52 basically decodes instruction codes in units of 16 bits (halfwords). This instruction decoding section 52 includes an FHW decoder that decodes the operation 3 code included in the first halfword, and a second . Third
It includes an NFIIN decoder that decodes the operation code included in the halfword, and an addressing mode decoder that decodes the addressing mode. A second decoder that further decodes the outputs of the FHW decoder and NFH-decoder to calculate the micro Roli entry address, a branch prediction mechanism that performs the branch Y-side of a conditional branch instruction, and a vibe line conflict when the operand address totals 11. Address calculation function checking machine (1) 1 to check address calculation including t1. Le. The instruction code input from the instruction fetch unit 51 is decoded by 0 to 6 bytes per 1 block. Among the decoding results, information related to the calculation in the data calculation unit 56 is output to the 7420120M unit 55, information related to operand address calculation is output to the operand address calculation unit 54, and information related to pc calculation is output to the IIc calculation unit 53. Ru. The instruction decoding section 52 decodes both main processor instructions executed in the microprocessor of the present invention and coprocessor instructions executed in the coprocessor. (4, 3) Micro Decoder 7420120 The M section 55 stores a micro decoder in which various micro program routines for controlling the data calculation section 56 are stored. ON, micro sequencer,
Includes microinstruction decoder, etc. The microinstruction is −
? It is read out from the micro OM once per clock. In addition to the sequence processing indicated by the microprogram, the microsequencer accepts exception 1 interrupts, trap (all together referred to as EXT) processing, and test interrupts using hardware. The micro ROM 55 also manages the store buffer. The micro ROM 55 contains flag information based on interrupts and operation results that do not depend on instruction codes, and a decoder 2.
The output of the instruction decoding unit 52 such as the output of is inputted. The output of the microdecoder is mainly output to the data calculation unit 56, but some information, such as information on canceling previous processing due to execution of a jump instruction, is also output to other functional blocks. (4, 4) r Operand Address Calculation Unit The operand address calculation unit 54 is hard-wired controlled by information related to operand address calculation output from the address decoder of the instruction decoding unit 52, etc. The operand address calculation unit 54 performs most of the processing related to operand address calculation. A check is also made to see if the address of the memory access for indirect memory addressing and the operand address fall into the memory mapped IlotJi area. Also, Kobrose
The operand address calculation for the second instruction is also performed. The address calculation result is sent to the external bus interface unit 57. The general-purpose registers and program counter information necessary for address calculation are input from the data calculation unit 56. Memory When performing indirect addressing, the address output circuit 58 outputs the memory address to be referenced externally through the external bus interface section 57, and the indirect address value input from the data input/output section 59 is used for instruction decoding.
By passing through the do part 52 as it is, the hum-notch is performed. (4, 5) rPC calculation unit” The PC calculation unit 53 receives the p output from the instruction decoding unit 52.
It is hardwired and controlled by information related to c calculation,
PC of the instruction (0 to calculate i) The instructions of the microprocessor of the present invention are variable length instructions, and the length of the instruction cannot be determined until after the instruction is decoded.PC calculation unit 5
3 generates the pc value of the next instruction by adding the instruction length output from the instruction decoding unit 52 to the pc value of the instruction being decoded. The calculation result of the PC calculation unit 53 is output as the pc value of each instruction together with the decoding result of the instruction. (4, 6) r Data calculation unit The data calculation unit 56 is controlled by a microprogram, and in order to realize the function of each instruction according to the microinstruction that is the output of the microROM, 1. The necessary calculations are performed using the register file and the calculation unit Execute with . When the operand to be operated on by the instruction is an address or an immediate value, the address or immediate value calculated by the operand address and if arithmetic section 54 is inputted to the data operation section 56 through the external hash interface section 57 . Furthermore, if the operand to be operated on by the instruction is data stored in the memory, the bus interface unit 57 outputs the address calculated by the address calculation unit 54 from the “?” address output circuit 58, thereby fetching the address from the memory system 13. The resulting operands are input from the data input/output circuit 59 to the data calculation unit 56. For coprocessor instructions, the data IF4 calculation unit 56 transfers necessary commands and operands according to the coprocessor protocol determined according to the microprogram. External hash I/F unit 57, address output circuit 58, data input/output circuit 59, control signal input/output circuit 60, etc. When it is necessary to access the memory system 13, the microprogram By outputting the address from the address output circuit 5B to the outside via the column bus interface section 57 according to the instruction of
One piece of data is fetched through the data input/output circuit B59. When loading data into the memory system 13, an address is output from the address output circuit 58 through the external bus interface section 57, and at the same time the data input Data is output from the output circuit 59 to the MPUl0(unit).In order to efficiently perform operand store, the data calculation unit 5
6 has a 4-byte store buffer. When the data calculation unit 56 obtains a new instruction address by executing jump instruction processing, exception handling, etc., the data calculation unit 56 outputs this to the instruction fetch unit 51 and the PC calculation unit 53. FIG. 5 shows the data calculation section 5 of the microprocessor of the present invention.
6. 5 is a block diagram of a detailed configuration of a portion of a micro ROM section 55 and a portion of a bus interface section 57. FIG. Inside the data calculation unit 56, there is a register file 103 consisting of general-purpose registers and working registers. Processor status word (PSW) consisting of processor status bins and flags 104. ^Data operation circuit consisting of LU, barrel shifter, etc. 8105. Alignment circuit rotates data in bytes and aligns it in words when exchanging data with memory 106.8 4-byte - 8-byte converts data between Hyde's external bus and 4-byte internal bus Conversion circuit 107. S1, the internal data bus
A bus, an S2 bus, and a PC queue 21 that holds up to four PC values of a coprocessor instruction when a command is transferred to a coprocessor according to a DO bus coprocessor instruction is provided. The main arithmetic units are connected to each other by the ST bus, S2 bus, and DO bus.
Processes microinstructions in one clock cycle. The PC value of the coprocessor instruction held in the PCC key 21 indicates that the coprocessor instruction overflows. It is read and used by the exception handler when an exception such as underflow or division by zero occurs. Also PC
Values are cyclically written into the four entries of the queue 21, and the fifth write directly overrides PO2 of the first written entry. This allows the PC
The latest four lJC values are always held in the queue 21. The configuration of the PS1 value 94 is shown in the schematic diagram of FIG. The PSW 104 includes an operation mode field indicating the processor operation mode and an IM field indicating the external interrupt mask level.
It consists of a field As, a CPID field used to specify the destination coprocessor when transferring a command to a coprocessor, and a flag field that changes depending on the result of calculations such as carry and zero flags. (4, 7) r External bus interface section, 1 The external bus interface section 57 controls communication at the input/output pins of the microprocessor of the present invention. All memory accesses are performed in clock synchronization and can be performed in a minimum of two clock cycles. The external bus interface section 57 has a data cache, and if successive operands hit the data cache, no memory access is performed. Writing operands is done to both the data cache and memory. When the coprocessor writes an operand to memory, the operand is read through the data input/output circuit 58 and the contents of the data cache at the same address are changed. A request for access to memory is made with a command fair of 1,000 copies 51.
It is generated independently from the Aviles calculation unit 54 and the data calculation unit 56. External bus interface unit 57 arbitrates these memory access requests. Furthermore, the size of the data bus connecting the memory and the microprocessor of the present invention is 64
When accessing data located at a memory address that straddles a bit (double word) aligned boundary, the crossing of the boundary within this block is automatically detected and the memory access is divided into two memory accesses. The corner bus interface section 57 also performs conflict prevention processing when an operand to be fetched and an operand to be stored overlap, and bypass processing from the store operand to the fetch operand. (Margins below) (5) [Pipeline processing J The microprocessor of the present invention exhibits high performance by pipeline processing of instructions. First, a pipeline processing method for a microprocessor according to the present invention will be described. (5,1) Pipeline Processing Mechanism The pipeline processing of the microprocessor of the present invention is schematically shown in FIG. An instruction fetch stage (IF stage) 31 for pre-fetching instructions, a decode stage (D stage) 32 for decoding instructions, an operand address calculation stage (A stage) 33 for calculating addresses of operands,
Micro l? ON access (especially referred to as R stage 36) and operand briefetch (especially called OF stage 37)
an operand fetch stage (F stage) 34 that performs instructions (referred to as
The five-stage configuration of No. 35 is the basis of pipeline processing. In addition to the E stage 35 having a store buffer, the execution of some of the high-performance instructions is pipelined, so that the effect of pipeline processing of five or more stages is actually exhibited. Each stage operates independently of the other stages, and in theory five stages operate completely independently. Each stage can perform one process in a minimum of l clocks. Therefore, ideally, pipeline processing progresses one after another every l clocks. Although the microprocessor of the present invention has instructions that cannot be processed in one basic pipeline process, such as memory-to-memory operations and memory indirect addressing, the microprocessor of the present invention has a pipeline that is as balanced as possible for these processes. It has been devised to allow line processing. For instructions with multiple memory operands, the instruction is decomposed into multiple pipeline processing units (step codes) at the decoding stage based on the number of memory operands, and pipeline processing is performed. The decomposition method is described in detail in Japanese Patent Application No. 61-236456. The information passed from the IF stage 31 to the D stage 32 is the instruction code itself. There are two pieces of information passed from the D stage 32 to the A stage: information In (referred to as D code value) regarding the operation specified by the instruction, and information In related to address calculation of the operand (referred to as A code 42). The information passed from the A stage 33 to the F stage is an R code 43 that includes the entry address of the microprogram routine and parameters to the microprogram, and an F code 43 that includes the operand address and access method instruction information.
There are two, code 44. The information passed from the F stage 34 to the E stage 35 is an E code 45 containing arithmetic control information and literals, and an S code 46 containing operands, operand addresses, etc. BIT detected at stage E stage 35 and above is
BIT processing is not started until the code reaches the E stage 35. Only the instructions being processed in the E stage 35 are instructions in the execution stage, and the instructions from the IF stage 31 to the F
The instructions processed up to stage 34 have not yet reached the execution stage. Therefore, when an EIT is detected at a stage other than the E stage 35, the fact that it has been detected is recorded in the step code and only transmitted to the next stage. (5, 2) ``Processing of Each Pipeline Stage'' The human output stage of each pipeline stage, Buko=1, is given a name as shown in FIG. 4 for convenience. Further, the step code performs processing related to an operation code, and may become an entry address of the micro ROM and a parameter for the E stage 35, or may become an operand for a microinstruction of the E stage 35. (5, 2, 1) r Instruction Fetch Stage The instruction fetch stage (IF stage) 31 fetches instructions from the instruction cache or memory system 13, inputs them into the instruction queue, and outputs the D stage 32-1 instruction code. Input to the instruction queue is performed in aligned 4-byte units. Fetching instructions from memory system 13 requires a minimum of two clocks per four aligned bytes. If the instruction cache is hit, the aligned 4
It can be fetched in l clocks per bit. The output unit of the instruction queue is variable approximately every 2 bytes, and a maximum of 6 bytes can be output during one clock. Immediately after the jump, it is also possible to bypass the instruction key 1- and directly transfer the 2 bytes of the instruction basic part to the instruction decoder. Controls registration and clearing of instructions in the instruction cache, management of prefetch destination instruction addresses, and control of instruction queues.
+B is also performed at the IF stage 31. (5, 2, 2) r instruction decode stage'' instruction decode stage (D stage) 32 is II + stage 31
Decodes the instruction code input from . Decoding is performed using the F1-rei decoder, NFH-decoder, and addressing mode 1' decoder of the instruction decoding section 52.
It is performed once per clock, and in one decoding process, θ ~ 6
A byte of instruction code is consumed (instruction code is not consumed in the output processing of the stem code including the return destination address of the return subroutine instruction, etc.), and in one decoding, it is sent to the A stage 33 as address durability information. A
, a control code of about 35 pints, which is the 1-code 42, and a maximum of 32 bits of address modification information, an intermediate decoding of the opcode, and a control code of about 50 bits, which is the resulting D code value, and literal information of 8 bits. Output. I] At the stage 32, the control j1 of the PC calculation unit 53 for each instruction is
, instruction code output processing from the instruction queue is also performed. <5.2.3) "r operand address calculation stage" operand address calculation stage (to stage) 3
The processing in step 3 is broadly divided into two parts. One is a process in which the second decoder of the instruction decoding unit 52 is used to decode the operation code at a later stage, and the other is a process in which the operand address calculation unit 54 calculates an operand address. The subsequent decoding process of the operation code uses the D code value manually, and performs write reservation of registers and memory, and output of the R code 43 including the entry address of the microprogram routine and parameters for the microprogram. Note that register and memory write reservations can be used to prevent incorrect addresses from being rewritten by instructions that precede them on the pipeline.
``This is to avoid performing a response calculation. The operand address calculation process takes the Δ code 42 as input, and according to this, the operand address value calculation unit 54 performs addition and memory indirect reference. The calculation is performed and the calculation result is output as the F code 44.At this time, a conflict check is performed when registers and memory are read in conjunction with address calculation.In other words, if the writing process of the preceding instruction to the register or memory has been completed. To avoid: If 1 conflict is specified, E stage 35
The preceding instruction is iHt! until the write process is completed. I am made to do so. (5, 2, 4) r Micro RO? +Access Stage" The processing at the operand fetch stage (F stage) 34 is also broadly divided into two. One is micro ROM
This is particularly called the R stage 36. The other is operand brief L Sochi processing, which is particularly referred to as OF stage 37. R stage 36 and Kano stage 37 do not necessarily operate at the same time, but depend on whether memory access rights can be acquired or not. , operate independently. The micro ROM access process, which is the process in the R stage 36, is performed on the R code 43 by performing the following [micro ROM access and micro This is instruction decoding processing. When processing for one 17 code 43 is decomposed into two or more microprogram steps, the micro ROM is used in the E stage 35, and the next R code 43 waits for micro ROM access. The micro ROM access to the R code 43 is performed at the time of execution of the last micro instruction in the previous E stage 35. In the microprocessor of the present invention, most of the basic instructions are executed by the microprogram step 21, so in reality, the microprocessor for the R code 43
? OM accesses are often performed one after another. (5, 2, 5) r Operand Fetch Stage The operand fetch stage (OF stage) 37 performs operand brifetch processing of the above two processes performed in the F stage 34. Presetting of operands is performed from the data cache or memory system 13. Operand briefetch takes F code 44 as input, and sends the fetched operand and its address to S:1J-F
Output as 46. Although one F code 44 may straddle a double word boundary, an operand fetch of 8 bytes or less is specified. The F code 44 also includes a designation as to whether or not to access the operand, and when the operand address itself and the immediate value calculated in the A stage 33 are transferred to the E stage 35, the operan 1 brief is not performed. , the contents of F code 44 are S code 46
will be transferred as If the operand to be read coincides with the operand to be written by the E stage 35, the operand briefetch is not performed from the data cache or memory and is bypassed. (5, 2, 6) r Execution Stage J The execution stage (E stage) 35 operates with the E code 45° S code 46 as input. This E stage 35 is a stage for executing instructions, and all processing performed in stages before the F stage 34 is preprocessing for processing in the E stage 35. A jump command is executed at E stage 35 or F! When IT processing is started, all processing from IF stage 31 to F stage 34 is invalidated. The E stage 35 is controlled by the microprogram and executes instructions by executing a series of microinstructions starting from the entry address of the microprogram routine indicated by the R code 45. The successive output of micro1iOrl and the execution of microinstructions are pipelined. Therefore, when a branch occurs in a microprogram, a blank space of 1 microstep occurs. Further, the E stage 35 can perform pipeline processing of operand store within 4 bytes and execution of the next microinstruction by using the store buffer in the data calculation unit 56. In the E stage 35, the write reservation for registers and memory made in the A stage 33 is canceled after the operand is written. Various interrupts are directly accepted by the E stage 35 at the timing of instruction breaks, and necessary processing is executed by the microprogram. Other various BIT processes are also executed by microprograms. (5, 3) r State control of each pipeline stage”
Each stage of the pipeline has an input launch and an output launch, and other stages basically operate independently. Each stage completes the previous processing, transfers the processing result from the output latch to the input lunch of the next stage, and receives all the human input signals necessary for the next processing at the input lunch of its own stage. Once all is completed, the next process begins. In other words, at each stage, all the human signals necessary for the next processing output from the previous stage are valid.
The current processing result is transferred to the input lunch of the subsequent stage, and when the output lunch is empty, the next processing is started. In other words, all input signals need to be available at the timing immediately before each stage starts its operation. If there are no human signals, the stage is in a waiting state (
input). When transferring from the output latch to the input latch of the next stage, the input lunch of the next stage must be empty, and even if the input latch of the next stage is not empty, the vibe line stage will wait. state (waiting for output), the necessary memory access rights cannot be obtained, or a wait state (standby state) is inserted in the memory access being processed.
If other pipeline conflicts occur, the processing of each stage itself will be delayed. (6) [Detailed circuit connection diagram of a data processing device using the microprocessor of the present invention] In Fig. 3, the microprocessor of the present invention is shown as υ]0,
1st FPU II, memory system 13. Clock generator single value interrupt system? 11 shows a connection relationship of a data processing device using a circuit 15. The first FPU II of the device of the present invention can accumulate and accept up to three coprocessor instructions, but MPt
Since llO has a PC queue of 4 entries, even if an exception occurs during execution of a coprocessor instruction, the PC value of that instruction is retained in the PCC 1021 and is destroyed by writing the PC value of the subsequent coprocessor instruction. It never happens. Also, the CPI old “00” of MPllo’s PSW104
1', the first FPU II has coprocessor number 0.
Assume that 01 is tilted. The system clock CLK is generated by the clock generator 14 and is passed through the IS line 70 to the MPIIIO, the first FPt 111. The data is input to the memory system 13. BATO:1 of signal vi171 is a signal indicating the type of bus cycle, CDEI of line 13 72 is a data output control signal to the first FPU II, which is a coprocessor, signal line 73
BSI, ASW, 051. R/en is another bus cycle control signal output by MPulO, C on signal line 74.
PSTO:2 is the control iT148 that transfers the internal state of the first FP old 1 to MPulo, CP of the signal line 75 [lCI is the bus cycle completion signal rB outputted by the first FPUII, and DCl of the signal line 76 is outputted by the memory system. This is the pass cycle complete 'M control signal. Further, the signal Ml is used as the data bus DO; 63, and the fi signal vA2 is used as the address bus AO: 31. Regarding interrupts, a signal line 79 for requesting an interrupt from the memory system 13 to the interrupt control circuit 15, and the first FPI 11
A signal line 80 that requests an interrupt to the interrupt control circuit 15 when an exception occurs in 1, and a signal that requests an interrupt to the liver υ10 as a result of those interrupt requests being sorted by the interrupt control circuit 15 *78 signal IRLO: 2 There is. (IO: 31 on δ line 3 is an instruction bus that transfers instruction codes from the memory system 13 to MPIIIQ. Each signal operates in clock synchronization with the system clock. Signal BATO on signal line 71: The meanings are as shown in the table in Figure 9. Each value "00", "01", 10", 11
″Address bus A 0 2 31 f 2 for each
+ and data bus DO: 63 (the contents of 11 are defined. For undefined parts, MPU10.11'
Pt1ll and memory system 13 ignore its contents. The meaning of the signal CPSTO:1 on the signal line 74 is as shown in the table of FIG. , the value "001" is the first FPIII
Indicates that data is being prepared for output. Value “OI
O” indicates that the command was accepted normally. The value “
Old 1° indicates that the first FPU II has completed preparations for data output. The value "io1" indicates that the first FPIII requests retransmission of the command. The value "110' indicates that an error occurred during command transfer. The value "I
II" indicates that the first FPII 11 is not connected. Values "000" and 0100" are undefined. The signal I RLO:2 on the signal line 80 requests the MPU10 to perform interrupt processing depending on its value. MPllo is I
It is determined whether to accept or mask the interrupt according to the value of the field in group ^S in RLO:2 and ll5W (104). (6,1) ``Memory Access Operation'' FIG. 6 is a timing chart of the memory access operation of the data processing apparatus whose configuration is shown in FIG. 3. Memory access is performed once every two clocks of clock BCLK for a sufficiently fast memory. In Figure 6, the first zero-wait data read cycle,
Next, a data write cycle with zero wait, then an instruction read cycle with one clock wait, and then a data write cycle with one clock wait are shown. In the diagram, CL and B
CLK is a system clock supplied via the signal line 70. RCLK is a bus clock with twice the cycle of CL,
Odd-numbered pulses and even-numbered pulses of C1 and K are determined. , CL, and the BCIJ of the present invention are synchronized at the time of system reset. In the data read cycle, the address is output as BATO:1-“00’, and the data bus DO:63111 is The value is fetched and the bus cycle ends. In the data write cycle, BATO:1 is set to ``OO'', and the address is first output, and then, after a delay of 1 clock, the data is output to the data bus DO:63 (11), and is output to Dsi and BCL.
I)(
When J (76) is asserted, the bus cycle ends. In the instruction read cycle, the address is output as BATO:1 = "01", and the instruction bus IO when C1l (76) is asserted at the fall of CL while S1 and BCL are at low level. :31 Loads the value of t31 and ends the bus cycle. As described above, in the processor mode, the microphone CI processor of the present invention activates a bus cycle synchronized with the clock CLK to perform input/output operations with the outside. (6, 2) r F P U access operation 1 FIG. 7 shows the MPII
This figure shows an example of a timing chart when a command is transferred from IO to the first FPIII. Command transfer is performed as a type of 9-cycle bus cycle. In Fig. 7, first the memory/FPU register (gate-F
R) 0-wait transfer cycle of command and operand for FPU instruction, then 0-wait data read cycle from memory, then FPLI register F
Between PII registers (PR-Fit) 1-wait transfer cycle of command l for FPυ instruction, then between MPU registers and FPU registers (GIN-Fil) Fl
The transfer cycle of command and operand and FPU instruction command value C value wait for U instruction is shown. In the transfer cycle of command and operand for FPU instruction between memory and FPU register, BATO: 1 = "
If a command is output from bit 16 to bit 31 of address bus A Q H31 (21) as 10', and CPDCI (75) is asserted at the falling edge of CL while DS# and BCLK,' are at low level. The bus U10 ends the bus cycle.The first FPU II receives commands from the address bus 2 and operands from the data bus DO:63 (11).
According to the value of (74), the state of the first FI"till is MP
tlIQ is notified. The command transfer cycle for the FPU instruction between the FPU register and the FPU register is the same as the command and operand transfer cycle for the FPU instruction between the memory and the FPtl register, except that there is no operand transfer. 1'1Fll between Ptl register and I'PII register
The transfer cycle of commands, operands, and PC values for the ll instruction is also similar to the transfer cycle of commands and operands for the Fl'tl register between the memory and Fl'tl registers. That is, the difference is that the operand is transferred only by the lower 4 bytes of data bus l. FIG. 8 is a timing chart when a command and operand for an FPU instruction are transferred between the FPU register and the memory in the data processing device whose configuration is shown in FIG. 3. In this case, first? 1st FP from 1llIIIO
The command is transferred to υ11, and then the 1st FPU II
from? Operands are transferred to IPIIIO and memory system 13. In the command transfer cycle, the address bus A O: 31 f3+ bit lO is set as BATO:1-“lO’.
A command is output from bit 31 to mouth S1 and BCL.
CPD at the falling edge of CLK while K is at low level.
When CI (75) is asserted, MPU10 ends the bus cycle. The first FPIII receives commands from the address bus 2. At this time, the first FPU
The state of II is notified to Ml'[IlO. In this example, the command is for an FPU instruction between FPU registers and memory, and the tppuii receives the command 7 in Tokyo, and CPSTO:2 (74) is "010", "001", "
O11'. gate! ) υ10 looks at the value of CPST (12) after transferring the command and asserts CPI El after reaching °O11'', and starts the operand transfer bus cycle one clock later. In this cycle, BATO: 1 -00'', the memory address at which the operand should be stored is output from the MPLllo to the address bus AO:31f21, and the data bus DO:
The operand is output from the first FPU II to 63 (11). The operand is taken in by both MPU10 and the memory system 13. The reason why MPIIIO takes in the operand is to rewrite the built-in data cache. The protocol and timing when .0 accesses the second Fl 1112 is that 11 Puio accesses the first Fl 1112.
This is basically the same as when accessing FPIII. The difference is that immediately after the memory cycle that transferred the command, BATO:1-“111 is sent to the coprocessor PC.
The only point is that there is a memory cycle that transfers the value from the liver ulO to the second PPυ12. (7) r:lBD Seno 4 “Instruction Processing Example” No. 36
38 are flowcharts illustrating an example of a processing procedure when the first FPU II executes the command 7. 1st FpHll cPID4i″001″? In these examples, C11 in l'5W104 of MllUlo
Since the ID value is also 001", the PC of the coprocessor instruction
(The command is not directly transferred to the first FP tll.) FIG. 36 is a flowchart of the processing procedure from when a coprocessor instruction, which is an FPU instruction between memory/FPI+registers, is fetched by MPIIIO until it is executed by the first FPU II. The figure is a flowchart of the processing procedure from when a coprocessor instruction, which is an FPU instruction between FPII registers and memory, is fetched at -PUIO until it is executed on the first FPUII. Figure 38 is an FPII instruction between memory and FPU registers. A coprocessor instruction is fetched in 5puio, executed in the first FPυ11, causes an exception, and is executed in the first FPυ1.
1 is interrupt system? 1 is a flowchart of a processing procedure when an interrupt is requested to MPIIIO through 1 circuit 15 and 1'1PI 110 performs exception handling. Exception handling is handled by an interrupt handler. In the microprocessor of the present invention, based on the instruction to read the PCID from the first FPU II and the PCID? IPII
There is a privileged instruction that reads the PC value of a coprocessor instruction from the corresponding entry in the PC queue 21 of IO. The interrupt processing handler uses these instructions to sequentially output the PCID and coprocessor. FIG. 39 is a flowchart 1 of the processing procedure from when a coprocessor instruction, which is an inter-register FPI instruction, is fetched at -PUIO until it is executed by the second FPU 12. 2nd FPu12 (7) CPIO is 010"?', and this is l'1Pul. CPID in PSW 104 of In the above-mentioned embodiment, the command transmission from MPU10 to Kobroset (J) and the operand transfer are performed in the same memory cycle, but it is not necessarily necessary to perform the same in one memory cycle. If the number of bits is larger than the number of bits of data bus l, operand transfer may be performed in multiple memory cycles.Furthermore, in the present invention, exceptions during execution of coprocessor instructions are handled by interrupts, and The instruction in the handler performs the operation of deleting the PC value of the coprocessor instruction stored in the entry of the PC queue 21 without reading it to the old O of h1. Notify the exception to MPU10 with a dedicated signal,
The MPU 10 reads the PCID and the contents of the entry in the PC queue 21 as a hardware operation,
The PC value of the coprocessor instruction that caused the exception may be placed on the stack as information to be passed to the exception handler. In this case, since the exception handler obtains the PC value of the coprocessor instruction that caused the exception from the stack, there is no need for the handler to read the Pc1 port or read the contents of the PC queue 21. [Effects of the Invention] As detailed above, the microprocessor of the present invention includes program counter value holding means (PC queue) for holding the pc value of a coprocessor instruction in preparation for the occurrence of an exception during execution of a coprocessor instruction. Therefore, when transferring a command corresponding to a coprocessor instruction, no memory cycle is required to transfer the pc value of the coprocessor instruction. Therefore, it is possible to instruct the coprocessor to operate in response to a coprocessor instruction in one memory cycle, and it becomes possible to obtain a high-performance data processing device. Also, program counter value holding means (1) C queue)
has multiple entries, and in order to concatenate the entry number with a command and transfer it to Kobro Seso'J, the PC of the program counter value holding device means corresponding to the coprocessor (keying multiple commands while distinguishing The entry number is 2 bits, which is a very small number of bits compared to the 32-bit PC value of the coprocessor instruction, and even if the command and entry number are concatenated, they can be transferred in one bus cycle. The entire data can be transferred via the 32-bit address bus 2. Therefore, it is possible to instruct the coprocessor to operate in response to a coprocessor instruction in one memory cycle, so the microprocessor of the present invention can be used to perform high-performance data processing. In addition, since it is determined whether or not to transfer the pc value of the coprocessor instruction to the coprocessor based on the value of the coprocessor identifier (CPIO), it is possible to connect multiple coprocessors. do,
Even when a plurality of coprocessors are operated in parallel while changing the coprocessor identifier, it is possible to hold the pc value of a coprocessor instruction for a specific coprocessor using one program counter value holding mechanism. Therefore, it is possible to configure a data processing device in which a plurality of coprocessors are connected without sacrificing the effect of the program counter value holding means.

[Brief explanation of the drawing]

第１図は本発明のマイクロプロセッサを用いたデータ処
理装置の一構成例を示すブロック図、第２図は本発明の
マイクロプロセッサの一構成例を示すブロック図、第３図は本発明のマイクロプロセッサとコプロセッサの
接続状態を示すブロック図、第４図は本発明のマイクロプロセッサのパイプライン処
理の概要を示す模式図第５図は本発明のマイクロプロセッサのデータ／Ｊ４算
部とその関係部分の詳細な構成を示すブロック図、第６図は本発明のマイクロプロセッサのメモリアクセス
時のタイミングチャート、第７図は本発明のマイクロブロセ、ＪＪのコプロセッサ
アクセス時のタイミングチャート、第８図は本発明のマ
イクロプロセッサでＦＰｕレジスターメモリ間ＦＰＵ命
令を行う場合のメモリサイクルのタイミングチャート、第９図は本発明のマイクロプロセッサのメモリサイクル
の種類を示す信号ＢＡＴ（ｈｌの値と８．値に対するデ
ータバス、アドレスバス及び命令バスの内ω 容を示ずＪｌ第１０図は本発明のマイクロプロセッサにコブ１１セツ
サの内部状態を伝える信号ＣＰＳＴＯ：２の値とそθ の意味を示す）ｈｉｉｔ図は本発明のマイクロプロセッサのメモリ上での
命令の並び方を示す模式図、第１２図から第１５図は本発明のマイクロプロセッサの
命令のフＡ−マットを示す模式図、第１６図から第２９
図は本発明のマイクロプロセッサの命令のアドレッシン
グモードを説明するための模式図、第３０図は本発明のマイクロプロセッサからコプロセッ
サ・＼転送されるコマンＩ・のフォーマットヲ示ず模式
図、第３１図は本発明のマイクロプロセッサからアドレスバ
スを通じてコプロセッサへコマンド転送する場合の転送
ビット位胃を示す模式図、第３２図から第３４図は本発
明のマイクロプロセッサからコプロセッサ・＼転送され
るコマンド中のＦｐＨ命令のフォーマットを示す模式図
、第３５図は本発明のマイク１コプロセッサのＰＳＷの内
容を示す模式図、第３６図から第３９図は本発明のマイクロプロセッサを
用いたデータ処理装置で種々のコプロセッサ命令を実行
する場合のフローチャー１・、第４０図は従来のマイク
ロプロセッサとコプロセッサを用いたデータ処理装置の
構成を示すブロック図、第値図は第４０図に構成を示す従来のマイクロプロセッ
サを用いたデータ処理装置においてＭＰＵがコプロセッ
サをアクセスする場合のタイミングチャートである。１０・・・主プロセッサ（ＭＰＵ）　　１１．１２・・
・コプロセッサ（ＦＰＵ）　　２ｌ−ｐｃキュー　（第
１１’ＰＣ）　　２２・・・第２ＦＰＣ５２・・・命令
デコード部　　５８・・・アドレス出力回路なお、各図
中同一符号は同−又は相当部分を示す。FIG. 1 is a block diagram showing an example of the configuration of a data processing device using the microprocessor of the invention, FIG. 2 is a block diagram showing an example of the configuration of the microprocessor of the invention, and FIG. FIG. 4 is a schematic diagram showing an overview of the pipeline processing of the microprocessor of the present invention. FIG. 5 is a block diagram showing the connection state of the processor and coprocessor. FIG. 6 is a timing chart of the microprocessor of the present invention during memory access; FIG. 7 is a timing chart of the microprocessor of the present invention, and a timing chart of JJ coprocessor access; FIG. 8 9 is a timing chart of a memory cycle when the microprocessor of the present invention executes an FPU instruction between FPU registers and memories. FIG. Figure 10 shows the value of the signal CPSTO:2 that conveys the internal state of the Cobb 11 setter to the microprocessor of the present invention and the meaning of θ. The figure is a schematic diagram showing how instructions are arranged on the memory of the microprocessor of the present invention. Figures 12 to 15 are schematic diagrams showing the format of instructions of the microprocessor of the present invention. 29
30 is a schematic diagram for explaining the instruction addressing mode of the microprocessor of the present invention; FIG. 30 is a schematic diagram, not showing the format of the command I transferred from the microprocessor of the present invention to the coprocessor; The figure is a schematic diagram showing the number of bits to be transferred when commands are transferred from the microprocessor of the present invention to the coprocessor via the address bus, and Figures 32 to 34 are commands transferred from the microprocessor of the present invention to the coprocessor. FIG. 35 is a schematic diagram showing the contents of the PSW of the microphone 1 coprocessor of the present invention, and FIGS. 36 to 39 are data processing devices using the microprocessor of the present invention. Flowchart 1 for executing various coprocessor instructions in 1. FIG. 40 is a block diagram showing the configuration of a data processing device using a conventional microprocessor and coprocessor. 2 is a timing chart when an MPU accesses a coprocessor in the data processing device using the conventional microprocessor shown in FIG. 10... Main processor (MPU) 11.12...
・Coprocessor (FPU) 2l-pc queue (11th PC) 22...2nd FPC52...Instruction decoding section 58...Address output circuit Note that the same reference numerals in each figure indicate the same or equivalent parts. .

Claims

[Claims]

(1) In a microprocessor that causes a coprocessor to perform a coprocessor operation, a decoding means that decodes a main processor instruction that is an instruction for itself and a coprocessor instruction that causes the coprocessor to perform a coprocessor operation; program counter value holding means for holding a program counter value; means for transferring a command corresponding to the coprocessor instruction to the coprocessor to execute a coprocessor operation according to the decoding result of the decoding means; A microprocessor comprising means for causing the program counter value holding means to hold the program counter value of the coprocessor instruction until a predetermined point in time during execution.

(2) In a microprocessor that causes a coprocessor to perform a coprocessor operation, a decoding means that decodes a main processor instruction that is an instruction for itself and a coprocessor instruction that causes the coprocessor to perform a coprocessor operation; a first program counter value holding means for holding a program counter value of a processor instruction; a second program counter value holding means for holding a program counter value of a second coprocessor instruction; and according to the decoding result of the decoding means, Transferring to the coprocessor a command including a field in which operation specifiers corresponding to the first and second coprocessor instructions and identifiers for identifying the first and second program counter value holding means are concatenated. A microprocessor comprising: means for executing a coprocessor operation using a coprocessor.

(3) In a microprocessor that causes first and second coprocessors to perform coprocessor operations, decoding that decodes a main processor instruction that is an instruction for itself and a coprocessor instruction that causes both coprocessors to perform coprocessor operations. means, a register for holding identification numbers of the first and second coprocessors, program counter value holding means for holding a program counter value of the coprocessor instruction, and according to the decoding result of the decoding means, the coprocessor instruction. means for transferring to the coprocessor a command including a field concatenated with an operation specifier corresponding to the first coprocessor and the identification number; and if the identification number indicates the first coprocessor, a program of the coprocessor instruction; means for causing the program counter value holding means to hold the counter value without transferring it to the first coprocessor; and when the identification number indicates the second coprocessor, the program counter of the coprocessor instruction; and means for transferring a value to the first coprocessor.

(4) Claim (3) wherein the second coprocessor is plural.
).

(5) In a data processing device comprising a coprocessor and a microprocessor that causes the coprocessor to perform a coprocessor operation, the microprocessor includes a main processor instruction that is an instruction for itself, and a main processor instruction that causes the coprocessor to perform a coprocessor operation. a decoding means for decoding a coprocessor instruction to be sent to the coprocessor, a program counter value holding means for holding a program counter value of the coprocessor instruction, and a command corresponding to the coprocessor instruction to the coprocessor according to the decoding result of the decoding means and means for causing the program counter value holding means to hold the program counter value of the coprocessor instruction until a predetermined point in time when the coprocessor executes the calculation. means for notifying the microprocessor from the coprocessor of the occurrence of an exception when an exception is detected as a result of an operation in the coprocessor; and reading a program counter value of the coprocessor instruction from the program counter value holding means. 1. A data processing device, comprising: means for executing exception handling.

(6) In a data processing device comprising a coprocessor and a microprocessor that causes the coprocessor to perform a coprocessor operation, the microprocessor includes a main processor instruction that is an instruction for itself, and a main processor instruction that causes the coprocessor to perform a coprocessor operation. a first program counter value holding means for holding a program counter value of the first coprocessor instruction; and a second program counter value holding means for holding a program counter value of the second coprocessor instruction. and a field in which an operation specifier corresponding to the coprocessor instruction and an identifier for identifying the first and second program counter value holding means are concatenated according to the decoding result of the decoding means. A data processing device comprising: means for transferring a command including a command to the coprocessor to execute a coprocessor operation.

(7) In a data processing device comprising first and second coprocessors and a microprocessor that causes the coprocessors to perform coprocessor operations, the microprocessor is configured to: a decoding means for decoding a coprocessor instruction that causes a processor to perform a coprocessor operation; a register for holding a coprocessor identification number; a program counter value holding means for holding a program counter value of the coprocessor instruction; and the decoding means. means for transmitting a command including a field in which an operation specifier corresponding to the coprocessor instruction and the coprocessor identification number are concatenated to the coprocessor according to a decoding result of the first coprocessor; means for holding the program counter value of the coprocessor instruction in the program counter value holding means without transferring it to the first coprocessor; and the identification number indicates the second coprocessor; and means for transferring a program counter value of the coprocessor instruction to the second coprocessor when the coprocessor instruction is executed.

(8) Claim (7) wherein the second coprocessor is plural.
).