JP3120382B2

JP3120382B2 - Three-way transfer operation by computer

Info

Publication number: JP3120382B2
Application number: JP01510702A
Authority: JP
Inventors: ソロモン，ダグラス; ジェルモラク，トーマス・アラン
Original assignee: シリコン・グラフィクス・インコーポレーテッド
Priority date: 1988-10-03
Filing date: 1989-10-02
Publication date: 2000-12-25
Anticipated expiration: 2015-12-25
Also published as: DE68926168T2; EP0436641B1; EP0436641A4; EP0436641A1; KR900702448A; JPH04502827A; DE68926168D1; WO1990004226A1

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明はディジタル・コンピュータにおけるディジタ
ル情報の効率的な転送に関する。特に、本発明は中央処
理装置からバス（母線）への、記憶装置からバス（母
線）への、又、バス（母線）から図形サブシステムを含
むサブシステムへのディジタル情報の転送に関する。Description: FIELD OF THE INVENTION The present invention relates to the efficient transfer of digital information in digital computers. In particular, the present invention relates to the transfer of digital information from a central processing unit to a bus (bus), from storage to a bus (bus), and from the bus (bus) to a subsystem including a graphics subsystem.

[Conventional technology]

ディジタル・コンピュータは、記憶装置から図形サブ
システムを含むコンピュータの種々のサブシステムへ
と、ディジタル情報を転送するための方法とシステムを
従来から利用してきた。図形サブシステムは、部分的に
コンピュータのワークステーションすなわち端末機のデ
ィスプレイを制御する。Digital computers have traditionally utilized methods and systems for transferring digital information from storage to various subsystems of the computer, including graphics subsystems. The graphics subsystem partially controls the display of a computer workstation or terminal.

従来の構造の一つでは、図形情報は、先ずコンピュー
タの主記憶装置（すなわち、ホスト・メモリ）の部分に
ダウンロードされた。従来型のコンピュータは、ホスト
・プロセッサの他に専用ディスプレイ・リスト・プロセ
ッサを備えていた。このディスプレイ・リスト・プロセ
ッサは次に、図形情報をホスト・メモリからディスプレ
イ・リスト・メモリへと転送した。次にディスプレイ・
リスト・プロセッサは、ディスプレイ・リストを継続的
に走査することによって図形情報をディスプレイ・リス
ト・メモリから図形サブシステムへと伝送した。In one of the conventional structures, the graphical information was first downloaded to the main memory (ie, host memory) portion of the computer. Conventional computers have a dedicated display list processor in addition to the host processor. The display list processor then transferred the graphical information from the host memory to the display list memory. Then display
The list processor transmitted graphic information from the display list memory to the graphic subsystem by continuously scanning the display list.

この従来型の構造の欠点の１つは、１度の転送ではな
く、２度の転送、すなわちホスト・メモリからディスプ
レイ・リスト・メモリへの１度目の転送と、ディスプレ
イ・リスト・メモリから図形サブシステムへの２度目の
転送とが必要であったことである。One of the drawbacks of this conventional structure is not a single transfer, but a double transfer: the first transfer from the host memory to the display list memory, and the display list memory to the graphics sub. A second transfer to the system was necessary.

この従来型の構造の別の欠点は、ホスト・プロセッサ
とホスト・メモリの他に特別のハードウェア、すなわち
ディスプレイ・リスト・プロセッサとディスプレイ・リ
スト・メモリが含まれていることであった。Another drawback of this conventional architecture was the inclusion of special hardware in addition to the host processor and host memory: the display list processor and the display list memory.

別の従来型の構造では、専用のディスプレイ・リスト
・プロセッサと専用のディスプレイ・リスト・メモリは
ないが、その代わりに、図形情報は、主（すなわちホス
ト）メモリに記憶され、ホスト・プロセッサによって主
記憶装置から図形サブシステムに転送された。In another conventional structure, there is no dedicated display list processor and dedicated display list memory, but instead the graphical information is stored in the main (or host) memory and is used by the host processor. Transferred from storage to the graphics subsystem.

この別の従来型の構造の欠点の１つは、プロセッサの
処理時間の多くを、図形情報をメモリから図形サブシス
テムに転送するために費やしてしまうことであった。そ
れによってプロセッサによる他の動作の実行が遅れた。
他の欠点は、図形サブシステムがデータを受信できる速
度にプロセッサが追いつかないことであった。言い換え
ると、転送は、可能な筈の速度に比べてそれ程迅速では
なかった。One of the drawbacks of this other conventional structure was that much of the processing time of the processor was spent transferring the graphics information from memory to the graphics subsystem. This delayed the execution of other operations by the processor.
Another drawback was that the processor could not keep up with the speed at which the graphics subsystem could receive data. In other words, the transfer was not as fast as it should have been.

[Problems to be Solved by the Invention and Means for Solving the Problems]

コンピュータ・記憶装置から図形サブシステムを含む
コンピュータのサブシステムへとディジタル情報を転送
するための公知の方法およびシステムにおける前記の限
界に鑑み、本発明の課題の１つは、コンピュータの効率
を高めるため、図形サブシステムを含むコンピュータの
サブシステムにデータを転送する効率的な手段を提供す
ることである。In view of the aforementioned limitations of known methods and systems for transferring digital information from a computer storage device to a computer subsystem, including a graphics subsystem, one of the objects of the present invention is to increase the efficiency of the computer. , Providing an efficient means of transferring data to a subsystem of a computer, including a graphics subsystem.

本発明の別の課題は、中央処理装置（“CPU"）の制御
の下で、ディジタル情報をコンピュータのサブシステム
に転送する際に、CPUが転送のために費やす時間を最小
限にして、CPUがその転送以外の他のコンピュータ動作
をより多くの時間にわたって行えるようにする方法を提
供することである。従って本発明の課題は、転送がCPU
の処理時間を占領することを避けることである。Another object of the present invention is to transfer the digital information to a subsystem of a computer under the control of a central processing unit (“CPU”) while minimizing the time the CPU spends transferring. Is to provide other computer operations other than the transfer for a longer period of time. Therefore, the problem of the present invention is that the transfer is performed by the CPU.
Is to avoid occupying the processing time of.

本発明のもう一つの課題は、ディジタル情報をコンピ
ュータのサブシステムに転送する際に、その転送に必要
なコンピュータ・ハードウェアの量を最小限にできる方
法を提供することである。Another object of the present invention is to provide a method for transferring digital information to a subsystem of a computer that minimizes the amount of computer hardware required for the transfer.

本発明の別の課題は、コンピュータがディジタル情報
をサブシステムに伝送する速度が、サブシステムがディ
ジタル情報を受信できる最大速度とほぼ適合するように
充分な高速度であるように、ディジタル情報をコンピュ
ータのサブシステムに転送するための方法を提供するこ
とである。Another object of the present invention is to provide digital information to a computer such that the rate at which the computer transmits the digital information to the subsystem is high enough to approximately match the maximum rate at which the subsystem can receive the digital information. It is to provide a method for transferring to the subsystem.

本発明の別の課題は、２つ又はそれ以上のデータ語が
同時に転送される、ディジタル情報をコンピュータのサ
ブシステムに転送するための方法を提供することであ
る。Another object of the invention is to provide a method for transferring digital information to a subsystem of a computer in which two or more data words are transferred simultaneously.

本発明の別の課題は、ディジタル情報をキャッシュ準
拠のマルチプロセッサ・コンピュータのサブシステムに
転送する際に、そのデータ伝送が、キャッシュの２以上
のラインにわたるデータについても可能となる方法を提
供することである。Another object of the invention is to provide a method for transferring digital information to a subsystem of a cache-compliant multiprocessor computer, the data transmission of which is also possible for data extending over more than one line of the cache. Is.

本発明の更に別の課題は、転送中に割り込みが発生し
たとき、割り込みの後でコンピュータが割り込み前の状
態に復元可能であるようにディジタル情報をコンピュー
タのサブシステムに転送する方法を提供することであ
る。Yet another object of the present invention is to provide a method of transferring digital information to a subsystem of a computer when the interrupt occurs during the transfer so that the computer can restore the pre-interrupt state after the interrupt. Is.

本発明のこれらの課題は、ディジタル・コンピュータ
内でディジタル情報の三路転送を実行する方法と装置と
によって達成される。ディジタルコンピュータはCPU
と、CPUに結合されたデコード回路と、デコード回路に
結合されたバスと、バスに結合されたメモリと、バスに
結合されたサブシステムとを備えている。ディジタル情
報には、データ・ブロックが含まれ、そのデータ・ブロ
ックには、（１）当該データ・ブロックの第１の語のア
ドレスである開始アドレスと、（２）当該データ・ブロ
ック内の語数を表す語カウントとが含まれている。書き
込み動作用のトリガ・アドレスが、CPUからデコード回
路に伝送される。トリガ・アドレスはデコードされ、そ
して、後続の書き込み動作によってメモリを変更するこ
とが抑止される。書き込み動作の開始アドレスが、CPU
からバスに伝送される。次に開始アドレスは、バスから
メモリへ、かつサブシステムへと伝送される。語カウン
トを含むデータがCPUからバスへと伝送される。２つ、
又はそれ以上の一連のデータ語がメモリからバスへと伝
送される。一連のデータ語の語数は語カウントよりも多
いか、それに等しい。それらの一連のデータ語のうちの
１つのデータ語は、メモリの開始アドレスの位置にあ
る。一連のデータ語はバスからサブシステムに伝送され
る。このようにしてデータ・ブロックがバスを経てメモ
リから直接サブシステムへと転送される。These objects of the invention are accomplished by a method and apparatus for performing a three-way transfer of digital information within a digital computer. Digital computer is CPU
And a decoding circuit coupled to the CPU, a bus coupled to the decoding circuit, a memory coupled to the bus, and a subsystem coupled to the bus. The digital information includes a data block, and the data block includes (1) a start address that is the address of the first word of the data block, and (2) the number of words in the data block. It contains the word count and the word to represent. The trigger address for the write operation is transmitted from the CPU to the decode circuit. The trigger address is decoded and the subsequent write operation is prevented from modifying the memory. The start address of the write operation is the CPU
To the bus. The starting address is then transmitted from the bus to memory and to the subsystem. Data including word counts are transmitted from the CPU to the bus. Two,
Or a series of more data words is transmitted from the memory to the bus. The number of words in the series of data words is greater than or equal to the word count. One data word of the series of data words is at the starting address of the memory. A series of data words is transmitted from the bus to the subsystem. In this way, blocks of data are transferred from memory directly to the subsystem via the bus.

本発明の前述の課題は、更に、次のようなディジタル
・コンピュータ内でディジタル情報の三路転送を実行す
る方法と装置によって達成される。ディジタル・コンピ
ュータはCPUと、CPUに結合されたデコード回路と、デコ
ード回路に結合されたバスと、バスに結合されたメモリ
と、バスに結合されたサブシステムとを備えている。デ
ィジタル情報には、（１）データ・ブロックの第１の語
のアドレスである開始アドレスと、（２）データ・ブロ
ック内の最後の語のアドレスである終了アドレスと、
（３）データブロック内の語数を表す語カウントとが含
まれている。書き込み動作のためのトリガ・アドレス
が、CPUからデコード回路に伝送される。トリガ・アド
レスはデコードされ、そして、後続の書き込み動作によ
ってメモリを変更することが抑止される。書き込み動作
のための第１の開始アドレスが、CPUからデコード回路
へと伝送される。第１の開始アドレスのビットがタグ
（標識）ビットとして記憶される。タグ（標識）ビット
は、開始アドレス内でのタグ（標識）ビットの位置を表
す配列順を有している。第１の開始アドレスはデコード
回路からバスに伝送される。第１の開始アドレスは、バ
スから、メモリと、サブシステムへと伝送される。語カ
ウントを含むデータはバスらサブシステムへと伝送され
る。Ｎ個のデータ語の第１シークェンスはメモリからバ
スに伝送される。Ｎは１よりも大きく、語カウントより
も大きいか、これと等しい整数である。Ｎ個のデータ語
の第１シークェンスの１つのデータ語がメモリ内の第１
の開始アドレスの位置にある。Ｎ個のデータ語の第１の
シークェンスはバスからサブシステムへと転送される。
語カウントはＮで除算され、１が減算され、その結果に
端数があれば、その結果は次の最大整数に丸められて、
整数Ｘが生成される。Ｘが０以上の正の整数である場合
は、前記のステップがＸ回だけ繰り返される。前記ステ
ップを繰り返す前に、（１）第１の開始アドレスがＮだ
け増分されて第２の開始アドレスが生成され、（２）第
１の開始アドレスが第２の開始アドレスになり、（３）
タグ（標識）ビットがクリアされ、そして、（４）Ｎ個
のデータ語の第１のシークェンスが、連続する一連の
（Ｎ個のデータ語の）シークェンスのうちの次の（Ｎ個
のデータ個の）シークェンスになり、連続する一連のシ
ークェンスについての操作は、前記ステップがＸ回反復
されると、終わることになる。これらのステップの終了
後、書き込み動作の終了アドレスがCPUからデコード回
路に伝送される。タグ（標識）ビットが、それと同じ配
列順を有する終了アドレスのビットと比較される。２つ
のビットが同一ではない場合は、（１）終了アドレスが
デコード回路からバスに伝送され、（２）終了アドレス
がバスからメモリ及びサブシステムに伝送され、（３）
語カウントを含むデータがCPUからバスに伝送され、
（４）語カウントを含むデータがバスからサブシステム
に伝送され、（５）Ｎ個のデータ語の終了シークェンス
がメモリからバスに伝送され（ここで、データ語の終了
シークェンスのデータ語の１つはメモリの終了アドレス
の位置にある）、そして、（６）Ｎデータ語の終了シー
クェンスがバスからサブシステムへと転送される。デー
タ・ブロックは、このようにしてバスを経てメモリから
サブシステムへと直接に転送される。The foregoing objects of the invention are further achieved by a method and apparatus for performing a three-way transfer of digital information in a digital computer as follows. The digital computer includes a CPU, a decode circuit coupled to the CPU, a bus coupled to the decode circuit, a memory coupled to the bus, and a subsystem coupled to the bus. The digital information includes (1) a start address that is the address of the first word of the data block, and (2) an end address that is the address of the last word in the data block.
(3) A word count indicating the number of words in the data block is included. The trigger address for the write operation is transmitted from the CPU to the decode circuit. The trigger address is decoded and the subsequent write operation is prevented from modifying the memory. The first start address for the write operation is transmitted from the CPU to the decode circuit. The bits of the first start address are stored as tag bits. The tag bit has an order of arrangement that represents the position of the tag bit within the start address. The first start address is transmitted from the decode circuit to the bus. The first start address is transmitted from the bus to the memory and subsystem. Data including word counts are transmitted from the bus to the subsystem. The first sequence of N data words is transmitted from the memory to the bus. N is an integer greater than 1 and greater than or equal to the word count. One data word of the first sequence of N data words is first in memory
At the start address of the. The first sequence of N data words is transferred from the bus to the subsystem.
The word count is divided by N, 1 is subtracted, and if the result is fractional, the result is rounded to the next largest integer,
The integer X is generated. If X is a positive integer greater than or equal to 0, the above steps are repeated X times. Before repeating the above steps, (1) the first start address is incremented by N to generate a second start address, (2) the first start address becomes the second start address, and (3)
The tag bit is cleared, and (4) the first sequence of N data words is the next (N data number) of the sequence of consecutive (N data word) sequences. Sequence) and the operation for a series of consecutive sequences will end when the above steps are repeated X times. After completion of these steps, the end address of the write operation is transmitted from the CPU to the decode circuit. The tag bit is compared to the bit of the ending address that has the same ordering as it. If the two bits are not the same, (1) the end address is transmitted from the decoding circuit to the bus, (2) the end address is transmitted from the bus to the memory and subsystem, and (3)
Data including word count is transmitted from the CPU to the bus,
(4) Data containing the word count is transmitted from the bus to the subsystem, (5) The end sequence of N data words is transmitted from the memory to the bus (where one of the data words of the end sequence of data words is Is at the end address of memory), and (6) the end sequence of N data words is transferred from the bus to the subsystem. Data blocks are thus transferred directly from memory to the subsystem via the bus.

本発明のその他の課題、特徴及び利点は添付図面を参
照した本発明の以下の詳細な説明によって明らかにされ
る。Other objects, features and advantages of the present invention will be made clear by the following detailed description of the present invention with reference to the accompanying drawings.

[Brief description of drawings]

添付図面は本発明を例示するものであり、本発明は添
付図面に限定されるものではない。図面において、同一
の参照符号は同一の素子を示す。The accompanying drawings illustrate the present invention and the present invention is not limited to the accompanying drawings. In the drawings, the same reference numerals indicate the same elements.

図１は、２つのCPUと、デコード回路と、バスと、メ
モリと、図形サブシステムとを備えたディジタル・コン
ピュータの体系の構成図である。FIG. 1 is a block diagram of a system of a digital computer including two CPUs, a decoding circuit, a bus, a memory, and a graphics subsystem.

図２は、種々の語カウントを有するデータ・ブロック
の図である。FIG. 2 is a diagram of a data block with various word counts.

図３は、ディジタル情報の三路転送のタイミング図で
ある。FIG. 3 is a timing diagram of three-way transfer of digital information.

[Detailed Description of the Invention]

図面を参照すると、図１は、グラフィック能力を増進
させたキャッシュ準拠のマルチプロセッサ・ディジタル
・コンピュータのアーキテクチャを構成する回路10の構
成図である。図１の左側から見ると、CPU20は、カリフ
ォルニア州、サニーベルのMIPコンピュータ社から市販
されている32ビットのR3000型のRISC（命令セット簡略
化）コンピュータのマイクロプロセッサである。各々の
データ語は32ビットから成っている。命令キャッシュ22
がCPU20に結合され、データ・キャッシュ24が同様にCPU
20に結合されている。命令キャッシュ22及びデータ・キ
ャッシュ24は、プロセッサ20と主記憶装置80との間に配
置された小型の高速メモリとして機能する。命令キャッ
シュ22とデータ・キャッシュ24は、各々ダイレクト・マ
ッピング（直接マップ）をされている。キャッシュ22お
よびキャッシュ24がダイレクト・マッピング（直接マッ
プ）形であるということは、データが主記憶装置80の何
処からキャッシュ22と24に入るのかをCPU20が認識でき
ることを意味する。このようなデュアル・キャッシュ方
式によって、命令とデータとは、ダイレクト・マッピン
グ方式としても、同じキャッシュ領域で競合しないこと
が保証される。命令キャッシュとデータ・キャッシュが
別個であることによって、命令の帯域幅が拡大する。回
路10は更に、書き込みバッファ30とバス70との間に結合
された第２のダイレクト・マッピング（直接マップ）さ
れるデータ・キャッシュ40を備えている。図形サブルー
チンは、CPU20の命令キャッシュ22およびデータ・キャ
ッシュ24にあり、又、CPU100の命令キャッシュ102およ
びデータ・キャッシュ104にある。Referring to the drawings, FIG. 1 is a block diagram of a circuit 10 which constitutes the architecture of a cache-compliant multiprocessor digital computer with enhanced graphics capabilities. Viewed from the left side of FIG. 1, CPU 20 is a microprocessor of a 32-bit R3000 RISC (Instruction Set Simplification) computer commercially available from MIP Computer, Inc. of Sunny Bell, Calif. Each data word consists of 32 bits. Instruction cache 22
Is coupled to CPU 20 and data cache 24 is also CPU
Combined with 20. The instruction cache 22 and the data cache 24 function as a small high speed memory arranged between the processor 20 and the main memory 80. The instruction cache 22 and the data cache 24 are each directly mapped. The fact that the cache 22 and the cache 24 are of the direct mapping type means that the CPU 20 can recognize where in the main storage device 80 the data enters the caches 22 and 24. With such a dual cache method, it is guaranteed that the instruction and the data do not compete in the same cache area even if the direct mapping method is used. The separate instruction and data caches increase instruction bandwidth. The circuit 10 further comprises a second direct mapped data cache 40 coupled between the write buffer 30 and the bus 70. Graphical subroutines are in instruction cache 22 and data cache 24 of CPU 20 and in instruction cache 102 and data cache 104 of CPU 100.

本発明の好ましい実施例では、回路10内でのディジタ
ル情報の三路転送が行われる。三路転送動作において、
開始アドレスが、CPU20からバス70を経て、記憶装置80
に伝送される。開始アドレスは、CPU20からバス70を経
て、図形サブシステム90にも伝送される。そして、デー
タ・ブロック（最初の語のアドレスである開始アドレス
と、データ・ブロック内の語数を表す語カウントとを有
するデータ・ブロック）が、記憶装置80又はデータ・キ
ャッシュ40からバス70を経て図形サブシステム90へと、
直接に伝送される。図形サブシステム90はそのようにし
て伝送されたデータ・ブロックを利用する。In the preferred embodiment of the present invention, three-way transfer of digital information within circuit 10 is performed. In the three-way transfer operation,
The start address is from the CPU 20 via the bus 70 to the storage device 80
Be transmitted to. The start address is also transmitted from the CPU 20 via the bus 70 to the graphics subsystem 90. Then, a data block (a data block having a starting address which is the address of the first word and a word count representing the number of words in the data block) is transferred from storage device 80 or data cache 40 via bus 70 to a graphic To subsystem 90,
It is transmitted directly. The graphics subsystem 90 utilizes the data blocks so transmitted.

データ・キャッシュ24はライトスルー・キャッシュで
ある。データ・キャッシュ24のようなライトスルー・キ
ャッシュの場合、CPU20は、常に、データ・キャッシュ
（すなわちデータ・キャッシュ24）と主記憶装置80の双
方に、書き込みを行う。データ・キャッシュ24はライト
スルーであるから、主記憶装置90は、データ・キャッシ
ュ24中のすべてのデータの最新のコピーを、常に有して
いる。ライトスルー・キャッシュのアーキテクチャで
は、CPUは、常にキャッシュをミスしたようになり、キ
ャッシュ（すなわちキャッシュ24）と記憶装置80の双方
への書き込みを行う。キャッシュ40へのタグ（標識）も
更新される。The data cache 24 is a write-through cache. In the case of a write-through cache such as the data cache 24, the CPU 20 always writes to both the data cache (ie the data cache 24) and the main memory 80. Since data cache 24 is write-through, main memory 90 always has an up-to-date copy of all the data in data cache 24. In a write-through cache architecture, the CPU always seems to miss the cache and writes to both the cache (ie cache 24) and storage 80. The tag (sign) to the cache 40 is also updated.

しかし、第２レベルのデータ・キャッシュ40はライト
バック・キャッシュである。ライト・バック・キャッシ
ュ方式では、キャッシュ40が、記憶装置80内の同じアド
レス位置にあるデータ語よりも、より新しいバージョン
のデータ語を有することあり得る。ライトバック・キャ
ッシュ方式では、CPU20はキャッシュ40だけに書き込み
を行い、キャッシュ40と記憶装置80の双方に同時には書
き込みをしない。言い換えると、ライトバック・キャッ
シュ方式では、キャッシュ40が、変更された最新のデー
タ語の専有コピーを有することがあり得る。CPU20は、
同じ記憶域に書込みを２回行う場合には、記憶装置80に
は一度だけ書き込めぱよいので、ライトバック・キャッ
シュ方式ではバスの通信量が最小限になる。However, the second level data cache 40 is a writeback cache. In a write-back cache scheme, cache 40 may have a newer version of the data word than the data word at the same address location in storage 80. In the write-back cache method, the CPU 20 writes only to the cache 40 and does not write to both the cache 40 and the storage device 80 at the same time. In other words, in the write-back cache scheme, cache 40 may have a proprietary copy of the most recent modified data word. CPU20 is
When writing to the same storage area twice, it is sufficient to write to the storage device 80 only once. Therefore, the write-back cache method minimizes the bus traffic.

書き込みバッファ30は、深さ４段の書き込みバッファ
である。すなわち、書き込みバッファ30は、各メモリ場
所に４語のディジタル情報を保持することができる。書
き込みバッファ30は、更に、先入れ先出し（“FIFO"）
書き込みバッファである。書き込みバッファ30は線82を
経て第２レベルのデータ・キャッシュ40に結合されてい
る。書き込みバッファ30は更に線34を経てアドレス・デ
コーダ50に結合されている。The write buffer 30 is a write buffer having a depth of 4 steps. That is, the write buffer 30 can hold four words of digital information in each memory location. The write buffer 30 is also first-in first-out (“FIFO”)
It is a write buffer. The write buffer 30 is coupled to the second level data cache 40 via line 82. Write buffer 30 is further coupled to address decoder 50 via line 34.

書き込みバッファ30は線26を経てCPU20とデータ・キ
ャッシュ24に結合されている。書き込みバッファ30によ
ってライトスルー・キャッシュ24は、書き込みサイクル
（周期）を完了するために速度が遅い主記憶装置80を待
つ必要がなくなる。CPU20は、（第２レベルのデータ・
キャッシュ40又は記憶装置80に）書き込みたいディジタ
ル情報を、書き込みバッファ30にダンプする。次いで、
書き込みバッファ30は、第２レベルのデータ・キャッシ
ュ40および主記憶装置80への比較的遅い速度での書き込
みを行うが、その書き込み中には、CPU20は、別のコン
ピュータ動作を行える。本発明の好ましい実施例では、
CPU20はディジタル情報を比較的迅速に転送する25MHzの
マイクロプロセッサである。このように書き込みバッフ
ァ30は、CPU20が別のコンピュータ動作に移れるよう
に、CPU20からのディジタル情報を保持する。Write buffer 30 is coupled to CPU 20 and data cache 24 via line 26. The write buffer 30 eliminates the need for the write-through cache 24 to wait for the slow main memory 80 to complete a write cycle. CPU20 (data of the second level
The digital information desired to be written (in the cache 40 or the storage device 80) is dumped in the write buffer 30. Then
The write buffer 30 writes to the second level data cache 40 and main memory 80 at a relatively slow rate, during which the CPU 20 can perform other computer operations. In a preferred embodiment of the invention,
The CPU 20 is a 25 MHz microprocessor that transfers digital information relatively quickly. Thus, write buffer 30 holds digital information from CPU 20 so that CPU 20 can move to another computer operation.

CPU20は読出し又は書き込み動作を実行する。読出し
動作（すなわちロード動作）では、CPU20は、キャッシ
ュ22,24及び40、又は記憶装置80から、命令とアドレス
及びデータを得ることができる。CPU20の読出し動作中
にキャッシュ・ミスが生じると、CPU20は、命令，アド
レス，データを得るために、キャッシュ22,24又は40で
はなく、記憶装置80に向かうことになる。読出し動作
中、CPU20はデータを待機しなければならない。一方、
書き込み動作（すなわち記憶動作）中には、CPU20はア
ドレス及びデータをキャッシュ24及び40、又は記憶装置
80に伝送する。CPU20が書き込み動作中にキャッシュ24
又は40を“ミスする”と、キャッシュ22が、アドレス及
びデータをキャッシュ24及び40の双方に伝送する。CPU 20 performs a read or write operation. In a read operation (ie a load operation), the CPU 20 can get instructions, addresses and data from the caches 22, 24 and 40 or the storage device 80. If a cache miss occurs during a read operation of the CPU 20, the CPU 20 will go to the storage device 80 instead of the cache 22, 24 or 40 to get the instruction, address, and data. CPU 20 must wait for data during a read operation. on the other hand,
During a write operation (ie, a store operation), CPU 20 caches addresses and data 24 and 40, or storage devices.
Transmit to 80. Cache 24 during CPU20 write operation
Or, “miss” 40, cache 22 forwards the address and data to both caches 24 and 40.

CPU20は、線26でアドレス及びデータを伝送し、受信
する。書き込みバッファ30は、線32でアドレス及びデー
タを伝送し、受信する。第２レベルのデータ・キャッシ
ュ40は、線44でアドレス及びデータを伝送し、受信す
る。CPU 20 transmits and receives addresses and data on line 26. Write buffer 30 transmits and receives addresses and data on line 32. The second level data cache 40 transmits and receives addresses and data on line 44.

図１の右側を参照すると、CPU100は、同じくカリフォ
ルニア州、サニーベルのMIPコンピュータ社から市販さ
れている32ビットのR3000マイクロプロセッサである。
図１の右側の回路は図１の左側の回路と同様である。命
令キキッシュ102はCPU100に結合されている。データ・
キャッシュ104もCPU100に結合されている。命令キャッ
シュ102とデータキャッシュ104は、双方ともダイレクト
・マッピング（直接マップ）されている。データ・キャ
ッシュ104は、ライトスルー・データ・キャッシュであ
る。CPU100は線106を経てアドレス及ぴデータを書き込
みバッファ110に伝送する。書き込みバッファ110は、同
様にFIFOで深さ４段の書き込みバッファである。書き込
みバッファ110は、線112を経てアドレス及びデータを第
２レベルのデータ・キャッシュ120へと伝送する。書き
込みバッファ110は更に線114を経てアドレス・デコーダ
130に結合されている。第２レベルのデータ・キャッシ
ュ120は、ダイレクト・マッピング（直接マップ）のラ
イトバック・データ・キャッシュである。第２レベルの
データ・キャッシュ120は、線124を経てアドレス及びデ
ータをバス70へと伝送する。第２レベルのデータ・キャ
ッシュ120は更に線122を経てアドレス・デコーダ130に
結合されている。Referring to the right side of FIG. 1, CPU 100 is a 32-bit R3000 microprocessor also commercially available from MIP Computer, Inc. of Sunnybell, Calif.
The circuit on the right side of FIG. 1 is similar to the circuit on the left side of FIG. The instruction quiche 102 is coupled to the CPU 100. data·
The cache 104 is also coupled to the CPU 100. Both the instruction cache 102 and the data cache 104 are directly mapped. The data cache 104 is a write-through data cache. The CPU 100 transmits the address and data to the write buffer 110 via the line 106. Similarly, the write buffer 110 is a FIFO write buffer having a depth of four stages. The write buffer 110 transfers the address and data to the second level data cache 120 via line 112. The write buffer 110 is further connected via line 114 to the address decoder.
Combined with 130. The second level data cache 120 is a direct mapping writeback data cache. The second level data cache 120 transfers addresses and data to bus 70 via line 124. The second level data cache 120 is further coupled to address decoder 130 via line 122.

図１の下部を参照すると、アドレス及びデータは、線
82を介して記憶装置80とバス70の間で通信される。アド
レス及びデータは、線92を介して図形サブシステム90と
バス70の間を通信される。Referring to the lower part of FIG. 1, addresses and data are
Communication is performed between the storage device 80 and the bus 70 via 82. Addresses and data are communicated between graphics subsystem 90 and bus 70 via line 92.

本発明の別の実施例では、バス70,記憶装置80,図形サ
ブシステム90を共用する追加のプロセッサを備えること
ができる。例えば、２つのマイクロプロセッサではな
く、４つのマイクロプロセッサでバス70を共用すること
ができる。Alternative embodiments of the invention may include additional processors sharing the bus 70, storage 80, and graphics subsystem 90. For example, four microprocessors could share bus 70 instead of two.

更に、本発明のもう一つの別の実施例では、バス70に
複数のプロセッサではなく、単一のプロセッサを結合す
ることもできよう。このような別の実施例では、CPU20
および図１の左下の回路はそのような実施例の回路の一
部であるが、CPU100は回路の一部ではなくなる。Furthermore, in another alternative embodiment of the invention, a single processor could be coupled to bus 70 rather than multiple processors. In another such embodiment, the CPU 20
And the lower left circuit of FIG. 1 is part of the circuit of such an embodiment, but CPU 100 is no longer part of the circuit.

本発明の更に別の実施例では、本発明の回路は書き込
みバッファもしくは第２レベルのデータ・キャッシ40,1
20を備えていない。言い換えると、書き込みバッファ30
及び110、及びデータ・キャッシュ40及び120は備えられ
ていない。しかし、ディジタル・コンピュータは、CPU
と、CPUに結合されたデコード回路と、CPUに結合された
バスと、バスに結合されたメモリと、バスに結合された
サブシステムとを備えている。In yet another embodiment of the present invention, the circuit of the present invention comprises a write buffer or second level data cache 40,1.
Not equipped with 20. In other words, write buffer 30
And 110 and data caches 40 and 120 are not provided. However, a digital computer has a CPU
A decoding circuit coupled to the CPU, a bus coupled to the CPU, a memory coupled to the bus, and a subsystem coupled to the bus.

回路10に複数のデータ・キャッシュが備えられている
図１の好ましい実施例に戻ると、キャッシュの一貫性、
もしくはキャッシュのコヒレンシを保持する手段がなけ
ればならない。キャッシュの一貫性、もしくはキャッシ
ュのコヒレンシが保持されていれば、CPU20がデータ・
キャッシュ40に書き込みをした場合にも、CPU100が、キ
ャッシュ40に書き込まれた同じデータをアクセスできる
ことが保証される。Returning to the preferred embodiment of FIG. 1 where circuit 10 includes multiple data caches, cache coherency,
Or there must be a way to maintain cache coherency. If cache coherency or cache coherency is maintained, CPU20
Even when writing to the cache 40, it is guaranteed that the CPU 100 can access the same data written in the cache 40.

発明の好ましい実施例におけるキャッシュ・コヒレン
シの保持手法の一部として、２つの状態ビットが使用さ
れる。２つの状態ビットにより、次の４つの可能な状態
を示すことができる。Two state bits are used as part of the cache coherency retention scheme in the preferred embodiment of the invention. Two status bits can indicate the next four possible statuses.

（１）妥当状態（バリッド）（２）不当状態（インバリッド）（３）共用状態（シェアード）（４）汚染状態（ダーティ）である。(1) Valid state (valid) (2) Unjustified state (invalid) (3) Shared state (shared) (4) Contamination state (dirty).

本発明の好ましい実施例では、状態ビットは、（キャ
ッシュにおける）データ・ラインと関連している。好ま
しい実施例では、（キャッシュのマッピングの単位に対
応する）データ・ライン（ないしデータのライン）は、
４つのデータ語から成っている。データ・ラインを構成
するデータ語のうちのいずれかのデータ語を変更するこ
とは、データ・ラインが変更されたことを意味する。In the preferred embodiment of the invention, the status bits are associated with a data line (in the cache). In the preferred embodiment, the data line (or line of data) (corresponding to the unit of mapping of the cache) is:
It consists of four data words. Modifying any of the data words that make up the data line means that the data line has been modified.

妥当状態（バリッド）は、メモリからの特定のデータ
・ラインが、１つのデータ・キャッシュだけにしかな
く、且つ、そのデータ・ラインが変更されていないこと
を示す。言い換えると、変更されていないデータ・ライ
ンは、メモリおよびデータ・キャッシュの双方におい
て、同一である。例えば、変更されていないデータ・ラ
インはデータ・キャッシュ40及び記憶装置80の双方に現
れる。A valid state indicates that a particular data line from memory is only in one data cache and that data line has not been modified. In other words, the unchanged data line is the same in both the memory and the data cache. For example, unmodified data lines appear in both data cache 40 and storage 80.

不当状態（インバリッド）は、キャッシュ内にデータ
がないことを意味する。そこでキャッシュは不当状態へ
と初期設定される。不当状態は、また、“ミス”と同じ
状態であることを意味する。The invalid state (invalid) means that there is no data in the cache. There, the cache is initialized to an illegal state. An illegal state also means the same state as a "miss".

共用状態（シェアード）は、特定のデータ・ラインが
おそらく１つ以上のキャッシュ内にあり、未だ変更され
ていないことを示す。例えば、データ・ラインは、デー
タ・キャッシュ40とデータ・キャッシュ120の双方にあ
る。A shared state indicates that a particular data line is probably in one or more caches and has not been modified. For example, data lines are in both data cache 40 and data cache 120.

汚染状態（ダーティ）は、特定のデータ・ラインのコ
ピーが１つしかなく、それが最新のコピーであり、その
特定のデータを有するのはそのキャッシュだけであり、
かつ、記憶装置80にはその特定のデータが含まれていな
いことを示す。言い換えると、キャッシュ40に対応する
状態ビットが、汚染状態（ダーティ）を示すようにセッ
トされていると、キャッシュ40は、その特定のデータ・
ラインの専有コピーを有することを意味する。すなわ
ち、その特定のデータ・ラインは、回路10における他の
データ・キャッシュ又は記憶装置80では発見できないこ
とを意味する。このように、状態ビットによってデータ
・キャッシュ40の汚染状態が示されると、データ・キャ
ッシュ120にも、記憶装置80にも、その特定のデータ・
ラインのコピーは存在しない、言うことを意味する。A dirty state is that there is only one copy of a particular data line, it is the most recent copy, and only that cache has that particular data,
In addition, it indicates that the storage device 80 does not include the specific data. In other words, if the status bit corresponding to cache 40 is set to indicate a dirty condition (dirty), cache 40 will cache that particular data
Means that you have your own copy of the line. That is, that particular data line cannot be found in another data cache or memory device 80 in circuit 10. Thus, when the status bit indicates a dirty state of the data cache 40, both the data cache 120 and the storage device 80 will
It means that there is no copy of the line.

バス70が関係する全ての動作では、データ・キャッシ
ュ40とデータ・キャッシュ120、及び、記憶装置80の関
与が必要となる。このように、キャッシュ40とキャッシ
ュ120は、バス70上のアドレスを探索するという意味
で、何れも、“スヌービング”キャッシュである。All operations involving the bus 70 require the involvement of the data cache 40 and data cache 120 and the storage device 80. Thus, cache 40 and cache 120 are both "snooping" caches in the sense of looking up an address on bus 70.

キャッシュの状態を例にあげて、キャッシュの一貫性
のプロトコルを説明する。データ・キャッシュ40とデー
タ・キャッシュ120は、先ず、不当状態に設定される。
次に、データ・キャッシュ40が妥当状態になる。データ
・キャッシュ40が妥当状態になった後、CPU100が主記憶
装置80から読出すことを要求する。それでキャッシュ40
とキャッシュ120の両方が共用状態になる。The cache coherency protocol will be explained using the cache state as an example. The data cache 40 and the data cache 120 are first set to an illegal state.
The data cache 40 is then in a valid state. After the data cache 40 is in the valid state, the CPU 100 requests reading from the main memory 80. So cache 40
Both the cache and the cache 120 are shared.

ここで例えば、CPU20が、データ・キャッシュ40に対
する同じアドレスでの書き込みを要求すると想定してみ
る。言い換えると、CPU20が、共用のデータ・ラインへ
の書き込みを要求したと想定する。CPU20は、それに応
じてバス70を経てのバス動作を行って、当該データ・ラ
インについては、データ・キャッシュ120が、共用状態
から不当状態になることを告げる。このようにしてキャ
ッシュ120は不当状態になる。それで、データ・キャッ
シュ40は汚染状態になる。Now suppose, for example, that CPU 20 requests a write to data cache 40 at the same address. In other words, assume that CPU 20 has requested a write to a shared data line. The CPU 20 accordingly performs a bus operation via the bus 70 to notify that the data cache 120 is changed from the shared state to the invalid state for the data line. In this way, the cache 120 becomes invalid. The data cache 40 is then tainted.

このように、その特定のデータ・ラインの汚染状態
（ダーティ）コピーが、データ・キャッシュ40内にはあ
るが、データ・キャッシュ120内にも主記憶装置80内に
もないことになる。もし、CPU100が、その特定の汚染状
態のデータ・ラインを読み出したいと要求すると、当該
データは、主記憶装置80からではなく、データ・キャッ
シュ40から出てくることになる。その理由は、データ・
キャッシュ40が、その変更されたデータ・ラインの専有
コピーを有しているからである。CPU100によってそのデ
ータが読み出されると、データはキャッシュ40からバス
70を経てキャッシュ120へと向かう。そこでデータ・キ
ャッシュ40とデータ・キャッシュ120が共用状態に戻
る。Thus, the dirty (dirty) copy of that particular data line is in data cache 40, but not in data cache 120 or main memory 80. If the CPU 100 requests that a particular dirty data line be read, the data will come from the data cache 40 rather than from main memory 80. The reason is data
This is because cache 40 has its own copy of the modified data line. When the data is read by the CPU 100, the data is transferred from the cache 40 to the bus.
Head towards Cash 120 via 70. Then, the data cache 40 and the data cache 120 are returned to the shared state.

三路転送動作中にキャッシュの一貫性プロトコルの状
態ビットにより、例えば転送すべきデータ・ラインの専
有コピーをデータ・キャッシュ40が有していること（す
なわち汚染状態のコピーをキャッシュ40が有しているこ
と）が示されると、データ・ラインは、記憶装置80から
ではなく、キャッシュ40から、バス70を経て図形サブシ
ステム90へと転送される。同様に、三路転送動作中に転
送すべきデータの変更された専有コピーをデータ・キャ
ッシュ120が有している（すなわち汚染状態のコピーを
キャッシュ120が有している）ならば、データは、記憶
装置80からではなく、キャッシュ120から図形サブシス
テム90へと転送される。三路転送動作中にキャッシュの
一貫性プロトコルが考慮されることによって、データ語
の古い、以前のコピーではなく、データ語のコピーのう
ちで最も新たに変更されたコピーだけが、三路転送動作
の一部として転送されることが保証される。Depending on the status bit of the cache coherency protocol during a three-way transfer operation, for example, data cache 40 may have its own copy of the data line to be transferred (ie cache 40 may have a dirty copy). Data lines are transferred from cache 40, rather than from storage 80, via bus 70 to graphics subsystem 90. Similarly, if the data cache 120 has a modified, proprietary copy of the data to be transferred during a three-way transfer operation (ie, the cache 120 has a dirty copy), then the data is It is transferred from the cache 120 to the graphics subsystem 90 rather than from storage 80. Due to the consideration of the cache coherency protocol during the three-way transfer operation, only the most recently modified copy of the data word, and not the old, previous copy of the data word, is transferred. Guaranteed to be transferred as part of

バス70は64ビットバスである。バス70は、アドレス線
及びデータ線を個別に含んでいる。CPU20とCPU100は、
別の時間にバス70をアクセスすることができ、従ってバ
ス70は多重化されたバスである。三路転送動作は特殊な
バス動作である。Bus 70 is a 64-bit bus. The bus 70 includes address lines and data lines separately. CPU20 and CPU100 are
Bus 70 can be accessed at another time, and thus bus 70 is a multiplexed bus. The three-way transfer operation is a special bus operation.

本発明の好ましい実施例では、各キャッシュ転送は４
語の伝送である。このようにバス70を用いる転送では、
アドレスそれぞれ毎に、４つのデータ語（１語32ビット
として128ビット）が転送される。従って、単一のバス
・サイクル（周期）で単一のデータ語が転送されるので
はなく、単一のバス・サイクルで４つのデータ語が転送
される。単一のメモリ読み出し又は書き込みの動作が行
われる毎に、同時に４語が転送される。CPU20が４語を
必要としているか、又は４語以下の語を必要としている
かに拘わらず、同時に４語が転送される。例えば、CPU2
0が４語ではなく３語を必要としている場合でも、４語
未満のデータが決して転送されないから、CPU20は４語
を受信することになる。In the preferred embodiment of the invention, each cache transfer is four.
It is the transmission of words. Thus, in the transfer using the bus 70,
Four data words (128 bits as 32 bits per word) are transferred for each address. Therefore, instead of transferring a single data word in a single bus cycle, four data words are transferred in a single bus cycle. Each time a single memory read or write operation is performed, four words are transferred simultaneously. Whether the CPU 20 needs 4 words, or less than 4 words, 4 words are transferred at the same time. For example, CPU2
Even if 0 requires 3 words instead of 4, CPU 20 will receive 4 words because less than 4 data is never transferred.

記憶装置80とキャッシュ22,24,40,102,104及び120
は、キャッシュのワード・バウンダリ（境界）を基礎と
して編成されている。キャッシュのワード・バウンダリ
301,302,304,306及び307が、図２に示されている。デー
タは、“データ”と表記された段の下に示され、アドレ
スは“アドレス”と表記された段の下に示されている。
キャッシュのワード・バウンダリ301,302,304,306及び3
07は、データを４データ語の群へと編成する。例えば、
バウンダリ302とバウンダリ304の間には４語がある。こ
のように、バウンダリ301,302,304,306及び307のそれぞ
れは、４語の境界である。Storage device 80 and cache 22, 24, 40, 102, 104 and 120
Is organized around the cache's word boundary. Cache word boundary
301, 302, 304, 306 and 307 are shown in FIG. Data is shown below the column labeled "Data", and addresses are listed below the column labeled "Address".
Cache word boundaries 301, 302, 304, 306 and 3
07 organizes the data into groups of 4 data words. For example,
There are four words between Boundary 302 and Boundary 304. Thus, each of the boundaries 301, 302, 304, 306 and 307 is a 4-word boundary.

図１に示されるように、図形サブシステム90は、線92
を経てバス70に結合されている。本発明の好ましい実施
例では、図形サブシステム90は、ディジタル・コンピュ
ータのための、図形情報の変換、レンダリング及びディ
スプレイを処理する。これらの動作は、パイプラインの
“VLSI"プロセッサ、並列有限状態プロセッサ、高速記
憶装置及び32ビット・マイクロプロセッサ（全て図示せ
ず）を利用して局部的に実行される。図形サブシステム
90は、幾何学サブシステム（図示せず）、レンダリング
・サブシステム（図示せず）、及び、ディスプレイ・サ
ブシステム（図示せず）を備えている。図形管理ボード
（図示せず）はレンダリング・サブシステムの一部であ
り、このレンダリング・サブシステムは図形サブシステ
ム90の一部である。図形管理ボードは、マイクロプロセ
ッサ（図示せず）と局部記憶装置（図示せず）とを備え
ている。図形管理ボードのマイクロプロセッサは、回路
10のマイクロプロセッサ20及び100と通信する。図形サ
ブシステム90の図形管理ボードも、図形サブシステム90
内の幾何学的なパイプラインの活動を監視する。As shown in FIG. 1, graphics subsystem 90 has line 92
It is connected to the bus 70 via. In the preferred embodiment of the present invention, graphics subsystem 90 handles the transformation, rendering, and display of graphics information for a digital computer. These operations are performed locally using a pipelined "VLSI" processor, a parallel finite state processor, a high speed memory and a 32-bit microprocessor (all not shown). Graphic subsystem
90 includes a geometry subsystem (not shown), a rendering subsystem (not shown), and a display subsystem (not shown). The graphics management board (not shown) is part of the rendering subsystem, which is part of the graphics subsystem 90. The graphic management board includes a microprocessor (not shown) and a local storage device (not shown). Graphic management board microprocessor circuit
Communicate with 10 microprocessors 20 and 100. The graphics management board of the graphics subsystem 90 is also the graphics subsystem 90.
Monitor the activity of the geometric pipeline within.

アドレス・デコーダ50は、線34を経て書き込みバッフ
ァ30に結合されている。アドレス・デコーダ50は、書き
込みバッファ30内にあるアドレスをデコードするデコー
ダ回路を含んでいる。アドレス・デコーダ50から得られ
た情報は、バス70の要求を行うか否かの決定に使用さ
れ、読み出し，書き込み，又は三路転送動作のどれを行
うかを決定するためにも利用され、データ・キャッシュ
40に対して何を行うかの決定に利用される。Address decoder 50 is coupled to write buffer 30 via line 34. The address decoder 50 includes a decoder circuit that decodes the address in the write buffer 30. The information obtained from the address decoder 50 is used to determine whether to make a request on the bus 70 and also to determine whether to perform a read, write, or three-way transfer operation. ·cache
Used to decide what to do for 40.

アドレス・デコーダ50は、線42を経てデータ・キャッ
シュ40に信号を伝送する。アドレス・デコーダ50は、線
52を介してデコード又は状態ビットをレジスタ60に送り
出す。状態又はデコード・ビットはレジスタ60内に記憶
することができる。状態又はデコード・ビットは、三路
転送ビット又は単に三路ビットとも言われる。線54,56
及び58によって、レジスタ60はアドレス・デコーダ50に
結合される。線54,56及び58は、レジスタ60が三路転送
ビットをレジスタ50に伝送する手段を提供する。The address decoder 50 transmits a signal to the data cache 40 via line 42. Address decoder 50
Decode or status bits are sent to register 60 via 52. The status or decode bits can be stored in register 60. The status or decode bits are also referred to as three way transfer bits or simply three way bits. Line 54,56
And 58 couple register 60 to address decoder 50. Lines 54, 56 and 58 provide the means by which register 60 transfers the 3-way transfer bit to register 50.

三路転送ビットには、（１）開始アドレス・トリガ・
ビット、（２）終了アドレス・トリガ・ビット、及び、
（３）タグ（標識）ビットが、含まれる。本発明の好ま
しい実施例では、タグ（標識）ビツトは、三路転送動作
と関連する開始アドレスの５番目のビットである。言い
換えると、タグ（標識）ビットは、開始アドレス（A0,A
2,・・,A4,・・）のA4ビットである。The three-way transfer bit includes (1) start address, trigger,
Bit, (2) end address trigger bit, and
(3) A tag bit is included. In the preferred embodiment of the invention, the tag bit is the fifth bit of the starting address associated with the three-way transfer operation. In other words, the tag bit is the start address (A0,A
2,..., A4,...) A4 bits.

開始アドレス・トリガ・ビット、終了アドレス・トリ
ガ・ビット、及びタグ（標識）ビットは、アドレス・デ
コーダ50からを経てレジスタ60へと伝送されて記憶され
る。レジスタ60に記憶されたビットは、クリヤされてリ
セットされることもできる。アドレス・デコーダ50は、
更に書き込みバッファ30内の終了アドレスのビットをサ
ンプル（標本抽出）し、そのビットを記憶をするために
線52を経てレジスタ60に伝送することができる。本発明
の好ましい実施例では、終了アドレスの５番目のビット
がサンプル（標本抽出）され、保持される。言い換える
と、終了アドレスのA4ビットがサンプル（標本抽出）さ
れ、保存される。The start address trigger bit, end address trigger bit, and tag bit are transmitted from address decoder 50 to register 60 for storage. The bits stored in register 60 can also be cleared and reset. The address decoder 50 is
Additionally, the bit at the ending address in the write buffer 30 can be sampled and transmitted to the register 60 via line 52 for storage. In the preferred embodiment of the invention, the fifth bit of the ending address is sampled and held. In other words, the A4 bit of the end address is sampled and saved.

アドレス・デコーダ50により供給される開始アドレス
・トリガ・ビットは、三路転送動作中に更に書き込みが
行われることを抑止するために使用される。開始アドレ
ス・トリガ・ビットは更に、次の書き込み動作が開始ア
ドレスへの書き込みとなることを指示する。アドレス・
デコーダ50は、開始アドレスへの書き込みがあると、直
ちに終了アドレス・トリガ・ビットを生成する。終了ア
ドレス・トリガ・ビットは、次の書き込み動作が終了ア
ドレスへの書き込みとなることを指示する。The start address trigger bit provided by address decoder 50 is used to prevent further writes during a three-way transfer operation. The start address trigger bit further indicates that the next write operation will be a write to the start address. address·
The decoder 50 will generate an end address trigger bit as soon as there is a write to the start address. The end address trigger bit indicates that the next write operation will be a write to the end address.

図１の右側のアドレス・デコーダ130及びレジスタ140
は、アドレス・デコーダ50及びレジスタ60とそれぞれ同
様の機能を果たす。アドレス・デコーダ130は、線114を
経て書き込みバッファ110から伝送されるアドレスをデ
コードする。アドレス・デコーダ130は、線122を経てデ
ータ・キャッシュ120に制御信号を伝送する。アドレス
・デコーダ130は、三路ビット（すなわちデコード又は
状態ビット）を、記憶をするため線132を経てレジスタ1
40に伝送する。三路ビットには、開始アドレス・トリガ
・ビット、終了アドレス・トリガ・ビット、タグ（標
識）ビット、及び、タグ（標識）ビットと同じ配列順の
終了アドレスのビットが含まれている。レジスタ140は
クリヤされることができ、レジスタ140に記憶された三
路ビットをリセットできる。レジスタ140は、線134,136
および138を経てアドレス・デコーダ130に結合されてい
る。このようにして、レジスタ140の内容は、線134,186
及び138を経てアドレス・デコーダ130に伝送される。Address decoder 130 and register 140 on the right side of FIG.
Perform the same function as the address decoder 50 and the register 60, respectively. Address decoder 130 decodes the address transmitted from write buffer 110 over line 114. Address decoder 130 transmits control signals to data cache 120 via line 122. Address decoder 130 stores the 3-way bit (ie, the decode or status bit) in register 1 via line 132 for storage.
Transmit to 40. The 3-way bit includes a start address trigger bit, an end address trigger bit, a tag (indicator) bit, and an end address bit in the same arrangement order as the tag (indicator) bit. Register 140 can be cleared and the 3-way bit stored in register 140 can be reset. Register 140 has lines 134,136
And address decoder 130 via 138. Thus, the contents of register 140 are
And 138 to the address decoder 130.

プログラム可能なアレイ論理素子（“PAL"）150は、
ディジタル・コンピュータの回路10のための制御論理を
備えている。PAL150は、線152を介して入力信号を受
け、線154を経て制御信号を発信する。PAL160も回路10
のための制御論理を備えている。PAL160は、線162を経
て入力を受け、線164を経て制御信号を発信する。The programmable array logic element (“PAL”) 150
It contains the control logic for the circuit 10 of the digital computer. PAL 150 receives an input signal on line 152 and emits a control signal on line 154. Circuit 10 of PAL160
Control logic for PAL 160 receives input on line 162 and emits control signals on line 164.

PAL150及び160は、バス70のサイクル（周期）の全て
を制御する。PAL150及び160は、バス70がリクエストさ
れているか否かを判定する。PAL150及ぴ160は、読み出
し／書き込み動作が行われるべきか、又は三路転送動作
が行われるべきかを最終的に決定する。PAL150及び160
は、三路転送動作を監視する。PAL150及び160は、書き
込み動作を、三路転送動作中に、抑止すべきかどうかを
決定する。PAL150及び160は、レジスタ60及び140内の三
路ビットの状態を監視して、制御の決定を下すために三
路ビットを受ける。PAL150及び160は、割込みがあるか
どうか、及び三路転送動作中に回路10がどの段階にある
かに応じて、三路ビットが確実に保存、復元またはクリ
ヤされるようにする。PAL150及び160は、回路10の大部
分を通るアドレスとデータの流れ方向を制御するトラン
シーバ（図示せず）に制御信号を送る。PALs 150 and 160 control all of the cycles of bus 70. The PALs 150 and 160 determine if the bus 70 is requested. The PALs 150 and 160 ultimately determine whether a read/write operation should be performed or a three-way transfer operation should be performed. PAL150 and 160
Monitors the three-way transfer operation. The PALs 150 and 160 determine whether the write operation should be suppressed during the three-way transfer operation. PALs 150 and 160 monitor the status of the 3-way bits in registers 60 and 140 and receive the 3-way bits to make control decisions. PALs 150 and 160 ensure that the 3-way bit is saved, restored or cleared, depending on whether there is an interrupt and at what stage the circuit 10 is during the 3-way transfer operation. PALs 150 and 160 send control signals to transceivers (not shown) that control the direction of address and data flow through most of circuit 10.

本発明の好ましい実施例では、PAL150及び160は、フ
リップ−フロップ、状態機械（ステートマシン）及び論
理ゲートから成っている。本発明の別の実施例では、PA
L150及び160による制御はマイクロコード化されてい
る。PAL150及び160をプログラム、又はマイクロコード
化するために用いられる論理はディジタル情報の三路転
送を行うためにここに記載する方法に従う。In the preferred embodiment of the present invention, PALs 150 and 160 consist of flip-flops, state machines, and logic gates. In another embodiment of the invention, the PA
The control by L150 and 160 is microcoded. The logic used to program, or microcode, the PALs 150 and 160 follows the method described herein for performing a three-way transfer of digital information.

本発明の三路転送動作によって、データを図形サブシ
ステム90の図形管理ボード（図示せず）に転送する効率
のよい手段が得られる。The three-way transfer operation of the present invention provides an efficient means of transferring data to the graphics management board (not shown) of graphics subsystem 90.

本発明の好ましい実施例では、ディジタル・コンピュ
ータのディスプレイに図形画像をディスプレイするには
多角形タイリングが用いられる。多角形タイリングで
は、画像を複数個の比較的小さいポリゴンで構成する。
このような多角形の各々の座標は、図形サブシステム90
の図形パイプラインで受領できるように、バス70から図
形サブシステム90へと伝送されなければならない。各多
角形の各々の頂点には、X,Y及びＺの空間成分と、R,G,B
（すなわち赤、緑、青）のカラー・コードが関連づけら
れる。このように、各多角形の各頂点は、Ｘ語、Ｙ語、
Ｚ語、Ｒ語、Ｇ語及びＢ語を有しており、これらの語
は、線70を経て図形サブシステム90に伝送されなければ
ならない。これらの語は各々、浮動小数点の数値であ
る。６つの語が各頂点と関連づけられており、多角形に
は通常４つの頂点があると想定すると、多角形毎に24の
語があることになる。デイスプレイされる各画像が多く
の多角形から構成されることを想起すると、図形サブシ
ステム90には多数のデータ語が伝送されなければならな
い。更に、このような語は図形サブシステム90の性能を
最適化するため充分な高速度で図形サブシステム90に伝
送されなければならない。In the preferred embodiment of the invention, polygon tiling is used to display the graphical image on the display of the digital computer. In polygon tiling, an image is composed of a plurality of relatively small polygons.
The coordinates of each of these polygons are determined by the graphics subsystem 90
Must be transmitted from the bus 70 to the graphics subsystem 90 so that it can be received in the graphics pipeline of At each vertex of each polygon, the spatial components of X, Y and Z, and R, G, B
Color codes (ie red, green, blue) are associated. Thus, each vertex of each polygon has an X word, a Y word,
There are Z, R, G and B words which must be transmitted to the graphics subsystem 90 via line 70. Each of these words is a floating point number. Assuming that 6 words are associated with each vertex and that a polygon usually has 4 vertices, there will be 24 words per polygon. Recalling that each image to be displayed consists of many polygons, a large number of data words must be transmitted to the graphics subsystem 90. Moreover, such words must be transmitted to the graphics subsystem 90 at a sufficiently high rate to optimize the performance of the graphics subsystem 90.

三路転送動作中にデータ・ブロックは、（１）主記憶
装置80、又は（２）キャッシュ０又はキャッシュ120
（キャッシュ40又はキャッシュ120がデータの専有コピ
ーを有している場合）のいずれかから、線70を経て図形
サブシステム90に転送される。転送されるデータ・ブロ
ックは、（１）データ・ブロック内の最初の語のアドレ
スである開始アドレスと、（２）データ・ブロック内の
最後の語のアドレスである終了アドレスと、（３）デー
タ・ブロック内の語数を表す語カウントとを有してい
る。During the three-way transfer operation, the data block is (1) the main storage device 80 or (2) the cache 0 or the cache 120.
From either (if cache 40 or cache 120 has its own copy of the data) to graphics subsystem 90 via line 70. The data block to be transferred is (1) a start address which is the address of the first word in the data block, (2) an end address which is the address of the last word in the data block, and (3) data. It has a word count that represents the number of words in the block.

図２は三路転送動作の一部として転送可能であるデー
タ・ブロックを例示している。図２の各“X"は、バス70
を経て転送されるデータ・ブロックの語を図示してい
る。前述のとおり、記憶装置80及びキャッシュ40及び12
0は、図２の301,302,304,306及び307のようなキャッシ
ュのバウンダリにより編成されている。キャッシュのバ
ウンダリは、４つのデータ語づつのグループに分ける。
バス動作すなわちバスを用いたデータ転送がなされる毎
に、同時に４つの語が転送される。FIG. 2 illustrates data blocks that can be transferred as part of a three-way transfer operation. Each "X" in Figure 2 is a bus 70
Figure 7 illustrates the words of a data block transferred over. As described above, the storage device 80 and the caches 40 and 12
0s are organized by cache boundaries such as 301, 302, 304, 306 and 307 in FIG. The cache boundary is divided into groups of four data words.
Every time a bus operation, that is, a data transfer using the bus is performed, four words are transferred at the same time.

図２に示すように、三路転送動作の一部として転送さ
れるべきデータのブロックは、種々の語長を有し得る。
データ・ブロックは、更に異なる開始アドレス及び異な
る終了アドレスを有することができる。例えば、図２の
列Ａのデータ・ブロックは、語カウント３と、開始アド
レス０と、終了アドレス２とを有している。更に、列Ａ
のブロックは、バウンダリと交叉しない。As shown in FIG. 2, blocks of data to be transferred as part of a three-way transfer operation can have various word lengths.
The data blocks can also have different starting addresses and different ending addresses. For example, the data block in column A of FIG. 2 has a word count of 3, a starting address of 0, and an ending address of 2. Further, row A
Block does not cross the boundary.

列Ｂのデータ・ブロックは、開始アドレス１、終了ア
ドレス３、および語カウント３である。The data block in column B has start address 1, end address 3, and word count 3.

任意のデータをバス70を経て転送するには、同時に４
つの語を転送しなければならない。一方、バス70を経て
列Ａのデータ・ブロックを転送するには、アドレス０か
らアドレス３までのデータ語のシークェンスが転送され
なければならない。従って、図１のCPU20が、アドレス
０から開始されてアドレス２で終了する列Ａのデータ・
ブロックだけを転送したい場合でも、アドレス３にある
データも転送される。このように、回路10のキャッシュ
におけるデータ・ラインの構成では、任意の三路転送動
作の一部として余分なデータが転送されることがある。
同様に、CPU20が図２の列Ｂのデータ・ブロックを転送
したい場合は、開始アドレス１、終了アドレス３および
データ語３を有するデータ・ブロックと共に、アドレス
０にあるデータも転送される。To transfer arbitrary data via bus 70, 4 at the same time
You have to transfer one word. On the other hand, to transfer the data block of column A via bus 70, the sequence of data words from address 0 to address 3 must be transferred. Therefore, the CPU 20 of FIG. 1 starts the data in column A starting at address 0 and ending at address 2.
Even if you want to transfer only the block, the data at address 3 is also transferred. As such, the configuration of data lines in the cache of circuit 10 may result in extra data being transferred as part of any three-way transfer operation.
Similarly, if CPU 20 wants to transfer the data block in column B of FIG. 2, the data at address 0 is transferred along with the data block having start address 1, end address 3 and data word 3.

このように、CPU20が転送したいデータ・ブロックの
全てが、隣接の２つのバウンダリにより挟まれた４つの
データ語のシークェンス内にある場合は、CPU20はその
４つの語シークェンスのどの特定の語を転送したいのか
を特定することができず、CPU20は、必要としているか
どうかにかかわりなく、４つの語を全て得る。換言する
と、アドレスの低位のビットが、４つのデータ語のシー
クェンス内において特定のデータ語が何処にあるかを特
定するだけの場合、アドレスのこの低位のビットは、無
意味なものであるということである。すなわち、CPU20
は、４つのデータ語のグループ中から単一のデータ語だ
けを得ることができず、同じグループの他のデータ語の
伝送も受けなければならない。キャッシュにおいてグル
ープとされた４つのデータ語は、まとめて、データのラ
インないしデータ・ラインと称されることもある。Thus, if all of the data blocks that CPU 20 wants to transfer are within the sequence of 4 data words sandwiched by two adjacent boundaries, CPU 20 will transfer which particular word of those 4 word sequences. Unable to specify what they want to do, CPU 20 gets all four words, regardless of whether they need them. In other words, if the low order bits of an address only identify where a particular data word is within a sequence of four data words, then this low order bit of the address is meaningless. Is. That is, CPU20
Cannot obtain only a single data word out of a group of four data words and must also receive the transmission of other data words of the same group. The four data words grouped together in the cache are sometimes collectively referred to as a line of data or a data line.

図２の列Ｃのデータ・ブロックは、開始アドレス２、
終了アドレス４、及び語カウント３のデータ語である。
この列Ｃのデータ・ブロックは、バウンダリ302と交叉
する。しかし、バス70を経た単一のデータ転送では、２
つの隣接バウンダリの間にある４つのデータ語のシーク
ェンスだけを転送する。従って、列Ｃのブロックのため
の単一のバス転送動作では、アドレス０で開始され、ア
ドレス３で終了する４つのデータ語のシークェンスが転
送される。列Ｃのデータ・ブロックの中のアドレス４に
あるデータは、最初の単一バス転送動作の一部としては
転送されない。その理由は、バウンダリ301と302の間に
あるデータだけが単一バス転送動作の一部としては転送
されるからである。CPU20は、アドレス４にあるデータ
を転送するためには、次のバス転送動作を開始しなけれ
ばならない。勿論、この第２のバス転送動作の一部とし
て、アドレス４のデータが転送されるだけではなく、ア
ドレス5,6、及び７のデータも（これらがバウンダリ302
と304との間にあるので）、転送される。The data block in column C of FIG. 2 has a starting address of 2,
It is a data word with end address 4 and word count 3.
The data block in column C intersects the boundary 302. However, in a single data transfer via bus 70, 2
Only transfer sequences of four data words between two adjacent boundaries. Therefore, a single bus transfer operation for the block in column C transfers a sequence of four data words starting at address 0 and ending at address 3. The data at address 4 in the column C data block is not transferred as part of the first single bus transfer operation. The reason is that only the data between boundaries 301 and 302 is transferred as part of a single bus transfer operation. In order to transfer the data at address 4, the CPU 20 must start the next bus transfer operation. Of course, as part of this second bus transfer operation, not only the data at address 4 is transferred, but also the data at addresses 5, 6, and 7 (these are the boundaries 302).
Between 304 and 304), so they are forwarded.

同様に、開始アドレス３、終了アドレス５及び語カウ
ント３を有する列Ｄのデータ・ブロックの転送には、２
度の（三路転送動作のような）バス転送動作が必要とさ
れる。最初のバス転送動作中にアドレス0,1,2及び３に
あるデータ語が転送される。第２のバス転送動作中、ア
ドレス4,5,6及び７にあるデータ語が（データ語が隣接
のバウンダリ302と304の間にあるので）、転送される。
従って、三路転送動作中にデータ・ブロックを図形サブ
システム90に伝送するべき場合でも、４つのデータ語の
シークェンスが各々の単一の転送中に図形サブシステム
90に伝送される。図２の列Ｅは、三路転送動作の一部と
して転送されるべきデータ・ブロックが語カウント１を
有することができることを示している。図２の列Ｆは、
三路転送動作の一部として転送されるべきデータ・ブロ
ックが語カウント４を有することができることを示して
いる。列A,B,E及びＦに関する転送では、各々三路転送
動作の一部としてデータ線の単一の転送だけを必要とす
る。しかし、バウンダリ302と交叉する列Ｃ及び列Ｄの
データ・ブロックでは、三路転送動作の一部として２度
のバス転送を必要とする。列Ｅのデータ・ブロックの転
送は、アドレス1,2、及び３のデータが、アドレス０の
データと同様に転送されてしまうことを意味する。Similarly, for the transfer of the data block in column D with start address 3, end address 5 and word count 3, 2
Occasional bus transfer operations (such as three-way transfer operations) are required. During the first bus transfer operation, the data words at addresses 0, 1, 2 and 3 are transferred. During the second bus transfer operation, the data words at addresses 4, 5, 6 and 7 are transferred (because the data words are between adjacent boundaries 302 and 304).
Thus, even if a block of data should be transmitted to the graphics subsystem 90 during a three-way transfer operation, the sequence of four data words will be used during each single transfer.
Transmitted to 90. Column E of FIG. 2 shows that a data block to be transferred as part of a three-way transfer operation can have a word count of one. Column F of FIG.
It shows that a data block to be transferred as part of a three-way transfer operation can have a word count of four. The transfers for columns A, B, E and F each require only a single transfer of the data line as part of a three way transfer operation. However, the column C and column D data blocks that intersect the boundary 302 require two bus transfers as part of the three-way transfer operation. The transfer of the data block in column E means that the data at addresses 1, 2, and 3 will be transferred as well as the data at address 0.

列Ｇデータ・ブロックは、語カウント６、開始アドレ
ス１および終了アドレス６を有している。列Ｇのデータ
・ブロックは、更にキャッシュ境界302と交叉してい
る。従って、列Ｇのデータ・ブロック全体を転送するに
は、２度のバス転送が必要である。The column G data block has a word count of 6, starting address 1 and ending address 6. The data block in column G also intersects cache boundary 302. Therefore, to transfer the entire column G data block, two bus transfers are required.

図２の列Ｈでは、２つのバウンダリ302及び304と交叉
するデータ・ブロックを示している。列Ｈのデータ・ブ
ロックは、語カウント８、開始アドレス２及び終了アド
レス９を有している。列Ｈのデータ・ブロックを完全に
転送するには、三路転送動作の一部として３回のバス転
送が必要である。この場合もバス転送の一部としてアド
レス0,1,10,11の余分なデータが転送される。図２の列
Ｉのデータ・ブロックは、３つのバウンダリ302,304,30
6と交叉する。Column H of FIG. 2 shows a data block that intersects two boundaries 302 and 304. The data block in column H has a word count of 8, a start address of 2 and an end address of 9. Complete transfer of the column H data block requires three bus transfers as part of the three-way transfer operation. In this case as well, extra data at addresses 0, 1, 10, 11 is transferred as part of the bus transfer. The data block in column I of FIG. 2 has three boundaries 302, 304, 30.
Cross with 6.

列Ｉのデータ・ブロックは、開始アドレス１、語カウ
ント14及び終了アドレス14を有している。列Ｉのデータ
・ブロック全体を転送するには、４回のバス転送が必要
である。この場合も、バス転送の一部としてアドレス０
及び15のデータも転送される。The data block in column I has a starting address of 1, a word count of 14 and an ending address of 14. Four bus transfers are required to transfer the entire column I data block. In this case, the address 0 is also used as part of the bus transfer.
And 15 data are also transferred.

本発明に従って、図２に示されていない他の開始アド
レス、他の終了アドレス及び他の語カウントを有する他
のデータ・ブロックも転送可能であることが理解されよ
う。例えば、倍精度数を伝送する場合、４以上の語カウ
ントを有するデータ・ブロックを伝送できる。It will be appreciated that other data blocks having other starting addresses, other ending addresses and other word counts not shown in FIG. 2 can also be transferred in accordance with the present invention. For example, when transmitting a double precision number, a data block with a word count of 4 or greater can be transmitted.

図３は三路転送動作中に発生する事象のシークェンス
を示すタイミング図である。図３の上部にクロック信号
が図示されている。本発明の好ましい実施例では、サイ
クルタイム（周期時間）は、クロックの立ち上がり縁と
クロックの次の立ち上がり縁との間の時間の長さとして
定義される。FIG. 3 is a timing diagram showing the sequence of events that occur during a three-way transfer operation. The clock signal is shown at the top of FIG. In the preferred embodiment of the invention, the cycle time is defined as the length of time between the rising edge of the clock and the next rising edge of the clock.

図３の波形NAは“次のアドレス（Next Address）”線
を示す。図３の波形NAは、線34及び32上に書き込みバッ
ファ30から出力されるアドレスを示している。これらの
アドレスは、CPU20から発生され、線26を経て書き込み
バッファ30に伝送されたものである。Waveform NA in FIG. 3 indicates the "Next Address" line. Waveform NA in FIG. 3 shows the address output from write buffer 30 on lines 34 and 32. These addresses were generated by CPU 20 and transmitted to write buffer 30 via line 26.

図３の波形TAは、第２レベルのデータ・キャッシュ40
内に現れるタグ（標識）アドレス（Tag Address）を示
している。このタグ（標識）アドレスはCPU20から発生
されたものである。波形TAに現れるアドレスは、波形NA
に現れるアドレスから単に遅延したものである。The waveform TA in FIG. 3 shows the second level data cache 40.
It shows the tag address that appears inside. This tag address is generated by the CPU 20. The address that appears in the waveform TA is the waveform NA
It is simply a delay from the address appearing in.

図３の波形DAは、データ・キャッシュ40内に現れる第
２レベルのデータ・アドレス（Data Address）を示して
いる。波形DAに示したアドレスは、波形TAおよび波形NA
上のアドレスを単に遅延したものである。The waveform DA in FIG. 3 shows the second level data address appearing in the data cache 40. The addresses shown in waveform DA are waveform TA and waveform NA.
It is simply a delay of the above address.

波形CDは、書き込みバッファ30から線32に現れるデー
タを示している。このデータは、CPU20から発生され、
線26を経て書き込みバッファ30に伝送されたものであ
る。Waveform CD shows the data appearing on line 32 from write buffer 30. This data is generated from CPU20,
It is transmitted to the write buffer 30 via the line 26.

別の見方をすると、波形NAは、CPU100から発生され
て、書き込みバッファ110から出て線114及び112に現れ
る次のアドレスを示すものである。同様に、波形TAはデ
ータ・キャッシュ120内のタグ（標識）アドレスを示す
と見ることができ、波形CDは、CPU100から発生されて、
書き込みバッファ110から出て線112に現れるデータを示
すと見ることができる。Viewed another way, the waveform NA is indicative of the next address generated by the CPU 100 and out of the write buffer 110 and appearing on lines 114 and 112. Similarly, the waveform TA can be seen as indicating a tag address in the data cache 120, and the waveform CD is generated from the CPU 100,
It can be seen as showing the data exiting the write buffer 110 and appearing on line 112.

図３に示すように、CPU20は、三路トリガ・アドレス
への書き込みを行うことによって三路転送動作を開始す
る。図３では、“3WT"が三路トリガ・アドレスを表して
いる。これは波形NAでの最初のアドレスである。三路ト
リガ・アドレスは、CPU20から、線26を経て、書き込み
バッファ30に伝送される。書き込みバッファ30と線34お
よびアドレス・デコーダ50が、デコード回路を構成し、
この回路が書き込みバッファ30に送られたアドレスをデ
コードする。アドレス・デコーダ50は、CPU20による次
の書き込み動作が三路転送動作中に転送されるべきデー
タ・ブロックの開始アドレスへの書き込みであることを
指示する開始アドレス・トリガ・ビットをセットするた
めに、三路トリガ・アドレスをデコードする。アドレス
・デコーダ50による開始アドレス・トリガ・ビットのセ
ットによって、第２レベルのデータ・キャッシュ40に対
して、次の２回の書き込み動作が“特別”であることを
通知する機能も果たされる。アドレス・デコーダ50は、
前記の通知をする信号を、アドレス・デコーダ50と第２
レベルのデータ・キャッシュ40との間に結合された線42
を介して第２レベルのデータ・キャッシュ40に伝送す
る。開始アドレス・トリガ・ビットは、それをアドレス
・デコーダから線52を介して送られて、レジスタ60に保
存される。As shown in FIG. 3, the CPU 20 starts the three-way transfer operation by writing to the three-way trigger address. In FIG. 3, "3WT" represents a three-way trigger address. This is the first address in waveform NA. The 3-way trigger address is transmitted from CPU 20 via line 26 to write buffer 30. The write buffer 30, the line 34, and the address decoder 50 form a decoding circuit,
This circuit decodes the address sent to the write buffer 30. The address decoder 50 sets the start address trigger bit to indicate that the next write operation by the CPU 20 is a write to the start address of the block of data to be transferred during the three-way transfer operation. Decode the 3-way trigger address. The setting of the start address trigger bit by the address decoder 50 also serves to inform the second level data cache 40 that the next two write operations are "special". The address decoder 50 is
The signal for notifying the above is sent to the address decoder 50 and the second
Line 42 coupled to and from level data cache 40
To the second level data cache 40 via. The start address trigger bit is sent from the address decoder via line 52 and stored in register 60.

次の二回の書き込みは、“特別”な書き込みである。
両方の書き込みとも第２レベルのデータ・キャッシュ40
への書き込みが抑止される。両方の書き込みでは、更に
記憶装置80を変更することも抑止される。次の二回の書
き込みは、部分的書き込みである。本発明の好ましい実
施例では、次の二回の書き込みは、データ・キャッシュ
24を自動的に不当状態（インバリッド）にする。その理
由は、データ・キャッシュ24への部分的な書き込みが試
みられると、データ・キャッシュ24は不当になるように
構成されるからである。次の二回の書き込みが部分的な
書き込みであることは、更に、図形指令及び語カウント
を含んでいる低位バイトのデータが、最終的にバス70に
転送されることをも意味する。The next two writes are "special" writes.
Both writes are second level data cache 40
Writing is suppressed. Both writes also prevent further modification of the storage device 80. The next two writes are partial writes. In the preferred embodiment of the invention, the next two writes are data cache
Automatically put 24 into invalid state. The reason is that when a partial write to the data cache 24 is attempted, the data cache 24 is configured to become invalid. The fact that the next two writes are partial writes also means that the low-order byte of data containing the graphics command and word count will eventually be transferred to bus 70.

三路転送動作は次の３つの種類の書き込みを含んでい
る。The three-way transfer operation includes the following three types of writing.

（１）トリガ・アドレスへの書き込み（２）最初の開始アドレスへの書き込み、又、データ・
ブロックがより長い場合は、それに後続の中間アドレス
への１回以上の書き込み（３）終了アドレスへの書き込み（但し必要な場合の
み）書き込み動作及び読み出し動作において、データは、
常にアドレスに従う。言い換えると、ディジタル情報の
転送に際してはアドレスが最初に現れる。アドレスが波
形NAに現れた語、データは波形CD上に現れる。図３の波
形CDにD0で示したデータは、使用されないデータを表し
ている。その理由は、CPU20がトリガ・アドレスへの書
き込み動作を行っており、従ってトリガ・アドレスへの
書き込みに関連するデータには関わりないからである。(1) Writing to the trigger address (2) Writing to the first start address, or data
If the block is longer, write one or more times to the intermediate address following it (3) Write to the end address (only when necessary) In the write operation and the read operation, the data is
Always follow the address. In other words, the address appears first in the transfer of digital information. The address appears on the waveform NA, the data appears on the waveform CD. The data indicated by D0 in the waveform CD of FIG. 3 represents data that is not used. The reason is that the CPU 20 is performing a write operation to the trigger address and therefore is not concerned with the data associated with the write to the trigger address.

図３の波形MPREQは、CPU20からバス70へと伝送される
マイクロプロセッサバス要求信号を示している。図３に
示す波形MPGRNTは、バス70からCPU20およびCPU100に伝
送されるマイクロプロセッサバス許諾線信号を示してい
る。The waveform MPREQ in FIG. 3 shows the microprocessor bus request signal transmitted from the CPU 20 to the bus 70. The waveform MPGRNT shown in FIG. 3 represents a microprocessor bus license line signal transmitted from the bus 70 to the CPU 20 and the CPU 100.

図３の波形NAに示すように、線32及び34に現れる次の
信号は書き込み動作用の開始アドレスA1である。書き込
み動作用の最初の開始アドレスが第２レベルのデータ・
キャッシュ40と遭遇すると、マイクロプロセッサバス70
が要求され、マイクロプロセッサバスの三路転送動作が
更に実行される。As shown in waveform NA of FIG. 3, the next signal appearing on lines 32 and 34 is the start address A1 for the write operation. First start address for write operation is second level data
When encountering cache 40, microprocessor bus 70
Is required and the three-way transfer operation of the microprocessor bus is further executed.

波形MPREQ及ぴMPGRNTが低レベルにされたことをCPU20
が見出すと直ちに、CPU20は図３の波形NAに現れる情報
をマイクロプロセッサバス70に伝送する。波形MPAがマ
イクロプロセッサバス70のアドレス線に現れるアドレス
を表すとすれば、このことは図３から明白である。この
ように、開始アドレスA1が次のアドレス波形NAに現れた
後で、かつ、信号MPREQ及びMPGRNTが低レベルにされた
後、開始アドレスA1は、波形MPAに現れる。マイクロプ
ロセッサバス70によって受信される前に、開始アドレス
はCPU20から線26を経て書き込みバッファ30に伝送され
ている。アドレス・デコーダ50は、（線34を経てアドレ
ス・デコーダに送られる）開始アドレスをデコードす
る。その結果、三路開始アドレス・トリガ・ビットがク
リヤされ、三路終了アドレス・トリガ・ビットがセット
される。三路開始アドレス・トリガ・ビットは、クリヤ
信号をレジスタ60に伝送することによってクリヤされ
る。三路終了アドレス・トリガ・ビットが、アドレス・
デコーダ50によってセットされ、線52を経てレジスタ60
に送られて記憶される。このようにして終了アドレス・
トリガ・ビットは保存される。書き込みバッファ30内に
含まれる物理的開始アドレスは、次にマイクロプロセッ
サバス70のアドレス線に置かれる。これは図３の波形MP
AのA1によって示されている。CPU20 that waveforms MPREQ and MPGRNT are set to low level
Immediately upon finding, the CPU 20 transmits the information appearing in the waveform NA of FIG. 3 to the microprocessor bus 70. If waveform MPA represents the address appearing on the address line of microprocessor bus 70, this is apparent from FIG. Thus, after the start address A1 appears in the next address waveform NA, and after the signals MPREQ and MPGRNT have been made low, the start address A1 appears in the waveform MPA. Before being received by the microprocessor bus 70, the starting address has been transmitted from the CPU 20 via line 26 to the write buffer 30. The address decoder 50 decodes the starting address (sent to the address decoder on line 34). As a result, the 3-way start address trigger bit is cleared and the 3-way end address trigger bit is set. The 3-way start address trigger bit is cleared by transmitting a clear signal to register 60. Three-way end address trigger bit is
Set by decoder 50, register 60 via line 52
Sent to and stored. In this way the end address
The trigger bit is saved. The physical start address contained within write buffer 30 is then placed on the address line of microprocessor bus 70. This is the waveform MP in Figure 3
It is indicated by A1 in A.

語カウント及び図形指令は、通常はデータのライトバ
ック用に予約された期間中にマイクロプロセッサバス70
の低位の16のデータ線上に置かれる。語カウントは、図
３の波形MPD（書き込み）内の“W"によって表される。
図形指令は、図３の波形MPD（書き込み）内の“C"によ
って表される。波形MPD（書き込み）及び波形MPD（読み
出し）は、共にマイクロプロセッサバス70のデータ線上
のデータを表す。Word counts and graphics commands are typically used by the microprocessor bus 70 during periods reserved for writeback of data.
Are placed on the lower 16 data lines of. The word count is represented by "W" in the waveform MPD (write) of FIG.
The figure command is represented by "C" in the waveform MPD (write) of FIG. The waveform MPD (write) and the waveform MPD (read) both represent data on the data lines of the microprocessor bus 70.

図形指令は、三路転送動作の一部として転送されてい
るデータをどう処理するかを指示する。図形指令の例に
は、指令“多角形描画”がある。図形指令は、図１の回
路10に示されたディジタル・コンピュータのソフトウェ
アから発生される。The graphics command directs how to process the data being transferred as part of the three-way transfer operation. An example of the graphic command is the command "polygon drawing". Graphic commands are generated from the software of the digital computer shown in circuit 10 of FIG.

クロックサイクル（周期）の始まりに、データがバス
70に置かれる。クロックサイクル（周期）の終わりに、
図形サブシステム90はデータをラッチし、それによって
データはバス70から線92を経て図形サブシステム90に伝
送される。従って、クロックサイクル（周期）の始まり
とクロックサイクル（周期）の終わりの間にギャップが
あるので、そのデータの整定時間が得られる。語カウン
ト及び図形指令がマイクロプロセッサバス70に到達する
時点では、図形サブシステム90は既に開始アドレスA1を
ラッテしている。At the beginning of a clock cycle, the data is on the bus
Placed at 70. At the end of the clock cycle,
Graphics subsystem 90 latches the data so that the data is transmitted from bus 70 to graphics subsystem 90 over line 92. Therefore, there is a gap between the beginning of the clock cycle (cycle) and the end of the clock cycle (cycle), so that the settling time of that data is obtained. By the time the word count and graphics commands reach the microprocessor bus 70, the graphics subsystem 90 has already latteed the starting address A1.

物理的開始アドレスA1により指定された16バイトのデ
ータ・ラインは、キャッシュ40が、16バイトの変更され
たデータ・ラインの専有コピーを有している場合に、か
つその場合に限って、第２レベルのキャッシュ40によっ
てマイクロプロセッサバス70上に置かれる。物理的アド
レスによって指定された16バイトのデータ・ラインは、
主記憶装置80が有しているのが16バイトのデータ・ライ
ンの最新のバージョンであり、他のキャッシュには16バ
イトのデータ・ラインの変更された専有コピーを有して
いない場合には、主記憶装置80によってマイクロプロセ
ッサバス70上に置かれる。The 16-byte data line specified by the physical start address A1 is the second line if and only if the cache 40 has a private copy of the 16-byte modified data line. The level cache 40 places it on the microprocessor bus 70. The 16-byte data line specified by the physical address is
If main memory 80 has the latest version of a 16-byte data line and the other caches do not have a modified proprietary copy of the 16-byte data line, Located on microprocessor bus 70 by main memory 80.

次に図形サブシステム90は、マイクロプロセッサバス
70から16バイトのデータ・ライン（すなわちデータ・ブ
ロック）を獲得すなわちラッチする。このように、三路
転送動作の一部として、図形システム90は、マイクロプ
ロセッサバス上の開始アドレス、語カウント、図形指令
及び16バイトのデータ・ライン（すなわちデータ・ブロ
ック）を獲得する。Next, the graphics subsystem 90 is a microprocessor bus.
Acquire or latch a 70 to 16 byte data line (or data block). Thus, as part of a three-way transfer operation, graphics system 90 acquires a starting address on the microprocessor bus, a word count, a graphics command, and a 16-byte data line (ie, data block).

第２のアドレスへの第２の書き込みがデータ・キャッ
シュ40に入ると、最初の三路転送動作中に転送されない
追加の語が残っている場合に限って、第２のバス転送が
三路バス転送の一部として開始される。これは第２の書
き込みアドレスの５番目のアドレス・ビット（すなわち
ビットA4）を、最初の書き込みアドレスから保存された
５番目のアドレス・ビット（すなわちビットA4）と比較
することによって決定される。ビットが異なる場合は、
図３に示すように三路転送動作の一部として追加のマイ
クロプロセッサバス転送動作が実行される。When the second write to the second address enters the data cache 40, the second bus transfer will be a three-way bus only if there are additional words that are not transferred during the first three-way transfer operation. Initiated as part of the transfer. This is determined by comparing the fifth address bit of the second write address (ie bit A4) with the fifth address bit stored from the first write address (ie bit A4). If the bits are different,
Additional microprocessor bus transfer operations are performed as part of the three-way transfer operation as shown in FIG.

三路終了トリガ・ビットは、後続の三路転送動作が行
われるかどうかに関係なくクリヤされる。The 3-way end trigger bit is cleared regardless of whether a subsequent 3-way transfer operation is performed.

図３を参照すると、波形MCWREQ0,MCWREQ1,MCWREQ2及
びMCWREQ3のそれぞれが、書込バッファ30の対応するレ
ベルに対応する。上記の波形のそれぞれが、低レベルに
されている場合は、特定レベルの書き込みバッファにデ
ータが満ちていることを示している。Referring to FIG. 3, each of waveforms MCWREQ0, MCWREQ1, MCWREQ2 and MCWREQ3 corresponds to a corresponding level of write buffer 30. When each of the above waveforms is low, it indicates that the write buffer at a particular level is full of data.

波形WACK0,WACK1,WACK2及びWACK3それぞれは、書き込
みバッファ30の各レベルに関連するハンドシェーク信号
である。The waveforms WACK0, WACK1, WACK2 and WACK3 are handshake signals associated with each level of the write buffer 30.

図３の波形CWREQは、単に全ての波形MCWREQ0〜MCWREQ
3の論理和である。従って、CWREQは、少なくとも一つの
書き込みが保留であることを示す。図３の波形WBSEL
は、書き込みバッファ30の何れのレベルが動作中である
かを示す。The waveform CWREQ in FIG. 3 is simply all the waveforms MCWREQ0 to MCWREQ.
It is the logical sum of three. Therefore, CWREQ indicates that at least one write is pending. Waveform WBSEL in Figure 3
Indicates which level of the write buffer 30 is in operation.

本発明の一実施例では、三路転送動作中に転送されな
い追加の語が残っている場合に限って、第2,第3,第4,第
５等の書き込み動作が三路転送動作の一部として開始さ
れる。図３に示すように、T1は三路転送動作のバス70上
の最初の書き込み動作の時間を表している。時間T2は、
バス70上の第２の書き込み動作を表している。時間T3は
バス70上の最後の書き込み動作を表している。更に長い
データ語ブロックで更に多くの書き込み動作を行うこと
は、大きな語カウントのデータ・ブロックがサブシステ
ム90に完全に転送されるまでに、時間T4,T5,T6等が続く
ことを意味する。例えば、図２の列Ｉのデータ・ブロッ
クは、図３に示す３回の動作ではなく、４回の動作を必
要とするであろう。In one embodiment of the present invention, the second, third, fourth, fifth, etc. write operations are one of the three-way transfer operations only if there are additional words that are not transferred during the three-way transfer operation. Started as a department. As shown in FIG. 3, T1 represents the time of the first write operation on the bus 70 of the three-way transfer operation. Time T2 is
A second write operation on bus 70 is represented. Time T3 represents the last write operation on bus 70. Performing more write operations on longer data word blocks means that the time T4, T5, T6, etc. will continue before a large word count data block is completely transferred to subsystem 90. For example, the data block in column I of FIG. 2 would require four operations rather than the three operations shown in FIG.

三路転送の最後の書き込み動作は常に終了アドレスの
伝送を含んでいる。図３において、最後から１つ前の書
き込み動作と終了動作との間では、トリガ・アドレスが
伝送されないことに留意されたい。図３の波形NAではT1
の最初の書き込み動作と、T2の第２の書き込み動作の間
には、トリガ・アドレスは現れている。例えば、三路転
送動作の一部として４回の書き込み動作があるとした場
合は、トリガ・アドレスへの書き込みが、第２と第３の
書き込み動作の間で行われるが、第３と最後の動作の間
では行われない。その理由は、トリガ・アドレスへの書
き込みはリセットの形式の機能を果たすからである。The final write operation of a three-way transfer always involves the transmission of the end address. Note that in FIG. 3, the trigger address is not transmitted between the last but one write and end operations. T1 in the waveform NA of FIG.
Between the first write operation of T2 and the second write operation of T2, the trigger address appears. For example, if there are four write operations as a part of the three-way transfer operation, the write to the trigger address is performed between the second and third write operations, but the third and last write operations are performed. Not done between actions. The reason is that writing to the trigger address performs a reset type function.

三路転送動作が完了すると、次の書込が可能となる。 When the three-way transfer operation is completed, the next writing becomes possible.

本発明の一実施例では、三路転送動作の一部として終
了アドレスへの書き込みが行われる前に、三路転送動作
中に行われることが必要な書き込みの回数を決定するた
めの予備計算が行われる。転送されるべきデータ・ブロ
ックの語カウントはＮで除算され、ここにＮは２つの隣
接するキャッシュのバウンダリの相互間にあるデータ語
の数である。言い換えると、Ｎは任意のバス転送動作中
に転送されるデータ語の数である。次に商から１が減算
される。この減算後の結果が整数の端数を含んでいる場
合は、結果は次に大きい整数に丸められる。例えば、結
果が1.25である場合は、結果は２に丸められる。それに
よって、整数Ｘが生ずる。前の例ではＸは２となる。Ｘ
が０以外の正の整数である場合は、三路転送動作の最初
の書き込み動作の後で、最後の書き込み動作の前にＸ回
の書き込み動作がなされることになる。この計算はPAL1
50及び160と連携したソフトウェアによって行うことが
できる。In one embodiment of the present invention, a preliminary calculation to determine the number of writes that need to be performed during a three-way transfer operation is performed before the end address is written as part of the three-way transfer operation. Done. The word count of the data block to be transferred is divided by N, where N is the number of data words between two adjacent cache boundaries. In other words, N is the number of data words transferred during any bus transfer operation. Next, 1 is subtracted from the quotient. If the result after this subtraction contains a fractional integer, the result is rounded to the next higher integer. For example, if the result is 1.25, the result is rounded to 2. This yields the integer X. In the previous example, X would be 2. X
Is a positive integer other than 0, X write operations are performed after the first write operation of the three-way transfer operation and before the last write operation. This calculation is PAL1
This can be done by software associated with 50 and 160.

本発明の別の実施例では、各々のバス転送動作中に転
送されるデータ語の数はＮであることができ、ここにＮ
は１以上であり、かつ語カウント以上であるか、これと
等しい整数である。すなわち、Ｎ個のデータ語が隣接す
るキャッシュのバウンダリ間にあることを意味する。In another embodiment of the invention, the number of data words transferred during each bus transfer operation may be N, where N
Is an integer greater than or equal to 1 and greater than or equal to the word count. That is, it means that N data words are between the boundaries of the adjacent caches.

CPU20の例外、従って、コンテキスト・スイッチが、
図３の三路転送に関連する３回の書き込みのいずれかの
間に生じることがある。従って、図１に示した回路で
は、最初の二回の書き込みによる状態を使用不能、保存
及び復元する機能が備えられる。この機能は、次を保存
すること、すなわち、（１）三路開始アドレス・トリガ
・ビット，（２）三路終了アドレス・トリガ・ビット，
（３）タグ（標識）ビット（開始アドレスの５番目のビ
ット（すなわちビットA4））を、保存することによって
備えられる。これらのビットは、図１のレジスタ60に記
憶することができる。図１の回路10には、レジスタ60内
の３つのビットを読み出し、これらに書き込む能力も備
えられている。三路転送動作中に割り込みが生じた場合
は、ディジタル・コンピュータののオペレーティング・
システムが先ずレジスタ60内に記憶された３つのビット
の状態を読み出して、記憶する。オペレーティング・シ
ステムは次にレジスタ60の実ビットを使用不能にする。
オペレーティング・システムが割り込みの直前のシステ
ム状態に戻して復元する場合には、オペレーティング・
システムは３つのビットをレジスタ60内の割り込み直前
の状態に復元する。それを行うために、オペレーティン
グ・システムは、先ず３つのビットが使用不能にされる
前に、それらビットの状態を読み出して記憶しておくの
である。CPU20's exception, and thus context switch,
It can occur during any of the three writes associated with the three-way transfer of FIG. Therefore, the circuit shown in FIG. 1 has a function of disabling, saving and restoring the state of the first two writings. This function saves the following: (1) three way start address trigger bit, (2) three way end address trigger bit,
(3) Provided by saving the tag bit (the fifth bit of the start address (ie bit A4)). These bits can be stored in register 60 of FIG. Circuit 10 of FIG. 1 also provides the ability to read and write the three bits in register 60. If an interrupt occurs during a three-way transfer operation, the operating
The system first reads and stores the state of the three bits stored in register 60. The operating system then disables the real bit in register 60.
If the operating system wants to restore and restore the system state just before the interrupt,
The system restores the three bits to the state in register 60 immediately before the interrupt. To do so, the operating system first reads and stores the state of the three bits before they are disabled.

本発明の別の実施例では、三路転送動作の一部として
図形サブシステム90に書き込まずに、例えば直列ポート
に書き込むことができよう。更に、三路転送動作の一部
として図形サブシステム90に書き込まずに、第２メモリ
に書き込むこともできよう。In another embodiment of the present invention, one could write to the serial port, for example, rather than writing to graphics subsystem 90 as part of a three-way transfer operation. Further, it would be possible to write to the second memory without writing to the graphics subsystem 90 as part of the three-way transfer operation.

本発明の更に別の実施例では、三路転送動作の一部と
して、データ・ブロックをサブシステム90からバス70を
経てメモリ80へと直接転送することも可能であろう。In yet another embodiment of the present invention, it would be possible to transfer a block of data directly from subsystem 90 via bus 70 to memory 80 as part of a three way transfer operation.

これまで本発明を特定の実施例に関して説明してき
た。しかし、添付の請求の範囲に記載し発明の精神と範
囲から逸脱することなく多くの変更と修正が可能である
ことは明らかであろう。従って、明細書と図面は限定的
な意味ではなく、説明のための例示であるとみなされる
べきものである。The invention has been described so far with reference to particular embodiments. However, it will be apparent that many changes and modifications may be made without departing from the spirit and scope of the invention as set forth in the appended claims. Therefore, the specification and drawings are to be regarded as illustrative rather than restrictive.

フロントページの続き (72)発明者ジェルモラク，トーマス・アランアメリカ合衆国 94025 カリフォルニア州・ロスアルトス・セントジョセフアヴェニュ・892 (56)参考文献特開昭63−70386（ＪＰ，Ａ) 特開平３−31947（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 13/38 - 13/42 G06F 12/08 - 12/12 G06F 13/10 - 13/14 G06F 15/16 Front Page Continuation (72) Inventor Germorak, Thomas Allan United States 94025 California Los Altos St. Joseph Avenue 892 (56) Reference JP-A-63-70386 (JP, A) JP-A-3- 31947 (JP, A) (58) Fields investigated (Int.Cl. ⁷ , DB name) G06F 13/38-13/42 G06F 12/08-12/12 G06F 13/10-13/14 G06F 15/16

Claims

(57) [Claims]

1. A central processing unit, a decoding circuit coupled to the central processing unit, a cache memory coupled to the decoding circuit, a bus coupled to the cache memory, and a main circuit coupled to the bus. A method for performing a special write-initiated transmission operation on digital information consisting of a start address and a data block in a digital computer having a memory device and a graphics subsystem coupled to the bus. And in that special transmission operation, the digital information is transmitted to the graphics subsystem, (1) the starting address is the address of the first word of the data block, and (2) the number of words of the data block is the word count. (A) Initiate the special transmission by performing a first write operation of the central processing unit to transmit a trigger address from the central processing unit to the decoding circuit. (B) the trigger address is decoded by the decode circuit, and (1) one of the central processing units.
Preventing the main memory device and the cache memory from being changed by the second write operation subsequent to the second write operation and (2) the third write operation subsequent to the second write operation of the central processing unit. (C) The second write operation of the central processing unit is executed to transfer from the central processing unit to the cache memory and the bus, (1) a start address, and (2).
Transmitting data including word counts and graphics commands; (d) transmitting a start address from the bus to the main memory and the graphics subsystem; (e) transmitting the main memory and graphics sub from the bus Transmitting the data, including word counts and graphics commands, to the system; (f) if only the cache memory has a modified sequence of N data words, the cache memory is proprietary. The modified sequence is transmitted directly from the cache memory to the graphics subsystem via the bus without being sent to the central processing unit that initiated the special transmission, where N
Is an integer greater than or equal to the word count, one of the data words in the sequence of N data words is at the start address, and the data block is contained in the sequence of N data words; (g) if , N data words have the latest sequence in the main memory and the cache memory does not have a modified sequence dedicated to that cache memory, the N data words A method of transmitting the latest sequence directly from the main storage device to the graphics subsystem via the bus without sending it to the central processing unit which initiated the special transmission.