JPH06282492A

JPH06282492A - Method and apparatus for removal of page table entry from plurality of address conversion buffers coupled to plurality of processors in multiprocessor system

Info

Publication number: JPH06282492A
Application number: JP5114182A
Authority: JP
Inventors: M Hayes Norman; ノーマン・エム・ヘイーズ; Pradeep Sindhu; プラディープ・シンドウ; Jean-Marc Frailong; ジャン−マルク・フレイロング; Sunil Nanda; スニル・ナンダ
Original assignee: Xerox Corp; Sun Microsystems Inc
Current assignee: Xerox Corp; Sun Microsystems Inc
Priority date: 1992-04-17
Filing date: 1993-04-19
Publication date: 1994-10-07
Also published as: KR100278034B1; KR930022215A

Abstract

PURPOSE: To make it possible to remove the page table entry from the address conversion buffers ('TLB') coupled with the processors in the multiprocessor system. CONSTITUTION: This method consists roughly of a stage for sending a request packet so that the page table entry is removed from a 1st TLB 113, a stage for sending the request packet forward to a 2nd TLB 123 through a packet conversion path, a stage wherein the request packet specifies a specific source, a 1st address mode, and a process identifier, a stage for receiving the request packet on the packet conversion path by the 2nd TLB 123, a check stage for deciding wherein the 2nd TLB 123 decides whether or not the TLB itself contains the page table entry by comparing the 1st mode address with the process identifier, and a stage for removing the page table entry from the 2nd TLB 123 when the 2nd TLB 123 contains the page table entry.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は多重プロセッサ・コンピ
ュータシステムにおける仮想アドレス指定されるメモリ
の動作に関し、特に上記システムにおけるアドレス変換
バッファに関する。FIELD OF THE INVENTION The present invention relates to the operation of virtual addressed memory in a multiprocessor computer system, and more particularly to an address translation buffer in such a system.

【０００２】[0002]

【従来の技術】コンピュータシステムにおいて、中央処
理装置（「ＣＰＵ」）がコンピュータシステムの主記憶
装置への記憶アクセスの速度を加速するためにキャッシ
ュ・メモリを備えることは極めて一般化している。キャ
ッシュ・メモリは小型であるが、主記憶装置よりも大幅
に高速で動作する。このキャッシュ・メモリはＣＰＵと
主記憶装置との間に作動的に配設される。ソフトウェア
・プログラムの実行中、キャッシュ・メモリは使用頻度
が最も高い命令とデータを格納する。プロセッサが主記
憶装置から情報をアクセスする必要がある場合は常に、
プロセッサは主記憶装置をアクセスする前に先ずキャッ
シュを吟味する。プロセッサがキャッシュ・メモリ内に
命令又はデータを発見できずキャッシュ・ミスが生じた
ときは、速度が遅い主記憶装置をアクセスする必要があ
る。このようにして、キャッシュ・メモリによってＣＰ
Ｕの平均記憶アクセス時間が短縮される。キャッシュ・
メモリに関する詳細な情報に関しては、ジョン・Ｌ．ヘ
ネシー、デービッド・Ａ．パターソン共著「コンピュー
タ・アーキテクチャ：その定量的アプローチ」（モーガ
ン、カウフマン出版、１９９０年刊）を参照されたい。BACKGROUND OF THE INVENTION In computer systems, it is very common for central processing units ("CPUs") to include cache memory to speed up memory access to main memory of the computer system. Although cache memory is small, it operates significantly faster than main memory. The cache memory is operably arranged between the CPU and main memory. During execution of software programs, cache memory stores the most frequently used instructions and data. Whenever the processor needs to access information from main memory,
The processor first examines the cache before accessing main memory. When the processor cannot find an instruction or data in the cache memory and a cache miss occurs, the slow main memory needs to be accessed. In this way, the CP by the cache memory
The average storage access time for U is reduced. cache·
For more information on memory, see John L. et al. Hennessy, David A. See Patterson, "Computer Architecture: Its Quantitative Approach" (Morgan, Kauffman Publishing, 1990).

【０００３】今日のコンピュータ技術では主記憶装置
（「物理的記憶装置」）だけで処理を実行し、一方、プ
ログラマーもしくはユーザーは外部ディスクに割当てら
れた大幅に大容量の記憶装置であると感ずる（「仮想記
憶装置」）。仮想記憶装置によって極めて効率良い多重
のプログラミングが可能になり、ユーザーは主記憶装置
に不要に拘束されることから解放される。仮想記憶装置
のアドレス指定を行うために、プロセッサの多くは仮想
記憶装置の仮想アドレスを物理的記憶装置の物理的アド
レスへと変換する変換器と、最近に生成された一対の仮
想−物理的アドレスをキャッシュするアドレス変換バッ
ファ（「ＴＬＢ」）とを備えている。一対の変換アドレ
スが既に存在する場合にはマッピング・プロセスをスキ
ップすることによって主記憶装置に迅速にアクセスでき
るので、ＴＬＢは重要である。ＴＬＢのエントリはキャ
ッシュ・メモリのエントリと類しており、これはタグが
仮想アドレスの一部を保持し、データ部が一般に物理的
ページ・フレーム番号と、保護フィールドと、使用ビッ
トと、ダーティ・ビットとを保持するものである。In today's computer technology, the main memory ("physical memory") alone performs the processing, while the programmer or user feels that it is a much larger memory memory allocated to an external disk ( "Virtual storage"). Virtual storage enables highly efficient multiple programming, freeing the user from being unnecessarily tied to main storage. To provide virtual memory addressing, many processors translate a virtual memory virtual address into a physical memory physical address and a recently generated pair of virtual-physical addresses. And an address translation buffer (“TLB”) for caching. The TLB is important because main memory can be accessed quickly by skipping the mapping process if a pair of translated addresses already exists. A TLB entry is similar to a cache memory entry, where the tag holds a portion of the virtual address and the data portion is typically the physical page frame number, protected field, used bits, and dirty It holds bits and.

【０００４】所与のプロセスの仮想−物理的アドレスの
ページマッピングがプロセスの必要性に応じてスワップ
アウトされ、又はスラッシュされる場合は、マッピング
は処分される必要がある。そうではない場合は、仮想ア
ドレスは各プロセスによって再利用されるので、仮想ア
ドレスを送る次のプロセスを終了して、以前のプロセス
からマッピングを得ることができる。単一のプロセッサ
を有するコンピュータ・システムでは、目標ページをデ
マップするために一般にフラッシュ指令がＴＬＢに送ら
れる。If the page mapping of the virtual-physical address of a given process is swapped out or slashed according to the needs of the process, the mapping needs to be discarded. Otherwise, the virtual address is reused by each process so the next process sending the virtual address can be terminated and the mapping obtained from the previous process. In computer systems with a single processor, a flush command is typically sent to the TLB to demap the target page.

【０００５】しかし、共用記憶域を有する多重プロセッ
サ・コンピュータシステムでは、システムバスに沿った
各プロセッサがそのページのコピーを有していることが
あるので、個々のフラッシュ指令をプロセッサに送るこ
とは非効率なタスクである。単一プロセッサ・システム
では一般的であるように、プロセッサに割込み要求を送
ることは可能であるが、全てのプロセッサに割込み要求
を発することはシステムバスを制御し、各プロセッサの
動作の実行を停止することを意味する。更に、プロセッ
サは割込み要求を受理すると、それぞれのＴＬＢに同じ
フラッシュ指令を発し、システムバスを制御した後で、
指令を発するプロセッサに応答する必要がある。それぞ
れのプロセスが一層複雑になり、プロセッサの数が増大
するので、システム全体に亘る割込みはほとんど常時出
現する。というのは、プロセッサの各々がシステム全体
に亘ってジョブを実行し、他の全てのプロセッサにフラ
ッシュ指令を発することがあるからである。However, in a multiprocessor computer system with shared storage, it is not possible to send individual flush commands to the processors because each processor along the system bus may have a copy of its page. It is an efficient task. It is possible to send an interrupt request to a processor, as is common in uniprocessor systems, but issuing an interrupt request to all processors controls the system bus and halts the execution of each processor operation. Means to do. Further, when the processor accepts the interrupt request, it issues the same flush command to each TLB, controls the system bus, and then
It needs to respond to the issuing processor. As each process becomes more complex and the number of processors increases, system-wide interrupts almost always appear. This is because each of the processors may perform jobs throughout the system and issue flush commands to all other processors.

【０００６】[0006]

【発明が解決しようとする課題】従って本発明の課題は
多重プロセッサ・コンピュータシステムにおいて全ての
プロセッサ用に同報通信ページ除去技法を提供すること
にある。SUMMARY OF THE INVENTION It is therefore an object of the present invention to provide a broadcast page removal technique for all processors in a multiprocessor computer system.

【０００７】更に、多重プロセッサ・コンピュータシス
テムにおいて割込み要求の送受に関連する不利益を生ず
ることなく、全てのプロセッサ用に同報通信ページ除去
技法を提供することにある。It is a further object to provide a broadcast page removal technique for all processors without the penalties associated with sending and receiving interrupt requests in a multiprocessor computer system.

【０００８】[0008]

【課題を解決するための手段】上記の課題を解決するた
め、多重プロセッサ・システムにおいて複数個のプロセ
ッサと結合された複数個のアドレス変換バッファ（「Ｔ
ＬＢ」）からページ・テーブル・エントリを除去する方
法と装置を開示する。この方法は、前記第１ＴＬＢの第
１制御装置によって前記第１ＴＬＢから前記ページ・テ
ーブル・エントリを除去するように要求パケットを送る
段階と、所定のソース、前記第１アドレス・モード及び
プロセス識別を特定する前記要求パケットで第２ＴＬＢ
と結合された第２制御装置へ同報通信される前記パケッ
ト変換バスにその要求パケットを送る段階と、前記要求
パケットを前記パケット変換バス上で前記第２制御装置
によって受理する段階と、前記第２制御装置が前記第１
モード・アドレスとプロセス識別子とを比較して、前記
ＴＬＢが前記ページ・テーブル・エントリを含んでいる
か否かを判定するためにチェックする段階と、前記第２
プロセッサに関するペンディング中の動作を完了する段
階と、前記ページ・テーブル・エントリが前記第２ＴＬ
Ｂに格納されている場合には前記第２制御装置によって
前記第２ＴＬＢから前記ページ・テーブル・エントリを
除去する段階と、前記第２制御装置によって完了を示す
応答パケットを前記第１制御装置に送る段階と、前記前
記ソースを識別する応答パケットで前記第１制御装置に
転送される前記パケット変換バスにその応答パケットを
送る段階とからなっている。In order to solve the above problems, in a multiprocessor system, a plurality of address translation buffers ("T
Method and apparatus for removing a page table entry from a "LB"). The method comprises sending a request packet by a first controller of the first TLB to remove the page table entry from the first TLB and identifying a predetermined source, the first address mode and a process identification. The second TLB in the request packet to
Sending the request packet to the packet conversion bus that is broadcast to a second control device coupled to the second control device; accepting the request packet on the packet conversion bus by the second control device; 2 control device is the first
Comparing the mode address with a process identifier to check to determine if the TLB contains the page table entry;
Completing the pending operation for the processor, the page table entry being the second TL.
If stored in B, removing the page table entry from the second TLB by the second controller, and sending a response packet to the first controller indicating completion by the second controller. And sending the response packet to the packet conversion bus that is forwarded to the first controller in a response packet identifying the source.

【０００９】表記法と用語以下の詳細な説明は大部分がコンピュータシステムにお
ける動作のアルゴリズム及び記号的な表現に基づいてお
こなわれる。これらのアルゴリズム記述法と表現はデー
タ処理の分野の専門家がその研究を他の専門家に最も有
効に伝えるために利用する手段である。 Notation and Terminology The following detailed description is based in large part on algorithms and symbolic representations of operations in computer systems. These algorithmic descriptions and representations are the means used by experts in the field of data processing to most effectively convey their work to other experts.

【００１０】ここでは、アルゴリズムは、一般に、所望
の結果に至る一貫した段階の手順であるものとみなされ
る。これらの段階は物理量の物理的処理を必要とする段
階である。限定されるものではないが、通常はこれらの
物理量は記憶、転送、結合、比較及びその他の処理が可
能である電気信号又は磁気信号の形態をとっている。現
在では主として共通に採用されるという理由から、これ
らの信号をビット、数値、要素、記号、文字、用語、数
等で表記することが便利であることが実証されている。
しかし、これらの用語、及び類似する用語の全てが適宜
の物理量と関連するものではなく、これらの物理量に付
した便利なラベルであるに過ぎないことに留意された
い。Algorithms are generally considered here to be a consistent, step-wise procedure leading to a desired result. These steps are those requiring physical manipulations of physical quantities. Without limitation, these physical quantities are usually in the form of electrical or magnetic signals that can be stored, transferred, combined, compared, and otherwise processed. Presently, it has proven convenient to represent these signals in bits, numbers, elements, symbols, characters, terms, numbers, etc., primarily because they are commonly adopted.
However, it should be noted that all of these terms and similar terms are not associated with the appropriate physical quantities, but are merely convenient labels for these physical quantities.

【００１１】更に、実施される処理は人間のオペレータ
によって実行される知的な演算と関連する加算又は比較
のような用語で呼ばれることが多くい。本発明の一部を
構成するここに開示するいかなる演算においても、ほと
んどの場合は人間のオペレータによる上記のような能力
は必要なく、又は望ましいものでもない。演算は機械に
よる演算である。本発明の演算を実行するのに有用であ
る機械には汎用のディジタルコンピュータ又はその類似
装置が含まれる。コンピュータを操作する際の操作方法
と、計算方法自体は区別されることを常に留意された
い。本発明は別の所望の物理的信号を生成するために電
気的又はその他の（例えば機械的、化学的な）物理的信
号を処理する際にコンピュータを操作する方法の段階に
関するものである。Further, the processing performed is often referred to in terms, such as addition or comparison, associated with intelligent operations performed by human operators. In most cases, none of the above-disclosed operations forming part of the present invention will require or be desirable by a human operator. The operation is a machine operation. Machines useful for performing the operations of the present invention include general purpose digital computers or similar devices. It should always be noted that there is a distinction between the method of operation when operating a computer and the method of calculation itself. The present invention relates to method steps for operating a computer in processing electrical or other (eg mechanical, chemical) physical signals to produce another desired physical signal.

【００１２】本発明は更にこれらの動作を実行する装置
に関するものである。この装置は必要な目的のために特
別に製造してもよく、又はコンピュータに記憶されたコ
ンピュータ・プログラムによって選択的に起動され、又
は再構成される汎用コンピュータであってもよい。ここ
に開示するアルゴリズムは特定のコンピュータ又はその
他の装置と固有に関連するものではない。特に、本明細
書の教示内容に従って書込まれるプログラムで各種の汎
用機械を利用できる。あるいは、必要な方法段階を実行
するためにより専門化された装置を製造することがより
便利であるということもできよう。これらの多様な機械
に必要な構造は以下の説明から明らかにされる。The present invention further relates to apparatus for performing these operations. This apparatus may be specially manufactured for the required purposes, or it may be a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. The algorithms disclosed herein are not inherently related to any particular computer or other apparatus. In particular, various general purpose machines can be utilized with programs written in accordance with the teachings herein. Alternatively, it may be more convenient to manufacture a more specialized device to carry out the necessary method steps. The required structure for a variety of these machines will appear from the description below.

【００１３】コード化の説明ここに開示する種々の手順を実行するために特別のプロ
グラミング言語を示すことはしない。その理由の一つ
は、言及し得る言語の全てを必ずしも汎用に利用できな
いということにある。特定のコンピュータの各ユーザー
は直面する目的に最も適した言語を周知している。実際
には、本発明を機械が実行可能な目的コードを含むアセ
ンブリ言語で実現することが有用であることが判明して
いる。本発明を実施する際に使用できるコンピュータ及
びモニタシステムは多くの多様な素子から構成されてい
るので、詳細なプログラム・リストは提示していない。
ここに説明し、添付図面に図示した演算及びその他の手
順は専門家が本発明を充分に実施できるように開示され
ている。 Coding Description No specific programming language is presented to perform the various procedures disclosed herein. One of the reasons is that not all of the languages that can be mentioned are universally available. Each user of a particular computer is familiar with the language that best suits the purpose they face. In practice, it has been found useful to implement the invention in assembly language containing machine-executable target code. The detailed program listing is not provided because the computer and monitor systems that may be used in practicing the present invention are made up of many different elements.
The operations and other procedures described herein and illustrated in the accompanying drawings are disclosed to enable an expert to fully practice the invention.

【００１４】[0014]

【実施例】多重プロセッサ・コンピュータシステムにお
いてメモリ変換ページの割付けを解除する方法と装置を
開示する。説明目的の以下の記述においては、本発明を
完全に理解するために特定のメモリ、機構及びアーキテ
クチャ等を開示する。しかし、本発明をこれらの特定の
細部なしでも実施できることが専門家には明白であろ
う。別の例では、本発明を不要に不明瞭にしないために
公知の回路は構成図の形式で示される。DETAILED DESCRIPTION A method and apparatus for deallocating memory translation pages in a multiprocessor computer system is disclosed. In the following description for purposes of explanation, specific memory, features, architectures, etc. are disclosed in order to provide a thorough understanding of the present invention. However, it will be apparent to one skilled in the art that the present invention may be practiced without these specific details. In another example, known circuits are shown in block diagram form in order not to unnecessarily obscure the present invention.

【００１５】さて図１を参照すると、多重プロセッサ・
コンピュータシステムの簡略な構成図が図示されてい
る。プロセッサ１１０はキャッシュ制御装置１１１とバ
スウォッチャ１１２とを経てシステムバス１００と結合
されている。プロセッサ１１０は更にそのキャッシュ
（図示せず）と連携して既に変換された仮想アドレス−
物理的アドレスのマッピングを格納するためにＩ／Ｏ・
ＴＬＢ１１４又はシステムＴＬＢ１１３のようなアドレ
ス変換バッファ（「ＴＬＢ」）を使用している。プロセ
ッサ１２０及び１３０は同様にしてそれぞれのキャッシ
ュ制御装置とバスウォッチャとを経てシステムバス１０
０と結合されている。プロセッサ１１０が単一の、又は
複数の仮想ページの仮想−物理アドレスマッピングをデ
マップする必要がある場合は、そのキャッシュ制御装置
１１１がバスウォッチャ１１２を経てシステムバス１０
０にデマップ要求パケットを同報通信で送る。別のバス
ウォッチャ１２２と１３２がデマップ要求パケットを受
理すると、これらのバスウォッチャはデマップ要求パケ
ットを実行するためにそれぞれのキャッシュ制御装置１
２１と１３１とにデマップ要求パケットを転送する。別
のキャッシュ制御装置がそれぞれのデマップを完了した
後、目標ページがある場合は、キャッシュ制御装置１２
１と１３１がそれぞれのバスウォッチャ１２２と１３２
とを経てデマップ応答パケットを送り、デマップ要求パ
ケットを送ったプロセッサ１１０のバスウォッチャ１１
２に対して他の全てのキャッシュでデマップが完了した
ことを通知する。次にバスウォッチャ１１２はデマップ
応答を送り、デマップ要求パケットを送ったプロセッサ
１１０に対して他の全てのプロセッサの処理が完了した
ことを通知する。Referring now to FIG. 1, a multiprocessor
A simplified block diagram of a computer system is shown. Processor 110 is coupled to system bus 100 via cache controller 111 and bus watcher 112. The processor 110 further cooperates with the cache (not shown) of the virtual address already translated.
I / O to store the physical address mapping
An address translation buffer ("TLB") such as TLB 114 or system TLB 113 is used. Processors 120 and 130 similarly go through their respective cache controllers and bus watchers to system bus 10
Combined with 0. If the processor 110 needs to demap the virtual-to-physical address mapping of a single or multiple virtual pages, its cache controller 111 goes through the bus watcher 112 to the system bus 10.
Send a demap request packet to 0 by broadcast. When the other bus watchers 122 and 132 receive the demap request packet, these bus watchers execute the respective demapping request packet in order to execute the demap request packet.
The demap request packet is transferred to 21 and 131. After another cache controller completes its demapping, if there is a target page, cache controller 12
1 and 131 are respectively bus watchers 122 and 132
The bus watcher 11 of the processor 110 that has sent the demap request packet via
Notify 2 that demapping is complete in all other caches. The bus watcher 112 then sends a demap response to notify the processor 110 that sent the demap request packet that all other processors have completed processing.

【００１６】本実施例ではデマップ要求パケットは２つ
のサイクルからなっている。見出しサイクルではアドレ
ス・フィールドは同報通信パケットを示すためにオール
・ゼロに設定され、一方、デマップされる構成要素はデ
ータ・サイクルで図２に示すように特定される。デマッ
プ応答パケットは以前のデマップ要求パケットを肯定す
る。これは目下２サイクルの長さである。第２のサイク
ルは使用されず、その内容は「ドントケヤ」である。内
部的にはデマップ・トランザクションはプロセッサのメ
モリ・フラッシュ動作によって生成される。デマップ用
に転送された情報には仮想アドレス（「ＶＦＰＡ」２
１）と、Ｔｙｐｅ２２が含まれ、これらはＴＬＢ内で除
去するためにページを突合わせる判断基準として用いら
れる。デマップ・トランザクション用に利用される、こ
の分野では一般に「文脈」として知られるＰｒｏｃｅｓ
ｓ＿Ｉｄ２０は図２に示すように３２からビット４７内
に同報通信される。より低位の３２ビットはフラッシュ
動作のデータフォーマットと等価である。In the present embodiment, the demap request packet consists of two cycles. In the header cycle, the address field is set to all zeros to indicate the broadcast packet, while the demapped components are identified in the data cycle as shown in FIG. The demap response packet acknowledges the previous demap request packet. This is currently two cycles long. The second cycle is not used and its content is "don't care". Internally, demap transactions are generated by the processor's memory flush operations. The information transferred for demapping includes the virtual address (“VFPA” 2
1) and Type 22 which are used as criteria for matching pages for removal within the TLB. Proces, commonly known as "context" in this field, used for demapping transactions
s_Id 20 is broadcast in bit 47 from 32 as shown in FIG. The lower 32 bits are equivalent to the flash operation data format.

【００１７】システムバスを経てデマップ・トランザク
ションを同報通信することに加えて、プロセッサは外部
デマップを受理することができる。図１を参照すると、
プロセッサ１１０からのデマップ要求がプロセッサ１２
０によって受理されると、プロセッサ１２０は目下の内
部プロセス識別子ではなく供給されたプロセス識別子を
用いて、デマップ要求が内部で生成されたかのようにデ
マップを実行する。現在、プロセッサはデマップ動作に
対して単一のレディ状態応答を要求する。システム・ハ
ードウェアはデマップ・トランザクションがシステム内
の全てのキャッシュ制御装置に同報通信され、それらに
よって完了されることを保証する。現在、受信したデマ
ップは２相の要求／応答プロトコルを利用する。バスウ
ォッチャによって保持されなければならない状態の量を
縮減するため、システム内で任意の一時点で単一のデマ
ップ・トランザクションだけしかペンディングされな
い。In addition to broadcasting demap transactions over the system bus, the processor can accept external demaps. Referring to FIG.
The demap request from the processor 110 is received by the processor 12
When accepted by 0, the processor 120 performs the demapping as if the demapping request was internally generated, using the supplied process identifier rather than the current internal process identifier. Currently, the processor requires a single ready response for the demap operation. System hardware ensures that demap transactions are broadcast to and completed by all cache controllers in the system. Currently, the received demap utilizes a two phase request / response protocol. Only a single demap transaction is pending in the system at any one time to reduce the amount of state that must be held by the bus watcher.

【００１８】図３を参照すると、プロセッサによって開
始されたデマップ・トランザクションのタイミング図が
図示されている。この場合は、デマップ・トランザクシ
ョンは多重プロセッサ・システムにおける全てのＴＬＢ
からページ・テーブル・エントリを除去するためにプロ
セッサによって利用される。この種類のシステムでは、
デマップはＩ／Ｏ・ＴＬＢ、その他のプロセッサのシス
テムＴＬＢに作用し、又は単にキャッシュ制御装置によ
って反映され、そのプロセッサによる動作は行われな
い。デマップのタイミングはスワップと同様である。す
なわち、目下２つの応答が要求されている。第１の応答
はデマップ要求の受理を肯定するためにＷＲＤＹ＿を用
いる。第２の応答はプロセッサに対してデマップがシス
テム全体に亘って首尾よく完了したことを通知する。そ
の信号はＲＲＤＹ＿信号である。双方のレディ状態信号
ＷＲＤＹ＿及びＲＲＤＹ＿とも図３に示すように同時に
表明できるが、２つの別個のレディ応答がなされる必要
がある。ＲＲＤＹ＿応答と共に例外信号（図示せず）を
表明することによってデマップに例外が報告されること
ができる。Referring to FIG. 3, a timing diagram of a processor initiated demap transaction is illustrated. In this case, the demap transaction will be the full TLB in the multiprocessor system.
Utilized by the processor to remove page table entries from the. In this type of system,
The demap acts on the system TLB of the I / O TLB, other processors, or is simply reflected by the cache controller and no action is taken by that processor. The timing of demapping is similar to swap. That is, two responses are currently required. The first response uses WRDY_ to acknowledge acceptance of the demap request. The second response informs the processor that the demapping has completed successfully throughout the system. That signal is the RRDY_ signal. Both ready status signals WRDY_ and RRDY_ can be asserted simultaneously as shown in FIG. 3, but two separate ready responses need to be made. Exceptions can be reported to the demap by asserting an exception signal (not shown) with the RRDY_ response.

【００１９】図３に示すように、デマップはＤｅｍａｐ
バー及びＡｄｒｅｓｓ−Ｓｔｒｏｂｅバー信号をアサー
トすることによって通信される。仮想アドレス、プロセ
ス識別子及び指令情報を含むデマップ用の全ての情報は
データ・サイクル［６３：０］の間に通される。このア
ドレスはデマップ用の「ドントケヤ」であり、図示した
様式のようにオール・ゼロに設定される。As shown in FIG. 3, the demap is Demap.
Communicated by asserting the bar and Address-Strobe bar signals. All information for demapping, including virtual address, process identifier and command information, is passed during the data cycle [63: 0]. This address is the "don't care" for demapping and is set to all zeros in the manner shown.

【００２０】デマップに応答して何の動作も行わないこ
とを選択するプロセッサと結合されたシステムがある場
合は、これらのシステムは２つのＲＲＤＹ＿の表明でデ
マップに応答しなければならない。この場合は、キャッ
シュ制御装置は２つの連続サイクルで起動するＲＲＤＹ
＿／ＷＲＤＹ＿を保持することができる。If there are systems associated with the processor that choose to take no action in response to the demap, then these systems must respond to the demap with two RRDY_ assertions. In this case, the cache controller will start RRDY in two consecutive cycles.
_ / WRDY_ can be held.

【００２１】ここで図４を参照すると、外部デマップ要
求トランザクションのタイミング図が示されている。デ
マップが同報通信される場合は、受信したデマップは２
相プロトコルを利用する。このプロトコルの第１の相は
外部デマップ要求４０７である。第２の相はこの要求に
対する応答４０８である。要求と応答の間には別のバス
のアクティビティが許容される。Referring now to FIG. 4, a timing diagram for an external demap request transaction is shown. If the demap is broadcast, the received demap is 2
Use the phase protocol. The first phase of this protocol is the external demap request 407. The second phase is the response 408 to this request. Another bus activity is allowed between request and response.

【００２２】デマップの要求４０７の相の部分はＡｄｒ
ｅｓｓ−Ｓｔｒｏｂｅバー４００及びＤｅｍａｐ−バ４
０１を起動する外部バスマスタによって送られる。デー
タ・サイクル［６３：０］４０２は図２に示したフォー
マットのデマップ指令４０２を含む必要がある。この指
令は単一のサイクル毎にバスに送られる必要がある。目
下、デマップ要求への応答の前に完了している必要があ
るペンディング動作により、プロセッサは要求には応答
できない。The phase portion of the demap request 407 is Adr.
ess-Strobe bar 400 and Demap-bar 4
Sent by the external bus master activating 01. The data cycle [63: 0] 402 must include a demap command 402 in the format shown in FIG. This command needs to be sent to the bus every single cycle. Currently, the processor is unable to respond to a request due to a pending operation that must be completed before responding to the demap request.

【００２３】外部要求４０７が内部で処理され終わる
と、プロセッサはデマップ応答４０８トランザクション
を開始する。この応答４０８は内部で生成されたデマッ
プと同様にバスに出現するが、主要な相違点は、それが
書込みではなくＲＤ＿４０３として通信されることであ
る。ＷＲ＿は一貫して変化しないことに留意されたい。
この応答はＡｄｒｅｓｓ−Ｓｔｒｏｂｅバー４０４及び
Ｄｅｍａｐバー４０５をアサートするプロセッサによっ
て通信される。トランザクションを完了するために、Ｒ
ＲＤＹ＿４０６はシステム論理により応答されるべきも
のである。When the external request 407 has been processed internally, the processor initiates a demap response 408 transaction. This response 408 appears on the bus similar to an internally generated demap, but the major difference is that it is communicated as RD_403 rather than write. Note that WR_ does not change consistently.
This response is communicated by the processor asserting Address-Strobe bar 404 and Demap bar 405. R to complete the transaction
RDY_406 should be responded to by system logic.

【００２４】[0024]

【発明の効果】上記の構成によって割込み要求の送受に
関連する不利益を生ずることなく、全てのプロセッサ用
にページ除去の同報通信技術を提供することができる。According to the above structure, the page removal broadcast communication technique can be provided for all processors without causing the disadvantages associated with the transmission and reception of interrupt requests.

[Brief description of drawings]

【図１】多重プロセッサ・コンピュータシステムの簡略
な構成図である。FIG. 1 is a simplified block diagram of a multiprocessor computer system.

【図２】同報通信デマップ要求におけるデータ・サイク
ルの記号的表現である。FIG. 2 is a symbolic representation of a data cycle in a broadcast demap request.

【図３】プロセッサにより開始されたデマップ・トラン
ザクションのタイミング図である。FIG. 3 is a timing diagram of a processor-initiated demap transaction.

【図４】外部デマップ要求トランザクションのタイミン
グ図である。FIG. 4 is a timing diagram of an external demap request transaction.

[Explanation of symbols]

２０Ｐｒｏｃｅｓｓ＿ＩＤ２１ＶＦＰＡ
（仮想アドレス）２２Ｔｙｐｅ１００システムバス１１０プロセッサ１１１キャッシュ制御装置１１２バスウォッ
チャ１１３システムＴＬＢ（アドレス変換バッファ）１１４Ｉ／ＯＴＬＢ１２０プロセッサ１２１キャッシュ制御装置１２２バスウォッ
チャ１２３ＴＬＢ１３０プロセッサ１３１キャッシュ制御装置１３２バスウォッ
チャ１３３ＴＬＢ20 Process_ID 21 VFPA
(Virtual Address) 22 Type 100 System Bus 110 Processor 111 Cache Controller 112 Bus Watcher 113 System TLB (Address Translation Buffer) 114 I / O TLB 120 Processor 121 Cache Controller 122 Bus Watcher 123 TLB 130 Processor 131 Cache Controller 132 Bus Watcher 133 TLB

───────────────────────────────────────────────────── フロントページの続き (71)出願人 591174933 ゼロックス・コーポレーションアメリカ合衆国 06904−1600 コネティカット州・スタンフォード・ロングリッジロード・800 (72)発明者ノーマン・エム・ヘイーズアメリカ合衆国 94087 カリフォルニア州・サニーヴェイル・メリマックドライブ・1121 (72)発明者プラディープ・シンドウアメリカ合衆国 94040 カリフォルニア州・マウンテンビュー・モンタルトドライブ・1557 (72)発明者ジャン−マルク・フレイロングアメリカ合衆国 94306 カリフォルニア州・パロアルト・ペッパーアヴェニュ・408 (72)発明者スニル・ナンダアメリカ合衆国 94024 カリフォルニア州・ロスアルトス・サンダルウッドレイン・1225 ─────────────────────────────────────────────────── ─── Continued Front Page (71) Applicant 591174933 Xerox Corporation United States 06904-1600 Stanford Longridge Road, Connecticut 800 (72) Inventor Norman M. Hayes United States 94087 Sunnyvale, California Merrimack Drive 1121 (72) Inventor Pradeep Sindow United States 94040 California Mountain View Montalto Drive 1557 (72) Inventor Jean-Marc Frey Long United States 94306 California Palo Alto Pepper Avenue 408 (72) Inventor Sunil Nanda United States 94024 Los Altos, California Sandalwood Les IN 1225

Claims

[Claims]

1. A multi-processor system in which data and command packets are transferred on a packet translation bus, removing page table entries from multiple address translation buffers ("TLBs") associated with multiple processors. In the above, each processor is a control device that controls reading and writing to the respective TLB, and the page table entry is an address mapping between the first mode address and the second mode address. And the page table entry is the first
Said method being of the form identified by mode address and process identification, wherein an invalid address mapping between said first address mode and said second address mode is performed in a first TLB coupled to a first processor. A page table entry containing the first TLB by the first controller of the first TLB.
A request packet to remove the page table entry from the device, and forwards the request packet specifying a predetermined source, the first address mode and process identification to a second controller associated with a second TLB. To the packet conversion bus for receiving the request packet on the packet conversion bus by the second controller, and by the second controller comparing the first mode address with a process identifier, The TLB
Checks to determine if it contains the page table entry, completes an ongoing operation for the second processor, and the page table entry is stored in the second TLB. Removes its page table entry from the second TLB by the second controller, issues a response packet by the second controller to indicate completion to the first controller, and identifies the source. Forwarding the response packet to the packet conversion bus for delivery to the first controller.

2. A circuit for removing one of a plurality of page table entries from a plurality of address translation buffers associated with a plurality of processors of a multiprocessor system having shared memory, each address translation A buffer stores a plurality of page table entries, the plurality of page table entries representing a mapping between virtual and physical addresses, each page table entry identified by its process identifier and virtual address. A packet conversion bus for transferring data and command packets to the plurality of processors of the multiplex microprocessor, and a data and command packet having corresponding processors as destinations on the packet conversion bus. Search for a given destination address Accepts the data and command packets, the first in the packet conversion bus and accepts the request
A receiver / receiver coupled to each processor and the packet translation bus for sending response packets, sending and receiving data and command packets, and a transceiver for each processor and a corresponding address translation buffer for address translation. A controller for reading and writing a buffer, wherein if the mapping between corresponding virtual addresses and physical addresses is not valid, the controller converts the plurality of address translations having the page table entry. A controller that sends a request packet to remove the page table entry from a buffer and the controller sends a response packet to the packet translation bus indicating that the removal of the page table entry is complete. A circuit characterized by that.

3. A page table entry is removed from a plurality of address translation buffers ("TLBs") associated with a plurality of processors in a multiprocessor system in which data and command packets are transferred on a packet translation bus. Device, where each processor has its own TL
A controller for controlling reading and writing to B, the page table entry representing an address mapping between a first mode address and a second mode address, the page table entry being the first mode thereof. A first T identified by an address and a process identification, said page table entry being associated with a first processor
An apparatus having an invalid address mapping between the first address mode and the second address mode in an LB, wherein the first TLB is controlled by a first controller of the first TLB.
Device for sending a request packet to remove the page table entry from a second source device, the request packet specifying a predetermined source, the first address mode and process identification to a second controller associated with a second TLB. A sending device for sending to the packet conversion bus for forwarding, a device for receiving the request packet on the packet conversion bus by the second control device, a first mode address coupled to the second control device, A comparing device for determining whether the TLB contains the page table entry by comparing with a process identifier; a device that completes pending operations for the second processor; By the second controller if a table entry is stored in the second TLB A device for removing the page table entry from the second TLB, a device for sending a response packet indicating completion by the second control device to the first control device, and a response packet for identifying the source for the first control packet. A device for sending to the packet conversion bus for transfer to a controller.