JP2008204488A

JP2008204488A - Multi-processor device

Info

Publication number: JP2008204488A
Application number: JP2008141295A
Authority: JP
Inventors: Yukio Nakamoto; 幸夫中本
Original assignee: Renesas Technology Corp
Current assignee: Renesas Technology Corp
Priority date: 2008-05-29
Filing date: 2008-05-29
Publication date: 2008-09-04

Abstract

<P>PROBLEM TO BE SOLVED: To solve the problem such as a cost increase since an inexpensive cache memory cannot be used because a processing time is necessary for monitoring processing, and a processing speed is lowered due to being unable to be formed as a light back cash. <P>SOLUTION: A common bus terminal 11a of an CPU 11 is connected to a global common bus 15b, and a bus terminal of a local cache memory 12 is connected to a global uncommon bus 15a. The global common bus is connected to an external common memory 19b storing common information used by the CPU, and the global uncommon bus 15a is connected to an external uncommon memory 19a storing uncommon information used by the CPU. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

この発明は複数のプロセッサユニットをグローバルバスに接続したマルチプロセッサ装置に関するものである。 The present invention relates to a multiprocessor device in which a plurality of processor units are connected to a global bus.

図２１は従来のマルチプロセッサ装置を示すブロック図である。図において、１，１Ａはプロセッサユニットであり、各プロセッサはＣＰＵ５とライトスルー機能で、かつ書き込み監視機能を有するキャッシュメモリ６を有している。各プロセッサ１，１Ａのキャッシュメモリ６は、共通のグローバルバス２に接続され、このグローバルバス２はインタフェース３を介して外部メモリ４に接続されている。なお、ここで問題としているのはデータキャッシュであり、命令キャッシュは問題としないので図示を省略する。 FIG. 21 is a block diagram showing a conventional multiprocessor device. In the figure, reference numerals 1 and 1A denote processor units, and each processor has a CPU 5 and a cache memory 6 having a write-through function and a write monitoring function. The cache memory 6 of each processor 1, 1 A is connected to a common global bus 2, and this global bus 2 is connected to an external memory 4 via an interface 3. The problem here is the data cache, and the instruction cache is not a problem, and is not shown.

次の動作について説明する。
ＣＰＵ５は処理に必要なデータをグローバルバス２、インタフェース３を介して外部メモリ４との間でやり取りを行うが、そのグローバルバス２、インタフェース３は処理動作速度が遅いため、この速度がボトルネックとなり、ＣＰＵ５は本来の処理速度が出なかった。 The following operation will be described.
The CPU 5 exchanges data necessary for processing with the external memory 4 via the global bus 2 and interface 3, but the global bus 2 and interface 3 are slow in processing operation, and this speed becomes a bottleneck. The CPU 5 did not achieve the original processing speed.

そこで、ＣＰＵ５がよく使う外部メモリ４の内容を該ＣＰＵの近くで保持することにより速度向上を計る手法が考えられた。ローカルキャッシュメモリ６は、ＣＰＵ５の近くに設けられ該ＣＰＵがよく使う外部メモリ４の内容を記録したメモリである。 In view of this, there has been considered a method for improving the speed by holding the contents of the external memory 4 frequently used by the CPU 5 near the CPU. The local cache memory 6 is a memory that is provided near the CPU 5 and records the contents of the external memory 4 frequently used by the CPU.

以下、このローカルキャッシュメモリ６の動作について記述する。
１．ローカルキャッシュメモリによる読み出し。 Hereinafter, the operation of the local cache memory 6 will be described.
1. Read by local cache memory.

いま、ＣＰＵ５が外部メモリ４の００１３番地を読みにいった場合、ローカルキャッシュメモリ６は自身が００１３番地の内容をもっているかを確認する。もしあれば、ＣＰＵ５に対して００１３番地の内容を返す。その結果、ＣＰＵ５は動作の遅いグローバルバス２、インタフェース３を使用しないで高速に動作できる。 Now, when the CPU 5 reads the address 0013 in the external memory 4, the local cache memory 6 confirms whether it has the contents at the address 0013. If there is, the content of address 0013 is returned to the CPU 5. As a result, the CPU 5 can operate at high speed without using the slow global bus 2 and interface 3.

もし、００１３番地の内容がローカルキャッシュメモリ６の中にない場合、ローカルキャッシュメモリ６は自分自身がもっているＣＰＵ５が今後当分の間使用しないであろうメモリの内容を選び出し（選出法はこの発明の本質ではないので説明を省略する）、これを消去（後述するがメモリに書き込み後キャッシュから消去）して、空いたところに００１３番地とその内容を転送する。こうすることにより、ＣＰＵ５が２回目以降、００１３番地を読みにいった場合、ローカルキャッシュメモリ６が００１３番地の内容を持っているので、高速に読み出すことができる。この機構をパージという。 If the content of the address 0013 is not in the local cache memory 6, the local cache memory 6 selects the content of the memory that the CPU 5 it owns will not use for the time being (the selection method is the same as that of the present invention). The description is omitted because it is not essential, and this is erased (which will be described later, erased from the cache after being written into the memory), and the address 0013 and its contents are transferred to an empty place. By doing so, when the CPU 5 reads the address 0013 after the second time, the local cache memory 6 has the contents of the address 0013, so that it can be read at high speed. This mechanism is called purge.

２．ローカルキャッシュメモリによる書き込み。
ＣＰＵ５が外部メモリ４に対して書き込みを実施する場合、二つの方法がある。一つはライトバック法（ＷｒｉｔｅＢａｃｋ）と呼ばれる方法で、もう一つはライトスルー法（ＷｒｉｔｅＴｈｒｏｕｇｈ）と呼ばれる方法である。 2. Write by local cache memory.
When the CPU 5 performs writing to the external memory 4, there are two methods. One is a method called a write-back method (Write Back), and the other is a method called a write-through method (Write Through).

まず、ライトスルー法について説明する。ＣＰＵ５が外部メモリ４の００１３番地に対して書き込みをしたとき、読み込みの時と同様にローカルキャッシュメモリ６が００１３番地の内容を持っているかどうかを確認する。ローカルキャッシュメモリ６が００１３番地の内容をもっていれば該ローカルキャッシュメモリ内の００１３番地の内容を書き換えると共に、外部メモリ４に対しても書き換えを実施する。ローカルキャッシュメモリ６が００１３番地の内容を持っていない場合は、ローカルキャッシュメモリ６はＣＰＵ５が使用しないと判断した他の内容を消去し、空いた場所に００１３番地の内容を書くと共に外部メモリ４にも書き込む。この結果、書き込みの度に動作の遅いグローバルバス２、インタフェース３を使用することになる。 First, the write-through method will be described. When the CPU 5 writes to the address 0013 of the external memory 4, it checks whether the local cache memory 6 has the contents of the address 0013 as in the case of reading. If the local cache memory 6 has the contents at the address 0013, the contents at the address 0013 in the local cache memory are rewritten, and the external memory 4 is also rewritten. If the local cache memory 6 does not have the contents of the address 0013, the local cache memory 6 erases other contents that the CPU 5 determines not to use, writes the contents of the address 0013 in an empty place, and stores them in the external memory 4. Also write. As a result, the global bus 2 and interface 3 which are slow in operation are used every time writing is performed.

次にライトバック法について説明する。このライトバック法はライトスルー法とは書き込みのタイミングが異なる。すなわち、書き込み時、ローカルキャッシュメモリ６には書き込みを実施するが、外部メモリ４に書き込みはせず、ローカルキャッシュメモリ６がこの内容をパージするときに書き込む。この結果、パージするときのみ、動作の遅いグローバルバス２、インタフェース３を使用することになり、ライトスルー法より処理動作が速くなる。 Next, the write back method will be described. This write back method is different in write timing from the write through method. That is, when writing, the local cache memory 6 is written, but not written to the external memory 4 and written when the local cache memory 6 purges the contents. As a result, only when purging, the slow global bus 2 and interface 3 are used, and the processing operation is faster than the write-through method.

３．マルチプロセッサ装置でのローカルキャッシュメモリの応用について
上記ローカルキャッシュメモリをマルチプロセッサで応用する場合、ライトスルー法で処理し、かつ他のＣＰＵの書き込み内容を監視する「監視機能」を持たなければならない。 3. Application of local cache memory in a multiprocessor device When the local cache memory is applied in a multiprocessor, it must have a “monitoring function” for processing by the write-through method and monitoring the written contents of other CPUs.

まず、ライトスルー法を使用しなければならない理由（つまり、ライトバック法ではいけない理由）は、ライトバック法で００１３番地を書き込んだ場合、その内容がパージされるまで外部メモリ４に書き込まれない。その結果、他のＣＰＵが００１３番地を読みにいってもパージされるまで他のＣＰＵは古い内容しか読めないからである。 First, the reason why the write-through method must be used (that is, the reason why the write-back method should not be used) is that when address 0013 is written by the write-back method, it is not written to the external memory 4 until the contents are purged. As a result, even if the other CPU reads address 0013, the other CPU can read only the old contents until purged.

一方、ライトスルー法にしたとしても、他のＣＰＵが００１３番地の内容を持っていた場合、その内容は変化されない。従って、ローカルキャッシュメモリ６は他のローカルキャッシュメモリの書き込みを監視し、書き込みがあった場合、自分がもっているローカルキャッシュメモリのアドレス情報と比較して同じ内容があった場合にこの内容を無効化しなければならなくなる。 On the other hand, even if the write-through method is used, if another CPU has the contents of address 0013, the contents are not changed. Therefore, the local cache memory 6 monitors the writing of the other local cache memory, and when there is a writing, invalidates this content when there is the same content as compared with the address information of the local cache memory that it has. Will have to.

マルチプロセッサ装置でのキャッシュメモリ構成では、このローカルキャッシュメモリごと、またはローカルキャッシュメモリと共有メモリとの間の同一性を保つため、これまでいろいろな方法が考えられた。例えば、特公平２−２２７５７号公報及び特公平４−１７５９４６号公報は、データの共有／非共有によってアクセスするメモリを分け、共有データを上記方法で書き込みを監視し、これによって、ローカルキャッシュメモリを無効化する技術を採用している。 In a cache memory configuration in a multiprocessor device, various methods have been conceived so far in order to maintain the identity between each local cache memory or between the local cache memory and the shared memory. For example, in Japanese Patent Publication Nos. 22-22757 and 4-175946, a memory to be accessed is divided according to sharing / non-sharing of data, and writing of the shared data is monitored by the above method. Employs invalidation technology.

米国特許明細書第４９３９６４１号公報は、キャッシュメモリの中に共有／非共有情報を置き、非共有ならライトバック法で、共有ならライトスルー法でキャッシュを読み書きする方法が紹介されている。これらをまとめて「書き込み監視付き」マルチプロセッサおよび、キャッシュメモリという構成は、数限りないほどあり、中には「監視機能つき」を前提条件としたものがある。
特公平２−２２７５７号公報特公平４−１７５９４６号公報米国特許明細書第４９３９６４１号公報 U.S. Pat. No. 4,939,641 introduces a method of placing shared / non-shared information in a cache memory and reading / writing the cache using the write-back method if non-shared and the write-through method if shared. Collectively, there are numerous configurations of “with write monitoring” multiprocessors and cache memories, and some have “with monitoring function” as a prerequisite.
JP-B-2-22757 Japanese Examined Patent Publication No. 4-175946 U.S. Pat. No. 4,939,641

従来のマルチプロセッサ装置は以上のように構成されているので、次のような課題があった。 Since the conventional multiprocessor device is configured as described above, it has the following problems.

一つ目は監視処理のための時間。
監視処理が書き込みの度に実施されると、その処理の間、ＣＰＵはローカルキャッシュメモリを使用できなくなり、ＣＰＵの動作速度が落ちる結果になる。例えば、ある処理の読み込み数が１，０００，０００回、１読み込みあたり１クロック（Ｃｌｏｃｋ）、書き込み数が１０，０００回、１書き込みあたり（ライトスルーであるため、すべての書き込みをバスを通して実施するとして）４クロック、書き込みに対する監視処理に２クロックかかったとする。この処理を５つのＣＰＵが同時に行った場合、全ＣＰＵの書き込みが５［ＣＰＵ］×１０，０００［回］＝５０，０００回になるため、監視処理が１００，０００クロック必要となる。監視処理を除いた時間が１, ０００, ０００＋１０，０００×４＝１，０４０，０００クロックであるため、監視処理のために処理時間が１０％近く長くなることになる。 The first is the time for the monitoring process.
If the monitoring process is performed each time writing is performed, the CPU cannot use the local cache memory during the process, resulting in a decrease in the operation speed of the CPU. For example, the number of readings of a certain process is 1,000,000 times, one clock (Clock) per reading, and the number of writings is 10,000 times per writing (because it is write-through, all writing is performed through the bus. Suppose that it takes 4 clocks and 2 clocks for the monitoring process for writing. When five CPUs perform this process at the same time, the writing of all the CPUs is 5 [CPU] × 10,000 [times] = 50,000 times, so that the monitoring process requires 100,000 clocks. Since the time excluding the monitoring process is 1,000,000 + 10,000 × 4 = 1,040,000 clocks, the processing time becomes nearly 10% longer for the monitoring process.

同じ例で、書き込み回数が２００回であった場合の監視処理を除いた処理時間は１，０８０，０００クロックとなり、監視処理の時間は２００，０００クロックとなり２０％程度長くなる。更に書き込み回数２００回かつ１０のＣＰＵであった場合、その監視処理の時間は４００，０００クロックとなり、４０％近く長くなる。上記の例のように一般に、監視処理時間はＣＰＵおよびキャッシュメモリの数と、その書き込み回数に比例する。 In the same example, when the number of writes is 200, the processing time excluding the monitoring processing is 1,080,000 clocks, and the monitoring processing time is 200,000 clocks, which is about 20% longer. Further, when the number of writings is 200 and the CPU is 10, the monitoring processing time is 400,000 clocks, which is nearly 40% longer. Generally, as in the above example, the monitoring processing time is proportional to the number of CPUs and cache memories and the number of writes.

二つ目はライトバックキャッシュにできないことによる処理速度の低下。
上記と同じ処理を実施して、書き込み処理のうち５０％がキャッシュメモリにヒットしたとし、そのときの書き込み処理の時間が１クロックであったとすると、その処理時間（監視時間を除いた）は１，０００，０００×１クロック＋１０，０００×５×４クロック＝１，０２５，０００クロックとなり、２％程度短くなる。書き込み回数が倍になった場合、同様に１，０５０，０００クロックとなり、同様に３％程度短くなる。ヒット率が高くなれば、ライトバックキャッシュの時間は更に短くなる。しかし、マルチプロセッサでは、前途のごとく、ライトバックキャッシュでは他のＣＰＵは古い内容しか読めないため、速度の遅くなるライトスルーキャッシュしか使用できなかった。 The second is a decrease in processing speed due to the fact that write-back cache cannot be used.
If the same processing as described above is performed and 50% of the write processing hits the cache memory, and the write processing time at that time is 1 clock, the processing time (excluding the monitoring time) is 1. , 000,000 × 1 clock + 10,000 × 5 × 4 clock = 1,0255,000 clocks, which is about 2% shorter. When the number of times of writing is doubled, it becomes 1,050,000 clocks in the same manner, and is similarly shortened by about 3%. The higher the hit rate, the shorter the write-back cache time. However, in the multiprocessor, as described above, in the write-back cache, other CPUs can read only the old contents, so that only the write-through cache with a slow speed can be used.

三つ目はコストの問題。
もし仮に、書き込み監視機能つきマルチプロセッサシステムを１チップについて実現した場合、この監視処理は、キャッシュメモリの機能を増やすことになる。監視機能の追加はこれまでライブラリ化されている通常のキャッシュメモリが使えないまたは改定を要することを意味する。仮に改定を要する場合はその分だけ設計時間の増大となる。また、機能付加によってチップレイアウト面積は増加する。設計時間の増大、レイアウト面積の増大の結果、チップ開発コスト、作成コストともに上昇する。 The third is the cost issue.
If a multiprocessor system with a write monitoring function is realized for one chip, this monitoring process increases the cache memory function. The addition of the monitoring function means that the normal cache memory that has been stored in the library cannot be used or needs to be revised. If revision is necessary, the design time is increased accordingly. Moreover, the chip layout area increases due to the addition of functions. As a result of increase in design time and layout area, both chip development cost and production cost increase.

この監視処理をチップ外部部品で調達するにも問題がある。もし、単なるライトキャッシュまたはライトスルーのみのキャッシュであれば、安価で手に入る。これは、現在シングルプロセッサの需要が多く、しかもシングルプロセッサの書き込み監視を必要としないからである。 There is also a problem in procuring this monitoring process with chip external components. If it is just a write cache or a write-through only cache, it is inexpensive and available. This is because there is currently a great demand for single processors, and no single processor write monitoring is required.

しかし、上記のような何らかの「書き込み監視」機能付きのキャッシュメモリはなかなか安価には入手できない。これは、現在マルチプロセッサが特殊分野でしか活用されず、その市場が小さく、その結果、部品は少量生産となり、高くなるからである。 However, a cache memory with some “write monitoring” function as described above cannot be obtained at a low cost. This is because multiprocessors are currently used only in special fields and the market is small, resulting in high volume production and high parts.

この発明は上記従来の課題を解消するためになされたもので、キャッシュメモリの書き込み監視処理を必要とせず、バスの負荷軽減、データキャッシュの負荷軽減を図り、データキャッシュの高速化処理を実現したマルチプロセッサ装置を得ることを目的とする。 The present invention has been made to solve the above-described conventional problems, and does not require a cache memory write monitoring process, reduces the load on the bus and the load on the data cache, and realizes a high-speed data cache process. An object is to obtain a multiprocessor device.

この発明に係るマルチプロセッサ装置は、ＣＰＵと、ＣＰＵに接続され、このＣＰＵ毎の情報を専ら記憶する第１のキャッシュメモリとを含むプロセッサユニットと、複数のプロセッサユニットのそれぞれを接続する第１の共有バスと、複数のプロセッサユニットに接続され、複数のプロセッサユニットに共有される情報を記憶する第２のキャッシュメモリと、第１の共有バスに接続され、複数のプロセッサユニットに共有される共有情報を記憶する外部メモリをアクセスする制御信号を出力するインタフェース部と、を備えたものである。 A multiprocessor device according to the present invention includes a CPU, a processor unit that is connected to the CPU and includes a first cache memory that exclusively stores information for each CPU, and a first processor that connects each of the plurality of processor units. A shared bus, a second cache memory connected to the plurality of processor units and storing information shared by the plurality of processor units, and a shared information connected to the first shared bus and shared by the plurality of processor units And an interface unit that outputs a control signal for accessing an external memory that stores data.

この発明に係るマルチプロセッサ装置の第２のキャッシュメモリは、第１のキャッシュメモリより下位に位置づけされるキャッシュメモリであり、第１の共有バスに接続されるものである。 The second cache memory of the multiprocessor device according to the present invention is a cache memory positioned lower than the first cache memory, and is connected to the first shared bus.

この発明によれば、共有データバスにキャッシュメモリを備えるように構成したので、より高速化を図ることができる効果がある。 According to the present invention, since the cache memory is provided in the shared data bus, there is an effect that the speed can be further increased.

以下、この発明の実施の一形態を説明する。
実施の形態１．
図１はこの発明の実施の形態１によるマルチプロセッサ装置の構成を示すブロック図であり、ライトバックキャッシュ使用、共有キャッシュなしの場合である。 An embodiment of the present invention will be described below.
Embodiment 1 FIG.
FIG. 1 is a block diagram showing the configuration of a multiprocessor device according to Embodiment 1 of the present invention, in which a write-back cache is used and no shared cache is used.

ここで、まず、「共有」とは、各ＣＰＵが使用・転送・格納する資源でなく、単に「共有データ」を転送・格納する資源である。また、「非共有」とは単一ＣＰＵのみが使用・転送・格納する資源でなく、単に「非共有データ」を転送・格納する資源である。単一ＣＰＵのみが使用する資源を「専用」資源とし、複数のＣＰＵが使用する資源を「共用」資源とする。 Here, “shared” is not a resource that is used / transferred / stored by each CPU, but a resource that simply transfers / stores “shared data”. Further, “non-shared” is not a resource that is used / transferred / stored only by a single CPU, but a resource that simply transfers / stores “non-shared data”. A resource used only by a single CPU is a “dedicated” resource, and a resource used by a plurality of CPUs is a “shared” resource.

図１において、１１は命令又はアクセスしようとするアドレスによって読み書きするデータが共有データか、非共有データかを判断し、その結果によってバスを選択することが可能な装置を備えた第ｉＣＰＵである。この判断方法については実施の形態１０以降で説明する。この第ｉＣＰＵ１１は第（ｉ，１）共有バス端子と第（ｉ，１）非共有バス端子をもち、これらは命令またはアクセスしようとするＣＰＵが共有データか非共有データかを判断することによりバスが選択されるようになっている。 In FIG. 1, reference numeral 11 denotes an iCPU having a device capable of determining whether data to be read / written is shared data or non-shared data according to an instruction or an address to be accessed, and selecting a bus based on the result. This determination method will be described in the tenth and subsequent embodiments. The i-th CPU 11 has a (i, 1) shared bus terminal and an (i, 1) non-shared bus terminal, which determine the instruction or whether the CPU to be accessed is shared data or non-shared data. Is to be selected.

１２は、他のＣＰＵからの書き込み監視機能をもたない第（ｉ，１）ローカルキャッシュメモリである。ここで、第（ｉ，１）ローカルキャッシュメモリ１２はデータの受け渡しのみを実施する。これは、前述のごとく、プログラムは原則書きかえる必要がないためで、この図には命令キャッシュ（ＩｎｓｔｒｕｃｔｉｏｎＣａｃｈｅ）は省略されている。この第（ｉ，１）ローカルキャッシュメモリ１２は、第（ｉ，１）ＣＰＵ側バス端子１２ａと第（ｉ，１）ＣＰＵ外部側バス端子１２ｂをもつ。第（ｉ，１）ＣＰＵ側バス端子１２ａは第（ｉ，１）非共有バス端子１１ｂに接続されている。この第（ｉ，１）ローカルキャッシュメモリ１２は第ｉＣＰＵ１１の専用資源である。 Reference numeral 12 denotes an (i, 1) local cache memory that does not have a write monitoring function from another CPU. Here, the (i, 1) th local cache memory 12 performs only data transfer. As described above, this is because the program does not need to be rewritten in principle, and an instruction cache is not shown in this figure. The (i, 1) local cache memory 12 has a (i, 1) CPU side bus terminal 12a and an (i, 1) CPU external side bus terminal 12b. The (i, 1) CPU side bus terminal 12a is connected to the (i, 1) non-shared bus terminal 11b. The (i, 1) local cache memory 12 is a dedicated resource for the i-th CPU 11.

１３ａは、第（ｉ，１）ローカルキャッシュメモリ１２の第（ｉ，１）外部側バス端子１２ｂに接続された第（ｉ，１）ローカル非共有バス、１３ｂは第ｉＣＰＵ１１の第（ｉ，１）共有バス端子１１ａに接続された第（ｉ，１）ローカル共有バスである。 13a is the (i, 1) local unshared bus connected to the (i, 1) external bus terminal 12b of the (i, 1) local cache memory 12, and 13b is the (i, 1) of the iCPU11. ) The (i, 1) local shared bus connected to the shared bus terminal 11a.

１４は第ｉＣＰＵ１１、第（ｉ，１）ローカルキャッシュメモリ１２、第（ｉ，１）ローカル非共有バス１３ａ、第（ｉ，１）ローカル共有バス１３ｂを含む第ｉプロセッサユニットである。 Reference numeral 14 denotes an i-th processor unit including an i-th CPU 11, an (i, 1) local cache memory 12, an (i, 1) local unshared bus 13a, and an (i, 1) local shared bus 13b.

第ｉプロセッサユニット１４は第（ｉ，１）ユニット非共有バス端子１４ａと第（ｉ，１）ユニット共有バス端子１４ｂをもち、それぞれの端子は第（ｉ，１）ローカル非共有バス１３ａと第（ｉ，１）ローカル共有バス１３ｂに接続されている。ここでプロセッサユニットの総数をＩ個とする。１４Ａは第ｉプロセッサユニット１４の隣にある第ｉ＋１プロセッサユニットであり、第ｉプロセッサユニット１４と同一構成である。 The i-th processor unit 14 has an (i, 1) unit unshared bus terminal 14a and an (i, 1) unit shared bus terminal 14b, each of which is connected to the (i, 1) local unshared bus 13a. (I, 1) Connected to the local shared bus 13b. Here, the total number of processor units is I. Reference numeral 14 </ b> A denotes an (i + 1) -th processor unit adjacent to the i-th processor unit 14 and has the same configuration as the i-th processor unit 14.

１５ａは第１グローバル非共有バスであり、第ｉプロセッサユニット１４の第（ｉ，１）ユニット非共有バス端子１４ａに接続されている。この第１グローバル非共有バス１５ａは、各ＣＰＵから外部の非共有メモリ１９ａへ非共有データを転送するためのバスである。各ローカル非共有バス端子１４ａからのアクセス要求に対して調停を実施する図示せぬバスアービタ装置を備えている。この資源は各ＣＰＵ（プロセッサユニット）共用である。 A first global unshared bus 15 a is connected to the (i, 1) unit unshared bus terminal 14 a of the i-th processor unit 14. The first global unshared bus 15a is a bus for transferring non-shared data from each CPU to an external non-shared memory 19a. A bus arbiter device (not shown) that performs arbitration in response to an access request from each local unshared bus terminal 14a is provided. This resource is shared by each CPU (processor unit).

１５ｂは第１グローバル共有バスであり、第ｉプロセッサユニット１４の第（ｉ，１）ユニット共有バス端子１４ｂに接続されている。この第１グローバル共有バスは、各ＣＰＵから外部の共有メモリ１９ｂへの共有データの転送をするためのバスである。この第１グローバル共有バス１５ｂは、各ローカル共有バス端子１４ｂからのアクセス要求に対して調停を実施する図示せぬバスアービタ装置を備えている。この資源は各ＣＰＵ（プロセッサユニット）共用である。 A first global shared bus 15 b is connected to the (i, 1) unit shared bus terminal 14 b of the i-th processor unit 14. The first global shared bus is a bus for transferring shared data from each CPU to the external shared memory 19b. The first global shared bus 15b includes a bus arbiter device (not shown) that performs arbitration in response to an access request from each local shared bus terminal 14b. This resource is shared by each CPU (processor unit).

１７ａは第１非共有インタフェースであり、ここから外部の非共有メモリ１９ａなどとアクセスする。この資源は各ＣＰＵ（プロセッサユニット）共用である。 Reference numeral 17a denotes a first non-shared interface from which an external non-shared memory 19a and the like are accessed. This resource is shared by each CPU (processor unit).

１７ｂは第１共有インタフェースであり、ここから外部の共有メモリ１９ｂなどとアクセスする。この資源は各ＣＰＵ（プロセッサユニット）共用である。 Reference numeral 17b denotes a first shared interface from which an external shared memory 19b and the like are accessed. This resource is shared by each CPU (processor unit).

上記非共有メモリ１９ａは非共有データを格納するメモリである。この非共有メモリ１９ａは各プロセッサユニット専用でなく、各プロセッサユニットで共用であってかまわない。この（共用の）非共有メモリ１９ａへの各ＣＰＵからの書き込み領域は、たとえばアドレスで分割されているものとする。具体的には、たとえば非共有メモリ１９ａがアドレス００００〜７ＦＦＦまでに割り当てられていたとすると、第１ＣＰＵはその使用する領域を００００〜０ＦＦＦ、第２ＣＰＵはその使用する領域を１０００〜１ＦＦＦといった具合に割り当てられているものとする。従って、この例で非共有メモリ１９ａの領域を００００〜０ＦＦＦは第１ＣＰＵ「専用」になる。 The non-shared memory 19a is a memory for storing non-shared data. This non-shared memory 19a is not dedicated to each processor unit, but may be shared by each processor unit. It is assumed that the write area from each CPU to this (shared) non-shared memory 19a is divided by, for example, addresses. Specifically, for example, if the non-shared memory 19a is assigned to addresses 0000 to 7FFF, the first CPU assigns the area to be used to 0000 to 0FFF, the second CPU assigns the area to be used to 1000 to 1FFF, and so on. It is assumed that Accordingly, in this example, the area of the non-shared memory 19a becomes “dedicated” for the first CPU from 0000 to 0FFF.

１９ｂは共有メモリである。この共有メモリは共有データを格納するためのメモリである。この共有メモリと非共有メモリのアドレスマップ上の領域は重ならないようにする。 Reference numeral 19b denotes a shared memory. This shared memory is a memory for storing shared data. The areas on the address map of the shared memory and the non-shared memory should not overlap.

（上記の構成におけるインタフェースから外の世界の制約）
ここで、第１非共有インタフェース１７ａと第１共有インタフェース１７ｂから先のバス構成についてはこの発明の本質ではない。従って、あるメモリブロックは第１非共有インタフェース１７ａ経由しか読めないようにしても良いし、また、別のメモリブロックはどちらからも読めるようにしても良い。ただし、共有データがおかれるメモリブロックは第１共有インタフェース経由でアクセスできるようにしておき、非共有データは同じように第１非共有インタフェース経由でアクセスしなければならない。 (Restriction of the outside world from the interface in the above configuration)
Here, the bus configuration ahead of the first non-shared interface 17a and the first shared interface 17b is not the essence of the present invention. Accordingly, a certain memory block may be readable only via the first non-shared interface 17a, and another memory block may be readable from either. However, the memory block in which the shared data is stored must be accessible via the first shared interface, and the non-shared data must be accessed via the first non-shared interface as well.

以下、説明のため便宜上共有メモリ１９ｂと非共有メモリ１９ａが図のように置かれた場合について説明する。 Hereinafter, for the sake of explanation, a case where the shared memory 19b and the non-shared memory 19a are placed as illustrated will be described.

（非共有データ・ワークエリアの説明）
本発明は、該当処理しか使わない内容（ワークエリアの内容）をローカルキャッシュで閉じさせ、さらに複数の処理で使用する内容は一つのメモリにのみ書いて各ローカルキャッシュにはいれないようして、書き込み監視処理をなくすことにより高速化及び低コスト化を図るようにしたもので、ここでは、５つのＣＰＵが５科目の平均点を求めるプログラムを例にとって説明する。 (Description of non-shared data work area)
The present invention closes the contents (work area contents) used only by the corresponding process in the local cache, and further writes the contents used in a plurality of processes only in one memory and does not enter each local cache. This is intended to increase the speed and cost by eliminating the write monitoring process. Here, a description will be given by taking as an example a program in which five CPUs obtain the average score of five subjects.

マルチプロセッサにおけるメモリの内容には書き込み共有すべき内容と書き込み共有する必要のない内容がある。例えば、ある得点データベースがあり、５つのＣＰＵが「英語」「数学」「国語」「理科」「社会」の平均点を求めるものとする。 The contents of the memory in the multiprocessor include contents that should be shared and contents that do not need to be shared. For example, it is assumed that there is a certain score database, and five CPUs calculate the average score of “English”, “Mathematics”, “Japanese”, “Science”, and “Society”.

このとき、「英語」の総得点を格納するメモリと、サンプル数を格納するメモリが必要になるが、これらは他の「数学」等の平均点を求めるのには必要がない。これら他の処理に必要でない格納領域を一般にワークエリアという。このワークエリアの内容は、他のＣＰＵが知る必要がないので第（ｉ，１）ローカルキャッシュメモリ１２に格納するようにする。 At this time, a memory for storing the total score of “English” and a memory for storing the number of samples are required, but these are not necessary for obtaining other average points such as “math”. A storage area that is not necessary for these other processes is generally called a work area. The contents of this work area are stored in the (i, 1) th local cache memory 12 because there is no need for other CPUs to know.

この動作を説明する。ＣＰＵはワークエリアにアクセスしにいくとき、ＣＰＵはこのアクセスを「非共有データ」と判断し、第（ｉ，１）ＣＰＵ非共有バス端子を選択しアクセスを実施する。第（ｉ，１）ローカルキャッシュメモリ１２は、ＣＰＵからのアクセス情報に従い、該当アドレスの内容があるかを検索し、あればその内容をＣＰＵに返す。該当アクセスの内容がない場合、キャッシュメモリは第（ｉ，１）ローカル非共有バス１３ａ，第１グローバル非共有バス１５ａ，第１非共有インタフェース１７ａを経由して、非共有メモリ１９ａにアクセスを要求する。 This operation will be described. When the CPU goes to access the work area, the CPU judges this access as “unshared data”, selects the (i, 1) CPU unshared bus terminal, and executes the access. The (i, 1) th local cache memory 12 searches for the contents of the corresponding address in accordance with the access information from the CPU, and returns the contents to the CPU if there is. If there is no corresponding access content, the cache memory requests access to the non-shared memory 19a via the (i, 1) local non-shared bus 13a, the first global non-shared bus 15a, and the first non-shared interface 17a. To do.

調停によって、第１グローバル非共有バス１５ａがあき、第（ｉ，１）ローカルキャッシュメモリ１２が非共有メモリ１９ａの該当アドレスを取り出したとき、第（ｉ，１）ローカルキャッシュメモリ１２はその内容のコピーを取り込む。この時、非共有メモリ１９ａの非共有領域は、各ＣＰＵで「専用に」なっているため、他のＣＰＵからの書き込みもなく、各ＣＰＵへの影響もない。 As a result of the arbitration, when the first global non-shared bus 15a is opened and the (i, 1) local cache memory 12 fetches the corresponding address of the non-shared memory 19a, the (i, 1) local cache memory 12 Capture a copy. At this time, since the non-shared area of the non-shared memory 19a is “dedicated” by each CPU, there is no writing from other CPUs and there is no influence on each CPU.

２回目以降、第（ｉ，１）ローカルキャッシュメモリ１２が取り込んだアドレスの内容を持っている間は、第ｉＣＰＵ１１は第（ｉ，１）ローカルキャッシュメモリ１２にのみアクセスする。また、この第（ｉ，１）ローカルキャッシュメモリ１２の内容は他のＣＰＵが知る必要がないので、たとえ第（ｉ，１）ローカルキャッシュメモリ１２の内容が書き変わったとしても他のＣＰＵは第（ｉ，１）ローカルキャッシュメモリ１２を書き込み監視する必要がない。 From the second time onward, the i-th CPU 11 accesses only the (i, 1) -th local cache memory 12 while the contents of the address fetched by the (i, 1) -th local cache memory 12 are held. In addition, since the contents of the (i, 1) local cache memory 12 do not need to be known by other CPUs, even if the contents of the (i, 1) local cache memory 12 are rewritten, (I, 1) There is no need to monitor the local cache memory 12 for writing.

（共有データの説明とその動作）
一方、このあと各科目の総平均点から、各科目の難易度を知るため、偏差値を取ったとする。このとき、求められた各科目の平均点は偏差値を求めるために必要なので共有すべきである。これら後の他のＣＰＵ（他の処理）が必要とする内容は第（ｉ，１）ローカル共有バス１３ｂから第１グローバル共有バス１５ｂ、第１共有インタフェース１７ｂを通して共有メモリ１９ｂとアクセスし、ローカルキャッシュメモリ１２に格納しない。 (Description of shared data and its operation)
On the other hand, it is assumed that the deviation value is taken from the total average score of each subject to know the difficulty level of each subject. At this time, the average score of each subject obtained is necessary to find the deviation value and should be shared. The contents required by other CPUs (other processes) after this are accessed from the (i, 1) local shared bus 13b to the shared memory 19b through the first global shared bus 15b and the first shared interface 17b, and the local cache It is not stored in the memory 12.

この動作を説明する。第ｉＣＰＵ１１は共有データと判断し、これによって第（ｉ，１）ＣＰＵ共有バス端子１１ａを選択する。これに接続されている第（ｉ，１）ローカル共有バス１３ｂから、第ｉＣＰＵ１１は第１グローバル共有バス１５ｂ，第１共有インタフェース１７ｂを経由して共有メモリ１９ｂにアクセスを要求する。調停によって、第１グローバル共有バス１５ｂがあき、第ｉＣＰＵ１１が共有メモリ１９ｂの該当アドレスを取り出す。 This operation will be described. The i-th CPU 11 determines that the data is shared data, and thereby selects the (i, 1) -th CPU shared bus terminal 11a. From the (i, 1) local shared bus 13b connected thereto, the i-th CPU 11 requests access to the shared memory 19b via the first global shared bus 15b and the first shared interface 17b. Due to the arbitration, the first global shared bus 15b is opened, and the i-th CPU 11 takes out the corresponding address in the shared memory 19b.

この動作が書き込みであった場合、書き込み監視装置が要らないことについて説明する。第ｉＣＰＵ１１からの書き込みが完了した段階で、共有メモリ１９ｂはもっとも最新の書き込み情報が格納されていることになる。一方、その直後に他のＣＰＵが同じアドレスのデータを共有メモリ１９ｂに読みにいくとき、共有メモリ１９ｂは確実に最新の内容をもっており、他のＣＰＵは最新の内容を得ることができる。また、共有データを取り込むローカルなキャッシュメモリがどこにもないので、これまで並列処理では暗黙の了解となっていた書き込み監視を実施する必要がない。 When this operation is writing, it will be described that the writing monitoring device is not necessary. When writing from the i-th CPU 11 is completed, the shared memory 19b stores the latest writing information. On the other hand, when another CPU goes to read data at the same address to the shared memory 19b immediately after that, the shared memory 19b surely has the latest contents, and the other CPUs can obtain the latest contents. In addition, since there is no local cache memory for fetching shared data, there is no need to perform write monitoring, which has been implicitly understood in parallel processing so far.

（共有と非共有をわけ、共有をキャッシュメモリに取り込まないことの効果１）
ワークエリアの内容は書き換えの度に他のＣＰＵが書き換えられたことを知る必要がないので、第（ｉ，１）ローカルキャッシュメモリ１２はライトスルーキャッシュである必要がなく、ライトバックキャッシュであってよい。つまり、ワークエリアの内容は第（ｉ，１）ローカルキャッシュメモリ１２の中で閉じることとなる。通常、ワークエリアのアクセス回数は非常に多い。 (Effect of not sharing and sharing into cache memory 1)
Since it is not necessary to know that the content of the work area is rewritten by another CPU every time it is rewritten, the (i, 1) local cache memory 12 does not need to be a write-through cache, and is a write-back cache. Good. That is, the contents of the work area are closed in the (i, 1) th local cache memory 12. Usually, the work area is accessed very often.

なお、この第（ｉ，１）ローカルキャッシュメモリ１２に書き込む内容は、処理のはじめから終わりまで書き込みを必要としない内容（定数等）も第（ｉ，１）ローカルキャッシュメモリ１２経由でアクセスしても良い。これは、内容を変更しないため、他の処理に影響しないからである。 The contents to be written in the (i, 1) local cache memory 12 are accessed via the (i, 1) local cache memory 12 even if the contents (constants, etc.) that do not need to be written from the beginning to the end of the process are accessed. Also good. This is because the contents are not changed and other processes are not affected.

（共有と非共有をわけ、共有をキャッシュメモリに取り込まないことの効果２）
共有すべき内容と非共有の内容によってアクセスする内容を分けることにより、第（ｉ，１）ローカルキャッシュメモリ１２は監視機能のいらないライトバックキャッシュで良いことが判った。そこで、読み込み１，０００，０００回、このうち５，０００回が共有バス経由、書き込み１０，０００回、このうち５，０００回が共有バス経由とし、ローカルキャッシュ経由の読み書きはライトバックキャッシュを使用したとして、１クロック、共有バス経由の読み書きは４クロックかかるものとすると、この処理にかかる時間は（９９５，０００＋５，０００）×１＋（５，０００＋５，０００）×４＝１，００４，０００クロックとなり、従来の監視機能を必要とする場合の時間１，１４０，０００に比べ約１０％程度速くなることが判る。 (Effect 2 of sharing and non-sharing, and not sharing the cache memory)
It was found that by dividing the contents to be accessed by the contents to be shared and the contents not to be shared, the (i, 1) local cache memory 12 may be a write-back cache that does not require a monitoring function. Therefore, read 1,000,000 times, of which 5,000 are via the shared bus, 10,000 are written, 5,000 of these are via the shared bus, and read / write via the local cache uses a write-back cache Assuming that 1 clock takes 4 clocks for reading and writing via the shared bus, the time required for this processing is (995,000 + 5,000) × 1 + (5,000 + 5,000) × 4 = 1,004,000 clocks. Thus, it can be seen that it is about 10% faster than the time of 1,140,000 when the conventional monitoring function is required.

また、従来ではライトスルーキャッシュでしか対応できなかったが、この実施の形態１ではローカルキャッシュはライトバックキャッシュとライトスルーキャッシュのいずれもが使える（性能的にはライトバックキャッシュの方がよいが何らかの設計的理由でライトスルーキャッシュにしてもよい）。 Conventionally, only the write-through cache can be used. However, in the first embodiment, either the write-back cache or the write-through cache can be used as the local cache. It may be a write-through cache for design reasons).

（共有と非共有をわけ、共有をキャッシュメモリに取り込まないことの効果３）
先に示したように第（ｉ，１）ローカルキャッシュメモリ１２は、マルチプロセッサに関する特殊な書き込み監視を必要としない。これは、高価なマルチマイクロプロセッサ専用のキャッシュメモリを使用せず、汎用のキャッシュメモリを使えることを意味する。この機能を持たないキャッシュメモリを使用することにより、コストを削減できる。
実施の形態２．
ライトバックキャッシュ使用、共有キャッシュありの場合
図２は発明の実施の形態２に係るマルチプロセッサ装置を示すブロック図であり、前記図１に示した実施の形態１と同一の部分については同一符号を付して重複説明を省略する。この実施の形態２では第１共有インタフェース１７ｂより内部において第１グローバル共有バス１５ｂの途中に第１グローバル共有キャッシュメモリ１６を設けたものである。 (Effect 3 of sharing and non-sharing, and not sharing the cache memory)
As described above, the (i, 1) -th local cache memory 12 does not require special write monitoring for the multiprocessor. This means that a general-purpose cache memory can be used instead of an expensive multi-microprocessor dedicated cache memory. Costs can be reduced by using a cache memory that does not have this function.
Embodiment 2. FIG.
FIG. 2 is a block diagram showing a multiprocessor device according to the second embodiment of the present invention. In FIG. 2, the same parts as those in the first embodiment shown in FIG. A duplicate description will be omitted. In the second embodiment, the first global shared cache memory 16 is provided in the middle of the first global shared bus 15b inside the first shared interface 17b.

この構成では、第ｉＣＰＵ１１が第１グローバル共有キャッシュメモリ１６に書き込みを実施した直後に第（ｉ＋１）ＣＰＵ（不図示）が同じアドレスの内容を読みにいっても、第１グローバル共有キャッシュメモリ１６の直前に更新された内容を取り込むことになるので、新しい内容が読み込める。また、第１グローバル共有キャッシュメモリ１６を搭載することで処理を更に高速化できる。 In this configuration, even if the (i + 1) CPU (not shown) reads the contents of the same address immediately after the i-th CPU 11 has written to the first global shared-cache memory 16, the contents of the first global-shared cache memory 16 Since the content updated immediately before is taken in, the new content can be read. Further, by installing the first global shared cache memory 16, the processing can be further speeded up.

そこで、共有データの読み書きは、第１グローバル共有キャッシュメモリ１６があることにより、２クロックで実施するものとして、実施の形態１と同じ処理で時間を比較すると、処理時間は（９９５，０００＋５，０００）×１＋（５，０００＋５，０００）×２＝１，００２，０００クロックとなり、実施の形態１に比べ若干ではあるが速くなる。しかし、これはあくまでも共有データの読み書きが少ない場合で、一般に、共有データの読み書きが多い場合、実施の形態２の方が速くなる。
実施の形態３．
ライトバックキャッシュ使用、インタフェース１個の場合
図３は発明の実施の形態３に係るマルチプロセッサ装置を示すブロック図であり、前記図１に示した実施の形態１と同一の部分については同一符号を付して重複説明を省略する。この実施の形態３では第１グローバル非共有バス１５ａと第１グローバル共有バス１５ｂを共有インタフェース３７を介して共有／非共有メモリ３９に接続したもので、この共有／非共有メモリ３９は共有／非共有領域が重ならないようになっている。 Therefore, when reading and writing shared data is performed in two clocks due to the presence of the first global shared cache memory 16, the processing time is (995,000 + 5,000) by comparing the time with the same processing as in the first embodiment. ) × 1 + (5,000 + 5,000) × 2 = 1,002,000 clocks, which is slightly faster than the first embodiment. However, this is only when there is little read / write of the shared data. Generally, when the read / write of the shared data is large, the second embodiment is faster.
Embodiment 3 FIG.
FIG. 3 is a block diagram showing a multiprocessor device according to the third embodiment of the present invention. In FIG. 3, the same parts as those in the first embodiment shown in FIG. A duplicate description will be omitted. In the third embodiment, the first global non-shared bus 15a and the first global shared bus 15b are connected to the shared / non-shared memory 39 via the shared interface 37. The shared / non-shared memory 39 is shared / non-shared. The shared areas do not overlap.

次に動作について説明する。
第ｉＣＰＵ１１が非共有データにアクセスする場合、まず第（ｉ，１）ＣＰＵ非共有バス端子から、第（ｉ，１）キャッシュメモリ１２にアクセス要求をする。第（ｉ，１）キャッシュメモリ１２は自分自身にアクセスし、内容が存在しない場合は第１グローバル非共有バス１５ａ、共有インタフェース３７を経由して共有／非共有メモリ３９の第ｉＣＰＵ１１の専用の非共有領域にアクセスしにいく。この第ｉＣＰＵ１１の「専用の」非共有領域は他のＣＰＵからの書き込みをしないし、第ｉＣＰＵ１１自身も他のＣＰＵの領域に書き込みにいかない。また、この第ｉＣＰＵ１１専用の非共有領域は、共有領域としての書き込みをしないので、第ｉＣＰＵ１１に対して書き込みデータは１００％保証される。したがって、当然書き込み監視は要らない。 Next, the operation will be described.
When the i-th CPU 11 accesses non-shared data, first, an access request is made to the (i, 1) cache memory 12 from the (i, 1) CPU non-shared bus terminal. The (i, 1) cache memory 12 accesses itself, and when there is no content, the dedicated non-shared memory of the iCPU 11 of the shared / non-shared memory 39 via the first global non-shared bus 15a and the shared interface 37. Go to the shared area. The “dedicated” non-shared area of the i-th CPU 11 does not write from other CPUs, and the i-th CPU 11 itself cannot write to the areas of other CPUs. In addition, since the non-shared area dedicated to the i-th CPU 11 does not perform writing as a shared area, the write data is guaranteed 100% for the i-th CPU 11. Therefore, of course, write monitoring is not necessary.

一方、共有データアクセスの場合は、第ｉＣＰＵ１１はＣＰＵ共有バス端子からアクセスを開始し、第１グローバル共有バス１５ｂ、共有インタフェース３７を経由して共有／非共有メモリ３９の割り当てられた共有領域へアクセスする。共有データが書き込まれた場合、内容は即座に他のＣＰＵの読み込みに反映されるので、書き込み監視は要らない。 On the other hand, in the case of shared data access, the i-th CPU 11 starts access from the CPU shared bus terminal and accesses the shared area to which the shared / non-shared memory 39 is allocated via the first global shared bus 15b and the shared interface 37. To do. When shared data is written, the contents are immediately reflected in the reading of other CPUs, so that writing monitoring is not required.

この方法では共有インタフェース３７に共有データと非共有データの両方が通ることになる。しかし特に第（ｉ，１）ローカルキャッシュメモリ１２へのヒット率が高く、また、共有データの読み書きが少ない場合、バスの使用率が低くなり、この方法は有効となる。 In this method, both shared data and non-shared data pass through the shared interface 37. However, in particular, when the hit rate to the (i, 1) th local cache memory 12 is high and there is little reading / writing of shared data, the bus usage rate is low, and this method is effective.

マルチマイクロプロセッサ系を一つのチップに収納したい場合、面積を少しでも小さく設計しなければならない。要求された面積を満足しなければならず、バスを２本も走らすことができない場合、図４に示すように、第１グローバル非共有バス１５ａと第１グローバル共有バス１５ｂを１つにして第１グローバルバス３５とすることができる。この実施の形態３は実施の形態２と同様の効果を示すが、実施の形態３の方がバス使用率が高くなるので、処理がやや遅くなる。
実施の形態４．
ライトバックキャッシュ使用、インタフェース１個、共有キャッシュ付きの場合
図５は発明の実施の形態４に係るマルチプロセッサ装置を示すブロック図であり、前記図３に示した実施の形態３と同一の部分については同一符号を付して重複説明を省略する。この実施の形態４では共有インタフェース３７より内部において第１グローバル共有バス１５ｂの途中に第１グローバル共有キャッシュメモリ１６を設けたものである。 If you want to store a multi-microprocessor system on a single chip, you must design the area as small as possible. If the requested area must be satisfied and no two buses can run, the first global unshared bus 15a and the first global shared bus 15b are combined as shown in FIG. One global bus 35 can be used. Although the third embodiment shows the same effect as the second embodiment, the bus usage rate is higher in the third embodiment, so the processing is slightly slower.
Embodiment 4 FIG.
FIG. 5 is a block diagram showing a multiprocessor device according to a fourth embodiment of the present invention, with the same parts as those in the third embodiment shown in FIG. Are denoted by the same reference numerals and redundant description is omitted. In the fourth embodiment, the first global shared cache memory 16 is provided in the middle of the first global shared bus 15b inside the shared interface 37.

このような形態が有効なのはローカルキャッシュメモリ１２のヒット率が高く（バス使用率が低く）、また共有データの読み書きが多い場合である。このときに面積縮小を図るためにこのような構成を取ってもよい。この実施の形態４は実施の形態２と同様の効果を示すが、実施の形態４の方がバス利用率が高くなるのでやや遅くなる。 Such a form is effective when the hit rate of the local cache memory 12 is high (the bus usage rate is low) and the shared data is frequently read and written. At this time, in order to reduce the area, such a configuration may be adopted. The fourth embodiment shows the same effect as the second embodiment, but the fourth embodiment is slightly slower because the bus utilization rate is higher.

また、図６に示すように、第１グローバル非共有バス１５ａと第１グローバル共有バス１５ｂを１つにして第１グローバルバス３５としてもよい。この場合、第１グローバルバス３５の途中に設けた第１グローバルキャッシュメモリ１６は非共有データも取り込むこととなるため、なるべく大容量の方がよい。なお、この実施の形態４も原理的には実施の形態３と同じなので、書き込み監視は一切必要としない。
実施の形態５．
再帰的構成の場合
図７は発明の実施の形態５に係るマルチプロセッサ装置を示すブロック図であり、この実施の形態５では実施の形態１に示すプロセッサユニット１４，１４Ａ・・の複数により再帰プロセッサユニット５４，５４Ａを構成したものである。５４ａは第１（再帰）ユニット非共有バス端子であり、これは第（ｉ，１）ユニット非共有バス端子１４ａと同等、５４ｂは第１（再帰）ユニット共有バス端子であり、これは第（ｉ，１）ユニット共有バス端子１４ｂと同等である。 Further, as shown in FIG. 6, the first global unshared bus 15a and the first global shared bus 15b may be combined into one first global bus 35. In this case, since the first global cache memory 16 provided in the middle of the first global bus 35 also captures non-shared data, it is preferable that the capacity be as large as possible. Since the fourth embodiment is also the same as the third embodiment in principle, no write monitoring is required.
Embodiment 5. FIG.
FIG. 7 is a block diagram showing a multiprocessor device according to a fifth embodiment of the present invention. In this fifth embodiment, a recursive processor is formed by a plurality of processor units 14, 14A,... Shown in the first embodiment. The units 54 and 54A are configured. 54a is a first (recursive) unit unshared bus terminal, which is equivalent to the (i, 1) unit unshared bus terminal 14a, and 54b is a first (recursive) unit shared bus terminal, i, 1) Equivalent to the unit shared bus terminal 14b.

５５ａは第１（再帰）グローバル非共有バスであり、これは第１グローバルバス１５ａと同等、５５ｂは第１（再帰）グローバル共有バスであり、これは第１グローバル共有バス１５ｂと同等である。 55a is a first (recursive) global unshared bus, which is equivalent to the first global bus 15a, and 55b is a first (recursive) global shared bus, which is equivalent to the first global shared bus 15b.

５７ａは第１（再帰）非共有インタフェースであり、これは第１非共有インタフェース１７ａと同等、５７ｂは第１（再帰）共有インタフェースであり、これは第１共有インタフェース１７ｂと同等である。 57a is a first (recursive) non-shared interface, which is equivalent to the first non-shared interface 17a, and 57b is a first (recursive) shared interface, which is equivalent to the first shared interface 17b.

図から明らかなように、実施の形態１のプロセッサユニット１４と、再帰プロセッサユニット５４の構造は再帰的に同じであることが判る。この結果、更にこの再帰プロセッサユニット５４を１つのプロセッサユニットとして、二重、三重の再帰が可能となる。また、再帰プロセッサユニット５４を実施の形態１の場合で示したが、実施の形態２をベースにした構成にしてもよい。このように、再帰を行うことにより目的に合わせて多様な構成が組める。
実施の形態６．
非共有バスの多バス化
図８は発明の実施の形態６に係るマルチプロセッサ装置を示すブロック図であり、前記図１に示した実施の形態１と同一の部分については同一符号を付して重複説明を省略する。また、この実施の形態６では実施の形態１について述べるが実施の形態２〜４でも同じことができるので、これら形態の説明は省略する。 As can be seen from the figure, the structure of the processor unit 14 of the first embodiment and the recursive processor unit 54 are recursively the same. As a result, the recursive processor unit 54 is further set as one processor unit, so that double and triple recursion is possible. Further, although the recursive processor unit 54 is shown in the case of the first embodiment, it may be configured based on the second embodiment. In this way, by performing recursion, various configurations can be assembled according to the purpose.
Embodiment 6 FIG.
FIG. 8 is a block diagram showing a multiprocessor device according to Embodiment 6 of the present invention. The same parts as those in Embodiment 1 shown in FIG. Duplicate explanation is omitted. Further, in Embodiment 6, Embodiment 1 will be described, but since the same can be applied to Embodiments 2 to 4, description of these embodiments will be omitted.

第ｉＣＰＵ１１は２つのＣＰＵ非共有バス端子１１ｂ，１１ｃを持っている。従来分を第（ｉ，１）ＣＰＵ非共有バス端子１１ｂ、増設分を第（ｉ，２）ＣＰＵ非共有バス端子１１ｃとする。６２は第（ｉ，２）ＣＰＵ非共有バス端子１１ｃに接続した第（ｉ，２）ローカルキャッシュメモリであり、その機能は第（ｉ，１）ローカルキャッシュメモリ１２と変わらない。この第（ｉ，２）ローカルキャッシュメモリ６２は第（ｉ，２）ＣＰＵ側バス端子６２ａと第（ｉ，２）バス側バス端子６２ｂをもつ。第（ｉ，２）ＣＰＵ側バス端子６２ａは第ｉＣＰＵ１１の増設された第（ｉ，２）ＣＰＵ非共有バス端子１１ｃに接続されている。６３ａは第（ｉ，２）ローカル非共有バスであり、増設された第（ｉ，２）ローカルキャッシュメモリ６２の第（ｉ，２）バス側バス端子６２ｂに接続されている。 The i-th CPU 11 has two CPU non-shared bus terminals 11b and 11c. The conventional part is referred to as the (i, 1) CPU non-shared bus terminal 11b, and the additional part is referred to as the (i, 2) CPU non-shared bus terminal 11c. Reference numeral 62 denotes a (i, 2) local cache memory connected to the (i, 2) CPU non-shared bus terminal 11c, and its function is the same as that of the (i, 1) local cache memory 12. The (i, 2) local cache memory 62 has an (i, 2) CPU side bus terminal 62a and an (i, 2) bus side bus terminal 62b. The (i, 2) CPU-side bus terminal 62a is connected to the (i, 2) CPU non-shared bus terminal 11c added to the i-th CPU 11. Reference numeral 63 a denotes a (i, 2) local unshared bus, which is connected to the (i, 2) bus side bus terminal 62 b of the added (i, 2) local cache memory 62.

第ｉプロセッサユニット１４は増設された第（ｉ，２）ユニット非共有バス端子１４ｃを持ち、第（ｉ，２）ローカル非共有バス６３ａが接続されている。６５ａは増設された第２グローバル非共有バス、６７ａは増設された第２非共有インタフェース、６９ａは第２非共有インタフェース６７ａに接続された第２非共有メモリである。 The i-th processor unit 14 has an added (i, 2) unit unshared bus terminal 14c, and is connected to the (i, 2) local unshared bus 63a. 65a is an added second global non-shared bus, 67a is an added second non-shared interface, and 69a is a second non-shared memory connected to the second non-shared interface 67a.

なお、図示しないが第１非共有メモリ１９ａおよび第２非共有メモリ６９ａは、共有データ格納用メモリとして第１非共有インタフェース１７ａまたは第２非共有インタフェース６７ａからアクセスできるようにしてもよい。また、それぞれのグローバルインタフェースに他の装置がつながっていてもかまわない。 Although not shown, the first non-shared memory 19a and the second non-shared memory 69a may be accessible from the first non-shared interface 17a or the second non-shared interface 67a as a shared data storage memory. Also, other devices may be connected to each global interface.

次に動作について説明する。
例えば、キャッシュメモリが２つ分のアドレス情報を格納することができるときに３つ以上のアドレスを読み出す場合、パージが起こりやすくなる。パージが発生すると、第１グローバル非共有バス１５ａの使用率が高くなり混雑する。ここで、混雑するというのは第ｉＣＰＵ１１が第１グローバル非共有バス１５ａを使用したいのに他のＣＰＵが第１グローバル非共有バス１５ａを使用しているため、使用できず、自分が使用できるまで待たなければいけない状態をいう（この状態ではＣＰＵが待たされるため処理能力が落ちる）。 Next, the operation will be described.
For example, when more than two addresses are read when the cache memory can store two pieces of address information, purging is likely to occur. When the purge occurs, the usage rate of the first global unshared bus 15a becomes high and becomes congested. Here, the congestion is that the i-th CPU 11 wants to use the first global unshared bus 15a, but the other CPU uses the first global unshared bus 15a. This means a state that must be waited (in this state, the CPU waits and the processing capability is reduced).

このような、非共有データのためのアドレスを多く使用し、かつ第（ｉ，１）ローカルキャッシュメモリ１２の容量が小さい結果、パージが発生し、バスが混雑する場合、この実施の形態６のように第２グローバル非共有バス６５ａを増設することによりバス負荷を軽減できる。 When a large number of addresses for non-shared data are used and the capacity of the (i, 1) local cache memory 12 is small, purge occurs and the bus is congested. As described above, the bus load can be reduced by adding the second global unshared bus 65a.

第ｉＣＰＵ１１は、非共有データをアクセスするときに、従来分、増設分のどちらのバスを使用するかを選択する。この選択の最も簡単な方法はアドレスが偶数／奇数によって振り分ける方法がある。仮に第（ｉ，２）非共有バス端子を選択したとして、第ｉＣＰＵ１１は第（ｉ，２）ローカルキャッシュメモリ６２を通して第２非共有インタフェース６７ａを通って第２非共有メモリ６９ａにアクセスする。 The i-th CPU 11 selects which of the conventional bus and the additional bus is used when accessing the non-shared data. The simplest method of this selection is a method of assigning addresses according to even / odd addresses. Assuming that the (i, 2) non-shared bus terminal is selected, the i-th CPU 11 accesses the second non-shared memory 69a through the (i, 2) local cache memory 62 and the second non-shared interface 67a.

一方、このとき第（ｉ＋１）ＣＰＵ（不図示）等の他のＣＰＵが第１グローバル非共有バス１５ａから第１非共有インタフェース１７ａを通って第１非共有データ格納用メモリ１９ａへ行くパスはあいている。その結果、他のＣＰＵがこのパスを使用することができる。 On the other hand, there is no path for other CPUs such as the (i + 1) th CPU (not shown) to go to the first unshared data storage memory 19a from the first global unshared bus 15a through the first unshared interface 17a. ing. As a result, other CPUs can use this path.

この実施の形態６では、バス使用のタイミングもあるが、２つのＣＰＵが非共有データ用メモリにアクセスできる。その結果、待ち時間が減り、バスの混雑度が減り、処理速度が向上する。ここでは、グローバル非共有バスを１本増設した場合について述べたが、同じ様にバスの数を増やせばより混雑度が減る。このような構成は多チップで構成するよりも系全体で１チップ構成した方がよい。理由として、１つのチップの出入口（ピン）は現在の技術では高々３００本程度であり、無限にバスを増やすことができないからである。 In the sixth embodiment, there are timings of using the bus, but two CPUs can access the non-shared data memory. As a result, waiting time is reduced, bus congestion is reduced, and processing speed is improved. Here, the case where one global non-shared bus is added has been described. However, if the number of buses is increased in the same manner, the degree of congestion is further reduced. Such a configuration is better configured with one chip as a whole rather than with multiple chips. The reason is that there are at most about 300 entrances / exits (pins) of one chip in the current technology, and the number of buses cannot be increased infinitely.

ここで、一般にプロセッサユニットの数（ＣＰＵの数）以上のバスを増設しても意味はない。例えば１０ＣＰＵのために１００本バスを用意しても一度に使用するバスの数が高々ＣＰＵの数（１０本）なので残り９０本は未使用の状態になる。一般に、最適なバスの数は以下のように表せられる。 Here, it is generally meaningless to add more buses than the number of processor units (the number of CPUs). For example, even if 100 buses are prepared for 10 CPUs, since the number of buses used at one time is at most 10 (10), the remaining 90 buses are unused. In general, the optimal number of buses can be expressed as:

グローバルバス数＝ＣＰＵ×（単位時間内の平均非共有データアクセス数×アクセス時間／単位時間）
実施の形態７．
共有バスの多バス化
図示しないが、実施の形態６を応用し、グローバル共有バスの多バス化をしてもよい。この場合も実施の形態６と同じ効果が期待できる。
実施の形態８．
遅い周辺機器のための外部Ｉ／Ｏバス
図９はこの発明の実施の形態８に係るマルチプロセッサ装置を示すブロック図であり、前記図１に示した実施の形態１と同一の部分については同一符号を付して重複説明を省略する。 Number of global buses = CPU x (average number of non-shared data accesses within unit time x access time / unit time)
Embodiment 7 FIG.
Although not shown, the shared bus may be multibused by applying the sixth embodiment. In this case, the same effect as in the sixth embodiment can be expected.
Embodiment 8 FIG.
FIG. 9 is a block diagram showing a multiprocessor device according to an eighth embodiment of the present invention. The same parts as those in the first embodiment shown in FIG. A reference numeral is attached and a duplicate description is omitted.

１１は第ｉＣＰＵである。この第ｉＣＰＵ１１は第（ｉ，１）ＣＰＵ外部装置バス端子１１ｄをもつ。７３ｃは第（ｉ，１）ローカル外部装置バスで、第（ｉ，１）ＣＰＵ外部装置バス端子１１ｄに接続されている。１４は第ｉプロセッサユニットであり、新たに第（ｉ，１）ユニット外部装置バス端子１４ｄが増設され、内部で第（ｉ，１）ローカル外部装置バス７３ｃに接続されている。７５ｃは第１グローバル外部装置バスである。この第１グローバル外部装置バス７５ｃは、それぞれの第ｉプロセッサユニット１４の第（ｉ，１）ユニット外部装置バス端子１４ｄに接続されている。７７ｃは第１外部装置インタフェース、７９Ｃは外部装置である。この外部装置７９Ｃはアクセス時間が非常にかかるものとする。 Reference numeral 11 denotes an i-th CPU. This i-th CPU 11 has a (i, 1) -th CPU external device bus terminal 11d. Reference numeral 73c denotes an (i, 1) local external device bus, which is connected to the (i, 1) CPU external device bus terminal 11d. Reference numeral 14 denotes an i-th processor unit, and a new (i, 1) unit external device bus terminal 14d is newly added and is internally connected to the (i, 1) local external device bus 73c. 75c is a first global external device bus. The first global external device bus 75 c is connected to the (i, 1) unit external device bus terminal 14 d of each i-th processor unit 14. 77c is a first external device interface, and 79C is an external device. This external device 79C takes a very long access time.

バスの停止の説明
この実施の形態８では、遅い外部装置７９Ｃにアクセスしたことによるバスの停止を回避することができる。ここでバスの停止について説明する。実施の形態１のような回路で、共有インタフェース１７ｂの外に遅い外部装置１９ｂがあり、そのアクセス時間が１０，０００クロックであったとする。第１ＣＰＵが第１グローバル共有バス１５ｂを通して、この遅い外部装置１９ｂにアクセスしたとき、他のＣＰＵは、第１ＣＰＵのアクセスが終了するまで第１グローバル共有バス１５ｂを使用できない。 Description of Bus Stop In the eighth embodiment, bus stop due to access to the slow external device 79C can be avoided. Here, the bus stop will be described. It is assumed that in the circuit as in the first embodiment, there is a slow external device 19b outside the shared interface 17b, and the access time is 10,000 clocks. When the first CPU accesses the slow external device 19b through the first global shared bus 15b, the other CPUs cannot use the first global shared bus 15b until the access of the first CPU is completed.

その結果、他のＣＰＵが第１グローバル共有バスをアクセス使用とすると、第１ＣＰＵのアクセス完了まで最悪１０，０００クロック待たされることになる。アクセス完了までの間は誰も何もすることができなくなるためバスが停止した状態になる。最悪例としてＣＰＵが１０個あったとして、各ＣＰＵは１，０００，０００クロックの間に１回外部装置のアクセスを実施したとする。外部装置アクセスのための時間は１０，０００クロック×１［回］×１０［ＣＰＵ］＝１００，０００クロックとなり、約１０％の時間がバスの停止時間となってしまう。この結果、全てのＣＰＵは最大１０％程度の速度下が起こる。この実施の形態８では、このようなバスの停止を回避するため、外部装置用のバスを増設したものである。 As a result, if another CPU uses the first global shared bus for access, the worst wait is 10,000 clocks until the first CPU completes access. Since no one can do anything until the access is completed, the bus is stopped. Assuming that there are 10 CPUs as the worst case, it is assumed that each CPU accesses the external device once during 1,000,000 clocks. The time for accessing the external device is 10,000 clocks × 1 [times] × 10 [CPU] = 100,000 clocks, and about 10% of time is the bus stop time. As a result, all CPUs are under a speed of about 10% at maximum. In the eighth embodiment, in order to avoid such a bus stop, a bus for an external device is added.

次に動作について説明する。
第ｉＣＰＵ１１はアクセスする番地情報から（または命令から）、この番地が外部装置７９ｃに割り当てられた番地かを判断する。外部装置７９ｃに割り当てられた番地であると判断した場合、第ｉＣＰＵ１１は第（ｉ，１）ＣＰＵ外部装置バス端子１１ｄからアクセスを開始して、これが第ｉプロセッサユニット１４の増設された第（ｉ，１）ユニット外部装置バス端子１４ｄを通り、第１グローバル外部装置バス７５ｃ、第１外部装置インタフェース７７ｃを通って外部装置７９ｃにアクセスすることになり、このときは第１グローバル共有バス１５ｂと第１グローバル非共有バス１５ａは一切使用しない。 Next, the operation will be described.
The i-th CPU 11 determines from the address information to be accessed (or from the instruction) whether this address is assigned to the external device 79c. If it is determined that the address is assigned to the external device 79c, the i-th CPU 11 starts access from the (i, 1) CPU external device bus terminal 11d, and this is the (i , 1) The external device 79c is accessed through the unit external device bus terminal 14d, the first global external device bus 75c, and the first external device interface 77c. In this case, the first global shared bus 15b and the first global shared bus 15b are accessed. The 1 global unshared bus 15a is not used at all.

この結果、他のＣＰＵが共有／非共有データをアクセスしにいっても全くバスの停止に巻き込まれることなく、速度を落とさずに処理を実行し続けることができる。また、この実施の形態８では実施の形態１について述べたが、別に実施の形態１に限ったことでなく、実施の形態２〜４でも同じことが言える。また、実施の形態５のように再帰的構成も可能で、実施の形態６又は実施の形態７のようにバスを多重化することもできる。
実施の形態９．
非共有データをローカルメモリに入れる。 As a result, even if other CPUs are accessing shared / non-shared data, the processing can be continued without reducing the speed without being involved in stopping the bus. Further, in the eighth embodiment, the first embodiment has been described. However, the present invention is not limited to the first embodiment, and the same applies to the second to fourth embodiments. In addition, a recursive configuration is possible as in the fifth embodiment, and buses can be multiplexed as in the sixth or seventh embodiment.
Embodiment 9 FIG.
Put unshared data in local memory.

図１０はこの発明の実施の形態９に係るマルチプロセッサ装置を示すブロック図であり、前記図１に示した実施の形態１と同一の部分については同一符号を付して重複説明を省略する。 10 is a block diagram showing a multiprocessor device according to Embodiment 9 of the present invention. The same parts as those in Embodiment 1 shown in FIG.

ここで、非共有データ、その中でもワークエリアはある一定の番地にのみアロケートさせるようにしておく。ＣＰＵがワークエリアにアクセスするとき、番地情報から第（ｉ，１）ローカルメモリ８６にアクセスするようにする。この第（ｉ，１）ローカルメモリ８６は第ｉＣＰＵ１１の専用となる。ワークエリア自体は該当処理（ＣＰＵ）で閉じているので、別に外に出す必要がない。外に出す必要がなければ第１グローバルバス３５を使用しなくても良い。その結果、バス使用率が極端に減ることになる。 Here, the non-shared data, especially the work area, is allocated only to a certain address. When the CPU accesses the work area, the (i, 1) th local memory 86 is accessed from the address information. This (i, 1) local memory 86 is dedicated to the i-th CPU 11. Since the work area itself is closed by the corresponding process (CPU), there is no need to go outside. If it is not necessary to go outside, the first global bus 35 may not be used. As a result, the bus usage rate is extremely reduced.

また、ワークエリアのためのローカルメモリ８６が小さい場合、第１グローバルバスにぶらさがったメモリをワークエリアとして使用してもよい。また、図１１のようにローカルメモリ８６が遅い場合は第（ｉ，１）ローカルキャッシュメモリ１２を通してアクセスさせるようにしても良い。いずれにせよこの実施の形態９では、バス使用率が下がるので、より高速化が期待できる。バス使用率が下がると、実施の形態３又は４を実施するのに有利となる。この実施の形態３又は４を実施することは面積縮小につながる。図１０、図１１はこの観点から実施の形態３をベースにしている。実施の形態３をベースにした場合、非常に簡単な構成になる。
実施の形態１０．
共有／非共有の判定方法1 〜アドレスによる判定その１〜
これまでは、ＣＰＵが共有／非共有を判定できることを前提にして実施の形態を記述してきたが、この実施の形態からは、これまでの実施の形態を構成するに当たり、どのようなＣＰＵが適当であるか、または一般のＣＰＵであってもどのような周辺回路をつければよいかについて説明する。 If the local memory 86 for the work area is small, the memory hung on the first global bus may be used as the work area. Further, when the local memory 86 is slow as shown in FIG. 11, the local memory 86 may be accessed through the (i, 1) local cache memory 12. In any case, in the ninth embodiment, since the bus usage rate decreases, higher speed can be expected. When the bus usage rate decreases, it is advantageous to implement the third or fourth embodiment. Implementing this Embodiment 3 or 4 leads to area reduction. 10 and 11 are based on the third embodiment from this viewpoint. When based on Embodiment 3, it becomes a very simple structure.
Embodiment 10 FIG.
Judgment method 1 of shared / non-shared Judgment by address 1
Up to this point, the embodiment has been described on the assumption that the CPU can determine whether the CPU is shared or not. However, from this embodiment, what kind of CPU is appropriate for constructing the previous embodiments. Or what kind of peripheral circuit should be attached even if it is a general CPU.

図１２はこの発明の実施の形態１０に係るマルチプロセッサ装置を示すブロック図であり、１１は実施の形態１の（あるいはこれをベースとした実施の形態の）ＣＰＵである。１０１は第ｉＣＰＵ本体部である。この第ｉＣＰＵ本体部１０１自体は共有／非共有の判定をする機能を必要とはしない。第ｉＣＰＵ本体部１０１はアクセスしようとする番地を示す第ｉＣＰＵ本体アドレスバス端子１０１ａ、番地から読み出した情報を搬送し、又は書き込む情報を伝達する第ｉＣＰＵ本体データバス端子１０１ｂ、読む、又は書く等の第ｉＣＰＵ本体制御情報を出力する第ｉＣＰＵ本体制御バス端子１０１ｃを有する。 FIG. 12 is a block diagram showing a multiprocessor device according to Embodiment 10 of the present invention. Reference numeral 11 denotes a CPU according to Embodiment 1 (or an embodiment based on this). Reference numeral 101 denotes an i-th CPU main body. The i-th CPU main body 101 itself does not need a function for determining whether to share or not share. The i-th CPU main body unit 101 is an i-th CPU main body address bus terminal 101a indicating an address to be accessed, an i-CPU main body data bus terminal 101b for conveying information read from the address or transmitting information to be written, reading, writing, etc. An i-th CPU main body control bus terminal 101c for outputting i-CPU main body control information is provided.

１０２は第ｉアドレス共有／非共有バス選択装置で、ＣＰＵ本体部１０１の第ｉＣＰＵ本体アドレスバス端子１０１ａに接続された第ｉＣＰＵアドレス選択装置ＣＰＵ側端子１０２ａ、第ｉアドレス選択装置共有側端子１０２ｂ、第ｉアドレス選択装置非共有側端子１０２ｃ、第ｉアドレス選択装置判定入力端子１０２ｄを有する。この第ｉアドレス共有／非共有バス選択装置１０２は第ｉアドレス選択装置判定入力端子１０２ｄに「共有」という情報が入れば、第ｉＣＰＵアドレス選択装置ＣＰＵ側端子１０２ａと第ｉアドレス選択装置共有側端子１０２ｂを接続し、第ｉアドレス選択装置判定入力端子１０２ｄに「非共有」という情報が入れば、第ｉＣＰＵアドレス選択装置ＣＰＵ側端子１０２ａと第ｉアドレス選択装置非共有側端子１０２ｃを接続する。 Reference numeral 102 denotes an i-th address sharing / non-shared bus selection device. The i-th CPU address selection device CPU side terminal 102a, the i-th address selection device sharing side terminal 102b connected to the i-th CPU body address bus terminal 101a of the CPU body 101, An i-th address selection device non-shared terminal 102c and an i-th address selection device determination input terminal 102d are provided. In this i-th address sharing / non-shared bus selection device 102, if the information "shared" is input to the i-th address selection device determination input terminal 102d, the i-th CPU selection device CPU side terminal 102a and the i-th address selection device sharing side terminal When 102b is connected and information “non-shared” is input to the i-th address selection device determination input terminal 102d, the i-th CPU address selection device CPU-side terminal 102a and the i-th address selection device non-shared side terminal 102c are connected.

１０３は第ｉデータ共有／非共有バス選択装置で、ＣＰＵ本体部１０１の第ｉＣＰＵ本体データバス端子１０１ｂに接続された第ｉＣＰＵデータ選択装置ＣＰＵ側端子１０３ａ、第ｉデータ選択装置共有側端子１０３ｂ、第ｉデータ選択装置非共有側端子１０３ｃ、第ｉデータ選択装置判定入力端子１０３ｄを有する。この第ｉデータ共有／非共有バス選択装置１０３は第ｉデータ選択装置判定入力端子１０３ｄに「共有」という情報が入れば、第ｉＣＰＵデータ選択装置ＣＰＵ側端子１０３ａと第ｉデータ選択装置共有側端子１０３ｂを接続し、第ｉデータ選択装置判定入力端子１０３ｄに「非共有」という情報が入れば、第ｉＣＰＵデータ選択装置ＣＰＵ側端子１０３ａと第ｉデータ選択装置非共有側端子１０３ｃを接続する。 Reference numeral 103 denotes an i-th data sharing / non-shared bus selection device. The i-th CPU data selection device CPU side terminal 103a, the i-th data selection device sharing side terminal 103b connected to the i-th CPU main body data bus terminal 101b of the CPU main body 101, An i-th data selection device non-shared terminal 103c and an i-th data selection device determination input terminal 103d are provided. In this i-th data sharing / non-shared bus selection device 103, if the information "shared" is input to the i-th data selection device determination input terminal 103d, the i-th data selection device CPU side terminal 103a and the i-th data selection device sharing side terminal If the information “non-shared” is input to the i-th data selection device determination input terminal 103d, the i-th CPU data selection device CPU-side terminal 103a and the i-th data selection device non-shared side terminal 103c are connected.

１０４は第ｉ制御共有／非共有バス選択装置で、ＣＰＵ本体部１０１の第ｉＣＰＵ本体制御バス端子１０１ｃに接続された第ｉＣＰＵ制御選択装置ＣＰＵ側端子１０４ａ、第ｉ制御選択装置共有側端子１０４ｂ、第ｉ制御選択装置非共有側端子１０４ｃ、第ｉ制御選択装置判定入力端子１０４ｄを有する。この第ｉ制御共有／非共有バス選択装置１０４は第ｉ制御選択装置判定入力端子１０４ｄに「共有」という情報が入れば、第ｉＣＰＵ制御選択装置ＣＰＵ側端子１０４ａと第ｉ制御選択装置共有側端子１０４ｂを接続し、第ｉ制御選択装置判定入力端子１０４ｄに「非共有」という情報が入れば、第ｉＣＰＵ制御選択装置ＣＰＵ側端子１０４ａと第ｉ制御選択装置非共有側端子１０４ｃを接続する。 Reference numeral 104 denotes an i-th control shared / non-shared bus selection device. The i-th CPU control selection device CPU side terminal 104a, the i-th control selection device sharing side terminal 104b connected to the i-th CPU main body control bus terminal 101c of the CPU main body 101, It has an i-th control selection device non-shared side terminal 104c and an i-th control selection device determination input terminal 104d. In this i-th control shared / non-shared bus selection device 104, if the information "shared" is input to the i-th control selection device determination input terminal 104d, the i-th CPU control selection device CPU side terminal 104a and the i-th control selection device sharing side terminal If 104b is connected and the information “non-shared” is input to the i-th control selection device determination input terminal 104d, the i-th CPU control selection device CPU-side terminal 104a and the i-th control selection device non-shared side terminal 104c are connected.

第ｉＣＰＵ１１は第（ｉ，１）ＣＰＵ側共有バス端子１１ａと第（ｉ，１）ＣＰＵ側非共有バス端子１１ｂを有する。この第ｉＣＰＵ１１の第（ｉ，１）ＣＰＵ側共有バス端子１１ｂからの配線は、第ｉＣＰＵ１１の内部でアドレス、データ、制御の３つに分けられ、それぞれ第ｉアドレス共有／非共有バス選択装置１０２の第ｉアドレス選択装置共有側端子１０２ｂ、第ｉデータ共有／非共有バス選択装置１０３の第ｉデータ選択装置共有側端子１０３ｂ、第ｉ制御共有／非共有バス選択装置１０４の第ｉ制御選択装置共有側端子１０４ｂに接続されている。 The i-th CPU 11 has a (i, 1) CPU-side shared bus terminal 11a and an (i, 1) CPU-side unshared bus terminal 11b. The wiring from the (i, 1) CPU-side shared bus terminal 11b of the i-th CPU 11 is divided into the address, data, and control inside the i-th CPU 11, and each of the i-th address shared / non-shared bus selection device 102 is divided. I-th address selection device shared side terminal 102b, i-th data selection / shared-bus selection device 103's i-th data selection device shared-side terminal 103b, i-th control shared / non-shared bus selection device 104's i-th control selection device It is connected to the shared terminal 104b.

また、第ｉＣＰＵ１１の第（ｉ，１）ＣＰＵ側非共有バス端子１１ａからの配線は、第ｉＣＰＵ１１の内部でアドレス、データ、制御の３つに分けられ、それぞれ第ｉアドレス共有／非共有バス選択装置１０２の第ｉアドレス選択装置非共有側端子１０２ｃ、第ｉデータ共有／非共有バス選択装置１０３の第ｉデータ選択装置非共有側端子１０３ｃ、第ｉ制御共有／非共有バス選択装置１０４の第ｉ制御選択装置非共有側端子１０４ｃに接続されている。 The wiring from the (i, 1) CPU-side unshared bus terminal 11a of the i-th CPU 11 is divided into three areas of address, data, and control inside the i-th CPU 11, and each of the i-th address sharing / non-shared bus selection is selected. Device 102, i-th address selection device non-shared side terminal 102c, i-th data sharing / non-shared bus selection device 103, i-th data selection device non-shared side terminal 103c, i-th control shared / non-shared bus selection device 104 The i control selection device is connected to the non-shared side terminal 104c.

１０５は第ｉ共有／非共有判定装置であり、第ｉＣＰＵ本体アドレスバス端子１０１ａに接続されたアドレスバス入力端子１０５ａを有するとともに、第ｉアドレス共有／非共有バス選択装置１０２、第ｉデータ共有／非共有バス選択装置１０３、及び第ｉ制御共有／非共有バス選択装置１０４のそれぞれの第ｉアドレス選択装置判定入力端子１０２ｄ、第ｉデータ選択装置判定入力端子１０３ｄ、第ｉ制御選択装置判定入力端子１０４ｄに接続された第ｉ選択判定出力端子１０５ｂを有する。この第ｉ共有／非共有判定装置１０５は固定された回路でよい。 Reference numeral 105 denotes an i-th shared / non-shared determination device, which has an address bus input terminal 105a connected to the i-th CPU main body address bus terminal 101a, an i-th address shared / non-shared bus selection device 102, an i-th data shared / non-shared device. The i-th address selection device determination input terminal 102d, the i-th data selection device determination input terminal 103d, and the i-th control selection device determination input terminal of each of the non-shared bus selection device 103 and the i-th control shared / non-shared bus selection device 104. And an i-th selection determination output terminal 105b connected to 104d. The i-th shared / non-shared determination device 105 may be a fixed circuit.

アクセスする番地情報で共有／非共有を分ける
この実施の形態１０は、共有又は非共有をアクセスするアドレス（番地）から知るという非常にシンプルな方法である。ユーザーはあらかじめ共有データを置くアドレス（例えば００００番地〜７ＦＦＦ番地）、非共有データを置くアドレス（例えば８０００番地〜ＦＦＦＦ番地）と分けておく。ユーザーは、この分けた番地情報に従ってプログラムを作成する。第ｉＣＰＵ本体部１０１はプログラム解読中にデータアクセス命令を受け取ると、制御バス入出力から「リード」又は「ライト」という情報を出力し、かつリードの場合はアドレスバス入出力からアクセスするアドレスを出力し、ライトの場合はアドレスバス入出力からアクセスするアドレスを出力すると共に、データバス入出力から書き込むデータを出力する。 Separating sharing / non-sharing according to address information to be accessed In the tenth embodiment, sharing or non-sharing is known from an address (address) to access. The user previously separates the address where shared data is placed (for example, addresses 0000 to 7FFF) and the address where unshared data is placed (for example, addresses 8000 to FFFF). The user creates a program according to the divided address information. When the i-CPU main unit 101 receives a data access command while decoding a program, it outputs information “read” or “write” from the control bus input / output, and in the case of read, outputs an address accessed from the address bus input / output In the case of writing, an address to be accessed is output from the address bus input / output, and data to be written is output from the data bus input / output.

次に、第ｉ共有／非共有判定装置１０５は第ｉＣＰＵ本体部１０１が出力したデータ情報を受け取り、これが共有されたデータの割り当てられたアドレスか非共有データの割り当てられたアドレスかを判定する。その結果を第ｉ選択判定出力端子１０５ｂを通して第ｉアドレス共有／非共有バス選択装置１０２、第ｉデータ共有／非共有バス選択装置１０３、及び第ｉ制御共有／非共有バス選択装置１０４のそれぞれの第ｉアドレス選択装置判定入力端子１０２ｄ、第ｉデータ選択装置判定入力端子１０３ｄ、第ｉ制御選択装置判定入力端子１０４ｄに「共有データにアクセスした」「非共有データにアクセスした」という情報を伝達する。第ｉアドレス共有／非共有バス選択装置１０２、第ｉデータ共有／非共有バス選択装置１０３、及び第ｉ制御共有／非共有バス選択装置１０４は、この結果に応答してバスを接続する。 Next, the i-th shared / non-shared determination device 105 receives the data information output from the i-th CPU main body unit 101 and determines whether this is an address to which shared data is allocated or an address to which non-shared data is allocated. The result is sent to each of the i-th address sharing / non-shared bus selection device 102, the i-th data sharing / non-shared bus selection device 103, and the i-th control shared / non-shared bus selection device 104 through the i-th selection determination output terminal 105b. Information indicating that “shared data has been accessed” or “unshared data has been accessed” is transmitted to the i-th address selection device determination input terminal 102d, the i-th data selection device determination input terminal 103d, and the i-th control selection device determination input terminal 104d. . The i-th address sharing / non-shared bus selection device 102, the i-th data sharing / non-shared bus selection device 103, and the i-th control shared / non-shared bus selection device 104 connect the buses in response to this result.

上記のように共有／非共有を、その割り付ける番地によって分けることは、第ｉ共有／非共有判定装置１０５の論理回路の単純化に非常に有効である。００００〜７ＦＦＦと８０００〜ＦＦＦＦで共有／非共有を分けた場合、第ｉ共有／非共有判定装置１０５はアドレスの最上位線にせいぜいインバータを一つ加えれば実現できる。このように、共有／非共有を分ける機能的負荷はそんなに多くない。その結果、従来例に比べ、機能的負荷の削減ができることになる。 As described above, dividing shared / non-shared according to the assigned address is very effective for simplifying the logic circuit of the i-th shared / non-shared determining apparatus 105. When sharing / non-sharing is divided between 0000-7FFF and 8000-FFFF, the i-th sharing / non-sharing determination device 105 can be realized by adding at most one inverter to the highest line of the address. Thus, there is not much functional load that separates sharing / non-sharing. As a result, the functional load can be reduced as compared with the conventional example.

また、第ｉＣＰＵ本体部を一般の共有・非共有の判定をもたないＣＰＵにおきかえ、各共有／非共有判定装置および判定装置を周辺回路としてもよい。通常の（共有／非共有の判定をもたない）ＣＰＵに置き換えることができるため安価な部品で作成できる。各共有／非共有判定装置１０５は、単なるセレクタであるので、部品としては増えるが安価なもので作成できる。ただしこの方法はＣＰＵ作成時に作り込むことになるため、ユーザーは共有と非共有のメモリ割り当てを変更できない。
実施の形態１１．
共有／非共有の判定方法２〜アドレスによる判定その２〜
図１３はこの発明の実施の形態１１を示すもので、この実施の形態１１では、共有／非共有判定装置として、入力をアドレス、出力をそのアドレス（ブロック）に対する共有／非共有を格納した第ｉ共有／非共有判定ＲＡＭ１１５を使用している。 Further, the i-th CPU main body may be replaced with a general CPU that does not have shared / unshared determination, and each shared / non-shared determination device and determination device may be used as a peripheral circuit. Since it can be replaced with a normal CPU (which does not have shared / non-shared determination), it can be created with inexpensive components. Since each shared / non-shared determination device 105 is a simple selector, it can be created with a low price although it increases as a part. However, since this method is created when the CPU is created, the user cannot change the memory allocation between shared and non-shared.
Embodiment 11 FIG.
Shared / non-shared determination method 2-Determination by address 2
FIG. 13 shows an eleventh embodiment of the present invention. In the eleventh embodiment, as a shared / unshared determination device, an input is stored as an address and an output is stored as shared / unshared for that address (block). An i-shared / non-shared determination RAM 115 is used.

この実施の形態１１では、アクセスしようとする番地の上位（例えば８ｂｉｔ）を高速な第ｉ共有／非共有判定ＲＡＭ１１５に入力する。第ｉ共有／非共有判定ＲＡＭ１１５には該当番地が共有であるか非共有であるかの情報が納められ、その結果をそのまま共有／非共有決定線に伝達する。図示しないが、この第ｉ共有／非共有判定ＲＡＭ１１５の情報を書き換えるのは簡単で、例えば上位８ｂｉｔが“００”であれば第ｉ共有／非共有判定ＲＡＭ１１５にアクセスできるようにしておく。 In the eleventh embodiment, the upper address (for example, 8 bits) of the address to be accessed is input to the high-speed i-th shared / non-shared determination RAM 115. The i-th shared / non-shared determination RAM 115 stores information on whether the corresponding address is shared or non-shared, and transmits the result to the shared / non-shared decision line as it is. Although not shown, it is easy to rewrite the information in the i-th shared / non-shared determination RAM 115. For example, if the upper 8 bits are “00”, the i-th shared / non-shared determination RAM 115 can be accessed.

このようにすることにより、ユーザーがある程度の共有／非共有領域を指定できる。ただし、いくらでも第ｉ共有／非共有判定ＲＡＭ１１５が大きければ問題はないが、第ｉ共有／非共有判定ＲＡＭ１１５は有限である。その結果、１バイト単位に共有／非共有判定を設定することはまず不可能である。また、この方法であれば、アドレスの上位しか見ていないため、固定されたブロック境界、固定された長さしか指定できない。
実施の形態１２．
共有／非共有の判定方法３〜アドレスによる判定その３〜
図１４はこの発明の実施の形態１２を示すもので、第ｉＣＰＵ本体部１０１はアクセスしようとする番地を示す第ｉＣＰＵ本体アドレスバス端子１０１ａ、番地から読み出した情報を搬送し、又は書き込む情報を伝達する第ｉＣＰＵ本体データバス端子１０１ｂ、読む、又は書く等の第ｉＣＰＵ本体制御情報を出力する第ｉＣＰＵ本体制御バス端子１０１ｃを有する。 In this way, the user can designate a certain shared / non-shared area. However, there is no problem as long as the i-th shared / non-shared determination RAM 115 is large, but the i-th shared / non-shared determination RAM 115 is finite. As a result, it is impossible to set shared / non-shared determination in units of 1 byte. Also, with this method, since only the upper part of the address is viewed, only a fixed block boundary and a fixed length can be specified.
Embodiment 12 FIG.
Judgment method 3 of shared / non-shared Judgment by address 3
FIG. 14 shows an twelfth embodiment of the present invention. The i-th CPU main unit 101 conveys information read from the i-CPU main unit address bus terminal 101a indicating the address to be accessed, or information to be written. And an i-th CPU main body control bus terminal 101c for outputting i-CPU main body control information such as reading or writing.

１２７は第ｉアドレスデコーダである。この第ｉアドレスデコーダ１２７は、番地によってアクセスする装置を決定するもので、「ＪｄｇＲｅｇ」「ＰＴｂｌ」「Ｏｔｈｅｒ」という信号線を持つ。この実施の形態では、００００−００ＦＦ番地が指定されれば「ＪｄｇＲｅｇ」に“許可”という信号を送り、０１００−０３ＦＦであれば「ＰＴｂｌ」信号に“許可”という信号を送り、それ以外の場合には「Ｏｔｈｅｒ」に“許可”を送る。これら出力は各装置の動作を制御することになるので、図中では制御信号線のひとつとして扱う。 Reference numeral 127 denotes an i-th address decoder. The i-th address decoder 127 determines a device to be accessed by an address, and has signal lines “JdgReg”, “PTbl”, and “Other”. In this embodiment, if the address 0000-00FF is specified, a signal “permitted” is sent to “JdgReg”, and if it is 0100-03FF, a signal “permitted” is sent to the “PTbl” signal. “Allow” is sent to “Other”. Since these outputs control the operation of each device, they are treated as one of the control signal lines in the figure.

第ｉアドレス共有／非共有バス選択装置１２２、第ｉデータ共有／非共有バス選択装置１２３、及び第ｉ制御共有／非共有バス選択装置１２４は、それぞれ動作許可端子Ｅｎをもち、この動作許可端子Ｅｎに“許可”という入力が入れば、実施の形態１０で説明した動作をし、入力信号が“許可”でなければ、第ｉＣＰＵ本体部１０１の第ｉＣＰＵ本体アドレスバス端子１０１ａ、第ｉＣＰＵ本体データバス端子１０１ｂ、第ｉＣＰＵ本体制御バス端子１０１ｃを各第ｉアドレス・データ・制御選択装置共有側端子、第ｉアドレス・データ・制御選択装置非共有側端子のどちらにも接続しない。 The i-th address sharing / non-shared bus selection device 122, the i-th data sharing / non-shared bus selection device 123, and the i-th control shared / non-shared bus selection device 124 each have an operation permission terminal En, and this operation permission terminal. If the input “permitted” is input to En, the operation described in the tenth embodiment is performed. If the input signal is not “permitted”, the iCPU main body address bus terminal 101a of the iCPU main body 101 and the iCPU main body data are input. The bus terminal 101b and the i-th CPU main body control bus terminal 101c are not connected to any of the i-th address / data / control selection device sharing side terminal and the i-th address / data / control selection device non-sharing side terminal.

この装置の動作許可端子Ｅｎは第ｉアドレスデコーダ１２７の「Ｏｔｈｅｒ」端子と接続され、第ｉＣＰＵ本体部１０１が０４００−ＦＦＦＦにアクセスする場合に接続動作し、００００−０３ＦＦの場合はすべてのバスを切り離す。 The operation permission terminal En of this device is connected to the “Other” terminal of the i-th address decoder 127, and is connected when the i-th CPU main body 101 accesses 0400-FFFF. In the case of 0000-03FF, all buses are connected. Separate.

第ｉ共有／非共有判定装置１２５は、アドレス入力端子１２５ａとデータ入力端子１２５ｂと制御入力端子１２５ｃを持ち、これは第ｉＣＰＵ本体部１０１の第ｉＣＰＵ本体アドレスバス端子１０１ａ、第ｉＣＰＵ本体データバス端子１０１ｂ、第ｉＣＰＵ本体制御バス端子１０１ｃに接続されている。この装置はＣＰＵ本体制御バスの「ＪｄｇＲｅｇ」という信号線に接続されている。この装置は、ラッチを一つ持ち、ＣＰＵが００００−００ＦＦをアクセスするときに第ｉアドレスデコーダによって「ＪｄｇＲｅｇ」端子から“許可”信号が出力されることにより動作し、この時にアクセス可能となる。 The i-th shared / non-shared determination device 125 has an address input terminal 125a, a data input terminal 125b, and a control input terminal 125c, which are the iCPU main body address bus terminal 101a and the iCPU main body data bus terminal of the iCPU main body 101. 101b is connected to the i-th CPU main body control bus terminal 101c. This apparatus is connected to a signal line “JdgReg” of the CPU main body control bus. This apparatus has one latch, and operates when a “permission” signal is output from the “JdgReg” terminal by the i-th address decoder when the CPU accesses 0000-00FF, and becomes accessible at this time.

１２６は第ｉポインタテーブル格納メモリである。この装置はアドレス入力端子１２６ａとデータ入力端子１２６ｂと制御入力端子１２６ｃを有し、これは第ｉＣＰＵ本体部１０１の第ｉＣＰＵ本体アドレスバス端子１０１ａ、第ｉＣＰＵ本体データバス端子１０１ｂ、第ｉＣＰＵ本体制御バス端子１０１ｃに接続されている。この装置はＣＰＵ本体制御バスの「ＰＴｂｌ」という信号線に接続されている。この装置は、ラッチを一つ持ち、ＣＰＵが０１００−０３ＦＦをアクセスするときに第ｉアドレスデコーダ１２７によって「ＰＴｂｌ」端子から“許可”信号が出力されることにより動作し、この時にアクセス可能となる。 Reference numeral 126 denotes an i-th pointer table storage memory. This device has an address input terminal 126a, a data input terminal 126b, and a control input terminal 126c, which are the iCPU main body address bus terminal 101a, the iCPU main body data bus terminal 101b, and the iCPU main body control bus of the iCPU main body 101. It is connected to the terminal 101c. This apparatus is connected to a signal line “PTbl” of the CPU main body control bus. This apparatus has one latch, and operates when a “permission” signal is output from the “PTbl” terminal by the i-th address decoder 127 when the CPU accesses 0100-03FF, and becomes accessible at this time. .

より説明をわかりやすくするために、第ｉＣＰＵ本体部１０１から見たときのこの実施の形態１２のメモリマップを第１５図に示す。 For easier understanding, FIG. 15 shows a memory map of the twelfth embodiment viewed from the i-th CPU main body 101.

この実施の形態１２はメモリ管理を「ハンドル」と呼ばれる方法でソフトウェア的に実施しているコンピュータ構成（この内容についてはＡｐｐｌｅＣｏｍｐｕｔｅｒ編集、「ＩｎｓｉｄｅＭａｃｉｎｔｏｓｈＶｏｌＩ，ＩＩ」（バークレイ出版）に詳しく記述されている）に適用している。 The twelfth embodiment is described in detail in a computer configuration in which memory management is performed by software in a method called “handle” (this content is edited by Apple Computer, “Inside Macintosh Vol I, II” (Berkeley Publishing)). Applied).

まず、メモリの一部分をブロックとして使用する場合（例えば、０４００〜０４ＦＦ番地）、ポインタテーブルにそのメモリブロックの先頭番地（０４００番地）とその長さ（２５６Ｂｙｔｅ＝０１００（Ｈｅｘ）Ｂｙｔｅ）を組にしてポインタテーブルのとある番地（ここで００１０番地に「０４００」、００１４番地に「０１００」）を記述する。このメモリブロックにアクセスするときは、ソフトウエア的にメモリブロックの先頭番地の内容の置かれたポインタテーブル上の番地（００１０番地）でアクセスするものである。ポインタテーブル上の番地を「ハンドル」という。故に、ユーザーの作成したプログラムがこのメモリブロックのある場所（先頭から８番目）にアクセスするときはハンドル（００１０番地）の内容（００１０番地の内容は０４００）を読み込み、更にこの内容（００１０番地の内容である０４００番地）からの加算値（８−１）を足した番地（０４０７番地）にアクセスしにいく。この実施の形態で使用されるコンピュータ構成はこれらを全てソフトウェアで実施する。 First, when a part of the memory is used as a block (for example, addresses 0400 to 04FF), the head address (address 0400) and the length (256 bytes = 0100 (Hex) bytes) of the memory block are paired in the pointer table. A certain address in the pointer table (here, “0400” at address 0010 and “0100” at address 0014) is described. When accessing this memory block, the access is made at the address (address 0010) on the pointer table where the contents of the start address of the memory block are placed by software. The address on the pointer table is called “handle”. Therefore, when a user-created program accesses a certain location (8th from the beginning) of this memory block, the content of the handle (address 0010) is read (the content of address 0010 is 0400), and this content (address 0010 is further read). The address (address 0407) obtained by adding the added value (8-1) from the address 0400) is accessed. The computer configuration used in this embodiment is all implemented by software.

この実施の形態は、このポインタテーブルの情報に「共有／非共有」ｂｉｔを加えたもので、この共有／非共有ｂｉｔを加えた場合の動作を図１６において説明する。ユーザーの作成したプログラムがこのメモリブロックのある場所（先頭から８番目）にアクセスするときはハンドル（００１０番地）の内容（００１０番地の内容は０４００）を読み込む（ステップＳＴ１２１）。 In this embodiment, a “shared / non-shared” bit is added to the information in this pointer table, and the operation when this shared / non-shared bit is added will be described with reference to FIG. When a user-created program accesses a location (eighth from the beginning) of this memory block, the contents of the handle (address 0010) (the content of address 0010 is 0400) are read (step ST121).

このとき、第ｉアドレスデコーダ１２７は第ｉポインタテーブルにのみアクセスを“許可”し、第ｉＣＰＵ本体部１０１はポインタテーブルから番地０１００の内容を読み込むことができる。一方、第ｉアドレス共有／非共有バス選択装置１２２、第ｉデータ共有／非共有バス選択装置１２３、及び第ｉ制御共有／非共有バス選択装置１２４は、動作許可されていないのでバスを切り離す。次に、同様に共有ｂｉｔを読み込む（ステップＳＴ１２２）。 At this time, the i-th address decoder 127 “permits” access only to the i-th pointer table, and the i-th CPU main body 101 can read the contents of the address 0100 from the pointer table. On the other hand, since the i-th address sharing / non-shared bus selection device 122, the i-th data sharing / non-shared bus selection device 123, and the i-th control shared / non-shared bus selection device 124 are not permitted to operate, they disconnect the bus. Next, the shared bit is similarly read (step ST122).

この次に、読み込んだ共有情報を（００００−００ＦＦの任意の番地にアクセスすることにより）第ｉ共有／非共有判定装置１２５にアクセスする（ステップＳＴ１２３）。このとき、各共有／非共有バス選択装置１２２，１２３，１２４は依然バスを切り離したままである。 Next, the read shared information is accessed (by accessing an arbitrary address of 0000-00FF) to the i-th shared / non-shared determination device 125 (step ST123). At this time, each shared / non-shared bus selection device 122, 123, 124 still keeps the bus disconnected.

最後のステップＳＴ１２４で、第ｉＣＰＵ本体部１０１がハンドルの内容（００１０番地の内容である０４００番地）からの加算値（８−１）を足した番地（０４０７番地）にアクセスしにいくとき、アドレスデコーダによって各共有／非共有バス選択装置１２２，１２３，１２４は動作を開始し、共有または非共有を判定することにより所望のバスへ接続する。 At the last step ST124, when the i-th CPU main body 101 goes to access the address (address 0407) obtained by adding the added value (8-1) from the handle contents (address 00010 which is the contents of address 0010), the address Each shared / non-shared bus selection device 122, 123, 124 starts operation by the decoder, and connects to a desired bus by determining shared or non-shared.

（更に境界自由度を持たせられる）
この実施の形態１２と実施の形態１１とを見比べた場合、実施の形態１１では、固定された境界及び長さでの共有／非共有のみが設定できたが、この実施の形態１２では任意の境界（先頭番地）及び任意の長さで共有、非共有の設定をすることができる。なお、この実施の形態１２の注意点として、他のＣＰＵの非共有領域にアクセスしないようにしなければならない。また、共有／非共有の判定はＣＰＵがソフトウェア的に担当する（ハード的に担当することはむづかしい）ことになるため、メモリアクセスがやや遅くなる。 (Additional boundary degrees of freedom)
When comparing the twelfth embodiment and the eleventh embodiment, in the eleventh embodiment, only sharing / non-sharing with a fixed boundary and length can be set. Sharing or non-sharing can be set at the boundary (start address) and any length. As a precaution in the twelfth embodiment, it is necessary to prevent access to non-shared areas of other CPUs. In addition, since the determination of sharing / non-sharing is handled by the CPU in terms of software (it is difficult to be in charge of hardware), the memory access is slightly delayed.

図示しないが、内部のとあるレジスタの書き込み内容をそのまま即座に外部へ出力できる第ｉＣＰＵ本体部１０１（というＣＰＵ部品）であれば、第ｉ共有／非共有判定装置１２５をこのレジスタでかねることができ（たとえばレジスタＢとする）、その結果、ステップＳＴ１２３が省略できるため高速となる。同様のことをチップとして構成する場合、第ｉＣＰＵ本体部１０１からアルミ配線をたった一本引き出して各共有、非共有選択端子に接続してやれば可能である。 Although not shown, the i-th shared / non-shared determination device 125 may be used by this register as long as it is the i-CPU main body 101 (CPU component) that can immediately output the contents written in a certain register to the outside as it is. (For example, register B). As a result, step ST123 can be omitted, resulting in high speed. When the same thing is configured as a chip, it is possible to draw only one aluminum wiring from the i-th CPU main body 101 and connect it to each shared / non-shared selection terminal.

なお、この例でメモリブロックの確保・削除・変更が発生した場合について簡単に述べる。ポインタテーブルは本来各ＣＰＵの共有情報であるが、このシステム系でポインタテーブル情報はよく参照される場合が多く、メモリブロックの確保・削除・変更（以下変更のみで説明）に伴う書き込みされることはまれである。メモリブロックの変更があった場合、その変更を発生したＣＰＵがあらかじめ共有領域のどこかに変更したハンドルとその内容を書いておき、その後で全ＣＰＵに一斉に割り込みをかけさせ、全ＣＰＵがその内容をよむことにより第ｉポインタテーブル格納メモリ１２６の内容を改定すればよい。
実施の形態１３．
共有／非共有の判定方法４〜アドレスによる判定その４〜
実施の形態１０〜１２では、単体使用ＣＰＵ（ＣＰＵ本体部）をそのまま使用して実施の形態１〜９に示すマルチプロセッサ装置に適用する方法を記述した。実施の形態１３では単体使用ＣＰＵ自体に必要な機能を載せて改良することにより、マルチプロセッサ装置に適用する場合を述べる。 In this example, a case where a memory block is secured / deleted / changed will be briefly described. The pointer table is originally shared information of each CPU, but the pointer table information is often referred to in this system system, and is written when a memory block is secured, deleted, or changed (only explained below). Is rare. If there is a change in the memory block, the CPU that generated the change writes the changed handle and its contents somewhere in the shared area in advance, then interrupts all the CPUs at once, and all the CPUs The contents of the i-th pointer table storage memory 126 may be revised by reading the contents.
Embodiment 13 FIG.
Shared / non-shared determination method 4-Determination by address 4
In the tenth to twelfth embodiments, the method applied to the multiprocessor device shown in the first to ninth embodiments using the single-use CPU (CPU main body) as it is described. In the thirteenth embodiment, a case where the single-use CPU itself is applied and improved by mounting necessary functions will be described.

この実施の形態１３では、単体使用ＣＰＵ（ＣＰＵ本体部）はセグメントにてメモリ管理を実施するＣＰＵについて、共有／非共有判定を実施するための改良適用法について述べる。セグメント自体は実施の形態１２のメモリブロックとかわらない。セグメントはセグメントディスクリプタ（図１５に示す実施の形態１２のポインタテーブルに相当する）によって記述され、セグメントディスクリプタは先頭番地（例えば０４００番地）、長さ（例えば０１００バイト）、ステータス情報ｂｉｔ等をもつ。セグメントディスクリプタはセグメントディスクリプタテーブル（実施の形態１２のポインタテーブルに相当）に配置され、それぞれにセグメント番号（０，１，２，．．．で与えられ、実施の形態１２のハンドルに相当）が打たれている。ＣＰＵがメモリにアクセスする場合は、１命令で、このセグメント番号（例えば７）の格納アドレスから先頭番地（セグメント７の先頭番地、０４００番地）とステータスビットを読み込み、更にその番地（０４００番地）からの相対番地（８−１）を加算して（０４０７番地）メモリにアクセスしにいく。そしてこの方法の特徴的なことはソフトウェア的には１命令でハード的にこの処理を実施することである。 In the thirteenth embodiment, a single-use CPU (CPU main unit) will describe an improved application method for performing sharing / non-sharing determination for a CPU that performs memory management in a segment. The segment itself is not different from the memory block of the twelfth embodiment. A segment is described by a segment descriptor (corresponding to the pointer table of the twelfth embodiment shown in FIG. 15), and the segment descriptor has a head address (for example, address 0400), a length (for example, 0100 bytes), status information bit, and the like. The segment descriptors are arranged in a segment descriptor table (corresponding to the pointer table in the twelfth embodiment), and each is given a segment number (given by 0, 1, 2,..., Corresponding to the handle in the twelfth embodiment). I'm leaning. When the CPU accesses the memory, it reads the start address (start address of segment 7, address 0400) and status bit from the storage address of this segment number (for example, 7) and the status bit with one instruction, and further from that address (address 0400) The relative address (8-1) is added (address 0407) to access the memory. A characteristic feature of this method is that this processing is implemented by hardware with one instruction in terms of software.

（単体ＣＰＵ（ＣＰＵ本体部）の改定方針）
しかし、一般的にこのようなセグメント管理によってメモリアクセスを実施するＣＰＵは、外部に対して「セグメントを読む」か「セグメントデイスクリプタを読む」かを出力しない。また、いま「どのセグメントをよんでいるか」を出カしない。このため、外部でセグメントによる共有／非共有判定ができない。安易な方法として、実施の形態１０をべースにして共有するセグメントを共有領城に、非共有セグメントを非共有領域に配置する方法が簡単であるが、融通が効かない。 (Revision policy of single CPU (CPU main unit))
However, in general, a CPU that performs memory access by such segment management does not output “read segment” or “read segment descriptor” to the outside. Also, it does not output “which segment you are reading” now. Therefore, sharing / non-sharing determination by segment cannot be performed outside. As an easy method, a method of arranging a shared segment in the shared castle and a non-shared segment in the non-shared region based on the tenth embodiment is simple, but the flexibility does not work.

実施の形態１２のようにメモリアクセス時にソフトウェア的に共有／非共有装置に書きこむようにすると、今度は過去の豊富なソフトウェア互換性がなくなる。そこで、この実施の形態ではブラックボックスとなっているＣＰＵから、機能上もっていて中で閉じている配線を外部に引き出すという簡単な改訂を実施することにより、この単体のＣＰＵにも共有／非共有判定ができるようにしたものである。 If writing is performed to a shared / non-shared device by software at the time of memory access as in the twelfth embodiment, the past abundant software compatibility is lost. Therefore, in this embodiment, a simple revision of pulling out a wiring that is functional and closed inside from a CPU that is a black box in this embodiment is shared / not shared with this single CPU. Judgment can be made.

図１７は実施の形態１３によるマルチプロセッサ装置を示すブロック図であり、図において、３８６はセグメントによってメモリにアクセスするＣＰＵ本体部である。このＣＰＵ本体部３８６は、セグメントディスクリプタテーブルを読みにいくか、これ以外を読みにいくかを決定する“Ｓｇｒ”端子１３１ｄを持っ。このＳｇｒ端子１３１ｄはセグメントディスクリプタテーブルを読みにいくときは“ＲｅａｄＳＧＴ”という情報を電気的に出力し、一方でセグメントを読みにいくときは“ＡｃｃＭｅｍ”という情報を出力する。ＣＰＵ本体部３８６は、Ｓｇｒ端子１３１ｄがＡｃｃＭｅｍの情報を出力するとき、アクセスするセグメント番号を出力するためのセグメント番号出力“ＳＮ０”端子１３１ｅをもつ。 FIG. 17 is a block diagram showing a multiprocessor device according to the thirteenth embodiment. In FIG. The CPU main body 386 has an “Sgr” terminal 131d for determining whether to read the segment descriptor table or to read the other segment descriptor table. The Sgr terminal 131d electrically outputs information “ReadSGT” when reading the segment descriptor table, and outputs information “AccMem” when reading the segment. The CPU main body 386 has a segment number output “SN0” terminal 131e for outputting a segment number to be accessed when the Sgr terminal 131d outputs AccMem information.

このＳＮ０端子１３１ｅはＣＰＵが実際にメモリをアクセスする（Ｓｇｒ：ＡｃｃＭｅｍ）ときにセグメント番号を出力する。機能上、これらの端子に相当する信号はＣＰＵ本体部内に存在するはずであり、これらをアルミ配線で引き出すことはそんなに労力はかからない。 The SN0 terminal 131e outputs a segment number when the CPU actually accesses the memory (Sgr: AccMem). Functionally, signals corresponding to these terminals should be present in the CPU main body, and it is not so much labor to draw them out with aluminum wiring.

１３６はセグメントディスクリプタテーブルで、ＲＡＭである。このセグメントディスクリプタテーブルは“許可（ＥＮ）”信号をもち、ＣＰＵ本体部３８６のＳｇｒ端子１３１ｄに接続されている。セグメントディスクリプタテーブル１３６は“許可（ＥＮ）”入力が“ＲｅａｄＳＤＴ”となったとき（Ｓｇｒ：ＲｅａｄＳＤＴになったときで、ＣＰＵがセグメントディスクリブタテーブルを読みにいったとき）に動作し、ＣＰＵの要求に対してセグメントディスクリプタを出力する。許可ＥＮがＲｅａｄＳＤＴ以外のときは何もせず何も出力しない。 Reference numeral 136 denotes a segment descriptor table, which is a RAM. This segment descriptor table has an “enable (EN)” signal and is connected to the Sgr terminal 131 d of the CPU main body 386. The segment descriptor table 136 operates when the “permission (EN)” input becomes “ReadSDT” (Sgr: ReadSDT, when the CPU reads the segment disk table). Output a segment descriptor. When permission EN is other than ReadSDT, nothing is output and nothing is output.

１３５は共有／非共有判定装置でＲＡＭであり、セグメント番号を入力端子１３５ａと共有／非共有判定出力端子１３５ｂをもつ。この共有／非共有判定装置自体は実施の形態１１と変わらず、差分は入力がアドレスの上位８ビットではなく、ＣＰＵ本体部３８６のＳＮ０端子から出力される（アクセスする）セグメント番号である。実施の形態１１と同じように、共有／非共有判定装置１３５は、入力されたセグメント（実施の形態１１でいうアドレス上位８ビット）に対応する共有／非共有情報をもっており、与えられたセグメント番号の共有／非共有情報を共有／非共有判定出力に出力する機能をもつ。 Reference numeral 135 denotes a shared / non-shared determination device which is a RAM and has a segment number having an input terminal 135a and a shared / non-shared determination output terminal 135b. This shared / non-shared determination device itself is the same as in the eleventh embodiment, and the difference is not the upper 8 bits of the address but the segment number output (accessed) from the SN0 terminal of the CPU main unit 386. As in the eleventh embodiment, the shared / unshared determination device 135 has shared / nonshared information corresponding to the input segment (the upper 8 bits of the address in the eleventh embodiment), and the given segment number. The shared / non-shared information is output to the shared / non-shared determination output.

１２２，１２３，１２４の各共有／非共有バス選択装置は、動作許可端子を持つが、この動作許可端子はＣＰＵ本体部３８６のＳｇｒ端子に接続され、動作許可端子が“ＡｃｃＭｅｍ”であれば共有／非共有判定入力の情報にしたがってバス接続を実施し、動作許可信号がこれ以外の場合には動作せず、すべてのバスを切り離す。 Each of the shared / non-shared bus selection devices 122, 123, and 124 has an operation permission terminal. This operation permission terminal is connected to the Sgr terminal of the CPU main body 386, and is shared if the operation permission terminal is “AccMem”. / Bus connection is performed according to the information of the non-sharing judgment input. When the operation permission signal is other than this, the bus is not operated and all the buses are disconnected.

次に動作について説明する。
ＣＰＵ本体部３８６は１つのソフトウェア的メモリアクセス命令でセグメントディスクリプタテーブルを読むという動作とセグメント自体にアクセスする動作を実施する。まず、ＣＰＵ本体部３８６がセグメントディスクリプタテーブル１３６にアクセスする揚合、ＣＰＵ本体部３８６は所定のアクセス手順にしたがってメモリアクセス要求を出力するとともに、Ｓｇｒ端子１３１ｄから“ＲｅａｄＳＤＴ”という信号を出カする。各共有／非共有バス選択装置１２２，１２３，１２４は、Ｓｇｒ端子１３１ｄに接続された動作許可信号に入力される信号が“ＲｅａｄＳＤＴ”であるため、動作せず、すべてのバスを切り離す。共有／非共有判定装置１３５は動作するかもしれないが、各共有／非共有バス選択装置１２２，１２３，１２４が動作しないため、出力は無効となる。一方、セグメントディスクリプタテーブル１３６は、ＥＮ信号が“ＲｅａｄＳＤＴ”となるため動作を開始し、ＣＰＵ本体部３８６に対してセグメントディスクリプタを送信する。 Next, the operation will be described.
The CPU main body 386 performs an operation of reading the segment descriptor table by one software memory access instruction and an operation of accessing the segment itself. First, when the CPU body 386 accesses the segment descriptor table 136, the CPU body 386 outputs a memory access request according to a predetermined access procedure and outputs a signal “ReadSDT” from the Sgr terminal 131d. Each shared / non-shared bus selection device 122, 123, 124 does not operate because the signal input to the operation permission signal connected to the Sgr terminal 131d is “ReadSDT”, and all the buses are disconnected. Although the shared / non-shared determination device 135 may operate, the output becomes invalid because each shared / non-shared bus selection device 122, 123, 124 does not operate. On the other hand, the segment descriptor table 136 starts its operation because the EN signal becomes “ReadSDT”, and transmits the segment descriptor to the CPU main body 386.

次に、ＣＰＵ本体部３８６がセグメントにアクセスする場合、ＣＰＵ本体部３８６のＳｇｒ端子１３１ｄから“ＲｅａｄＭｅｍ”が出力されるとともにＳＮ０端子１３１ｅからアクセスするセグメントの番号が出力される。このときセグメントディスクリプタテーブル１３６はＥＮ信号が“ＲｅａｄＭｅｍ”であるため動作せず、何も出力しない。一方、共有／非共有判定装置１３５はＣＰＵ本体部３８６のＳＮ０端子１３１ｅから出力されたセグメント番号を入力し、ＲＡＭとして中に蓄えられている情報から、該当セグメントの共有／非共有判定を出力する。各共有／非共有バス選択装置１２２，１２３，１２４は動作許可端子に“ＡｃｃＭｅｍ”が入力されているため、それぞれの共有／非共有判定入力からの結果に従い、各バスの接続を行う。このとき、ソフトウェア的には何も変更の必要がない。 Next, when the CPU main unit 386 accesses a segment, “ReadMem” is output from the Sgr terminal 131d of the CPU main unit 386 and the number of the segment to be accessed is output from the SN0 terminal 131e. At this time, since the EN signal is “ReadMem”, the segment descriptor table 136 does not operate and outputs nothing. On the other hand, the shared / non-shared determination device 135 receives the segment number output from the SN0 terminal 131e of the CPU main body 386, and outputs the shared / non-shared determination of the corresponding segment from the information stored in the RAM. . Since each shared / non-shared bus selection device 122, 123, 124 has "AccMem" input to the operation permission terminal, each bus is connected according to the result from each shared / non-shared determination input. At this time, there is no need to change anything in terms of software.

この実施の形態１３によれば、単体使用のＣＰＵ（ＣＰＵ本体部）に最小限の、しかもなるべく労力の少ない改訂を実施することにより、共有／非共有の判定を可能にした。この実施の形態１３の場合、ここでいう労力とは“Ｓｇｒ”端子１３１ｄに相当する端子を出すことと、“ＳＮ０”端子１３１ｅに相当する端子を出すことで、アルミ配線を外に出す以外の労カはない。しかも、これら端子は機能上、上記セグメント管埋を実施する単体使用のＣＰＵには存在するはずで、さらに単体のＣＰＵの機能を搭載する必要がない。その結果、比較的安く改訂できる。 According to the thirteenth embodiment, the determination of sharing / non-sharing is made possible by implementing a revision to the single-use CPU (CPU main unit) with a minimum amount of effort. In the case of the thirteenth embodiment, the labor referred to here is to provide a terminal corresponding to the “Sgr” terminal 131d and to provide a terminal corresponding to the “SN0” terminal 131e, so that the aluminum wiring is exposed outside. There is no labor. In addition, these terminals should be present in the function of a single-use CPU that implements the above-mentioned segment embedding, and it is not necessary to mount a function of a single CPU. As a result, it can be revised relatively cheaply.

また、上記の方法であれば、内部および外部のハード的な付加はあってもソフトウェア的には何も付加するものはない。これはこれまでの過去のソフトウェア資産を承継できることを意味する。また、実施の形態１０にセグメント管理を実施する単体のＣＰＵを載せた場合は共有するセグメントは共有領域に、非共有のセグメントは非共有領域に置くという制約がついていたが、この実施の形態１３ではそのような制約がなくなり、融通が利くという利点がある。 Further, with the above method, there is nothing in terms of software even though internal and external hardware are added. This means that previous software assets can be inherited. Further, when a single CPU for performing segment management is mounted in the tenth embodiment, there is a restriction that the shared segment is placed in the shared area and the non-shared segment is placed in the non-shared area. Then, there is an advantage that such restrictions are eliminated and flexibility is available.

なお、ここでも実施の形態１２と同じく、セグメントの生成、変更、消去という処理およびこれに伴う共有／非共有判定装置１３５内の整合性を保った情報の変更がありうるが、これについては処理中まれに起こることと、特許の本質ではないため説明を省略する。 Here, as in the twelfth embodiment, there may be a process of segment generation, change, and deletion and a change in information that maintains consistency in the shared / non-shared determination apparatus 135 accompanying this. Because it is not the essence of the patent and what happens infrequently, explanation is omitted.

実施の形態１３ではＣＰＵ本体部３８６は、単体のチップであることを前提としたが、図１８に示すように、別に共有／非共有判定装置１３５を含んだものでもよい。この結果、ＣＰＵ本体部３８６は“Ｓｇｒ”端子と“共有／非共有判定”端子１３１ｆをもつこととなる。 In the thirteenth embodiment, it is assumed that the CPU main body 386 is a single chip. However, as shown in FIG. 18, the CPU main body 386 may separately include a shared / non-shared determination device 135. As a result, the CPU main body 386 has a “Sgr” terminal and a “shared / non-shared determination” terminal 131f.

実施の形態１３ではＣＰＵ本体部３８６は、単体のチップであることを前提としたが、図１９に示すように、別に共有／非共有判定装置１３５およびセグメントディスクリブタテーブル１３６を含んだものでもよい。この結果、ＣＰＵ本体部３８６は“共有／非共有判定”端子１３１ｆのみをもつこととなる。 In the thirteenth embodiment, it is assumed that the CPU main body 386 is a single chip. However, as shown in FIG. 19, it may include a shared / non-shared determination device 135 and a segment disk table 136 separately. . As a result, the CPU main body 386 has only the “shared / non-shared determination” terminal 131f.

図１８、図１９に示した共有／非共有端子１３１ｆは共有／非共有判定装置１３５というＲＡＭによって決定されるものでなくてもよい。たとえば固定された回路であってもよい。また、実施の形態１３以降で内部に組み入れられた共有／非共有判定装置１３５はセグメントで判断するように適用したが、これ以外の判断材料であってもよい。 The shared / non-shared terminal 131f shown in FIGS. 18 and 19 may not be determined by the RAM, which is the shared / non-shared determination device 135. For example, it may be a fixed circuit. Further, the shared / non-shared determining apparatus 135 incorporated in the inside in the thirteenth and subsequent embodiments is applied so as to determine by segment, but other determination materials may be used.

以上、実施の形態１３では共有／非共有の判定をセグメント番号によって判定するＣＰＵを用いて、この発明のＣＰＵおよびキャッシュ構成に適用した。図１８、図１９では共有／非共有判定装置１３５はＣＰＵの中に設けてもよいことを示した。また、共有／非共有判定装置１３５は固定されたものであってもよいとしたが、命令によって共有／非共有を分けるようにしてもよい。
実施の形態１４．
（複合システム）
第２０図は実施の形態１４によるマルチプロセッサ装置を示すブロック図である。この実施の形態１４は実施の形態４、実施の形態８をべースにしたもので、図において、１７１は実施の形態１２のＣＰＵである。１７２は実施の形態１３のＣＰＵである。 As described above, in the thirteenth embodiment, the shared / unshared determination is applied to the CPU and cache configuration of the present invention by using the CPU that determines the segment number. 18 and 19 show that the shared / non-shared determination device 135 may be provided in the CPU. Further, although the shared / non-shared determination device 135 may be fixed, shared / non-shared may be divided according to an instruction.
Embodiment 14 FIG.
(Complex system)
FIG. 20 is a block diagram showing a multiprocessor device according to the fourteenth embodiment. The fourteenth embodiment is based on the fourth and eighth embodiments. In the figure, reference numeral 171 denotes a CPU according to the twelfth embodiment. Reference numeral 172 denotes a CPU according to the thirteenth embodiment.

この実施の形態１４は、９個以上の異なるコンピュータシステムを一つのシステムとして融合させる方法である。この実施の形態１４を便用すれば、図で明らかなように２個以上のコンピュータシステムを一つのシステムとして融合することが可能である。この効果として、２個のシステムで一つのデータを共有することができる。このＣＰＵは、別に実施の形態１２、実施の形態１３に限ったものでなくてもよい。最低でも、共有データと非共有データをはっきりと分けて、これに応じてバスを選択することのできるＣＰＵであれば、この発明によるＣＰＵおよびキャッシュ構成を組める。 The fourteenth embodiment is a method of fusing nine or more different computer systems as one system. If this embodiment 14 is used for convenience, it is possible to fuse two or more computer systems as one system, as is apparent from the figure. As an effect, two systems can share one data. This CPU does not have to be limited to the twelfth and thirteenth embodiments. At least, the CPU and the cache configuration according to the present invention can be assembled as long as the CPU can clearly separate the shared data and the non-shared data and select the bus according to this.

この発明の実施の形態１によるマルチプロセッサ装置のブロック図である。1 is a block diagram of a multiprocessor device according to a first embodiment of the present invention. FIG. この発明の実施の形態２によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 2 of this invention. この発明の実施の形態３によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 3 of this invention. この発明の実施の形態３による他のマルチプロセッサ装置のブロック図である。It is a block diagram of the other multiprocessor device by Embodiment 3 of this invention. この発明の実施の形態４によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 4 of this invention. この発明の実施の形態４による他のマルチプロセッサ装置のブロック図である。It is a block diagram of the other multiprocessor device by Embodiment 4 of this invention. この発明の実施の形態５によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 5 of this invention. この発明の実施の形態６によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 6 of this invention. この発明の実施の形態８によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 8 of this invention. この発明の実施の形態９のマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device of Embodiment 9 of this invention. この発明の実施の形態９による他のマルチプロセッサ装置のブロック図である。It is a block diagram of the other multiprocessor device by Embodiment 9 of this invention. この発明の実施の形態１０によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 10 of this invention. この発明の実施の形態１１によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 11 of this invention. この発明の実施の形態１２によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 12 of this invention. 実施の形態１２のメモリマップである。22 is a memory map according to the twelfth embodiment. 実施の形態１２のメモリアクセスソフトウェアである。19 is memory access software according to the twelfth embodiment. この発明の実施の形態１３によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 13 of this invention. この発明の実施の形態１３による他のマルチプロセッサ装置のブロック図である。It is a block diagram of the other multiprocessor device by Embodiment 13 of this invention. この発明の実施の形態１３による更に他のマルチプロセッサ装置のブロック図である。It is a block diagram of still another multiprocessor device according to the thirteenth embodiment of the present invention. この発明の実施の形態１４によるマルチプロセッサ装置のブロック図である。It is a block diagram of the multiprocessor device by Embodiment 14 of this invention. 従来のマルチプロセッサ装置のブロック図である。It is a block diagram of the conventional multiprocessor device.

Explanation of symbols

１１ＣＰＵ、１１ａ共有バス端子、１１ｂ，１１ｃ非共有バス端子、１２，６２ローカルキャッシュメモリ、１３ａ，６３ａローカル非共有バス、１３ｂローカル共有バス、１４プロセッサユニット、１５ａ，５５ａ，６５ａグローバル非共有バス、１５ｂ，５５ｂグローバル共有バス，１６共有キャッシュメモリ、１７ａ，５７ａ，６７ａ非共有インタフェース、１７ｂ，３７，５７ｂ共有インタフェース、１９ａ，６９ａ非共有メモリ、３９外部メモリ、５４再帰プロセッサユニット。 11 CPU, 11a shared bus terminal, 11b, 11c unshared bus terminal, 12, 62 local cache memory, 13a, 63a local unshared bus, 13b local shared bus, 14 processor unit, 15a, 55a, 65a global unshared bus, 15b, 55b Global shared bus, 16 Shared cache memory, 17a, 57a, 67a Non-shared interface, 17b, 37, 57b Shared interface, 19a, 69a Non-shared memory, 39 External memory, 54 Recursive processor unit.

Claims

CPU,
A processor unit connected to the CPU and including a first cache memory that exclusively stores information for each CPU;
A first shared bus connecting each of the plurality of processor units;
A second cache memory connected to the plurality of processor units and storing information shared by the plurality of processor units;
An interface unit connected to the first shared bus and outputting a control signal for accessing an external memory storing shared information shared by the plurality of processor units;
A multiprocessor device.

2. The multiprocessor device according to claim 1, wherein the second cache memory is a cache memory positioned lower than the first cache memory, and is connected to the first shared bus. 3. .