JPH028946A

JPH028946A - Cache memory control system

Info

Publication number: JPH028946A
Application number: JP63159699A
Authority: JP
Inventors: Katsumi Nakamura; 克己中村
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1988-06-28
Filing date: 1988-06-28
Publication date: 1990-01-12
Anticipated expiration: 2009-08-10
Also published as: JPH0661065B2

Abstract

PURPOSE:To perform a processing without nullifying the content of a cache memory by broadcasting write data to all the central processing units, and writing the write data on the cache memory corresponding to a recognized central processing unit. CONSTITUTION:The write data is broadcasted to all the CPUs (2a-2d) at the time of performing a write operation, and broadcasted write data is written actually on the cache memory corresponding to the CPU recognized by a recognition means (parallel processing mode flags 9a-9d) via a transfer means (data bus 11). Assuming that arrangements (A1-A8) exist in the same block in a main memory device 1, the block is read out to all of four CPUs (2a-2d), and all the CPUs (2a-2d) perform write on the block. In such a case, the CPU2a writes the data in the arrangement A1 and the CPU2b writes the data in the arrangements A2 on the same block, respectively.

Description

【発明の詳細な説明】［産業上の利用分野］この発明は主記憶装置を共有するマルチプロセッサシス
テムにおけるキャッシュメモリ制御方式％式％［従来の技術］第４図は例えばＣ０ＭＰＵＴＥＲＤＥＳＩＧＮ”　、　
Ｇｌｅｎ　Ｇ、　Ｌａｎｇｄｏｎ　、　Ｊｒ、著（ＣＯ
Ｍ　Ｐ　ＵＴＥＡＣＨＰＲＥＳＳ　　ＩＮＣ，１９８２
）に示されたキャッシュメモリ制御方式の構成を示すブ
ロック図である。図において、１はデータ処理に必要な
データを格納する主記憶装置、２はこの主記憶装置１を
共有しデータ処理に関する演算′・制御を行う中央処理
装置（ＣＰＵ）、３は中央処理装置２と主記憶装置１間
のデータ転送を高速化するためのキャッシュメモリ、４
は読み出しデータバス、５は主記憶読み出しパス、６は
書き込みデータバス、７はロードスルーパス、８はスト
アスルーパスである。[Detailed Description of the Invention] [Industrial Field of Application] This invention describes a cache memory control method in a multiprocessor system that shares a main memory. [Prior Art] FIG.
Written by Glen G., Langdon, Jr. (CO
M P UTEACH PRESS INC, 1982
) is a block diagram showing the configuration of the cache memory control method shown in FIG. In the figure, 1 is a main memory that stores data necessary for data processing, 2 is a central processing unit (CPU) that shares this main memory 1 and performs calculations and controls related to data processing, and 3 is a central processing unit 2. and a cache memory for speeding up data transfer between the main storage device 1 and the main storage device 1;
5 is a read data bus, 5 is a main memory read path, 6 is a write data bus, 7 is a load through path, and 8 is a store through path.

次に動作について説明する。キャッシュメモリ３は、中
央処理装置２が主記憶装置１からデータをアクセスする
時にそのデータ含格納しておく。Next, the operation will be explained. The cache memory 3 stores data when the central processing unit 2 accesses the data from the main storage device 1.

−ｉに、−度使用されたデータは、再び使用される可能
性が高いといわれており、以後のデータのアクセス時に
、このキャッシュメモリ３からデータがアクセスできる
場合が多くなる。また、このキャッシュメモリ３は、通
常主記憶装置１よりかなり高速にアクセスできるものな
ので、したがって、アクセスしたいデータがキャッシュ
メモリ３内にある場合には、かなり高速にデータのアク
セスができる。データがキャッシュメモリ３内にない時
には、主記憶装置１ヘデータをアクセスする必要がある
ので、遅くなってしまう。It is said that data that has been used -i times has a high possibility of being used again, and data can often be accessed from this cache memory 3 during subsequent data access. Furthermore, the cache memory 3 can normally be accessed much faster than the main storage device 1, so if the data to be accessed is in the cache memory 3, the data can be accessed quite quickly. When the data is not in the cache memory 3, it is necessary to access the data to the main storage device 1, which slows down the process.

中央処理装置２がデータをアクセスする時には、まず、
必要なデータに対するリクエストを出し、このリクエス
トされたアクセスのデータがキャッシュメモリ３内に存
在しているか否かをテストする。もしあれば、そのキャ
ッシュメモリ３内のデータを読み出し、データバス４が
らそのままとりだす。もしキャッシュメモリ３内に必要
なデータがない場合には、主記＋！１ＭｆＦ、み出しバ
ス５によってデータを一度主記憶装置１からキャッシュ
メモリ３内へ読み出してきて、その後にキャッシュメモ
リ３を読み出す。あるいは、主記’ＩＩ　＋ｉＪＬみ出
しパス５からキヤ・ソシュメモリ３へ読み出すと同時に
ロートスルーバス１を介して中央処理装置２／＼ら読み
出す、書き込み動１ヤの時にら同様のテストを行り主記
憶装置１へ書き込まれる。When the central processing unit 2 accesses data, first,
A request for necessary data is issued, and it is tested whether or not the requested access data exists in the cache memory 3. If there is, the data in the cache memory 3 is read out and taken out from the data bus 4 as is. If there is no necessary data in the cache memory 3, the main +! 1MfF, data is once read from the main storage device 1 into the cache memory 3 via the spill bus 5, and then the data is read from the cache memory 3. Alternatively, a similar test can be carried out during a write operation in which data is read from the main memory 'II + iJL output path 5 to the cache memory 3 and simultaneously read from the central processing unit 2/\\ via the rotary through bus 1. Written to storage device 1.

したがって、主記憶装置１と比較して、かなり小さい容
量のキャッシュメモリ３には、最乙団用頻度の高いデー
タをおいておくのが望ましい。通常キャッシュメモリ３
内において最も新しく（受用されたデータは長くキャッ
シュメモリ３内におけるように、キャンシュメモリ３内
のデータの置き換えは最ら古くに使用されたデータを主
記１．Ｏ装置１へ返すように行われる。Therefore, it is desirable to store frequently used data in the cache memory 3, which has a considerably smaller capacity than the main storage device 1. Normal cache memory 3
Replacement of the data in the cache memory 3 is performed in such a way that the most recently used data (accepted data remains in the cache memory 3 for a long time) and the oldest used data is returned to the main device 1. .

また、第５図に示すように、このようなキャッシュメモ
リをそれぞれ持つ中央処理装置を複数台結合し主記憶装
置１を共有するマルチプロセッサシステムにおいては、
更に複雑な問題がある。マルチプロセッサシステムでは
、同一の主記憶装置１を共有するので、その主記憶装置
１のコピーであるキャッシュメモリ３ａ〜３ｄには、複
数台の中央処理装置２ａ〜２ｄに対応した同じ主記憶装
置１のデータのコピーが存在する場合かある。もし、い
ずれかの中央処理装置がそのキャッシュメモリの内容を
書き換えた時には、全ての中央処理装置２ａ〜２ｄ中の
キャッシュメモリ３ａ〜３ｄの更新前のデータのコピー
は、もはや正しくないので誤って使用しないように無効
なものとする必要がある。これをキャッシュメモリの無
効化という。このキャッシュメモリの無効化は、通常中
央処理装置の書き込みの度毎に発生するものである。Furthermore, as shown in FIG. 5, in a multiprocessor system in which a plurality of central processing units each having such a cache memory are combined to share the main storage device 1,
There are further complications. In a multiprocessor system, the same main storage device 1 is shared, so the cache memories 3a to 3d, which are copies of the main storage device 1, have the same main storage device 1 corresponding to the plurality of central processing units 2a to 2d. A copy of the data may exist. If any central processing unit rewrites the contents of its cache memory, the copies of the data before the update in the cache memories 3a to 3d in all central processing units 2a to 2d are no longer correct and may be used incorrectly. It is necessary to make it invalid so that it does not occur. This is called cache memory invalidation. This invalidation of the cache memory normally occurs every time the central processing unit writes.

第６図は、マルチプロセッサシステムにおける１つの中
央処理装置２におけるデータのストア動作（書き込み動
作ンを表したものである。このフローチャー１・では、
ス１〜ア時には、キャツシュヒツトした時にら主記憶装
置１への書き込みを行うやストアスル一方式≠を仮定し
て記述している。FIG. 6 shows a data store operation (write operation) in one central processing unit 2 in a multiprocessor system. In this flowchart 1,
At times 1 to 1A, the description is made assuming that the write to the main memory device 1 is performed after the cache hit and the store is one-sided ≠.

第６図において、ステップＳ１ではあるキャッシュメモ
リにデータがあるか否かを判断し、データがあればステ
ップＳ３へ移りキャッシュメモリ３にデータをストアし
、データがなければステップＳ２へ移り主記憶装置１に
データをストアする。In FIG. 6, in step S1 it is determined whether or not there is data in a certain cache memory. If there is data, the process moves to step S3 and the data is stored in the cache memory 3. If there is no data, the process moves to step S2 and the main memory Store data in 1.

また、ステップＳ４では目的データか他の中央処理装置
のキャッシュメモリにあるか否かを判断し、目的データ
があればステップＳ６へ移り池の中央処理装置のキャッ
シュメモリの内容を無効（ヒし、目的データがなければ
ステップＳ５へ移り何の処理もしない。In step S4, it is determined whether or not the target data exists in the cache memory of another central processing unit. If the target data exists, the process moves to step S6 and invalidates the contents of the cache memory of the other central processing unit. If there is no target data, the process moves to step S5 and no processing is performed.

ここに示すように、データのストアに対しては、キャツ
シュヒツトの場合、あるいはキャッシュミスの場合にお
いても別のく旧糸）中央処理装置のキャッシュメモリの
内容をテスｌ−して目的データが旧糸の中央処理装置に
もあった場合には、そのキャッシュメモリを無効化する
必要がある。As shown here, when storing data, in the case of a cache hit or a cache miss, the content of the cache memory of the central processing unit is tested to determine if the target data is the old thread. If the cache memory is also present in the central processing unit, it is necessary to invalidate that cache memory.

近年のマルチプロセッサシステムにおいては、従来から
行われている複数のジョブを複数の中央処理装置に割り
当ててシステムとしてのスループットを向上させること
を目的とした並列処理とは異なって、複数台の中央処理
装置が１つのジョブを実行することによって、そのジョ
ブのレスポンスを向上させることを目的とした並列処理
が行われる場合がある。この場合には、主記憶装置の同
じブロックのデータが複数の中央処理装置に分割される
ことがあり、この同しブロックのデータを別の中央処理
装置がアクセスすることが頻繁に起こる。In recent multiprocessor systems, unlike the conventional parallel processing that aims to improve system throughput by allocating multiple jobs to multiple central processing units, When a device executes one job, parallel processing may be performed for the purpose of improving the response of that job. In this case, data in the same block of the main memory may be divided among multiple central processing units, and data in the same block is frequently accessed by different central processing units.

例えば、第２図に示すように、あるプログラム中のルー
プを複数台の中央処理装置で並列に処理させようとする
場合には、いわゆる空間的並列性を生かし、そのループ
の繰り返し毎に分割して処理させることが良く行われる
。第２図の例では、Ｄ　ＯＡ、　Ｌ　Ｌという文が並列
処理を行うことを表し、中央処理装置が４台あることを
仮定し、このループ内の４つの式がそれぞれ１台の中央
処理装置に割り当てられることを示している。つまり、
ループの繰り返しを４つに分けて、中央処理装置２ａが第１番目の式の処理である繰り返し
の第１回、５回、９回目、・・・を中央処理装置２ｂが
第２番目の式の処理である繰り返しの第２回、６回、１
０回目、・・・を中央処理装置２ｃが第３番目の式の処
理である繰り返しの第３回、７回、１１回目、・・・を
中央処理装置２ｄが第４番目の式の処理である繰り返し
の第４回、８回、１２回目、・・・をというように受は
持つものとしている。このことは、空間的に集中してい
るデータ、例えば、配列Ａの要素をあえて複数の中央処
理装置に分割することになるので、複数台の中央処理装
置２ａ〜２ｄのキャッシュメモリ３ａ〜３ｄに配列Ａの
要素の同一のデータのブロックか存在する結果となる。For example, as shown in Figure 2, if you want to have a loop in a certain program processed in parallel by multiple central processing units, you can take advantage of so-called spatial parallelism and divide the loop into parts for each repetition. It is often done to treat In the example in Figure 2, the statements D OA, LL represent parallel processing, and assuming there are four central processing units, each of the four expressions in this loop executes one central processing unit. indicates that it is assigned to In other words,
The loop repetition is divided into four parts, and the central processing unit 2a processes the first expression for the first, fifth, ninth, and so on, and the central processing unit 2b processes the second expression. The 2nd, 6th, and 1st iteration of the process
The central processing unit 2c processes the third expression for the 0th time, and the central processing unit 2d processes the fourth expression for the 3rd, 7th, 11th, and so on. Uke holds the 4th, 8th, 12th, etc. of a certain repetition. This means that data that is spatially concentrated, for example, the elements of array A, will be purposely divided into multiple central processing units, so the cache memories 3a to 3d of multiple central processing units 2a to 2d will be The result is that there are blocks of identical data for the elements of array A.

この複数のキャッシュメモリ３ａ〜３ｄ内のデータの有
効性を保つためには、各中央処理装置でそのデータのブ
ロックへの書き込みが行われる度毎に書き込みが行われ
たブロックのアドレスを他系の中央処理装置へ送り、キ
ャッシュメモリ内にそのアドレスのブロックを持つ中央
処理装置の対応するキャッシュメモリのブロックのデー
タを無効１ヒする必要がある。無効（ヒされたブロック
を持っていた中央処理装置は、そのブロックを使用した
いときには、再度主記憶装置１からその一部のみが変更
されたブロックをアクセスする必要がある。In order to maintain the validity of the data in the plurality of cache memories 3a to 3d, each time each central processing unit writes data to a block, the address of the written block must be sent to the other system. It is necessary to send the data to the central processing unit and invalidate the data in the corresponding cache memory block of the central processing unit that has the block at that address in the cache memory. When a central processing unit that had an invalidated block wants to use that block, it needs to access the block from the main memory 1 again, only a portion of which has been changed.

第２図の例では、各中央処理装置２ａ〜２ｄにおける１
つの演算の処理の度毎に、この状況が発生する。そのた
びに、各中央処理装置２ａ〜２ｄ内でキャッシュメモリ
３ａ〜３ｄ内のデータの無効ロセッサシステムの性能に
多大な影響を与える。In the example of FIG. 2, 1 in each central processing unit 2a to 2d.
This situation occurs every time one operation is processed. Each time, the performance of the invalid processor system of the data in the cache memories 3a-3d in each central processing unit 2a-2d is greatly affected.

［発明が解決しようとする課題］従来のキャッシュメモリ制御方式は上述したような動作
を行うので、並列処理を実行するマルチプロセンサシス
テムにおいては各中央処理装置の書き込み処理の１回毎
にキャッシュメモリに対する内容の無効要求が出される
ことになり、その度毎にキャッシュメモリと主記憶装置
間でデータ転送が行われ、したがって処理性能を著しく
低下させるという問題点があった。[Problems to be Solved by the Invention] Since the conventional cache memory control method operates as described above, in a multi-processor sensor system that executes parallel processing, the cache memory is There is a problem in that each time a request is made to invalidate the contents of a file, data is transferred between the cache memory and the main storage, resulting in a significant drop in processing performance.

この発明は上記のような問題点を解消するためになされ
たもので、書き込み動作の度に全中央処理装置に対応す
るキャッシュメモリの内容を無効（ヒせずに処理を行う
ことにより、処理性能の向上を図ることができるキャッ
シュメモリ制御卸方式を得ることを目的とする。This invention was made to solve the above-mentioned problems, and it improves processing performance by invalidating the contents of the cache memory corresponding to all central processing units every time a write operation is performed. The purpose of this invention is to obtain a cache memory control system that can improve the performance of cache memory.

［課題を解決するための手段］この発明に係るキャッシュメモリ制御方式は、指定され
た２個以上の中央処理装置によって並列データ処理が実
行されているときにその指定された中央処理装置が複数
の中央処理装置２ａ〜２ｄ群の１つであることを認識す
るための認識手段（並列処理モードフラグ９ａ〜９ｄ）
と、各キャツシュメモリ３ａ〜３ｄ間で直接にデータ転
送を行うための転送手段（データバス１１）とを備え、
書き込み動（ヤ時にすべての中央処理装置２ａ〜２ｄに
対して書き込みデータを放送し、上記認識手段により認
識された中央処理装置に対応するキャッシュメモリに上
記書き込みデータを上記転送手段を介して転送して実際
に書き込ませることを特徴とするものである。[Means for Solving the Problems] A cache memory control method according to the present invention provides that when two or more designated central processing units are executing parallel data processing, the designated central processing units Recognition means for recognizing that the central processing unit is one of the groups of central processing units 2a to 2d (parallel processing mode flags 9a to 9d)
and a transfer means (data bus 11) for directly transferring data between each cash memory 3a to 3d,
During a write operation, the write data is broadcast to all the central processing units 2a to 2d, and the write data is transferred to the cache memory corresponding to the central processing unit recognized by the recognition means via the transfer means. The feature is that the data is actually written.

［作用］この発明のキャッシュメモリ制御方式においては、書き
込み動作時に害き込みデータはすべての中央処理装置２
ａ〜２ｄに対して放送され、この放送された書き込みデ
ータは転送手段（データバス１１）を介して認識手段く
並列処理モードフラグ９ａ〜９ｄ）により認識された中
央処理装置に対応するキャッシュメモリに実際に書き込
まれる。[Operation] In the cache memory control method of the present invention, corrupted data is sent to all central processing units 2 during a write operation.
a to 2d, and the broadcast write data is sent to the cache memory corresponding to the central processing unit recognized by the recognition means (parallel processing mode flags 9a to 9d) via the transfer means (data bus 11). actually written.

［発明の実施例コ第１図はこの発明の一実施例に係るキャッシュメモリ制
御方式を採用したマルチプロセッサシステムの構成を示
すブロック図である。図において、１はデータ処理に必
要なデータを格納する主記憶装置、２ａ〜２ｄは主記憶
装置１を共有しデータ処理に関する演算・制御を行う複
数の中央処理装置（ＣＰＵ）、３ａ　〜３ｄは中央処理
装置２ａ〜２ｄにそれぞれ対応して設けられ中央処理装
置２ａ〜２ｄと主記憶装置１間のデータ転送を高速化す
るための複数のキャッシュメモリ、１０は各中央処理装
置２ａ〜２ｄが主記憶装置１をアクセスするためのグロ
ーバルメモリパス、１１は各キャツシュメモリ３ａ〜３
ｄ間で直接にデータ転送を行うための転送手段の機能及
び書き込み動作時にすべての中央処理装置２ａ〜２ｄに
対して書き込みデータを放送するデータバス、９ａ〜９
ｄは指定された２個以上の中央処理装置によって並列デ
ータ処理が実行されているときにその指定された中央処
理装置が複数の中央処理装置２ａ〜２ｄ群の１つである
ことを認識するための認識手段としての並列処理モード
フラグである。[Embodiment of the Invention] FIG. 1 is a block diagram showing the configuration of a multiprocessor system employing a cache memory control method according to an embodiment of the invention. In the figure, 1 is a main memory that stores data necessary for data processing, 2a to 2d are a plurality of central processing units (CPUs) that share the main memory 1 and perform calculations and controls related to data processing, and 3a to 3d are CPUs that perform calculations and controls related to data processing. A plurality of cache memories are provided corresponding to the central processing units 2a to 2d, respectively, to speed up data transfer between the central processing units 2a to 2d and the main storage device 1; A global memory path 11 for accessing the storage device 1 is each cash memory 3a to 3.
data buses 9a to 9 that broadcast write data to all central processing units 2a to 2d during write operations;
d is for recognizing that the designated central processing unit is one of the plurality of central processing unit groups 2a to 2d when parallel data processing is being executed by two or more designated central processing units. This is a parallel processing mode flag as a recognition means.

次に動作について説明する。Next, the operation will be explained.

第１図では、４台の中央処理装置２ａ〜２ｄを持つマル
チプロセッサシステムを示している。このマルチプロセ
ッサシステムにおいて並列処理を行わせる時、最も単純
で良く行われる方法にプログラムのループの部分を分割
する手法かある。コンパイラなどによって自動的に並列
化することを前提とすると並列化の困難さから分割はこ
の様な単純なものとなることが多い。例えば、第２図に
示すように繰り返し数の多いループを４つに分割して、
ＤＯＡＬＬ文で示されるように変形する。FIG. 1 shows a multiprocessor system having four central processing units 2a to 2d. When performing parallel processing in this multiprocessor system, the simplest and most common method is to divide the loop portion of the program. If parallelization is automatically performed by a compiler or the like, the division is often as simple as this due to the difficulty of parallelization. For example, as shown in Figure 2, a loop with a large number of repetitions is divided into four,
Transform as indicated by the DOALL statement.

ＤＯＡＬＬという文は並列処理を行うことを表し、この
ループ内の４つの式がそれぞれ１台の中央処理装置に割
り当てられることを示している。つまり、ループの繰り
返しを４つに分けて、中央処理装置２ａが第１番目の式
の処理である繰り返しの第１回、５回、９回目、・・・
を中央処理装置２ｂが第２番目の式の処理である繰り返
しの第２回、６回、１０回目、・　・を中央処理装置２
Ｃが第３番目の式の処理である繰り返しの第３回、７回
、１１回目、・・・を中央処理装置２ｄが第４番目の式
の処理である繰り返しの第４回、８回、１２回目、・・
・をというように受は持つものとしている。The statement DOALL indicates parallel processing and indicates that each of the four expressions in this loop is assigned to one central processing unit. In other words, the repetition of the loop is divided into four parts, and the central processing unit 2a processes the first expression in the first, fifth, ninth, and so on.
The central processing unit 2b processes the second expression for the second, sixth, tenth time, etc.
The central processing unit 2d processes the fourth expression in the 4th, 8th, and so on for the 3rd, 7th, 11th, etc. of the repetition in which C processes the 3rd expression. The 12th time...
・Uke is assumed to have.

第３図に、この時の各中央処理装置内２ａ〜２ｄのキャ
ッシュメモリ３ａ〜３ｄ内のデータを示している。主記
憶装置１内において配列Ａ１〜ＡＳが同一のブロックに
あるとすると、このブロックは４つの中央処理装置　２
　ａ〜２ｄのすべてに読みだされ、４つの中央処理装置
２ａ〜２ｄのすべてがこのプロ・ツクへの書き込みを行
う。この場合には、中央処理装置２ａが配列Ａ１を中央
処理装置２ｂがその隣接する要素である配列Ａ２を、中
央処理装置２Ｃが更にその隣接要素の配列Ａ３を、そし
て中央処理装置２ｄがその配列Ａ３の隣接要素の配列Ａ
４を同一のブロックにデータをそれぞれ書き込むことに
なる。この動［ｔの流れは、この後も継続され常に隣の
中央処理装置が書き込みを行った要素の隣接要素にデー
タを書き込むことになる。この時に、各中央処理装置２
ａ〜２ｄ内の並列処理モードフラグ９ａ〜９ｄはすべて
オンとなっていて、すべての中央処理装置２ａ〜２ｄか
並列処理のモードであることを示している。これに従っ
て書き込み動１を時には、他系の中央処理装置へデータ
の書き込みが行われたアドレスを送り、その他系の中央
処理装置のキャッシュメモリを無効（ヒする代わりに、
旧糸の中央処理装置へ書き込みか行われたデータのアド
レスと共に書き込みデータをデータバス１１からすべて
の中央処理装置へ放送する。これによって並列処理モー
ドフラグ９ａ〜９ｄがオンとなっていて、かつ、書き込
まれたデータを含むブロックを保持している各中央処理
装置２ａ〜２ｄが、送られてきた書き込みデータを取り
込みキャッシュメモリ３ａ〜３ｄ内へ実際に書き込むこ
とによってキャッシュメモリ３ａ〜３ｄの内容を有効の
ままに維持てき無効１ヒの必要はなくなる。FIG. 3 shows the data in the cache memories 3a to 3d of the respective central processing units 2a to 2d at this time. Assuming that arrays A1 to AS are in the same block in the main memory device 1, this block is divided into four central processing units 2
All four central processing units 2a to 2d write to this program. In this case, the central processing unit 2a stores array A1, the central processing unit 2b stores array A2 as its adjacent element, the central processing unit 2C further stores array A3 as its adjacent element, and the central processing unit 2d stores the array A2 as its adjacent element. Array A of adjacent elements of A3
4 will be written into the same block. The flow of this operation [t continues after this point, and the adjacent central processing unit always writes data to an element adjacent to the element to which writing has been performed. At this time, each central processing unit 2
The parallel processing mode flags 9a to 9d in a to 2d are all on, indicating that all central processing units 2a to 2d are in parallel processing mode. According to this, write operation 1 sometimes sends the address where the data was written to the central processing unit of the other system, and instead of invalidating the cache memory of the central processing unit of the other system,
The write data is broadcast from the data bus 11 to all the central processing units along with the address of the data written to the old central processing unit. As a result, each central processing unit 2a to 2d whose parallel processing mode flags 9a to 9d are on and which holds a block containing written data takes in the written data sent to the cache memory 3a. By actually writing into cache memories 3a to 3d, the contents of cache memories 3a to 3d can be maintained valid, eliminating the need for invalidation.

なお、上記実施例ではマルチプロセッサシステムが並列
処理のモードで動作している時について説明したが、そ
のシステムは通常のモードの実行時においてもある程度
の効果が望まれる。また、実施例では並列処理動作時に
限定してデータのス１へア時にキャッシュメモリを無効
化する代わりにストアデータ（書き込みデータ）を全中
央処理装置に放送して、並列処理モードフラグがオンの
中央処理装置のキャッシュメモリに書き込むものとして
いた。これは、通常並列処理動作時には、前述のように
ループを分割する場合が多く同一ブロックのデータを複
数の中央処理装置が、共有して使用する可能性が非常に
高いことに着目したものである。ところが、実際のジョ
ブの中には、ユーザがプログラム中で明示的に複数台の
中央処理装置を使用することを宣言して処理を行わせる
場合がある。この場合においても、複数の中央処理装置
でデータを共有することがあるので、ユーザが複数の中
央処理装置間でデータの同一ブロックをアクセスするこ
とを認識した上で、ソフトウェア的に使用する中央処理
装置を指定して、指定された中央処理装置においてはキ
ャッシュの無効化を行わずにストアデータを放送するも
のとしても目的の効果が得られる。In the above embodiment, the multiprocessor system operates in the parallel processing mode, but it is desired that the system has a certain degree of effect even when running in the normal mode. In addition, in the embodiment, instead of invalidating the cache memory when storing data only during parallel processing operation, the store data (write data) is broadcast to all central processing units, and the parallel processing mode flag is turned on. It was supposed to be written to the cache memory of the central processing unit. This is based on the fact that during normal parallel processing operations, loops are often divided as mentioned above, and there is a very high possibility that multiple central processing units will share and use the same block of data. . However, in some actual jobs, the user may explicitly declare in the program that multiple central processing units will be used to perform processing. Even in this case, since data may be shared among multiple central processing units, the central processing unit used in the software should be aware that the user will access the same block of data between multiple central processing units. The desired effect can also be obtained by specifying a device and broadcasting stored data without invalidating the cache in the specified central processing unit.

［発明の効果」以上のように本発明によれば、書き込み動１を時にすべ
ての中央処理装置に対して書き込みデータを放送し、認
識手段により認識された中央処理装置に対応するキャッ
シュメモリに書き込みデータを転送手段を介して転送し
て実際に書き込ませるようにしたので、書き込み動１ｔ
の度に全中央処理装置に対応するキャッシュメモリの内
容を無効化せずに処理を行い、認識された中央処理装置
に対応するキャッシュメモリに書き込みデータが書き込
まれ、これによりキャッシュメモリ内のデータを有効の
ままに保つことができ、キャッシュメモリど主記憶装置
間のデータの入れ換えが少なくなり、したがって処理性
能が向上するという効果が得られる。[Effects of the Invention] As described above, according to the present invention, write operation 1 is performed by broadcasting write data to all central processing units at the same time, and writing data to the cache memory corresponding to the central processing unit recognized by the recognition means. Since the data is actually written by transferring it via a transfer means, the writing operation is 1t.
Each time, processing is performed without invalidating the contents of the cache memory corresponding to all central processing units, and write data is written to the cache memory corresponding to the recognized central processing unit, thereby updating the data in the cache memory. This has the effect of reducing data exchange between main storage devices such as the cache memory, thereby improving processing performance.

[Brief explanation of the drawing]

第１図はこの発明の一実施例に係るキャッシュメモリ制
御方式を採用したマルチプロセッサシステムの構成を示
すブロック図、第２図はこの実施例において並列処理さ
れる典型的なプログラムの例を示す図、第３図はそのプ
ログラムをこの実施例において実行させた時に各中央処
理装置のキャッシュメモリ上にデータがどのように分割
されるかを示す図、第４図は従来のキャッシュメモリ制
Ｍ方式の構成を示すブロック図、第５図は従来のキャッ
シュメモリ制御卸方式を採用したマルチプロセッサシス
テムの構成を示すブロック図、第６図はこの従来例にお
けるキャッシュメモリの動１ヤを示すフローチャートで
ある。FIG. 1 is a block diagram showing the configuration of a multiprocessor system that employs a cache memory control method according to an embodiment of the present invention, and FIG. 2 is a diagram showing an example of a typical program that is processed in parallel in this embodiment. , Fig. 3 is a diagram showing how data is divided on the cache memory of each central processing unit when the program is executed in this embodiment, and Fig. 4 is a diagram showing how data is divided on the cache memory of each central processing unit when the program is executed in this embodiment. FIG. 5 is a block diagram showing the structure of a multiprocessor system employing a conventional cache memory control system, and FIG. 6 is a flowchart showing the operation of the cache memory in this conventional example.

Claims

[Claims]

A main storage device that stores data necessary for data processing;
A plurality of central processing units that share this main storage device and carry out calculations and controls related to data processing, and are provided corresponding to each of these central processing units to speed up data transfer between the central processing unit and the main storage device. In a multiprocessor system with multiple cache memories for
recognition means for recognizing that the designated central processing unit is one of the plurality of central processing unit groups when parallel data processing is being executed by the designated two or more central processing units; , a transfer means for directly transferring data between each of the cache memories, and transfers write data to all central processing units during a write operation, and corresponds to the central processing unit recognized by the recognition means. A cache memory control method characterized in that the write data is transferred to the cache memory via the transfer means to be actually written.