JPH05233572A

JPH05233572A - Process dispatch system for multiprocessor

Info

Publication number: JPH05233572A
Application number: JP3476092A
Authority: JP
Inventors: Takaaki Sawada; 貴章澤田
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1992-02-21
Filing date: 1992-02-21
Publication date: 1993-09-10

Abstract

PURPOSE:To repeat the execution with a single processor corresponding to a local run queue based on the time slice by holding the processes or the process groups in the local run queues for a fixed time based on the order of priority from the process groups which ere connected to an accessable global run queue from all processors. CONSTITUTION:The processes included in a global run queue 112 ere fetched to the local run queues 106-108 provided in the processors respectively in the order of priority. Then these processes are successively carried out based on the time slice. The processors 101-103 contain the processing means which return again the processes fetched to the local run queues and carried out to the queue 112 after a fixed time.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、プロセッサ毎にキャッ
シュメモリを有してなるマルチプロセッサシステムに於
いて、プロセッサ間の負荷バランスを維持しつつ、キャ
ッシュメモリ効果（キャッシュヒット率）を向上させる
ことのできるマルチプロセッサに於けるプロセスディス
パッチ方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention, in a multiprocessor system having a cache memory for each processor, improves the cache memory effect (cache hit ratio) while maintaining the load balance between the processors. The present invention relates to a process dispatch method in a multiprocessor capable of performing.

【０００２】[0002]

【従来の技術】従来、マルチプロセッサシステムに於い
て、実行可能な複数のプロセス（タスク、ジョブ、スレ
ッド等）を到着順や優先度順等で全プロセッサからアク
セス可能なランキュー（レディキューともいう）に並
べ、各プロセッサがキューの先頭からプロセスを順番に
タイムスライスでディスパッチしてゆくという方式が存
在する。2. Description of the Related Art Conventionally, in a multiprocessor system, a run queue (also called a ready queue) in which a plurality of executable processes (tasks, jobs, threads, etc.) can be accessed from all processors in the order of arrival or priority There is a method in which each processor dispatches processes sequentially from the head of the queue in time slices.

【０００３】しかしながら、この方式によるプロセスス
ケジューリングを用いた場合、プロセスが頻繁に複数プ
ロセッサ間を移動してしまう。極端な例では、コンテク
ストスイッチが発生し、そのプロセスが再度ディスパッ
チされる度に実行プロセッサが変わる。However, when the process scheduling according to this method is used, the process frequently moves among a plurality of processors. In the extreme case, a context switch occurs and the executing processor changes each time the process is redispatched.

【０００４】このような状態下では、プロセス（プログ
ラム）のアクセス局所性によるキャッシングの効果が低
下するため、各ＣＰＵのキャッシュのヒット率が低下
し、システム全体のスループットが下がってしまうとい
う問題が生じる。In such a state, the effect of caching due to the locality of access of the process (program) is reduced, so that the cache hit rate of each CPU is reduced and the throughput of the entire system is reduced. ..

【０００５】この問題を回避するために、プロセスの実
行プロセッサを固定にしてしまうという方法がある。し
かしながらこの方法は実行可能なプロセス数が均等にな
らず、プロセッサ間の負荷バランスが悪くなるという問
題が生じる。In order to avoid this problem, there is a method of fixing the process execution processor. However, this method has a problem that the number of processes that can be executed is not uniform and the load balance between the processors is deteriorated.

【０００６】[0006]

【発明が解決しようとする課題】上述したように従来の
マルチプロセッサに於けるディスパッチ方式に於いて
は、プロセッサ間の負荷バランスを崩すことなく、キャ
ッシングの効果を高めることができないので、システム
のスループットが上がらないという問題があった。As described above, in the conventional dispatching method in the multiprocessor, the caching effect cannot be enhanced without disturbing the load balance between the processors, so that the system throughput is improved. There was a problem that could not rise.

【０００７】本発明は上記したような従来技術の欠点を
除去し、マルチプロセッサシステムに於いて、プロセッ
サ間の負荷バランスを崩さずに、キャッシングによる効
果を高め、システムのスループットを向上させることの
できるプロセスディスパッチ方式を提供することを目的
とする。The present invention eliminates the above-mentioned drawbacks of the prior art, and in a multiprocessor system, the effect of caching can be enhanced and the system throughput can be improved without disturbing the load balance between the processors. The purpose is to provide a process dispatch method.

【０００８】[0008]

【課題を解決するための手段】上記目的を達成するため
に、本発明に係るマルチプロセッサにおけるプロセスデ
ィスパッチ方式に於いては、実行可能なプロセスを保持
するために全プロセッサからアクセス可能なグローバル
ランキューと、個々のプロセッサ毎に設けたローカルラ
ンキューとを持ち、プロセッサ毎にキャッシュメモリを
有してなるマルチプロセッサシステムに於いて、上記グ
ローバルランキュー内にあるプロセス中から優先度順に
プロセスをローカルランキューにフェッチしてくる手段
と、上記ローカルランキューにフェッチしてきたプロセ
スをタイムスライスに基づき順次実行する手段と、一定
時間経過後、ローカルランキューにフェッチしてきて実
行していたプロセスを再びグローバルランキューに戻す
手段とを具備してなることを特徴とする。In order to achieve the above object, in a process dispatch system in a multiprocessor according to the present invention, a global run queue accessible from all processors to hold an executable process is provided. , In a multiprocessor system having a local run queue provided for each processor and having a cache memory for each processor, processes are fetched into the local run queue from the processes in the global run queue in order of priority. And means for sequentially executing the processes fetched to the local run queue based on the time slice, and means for returning the processes fetched to the local run queue and executed after a certain time to the global run queue again. do it And wherein the Rukoto.

【０００９】又、本発明に於いては、実行可能なプロセ
スを保持するために全プロセッサからアクセス可能なグ
ローバルランキューと、個々のプロセッサ毎に設けたロ
ーカルランキューとを持ち、プロセッサ毎にキャッシュ
メモリを有したマルチプロセッサシステムに於いて、上
記グローバルランキュー内にあるプロセス中から同一ア
ドレス空間を共有する複数のプロセスをローカルランキ
ューにフェッチしてくる手段と、ローカルランキューに
フェッチしてきたプロセスをタイムスライスに基づき順
次実行する手段と、一定時間経過後ローカルランキュー
にフェッチしてきたプロセスを再びグローバルランキュ
ーに戻す手段とを具備してなることを特徴とする。Further, in the present invention, a global run queue accessible from all processors to hold an executable process and a local run queue provided for each processor are provided, and a cache memory is provided for each processor. In a multiprocessor system that has, based on a time slice, a method for fetching a plurality of processes sharing the same address space from among the processes in the global run queue to a local run queue and a process fetched to the local run queue. It is characterized by comprising means for sequentially executing and means for returning the process fetched to the local run queue to the global run queue again after a certain period of time.

【００１０】[0010]

【作用】本発明によれば、複数の各プロセッサが、グロ
ーバルランキュー内にあるプロセス中から、プロセスま
たはプロセス群を、優先度、アドレス空間等に従い、あ
る一定時間だけローカルランキュー内に保持して、タイ
ムスライスに基づき実行を繰り返し行なうことによっ
て、プロセッサのアクセスにおけるキャッシュのヒット
率の向上が図れる。また、これと同時に、あるタイミン
グで全プロセッサからアクセス可能なグローバルランキ
ューと各ローカルランキュー間でプロセスまたはプロセ
ス群の移動を行なうことによって、プロセッサ間の負荷
バランスの崩れが防げる。このようにして、マルチプロ
セッサにおいて、プロセッサ間の負荷バランスを崩さ
ず、かつキャッシングによる効果を高め、システムのス
ループットを向上させることができる。According to the present invention, each of the plurality of processors holds a process or a group of processes from the processes in the global run queue in the local run queue for a certain period of time according to the priority, the address space, etc. By repeating the execution based on the time slice, the cache hit rate in the processor access can be improved. At the same time, by moving a process or a group of processes between a global run queue accessible from all processors and each local run queue at a certain timing, it is possible to prevent the load balance between processors from being lost. In this way, in a multiprocessor, it is possible to improve the throughput of the system without disturbing the load balance between the processors and enhancing the effect of caching.

【００１１】[0011]

【実施例】以下図面を参照して本発明の実施例を説明す
る。図１は本発明の第１実施例の構成を示すブロック図
である。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing the configuration of the first embodiment of the present invention.

【００１２】この図１に示す本実施例の共有メモリ型マ
ルチプロセッサシステムは、３つのプロセッサ（ＰＭ
０，ＰＭ１，ＰＭ２）１０１，１０２，１０３と、これ
ら複数のプロセッサ（ＰＭ０，ＰＭ１，ＰＭ２）１０
１，１０２，１０３にメモリバス（ＢＵＳ）１０４を介
して接続された共有メモリ（ＣＭ）１０５とでなる構成
を例示している。上記各プロセッサ（ＰＭ０，ＰＭ１，
ＰＭ２）１０１，１０２，１０３はそれぞれ専用のキャ
ッシュメモリを持つ。The shared memory type multiprocessor system of this embodiment shown in FIG. 1 has three processors (PM).
0, PM1, PM2) 101, 102, 103 and a plurality of these processors (PM0, PM1, PM2) 10
1, a shared memory (CM) 105 connected to a memory bus (BUS) 104 is illustrated as an example. Each of the above processors (PM0, PM1,
The PM2) 101, 102, 103 each have a dedicated cache memory.

【００１３】上記共有メモリ（ＣＭ）１０５上には、全
てのプロセッサ（ＰＭ０，ＰＭ１，ＰＭ２）１０１，１
０２，１０３からアクセス可能なグローバルランキュー
（ＧＲＱ）１１２と、そのグローバルランキュー（ＧＲ
Ｑ）１１２につながっている実行待ちプロセス（Ｐ）の
数を格納するグローバルランキューカウンタ（ＧＣ）１
１３と、各プロセッサ（ＰＭ０，ＰＭ１，ＰＭ２）１０
１，１０２，１０３毎に設けられた、実行待ちプロセス
（Ｐ）１１４をつないでおくローカルランキュー（ＬＲ
Ｑ０，ＬＲＱ１，ＬＲＱ２）１０６，１０７，１０８
と、このローカルランキュー（ＬＲＱ０，ＬＲＱ１，Ｌ
ＲＱ２）１０６，１０７，１０８内のプロセス数を格納
するローカルランキューカウンタ（ＬＣ０，ＬＣ１，Ｌ
Ｃ２）１０９，１１０，１１１とが保持される。On the shared memory (CM) 105, all the processors (PM0, PM1, PM2) 101, 1
02 and 103, the global run queue (GRQ) 112 and the global run queue (GRQ)
Q) Global run queue counter (GC) 1 that stores the number of pending processes (P) connected to 112
13 and each processor (PM0, PM1, PM2) 10
A local run queue (LR) that is provided for each of the 1, 102, and 103 and connects the waiting process (P) 114
Q0, LRQ1, LRQ2) 106, 107, 108
And this local run queue (LRQ0, LRQ1, L
RQ2) Local run queue counters (LC0, LC1, L that store the number of processes in 106, 107, 108)
C2) 109, 110 and 111 are held.

【００１４】即ち、プロセッサ（ＰＭ０）１０１にはロ
ーカルランキュー（ＬＲＱ０）１０６とローカルランキ
ューカウンタ（ＬＣ０）１０９、プロセッサ（ＰＭ１）
１０２にはローカルランキュー（ＬＲＱ１）１０７とロ
ーカルランキューカウンタ（ＬＣ１）１１０、プロセッ
サ（ＰＭ２）１０３にはローカルランキュー（ＬＲＱ
２）１０８とローカルランキューカウンタ（ＬＣ２）１
１１がそれぞれ対応して設けられる。That is, the processor (PM0) 101 has a local run queue (LRQ0) 106, a local run queue counter (LC0) 109, and a processor (PM1).
The local run queue (LRQ1) 107 and the local run queue counter (LC1) 110 are shown in 102, and the local run queue (LRQ) is shown in the processor (PM2) 103.
2) 108 and local run queue counter (LC2) 1
11 are provided correspondingly.

【００１５】上記グローバルランキューカウンタ（Ｇ
Ｃ）１１３、及び各ローカルランキューカウンタ（ＬＣ
０，ＬＣ１，ＬＣ２）１０９，１１０，１１１は、各ラ
ンキュー内のプロセスの生成、削除操作時に値が変更
（更新）される。The global run queue counter (G
C) 113 and each local run queue counter (LC
The values of 0, LC1, LC2) 109, 110, and 111 are changed (updated) at the time of creating and deleting processes in each run queue.

【００１６】例えば、各プロセッサ（ＰＭ０，ＰＭ１，
ＰＭ２）１０１，１０２，１０３のスケジューラが、そ
れぞれグローバルランキュー（ＧＲＱ）１１２から自ロ
ーカルランキュー（ＬＲＱ０，ＬＲＱ１，ＬＲＱ２）１
０６，１０７，１０８へフェッチしてくるプロセス数を
一度にＸ個と固定しておく。For example, each processor (PM0, PM1,
The schedulers of PM2) 101, 102, and 103 respectively transmit from the global run queue (GRQ) 112 to their own local run queues (LRQ0, LRQ1, LRQ2) 1
The number of processes fetched to 06, 107, 108 is fixed at X at a time.

【００１７】そして、各プロセッサ（ＰＭ０，ＰＭ１，
ＰＭ２）１０１，１０２，１０３は、ローカルランキュ
ー（ＬＲＱ０，ＬＲＱ１，ＬＲＱ２）１０６，１０７，
１０８へフェッチしてきたプロセス群を、タイムスライ
スに基づき順次実行し、各プロセスのＣＰＵ消費時間の
総和（ここでは［ＴＣＴ］と呼ぶ）がＹ秒以内なら、そ
のプロセス群は自ローカルランキュー内に留め、［ＴＣ
Ｔ］がＹ秒経過後には、自ローカルランキューにつなが
っている全てのプロセス、及び自プロセッサで実行して
いたプロセスをグローバルランキュー（ＧＲＱ）１１２
へ戻すとする。また、空になったローカルランキューに
はグローバルランキュー（ＧＲＱ）１１２から再びＸ個
のプロセスをフェッチするものとする。Then, each processor (PM0, PM1,
PM2) 101, 102, 103 are local run queues (LRQ0, LRQ1, LRQ2) 106, 107,
The process groups fetched to 108 are sequentially executed based on the time slice, and if the total CPU consumption time of each process (referred to as [TCT] here) is within Y seconds, the process group remains in its own local run queue. , [TC
After [T] has passed Y seconds, the global run queue (GRQ) 112 is executed for all processes connected to the local run queue and processes executed by the local processor.
I will return to. Further, it is assumed that X processes are fetched again from the global run queue (GRQ) 112 to the empty local run queue.

【００１８】但し［ＴＣＴ］の計算に使用するＣＰＵ消
費量は、プロセスが生成された時刻を「０」としたとき
の消費時間ではなく、プロセスが各ローカルランキュー
にフェッチされてからの消費時間である。図２は、上記
図１に示す実施例によるディスパッチ方式のアルゴリズ
ムを示すフローチャートである。However, the CPU consumption used for the calculation of [TCT] is not the consumption time when the time when the process is generated is "0", but the consumption time after the process is fetched into each local run queue. is there. FIG. 2 is a flowchart showing an algorithm of the dispatch method according to the embodiment shown in FIG.

【００１９】各プロセッサ（ＰＭ０，ＰＭ１，ＰＭ２）
１０１，１０２，１０３に対応するスケジューラは、先
ず、グローバルランキュー（ＧＲＱ）１１２からＸ個の
プロセスを自ローカルランキュー（ＬＲＱ０，ＬＲＱ
１，ＬＲＱ２）１０６，１０７，１０８へフェッチして
くる（図２ステップａ１）。Each processor (PM0, PM1, PM2)
The schedulers corresponding to 101, 102 and 103 first send X processes from the global run queue (GRQ) 112 to their own local run queues (LRQ0, LRQ).
1, LRQ2) 106, 107, 108 are fetched (step a1 in FIG. 2).

【００２０】このとき、ローカルランキューカウンタ
（ＬＣ０，ＬＣ１，ＬＣ２）１０９，１１０，１１１の
値はＸにセットされる。また、この時点では、［ＴＣ
Ｔ］を「０」に初期化する（図２ステップａ２）。At this time, the values of the local run queue counters (LC0, LC1, LC2) 109, 110, 111 are set to X. At this point, [TC
T] is initialized to "0" (step a2 in FIG. 2).

【００２１】次に、プロセッサ（ＰＭ０，ＰＭ１，ＰＭ
２）１０１，１０２，１０３は、自ローカルランキュー
（ＬＲＱ０，ＬＲＱ１，ＬＲＱ２）１０６，１０７，１
０８につながっているＸ個のプロセスの中の一つをディ
スパッチし、タイムスライスに基づき実行を行なう（図
２ステップａ３，ａ４）。Next, the processors (PM0, PM1, PM
2) 101, 102, 103 are local local queues (LRQ0, LRQ1, LRQ2) 106, 107, 1
One of the X processes connected to 08 is dispatched and executed based on the time slice (steps a3 and a4 in FIG. 2).

【００２２】そして、そのプロセスが実行の時間刻みを
使い果たした後、プロセスのＣＰＵ消費時間（この場
合、時間きざみの値）が［ＴＣＴ］に加算される（図２
ステップａ５）。Then, after the process runs out of time intervals of execution, the CPU consumption time of the process (in this case, the value of the time step) is added to [TCT] (FIG. 2).
Step a5).

【００２３】ここで、［ＴＣＴ］の値がＹ秒を越えてい
たときには、そのプロセッサのスケジューラが自ローカ
ルランキューにフェッチしてきたＸ個のプロセスを全て
グローバルランキュー（ＧＲＱ）１１２へ戻し、再び、
グローバルランキュー（ＧＲＱ）１１２から新たにＸ個
のプロセスを自ローカルランキューへフェッチする（図
２ステップａ７，ａ１）。When the value of [TCT] exceeds Y seconds, the scheduler of the processor returns all the X processes fetched to the local run queue to the global run queue (GRQ) 112, and again,
X new processes are fetched from the global run queue (GRQ) 112 to the local run queue (steps a7 and a1 in FIG. 2).

【００２４】又、逆に［ＴＣＴ］の値がＹ秒以内のとき
は、プロセッサはそれまで実行していたプロセスを自ロ
ーカルランキューへ戻したあと、自ローカルランキュー
より次のプロセスをディスパッチして実行を続ける（図
２ステップａ６，ａ３，ａ４，…）。On the contrary, when the value of [TCT] is within Y seconds, the processor returns the process which was being executed up to that time to its own local run queue, and then dispatches and executes the next process from its own local run queue. (Steps a6, a3, a4, ... in FIG. 2) are continued.

【００２５】このようなアルゴリズムで行なわれるディ
スパッチ方式をとった場合、ある時点におけるグローバ
ルランキューとローカルランキューの状態は例えば図３
のようになる。When the dispatch method performed by such an algorithm is adopted, the states of the global run queue and the local run queue at a certain time point are shown in FIG.
become that way.

【００２６】尚、図３に於いて、３０１，３０２，３０
３は、図１のプロセッサ（ＰＭ０，ＰＭ１，ＰＭ２）１
０１，１０２，１０３に相当し、３０４，３０５，３０
６は、図１のローカルランキュー（ＬＲＱ０，ＬＲＱ
１，ＬＲＱ２）１０６，１０７，１０８に相当し、３０
８は図１のグローバルランキュー（ＧＲＱ）１１２に相
当する。３０７は実行待ちプロセス（Ｐ）である。Incidentally, in FIG. 3, 301, 302, 30
3 is the processor (PM0, PM1, PM2) 1 of FIG.
01, 102, 103, 304, 305, 30
6 is the local run queue (LRQ0, LRQ of FIG.
1, LRQ2) 106, 107, 108, and 30
Reference numeral 8 corresponds to the global run queue (GRQ) 112 in FIG. 307 is an execution waiting process (P).

【００２７】この図３に示す状態下に於いて、各ローカ
ルランキュー（ＬＲＱ０，ＬＲＱ１，ＬＲＱ２）３０
４，３０５，３０６には、常にＸ−１個以内のプロセス
（Ｐ）がつながっている（即ち、一つのプロセスはプロ
セッサで実行中である。また、入出力待ちや実行の終了
などによる実行待ちプロセスの消滅がおこるので、ロー
カルランキュー内のプロセス数は減る）。Under the condition shown in FIG. 3, the local run queues (LRQ0, LRQ1, LRQ2) 30 are provided.
No more than X-1 processes (P) are always connected to 4, 305, 306 (that is, one process is being executed by the processor. Also, waiting for execution due to I / O wait, execution termination, etc.) The number of processes in the local run queue is reduced because the process disappears).

【００２８】各ローカルランキュー（ＬＲＱ０，ＬＲＱ
１，ＬＲＱ２）３０４，３０５，３０６内のプロセス群
は、各々が現時点で割り当てられているプロセッサ（Ｐ
Ｍ０，ＰＭ１，ＰＭ２）３０１，３０２，３０３に対応
したスケジューラによって、グローバルランキュー（Ｇ
ＲＱ）３０８へ戻される間での間は、何度コンテクスト
スイッチが起きても他のプロセッサへディスパッチされ
ることはない。これによって、プロセスの頻繁なプロセ
ッサ間移動を抑制することができ、キャッシング効果を
上げることが可能となる。Each local run queue (LRQ0, LRQ
1, LRQ2) 304, 305, and 306 are respectively assigned to the processors (P
M0, PM1, PM2) 301, 302, 303 corresponding to the global run queue (G
While it is being returned to the RQ) 308, it will not be dispatched to another processor no matter how many context switches occur. As a result, it is possible to suppress frequent movement of processes between processors, and it is possible to improve the caching effect.

【００２９】更に、プロセッサ間の負荷バランスが大き
く崩れるのを防ぐことができる。これを図４を参照して
説明する。尚、図４に於いて、４０１，４０２，４０３
は、図１のプロセッサ（ＰＭ０，ＰＭ１，ＰＭ２）１０
１，１０２，１０３（図３の３０１，３０２，３０３）
に相当し、４０４，４０５，４０６は、図１のローカル
ランキュー（ＬＲＱ０，ＬＲＱ１，ＬＲＱ２）１０６，
１０７，１０８（図３の３０４，３０５，３０６）に相
当し、４０８は図１のグローバルランキュー（ＧＲＱ）
１１２（図３の３０８）に相当する。４０７は実行待ち
プロセス（Ｐ）である。Further, it is possible to prevent the load balance between the processors from being largely lost. This will be described with reference to FIG. In FIG. 4, 401, 402, 403
Is the processor (PM0, PM1, PM2) 10 of FIG.
1, 102, 103 (301, 302, 303 in FIG. 3)
1 corresponds to the local run queues (LRQ0, LRQ1, LRQ2) 106,
107, 108 (304, 305, 306 in FIG. 3), and 408 is the global run queue (GRQ) in FIG.
This corresponds to 112 (308 in FIG. 3). Reference numeral 407 is an execution waiting process (P).

【００３０】図４は、上記図３の状態から各プロセスの
実行時間が数秒間経過後（［ＴＣＴ］はＹ秒以内）の状
態の一例であるが、この例では、プロセスの消滅によっ
て、各プロセッサに対する実行待ちプロセス数がプロセ
ッサ間で不均等になってしまっている。即ち、この例で
は、プロセッサ（ＰＭ２）４０３のプロセス処理に、例
えばプロセス間のスイッチ、Ｉ／Ｏの処理待ち（アイド
ル）等が介在して、プロセッサ（ＰＭ２）４０３の［Ｔ
ＣＴ］が他のプロセッサ（ＰＭ０，ＰＭ１）４０１，４
０２に比し極端に小さい状態となっている。FIG. 4 shows an example of a state in which the execution time of each process has passed several seconds ([TCT] is within Y seconds) from the state of FIG. 3 above. The number of pending processes for processors is uneven among the processors. That is, in this example, in the process processing of the processor (PM2) 403, for example, a switch between processes, waiting for I / O processing (idle), and the like intervene, and [T
CT] is another processor (PM0, PM1) 401, 4
It is extremely smaller than 02.

【００３１】しかし、これはプロセッサ（ＰＭ０，ＰＭ
１）４０１，４０２の［ＴＣＴ］がＹ秒を超過すると、
そのプロセッサがもつローカルランキュー（ＬＲＱ０，
ＬＲＱ１）４０４，４０５内のプロセスが再びグローバ
ルランキュー（ＧＲＱ）４０８からフェッチされるの
で、各プロセッサ（ＰＭ０，ＰＭ１，ＰＭ２）４０１，
４０２，４０３に対する実行待ちプロセス数が各プロセ
ッサ間で均等になり、図３の状態に戻る。However, this is a processor (PM0, PM
1) When [TCT] of 401 and 402 exceeds Y seconds,
The local run queue (LRQ0,
Since the processes in the LRQ1) 404 and 405 are fetched again from the global run queue (GRQ) 408, each processor (PM0, PM1, PM2) 401,
The number of pending processes for 402 and 403 becomes equal among the processors, and the state returns to the state of FIG.

【００３２】このように、ローカルランキューとグロー
バルランキュー間での定期的なプロセス入れ替え処理の
実行により、プロセッサ間の負荷バランスが大きく崩れ
るのを防ぐことができる。As described above, it is possible to prevent the load balance between the processors from being greatly disturbed by executing the periodical process replacement processing between the local run queue and the global run queue.

【００３３】ここで、上記実施例において、各プロセッ
サに対し、キャッシング効果を更に向上させるために、
プロセスのアクセス局所性を意識したスケジューリング
が考えられる。Here, in the above embodiment, in order to further improve the caching effect for each processor,
Scheduling considering the access locality of the process is possible.

【００３４】例えば、今、Ｆ０，Ｆ１，Ｆ２という三つ
のファイルがあり、ファイルＦ０にアクセスするプロセ
スはアドレス空間Ａ０を、ファイルＦ１にアクセスする
プロセスはアドレス空間Ａ１を、ファイルＦ２にアクセ
スするプロセスはアドレス空間Ａ２をそれぞれ共有して
いるとする。For example, now, there are three files F0, F1, and F2. The process accessing the file F0 has the address space A0, the process accessing the file F1 has the address space A1, and the process accessing the file F2 has the file F2. It is assumed that the address space A2 is shared.

【００３５】このとき、図１に於ける各プロセッサ（Ｐ
Ｍ０，ＰＭ１，ＰＭ２）１０１，１０２，１０３のスケ
ジューラが、グローバルランキュー（ＧＲＱ）１１２か
ら自ローカルランキュー（ＬＲＱ０，ＬＲＱ１，ＬＲＱ
２）１０６，１０７，１０８へフェッチしてくるプロセ
スを選択する際に、単に優先度順で選ぶのではなく、プ
ロセッサ（ＰＭ０）１０１にはファイルＦ０へアクセス
するプロセス、即ちアドレス空間Ａ０を共有するプロセ
スを、プロセッサ（ＰＭ１）１０２にはファイルＦ１へ
アクセスするプロセス、即ちアドレス空間Ａ１を共有す
るプロセスを、プロセッサ（ＰＭ２）１０３にはファイ
ルＦ２へアクセスするプロセス、即ちアドレス空間Ａ２
を共有するプロセスをそれぞれ優先的に選ぶようにす
る。At this time, each processor (P
The schedulers of M0, PM1, PM2) 101, 102, 103 from the global run queue (GRQ) 112 to their own local run queues (LRQ0, LRQ1, LRQ).
2) When selecting the processes fetched to 106, 107, 108, the process to access the file F0, that is, the address space A0 is shared by the processor (PM0) 101, rather than simply selecting them in order of priority. A process for accessing the file F1 to the processor (PM1) 102, that is, a process sharing the address space A1, and a process accessing to the file F2 for the processor (PM2) 103, that is, the address space A2.
Priority should be given to each process that shares the.

【００３６】このようにすれば、各プロセッサ（ＰＭ
０，ＰＭ１，ＰＭ２）１０１，１０２，１０３に割り当
てられるプロセス群が、ある程度、アクセス空間を共有
することになるので、よりキャッシュのヒット率が上が
り、キャッシング効果が大幅に向上する。In this way, each processor (PM
0, PM1, PM2) 101, 102, 103 share the access space to some extent, so that the cache hit rate is further increased and the caching effect is significantly improved.

【００３７】上記した第１実施例のディスパッチ方式で
は、ローカルランキューとグローバルランキューとの間
のプロセス移動に於いて、ローカルランキュー内のプロ
セスの［ＴＣＴ］（各プロセスのＣＰＵ消費時間の総
和）が、ある一定値（前記例ではＹ秒）に達したとき
に、ある一定個数（前記例ではＸ個）のプロセス群をひ
とまとまりとして扱う方法である。このように、プロセ
スをグループ化して扱う実施例に対して、プロセス単体
を扱う第２実施例を以下に挙げる。この第２実施例で
は、ローカルランキュー内の各プロセスのＣＰＵ消費時
間を［ＣＴ］とする。In the dispatch method of the first embodiment described above, in the process migration between the local run queue and the global run queue, the [TCT] of the processes in the local run queue (total CPU consumption time of each process) is In this method, when a certain value (Y seconds in the above example) is reached, a certain number (X in the example) of process groups are treated as a group. As described above, a second example of handling a single process will be described below, as compared to an example of handling a group of processes. In the second embodiment, the CPU consumption time of each process in the local run queue is [CT].

【００３８】但し、ここでの［ＣＴ］はプロセスがグロ
ーバルランキューからローカルランキューへフェッチさ
れてきた時点を「０」として、それからＣＰＵをどのく
らい使用したかを示す時間である。従って、あるプロセ
スの［ＣＴ］は、そのプロセスがローカルランキュー内
にあるときに限り有効な値である。However, [CT] here is the time indicating how much the CPU has been used since the time when the process is fetched from the global run queue to the local run queue is "0". Therefore, the [CT] of a process is a valid value only when the process is in the local run queue.

【００３９】ここで、各プロセッサのスケジューラが、
ローカルランキュー内にあるプロセスの［ＣＴ］を調
べ、Ｚ秒よりも大きい値の［ＣＴ］をもつプロセスはグ
ローバルランキューへ戻し、代わりにグローバルランキ
ューから一つのプロセスを取り出し、自ローカルランキ
ューへつなぐ処理を行なう。Here, the scheduler of each processor is
Check the [CT] of the process in the local run queue, and return the process with the [CT] greater than Z seconds to the global run queue. Instead, retrieve one process from the global run queue and connect it to the local run queue. To do.

【００４０】また、このときローカルランキュー内のプ
ロセスの実行が進むにつれて、実行待ちプロセスの消滅
がおこりローカルランキュー内のプロセス数が減ってし
まう場合がある。Further, at this time, as the execution of the processes in the local run queue progresses, the number of processes in the local run queue may decrease due to the disappearance of the waiting process.

【００４１】そこで、各プロセッサのスケジューラは、
適当なタイミングでローカルランキューカウンタの値に
よりプロセス数をチェックして、もし、ある一定数（プ
ロセス数の上限値）よりも少ないときにはグローバルラ
ンキューより、その一定数になるまでローカルランキュ
ーへプロセスの補充を行なう。この第２実施例のディス
パッチ方式のアルゴリズムを図５に示す。Therefore, the scheduler of each processor is
The number of processes is checked by the value of the local run queue counter at an appropriate timing, and if the number is less than a certain number (upper limit of the number of processes), the global run queue is replenished with processes until the number reaches the certain number. To do. FIG. 5 shows the dispatch type algorithm of the second embodiment.

【００４２】ローカルランキュー内のプロセス数の上限
値をＭＸＰとすると、まず、グローバルランキューから
ローカルランキューへ［（ＭＸＰ）−（ローカルランキ
ューカウンタ）］個のプロセスをフェッチしてくる。こ
のとき、フェッチしてきたプロセスの［ＣＴ］を「０」
に初期化する（図５ステップｂ１，ｂ２）。プロセッサ
はローカルランキュー内の一プロセスをディスパッチ
し、タイムスライスに基づき実行を行なう（ステップｂ
３，ｂ４）。そして、実行が終了したら［ＣＴ］を更新
する（ステップｂ５）。When the upper limit value of the number of processes in the local run queue is MXP, first, [(MXP)-(local run queue counter)] processes are fetched from the global run queue to the local run queue. At this time, [CT] of the fetched process is set to "0".
Initialization (steps b1 and b2 in FIG. 5). The processor dispatches a process in the local run queue and executes it based on the time slice (step b).
3, b4). Then, when the execution is completed, [CT] is updated (step b5).

【００４３】［ＣＴ］を更新した後、グローバルランキ
ューとローカルランキュー間でのプロセス入れ替えを決
定するため、Ｚ秒よりも大きければ、いま実行を終了し
たプロセスをローカルランキューではなくグローバルラ
ンキューへ戻し、逆にグローバルランキューからはロー
カルランキューへプロセスを補充する（ステップｂ６，
ｂ７，ｂ１，…）。After [CT] is updated, in order to determine the process switching between the global run queue and the local run queue, if it is longer than Z seconds, the process that has just finished being executed is returned to the global run queue instead of the local run queue, and the process is reversed. Then, the process is replenished from the global run queue to the local run queue (step b6).
b7, b1, ...).

【００４４】一方、［ＣＴ］がＺ秒以下の場合、プロセ
スはローカルランキューへ戻され実行を待ち、プロセッ
サは新たなプロセスの実行を行なう（ステップｂ６，ｂ
３）。このアルゴリズムでは、ローカルランキューへの
プロセス補充のタイミングを［ＣＴ］がＺ秒を越えたと
きと同期して行なっている。On the other hand, if [CT] is less than Z seconds, the process is returned to the local run queue and waits for execution, and the processor executes a new process (steps b6, b).
3). In this algorithm, the timing of replenishing the process to the local run queue is synchronized with the time when [CT] exceeds Z seconds.

【００４５】図６は第２実施例での動作を示している。
尚、図６に於いて、６０１，６０２，６０３は、図１の
プロセッサ（ＰＭ０，ＰＭ１，ＰＭ２）１０１，１０
２，１０３に相当し、６０４，６０５，６０６は、図１
のローカルランキュー（ＬＲＱ０，ＬＲＱ１，ＬＲＱ
２）１０６，１０７，１０８に相当し、６０９は図１の
グローバルランキュー（ＧＲＱ）１１２に相当する。６
０７は移動対象プロセスである。FIG. 6 shows the operation of the second embodiment.
In FIG. 6, reference numerals 601, 602 and 603 denote the processors (PM0, PM1, PM2) 101, 10 of FIG.
2 and 103, and 604, 605, and 606 are shown in FIG.
Local run queue (LRQ0, LRQ1, LRQ
2) 106, 107, and 108, and 609 corresponds to the global run queue (GRQ) 112 in FIG. 6
Reference numeral 07 is a process to be moved.

【００４６】ここでは、プロセッサ（ＰＭ０）６０１で
実行されたプロセスが、実行を終えた直後の［ＣＴ］の
値がＺ秒以上となり、グローバルランキュー（ＧＲＱ）
６０９へ移される条件を満たしたので、そのプロセスは
グローバルランキュー（ＧＲＱ）６０９へ渡され、一
方、グローバルランキュー（ＧＲＱ）６０９からは１つ
のプロセスがローカルランキュー（ＬＲＱ０）６０４に
フェッチされる。In this case, the value of [CT] immediately after the process executed by the processor (PM0) 601 is finished is Z seconds or more, and the global run queue (GRQ) is reached.
The process is passed to the global run queue (GRQ) 609 because the condition to be moved to 609 is satisfied, while one process is fetched from the global run queue (GRQ) 609 to the local run queue (LRQ0) 604.

【００４７】このように、ローカルランキューにつなが
っているプロセスは、しばらくの間（この例の場合ＣＴ
＞Ｚ秒となるまで）は唯一のプロセッサ上でのみ実行さ
れることになるので、キャッシング効果が向上する。As described above, the process connected to the local run queue is for a while (CT in this example).
> Z seconds) will be executed on only one processor, so the caching effect is improved.

【００４８】また、図７は、ローカルランキュー内のプ
ロセス数をローカルランキューカウンタ（ＬＣ０）７０
４により調べ、ＭＸＰより少ないので、グローバルラン
キュー（ＧＲＱ）７０９から補充する例である。尚、図
７に於いて、７０１，７０２，７０３は、図１のプロセ
ッサ（ＰＭ０，ＰＭ１，ＰＭ２）１０１，１０２，１０
３に相当し、７０４，７０５，７０６は、図１のローカ
ルランキュー（ＬＲＱ０，ＬＲＱ１，ＬＲＱ２）１０
６，１０７，１０８に相当し、７０９は図１のグローバ
ルランキュー（ＧＲＱ）１１２に相当する。７０７は移
動対象プロセス、７０８は補充プロセスである。Further, FIG. 7 shows the number of processes in the local run queue as a local run queue counter (LC0) 70.
In this example, the global run queue (GRQ) 709 is replenished because it is less than MXP. In FIG. 7, reference numerals 701, 702, 703 denote processors (PM0, PM1, PM2) 101, 102, 10 of FIG.
3 corresponds to the local run queues (LRQ0, LRQ1, LRQ2) 10 in FIG.
6, 107 and 108, and 709 corresponds to the global run queue (GRQ) 112 in FIG. Reference numeral 707 is a movement target process, and 708 is a supplement process.

【００４９】このように、ローカルランキュー内のプロ
セス数をローカルランキューカウンタにより調べて、Ｍ
ＸＰより少ないとき、グローバルランキュー（ＧＲＱ）
７０９から補充するすることにより、プロセッサ間のプ
ロセス数がある程度、均等に保たれ、負荷バランスが良
くなる。In this way, the number of processes in the local run queue is checked by the local run queue counter, and M
Global run queue (GRQ) when less than XP
By replenishing from 709, the number of processes among the processors is maintained to some extent evenly, and the load balance is improved.

【００５０】[0050]

【発明の効果】以上詳記したように本発明によれば、実
行可能なプロセスを保持するための、全プロセッサから
アクセス可能なグローバルランキューと、複数のプロセ
ッサ各々に固有のローカルランキューとをもつ、各プロ
セッサにキャッシュメモリを備えたマルチプロセッサシ
ステムに於いて、プロセスまたはプロセス群を、ある一
定時間だけローカルランキュー内に保持し、そのローカ
ルランキューに対応する一つのプロセッサ上でタイムス
ライスに基づき実行を繰り返し行なうことにより、プロ
セッサのアクセスにおけるキャッシュのヒット率を向上
できる。また、これと同時に、あるタイミングで全プロ
セッサからアクセス可能なグローバルランキューと各ロ
ーカルランキューとの間でプロセスまたはプロセス群の
移動を行なうことによって、プロセッサ間の負荷バラン
スが大きく乱れるのを防ぐことができる。これにより、
システムのスループットを向上させることができる。As described in detail above, according to the present invention, a global run queue accessible from all processors for holding an executable process and a local run queue unique to each of a plurality of processors are provided. In a multiprocessor system in which each processor has a cache memory, hold a process or processes in the local run queue for a certain period of time, and repeat the execution based on the time slice on one processor corresponding to the local run queue. By doing so, the cache hit rate in processor access can be improved. At the same time, by moving a process or a group of processes between a global run queue accessible from all processors and each local run queue at a certain timing, it is possible to prevent the load balance between processors from being significantly disturbed. .. This allows
The throughput of the system can be improved.

[Brief description of drawings]

【図１】本発明の実施例の構成を示すブロック図。FIG. 1 is a block diagram showing a configuration of an exemplary embodiment of the present invention.

【図２】図１に示す実施例のディスパッチ方式のアルゴ
リズムを示すフローチャート。FIG. 2 is a flowchart showing an algorithm of the dispatch method of the embodiment shown in FIG.

【図３】図１に示す実施例のシステムにおいて、第１実
施例のディスパッチ方式を用いたときのグローバルラン
キューと各ローカルランキューの状態を示す図。FIG. 3 is a diagram showing the states of a global run queue and local run queues when the dispatch system of the first embodiment is used in the system of the embodiment shown in FIG.

【図４】図１に示す実施例のシステムにおいて、図３の
状態から数秒経過後のグローバルランキューと各ローカ
ルランキューの状態を示す図。4 is a diagram showing the states of the global run queue and each local run queue after a few seconds have elapsed from the state of FIG. 3 in the system of the embodiment shown in FIG.

【図５】本発明の第２実施例によるディスパッチ方式の
アルゴリズムを示すフローチャート。FIG. 5 is a flowchart showing an algorithm of the dispatch method according to the second embodiment of the present invention.

【図６】図１に示す構成のシステムにおいて、第２実施
例のディスパッチ方式を用いたときのグローバルランキ
ューとローカルランキューとの間でのプロセス移動を示
す図。FIG. 6 is a diagram showing process movement between a global run queue and a local run queue when the dispatch system of the second embodiment is used in the system having the configuration shown in FIG.

【図７】第２実施例のディスパッチ方式において、プロ
セッサ間での負荷バランスを保つためにグローバルラン
キューからローカルランキューへのプロセス補充を示す
図。FIG. 7 is a diagram showing process replenishment from a global run queue to a local run queue in order to maintain a load balance between processors in the dispatch system of the second embodiment.

[Explanation of symbols]

１０１，１０２，１０３、３０１，３０２，３０３、４
０１，４０２，４０３、６０１，６０２，６０３、７０
１，７０２，７０３…プロセッサ（キャッシュ付きのプ
ロセッサモジュール；ＰＭ０，ＰＭ１，ＰＭ２）。１０４…メモリバス（ＢＵＳ）。１０５…共有メモリ（ＣＭ）。１０６，１０７，１０８、３０４，３０５，３０６、４
０４，４０５，４０６、６０４，６０５，６０６、７０
４，７０５，７０６…ローカルランキュー（ＬＲＱ０，
ＬＲＱ１，ＬＲＱ２）。１０９，１１０，１１１…ローカルランキューカウンタ
（ＬＣ０，ＬＣ１，ＬＣ２）。１１２，３０８，４０８，６０９，７０９…グローバル
ランキュー（ＧＲＱ）。１１３…グローバルランキューカウンタ（ＧＣ）。１１４，３０７，４０７，６０８，…実行待ちプロセス
（Ｐ）。６０７，７０７…移動対象プロセス。７０８…補充プロセス。101, 102, 103, 301, 302, 303, 4
01, 402, 403, 601, 602, 603, 70
1, 702, 703 ... Processor (processor module with cache; PM0, PM1, PM2). 104 ... Memory bus (BUS). 105 ... Shared memory (CM). 106, 107, 108, 304, 305, 306, 4
04,405,406,604,605,606,70
4, 705, 706 ... Local run queue (LRQ0,
LRQ1, LRQ2). 109, 110, 111 ... Local run queue counters (LC0, LC1, LC2). 112, 308, 408, 609, 709 ... Global run queue (GRQ). 113 ... Global run queue counter (GC). 114, 307, 407, 608, ... Waiting process (P). 607, 707 ... Process to be moved. 708 ... Replenishment process.

Claims

[Claims]

1. For holding a runnable process,
In a multiprocessor system that has a global run queue accessible from all processors and a local run queue unique to each of the multiple processors, and each processor has a cache memory, the processes in the global run queue are listed in order of priority. Means for fetching a process into the local run queue, means for sequentially executing the process group fetched in the local run queue based on a time slice, and the process after the process group fetched in the local run queue is executed for a certain time A process dispatch system in a multiprocessor, comprising means for returning a group to the global run queue again based on priority.

2. The process dispatch method in a multiprocessor according to claim 1, further comprising means for fetching a plurality of processes sharing the same address space from among the processes in the global run queue to a local run queue.