JP3331695B2

JP3331695B2 - Time synchronization method

Info

Publication number: JP3331695B2
Application number: JP23060293A
Authority: JP
Inventors: 義博草野; 治彦上埜
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1993-09-17
Filing date: 1993-09-17
Publication date: 2002-10-07
Anticipated expiration: 2017-10-07
Also published as: JPH0784964A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】最近の、密結合計算機の性能の限
界を克服する為に、多重命令、多重データ型(MIMD 型)
の分散主記憶型並列計算機 (所謂、粗結合計算機) が実
用化の段階に入っている。然して、この構成の計算機で
は、各プロセッサエレメント(PE)内の中央処理装置(CP
U) が、グローバル記憶 (所謂、共通メモリ) に高速に
アクセスを行うことができない。[Industrial applications] To overcome the recent performance limitations of tightly coupled computers, multiple instructions and multiple data types (MIMD type)
Distributed main memory type parallel computers (so-called loosely coupled computers) have entered the stage of practical use. However, in the computer having this configuration, the central processing unit (CP) in each processor element (PE) is used.
U) cannot access global storage (so-called common memory) at high speed.

【０００２】もし、上記グローバル記憶に高速にアクセ
スできる機構が存在する場合には、これを利用して、当
該並列計算機システムに只一つの時刻機構 (以下、TOD
時計という) を、全てのプロセッサエレメント(PE)で実
行されるプログラムがアクセスするようにするか、或い
は、各プロセッサエレメント(PE)で行われる、該TOD時
計への初期値の設定処理を、上記グローバル記憶を利用
して、時刻同期ができるため、各プロセッサエレメント
(PE)内のTOD 時計を高精度に同期させることができる。If a mechanism that can access the global storage at a high speed exists, the mechanism can be used to provide a single time mechanism (hereinafter referred to as TOD) in the parallel computer system.
The clock executed by all the processor elements (PEs) is accessed, or the initial value setting process for the TOD clock performed by each processor element (PE) is performed as described above. The time can be synchronized using global storage, so each processor element
The TOD clock in (PE) can be synchronized with high accuracy.

【０００３】然しながら、前述のように、上記 MIMD 型
の分散主記憶型並列計算機では、各プロセッサエレメン
ト(PE)内の中央処理装置(CPU) が、該グローバル記憶
(所謂、共通メモリ) に高速にアクセスする手段がない
為、各プロセッサエレメント(PE)が、該TOD 時計を持つ
ことが必要となる。このとき、各プロセッサエレメント
(PE)が備えているTOD 時計を、できるだけ高い精度で同
期させる必要がある。However, as described above, in the above-mentioned MIMD type distributed main memory type parallel computer, the central processing unit (CPU) in each processor element (PE) uses the global memory.
Since there is no means for accessing the so-called common memory at high speed, each processor element (PE) needs to have the TOD clock. At this time, each processor element
It is necessary to synchronize the TOD clock of (PE) with as high accuracy as possible.

【０００４】又、大規模の並列計算機システムでは、保
守や節電の為に、動的なプロセッサエレメント(PE)の構
成変更、具体的には、システムの運用中に、プロセッサ
エレメント(PE)の組み込みや, 切り離し、及び、部分的
な電源のオン, オフが必要とされる。このような場合に
おいても、あるプロセッサエレメント(PE)が組み込まれ
た時には、それらのプロセッサエレメント(PE)の、上記
TOD 時計を、正確に同期させなければならない。In a large-scale parallel computer system, the configuration of a dynamic processor element (PE) is dynamically changed for maintenance and power saving. Specifically, the processor element (PE) is installed during operation of the system. , Disconnection, and partial power on / off are required. Even in such a case, when a certain processor element (PE) is incorporated,
The TOD clock must be precisely synchronized.

【０００５】又、それぞれのプロセッサエレメント(PE)
に、キャッシュメモリを備えているようなシステムにお
いては、上記時刻同期を行うプログラムの実行時の、例
えば、命令フェッチ, オペランドフェッチ時において、
キャッシュミスが発生すると、処理の遅延が発生し、該
時刻同期のための処理に遅れが生じることから、高精度
の時刻同期を行う為には、該時刻同期の為のプログラム
を実行するとき、該キャッシュメモリでキャッシュミス
を起こすことのない方策が必要となる。[0005] Each processor element (PE)
In a system having a cache memory, when executing the time synchronization program, for example, at the time of instruction fetch, operand fetch,
When a cache miss occurs, a processing delay occurs, and a delay occurs in the processing for time synchronization.Therefore, in order to perform highly accurate time synchronization, when executing a program for time synchronization, It is necessary to take a measure that does not cause a cache miss in the cache memory.

【０００６】[0006]

【従来の技術】図３は、 MIMD 型の分散主記憶型並列計
算機システムを説明する図であり、図３( a)は、全体の
構成例を示しており、図３(b) は、TOD 時計を示してお
り、図４は、従来のTOD 時計の時刻同期方法を説明する
図であり、図５, 図６は、バリア同期を説明する図であ
って、図５はバリア同期のハードウェア構成の例を示
し、図６(a1),(a2) は、該バリア同期をとる為のプログ
ラム例を示し、図６(b) は、バリア同期をとるときの動
作タイムチャートを示している。2. Description of the Related Art FIG. 3 is a diagram for explaining a MIMD-type distributed main memory type parallel computer system. FIG. 3 (a) shows an example of the entire configuration, and FIG. Ri Contact <br/> shows the watch, FIG. 4 is a diagram illustrating the time synchronization method of the conventional TOD clock, FIG. 5, FIG. 6 is a diagram illustrating barrier synchronization, 5 FIGS. 6 (a1) and 6 (a2) show examples of a hardware configuration for barrier synchronization. FIGS. 6 (a1) and 6 (a2) show examples of programs for achieving the barrier synchronization. Is shown.

【０００７】該 MIMD 型の分散主記憶型並列計算機シス
テムの各プロセッサエレメント｛以下、PE(0),(1),〜と
いうことがある) 1 には、図３(a) に示されているよう
に、中央処理装置 (以下、CPU という) 10と、キャッシ
ュメモリ(CA) 11 と、主記憶(L-ME) 12 の他に、各PE
(0),(1),〜 1間でデータ通信を行う為のPE間データ通信
網 (スイッチ結合網) 3 を介して、ある PE(0) 1内の上
記主記憶(L-ME) 12 内のデータを他の PE(1), 〜 1に転
送するムーブァ(MV)｛例えば、ダイレクトメモリアクセ
ス(DMA) 機構｝13と、バリア処理網 4を介して、PE(0),
(1),〜 1間でのバリア同期をとるバリア処理ユニット(B
A) 14 と、TOD 時計(TO,(T1), 〜 100とを備えている。FIG. 3 (a) shows each processor element of the MIMD type distributed main memory parallel computer system (hereinafter, may be referred to as PE (0), (1),...) 1. In addition to the central processing unit (hereinafter referred to as CPU) 10, the cache memory (CA) 11, and the main memory (L-ME) 12,
The main memory (L-ME) 12 in a certain PE (0) 1 through a data communication network between PEs (switch connection network) 3 for performing data communication between (0), (1), A mover (MV) that transfers the data in the other PEs (1), to 1; for example, a direct memory access (DMA) mechanism 13 and a PE (0),
(1), barrier processing unit (B
A) 14 and TOD clock (TO, (T1), ~ 100).

【０００８】上記TOD 時計(T0), 〜 100は、日付と時刻
を表示する為に、一貫した経過時間を示すもので、例え
ば、図３(b) に示す形式からなる２進カウンタにより構
成されている。このTOD 時計 100は、所定のクロックに
よってカウント動作が行われており、例えば、１秒毎
に、図３(b) のビット51に“１”を加えることにより増
加される。又、該TOD 時計 100の値は、SETCLOCK(SCK)
命令によって、任意の値が設定できるようになってい
る。The above TOD clocks (T0),..., 100 indicate a consistent elapsed time in order to display the date and time, and are constituted by, for example, a binary counter having a format shown in FIG. ing. The TOD clock 100 performs a counting operation by a predetermined clock, and is incremented, for example, by adding “1” to a bit 51 of FIG. 3B every second. Also, the value of the TOD clock 100 is SETCLOCK (SCK)
Arbitrary values can be set by instructions.

【０００９】例えば、図３(a) に示されているような M
IMD 型の分散主記憶型並列計算機システムの場合、各 P
E(0),(1), 〜 1の CPU 10 に、それぞれ、独立に、TOD
時計(T0),(T1),〜 100が接続されていて、該 CPU 10 が
実行する上記SETCLOCK(SCK)命令によって、所定の値が
設定されたり、該 CPU 10 が実行するプログラムによっ
て参照される。[0009] For example, as shown in FIG.
In the case of an IMD type distributed main memory parallel computer system, each P
E (0), (1), TOD
Clocks (T0), (T1), to 100 are connected, and a predetermined value is set by the SETCLOCK (SCK) instruction executed by the CPU 10 or referred to by a program executed by the CPU 10 .

【００１０】これらのTOD 時計(TO),(T1),〜 100は、実
時間のデータ処理システムでは、実時間の換算値を持つ
ものであるから、全PE(0),(1),〜 1の CPU 10 で、同じ
値を持つ必要があり、値が異なると、該データ処理シス
テムでの誤動作を引き起こす原因となる。Since these TOD clocks (TO), (T1),... 100 have real-time conversion values in a real-time data processing system, all PE (0), (1),. The CPU 10 must have the same value, and a different value may cause a malfunction in the data processing system.

【００１１】その為、例えば、システムの立ち上げ時
に、各PE(0),(1),〜 1のTOD 時計(T0),(T1),〜 100の同
期設定処理により、予め、全 PE(0)〜PE(n) 1 の各TOD
時計(T0)〜(Tn) 100が同じ値を持つように同期化処理が
行われる。For this reason, for example, when the system is started up, all the PEs (0), (1),... 1 are synchronized in advance by the TOD clocks (T0), (T1),. 0) to PE (n) 1 TOD
Synchronization processing is performed so that clocks (T0) to (Tn) 100 have the same value.

【００１２】このTOD 同期設定処理は、例えば、図４に
示されているように行われる。先ず、 MIMD 型の分散主
記憶型並列計算機システムの１つの PE(0) 1がメインと
なって、同期設定開始を、他の PE(1)〜(n) 1 に指示す
る。｛図４の処理ステップ 200参照｝このとき、例えば、PE(0) 1 のTOD 時計(TO) 100がｔ秒
を示しているとする。他の PE(1)〜(n) 1 は、上記の指
示により、TOD 時計 100を初期設定する為に、図示され
ていない制御レジスタに設けられている、所謂、STOPビ
ットを“オン”にする。｛図４の処理ステップ 201参
照｝その後、周知の如く、例えば、ｔ＋１秒を自己のTOD 時
計(T1), 〜 100に設定すべく、所定のレジスタに設定し
ておき、上記SETCLOCK(SCK) 命令を実行すると、各 PE
(1)〜(n) 1 のTOD 時計(T1)〜(Tn) 100に、ｔ＋１秒の
値が設定され、該TOD 時計(T1)〜(Tn) 100はカウント動
作を停止している。｛図４の処理ステップ202参照｝そして、メインの PE(0) 1のTOD 時計(TO) 100だけがカ
ウント動作を行い、該TOD 時計(TO) 100が１秒の桁上が
りをするとき、即ち、t 秒台からｔ＋１秒に切り替わる
時、同期信号を、上記他の PE(1)〜(n) 1 のTOD 時計(T
1)〜(Tn) 100にに発行し、他の PE(1)〜(n) 1 のTOD 時
計(T1)〜(Tn) 100のカウント動作をスタートさせる。The TOD synchronization setting process is performed, for example, as shown in FIG. First, one PE (0) 1 of the MIMD type distributed main memory type parallel computer system becomes the main, and instructs the other PEs (1) to (n) 1 to start synchronization setting. {See processing step 200 in FIG. 4} At this time, for example, it is assumed that the TOD clock (TO) 100 of PE (0) 1 indicates t seconds. The other PEs (1) to (n) 1 turn on a so-called STOP bit provided in a control register (not shown) in order to initialize the TOD clock 100 according to the above instruction. . << Refer to processing step 201 in FIG. 4 >> Thereafter, as is well known, for example, in order to set t + 1 seconds to its own TOD clock (T1), 100100, a predetermined register is set, and the SETCLOCK (SCK) instruction is executed. Run, each PE
The value of t + 1 seconds is set in the TOD clocks (T1) to (Tn) 100 of (1) to (n) 1, and the TOD clocks (T1) to (Tn) 100 stop counting. << Refer to processing step 202 in FIG. 4 >> Then, only the TOD clock (TO) 100 of the main PE (0) 1 performs the counting operation, and when the TOD clock (TO) 100 carries one digit, that is, , T seconds to t + 1 seconds, the synchronizing signal is sent to the TOD clocks of the other PEs (1) to (n) 1 (T
1) to (Tn) 100, and start the count operation of the other PE (1) to (n) 1 TOD clocks (T1) to (Tn) 100.

【００１３】これにより、全てのTOD 時計(TO)〜(Tn) 1
00が一斉に、丁度ｔ＋１秒の値から動作を開始し、同期
が取られることになる。この同期信号は、例えば、図示
されていない専用の制御線によって、各 PE(1)〜(n) 1
のTOD 時計(T1)〜(Tn) 100に送出される。Thus, all TOD clocks (TO) to (Tn) 1
00 starts all at once from the value of just t + 1 seconds, and synchronization is established. This synchronization signal is transmitted to each of the PEs (1) to (n) 1 by, for example, a dedicated control line (not shown).
Is transmitted to the TOD clocks (T1) to (Tn) 100.

【００１４】この同期信号を、図３(a) の PE 間データ
通信網 3を介して行うようにすると、上記専用の制御線
は不要になるが、該 PE 間データ通信網 3による同期信
号の通知には時間がかかり、精度の高い時刻同期は得ら
れない。If the synchronization signal is transmitted via the inter-PE data communication network 3 shown in FIG. 3A, the above-mentioned dedicated control line becomes unnecessary, but the synchronization signal is not transmitted by the inter-PE data communication network 3. Notification takes time, and accurate time synchronization cannot be obtained.

【００１５】次に、上記 MIMD 型の分散主記憶型並列計
算機システムが備えているバリア同期機構について、図
５, 図６によって説明する。並列計算機では、各 PE
(0), 〜 1のプログラム処理の進み具合を制御するため
の同期処理をする必要がある。この同期処理のために、
該同期が必要な箇所に特別な同期命令 (バリア同期命
令) が埋め込まれる。この同期命令が埋め込まれた
箇所を同期点と呼び、該同期点に挿入された同期命令
は、２つの部分からなる同期動作を行う。Next, a barrier synchronization mechanism provided in the MIMD type distributed main memory type parallel computer system will be described with reference to FIGS. In the parallel computer, each PE
(0), It is necessary to perform synchronous processing for controlling the progress of the program processing of (1). For this synchronization process,
A special synchronization instruction (barrier synchronization instruction) is embedded in the place where synchronization is required. A location where the synchronization command is embedded is called a synchronization point, and the synchronization command inserted at the synchronization point performs a synchronization operation including two parts.

【００１６】同期動作の前半は、自分がこの同期点に到
着したことを、他の PE 1 に通知する処理で、該同期動
作の後半は、該同期動作を行う PE 1 として指定された
全ての PE 1 が、この同期点に到着するまで、待ち続け
る処理である。The first half of the synchronous operation is processing for notifying the other PE 1 that it has arrived at this synchronization point, and the second half of the synchronous operation is performed for all the PEs designated as the PEs performing the synchronous operation. This is a process in which the PE 1 waits until it reaches this synchronization point.

【００１７】図６(a1),(a2) は、各 PE(0),(1), 〜 1の
CPU 10 で実行されるバリア同期プログラムの例を示し
たもので、図示されている如くに、同期命令が、プロ
グラムの所定の位置に挿入されている。FIGS. 6 (a1) and 6 (a2) show each of PEs (0), (1), and -1.
This shows an example of a barrier synchronization program executed by the CPU 10, in which a synchronization instruction is inserted at a predetermined position in the program as shown.

【００１８】図６(b) は、その同期動作のタイムチャー
トを示したもので、PE(1) 1 が、PE(0) 1が実行する同
期命令によって、同期点に到着するのを待ち、各 PE
(0),(1) 1が同期点に到着した時点から、次の命令を同
期して実行する例を示している。FIG. 6B shows a time chart of the synchronous operation, in which the PE (1) 1 waits for the synchronous instruction executed by the PE (0) 1 to arrive at the synchronous point. Each PE
(0), (1) An example is shown in which the next instruction is executed synchronously from the time when 1 arrives at the synchronization point.

【００１９】このような処理の為に、各 PE(0),(1), 〜
1が同期点に到着したか否かを検出し、更に、検出結果
を各 PE(0),(1), 〜 1に通知する為のハードウェア機構
が必要となる。For such processing, each PE (0), (1),.
A hardware mechanism is required to detect whether 1 has arrived at the synchronization point and to notify each PE (0), (1), to 1 of the detection result.

【００２０】図５は、このハードウェア機構を示したも
ので、各 PE(0)〜(n) 1 には、図示されている同期プロ
セッサ選択マスク 140と、同期検出回路 141が設けられ
ており、又、各 PE(0)〜(n) 1 外に、同期処理網 (バリ
ア処理網) 4 が設けられている。FIG. 5 shows this hardware mechanism. Each PE (0) to (n) 1 is provided with a synchronous processor selection mask 140 and a synchronous detection circuit 141 as shown. In addition, a synchronization processing network (barrier processing network) 4 is provided outside each of the PEs (0) to (n) 1.

【００２１】1) 先ず、各 PE(0)〜 1の CPU(0) 〜 10
は、これから行う同期処理を適用する PE(0)〜 1の集合
を、上記同期プロセッサ選択マスク 140を使って設定す
る。該同期プロセッサ選択マスク 140は、並列計算機シ
ステム内の PE(0)〜(n) 1 の数と等しい数のラッチで構
成されており、各ビットが各 PE(0)〜(n) 1 に対応して
いる。CPU 10は、同期をする PE 1 に対応するビットを
“１”に設定し、対応外のビットを“０”に設定する。1) First, CPUs (0) to 10 of each PE (0) to 1
Sets the set of PEs (0) to 1 to which the synchronization processing to be applied will be applied using the synchronization processor selection mask 140. The synchronous processor selection mask 140 is composed of a number of latches equal to the number of PE (0) to (n) 1 in the parallel computer system, and each bit corresponds to each PE (0) to (n) 1. are doing. The CPU 10 sets the bit corresponding to PE 1 to be synchronized to “1” and sets the non-corresponding bit to “0”.

【００２２】2) 各CPU(0)〜 10 の実行する前述のプロ
グラムの実行が、同期命令が挿入されている同期点に
到着すると、該同期点に到着したことを、他の PE(1)〜
10に知らしめる処理、即ち、前述の同期命令の前半
の処理を行う。具体的には、各 PE(0)内のバリア処理ユ
ニット(BA) 14 内の同期点到着表示ラッチ 142に“１”
を設定する。2) When the execution of the program executed by each of the CPUs (0) to 10 reaches the synchronization point where the synchronization instruction is inserted, the arrival at the synchronization point is notified to the other PEs (1). ~
The process of informing 10, that is, the first half of the above-mentioned synchronization command is performed. Specifically, “1” is set in the synchronization point arrival indication latch 142 in the barrier processing unit (BA) 14 in each PE (0).
Set.

【００２３】3) 該同期点到着表示ラッチ 142の値
“１”は、上記同期処理網 (バリア処理網) 4 に通知さ
れる。 4) 該同期処理網 (バリア処理網) 4 は、全ての PE(0)
〜 1からの上記同期点到着表示ラッチ 142の信号を纏め
て、全ての PE(0)〜 1に返送する。3) The value “1” of the synchronization point arrival indication latch 142 is notified to the synchronization processing network (barrier processing network) 4. 4) The synchronous processing network (barrier processing network) 4 is all PE (0)
The signals of the above-mentioned synchronization point arrival indication latch 142 from .about.1 are collected and sent back to all the PEs (0) .about.1.

【００２４】5) 各 PE(0)〜(n) 1 の上記バリア処理ユ
ニット(BA) 14 内の同期検出回路 141は、「同期プロセ
ッサ選択マスク 140中の“１”であるビットに対応する
PE(0)〜(n) 1 からの同期点到着表示ラッチ 142が全て
“１”である」条件を検査し、この条件が満たされる
と、同期が完了したとして、該同期命令の後半の処理
を終了して、次の命令の実行に移ることができるように
なる。5) The synchronization detection circuit 141 in the barrier processing unit (BA) 14 of each of the PEs (0) to (n) 1 "corresponds to a bit which is" 1 "in the synchronization processor selection mask 140.
The condition that the synchronization point arrival indication latches 142 from PE (0) to (n) 1 are all “1” is checked. If this condition is satisfied, it is determined that the synchronization has been completed and the second half of the synchronization instruction is processed. Is completed, and execution of the next instruction can be started.

【００２５】6) 該同期検出回路 141は、同期の完了し
たことを、自己の CPU 10 に、図示の同期完了信号(SYN
E)を通知することにより、該CPU 10は、次の命令の実行
の再開を始める。6) The synchronization detection circuit 141 informs its own CPU 10 of the completion of the synchronization by sending a synchronization completion signal (SYN
By notifying E), the CPU 10 starts resuming execution of the next instruction.

【００２６】[0026]

【発明が解決しようとする課題】即ち、従来の並列計算
機システムにおいては、例えば、該並列計算機システム
全体のリセット時に、全 PE(0)〜 1が備えているTOD 時
計(T0), 〜 100がリセットされ、特定の値、例えば、全
“０”となる。このような時刻同期方法では、該並列計
算機システムの運用の途上において、新たに、PE(i) 1
を組み込み時に、該組み込まれた PE(i) 1のTOD 時計(T
i) 100を、他の PE(0)〜 1の時刻に同期させることはで
きないという問題があった。That is, in the conventional parallel computer system, for example, when the entire parallel computer system is reset, the TOD clocks (T0),... It is reset to a specific value, for example, all “0”. In such a time synchronization method, PE (i) 1 is newly added during the operation of the parallel computer system.
At the time of installation, the TOD clock (T
i) There was a problem that 100 could not be synchronized with the time of other PE (0) -1.

【００２７】又、密結合の並列計算機等の、PE 1の数が
少ない場合には、時刻同期の為の特別なハードウェア、
例えば、前述のグローバルメモリ (共通メモリ) による
時刻同期機構が設けられ、特定のタイミングで発生する
時刻同期信号を、全ての PE(0)〜 1のTOD 時計(T0)〜 1
00に、高速に分配する時刻同期プログラムの所定の手順
によって、TOD 時計(T0)〜 100の時刻同期を行うことが
できるが、例えば、前述のMIMD型の分散主記憶型並列計
算機システムのように、 PE 1 の数が数十〜数百となる
と、正確な時刻同期が困難になるという問題がある。こ
の問題は、図４で説明した、メインの PE(0) 1からの、
専用線による時刻同期方法についても同じである。When the number of PEs 1 is small, such as a tightly coupled parallel computer, special hardware for time synchronization is used.
For example, a time synchronization mechanism using the above-mentioned global memory (common memory) is provided, and a time synchronization signal generated at a specific timing is transmitted to all the TOD clocks (T0) to 1 (T0) to 1 of PE (0) to 1.
At 00, the time synchronization of the TOD clocks (T0) to 100 can be performed by a predetermined procedure of the time synchronization program distributed at high speed. For example, as in the above-described MIMD type distributed main memory type parallel computer system, When the number of PEs 1 is several tens to several hundreds, there is a problem that accurate time synchronization becomes difficult. This problem is explained by the main PE (0) 1 described in FIG.
The same applies to a time synchronization method using a dedicated line.

【００２８】又、上記並列計算機システムが備えている
PE 間データ通信網 4を用いて、CPU 10が実行するプロ
グラムを実行して時刻同期を行う方法では、該 PE 間デ
ータ通信網 4でのデータ通信に時間がかかる為、該TOD
時計(T0)〜 100の同期精度が悪いという問題があった。Further, the above-mentioned parallel computer system has
In the method of performing time synchronization by executing a program executed by the CPU 10 using the data communication network 4 between PEs, it takes time to perform data communication on the data communication network 4 between PEs.
There is a problem that the synchronization accuracy of the clock (T0) to 100 is poor.

【００２９】本発明は上記従来の欠点に鑑み、PE(i)
が、動的に、並列計算機に組み込まれた場合でも、該組
み込まれたPE(i) のTOD 時計(Ti) 100の時刻同期ができ
ること、又、該時刻同期は、絶対時刻でできること。
又、並列計算機システムが備えている PE 間データ通信
網 4を用いて同期するより高い精度で、TOD 時計を同期
することができること。更に、該TOD 時計を同期の為の
特別なハードウェア、例えば、時刻同期の為の専用の制
御線を必要としないで、MIMD型の分散主記憶型並列計算
機システムで、精度の高い時刻同期を行う方法、具体的
には、該MIMD型の分散主記憶型並列計算機システムに
は、プログラム処理の同期化機構であるバリア同期機構
が設けられていることに着目し、該バリア同期機構を使
用して、時刻同期を行う方法を提供することを目的とす
るものである。The present invention has been made in view of the above-mentioned conventional drawbacks, and
However, the time synchronization of the TOD clock (Ti) 100 of the incorporated PE (i) can be performed even when dynamically incorporated in a parallel computer, and the time synchronization can be performed with an absolute time.
In addition, the TOD clock must be able to be synchronized with higher accuracy than using the data communication network 4 between PEs provided in the parallel computer system. Furthermore, without the need for special hardware for synchronizing the TOD clock, for example, a dedicated control line for time synchronization, the MIMD type distributed main memory type parallel computer system provides highly accurate time synchronization. Attention is paid to the fact that the MIMD-type distributed main memory type parallel computer system is provided with a barrier synchronization mechanism which is a synchronization mechanism of the program processing. It is another object of the present invention to provide a method for performing time synchronization.

【００３０】[0030]

【課題を解決する為の手段】図１は、本発明の一実施例
を模式的に示した図である。上記の問題点は下記の如く
構成した時刻同期方法によって解決される。FIG. 1 is a diagram schematically showing an embodiment of the present invention. The above problem is solved by a time synchronization method configured as follows.

【００３１】(1) バリア同期機構 4を備えたマルチプロ
セッサシステムにおいて、時刻同期の為の第１のプログ
ラムで、親プロセッサエレメントPE(0) 1 は、自己のTO
D 時計(T0) 100を読み出し、この読み出した基準の時
刻、即ち他の各プロセッサエレメントPE(1),〜 1のTOD
時計(T1), 〜 100に設定すべき時刻(A) を、他の全ての
PE(1), 〜 1に通知しておく。上記バリア同期機構 4を
用いて、各 PE(0), 〜 1は、バリア同期命令が指示す
る同期ポイント迄待ち、全ての PE(0), 〜 1間で同期が
取れたことを認識した時点で、親 PE(0) 1は、自己のTO
D 時計(T0) 100を読み出し、この読み出した時刻と、上
記時刻(A) との差分の値(B) を求め、他の全ての PE
(1), 〜 1は、自己のTOD 時計(T1), 〜 100に上記時刻
(A) を設定する。次の時刻同期の為のプログラムｂで、
親 PE(0) 1は上記差分の値(B) を、他の全ての PE(1),
〜 1に通知し、上記と同様に上記バリア同期機構 4を用
いて、各PE(0), 〜 1は、バリア同期命令が指示する
同期ポイント迄待ち、全ての PE(0), 〜 1間で同期が取
れたことを認識した時点で、親 PE(0) 1以外の各 PE
(1),〜 1は自己のTOD 時計(T1), 〜 100の値に、上記差
分の値(B) を加算して、該TOD 時計(T1), 〜 100を設定
するように構成する。(1) In the multiprocessor system having the barrier synchronization mechanism 4, the parent processor element PE (0) 1 is the first program for time synchronization.
D Clock (T0) 100 is read and the time of this read reference
Time, that is , the TOD of each other processor element PE (1), ~ 1
Clock (T1), the time (A) to be set to ~ 100
PE (1), to notify the ~ 1 your clause. Using the barrier synchronization mechanism 4, each PE (0), 〜1 waits until the synchronization point indicated by the barrier synchronization instruction, and when it is recognized that synchronization has been achieved between all the PE (0), 11 in, parent PE (0) 1 is, self-tO
D Read the clock (T0) 100, find the difference value (B) between the read time and the above time (A), and
(1), ~ 1 is your TOD clock (T1), ~ 100 above time
Set (A) . In the next program b for time synchronization,
The parent PE (0) 1 sets the value of the above difference (B) to all other PE (1),
, And each PE (0), 11 waits until the synchronization point indicated by the barrier synchronization instruction, and uses all the PE (0), 11 When it is recognized that synchronization has taken place, each PE other than the parent PE (0) 1
(1), - 1 own TOD clock (T1), a value of to 100, the difference
The TOD clock (T1), .about.100 is set by adding the minute value (B).

【００３２】(2) 上記バリア同期機構 4を備え、且つ、
キャッシュメモリ 11 を備えたマルチプロセッサシステ
ムにおいて、上記(1) 項に記載の上記時刻同期の為のプ
ログラムａを１回ドライランさせて、該プログラムａの
全てを、該キャッシュメモリ 11 に格納した後、上記プ
ログラムａを再実行し、次に、上記時刻同期の為のプロ
グラムｂを１回ドライランさせて、該プログラムｂの全
てを、該キャッシュメモリ 11 に格納した後、上記プロ
グラムｂを再実行して、上記(1) 項に記載の時刻同期を
行うように構成する。(2) The above-mentioned barrier synchronization mechanism 4 is provided, and
In a multiprocessor system including the cache memory 11, the program a for time synchronization described in the above item (1) is dry-run once, and all the programs a are stored in the cache memory 11. The program a is re-executed, the program b for time synchronization is dry-run once, and all of the program b is stored in the cache memory 11. Then, the program b is re-executed. It is configured to perform the time synchronization described in the above (1).

【００３３】(3) 上記バリア同期機構 4を備え、且つ、
キャッシュメモリ 11 を備えたマルチプロセッサシステ
ムにおいて、上記(1) 項に記載の上記時刻同期の為のプ
ログラムａと、プログラムｂとを１回ドライランさせ
て、該プログラムａ，及び、プログラムｂの全てを、該
キャッシュメモリ 11 に格納した後、上記プログラム
ａ，及び、プログラムｂを再実行して、上記(1) 項に記
載の時刻同期を行うように構成する。(3) The barrier synchronization mechanism 4 is provided, and
In the multiprocessor system having the cache memory 11, the program a and the program b for time synchronization described in the above item (1) are dry-run once, and all of the programs a and b are executed. After storing the data in the cache memory 11, the program a and the program b are re-executed to perform the time synchronization described in the above item (1).

【００３４】[0034]

【作用】即ち、本発明の時刻同期方法は、MIMD型の分散
主記憶型並列計算機に、各 PE(0)〜(n) 1 の進み具合を
制御する為のバリア同期機構が備えてられていことに着
目する。親の PE(0) 1のCPU(0) 10 で実行される時刻同
期の為の第１のプログラムで、親 PE(0) 1が、自己のTO
D 時計(T0) 100を読み出し、この値、即ち各 PE(1), 〜
1のTOD 時計(T1), 〜 100に設定すべき時刻(A) を、全
ての PE(1), 〜 1に通知しておく。上記バリア同期機構
4を用いて、各 PE(0), 〜 1は、バリア同期命令が指
示する同期ポイント迄待ち、全ての PE(0), 〜 1間で同
期が取れたことを認識した時点、即ち、前述の同期検出
回路 141で、全ての PE(0), 〜 1が同期ポイントに到達
したことを検出した時点 (検出信号 SYNE の送出時点)
で、親 PE(0), 〜 1が、自己のTOD 時計(T0), 〜 100が
示す絶対時刻を読み込んで、上記各PE(1), 〜 1に通知
した時刻(A) との差分の値(B) を求め、他の PE(1), 〜
1 は、自己の PE(1), 〜 1のTOD 時計(T1), 〜 100に上
記時刻(A) を設定する。次の時刻同期の為のプログラム
ｂで、親 PE(0), 〜 1は、該求めた差分値(B)を、全て
の PE(1), 〜 1に通知し、上記と同様に上記バリア同期
機構 4を用いて、各 PE(0), 〜 1は、バリア同期命令
が指示する同期ポイント迄待ち、全てのPE(0), 〜 1間
で同期が取れたことを認識した時点で、親以外の各 PE
(1), 〜 1は自己のTOD 時計(T1), 〜 100の値に、上記
差分の値(B) を加算して、該TOD 時計(T1), 〜 100に設
定するようにして、時刻同期を行うようにしたものであ
る。According to the time synchronization method of the present invention, the MIMD type distributed main memory parallel computer is provided with a barrier synchronization mechanism for controlling the progress of each of the PEs (0) to (n) 1. Pay attention to This is the first program for time synchronization executed by the CPU (0) 10 of the parent PE (0) 1, and the parent PE (0) 1
D Read the clock (T0) 100 and read this value, that is, each PE (1), ~
1 of TOD clock (T1), time should be set to ~ 100 (A), all of the PE (1), to your Ku notification to ~ 1. Above barrier synchronization mechanism
4, each PE (0), 〜1 waits until the synchronization point indicated by the barrier synchronization instruction, and when it is recognized that synchronization has been established between all PE (0), 11, At the time when the synchronization detection circuit 141 detects that all PE (0), ~ 1 have reached the synchronization point (when the detection signal SYNE is sent)
Then, the parent PE (0), ~ 1 reads the absolute time indicated by its own TOD clock (T0), ~ 100, and calculates the difference from the time (A) notified to the above PE (1), ~ 1. The value (B) is calculated and the other PE (1), ~
1 sets the above time (A) to its own PE (1), ~ 1 TOD clock (T1), ~ 100 . Program b for the next time synchronization, a parent PE (0), and 1 is the calculated difference value (B), all the PE (1), notifies the ~ 1, similarly to the above the barrier Using the synchronization mechanism 4, each PE (0), 11 waits until the synchronization point indicated by the barrier synchronization instruction, and when it recognizes that all the PE (0), 11 have been synchronized, Each non-parent PE
(1), 11 is the time of adding the value of the above difference (B) to the value of its own TOD clock (T1), 100100, and setting it to the TOD clock (T1), 100100. This is to synchronize.

【００３５】具体的に言えば、時刻同期プログラムａ
で、親 PE(0) 1が自己のTOD 時計(T0)100を読み出した
時の時刻(A) を、時刻同期の基準時刻として各 PE(1)〜
1に、例えば、前述のPE間データ通信網 3を用いて通知
しておき、各 PE(0)〜 1の CPU10 が実行する上記プロ
グラムａに挿入されているバリア同期命令で、バリア
同期ポイントまで、次の命令の実行を待ち、各 PE(0),
〜 1間で同期が取れたら、親 PE(0) 1は、自己のTOD 時
計(T0) 100を読み出した時の絶対時刻と、上記各PE(1)
〜 1に通知した同期基準となる時刻(A) との差分(B) を
求め、他の PE(1), 〜 1では、例えば、実行再開した次
の時刻設定命令で上記親 PE(0) 1から通知された時刻
(A) を、自己のTOD 時計(T1), 〜 100に設定し、そこか
らカウントを始める。More specifically, the time synchronization program a
The time (A) when the parent PE (0) 1 reads out its TOD clock (T0) 100 is used as the reference time for time synchronization, and
1 is notified using, for example, the data communication network 3 between PEs described above, and a barrier synchronization instruction inserted into the program a executed by the CPU 10 of each of the PEs (0) to 1 reaches the barrier synchronization point. , Waits for the next instruction to execute, and returns to each PE (0),
When synchronization is established between 1 and 1, the parent PE (0) 1 reads the absolute time of reading its own TOD clock (T0) 100 and the above PE (1).
The difference (B) from the synchronization reference time (A) notified to ~ 1 is obtained, and the other PEs (1), ~ 1, for example, execute the above-mentioned parent PE (0) by the next time setting instruction whose execution has been resumed. Time notified from 1
Set (A) to your TOD clock (T1), ~ 100, and start counting from there.

【００３６】同様にして、次の時刻同期プログラムｂ
で、親 PE(0), 〜 1では、上記求めた差分(B) の値を、
他の各 PE(1)〜 1に通知して、各 PE(0)〜 1の CPU 10
が実行する上記プログラムｂに挿入されているバリア同
期命令で、バリア同期ポイントまで、次の命令の実行を
待ち、各 PE(0), 〜 1間で同期が取れたら、例えば、実
行再開した次の時刻設定命令で、上記親 PE(0) 1から通
知された差分(B) を、自己のTOD 時計(T1), 〜 100に加
算した時刻を、自己のTOD 時計(T1), 〜 100に設定す
る。このようにして、該並列計算機の各 PE(0)〜 1での
時刻同期が正確にとれることは、Similarly, the next time synchronization program b
In the parent PE (0), ~ 1, the value of the difference (B) obtained above is
Notify each other PE (1) ~ 1 and CPU 10 of each PE (0) ~ 1
Waits for the execution of the next instruction up to the barrier synchronization point with the barrier synchronization instruction inserted in the program b executed, and when synchronization is established between the PEs (0), to Adds the difference (B) notified from the parent PE (0) 1 to its own TOD clock (T1), ~ 100 by the time setting instruction, and adds the time to its own TOD clock (T1), ~ 100 Set. In this way, the time synchronization between the PEs (0) to 1 of the parallel computer can be accurately obtained.

【表１】によって明らかである。[Table 1] It is clear by

【００３７】次に、上記並列計算機システムの各 PE(0)
〜 1において、キャッシュメモリ(CA) 11 を備えている
場合、それまでに実行してきたプログラムの違いによ
り、各PE(0)〜 1のキャッシュメモリ(CA) 11 の状態に
差があり、ある PE(i)〜 1で、該時刻同期プログラム
ａ，ｂを実行したとき、その時刻同期プログラムａ，ｂ
が、キャッシュメモリ(CA) 11 上に存在しなくて、キャ
ッシュミスが発生すると、その PE(i)〜 1では処理の遅
延が発生することになる。Next, each PE (0) of the above parallel computer system
1 to 1 have a cache memory (CA) 11, there is a difference in the state of the cache memory (CA) 11 of each PE (0) to 1 due to the difference in the programs executed so far. When the time synchronization programs a and b are executed in (i) to 1, the time synchronization programs a and b
However, if a cache miss occurs because it does not exist in the cache memory (CA) 11, processing delay occurs in the PE (i) -1.

【００３８】そこで、本発明においては、該時刻同期プ
ログラムａ，ｂを実行したとき、上記の如き、キャッシ
ュミスが発生しないように、該時刻同期プログラムａ，
ｂを一回空実行（ドライランという）して、該時刻同期
プログラムａ，ｂをキャッシュメモリ(CA) 11 に存在す
るようにしておき、再度該時刻同期プログラムａ，ｂを
実行するようにすることで、該キャッシュミスによる同
期時刻のずれを無くすることができるようになる。Therefore, according to the present invention, when the time synchronization programs a and b are executed, the time synchronization programs a and b are set so that a cache miss does not occur as described above.
b is executed once (called dry run) so that the time synchronization programs a and b are present in the cache memory (CA) 11 and the time synchronization programs a and b are executed again. Thus, it is possible to eliminate the deviation of the synchronization time due to the cache miss.

【００３９】この時刻同期プログラムａ，ｂのドライラ
ンは、両方を一度にドライランしてから、再度該時刻同
期プログラムａ，ｂを実行するようにしても良いし、各
時刻同期プログラムａ，ｂを、それぞれ、ドライランし
てから再度実行するようにしてもよいことは言うまでも
ないことである。In the dry run of the time synchronization programs a and b, both of the time synchronization programs a and b may be dry-run at once, and then the time synchronization programs a and b may be executed again. Needless to say, each may be dry-run and then executed again.

【００４０】このように、本発明による時刻同期方法に
よれば、並列計算機システムに元々備えられているバリ
ア同期機構を使用して、時刻同期を行うので、時刻同期
の為の特別なハードウェア機構、例えば、専用の制御線
とか、グローバルメモリへの高速アクセス機構等を必要
とすることなく、然も、高い精度で、各プロセッサエレ
メント(PE)間のTOD 時計を同期化することができる。
又、該並列計算機システムに新たなプロセッサエレメン
ト(PE)を動的に組み込むように時でも、本発明による、
上記バリア同期命令を含む時刻同期プログラムａ，ｂを
実行することで、時刻同期を行うことができる効果があ
る。As described above, according to the time synchronization method of the present invention, the time synchronization is performed by using the barrier synchronization mechanism originally provided in the parallel computer system. Therefore, a special hardware mechanism for time synchronization is used. For example, the TOD clock between the processor elements (PEs) can be synchronized with high accuracy without requiring a dedicated control line or a high-speed access mechanism to the global memory.
Also, even when dynamically incorporating a new processor element (PE) into the parallel computer system, the present invention provides
By executing the time synchronization programs a and b including the barrier synchronization instruction, there is an effect that time synchronization can be performed.

【００４１】[0041]

【実施例】以下本発明の実施例を図面によって詳述す
る。前述の図１が本発明の一実施例を模式的に示した図
であり、図２は、本発明の時刻同期プログラムのドライ
ランを説明する図である。BRIEF DESCRIPTION OF THE DRAWINGS FIG. FIG. 1 is a diagram schematically showing one embodiment of the present invention, and FIG. 2 is a diagram for explaining a dry run of the time synchronization program of the present invention.

【００４２】本発明においては、バリア同期機構 4を備
えたマルチプロセッサシステムにおいて、時刻同期の為
の第１のプログラムで、親 PE(0) 1が、自己のTOD 時計
(T0)100を読み出し、この値、即ち他の各 PE(1), 〜 1
のTOD 時計(T1), 〜 100に設定すべき時刻(A) を他の P
E(1)〜 1に通知しておき、上記バリア同期機構 4を用い
て、各 PE(0), 〜 1は、バリア同期命令が指示する同
期ポイント迄待ち、全ての PE(0), 〜 1間で同期が取れ
たことを認識した時点で、親 PE(0) 1は、自己のTOD 時
計(T0) 100の値を読み出し、上記時刻(A) との差分の値
(B) を求め、他の PE(1), 〜 1は、親 PE(0) 1から通知
された時刻(A) を自己のTOD 時計(T1),〜 100に設定
し、次の時刻同期の為のプログラムｂで、親 PE(0) 1
が、上記各 PE(1), 〜 1に通知した時刻(A) との差分の
値(B) を、全ての PE(1), 〜 1に通知し、上記と同様に
上記バリア同期機構 4を用いて、各 PE(0)〜 1は、バリ
ア同期命令が指示する同期ポイント迄待ち、全ての P
E(0)〜 1間で同期が取れたことを認識した時点で、親以
外のPE(1) 〜 1は自己のTOD 時計(T1) 100の値に、上記
差分の値(B) を加算して、該TOD 時計(T1), 〜 100を設
定する手段、及び、時刻同期第１のプログラム，ｂをド
ライランした後、実行する手段が、本発明を実施するの
に必要な手段である。尚、全図を通して同じ符号は同じ
対象物を示している。According to the present invention, in a multiprocessor system provided with a barrier synchronization mechanism 4, the parent PE (0) 1 uses its own TOD clock in the first program for time synchronization.
(T0) 100 is read, and this value, that is , each other PE (1), ~ 1
The TOD clock (T1), the time (A) to be set to
E (1) to E (1) are notified, and using the barrier synchronization mechanism 4, each PE (0), to 1 waits until the synchronization point indicated by the barrier synchronization instruction, and all PE (0), to When the parent PE (0) 1 recognizes that synchronization has been established between the two, the parent PE (0) 1 reads the value of its own TOD clock (T0) 100 and calculates the value of the difference from the time (A).
(B) , the other PEs (1), ~ 1 set the time (A) notified from the parent PE (0) 1 to their TOD clock (T1), ~ 100, and synchronize the next time Program b for parent PE (0) 1
Informs all the PEs (1), to 1 of the difference value (B) from the time (A) notified to each of the PEs (1), to 1, and similarly to the above, the barrier synchronization mechanism 4 , Each PE (0) ~ 1 waits until the synchronization point indicated by the barrier synchronization instruction, and
E (0) at the time of the recognition that the synchronization is established between the ~ 1, parent or more
Outer PE (1) ~ 1 is the value of self TOD clock (T1) 100
The means for adding the difference value (B) to set the TOD clock (T1), .about.100, and the means for executing the time synchronization first program, b after dry-run, implements the present invention. It is a necessary means. Note that the same reference numerals indicate the same object throughout the drawings.

【００４３】以下、図５，図６で説明したバアリ同期機
構を参照しながら、図１，図２によって、本発明の時刻
同期方法を説明する。先ず、本発明による時刻同期方法
は、バリア同期機構を備えた並列計算機システムである
ことが前提となる。その為には、再度説明すると、バリ
ア同期命令が、プログラムの所定の位置に挿入されて
いて、各 PE(0)〜 1において、該バリア同期命令を実
行して、プログラム上での同期点に達したことを、他の
PE(1)〜 1に知らせる為の処理 (この処理を、同期処理
前半という) 、具体的には、同期点到着表示ラッチ 142
を“１”にセットすることが行われる。該同期点到着表
示ラッチ 142の値は、同期処理網（バアリ同期網）4 に
通知され、該同期処理網（バアリ同期網）4 は、全ての
PE(0)〜 1からの同期点到着表示ラッチ 142の信号を纏
めて、全ての PE(0)〜 1に返送し、各 PE(0)〜 1の同期
検出回路 141において、全ての PE(0)〜 1からの、上記
同期点到着表示ラッチ 142が“１”になったことを検出
すると、同期完了として、各 PE(0)〜 1内の CPU(0) 〜
10 に通知する処理 (この処理を、同期処理後半とい
う) が行われる必要がある。｛図５参照｝本発明による時刻同期は、上記バリア同期機構を使用す
るものである。先ず、並列計算機システムに電源を投入
するとき等に、親となる PE(0) 1を決定し、該親の PE
(0), 〜 1等において、時刻同期プログラムａを実行し
て、親 PE(0) 1では、自己のTOD 時計(T0) 10 を読み出
し、その時の時刻(A) を、PE間データ通信網 (スイッチ
結合網) 3 を介して、他の PE(1)〜 1上のプログラムａ
に通知しておく。｛図１の処理ステップ 300,400参照｝次に、該時刻同期プログラムａ中に挿入されているバリ
ア同期命令を実行すると、所謂、バリア同期ポイント
迄｛図６(b) の動作タイムチャート参照｝待つ。各 PE
(0)〜 1間でのバリア同期が取れたら、親 PE(0) 1で
は、上記時刻(A) と自己のTOD 時計(T0)との差分B を求
め、各 PE(1)〜 1においては、親 PE(0) 1から通知され
ている時刻(A) を、自己のTOD 時計(T1), 〜10を設定す
る。この設定処理は、各 PE(1), 〜 1の同期検出回路 1
04からの同期完了信号 SYNE で、ハードウェア的に行っ
ても良いし、バリア同期完了後の次に実行される時刻設
定(SCK) 命令で行うようにしても良い。上記時刻(A) か
ら、このバリア同期がとれる迄の時間を、図示されてい
るように、αとすると、上記差分B=αとなる。ここで、
一旦 PE 間同期処理を一旦終了する。｛図１の処理ステ
ップ 301,401参照｝次に、上記親の PE(0) 1において、時刻同期プログラム
ｂを実行して、上記の差分Ｂ(=α) を、PE間データ通信
網 (スイッチ結合網) 3 を介して、他の PE(1)〜 1上の
プログラムに通知しておく。｛図１の処理ステップ 30
2,402参照｝そして、所謂、バリア同期ポイント迄｛図６(b) の動作
タイムチャート参照｝待つ。各 PE(0)〜 1間でのバリア
同期が取れたら、各 PE(1)〜 1において、親 PE(0) 1か
ら通知されている差分Ｂ(=α) を、自己のTOD 時計(T
1), 〜10を加算して、再設定する。この上記最初のバリ
ア同期ポイントから、次の同期ポイント迄の待ち時間を
βとする。｛図１の処理ステップ 303,403参照｝このバリア同期が取れた時点、即ち、各 PE(0)〜 1の同
期検出回路 141で、全ての PE(0)〜(n) 1 が同期点に到
着したことを検出した時点では、親の PE(0) 1のTOD 時
計(T0) 100では、図示されている如くに、時刻(A) ＋α
＋βとなり、他の PE(1)〜 1のTOD 時計(T1)〜 100で
は、時刻(A)+βになっている時点で、上記親 PE(0) 1か
ら通知されたB(= α) を加算するので、親の PE(0) 1の
TOD 時計(T0) 100と同じ時刻に同期する。Hereinafter, the time synchronization method of the present invention will be described with reference to FIGS. 1 and 2 while referring to the bayary synchronization mechanism described with reference to FIGS. First, the time synchronization method according to the present invention is premised on a parallel computer system having a barrier synchronization mechanism. For that purpose, to explain again, the barrier synchronization instruction is inserted at a predetermined position in the program, and in each of the PEs (0) to 1, the barrier synchronization instruction is executed to reach a synchronization point on the program. Have reached that other
A process for notifying PEs (1) to 1 (this process is called the first half of the synchronization process), specifically, a synchronization point arrival indication latch 142
Is set to "1". The value of the synchronization point arrival indication latch 142 is notified to the synchronization processing network (Baari synchronization network) 4, and the synchronization processing network (Baari synchronization network) 4
The signals of the synchronization point arrival indication latches 142 from the PEs (0) to 1 are collected and returned to all the PEs (0) to 1, and the synchronization detection circuit 141 of each of the PEs (0) to 1 outputs the signals to all the PEs (0) to 1. When it is detected that the synchronization point arrival indication latch 142 from “0” to “1” has become “1”, it is determined that the synchronization has been completed and the CPU (0) to
It is necessary to perform the process of notifying 10 (this process is called the second half of the synchronization process). << See FIG. 5 >> The time synchronization according to the present invention uses the above barrier synchronization mechanism. First, when the parallel computer system is powered on, the parent PE (0) 1 is determined, and the parent PE (0) 1 is determined.
(0), ~ 1, etc., the time synchronization program a is executed, and the parent PE (0) 1 reads out its own TOD clock (T0) 10 and writes the time (A) at that time to the data communication network between PEs. (Switch network) 3 through other PE (1) -1
To be notified. << Refer to processing steps 300 and 400 in FIG. 1 >> Next, when the barrier synchronization instruction inserted in the time synchronization program a is executed, a so-called barrier synchronization point {see the operation time chart of FIG. 6B} waits. Each PE
When barrier synchronization is established between (0) and (1), the parent PE (0) 1 calculates the difference B between the time (A) and its own TOD clock (T0), and in each PE (1) to 1, Sets the time (A) notified from the parent PE (0) 1 to its own TOD clock (T1), 〜10. This setting process is performed by the synchronization detection circuit 1 of each PE (1), ~ 1.
The synchronization may be performed by hardware using the synchronization completion signal SYNE from 04, or may be performed by a time setting (SCK) instruction executed next after barrier synchronization is completed. Assuming that the time from the time (A) until the barrier synchronization is obtained is α, the difference B = α. here,
Temporarily terminate the PE synchronization process. << Refer to processing steps 301 and 401 in FIG. 1 >> Next, in the parent PE (0) 1, the time synchronization program b is executed to convert the difference B (= α) into the PE data communication network (switch connection network). ) Notify the program on the other PE (1) ~ 1 via 3).処理 Processing step of Fig. 1 30
2,402} and wait until the so-called barrier synchronization point {see the operation time chart of FIG. 6 (b)}. When barrier synchronization is established between the PEs (0) to 1, the difference B (= α) notified from the parent PE (0) 1 in each of the PEs (1) to 1 is compared with its own TOD clock (T
1) Add ~ 10 and reset. The waiting time from the first barrier synchronization point to the next synchronization point is β. << Refer to processing steps 303 and 403 in FIG. 1 >> When this barrier synchronization is achieved, that is, in the synchronization detection circuit 141 of each PE (0) -1, all PE (0)-(n) 1 have arrived at the synchronization point. At the time of detecting that, the TOD clock (T0) 100 of the parent PE (0) 1 has the time (A) + α as shown in the figure.
+ Β, and in the TOD clocks (T1) to 100 of the other PEs (1) to 1, B (= α) notified from the parent PE (0) 1 at the time (A) + β So that the parent PE (0) 1
Synchronizes with the same time as TOD clock (T0) 100.

【００４４】次に、該並列計算機システムの各 PE(0)〜
1がキャッシュメモリ(CA) 11 を備えている場合、キャ
ッシュメモリ(CA) 11 の状態によっては、該キャッシュ
メモリ(CA) 11 上に、時刻同期プログラムａ，ｂが存在
しないことがある。Next, each PE (0) ~ of the parallel computer system
When 1 has the cache memory (CA) 11, the time synchronization programs a and b may not exist on the cache memory (CA) 11 depending on the state of the cache memory (CA) 11.

【００４５】この場合には、該時刻同期処理に遅れが生
じることがある。そのため、本発明においては、該時刻
同期プログラムａ，ｂを実行する前に、該時刻同期プロ
グラムａ，ｂを空実行（ドライラン）して、必ず、キャ
ッシュメモリ(CA) 11 上に存在するようにしてから、実
際の時刻同期プログラムａ，ｂの実行を行うようにす
る。In this case, the time synchronization processing may be delayed. Therefore, in the present invention, before the time synchronization programs a and b are executed, the time synchronization programs a and b are executed empty (dry run) so that the time synchronization programs a and b always exist in the cache memory (CA) 11. After that, the actual time synchronization programs a and b are executed.

【００４６】図２は、この場合のドライランの実行形態
の例を示したもので、図２(a) では、２つの時刻同期プ
ログラムａ，ｂのそれぞれを、独立にドライランした
後、それぞれの時刻同期プログラムａ，ｂを実行する場
合を示しており、図２(b) は、該時刻同期プログラムａ
とｂの両方を、一度にドライランした後、実際の時刻同
期プログラムａとｂとを続けて実行する例を示してい
る。FIG. 2 shows an example of the execution form of the dry run in this case. In FIG. 2 (a), each of the two time synchronization programs a and b is independently dry-run, FIG. 2B shows a case where the synchronization programs a and b are executed.
In this example, both the time synchronization programs a and b are executed at the same time after the dry run of both the time synchronization programs a and b are performed at once.

【００４７】該時刻同期プログラムａ，ｂのドライラン
は、該時刻同期プログラムａ，ｂが、必ず、キャッシュ
メモリ(CA) 11 に存在せしめる為の処理であるので、上
記の何れのドライランを実行して、実際の該時刻同期プ
ログラムａ，ｂの実行を行うようにしても良いことは言
うまでもないことである。The dry run of the time synchronization programs a and b is a process for ensuring that the time synchronization programs a and b exist in the cache memory (CA) 11. Needless to say, the actual time synchronization programs a and b may be executed.

【００４８】[0048]

【００４９】[0049]

【発明の効果】以上、詳細に説明したように、本発明の
時刻同期方法によれば、並列計算機システムに元々備え
られているバリア同期機構を使用して、時刻同期を行う
ので、時刻同期の為の特別なハードウェア機構、例え
ば、専用の制御線とか、グローバルメモリへの高速アク
セス機構等を必要とすることなく、然も、高い精度で、
各プロセッサエレメント(PE)間のTOD 時計を同期化する
ことができる。又、該並列計算機システムに新たなプロ
セッサエレメント(PE)を動的に組み込むような時でも、
本発明による、上記バリア同期命令を含む時刻同期プロ
グラムａ，ｂを実行することで、時刻同期を行うことが
できる効果がある。As described above in detail, according to the time synchronization method of the present invention, time synchronization is performed using the barrier synchronization mechanism originally provided in the parallel computer system. It does not require a special hardware mechanism such as a dedicated control line or a high-speed access mechanism to the global memory.
The TOD clock between each processor element (PE) can be synchronized. Also, even when a new processor element (PE) is dynamically incorporated in the parallel computer system,
By executing the time synchronization programs a and b including the barrier synchronization instruction according to the present invention, there is an effect that time synchronization can be performed.

[Brief description of the drawings]

【図１】本発明の一実施例を模式的に示した図FIG. 1 is a diagram schematically showing an embodiment of the present invention.

【図２】本発明の時刻同期プログラムのドライランを説
明する図FIG. 2 is a diagram illustrating a dry run of a time synchronization program according to the present invention.

【図３】MIMD 型の分散主記憶型並列計算機システムを
説明する図FIG. 3 is a diagram illustrating a MIMD-type distributed main memory parallel computer system.

【図４】従来のＴＯＤ時計の時刻同期方法を説明する図FIG. 4 is a view for explaining a conventional time synchronization method of a TOD clock.

【図５】バリア同期を説明する図（その１）FIG. 5 is a view for explaining barrier synchronization (part 1);

【図６】バリア同期を説明する図（その２）FIG. 6 is a view for explaining barrier synchronization (part 2);

[Explanation of symbols]

1 プロセッサエレメント｛PE(0) 〜PE(n) ｝ 10 中央処理装置(CPU) 11 キャッシ
ュメモリ(CA) 12 主記憶(L-ME) 13 ムーバー
(MV) 14 バリア処理ユニット(BA) 100 TOD 時計(T0)〜(Tn) 2 グローバルメモリ (共通メモリ) 3 PE間データ通信網 (スイッチ結合網) 4 バリア処理網、バリア同期機構 200 〜205,300 〜303,400 〜404 処理ステップａ，ｂ時刻同期プログラムバリア同期命令, 同期命令1 Processor element ｛PE (0) to PE (n)｝ 10 Central processing unit (CPU) 11 Cache memory (CA) 12 Main memory (L-ME) 13 Mover
(MV) 14 Barrier processing unit (BA) 100 TOD clock (T0) to (Tn) 2 Global memory (Common memory) 3 Data communication network between PEs (Switch connection network) 4 Barrier processing network, barrier synchronization mechanism 200 to 205,300 to 303,400 to 404 Processing steps a, b Time synchronization program Barrier synchronization instruction, synchronization instruction

フロントページの続き (56)参考文献特開昭63−191217（ＪＰ，Ａ) 特開平３−288956（ＪＰ，Ａ) 特開平４−277953（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 15/16 - 15/177 Continuation of the front page (56) References JP-A-63-191217 (JP, A) JP-A-3-288956 (JP, A) JP-A-4-277953 (JP, A) (58) Fields studied (Int .Cl. ⁷ , DB name) G06F 15/16-15/177

Claims

(57) [Claims]

In 1. A Contact Keru time synchronization method in a multi-processor system with a barrier synchronization mechanism, time synchronization first program for the, parent processor Jer
Instrument reads the self of the clock, the read-out basis
The time to notify all other processor elements, with the barrier synchronizing mechanism, each processor element <br/> waits until the synchronization point barrier synchronization instruction instructs, synchronized among all the processor elements Once it is recognized that the parent processor element of its own clock time and the group
It calculates a difference value of the quasi-time, all of the other processors d
The element sets the above reference time on its own clock, and in the second program for the next time synchronization, the parent processor
The element uses the value of the above difference for all other processors.
Notify the element, with the barrier synchronizing mechanism similar to the above, each processor
The sub-element waits until the synchronization point indicated by the barrier synchronization instruction, and when it recognizes that all the processor elements have been synchronized, each processor element other than the parent processor element
Is a method of adding the value of the above difference to the value of its own clock and setting it as its own time.