JP3569341B2

JP3569341B2 - Parallel computer system

Info

Publication number: JP3569341B2
Application number: JP07494195A
Authority: JP
Inventors: 健司吉村; 光一郎原田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1995-03-31
Filing date: 1995-03-31
Publication date: 2004-09-22
Anticipated expiration: 2019-09-22
Also published as: JPH08272729A

Description

【０００１】
【産業上の利用分野】
本発明は，複数のプロセッサエレメントとコントロールプロセッサとがネットワークで結合された計算機システムにおいて，Ｉ／Ｏ装置を持たないプロセッサエレメントからの，Ｉ／Ｏ装置を持つコントロールプロセッサを経由したＩ／Ｏ処理を，高速に効率よく実行できるようにした並列型計算機システムに関するものである。
【０００２】
【従来の技術】
図８は従来技術の説明図である。
図８（Ａ），（Ｃ）に示す並列型計算機システムにおいて，プロセッサエレメントＰＥ〔ＰＥ（０），ＰＥ（１），…〕は，演算を専門に行うプロセッサである。コントロールプロセッサＣＰは，プロセッサエレメントＰＥを管理する装置であり，これには記憶装置８０のようなＩ／Ｏ装置が接続されている。また，コントロールプロセッサＣＰと，各プロセッサエレメントＰＥとは，クロスバネットワークＸＢで接続されている。
【０００３】
各プロセッサエレメントＰＥが記憶装置８０上に格納されているデータを参照する場合，プロセッサエレメントＰＥには記憶装置８０が接続されていないため，コントロールプロセッサＣＰに存在するサーバにＩ／Ｏ要求を出す。Ｉ／Ｏ要求を受けたコントロールプロセッサＣＰのサーバは，記憶装置８０へのＩ／Ｏを実行し，読み込んだデータをクロスバネットワークＸＢを経由して要求元のプロセッサエレメントＰＥに転送する。
【０００４】
【発明が解決しようとする課題】
図８において，記憶装置８０とコントロールプロセッサＣＰとの間のデータ転送速度が８００Ｍｂｙｔｅ／秒，コントロールプロセッサＣＰと各プロセッサエレメントＰＥとの間のデータ転送速度が４００Ｍｂｙｔｅ／秒であるとする。
【０００５】
従来，図８（Ａ）に示すように，プロセッサエレメントＰＥ（０）が，Ｉ／Ｏ（１）の要求と，Ｉ／Ｏ（２）の要求を行い，プロセッサエレメントＰＥ（１）がＩ／Ｏ（３）の要求と，Ｉ／Ｏ（４）の要求をこの順序で行った場合，コントロールプロセッサＣＰは，これらのＩ／Ｏ要求を１つずつ順番に処理していた。そのため，記憶装置８０からコントロールプロセッサＣＰへの１つのＩ／Ｏ処理によるデータ転送に例えば６秒かかるとすると，図８（Ｂ）に示すように，記憶装置８０からコントロールプロセッサＣＰへの全部のデータ転送では，６秒×４＝２４秒かかることになった。
【０００６】
また，図８（Ｃ）に示すように，プロセッサエレメントＰＥのクライアントから１つのｒｅａｄ要求が出された場合に，記憶装置８０からコントロールプロセッサＣＰへのデータ転送に例えば６秒かかると，そのＩ／Ｏ処理が完了してからＩ／Ｏ要求元のプロセッサエレメントＰＥへのＩ／Ｏ処理を開始するので，さらにコントロールプロセッサＣＰからプロセッサエレメントＰＥへのデータ転送に１２秒かかり，１つのＩ／Ｏ要求が完了するまで，図８（Ｄ）に示すように，全部で１８秒の時間がかかっていた。
【０００７】
以上のように，従来技術では，プロセッサエレメントＰＥからのＩ／Ｏ要求の処理に時間がかかるという問題があった。
また，コントロールプロセッサＣＰでは，プロセッサエレメントＰＥを管理するための処理や，運用に関する処理，ジョブに関する処理等，いろいろな処理が動いており，プロセッサエレメントＰＥが例えば１００台〜２００台の規模になった場合，Ｉ／Ｏ要求が１台のコントロールプロセッサＣＰに集中し，コントロールプロセッサＣＰの負荷が非常に高くなることがあるという問題があった。
【０００８】
本発明は上記問題点の解決を図り，Ｉ／Ｏ処理の分散化とＩ／Ｏ分割並列化により，Ｉ／Ｏ処理の高速化を可能にするとともに，コントロールプロセッサの負荷を軽減することを目的とする。
【０００９】
【課題を解決するための手段】
図１は本発明の構成例を示す。
プロセッサエレメント（ＰＥ）１０−０，１０−１，…は，演算を専門に行うプロセッサである。記憶装置３１のようなＩ／Ｏ装置が接続されているコントロールプロセッサが複数台設けられ，その中の１台がマスタコントロールプロセッサ（以下，マスタＣＰという）１２として設定され，その他がスレーブコントロールプロセッサ（以下，スレーブＣＰという）２６として設定される。
【００１０】
マスタＣＰ１２，スレーブＣＰ２６およびプロセッサエレメントＰＥは，クロスバネットワーク（ＸＢ）１１で接続されている。
マスタＣＰ１２には，プロセッサエレメントＰＥの管理，運用に関する処理，ジョブに関する処理などのために，オペレーション用の入力装置１３が接続されている。
【００１１】
Ｉ／Ｏ分割指示入力手段１４は，Ｉ／Ｏ要求を複数のＩ／Ｏ実行単位に分割する分割数を，入力装置１３からのコマンド等により入力する手段である。Ｉ／Ｏ分散指示入力手段１５は，Ｉ／Ｏ要求を複数のコントロールプロセッサに分散するか否か，または各コントロールプロセッサに何％分散させるかを，入力装置１３からのコマンド等により入力する手段てある。
【００１２】
Ｉ／Ｏ要求受付け手段１６は，各プロセッサエレメントＰＥからのＩ／Ｏ要求を受け付ける手段である。分散判定処理手段１７は，マスタＣＰ１２およびスレーブＣＰ２６の負荷状況およびＩ／Ｏ分散指示入力手段１５からの分散指示情報に基づいて，Ｉ／Ｏ要求の実行を複数のコントロールプロセッサに分散させるか，または特定のコントロールプロセッサで行うかを決定する手段である。
【００１３】
Ｉ／Ｏ要求転送手段２０は，分散判定処理手段１７の判定結果に従い，Ｉ／Ｏ要求をスレーブＣＰ２６で実行する場合に，Ｉ／Ｏ要求をスレーブＣＰ２６へ転送する手段である。Ｉ／Ｏ処理結果受信手段２１は，スレーブＣＰ２６でＩ／Ｏ要求を実行した場合に，そのＩ／Ｏ処理結果を受け取る手段である。
【００１４】
Ｉ／Ｏ分割処理手段１８は，Ｉ／Ｏ分割指示入力手段１４の指示に従い，プロセッサエレメントＰＥからのＩ／Ｏ要求を，複数のＩ／Ｏ実行単位に分割する手段である。Ｉ／Ｏ並列実行手段１９は，分割されたＩ／Ｏ実行単位について，記憶装置３１とマスタＣＰ１２との間のデータの入出力と，マスタＣＰ１２とＩ／Ｏ要求元のプロセッサエレメントＰＥとの間のデータの転送とを，並列に実行する手段である。
【００１５】
Ｉ／Ｏ処理完了通知手段２２は，マスタＣＰ１２で実行したＩ／Ｏ処理結果もしくはスレーブＣＰ２６で実行したＩ／Ｏ処理結果またはそれらの双方のＩ／Ｏ処理結果をＩ／Ｏ要求元のプロセッサエレメントＰＥへ通知する手段である。
【００１６】
異常検出手段２３は，要求したＩ／Ｏのタイムアウトまたは所定の運転制御等により，スレーブＣＰ２６の異常を検出する手段である。分散抑止手段２４は，異常検出手段２３によってスレーブＣＰ２６に異常が検出された場合に，そのスレーブＣＰ２６にＩ／Ｏ要求の実行を分担させないように分散判定処理手段１７に指示を出す手段である。
【００１７】
リカバリ処理手段２５は，Ｉ／Ｏ要求を分担して実行していたスレーブＣＰ２６の異常を検出した場合に，このスレーブＣＰ２６が処理していたＩ／Ｏ要求を再実行する手段である。
【００１８】
スレーブＣＰ２６におけるＩ／Ｏ要求受信手段２７は，Ｉ／Ｏの分散化のために，マスタＣＰ１２から送られてきたＩ／Ｏ要求を受信する手段である。Ｉ／Ｏ分割処理手段２８は，Ｉ／Ｏ分割指示入力手段１４からの指示に従い，マスタＣＰ１２からのＩ／Ｏ要求を，複数のＩ／Ｏ実行単位に分割する手段である。Ｉ／Ｏ並列実行手段２９は，分割されたＩ／Ｏ実行単位について，記憶装置３１とスレーブＣＰ２６との間のデータの入出力と，スレーブＣＰ２６からプロセッサエレメントＰＥまたはマスタＣＰ１２へのデータ転送とを並列に実行する手段である。Ｉ／Ｏ処理結果転送手段３０は，スレーブＣＰ２６で実行したＩ／Ｏ処理結果を，マスタＣＰ１２またはＩ／Ｏ要求元のプロセッサエレメントＰＥへ転送する手段である。
【００１９】
図１では，スレーブＣＰ２６が１台であるが，スレーブＣＰは２台以上存在してもよい。
【００２０】
【作用】
本発明では，コントロールプロセッサの数を増やし，マスタＣＰ１２以外にスレーブＣＰ２６を設ける。そして，各プロセッサエレメントＰＥからのＩ／Ｏ要求を，複数のコントロールプロセッサで分担して処理を行う。ここでは，これをＩ／Ｏの分散化と呼ぶ。
【００２１】
また，記憶装置３１とコントロールプロセッサ間，コントロールプロセッサとプロセッサエレメントＰＥ間でデータ転送が行われることに着目して，プロセッサエレメントＰＥから発行された１つのＩ／Ｏ要求を，複数に分割する。これによってトータルのＩ／Ｏ処理のスピードアップを図る。ここでは，これをＩ／Ｏの分割並列化と呼ぶ。
【００２２】
以上のようなＩ／Ｏの分散化とＩ／Ｏの分割並列化とによって，コントロールプロセッサの負荷分散およびＩ／Ｏ性能の向上が可能になる。これにより，ジョブのスループット性能，Ｉ／Ｏのピーク性能が向上し，システム全体としてのスピードアップが図れる。
【００２３】
プロセッサエレメントＰＥからのＩ／Ｏ要求を，すべてマスタＣＰ１２が受け取り，それを１台または複数台のスレーブＣＰ２６に分担させるので，マスタＣＰ１２におけるデータの排他制御等を簡単に実現することができる。
【００２４】
また，スレーブＣＰ２６に異常があった場合，Ｉ／Ｏの分散化を自動抑止し，さらにスレーブＣＰ２６がＩ／Ｏを実行中に異常になった場合には，マスタＣＰ１２でＩ／Ｏ処理をリトライすることにより，信頼性の向上も可能になる。
【００２５】
【実施例】
図２は本発明の実施例によるＩ／Ｏの分散化を説明する図である。
図２（Ａ）に示すように，複数台のコントロールプロセッサは，マスタＣＰ１２とスレーブＣＰ２６とに分かれて動作する。プロセッサエレメントＰＥからＩ／Ｏ要求を受けるのは，マスタＣＰ１２だけである。プロセッサエレメントＰＥからＩ／Ｏ要求を受けたマスタＣＰ１２では，所定の負荷アルゴリズムにより，自分でＩ／Ｏ処理を行うか，スレーブＣＰ２６にＩ／Ｏ処理を任せるかを判断する。
【００２６】
あるコントロールプロセッサでＩ／Ｏの分散化を行うかどうかは，オペレータが，図２（Ｃ）に示すような分散化コマンドによって指定することができる。コマンド名は“ｂａｌａｎｃｅ”であり，−ｓは分散化の指定である。ＣＰ識別番号は，どのコントロールプロセッサに対して本コマンドを適用するかを指定するものである。ｏｎ／ｏｆｆは分散化の開始／終了を表す。分散する場合には，例えばマスタＣＰ１２が３０％，スレーブＣＰ２６が７０％というように分散率を指定することができる。分散率を省略した場合，ディフォルト値としてマスタＣＰ１２とスレーブＣＰ２６の分担が１対２となる分散率が与えられる。ディフォルト値でマスタＣＰ１２の負担を少なくしているのは，プロセッサエレメントＰＥからの要求の受け付け，完了の通知等，スレーブＣＰ２６に比べて処理負担が大きいためである。
【００２７】
分散化コマンドが投入されている状態で，図１に示す分散判定処理手段１７が用いる負荷アルゴリズムは，以下のとおりである。
▲１▼ マスタＣＰ１２が全くＩ／Ｏ処理を行っていないならば，マスタＣＰ１２がＩ／Ｏを実行する。スレーブＣＰ２６に処理を依頼するには通信コストがかかるからである。
【００２８】
▲２▼ マスタＣＰ１２がある量のＩ／Ｏ処理を行っている最中で，負荷の低い，Ｉ／Ｏ処理を行っていないスレーブＣＰ２６があれば，このスレーブＣＰ２６にＩ／Ｏ処理を依頼する。
【００２９】
▲３▼ マスタＣＰ１２およびスレーブＣＰ２６が共にＩ／Ｏ処理中であれば，Ｉ／Ｏ量で分担し，マスタＣＰ１２，スレーブＣＰ２６でバランシングする。マスタＣＰ１２およびスレーブＣＰ２６が１台ずつで，１対２の分散率の場合には，マスタＣＰ１２が１／３のＩ／Ｏ量を分担し，スレーブＣＰ２６が２／３のＩ／Ｏ量を分担する。
【００３０】
図２（Ａ）において，例えばプロセッサエレメント１０−０がＩ／Ｏ（１），Ｉ／Ｏ（３）の要求を出し，プロセッサエレメント１０−１がＩ／Ｏ（２），Ｉ／Ｏ（４）の要求を出したとする。前述した負荷アルゴリズムにより，マスタＣＰ１２はＩ／Ｏ（２），Ｉ／Ｏ（４）の要求をスレーブＣＰ２６に転送し，２つのコントロールプロセッサで記憶装置３１へのＩ／Ｏを実行する。
【００３１】
これによって，記憶装置３１から各コントロールプロセッサへのデータ転送は，図２（Ｂ）に示すように行われ，１つのＩ／Ｏ処理に６秒かかったとすると，トータルで１２秒となる。従来技術では，図８（Ｂ）に示すように２４秒かかっていたので，分散化の効果が現れていることが分かる。
【００３２】
図３は本発明の実施例によるＩ／Ｏの分割並列化を説明する図である。
図３（Ａ）に示すように，マスタＣＰ１２がプロセッサエレメント１０−０からのＩ／Ｏ要求（ｒｅａｄ要求）を処理する場合，Ｉ／Ｏの実行単位をいくつかに分割する。例えば，Ｉ／Ｏ（１）とＩ／Ｏ（２）の２つに分割したとすると，まず，Ｉ／Ｏ（１）を実行し，そのＩ／Ｏ（１）によって読み取ったデータをプロセッサエレメント１０−０に転送している間に，Ｉ／Ｏ（２）による残りの読み出しを行う。すなわち，記憶装置３１からマスタＣＰ１２へのデータ入力と，マスタＣＰ１２からプロセッサエレメント１０−０へのデータ転送とを並列に行う。
【００３３】
スレーブＣＰ２６がＩ／Ｏを実行する場合にも，同様に分割して実行することができる。
このＩ／Ｏ要求の分割は，例えばプロセッサエレメント１０−０からのＩ／Ｏ要求パケットが図３（Ｂ）のＰ１のような場合に，分割数が２であるとすると，要求されたデータサイズを１／２にしたパケットＰ２と，アドレスをデータサイズの１／２だけ進めて，データサイズを１／２にしたパケットＰ３とを作成することによって，行うことができる。
【００３４】
例えば，記憶装置３１とマスタＣＰ１２間のデータ転送速度が８００Ｍｂｙｔｅ／秒，マスタＣＰ１２とプロセッサエレメント１０−０間のデータ転送速度が４００Ｍｂｙｔｅ／秒であるとして，Ｉ／Ｏ要求を２つに分割した場合のタイムチャートは，図３（Ｃ）に示すようになる。なお，データ転送以外の時間については，非常に小さいので無視している。Ｉ／Ｏの分割並列化によって，Ｉ／Ｏ完了までのトータルの時間は，１５秒となっている。従来技術では，図８（Ｄ）に示すように１８秒かかっていたので，分割並列化の効果が現れているのが分かる。
【００３５】
この分割並列化を行うか否かは，例えば図３（Ｄ）に示す分割並列化コマンドによって指定することができる。コマンド名は“ｂａｌａｎｃｅ”であり，−ｐは分割並列化の指定である。ＣＰ識別番号は，どのコントロールプロセッサに対して本コマンドを適用するかを指定するものである。ｏｎ／ｏｆｆは分割並列化を行うか否かを表す。分割並列化を行う場合には，分割数を指定することができる。分割数を省略した場合，ディフォルト値は２である。理論的には分割数を増やすと効率がよくなるように思えるが，通信回数などのオーバーヘッドも同時に大きくなるため，そのコストも無視できなくなる。
【００３６】
図４は本発明の実施例による全体のＩ／Ｏ処理の流れを示している。
記憶装置３１は半導体記憶装置であり，転送速度は比較的大きい。プロセッサエレメントＰＥは３台である。プロセッサエレメント１０−０，１０−１，１０−２がそれぞれＩ／Ｏ要求（１），Ｉ／Ｏ要求（２），Ｉ／Ｏ要求（３）を出したとすると，マスタＣＰ１２では，それらの要求をキューイングし，要求順に処理していく。Ｉ／Ｏ要求の実行では，まずデータの排他制御を行い，各コントロールプロセッサの負荷を考慮して，分散判定処理を行う。
【００３７】
この結果，自プロセッサでＩ／Ｏ処理を行う場合には，指定に応じてＩ／Ｏの実行単位を分割して，記憶装置３１のデータにアクセスする。一方，スレーブＣＰ２６でＩ／Ｏ処理を行う場合には，スレーブＣＰ２６に処理を依頼する。その後，マスタＣＰ１２は，スレーブＣＰ２６のＩ／Ｏ完了を待ち合わせる。スレーブＣＰ２６では，マスタＣＰ１２からのＩ／Ｏ要求を受信すると，指定に応じてＩ／Ｏの実行単位を分割して，記憶装置３１のデータにアクセスする。Ｉ／Ｏ処理が完了すると，その処理結果をマスタＣＰ１２に通知する。
【００３８】
マスタＣＰ１２では，自プロセッサによるＩ／Ｏ処理およびスレーブＣＰ２６からのＩ／Ｏ処理完了の報告を受けると，Ｉ／Ｏ要求元のプロセッサエレメントＰＥにＩ／Ｏ処理完了を通知する。
【００３９】
例えば，記憶装置３１からデータをｒｅａｄするＩ／Ｏ要求に対して，そのＩ／ＯをスレーブＣＰ２６だけで行った場合には，スレーブＣＰ２６から直接Ｉ／Ｏ要求元のプロセッサエレメントＰＥへｒｅａｄしたデータを転送する。ただし，スレーブＣＰ２６がダウンした場合等のリカバリ処理のために，マスタＣＰ１２はスレーブＣＰ２６からのＩ／Ｏ処理の完了通知を待っているので，スレーブＣＰ２６は，マスタＣＰ１２にもＩ／Ｏが終了したことを通知する。
【００４０】
また，データをｒｅａｄするＩ／Ｏを，マスタＣＰ１２とスレーブＣＰ２６とが分担して処理する場合には，スレーブＣＰ２６が読み出したデータをマスタＣＰ１２が集め，それをマスタＣＰ１２から自プロセッサが読み出したデータとともに，プロセッサエレメントＰＥへ転送する。
【００４１】
上記処理において，マスタＣＰ１２がスレーブＣＰ２６にＩ／Ｏを依頼した後，スレーブＣＰ２６に異常が発生した場合，またはスレーブＣＰ２６から所定の時間以上の応答がなく，Ｉ／Ｏの依頼がタイムアウトになった場合，マスタＣＰ１２は，スレーブＣＰ２６へ依頼したＩ／Ｏを自プロセッサが実施し，リカバリ処理を行う。
【００４２】
図５はマスタＣＰ１２におけるＩ／Ｏ分散化処理フローチャートである。
まず，ステップ５１では，Ｉ／Ｏ要求のキューを調べ，Ｉ／Ｏ要求があればそれをキューから取り外す。ステップ５２では，データの排他制御を行う。このデータの排他制御については，従来から種々の方式が知られているので，ここでは詳しい説明を省略する。
【００４３】
ステップ５３では，マスタＣＰ１２の分散化フラグがｏｎになっているかどうかを判定する。この分散化フラグは，分散化コマンドまたはその他の環境設定手段によりｏｎ／ｏｆｆされるシステム運用制御のためのフラグである。マスタＣＰ１２の分散化フラグがｏｆｆの場合，ステップ５５へ進む。
【００４４】
マスタＣＰ１２の分散化フラグがｏｎの場合，ステップ５４では，マスタＣＰ１２がＩ／Ｏ処理中であるか否かを判定する。Ｉ／Ｏ処理中でない場合，Ｉ／Ｏ実行装置をマスタＣＰ１２とする。Ｉ／Ｏ処理中の場合，次のステップ５５へ進む。
【００４５】
ステップ５５では，スレーブＣＰ２６が使用可能かどうかを判定する。故障等によりスレーブＣＰ２６が使用できない場合，Ｉ／Ｏ実行装置をマスタＣＰ１２とする。
【００４６】
スレーブＣＰ２６が使用可能である場合，ステップ５６では，スレーブＣＰ２６の分散化フラグがｏｎになっているかどうかを判定する。スレーブＣＰ２６の分散化フラグがｏｆｆであれば，Ｉ／Ｏ実行装置をマスタＣＰ１２とする。なお，通常の運用時には，スレーブＣＰ２６の分散化フラグは常時ｏｎであることが一般的に望ましいと考えられる。
【００４７】
スレーブＣＰ２６の分散化フラグがｏｎであれば，ステップ５７では，スレーブＣＰ２６がＩ／Ｏ処理中であるか否かを判定する。Ｉ／Ｏ処理中でない場合，Ｉ／Ｏ実行装置をスレーブＣＰ２６とする。Ｉ／Ｏ処理中の場合，所定の分散率に従ってマスタＣＰ１２とスレーブＣＰ２６とでＩ／Ｏを分担し実行する。この場合，Ｉ／Ｏ要求の個数ごとに分散させてもよく，また，１つのＩ／Ｏ要求を分割して分散率に応じてマスタＣＰ１２とスレーブＣＰ２６とが処理を分担するようにしてもよい。
【００４８】
図６は，本発明の実施例によるＩ／Ｏ実行装置におけるＩ／Ｏ分割化処理フローチャートである。
マスタＣＰ１２またはスレーブＣＰ２６においてＩ／Ｏを実行する場合，まず図６（Ａ）のステップ６１では，指定されたＩ／Ｏ分割数が２以上かどうかを判定する。Ｉ／Ｏの実行単位を分割しない場合，すなわち分割数が１の場合，ステップ６３へ進む。
【００４９】
分割数が２以上であれば，ステップ６２により，指定された分割数になるようにＩ／Ｏ要求の実行単位を分割する。
ステップ６３では，現在，記憶装置３１に対しＩ／Ｏ発行可能であるかどうかを調べ，発行可能であればステップＳ６４へ進む。発行可能でなければ，発行可能になるのを待つ。
【００５０】
ステップ６４では，記憶装置３１に対して分割したＩ／Ｏ要求のＩ／Ｏを発行する。ステップ６５の判定により，全Ｉ／Ｏ発行が終了するまで，ステップ６３〜６５を繰り返す。
【００５１】
以上のＩ／Ｏ発行に対するＩ／Ｏ割込みがあった場合，またはマスタＣＰ１２がスレーブＣＰ２６からＩ／Ｏの処理結果を受信した場合には，図６（Ｂ）に示す転送処理を行う。
【００５２】
まず，ステップ７１では，自コントロールプロセッサ（ＣＰ）がマスタＣＰ１２かスレーブＣＰ２６かを判定し，スレーブＣＰ２６であれば，ステップ７４へ進む。
【００５３】
自ＣＰがマスタＣＰ１２であれば，ステップ７２により，Ｉ／Ｏ要求元のプロセッサエレメントＰＥへＩ／Ｏの処理結果を転送可能であるかどうかを判定し，転送可能であれば，ステップ７３により，プロセッサエレメントＰＥへＩ／Ｏ処理結果を転送する。プロセッサエレメントＰＥへ転送可能でなければ，一旦，割込み元へ復帰し，次の転送契機を待つ。
【００５４】
自ＣＰがスレーブＣＰ２６の場合，ステップ７４により，マスタＣＰ１２へＩ／Ｏ処理結果を転送可能であるかどうかを判定し，転送可能であれば，ステップ７５により，マスタＣＰ１２へＩ／Ｏ処理結果を転送する。マスタＣＰ１２へ転送可能でなければ，一旦，割込み元へ復帰し，次の転送契機を待つ。
【００５５】
図７は，本発明の一実施例においてＩ／Ｏ分散化と分割並列化を併用したときのタイムチャートを示す。
マスタＣＰ１２が１台とスレーブＣＰ２６が１台あり，指定されたＩ／Ｏ分割数が２であるとする。また，分散率はマスタ：スレーブ＝１：２であるとする。プロセッサエレメントＰＥからのＩ／Ｏ要求のｒｅａｄデータは，図７（Ａ）に示すように，マスタＣＰ１２およびスレーブＣＰ２６によって実行される。すなわち，マスタＣＰ１２では，データＡ，Ｂのｒｅａｄが実行され，スレーブＣＰ２６では，データＣ，Ｄのｒｅａｄが実行される。
【００５６】
時間的な流れは，図７（Ｂ）に示すようになる。まず，マスタＣＰ１２から記憶装置３１へのデータＡに関するＩ／Ｏ実行により，記憶装置３１からマスタＣＰ１２へ時間Ａ１だけデータが転送される。Ａ１の転送が終了すると，マスタＣＰ１２はデータＢに関するＩ／Ｏを実行するとともに，Ａ１で入力したデータＡをＩ／Ｏ要求元のプロセッサエレメントＰＥへ，時間Ａ２をかけて転送する。この間，記憶装置３１からマスタＣＰ１２へのデータ転送が時間Ｂ１の間，並列に実行されることになる。時間Ａ２の後，マスタＣＰ１２からプロセッサエレメントＰＥへ時間Ｂ２分のデータ転送が行われる。
【００５７】
スレーブＣＰ２６においても，データＣ，Ｄのそれぞれについて，図７（Ｂ）に示すようにデータ転送が行われる。まず，記憶装置３１からスレーブＣＰ２６へのデータ転送が時間Ｃ１，Ｄ１のように行われ，スレーブＣＰ２６からマスタＣＰ１２へのデータ転送が時間Ｃ２，Ｄ２のように行われ，マスタＣＰ１２からプロセッサエレメントＰＥへ時間Ｃ３，Ｄ３のように行われる。
【００５８】
本実施例において，スレーブＣＰ２６のＩ／Ｏ処理に異常があった場合，マスタＣＰ１２がそのＩ／Ｏ処理を引き継いでリカバリ処理を行う。マスタＣＰ１２に異常があった場合には，他のスレーブＣＰの１台がマスタＣＰになって，同様に処理を引き継ぐことができる。
【００５９】
【発明の効果】
以上説明したように，本発明によれば，Ｉ／Ｏの分散化，Ｉ／Ｏの分割並列化により，コントロールプロセッサの負荷が分散され，またプロセッサエレメントが要求したＩ／Ｏの実行完了までの時間が短縮される。したがって，ジョブのスループット性能，Ｉ／Ｏのピーク性能が向上し，システム全体としてのスピードアップが可能になる。また，スレーブＣＰのＩ／Ｏ処理に何らかの異常があった場合，マスタＣＰがリカバリするので，信頼性の維持も可能である。
【図面の簡単な説明】
【図１】本発明の構成例を示す図である。
【図２】本発明の実施例によるＩ／Ｏの分散化を説明する図である。
【図３】本発明の実施例によるＩ／Ｏの分割並列化を説明する図である。
【図４】本発明の実施例による全体のＩ／Ｏ処理の流れを示す図である。
【図５】マスタＣＰにおけるＩ／Ｏ分散化処理フローチャートである。
【図６】本発明の実施例によるＩ／Ｏ分割化処理フローチャートである。
【図７】本発明の一実施例においてＩ／Ｏ分散化と分割並列化を併用したときのタイムチャートである。
【図８】従来技術の説明図である。
【符号の説明】
１０−０，１０−１，… プロセッサエレメント
１１クロスバネットワーク
１２マスタＣＰ
１３入力装置
１４Ｉ／Ｏ分割指示入力手段
１５Ｉ／Ｏ分散指示入力手段
１６Ｉ／Ｏ要求受付け手段
１７分散判定処理手段
１８Ｉ／Ｏ分割処理手段
１９Ｉ／Ｏ並列実行手段
２０Ｉ／Ｏ要求転送手段
２１Ｉ／Ｏ処理結果受信手段
２２Ｉ／Ｏ処理完了通知手段
２３異常検出手段
２４分散抑止手段
２５リカバリ処理手段
２６スレーブＣＰ
２７Ｉ／Ｏ要求受信手段
２８Ｉ／Ｏ分割処理手段
２９Ｉ／Ｏ並列実行手段
３０Ｉ／Ｏ処理結果転送手段
３１記憶装置[0001]
[Industrial applications]
According to the present invention, in a computer system in which a plurality of processor elements and a control processor are connected by a network, I / O processing from a processor element having no I / O device via a control processor having an I / O device is performed. The present invention relates to a parallel computer system which can be executed at high speed and efficiently.
[0002]
[Prior art]
FIG. 8 is an explanatory diagram of the prior art.
In the parallel computer system shown in FIGS. 8A and 8C, a processor element PE [PE (0), PE (1),...] Is a processor that specializes in operations. The control processor CP is a device that manages the processor element PE, and is connected to an I / O device such as a storage device 80. Further, the control processor CP and each processor element PE are connected by a crossbar network XB.
[0003]
When each processor element PE refers to data stored in the storage device 80, since the storage device 80 is not connected to the processor element PE, an I / O request is issued to a server existing in the control processor CP. The server of the control processor CP that has received the I / O request executes I / O to the storage device 80 and transfers the read data to the requesting processor element PE via the crossbar network XB.
[0004]
[Problems to be solved by the invention]
In FIG. 8, it is assumed that the data transfer speed between the storage device 80 and the control processor CP is 800 Mbyte / sec, and the data transfer speed between the control processor CP and each processor element PE is 400 Mbyte / sec.
[0005]
Conventionally, as shown in FIG. 8A, a processor element PE (0) makes an I / O (1) request and an I / O (2) request, and the processor element PE (1) issues an I / O (2) request. When the request of O (3) and the request of I / O (4) are made in this order, the control processor CP processes these I / O requests one by one. Therefore, if it takes, for example, 6 seconds to transfer data from the storage device 80 to the control processor CP by one I / O process, as shown in FIG. 8B, all data from the storage device 80 to the control processor CP are transferred. In transfer, it took 6 seconds × 4 = 24 seconds.
[0006]
Further, as shown in FIG. 8C, when one read request is issued from the client of the processor element PE and the data transfer from the storage device 80 to the control processor CP takes, for example, 6 seconds, the I / O After the completion of the O processing, the I / O processing to the processor element PE of the I / O request source starts, so that it takes 12 seconds to transfer data from the control processor CP to the processor element PE, and one I / O request It took a total of 18 seconds until the process was completed, as shown in FIG.
[0007]
As described above, the conventional technology has a problem that it takes time to process an I / O request from the processor element PE.
In the control processor CP, various processes such as a process for managing the processor elements PE, a process related to operation, and a process related to a job are performed, and the number of the processor elements PE is, for example, 100 to 200. In such a case, there is a problem that I / O requests are concentrated on one control processor CP, and the load on the control processor CP may be extremely high.
[0008]
SUMMARY OF THE INVENTION It is an object of the present invention to solve the above-mentioned problems, and to speed up I / O processing and reduce the load on a control processor by dispersing I / O processing and parallelizing I / O division. And
[0009]
[Means for Solving the Problems]
FIG. 1 shows a configuration example of the present invention.
Processor elements (PE) 10-0, 10-1,... Are processors that specialize in operations. A plurality of control processors to which an I / O device such as a storage device 31 is connected are provided, one of which is set as a master control processor (hereinafter, referred to as a master CP) 12 and the other is a slave control processor (hereinafter, a master control processor). Hereinafter, this is set as a slave CP) 26.
[0010]
The master CP 12, the slave CP 26, and the processor element PE are connected by a crossbar network (XB) 11.
An operation input device 13 is connected to the master CP 12 for processing related to management and operation of the processor element PE, processing related to jobs, and the like.
[0011]
The I / O division instruction input means 14 is a means for inputting the number of divisions for dividing an I / O request into a plurality of I / O execution units by a command from the input device 13 or the like. The I / O distribution instruction input means 15 is a means for inputting a command or the like from the input device 13 as to whether or not the I / O request is distributed to a plurality of control processors, or what percentage to distribute to each control processor. is there.
[0012]
The I / O request receiving means 16 is a means for receiving an I / O request from each processor element PE. The distribution determination processing unit 17 distributes the execution of the I / O request to a plurality of control processors based on the load status of the master CP 12 and the slave CP 26 and the distribution instruction information from the I / O distribution instruction input unit 15, or This is a means for deciding whether to perform the processing with a specific control processor.
[0013]
The I / O request transfer unit 20 is a unit that transfers the I / O request to the slave CP 26 when the I / O request is executed by the slave CP 26 according to the determination result of the distribution determination processing unit 17. The I / O processing result receiving means 21 is a means for receiving an I / O processing result when the slave CP 26 executes an I / O request.
[0014]
The I / O division processing unit 18 is a unit that divides an I / O request from the processor element PE into a plurality of I / O execution units according to an instruction from the I / O division instruction input unit 14. For the divided I / O execution units, the I / O parallel execution means 19 performs data input / output between the storage device 31 and the master CP 12, and communication between the master CP 12 and the processor element PE of the I / O request source. And data transfer in parallel.
[0015]
The I / O processing completion notifying means 22 sends the I / O processing result executed by the master CP 12, the I / O processing result executed by the slave CP 26, or both I / O processing results to the processor element of the I / O request source. This is a means for notifying the PE.
[0016]
The abnormality detecting unit 23 is a unit that detects an abnormality of the slave CP 26 by a timeout of the requested I / O or a predetermined operation control. The dispersion suppressing unit 24 is a unit that, when the abnormality detecting unit 23 detects an abnormality in the slave CP 26, issues an instruction to the distribution determination processing unit 17 so as not to allow the slave CP 26 to share the execution of the I / O request.
[0017]
The recovery processing means 25 is means for re-executing the I / O request being processed by the slave CP 26 when detecting an abnormality of the slave CP 26 which has been executing the I / O request in a shared manner.
[0018]
The I / O request receiving unit 27 in the slave CP 26 is a unit that receives an I / O request sent from the master CP 12 for dispersing the I / O. The I / O division processing unit 28 is a unit that divides an I / O request from the master CP 12 into a plurality of I / O execution units according to an instruction from the I / O division instruction input unit 14. The I / O parallel execution unit 29 performs, for the divided I / O execution units, data input / output between the storage device 31 and the slave CP 26 and data transfer from the slave CP 26 to the processor element PE or the master CP 12. It is a means to execute in parallel. The I / O processing result transfer unit 30 is a unit that transfers the I / O processing result executed by the slave CP 26 to the master CP 12 or the processor element PE that has issued the I / O request.
[0019]
In FIG. 1, there is one slave CP 26, but two or more slave CPs may exist.
[0020]
[Action]
In the present invention, the number of control processors is increased, and a slave CP 26 is provided in addition to the master CP 12. Then, the I / O requests from the respective processor elements PE are processed by being shared by a plurality of control processors. Here, this is called I / O decentralization.
[0021]
Focusing on data transfer between the storage device 31 and the control processor and between the control processor and the processor element PE, one I / O request issued from the processor element PE is divided into a plurality of I / O requests. This speeds up the total I / O processing. Here, this is referred to as split parallelization of I / O.
[0022]
By distributing the I / O and dividing and parallelizing the I / O as described above, it is possible to distribute the load of the control processor and improve the I / O performance. As a result, the throughput performance of the job and the peak performance of the I / O are improved, and the speed of the entire system can be increased.
[0023]
Since the master CP 12 receives all I / O requests from the processor element PE and distributes the requests to one or a plurality of slave CPs 26, exclusive control of data in the master CP 12 can be easily realized.
[0024]
In addition, if there is an abnormality in the slave CP 26, the distribution of I / O is automatically suppressed, and if the slave CP 26 becomes abnormal while executing I / O, the I / O process is retried in the master CP 12. By doing so, reliability can be improved.
[0025]
【Example】
FIG. 2 is a diagram illustrating I / O decentralization according to an embodiment of the present invention.
As shown in FIG. 2A, the plurality of control processors operate in a master CP 12 and a slave CP 26 separately. Only the master CP 12 receives an I / O request from the processor element PE. The master CP 12 that has received the I / O request from the processor element PE determines whether to perform the I / O processing by itself or to leave the I / O processing to the slave CP 26 according to a predetermined load algorithm.
[0026]
Whether a certain control processor performs I / O decentralization can be designated by an operator by a decentralization command as shown in FIG. The command name is "balance", and -s is a designation of decentralization. The CP identification number specifies to which control processor this command is applied. on / off indicates the start / end of decentralization. In the case of dispersing, for example, the dispersing rate can be specified such that the master CP 12 is 30% and the slave CP 26 is 70%. If the distribution ratio is omitted, a distribution ratio in which the sharing between the master CP 12 and the slave CP 26 is 1: 2 is given as a default value. The load on the master CP 12 is reduced by the default value because the processing load such as the reception of a request from the processor element PE and notification of completion is larger than that of the slave CP 26.
[0027]
The load algorithm used by the distribution determination processing means 17 shown in FIG. 1 when the distribution command is input is as follows.
(1) If the master CP 12 has not performed any I / O processing, the master CP 12 executes I / O. This is because communication cost is required to request the slave CP 26 for processing.
[0028]
{Circle around (2)} While the master CP 12 is performing a certain amount of I / O processing and there is a slave CP 26 with a low load and not performing I / O processing, the slave CP 26 is requested to perform I / O processing. .
[0029]
{Circle around (3)} When both the master CP 12 and the slave CP 26 are performing I / O processing, the I / O amount is shared, and the master CP 12 and the slave CP 26 balance. In the case of one master CP 12 and one slave CP 26 and a dispersion ratio of 1: 2, the master CP 12 shares 1/3 of the I / O amount and the slave CP 26 shares 2/3 of the I / O amount. I do.
[0030]
In FIG. 2A, for example, the processor element 10-0 issues a request for I / O (1) and I / O (3), and the processor element 10-1 issues a request for I / O (2) and I / O (4). ). According to the load algorithm described above, the master CP 12 transfers the I / O (2) and I / O (4) requests to the slave CP 26, and executes I / O to the storage device 31 by two control processors.
[0031]
As a result, the data transfer from the storage device 31 to each control processor is performed as shown in FIG. 2B. If one I / O process takes 6 seconds, the total is 12 seconds. In the prior art, it took 24 seconds as shown in FIG. 8 (B), and it can be seen that the effect of decentralization has appeared.
[0032]
FIG. 3 is a diagram for explaining the parallel division of I / O according to the embodiment of the present invention.
As shown in FIG. 3A, when the master CP 12 processes an I / O request (read request) from the processor element 10-0, the I / O execution unit is divided into several units. For example, if it is divided into two, I / O (1) and I / O (2), first, I / O (1) is executed, and the data read by the I / O (1) is processed by the processor element. While the data is being transferred to 10-0, the remaining data is read by I / O (2). That is, data input from the storage device 31 to the master CP 12 and data transfer from the master CP 12 to the processor element 10-0 are performed in parallel.
[0033]
When the slave CP 26 executes I / O, it can be similarly divided and executed.
This division of the I / O request is performed by, for example, assuming that the number of divisions is 2 when the I / O request packet from the processor element 10-0 is P1 in FIG. Can be performed by creating a packet P2 in which the data size is reduced to 1/2 and a packet P3 in which the address is advanced by 1/2 of the data size and the data size is reduced to 1/2.
[0034]
For example, assuming that the data transfer speed between the storage device 31 and the master CP 12 is 800 Mbytes / sec, and that the data transfer speed between the master CP 12 and the processor element 10-0 is 400 Mbytes / sec, the I / O request is divided into two. Is as shown in FIG. 3 (C). Times other than data transfer are ignored because they are very small. The total time until the completion of the I / O is 15 seconds due to the split parallelization of the I / O. In the prior art, it took 18 seconds as shown in FIG. 8 (D), and it can be seen that the effect of the split parallelization has appeared.
[0035]
Whether or not to perform the split parallelization can be designated by, for example, a split parallelization command shown in FIG. The command name is "balance", and -p is a designation of division parallelization. The CP identification number specifies to which control processor this command is applied. on / off indicates whether or not to perform split parallelization. When performing parallel division, the number of divisions can be specified. If the number of divisions is omitted, the default value is 2. Theoretically, increasing the number of divisions seems to improve efficiency, but the overhead, such as the number of communications, also increases at the same time, so the cost cannot be ignored.
[0036]
FIG. 4 shows the flow of the entire I / O processing according to the embodiment of the present invention.
The storage device 31 is a semiconductor storage device, and has a relatively high transfer speed. There are three processor elements PE. Assuming that the processor elements 10-0, 10-1, and 10-2 have issued I / O requests (1), I / O requests (2), and I / O requests (3), respectively, the master CP 12 Requests are queued and processed in the order of requests. In the execution of an I / O request, first, exclusive control of data is performed, and distribution determination processing is performed in consideration of the load on each control processor.
[0037]
As a result, when the I / O processing is performed by the own processor, the execution unit of the I / O is divided according to the specification and the data in the storage device 31 is accessed. On the other hand, when performing I / O processing in the slave CP 26, the slave CP 26 requests processing. Thereafter, the master CP 12 waits for I / O completion of the slave CP 26. When receiving the I / O request from the master CP 12, the slave CP 26 divides the I / O execution unit according to the designation and accesses the data in the storage device 31. When the I / O processing is completed, the processing result is notified to the master CP 12.
[0038]
Upon receiving the report of the I / O processing by the own processor and the completion of the I / O processing from the slave CP 26, the master CP 12 notifies the I / O requesting processor element PE of the completion of the I / O processing.
[0039]
For example, when an I / O request for reading data from the storage device 31 is performed only by the slave CP 26, the data read directly from the slave CP 26 to the processor element PE that has issued the I / O request. To transfer. However, since the master CP 12 is waiting for the completion notification of the I / O processing from the slave CP 26 for the recovery processing when the slave CP 26 goes down, the slave CP 26 also terminates the I / O to the master CP 12. Notify that.
[0040]
When the I / O for reading data is processed by the master CP 12 and the slave CP 26 in a shared manner, the master CP 12 collects the data read by the slave CP 26, and collects the data read from the master CP 12 by the own processor. At the same time, the data is transferred to the processor element PE.
[0041]
In the above processing, if an abnormality occurs in the slave CP 26 after the master CP 12 requests the slave CP 26 for I / O, or there is no response for more than a predetermined time from the slave CP 26, the I / O request times out. In this case, the master CP 12 performs the I / O requested to the slave CP 26 by its own processor, and performs a recovery process.
[0042]
FIG. 5 is a flowchart of the I / O decentralization process in the master CP 12.
First, at step 51, the I / O request queue is checked, and if there is an I / O request, it is removed from the queue. In step 52, exclusive control of data is performed. Various methods for exclusive control of the data are conventionally known, and a detailed description thereof will be omitted.
[0043]
In step 53, it is determined whether the decentralization flag of the master CP 12 is on. This decentralized flag is a flag for system operation control that is turned on / off by a decentralized command or other environment setting means. When the decentralization flag of the master CP 12 is off, the process proceeds to step 55.
[0044]
When the decentralization flag of the master CP 12 is on, in step 54, it is determined whether or not the master CP 12 is performing I / O processing. When the I / O processing is not being performed, the I / O execution device is set as the master CP 12. When the I / O processing is being performed, the process proceeds to the next step 55.
[0045]
In step 55, it is determined whether the slave CP 26 can be used. If the slave CP 26 cannot be used due to a failure or the like, the I / O execution device is set as the master CP 12.
[0046]
If the slave CP 26 is available, in step 56, it is determined whether the decentralization flag of the slave CP 26 is on. If the decentralization flag of the slave CP 26 is off, the I / O execution device is set as the master CP 12. During normal operation, it is generally considered desirable that the decentralized flag of the slave CP 26 is always on.
[0047]
If the decentralization flag of the slave CP 26 is on, in a step 57, it is determined whether or not the slave CP 26 is performing I / O processing. When the I / O processing is not being performed, the I / O execution device is set as the slave CP 26. During the I / O processing, the master CP 12 and the slave CP 26 share and execute the I / O according to a predetermined distribution ratio. In this case, the I / O requests may be distributed for each number, or one I / O request may be divided so that the master CP 12 and the slave CP 26 share the processing according to the distribution ratio. .
[0048]
FIG. 6 is an I / O division processing flowchart in the I / O execution device according to the embodiment of the present invention.
When executing I / O in the master CP 12 or the slave CP 26, first, in step 61 of FIG. 6A, it is determined whether the designated number of I / O divisions is two or more. If the I / O execution unit is not divided, that is, if the number of divisions is 1, the process proceeds to step 63.
[0049]
If the number of divisions is two or more, at step 62, the execution unit of the I / O request is divided so as to have the designated number of divisions.
In step 63, it is checked whether or not I / O can be issued to the storage device 31 at present. If it is possible, the process proceeds to step S64. If it cannot be issued, it waits until it can be issued.
[0050]
In step 64, I / O of the divided I / O request is issued to the storage device 31. Steps 63 to 65 are repeated until all I / O issuance is completed by the determination in step 65.
[0051]
When there is an I / O interrupt for the above I / O issuance, or when the master CP 12 receives an I / O processing result from the slave CP 26, the transfer processing shown in FIG. 6B is performed.
[0052]
First, in step 71, it is determined whether the own control processor (CP) is the master CP 12 or the slave CP 26.
[0053]
If the own CP is the master CP 12, it is determined in step 72 whether or not the I / O processing result can be transferred to the processor element PE of the I / O request source. The I / O processing result is transferred to the processor element PE. If transfer to the processor element PE is not possible, the process temporarily returns to the interrupt source and waits for the next transfer trigger.
[0054]
If the own CP is the slave CP 26, it is determined in step 74 whether the I / O processing result can be transferred to the master CP 12. If transferable, the I / O processing result is sent to the master CP 12 in step 75. Forward. If transfer to the master CP 12 is not possible, the process temporarily returns to the interrupt source and waits for the next transfer trigger.
[0055]
FIG. 7 is a time chart when the I / O decentralization and the split parallelization are used in one embodiment of the present invention.
It is assumed that there is one master CP 12 and one slave CP 26, and the designated number of I / O divisions is two. It is also assumed that the distribution ratio is master: slave = 1: 2. The read data of the I / O request from the processor element PE is executed by the master CP 12 and the slave CP 26 as shown in FIG. That is, in the master CP 12, the reading of the data A and B is executed, and in the slave CP 26, the reading of the data C and D is executed.
[0056]
The temporal flow is as shown in FIG. First, data is transferred from the storage device 31 to the master CP 12 for the time A1 by executing I / O relating to data A from the master CP 12 to the storage device 31. When the transfer of A1 is completed, the master CP 12 executes I / O relating to the data B, and transfers the data A input at A1 to the processor element PE that has requested the I / O over a time A2. During this time, data transfer from the storage device 31 to the master CP 12 is executed in parallel during the time B1. After the time A2, data transfer for a time B2 from the master CP 12 to the processor element PE is performed.
[0057]
In the slave CP 26 as well, data transfer is performed for each of the data C and D as shown in FIG. First, data transfer from the storage device 31 to the slave CP 26 is performed at time C1, D1, data transfer from the slave CP 26 to the master CP 12 is performed at time C2, D2, and the master CP 12 is transferred to the processor element PE. The operation is performed as at times C3 and D3.
[0058]
In this embodiment, when there is an abnormality in the I / O processing of the slave CP 26, the master CP 12 performs the recovery processing taking over the I / O processing. If there is an abnormality in the master CP 12, one of the other slave CPs becomes the master CP and can similarly take over the processing.
[0059]
【The invention's effect】
As described above, according to the present invention, the load of the control processor is distributed by dispersing the I / O and dividing and parallelizing the I / O, and the execution of the I / O requested by the processor element is completed. Time is reduced. Therefore, the throughput performance of the job and the peak performance of the I / O are improved, and the speed of the entire system can be increased. Further, if there is any abnormality in the I / O processing of the slave CP, the master CP recovers, so that the reliability can be maintained.
[Brief description of the drawings]
FIG. 1 is a diagram showing a configuration example of the present invention.
FIG. 2 is a diagram illustrating I / O decentralization according to an embodiment of the present invention.
FIG. 3 is a diagram for explaining I / O division and parallelization according to an embodiment of the present invention.
FIG. 4 is a diagram showing a flow of an entire I / O process according to an embodiment of the present invention.
FIG. 5 is a flowchart of an I / O decentralization process in a master CP.
FIG. 6 is a flowchart of an I / O division process according to an embodiment of the present invention.
FIG. 7 is a time chart when the I / O decentralization and the split parallelization are used together in one embodiment of the present invention.
FIG. 8 is an explanatory diagram of a conventional technique.
[Explanation of symbols]
10-0, 10-1,... Processor element 11 Crossbar network 12 Master CP
13 input device 14 I / O division instruction input means 15 I / O distribution instruction input means 16 I / O request receiving means 17 distribution determination processing means 18 I / O division processing means 19 I / O parallel execution means 20 I / O request Transfer means 21 I / O processing result receiving means 22 I / O processing completion notifying means 23 abnormality detecting means 24 distribution suppressing means 25 recovery processing means 26 slave CP
27 I / O request receiving means 28 I / O division processing means 29 I / O parallel execution means 30 I / O processing result transfer means 31 Storage device

Claims

A plurality of processor elements for performing operations, a plurality of control processors for processing I / O requests from each of the processor elements, a network connecting the processor elements and the control processor, and an I / O connected to the control processor. In a parallel computer system having a storage device or an input / output device targeted for / O,
One of the control processors is a master control processor,
This master control processor
Means for receiving an I / O request from each of the processor elements;
Distribution determination processing means for determining whether to distribute the execution of the I / O request to a plurality of control processors or a specific control processor based on the load status of each control processor and predetermined distribution instruction information;
I / O request transfer means for transferring an I / O request to another control processor when executing the I / O request;
An I / O processing result receiving means for receiving an I / O processing result when another I / O request is executed by another control processor;
I / O for notifying the processor element of an I / O request source of an I / O processing result executed by its own control processor, an I / O processing result executed by another control processor, or both of them. A parallel computer system comprising a processing completion notifying means.

The parallel computer system according to claim 1 ,
I / O distribution instruction input means for externally inputting whether to distribute the I / O request to a plurality of control processors or externally input a ratio of distributing the I / O request to each control processor. And a parallel computer system.

In the parallel computer system according to claim 1 or 2 ,
The master control processor comprises:
Abnormality detection means for detecting an abnormality of another control processor;
A parallel computer system, comprising: a distribution inhibiting unit that inhibits another control processor in which an abnormality has been detected from sharing execution of an I / O request.

In the parallel computer system according to claim 1, claim 2, or claim 3 ,
The master control processor comprises:
When an abnormality is detected in another control processor that has been executing the I / O request in a shared manner, recovery processing means for re-executing the I / O request being processed by the control processor is provided. Parallel computer system.

A plurality of processor elements for performing operations, a plurality of control processors for processing I / O requests from each of the processor elements, a network connecting the processor elements and the control processor, and an I / O connected to the control processor. In a parallel computer system having a storage device or an input / output device targeted for / O,
One of the control processors is a master control processor,
This master control processor
I / O distribution instruction input means for externally inputting whether to distribute an I / O request to a plurality of control processors,
I / O division instruction input means for externally inputting a division number for dividing an I / O request into a plurality of I / O execution units;
Means for receiving an I / O request from each of the processor elements;
Based on the load status of each control processor and the distribution instruction information from the I / O distribution instruction input means, it is determined whether the execution of the I / O request is distributed to a plurality of control processors or performed by a specific control processor. Dispersion determination processing means
I / O request transfer means for transferring an I / O request to another control processor when executing the I / O request;
An I / O processing result receiving means for receiving an I / O processing result when another I / O request is executed by another control processor;
I / O for notifying the processor element of an I / O request source of an I / O processing result executed by its own control processor, an I / O processing result executed by another control processor, or both of them. Processing completion notification means,
Processing means for dividing an I / O request from the processor element into a plurality of I / O execution units in accordance with an instruction from the I / O division instruction input means;
For a plurality of divided I / O execution units, input / output of data between the storage device or the input / output device and the control processor, and communication between the control processor and the processor element of an I / O request source I / O parallel execution means for executing data transfer in parallel.