JP3732648B2

JP3732648B2 - Process allocation method

Info

Publication number: JP3732648B2
Application number: JP07049398A
Authority: JP
Inventors: 乃親熊本; 竜太田中
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1998-03-19
Filing date: 1998-03-19
Publication date: 2006-01-05
Anticipated expiration: 2018-03-19
Also published as: JPH11272623A

Description

【０００１】
【発明の属する技術分野】
本発明は、複数のプロセッサを直列接続したリニアアレイ型マルチプロセッサシステムの前記各プロセッサに対し、複数のサブプロセスを構成要素として含むプロセスを動的に割当てるプロセス割当て方法に関する。
【０００２】
近年、ディジタル通信システム、マルチメディアシステムなど、実時間処理が要求される分野においては、高速、低コストで、かつ複雑なアルゴリズムに柔軟に対応可能なディジタル信号処理プロセッサ（ＤＳＰ）を始めとして、信号処理を高速に実行可能なプロセッサが広く使われている。
【０００３】
ところが、画像処理、音響処理ＣＧ、などの処理を複数同時に処理しようとした場合、単一のプロセッサでは実時間で処理を行うことが困難となっている。また、単一プロセッサで実現される処理においても、処理すべきデータ量が増加した場合や、要求される速度が早くなった場合、単一のプロセッサでは実現できなくなってしまうことがある。そこで、複数のプロセッサを使用して処理の高速化を図る必要がある。
【０００４】
【従来の技術】
以下、図に基づいて従来例を説明する。
§１：従来例１の説明
従来、マルチプロセッサシステムにおいて、各プロセッサにプロセスを割当てる場合、空いているプロセッサから順番にプロセスを割当てる方法がとられている。これは、リニアアレイ型マルチプロセッサシステムではプロセッサ間通信を行う場合、転送経路が１つ（直線状）又は２つ（リング状）しかないため、通信時間をほとんど考慮する必要がないためである。
【０００５】
また、このように空いているプロセッサを見つけて順に割当てる方法では、動的なプロセス割当てに際し、重要な割当てに要する時間が小さいという利点がある。しかしながら、アプリケーションによっては、プロセッサ間のデータ転送に用いるリンクに流れるデータ転送量がかなり多い場合があり、このような場合、或る特定のプロセッサ間のリンクのデータ転送量が大きくなり、他のプロセッサ間のリンクのデータ転送流が小さくなってしまうことが起こる。
【０００６】
このような場合、プロセッサの能力はまだ十分に空いているにも関わらず、データ転送ができないため、全体の処理が実時間で実現不可能になってしまうことがある。
【０００７】
§２：従来例２の説明
データ転送量を考慮した割当て方法としては、例えば、特開平９−３４８４７号公報に記載された発明がある。以下、この例を従来例２として説明する。従来例２では、通信路にスイッチを置き、そのスイッチでデータ転送量を計数し、それに応じて通信経路を変更することで、データ転送量を考慮したプロセスの割当てを行う。
【０００８】
ところが、このプロセス割当て方法では、通信経路が複数あれば効果はあるが、リニアアレイ型プロセッサシステムでは、通信路は１つ、リンク構成でも２つしかなく、ほとんど効果がない。
【０００９】
§３：従来例３の説明
プロセッサ間のデータ転送量の最大値が最小となるプロセスを割当てることで、プロセッサの能力を生かした処理の実現を可能にしたものが既に提案されている。ところが、ここで割当てる各プロセスは、一般に複数のプロセッサ資源を要求するため、単一プロセッサで処理されるサブプロセスを複数接続した構成になっている。
【００１０】
従って、このサブプロセスの実行スケジュールによって、そのプロセスが要求する資源は変化する。このようなサブプロセスの実行スケジュールについては、特開平８−１５２９０３号公報に示されているように、予め制御グループを設定し、グループ内をスケジューリングする方法が知られている。
【００１１】
しかし、この方法では、制御グループを先に静的に決定してしまうため、グループ間のデータ転送量の負荷分散を行うには、その都度、制御グループの割当て変更が必要となり、負荷分散が動的に行われない。
【００１２】
§４：従来例４の説明
複数プロセッサで構成されるシステムの中で、複数のプロセッサを直列接続して得られるリニアアレイ型マルチプロセッサシステムが、先に提出した特願平９−２２１６１７号（以下「従来例４」と記す）に記載されている。この従来例４に示されているリニアアレイ型マルチプロセッサシステムは、構成が簡単で、最も安価なシステムを構成できる。
【００１３】
前記リニアアレイ型マルチプロセッサシステムでは、プロセッサは左右のいずれかのプロセッサへは、直接データ転送できるが、それ以外のプロセッサへのデータ転送は、他のプロセッサを経由してしかデータ転送できない。従って、複数のプロセッサに対し、ユーザの要求によりダイナミックに処理を行ったり、停止する場合、プロセスの割当て方によって、複数のプロセッサの能力を十分に生かしきれない。
【００１４】
以下、従来例４について、図を参照しながらその概要を説明する。図８は従来例の説明図（その１）であり、信号処理アクセラレータ（リニアアレイ型プロセッサシステム）の説明図である。以下、図８に基づいて、信号処理アクセラレータ（リニアアレイ型プロセッサシステム）の構成を説明する。
【００１５】
図８に示した信号処理アクセラレータは、複数で同一の情報処理ユニット１０を含む。各情報処理ユニット１０は互いに接続されると共に、ホストメモリバス３０に接続される。各情報処理ユニット１０は、信号処理プロセッサ１１、命令キャッシュ１２、データＲＡＭ１３、リンク制御部１４、１５、メインキャッシュ１６、リンクキャッシュ１７、ＤＲＡＭコントローラ１９を含む。なお、前記信号処理プロセッサ１１及びデータＲＡＭ１３は、信号処理部２５を構成する。また、リンク制御部１４、１５、メインキャッシュ１６、及びリンクキャッシュ１７は、通信制御部２６を構成する。リンク制御部１４、１５には通信リンク２０が接続される。各情報処理ユニット１０は通信リンク２０を介して直列に配列され、隣接する情報処理ユニット１０と通信する。通信内容を或る情報処理ユニット１０から次の情報処理ユニット１０へと伝達していくことで、任意の情報処理ユニット間で通信を行うことができる。この場合、図８では、３個の情報処理ユニット１０が示されているが、個数は３個に限らず任意である。
【００１６】
また、各情報処理ユニット１０はＤＲＡＭコントローラ１９を介してホストメモリバス３０に接続される。更に、ホストプロセッサ３１がホストメモリバス３０に接続される。信号処理プロセッサ１１は信号処理機能を実現するプロセッサであり、命令キャッシュ１２は信号処理プロセッサ１１が頻繁に用いる命令を格納しておくためのキャッシュメモリである。
【００１７】
なお、信号処理プロセッサ１１が実行するプログラムは、命令キャッシュ１２以外にＤＲＡＭ１８に格納されている。データＲＡＭ１３は、信号処理プロセッサ１１がデータ処理する際に中間結果を格納するため等のワーク領域として用いられる。メインキャッシュ１６及びリンクキャッシュ１７は、信号処理プロセッサ１１が処理するデータを格納しておくためのキャッシュメモリである。
【００１８】
メインキャッシュ１６には、情報処理ユニット１０自身のＤＲＡＭ１８から取り込んだデータを格納し、リンクキャッシュ１７には、リンク制御部１４、１５を介して他の情報処理ユニット１０から取り込んだデータを格納する。
【００１９】
メインキャッシュ１６のデータがスワップアウトされても、そのデータが必要になった場合には、自分のＤＲＡＭ１８からデータを再度読み出すことができる。それに対して、リンクキャッシュ１７のデータがスワップアウトされると、別の情報処理ユニット１０から通信リンク２０を介してデータを再度持ってくる必要がある。
【００２０】
メインキャッシュ１６とリンクキャッシュ１７を同一のキャッシュメモリとしてしまうと、例えば、通信負荷が重い状態なのに、自身のＤＲＡＭ１８からのデータをキャッシュメモリに格納することによって、他の情報処理ユニット１０から獲得したデータをスワップアウトしてしまう等の問題が生じる。この例では、メインキャッシュ１６とリンクキャッシュ１７とを機能別に分けている。
【００２１】
データ処理を行う際、複数の情報処理ユニット１０は互いに通信しながら、並列処理、或いはパイプライン処理を行う。たとえば、いくつかの情報処理ユニット１０が画像データ処理を並列に実行している間に、別の幾つかの情報処理ユニット１０が音声データを並列に処理すること等が可能である。
【００２２】
複数の情報処理ユニット１０間の通信は、通信リンク２０によって行われる。従って、ホストメモリバス３０は、複数の情報処理ユニット１０間での通信には全く関与せず、ホストプロセッサ３１が実行するＯＳプロセス等の他のプロセスにデータ転送経路を提供できる。
【００２３】
各情報処理ユニット１０は処理後のデータをＤＲＡＭ１８に書き込む。ホストプロセッサ３１はホストメモリバス３０を介してＤＲＡＭ１８にアクセスすることによって、データ処理後のデータを書き込むことができる。図８の信号処理アクセラレータは、ホストメモリバス３０を介在しないで通信可能な複数の情報処理ユニット１０を設けて、それら情報処理ユニット１０に並列処理を実行させるので、バス競合によりデータ処理速度が低下することなく、高速な信号処理を実現することができる。
【００２４】
また、画像処理プロセスや音声処理プロセス等の複数のプロセスの各々に、別の情報処理ユニット１０を割当てることができるので、複数の異なった信号を処理する必要があるマルチメディア信号処理に適している。また、信号処理部２５（信号処理プロセッサ１１、命令キャッシュ１２、データＲＡＭ１３）、通信制御部２６（キャッシュメモリ１６、及び１７と、リンク制御部１４及び１５）、及びメモリ（ＤＲＡＭ１８及びＤＲＡＭコントローラ１９）を１つのチップ上に集積回路化して、従来のメモリと同等な形でパーソナルコンピュータに搭載することができる。
【００２５】
従って、コストを従来のメモリバスに重複させることが可能であると共に、メモリバス内に埋め込まれた信号処理アクセラレータをソフトウェアで活用できるという利点がある。従って、ハードウェアの追加、拡張にかかる費用を削減すると共に、機能拡張に優れたシステムを構築できる。
【００２６】
§５：プロセス割当て方法の説明・・・図９参照
図９は従来例の説明図（その２）であり、２つの異なったプロセス割当ての方法（Ａ図は例１、Ｂ図は例２）を示した図である。以下、図９に基づいて、２つの異なったプロセス割当ての方法を説明する。図９のＡ図のように、プロセス１をプロセッサエレメントＰＥ１及びＰＥ３に割当て、プロセス２をプロセッサエレメントＰＥ２及びＰＥ４に割当てた場合を説明する。
【００２７】
一つのプロセッサ内で２つのＰＥ間のデータ転送量をＭとすれば、ＰＥ１及びＰＥ３間でＰＥ２を介してのデータ転送量はＭとなり、また、ＰＥ２及びＰＥ４間でＰＥ３を介してのデータ転送量もＭとなる。従って、データ転送量は、ＰＥ１及びＰＥ２間でＭ、ＰＥ２及びＰＥ３間で２Ｍ、ＰＥ３及びＰＥ４間でＭとなる。
【００２８】
次に、図９のＢ図に示したように、プロセス１をプロセッサエレメントＰＥ１及びＰＥ２に割当て、プロセス２をプロセッサエレメントＰＥ３及びＰＥ４に割当てた場合を説明する。この場合、ＰＥ１及びＰＥ２の間でのデータ転送量がＭ、ＰＥ３及びＰＥ４間でのデータ転送量がＭとなり、ＰＥ２及びＰＥ３間ではデータ転送は行われない。
【００２９】
各ＰＥ間のデータ転送能力が、たとえば、１．５Ｍであるとすると、図９のＡ図の場合（例１）には、一方のプロセスの処理が実現不可能となるのに対して、図９のＢ図の場合（例２）には、両方のプロセスの処理を同時に実現できる。このようにプロセスの割当て方によっては各リンクのデータ転送量が異なる結果となり、複数のプロセスの完全な同時処理が可能な場合と、不可能な場合とが生じる。
【００３０】
同時処理が不可能な場合には、当然ながら全体でのデータ処理速度が劣化することになる。しかも、他のプロセッサエレメントから要求される数やタイミングは、実際の要求が発行される前には全く未知であるため、プロセスの割当てを動的に行う必要がある。従って、効率の良い動的プロセス割当てアルゴリズムが必要になる。
【００３１】
§６：動的プロセス割当てアルゴリズムの説明・・・図１０〜１３参照
図１０〜図１３は従来例の説明図（その３）〜（その６）であり、動的プロセス割当てアルゴリズムを示した図である。以下、図１０〜図１３に基づいて、動的プロセス割当てアルゴリズムを説明する。
【００３２】
(1) ：メインルーチンの説明・・・図１０参照
図１０は従来例の説明図（その３）であり、動的プロセス割当てアルゴリズムのメインルーチンを示すフローチャートである。以下、図１０に基づいて、動的プロセス割当てアルゴリズムのメインルーチンを説明する。なお、Ｓ１〜Ｓ５は各処理ステップを示す。
【００３３】
この動的プロセス割当てアルゴリズムは、リソース管理プログラム（図示省略）が実行する処理である。この動的プロセス割当てアルゴリズムは、以下の２つの指標に基づいてリソース（この場合のリソースはＰＥを指す）の割当てを行う。第１の指標は、割当てるプロセスのデータ転送ができるだけ他のデータ転送と重ならないことである。第２の指標は、当該プロセスを割当てた結果、次のプロセスを割当てる際に、できるだけ他のデータ転送と重ならないことである。
【００３４】
先ず、第１の指標によって、割当てようとしているプロセスによって生じる通信リンク２０上のデータ転送量の最大値が最小になるように設定する。次に、同一の最大値を有する割当て方の中から、第２の指標を用いて、次に割当てるプロセスができるだけ阻害されないような割当て方を求める。
【００３５】
図１０に示したように、このアルゴリズムにおいては、ＰＥを一個要求する場合と、複数個要求する場合とで、別々に割当て方を求める。これはＰＥを１個だけ要求する場合には、当該プロセスによる通信リンク２０上のデータ転送は生じないために、次のプロセス割当てに対する影響を考えればよいからである。
【００３６】
それに対してＰＥを複数個割当てる場合には、通信リンク２０を介したデータ転送が必要であるため、そのプロセスの割当て方によって、当該プロセス自身の効率が影響を受けてしまう。
【００３７】
図１０のステップＳ１において、フリーリソースとして利用可能なＰＥの数を検査する。利用可能なＰＥが存在しない場合にはアルゴリズムを終了し、存在する場合にはステップＳ２に進む。ステップＳ２において、要求されるＰＥの数が１個であるのか否かを判断する。１個の場合はステップＳ３に進み、複数の場合はステップＳ４に進む。
【００３８】
ステップＳ３において、ＰＥの１個要求に対する割当てを行う。割当てが失敗した場合にはアルゴリズムを終了し、成功した場合は次にステップＳ５に進む。ステップＳ４において、ＰＥの複数個要求に対する割当てを行う。割当てが失敗した場合にはアルゴリズムを終了し、成功した場合にはステップＳ５に進む。ステップＳ５において、プロセスＩＤを更新する。すなわち、新規のプロセスに、新規のプロセスＩＤを割当てる。これでアルゴリズムを終了する。
【００３９】
(2) ：ＰＥの１個要求に対する割当ての説明・・・図１１参照
図１１は、従来例の説明図（その４）であり、図１０のステップＳ３におけるＰＥの１個要求に対する割当て処理を示すフローチャートである。なお、Ｓ１１〜Ｓ１７は各処理ステップを示す。
【００４０】
ステップＳ１１において、利用可能なＰＥを検索する。次に、ステップＳ１２において、全ての利用可能なＰＥに対してループを構成する。すなわち、全ての利用可能なＰＥに対して以下のステップを繰り返す。ステップＳ１３において、１個のＰＥを仮割当てする。ステップＳ１４において、次の割当て効率を計算する。なお、割当て効率を計算した結果を「Ｒｅｓｕｌｔ」と記す。
【００４１】
ステップＳ１５において、Ｒｅｓｕｌｔが最小の値を保持する。ステップＳ１６においてループを終了する。ステップＳ１７において、Ｒｅｓｕｌｔが最小の値であるＰＥを、実際にプロセスに割当てる。以上で処理を終了する。
【００４２】
(3) ：ＰＥの複数個要求に対する割当ての説明・・・図１２参照
図１２は従来例の説明図（その５）であり、ＰＥの複数個要求に対する割当てを示すフローチャートである。以下、図１０のステップＳ４におけるＰＥの複数個要求に対する割当てを説明する。なお、Ｓ２１〜Ｓ３１は各処理ステップを示す。
【００４３】
ステップＳ２１において、利用可能なＰＥを検索する。ステップＳ２２において、要求される個数の利用可能なＰＥの全ての組み合わせに対して、第１ループを構成する。すなわち、要求される個数の利用可能なＰＥの全ての組み合わせに対して、以下のステップを繰り返す。
【００４４】
ステップＳ２３において、当該プロセスの割当てによって生じる通信リンク２０のデータ転送量を計算する。ステップＳ２４において、通信リンク２０のデータ転送量の最大値が最小の割当てを保持する。ステップＳ２５において、第１のループを終了する。ステップＳ２６において、通信リンク２０のデータ転送量の増加が最小であるような割当てを第１のループで選択された割当てとして、全ての選択された割当てに対して第２のループを構成する。
【００４５】
ステップＳ２７において、選択された割当て方で、複数個のＰＥを仮割当てする。ステップＳ２８において、次の割当て効率を計算する。なお、次の割当て効率を計算した結果を、Ｒｅｓｕｌｔと記す。ステップＳ２９において、結果Ｒｅｓｕｌｔが最小の割当てを保持する。ステップＳ３０において、第２のループを終了する。ステップＳ３１において、Ｒｅｓｕｌｔが最小の割当てである割当て方で、実際にプロセスに割当てる。以上の処理を終了する。
【００４６】
(4)：次の割当て効率計算処理の説明・・・図１３参照
図１３は従来例の説明図（その６）であり、図１１のステップＳ１４及び図１２のステップＳ２８に於ける次の割当て効率計算処理を示すフローチャートである。以下、図１３に基づいて、前述した次の割当て効率計算処理を説明する。なお、Ｓ４１〜Ｓ４３は各処理ステップを示す。
【００４７】
ステップＳ４１において、利用可能なＰＥの内で、最も左端のＰＥを選択してＰＥ−Ｌとする。ステップＳ４２において、利用可能なＰＥの内でも最も右端のＰＥを選択してＰＥ−Ｒとする。ステップＳ４３において、ＰＥ−ＬからＰＥ−Ｒまでの通信リンク数を求め、この通信リンク数をＲｅｓｕｌｔとする。以上で処理を終了する。
【００４８】
図１３のフローチャートにおいては、最も左端のＰＥと最も右端のＰＥを選択して、両ＰＥ間での通信リンク数を求める。すなわち、この通信リンク数によって、次のプロセスを割当てる際の割当て効率を求めていることになる。これは以下のように考えれば良い。すなわち、両ＰＥ間での通信リンク数が少ないということは、利用可能なＰＥがまとまって存在していることを意味する。逆に、両ＰＥ間での通信リンク数が多いと、利用可能なＰＥが広い範囲に広がって存在していることになる。
【００４９】
狭い範囲にまとまったＰＥにプロセスを割当てる方が、当然ながら割当てられたＰＥ間に介在するＰＥの数も少なく、割当て後のデータ転送量の最大値も小さくなる可能性が高い。広い範囲で互いに離れたＰＥにプロセスを割当てたのでは介在するＰＥの数が多く、別のプロセスのデータ転送と重なる可能性が高く、従って、割当て後のデータ転送量の最大値も大きくなる可能性が高い。
【００５０】
すなわち、図１３のフローチャートは、プロセス割当て後に残された利用可能なＰＥがまとまってなるべく狭い範囲に存在するような指標を与えることになる。すなわち、プロセス割当て後に、残された利用可能なＰＥを次回割当てる際に、データ転送効率がなるべくよくなるような指標を与えていることになる。
【００５１】
【発明が解決しようとする課題】
前記のような従来のものにおいては、次のような課題があった。
(1) ：従来例１では、プロセッサの能力はまだ十分に開いているにも関わらず、データ転送ができないため、全体の処理が実時間で実現不可能になってしまうことがある。
【００５２】
(2) ：従来例２では、前記プロセス割当て方法は通信経路が複数あれば効果はあるが、リニアアレイ型プロセッサシステムでは、通信路は１つ、リンク構成でも２つしかなく、ほとんど効果がない。
【００５３】
(3) ：従来例３では、制御グループを先に静的に決定してしまうため、グループ間のデータ転送量の負荷分散を行うには、その都度、制御グループの割当て変更が必要となり、負荷分散が動的に行われない。
【００５４】
(4) ：従来例４に示したリニアアレイ型プロセッサシステムにおける動的プロセス割当てにおいて、割当てるべきプロセスは複数のプロセッサ資源を要求する。従って、一般的に、各プロセスは単一プロセッサで処理されるサブプロセスを複数接続した構成となっている。サブプロセスの実行スケジュールによっては、そのプロセスが要求するプロセッサ数や、データ転送量は可変である。
【００５５】
従って、このサブプロセスの実行スケジュールが動的プロセス割当てによるプロセッサ間データ転送量の増減に影響し、このことで、システム全体の能力を充分に発揮することができなくなってしまうという課題がある。
【００５６】
また、プロセスが要求するプロセッサ数やデータ転送量の少ないサブプロセスの実行スケジュールを使うことで、他のプロセスの実行速度は満足できても、そのプロセス単体の処理速度が低下してしまい、全体の処理が実現できなくなるという課題がある。
【００５７】
本発明は、このような従来の課題を解決し、リニアアレイ型プロセッサシステムにおける動的プロセス割当てによってシステム全体の能力を最大限引き出すために、サブプロセスの実行スケジュールを、予め静的スケジューリング手段によって求めておくことで、効率的な処理を可能にすることを目的とする。
【００５８】
【課題を解決するための手段】
本発明は前記の目的を達成するため、次のように構成した。
(1) ：複数のプロセッサを直列接続したリニアアレイ型マルチプロセッサシステムの前記各プロセッサに対し、複数のサブプロセスを構成要素として含むプロセスを動的に割当てるプロセス割当て方法において、前記動的プロセス割当てを行うためのプロセスを構成するサブプロセスの、前記プロセスの要求するプロセッサ数やプロセッサ間データ転送量を含む実行スケジュール情報を、予め、サブプロセスの静的スケジューリング手段により静的に求めておく手順と、動的プロセス割り当て手段が、前記求められたサブプロセスの実行スケジュール情報を受け取り、その実行スケジュール情報を基に、前記各プロセッサに対して動的プロセス割り当てを行う手順とを備えると共に、前記サブプロセスの静的スケジューリング手段は、静的スケジューリングを行う時、指定した時間内にそのプロセスが終了するために要求されるプロセッサ数が最小となるサブプロセススケジュールを求める手順と、前記求めたプロセッサ数で割り当てを試み、割り当てが可能か否かを判断して、割り当てが不可能であれば前記プロセッサ数を増加させて再度割り当てが可能か否かを判断することを繰り返し、割り当てが可能になった時点で、そのプロセスが要求するプロセッサ数最小となるスケジュールを得る手順とを備えていることを特徴とする。
【００５９】
(2) ：複数のプロセッサを直列接続したリニアアレイ型マルチプロセッサシステムの前記各プロセッサに対し、複数のサブプロセスを構成要素として含むプロセスを動的に割当てるプロセス割当て方法において、前記動的プロセス割当てを行うためのプロセスを構成するサブプロセスの、前記プロセスの要求するプロセッサ数やプロセッサ間データ転送量を含む実行スケジュール情報を、予め、サブプロセスの静的スケジューリング手段により静的に求めておく手順と、動的プロセス割り当て手段が、前記求められたサブプロセスの実行スケジュール情報を受け取り、その実行スケジュール情報を基に、前記各プロセッサに対して動的プロセス割り当てを行う手順とを備えると共に、前記サブプロセスの静的スケジューリング手段は、静的スケジューリングを行う時、指定したプロセッサ数でそのプロセスが終了するための処理時間が最小となるサブプロセススケジュールを求める手順と、サブプロセスの処理時間の総和を目的のプロセッサ数で割る演算を行うことで前記処理時間を求める手順と、前記求めた処理時間以上の処理時間Ｔの条件で割り当てを試み、割り当てが可能か否かを判断して、割り当てが不可能であれば前記処理時間Ｔを更新し、再度割り当てが可能か否かを判断することを繰り返し、割り当てが可能になった時点で、そのプロセス単体の処理時間が最小となるスケジュールを得る手順とを備えていることを特徴とする。
【００６０】
(3) ：複数のプロセッサを直列接続したリニアアレイ型マルチプロセッサシステムの前記各プロセッサに対し、複数のサブプロセスを構成要素として含むプロセスを動的に割当てるプロセス割当て方法において、前記動的プロセス割当てを行うためのプロセスを構成するサブプロセスの、前記プロセスの要求するプロセッサ数やプロセッサ間データ転送量を含む実行スケジュール情報を、予め、サブプロセスの静的スケジューリング手段により静的に求めておく手順と、動的プロセス割り当て手段が、前記求められたサブプロセスの実行スケジュール情報を受け取り、その実行スケジュール情報を基に、前記各プロセッサに対して動的プロセス割り当てを行う手順とを備えると共に、前記サブプロセスの静的スケジューリング手段は、静的スケジューリングを行う時、前記プロセスを、サブプロセスの数まで分割する手順と、前記分割されたプロセスに対してスケジューリングを行い、各サブプロセスの実行時間を求める手順と、この時の時間制約の条件を満たすか否かを判断し、前記条件を満たさなければ、再度別の分割を試み、再度前記条件を満たすか否かを判断することを繰り返し、前記条件を満たした場合、その時の各分割間のデータ転送量の最大値が最小となるか否かを判断し、最小とならなければ処理を繰り返し、最小となれば、データ転送量の最大値が最小のものを保存する手順と、前記手順を全ての分割パターンに対して行い、前記保存してあるデータ転送量の最大値が最小のものを、そのプロセスのサブプロセススケジュールとする手順とを備えていることを特徴とする。
【００６１】
(4) ：複数のプロセッサを直列接続したリニアアレイ型マルチプロセッサシステムの前記各プロセッサに対し、複数のサブプロセスを構成要素として含むプロセスを動的に割当てるプロセス割当て方法において、前記動的プロセス割当てを行うためのプロセスを構成するサブプロセスの、前記プロセスの要求するプロセッサ数やプロセッサ間データ転送量を含む実行スケジュール情報を、予め、サブプロセスの静的スケジューリング手段により静的に求めておく手順と、動的プロセス割り当て手段が、前記求められたサブプロセスの実行スケジュール情報を受け取り、その実行スケジュール情報を基に、前記各プロセッサに対して動的プロセス割り当てを行う手順とを備えると共に、前記サブプロセスの静的スケジューリング手段が複数個で構成されていた場合、複数個のサブプロセスの静的スケジューリング手段により、予め求めておいた複数のサブプロセススケジュールから、動的プロセス割り当てに最適なサブプロセススケジュール結果を選択して使用する手順と、前記手順において、現在のプロセッサへのプロセス割り当て状況、現在のプロセッサ間のデータ転送量、及び将来のプロセス割り当て解除情報を含む複数の情報を、プロセスに割り当てる毎に更新しておき、次のプロセスを割り当てる時、前記複数の情報から最適な情報を選択して割り当てる手順とを備えていることを特徴とする。
【００６５】
（作用）
前記構成に基づく本発明の作用を説明する。
(a) ：前記(1) の作用
サブプロセスの静的スケジューリング手段は、動的プロセス割当てを行うためのサブプロセスのスケジュール情報を、予め、静的に求めておく。このように、動的プロセス割当てを行う時、割当てを行うプロセスを構成するサブプロセスを動的プロセス割当てを行う前に、予め静的にスケジュール情報を求めておく。
【００６６】
そして、そこで求められたサブプロセスの実行スケジュールからそのプロセスの要求するプロセッサ数やプロセッサ間データ転送量を基に、動的プロセス割当てを行うことで、全体の処理の効率の良い実現に寄与するところが大である。
【００６７】
(b) ：前記(2) の作用
サブプロセスの静的スケジューリング手段は、サブプロセスのスケジュール情報を求める時、そのプロセスが要求するプロセッサ数最小となるスケジュール情報を得る。このように、サブプロセスの静的スケジューリング手段が静的スケジューリングを行う時、そのプロセスが要求するプロセッサ数が最小となるサブプロセススケジュールを求めることで、プロセッサ数の少ないシステムに適した動的プロセス割当てを行うことができる。
【００６８】
(c) ：前記(3) の作用
サブプロセスの静的スケジューリング手段は、サブプロセスのスケジュール情報を求める時、そのプロセス単体の処理時間最小となるスケジュール情報を得る。このように、サブプロセスの静的スケジューリング手段が静的スケジューリングを行う時、そのプロセス単体の処理時間が最小となるサブプロセススケジュールを求めることで、全プロセスの処理時間の短い動的プロセス割当てを行うことができる。
【００６９】
(d) ：前記(4) の作用
サブプロセスの静的スケジューリング手段は、サブプロセスのスケジュール情報を求める時、そのプロセスが要求するプロセッサ間データ転送量最小となるスケジュール情報を得る。このように、サブプロセスの静的スケジューリング手段が静的スケジューリングを行う時、そのプロセスが要求するプロセッサ間データ転送量の最大値が最小となるサブプロセススケジュールを求めることで、プロセッサ間データ転送のための、容量が小さなシステムに適した動的プロセス割当てを行うことができる。
【００７０】
(e) ：前記(5) の作用
サブプロセスの静的スケジューリング手段は、サブプロセスのスケジュール情報を求める時、前記サブプロセスのスケジュール情報を、複数の条件により求めておき、プロセスの要求条件に応じて適当なスケジュール情報を選択して割当てる。
【００７１】
このようにすれば、サブプロセスの静的スケジューリング手段が静的スケジューリングを行う時、複数のサブプロセススケジュールを求めておき、その中から動的プロセス割当て時に、最適なサブプロセススケジュールを選択することで、その時の割当て状況に適応した動的プロセス割当てを行うことができる。
【００７２】
(f) ：前記(6) の作用
静的スケジューリング装置のサブプロセスの静的スケジューリング手段は、動的プロセス割当てを行うためのサブプロセスのスケジュール情報を、予め、静的に求めておく。このように、動的プロセス割当てを行う時、割当てを行うプロセスを構成するサブプロセスを動的プロセス割当てを行う前に、予め静的にスケジュール情報を求めておく。
【００７３】
そして、そこで求められたサブプロセスの実行スケジュールからそのプロセスの要求するプロセッサ数やプロセッサ間データ転送量を基に、動的プロセス割当てを行うことで、全体の処理の効率の良い実現に寄与するところが大である。
【００７４】
(g) ：前記(7) の作用
コンピュータが、前記記録媒体のプログラムを読み出して実行することにより、動的プロセス割当てを行うためのサブプロセスのスケジュール情報を、予め、静的に求めておく。このように、動的プロセス割当てを行う時、割当てを行うプロセスを構成するサブプロセスを動的プロセス割当てを行う前に、予め静的にスケジュール情報を求めておく。
【００７５】
そして、そこで求められたサブプロセスの実行スケジュールからそのプロセスの要求するプロセッサ数やプロセッサ間データ転送量を基に、動的プロセス割当てを行うことで、全体の処理の効率の良い実現に寄与するところが大である。
【００７６】
【発明の実施の形態】
以下、発明の実施の形態を図面に基づいて詳細に説明する。
§１：システムの説明・・・図１参照
図１はシステムの説明図である。以下、図１に基づいて本実施の形態で使用するシステムについて説明する。図１に示したように、このシステムは、リニアアレイ型マルチプロセッサシステムに対し、動的プロセス割当てを行うシステムである。
【００７７】
前記リニアアレイ型マルチプロセッサシステムは、複数のプロセッサ３２（ＰＥ１、ＰＥ２、ＰＥ３、ＰＥ４・・・）が直列接続されたシステムであり、前記従来例４と同じ構成のシステムである。この場合、前記プロセッサ３２は、それぞれ、前記従来例４の情報処理ユニット１０に対応している。
【００７８】
そして、各プロセッサ３２に対して、動的プロセス割当てを行うホストプロセッサ３１を前記各プロセッサ３２とバスにより接続されている。また、このホストプロセッサ３１により、前記各プロセッサ３２に動的プロセス割当てを行う前に、サブプロセスの静的割当てを行うための静的スケジューリング装置３３を前記ホストプロセッサ３１に接続しておく。
【００７９】
ホストプロセッサ３１には、各プロセッサ３２に対して動的プロセス割当てを行うための動的プロセス割当て手段３４が設けてあり、静的スケジューリング装置３３には、サブプロセスの静的スケジューリングを行うためのサブプロセスの静的スケジューリング手段３５が設けてある。
【００８０】
この場合、静的スケジューリング装置３３は、リニアアレイ型プロセッサシステムに対し、バス結合等により直接接続しても良いが、このような接続でなくても良い。例えば、静的スケジューリング装置３３と、ホストプロセッサ３１の間は、何らかの手段によりデータ通信が可能な状態にしておけば良く、ＬＡＮ等の通信回線を介して接続し、互いにデータ伝送可能な状態で使用することも可能である。
【００８１】
すなわち、プロセッサ３３は、前記ホストプロセッサ３１の近く（例えば、同一パーソナルコンピュータ内）に置いても良いし、離れた場所（異なる装置内）に置いても良い。少なくとも、動的プロセス割当て手段３４を備えた装置へデータ伝送できれば実施可能である。この場合、前記動的プロセス割当て手段３４、及びサブプロセスの静的スケジューリング手段３５は、プログラムの実行により実現するものである。
【００８２】
§２：プロセス割当て方法の説明・・・図２参照
図２はサブプロセスの処理説明図である。以下、図２に基づいて、サブプロセスの処理を説明する。図１に示したシステムにおいて、ホストプロセッサ３１が各プロセッサ３２に対し、動的プロセス割当て処理を行うことで、前記リニアアレイ型プロセッサシステムによる処理を行う。
【００８３】
この場合、割当てる各プロセスは、一般に複数のプロセッサ資源を要求するため、単一プロセッサで処理されるサブプロセスを複数接続した構成になっている。このサブプロセスの実行によって、そのプロセスが要求する資源は変化する。以下の説明では、前記サブプロセスは、１つのプロセスの要素を構成しており、１つのプロセスに複数のサブプロセスが存在する場合について説明する。
【００８４】
前記のように、ホストプロセッサ３１が動的プロセス割当てを行う時、前記動的割当てを行うプロセスを構成する複数のサブプロセスを、動的プロセス割当てを行う前に、静的スケジューリング装置３３に設けたサブプロセスの静的スケジューリング手段３５が、静的にスケジューリングを行う。
【００８５】
そこで求められたサブプロセスの実行スケジュール情報から、そのプロセスの要求するプロセッサ数や、プロセッサ間データ転送量を、前記ホストプロセッサ３１へ転送し、動的プロセス割当て手段３４が受信する。その後、ホストプロセッサ３１の動的プロセス割当て手段３４は、受信した前記プロセスの要求するプロセッサ数や、プロセッサ間データ転送量を基に、各プロセッサ３２に対して動的プロセス割当てを行う。このようにして、全体の処理の効率の良い実現を可能にしている。
【００８６】
この場合、前記リニアアレイ型マルチプロセッサシステムの各プロセッサ３２に対し、サブプロセスを構成要素として含むプロセスの割当てを行うプロセス割当て方法には、次のような方法がある。
【００８７】
▲１▼：プロセス割当て方法１は、前記サブプロセスの静的スケジューリング手段３５が、そのプロセスが要求するプロセッサ数最小となるスケジュールを得る方法（時間制約付スケジューリング）である。
【００８８】
▲２▼：プロセス割当て方法２は、サブプロセスの静的スケジューリング手段３５が、そのプロセス単体の処理時間が最小となるスケジュールを得る方法（プロセッサ数制約付スケジューリング）である。
【００８９】
▲３▼：プロセス割当て方法３は、サブプロセスの静的スケジューリング手段３５が、そのプロセスが要求するプロセッサ間データ転送量が最小となるスケジュールを得る方法である。
【００９０】
▲４▼：プロセス割当て方法４は、前記サブプロセスのスケジュール情報を、複数の条件により求めておき、プロセスの要求条件に応じて適当なスケジュールを選択して割当てる方法である。
【００９１】
§３：プロセス割当て方法１の説明・・・図３参照
図３はプロセス割当て処理フローチャート（その１）である。プロセス割当て方法１は、サブプロセスの静的スケジューリング手段３５が、そのプロセスが要求するプロセッサ数が最小となるスケジュールを得る方法（時間制約付スケジューリング）であり、以下、図３に基づいて説明する。なお、以下の処理はサブプロセスの静的スケジューリング手段３５が行う処理であり、Ｓ５１〜Ｓ５４は各処理ステップを示す。
【００９２】
サブプロセスの静的スケジューリング手段３５は静的スケジューリングを行う時、指定した時間内にそのプロセスが終了するために要求されるプロセッサ数Ｎｐが最小となるサブプロセススケジュールを求める。従って、先ず、前記プロセッサ数Ｎｐを、Ｎｐ＝（サブプロセスの処理時間の総和）／（目的の処理時間）の式により求める（Ｓ５１）。
【００９３】
次に、Ｎｐプロセッサで割当てを試み（Ｓ５２）、割当てが可能か否かを判断する（Ｓ５３）。その結果、割当てが可能であれば処理を終了するが、割当てが不可能であれば、Ｎｐを更新（Ｎｐ＝Ｎｐ＋１）し（Ｓ５４）、前記Ｓ５２の処理から繰り返して行う。このようにして、前記Ｓ５３の処理で割当てが可能になった時点で、そのプロセスが要求するプロセッサ数が最小となるスケジュールを得ることができる。
【００９４】
§４：プロセス割当て方法２の説明・・・図４参照
図４はプロセス割当て処理フローチャート（その２）である。プロセス割当て方法２は、サブプロセスの静的スケジューリング手段３５が、そのプロセス単体の処理時間が最小となるスケジュールを得る方法（プロセッサ数制約付スケジューリング）であり、以下、図４に基づいて説明する。なお、以下の処理はサブプロセスの静的スケジューリング手段３５が行う処理であり、Ｓ６１〜Ｓ６４は各処理ステップを示す。
【００９５】
サブプロセスの静的スケジューリング手段３５は静的スケジューリングを行う時、指定したプロセッサ数でそのプロセスが終了するための処理時間Ｔが最小となるサブプロセススケジュールを求める。従って、先ず、前記処理時間Ｔを、Ｔ＝（サブプロセスの処理時間の総和）／（目的のプロセッサ数）の式により求める（Ｓ６１）。
【００９６】
次に、処理時間≦Ｔの条件で割当てを試み（Ｓ６２）、割当てが可能か否かを判断する（Ｓ６３）。その結果、割当てが可能であれば処理を終了するが、割当てが不可能であれば、Ｔを更新｛Ｔ＝Ｔ＋（単位時間）｝し（Ｓ６４）、前記Ｓ６２の処理から繰り返して行う。このようにして、前記Ｓ６３の処理で割当てが可能になった時点で、そのプロセス単体の処理時間が最小となるスケジュールを得ることができる。
【００９７】
§５：プロセス割当て方法３の説明・・・図５参照
図５はプロセス割当て処理フローチャート（その３）である。また、図６はプロセス割当て処理説明図（その１）である。プロセス割当て方法３は、サブプロセスの静的スケジューリング手段３５が、そのプロセスが要求するプロセッサ間データ転送量が最小となるスケジュールを得る方法であり、以下、図５、図６に基づいて説明する。なお、以下の処理はサブプロセスの静的スケジューリング手段３５が行う処理であり、Ｓ７１〜Ｓ７９は各処理ステップを示す。また、Ｎはサブプロセスの数、Ｍｉはｉ分割の分割パターン数である。
【００９８】
サブプロセスの静的スケジューリング手段３５は、先ず、プロセスを、サブプロセスの数をＮとした時、ｉ＝１からＮまで分割を試みる。すなわち、ｉ＝１〜Ｎループとして（Ｓ７１）、プロセスを分割する（Ｓ７２）。この場合、ｉ分割する時の分割パターンとの数をＭｉとすると、分割パターンは、ｊ＝１からＭｉループまで存在する（Ｓ７３）。
【００９９】
例えば、Ｎ＝３で、｛１，２，３｝を２分割する場合、｛｛１，２｝，｛３｝｝，｛｛１｝，｛２，３｝｝のＭ２＝２パターン存在する。従って、ｊ＝１〜Ｍｉループとして（Ｓ７３）、分割されたプロセスに対してスケジューリングを行い、各サブプロセスの実行開始時刻を求める（Ｓ７４）。この時、条件（時間制約）を満たすことができるか否かを判断し（Ｓ７５）、前記条件を満たせばＳ７６へ進み、満たさなければ、Ｓ７８へ進み、別の分割が試される。
【０１００】
前記条件を満たした場合（時間制約を満たす場合）、その時の各分割間のデータ転送量の最大値が最小となるか否かを判断し（Ｓ７６）、最小とならなければＳ７８の処理を行い、最小となれば、データ転送量の最大値が最小となるものをメモリ等に保存する（Ｓ７７）。
【０１０１】
これを繰り返し実行し、全ての分割パターンに対して行う。そして、２重ループ（Ｓ７８、Ｓ７９）が終了した時、前記メモリ等に保存されているスケジュールをそのプロセスのサブプロセススケジュールとする。
【０１０２】
以下、図６に基づいて、前記プロセス割当て方法３の処理を具体的に説明する。図６に示したプロセスは、８個のサブプロセス｛１，２，３，４，５，６，７，８｝から構成される。これをｉ＝３の時、すなわち、３分割の場合、例えば、｛１，５，７｝，｛２，４｝，｛３，６，８｝のように分割される。
【０１０３】
この結果に対してスケジューリングを行い、３プロセッサに対して、各サブプロセスの開始時刻を決定する。そして、条件（時間制約）を満たすスケジュールが存在すれば、転送量を計算し、その内の最大値が、以前に求めた割当てより小さい場合は、これを保存する。
【０１０４】
§６：プロセス割当て方法４の説明・・・図７参照
図７はプロセス割当て処理説明図（その２）である。プロセス割当て方法４は、サブプロセスのスケジュール情報を、複数の条件により求めておき、プロセスの要求条件に応じて適当なスケジュールを選択して割当てる方法であり、以下、図７に基づいて説明する。
【０１０５】
この処理では、前記サブプロセスの静的スケジューリング手段３５がＮ個で構成されており、これらのＮ個の手段によりサブプロセスの静的スケジューリングを行う。前記静的スケジューリングを行う時、前記Ｎ個のサブプロセスの静的スケジューリング手段（１〜Ｎ）により、予め求めておいた複数のサブプロセススケジュールから、動的プロセス割当て時に最適なサブプロセススケジュール結果を選択して使用する。
【０１０６】
１．現在のプロセッサへのプロセス割当て状況（どのプロセッサが空いているか）
２．現在のプロセッサ間のデータ転送量
３．将来のプロセス割当て解除情報（いつ、どのプロセッサから、どのプロセスが割当て解除されるか）
前記処理において、前記１〜３に示した３つの情報をプロセスに割当てる毎に更新している。次のプロセスを割当てる時、この３つの情報から例えば、プロセッサの空きが少ない場合は、要求プロセッサ数が最小となるサブプロセッサ割当てを選択してプロセス割当てを行う。また、データ転送量の空きが少ない時には、データ転送量が最小となるサブプロセッサ割当てを選択してプロセス割当てを行う。
【０１０７】
更に、次に行うプロセスの処理時間が最小となるサブプロセス割当ては、この処理時間と割当て解除になる時間との和が、このプロセスの時間制約以内に納まる時使用される。
【０１０８】
§７：記録媒体とプログラムの説明
前記静的スケジューリング装置３３のサブプロセスの静的スケジューリング手段３５が行うサブプロセスの静的スケジューリング処理は、静的スケジューリング装置３３内のＣＰＵがプログラムを実行することにより、次のようにして実現する。
【０１０９】
静的スケジューリング装置３３にはハードディスク装置が設けてあり、このハードディスク装置の記録媒体（ハードディスク）に、前記処理を実現するためのプログラムやその他の各種データ等を格納しておく。そして、前記処理を行う場合は、ＣＰＵの制御によりハードディスク装置の記録媒体に格納されている前記プログラムやデータを読み出して静的スケジューリング装置３３内に設けたメモリに取り込む。
【０１１０】
その後、ＣＰＵが前記メモリに格納してあるプログラムの内、必要なプログラムから順次読み出して実行することにより、前記処理を行う。なお、前記ハードディスク装置の記録媒体に記録するプログラムは、次のようにして記録（記憶）する。
【０１１１】
▲１▼：フレキシブルディスク（フロッピィディスク）に格納されているプログラム（他の装置で作成したプログラムデータ）を、静的スケジューリング装置３３に設けたフレキシブルディスクドライブ装置により読み取り、ハードディスク装置の記録媒体（ハードディスク）に格納する。
【０１１２】
▲２▼：光磁気ディスク、或いはＣＤ−ＲＯＭ等の記憶媒体に格納されているデータを、静的スケジューリング装置３３に設けたドライブ装置により読み取り、ハードディスク装置の記録媒体（ハードディスク）に格納する。
【０１１３】
▲３▼：ＬＡＮ等の通信回線を介して他の装置から伝送されたデータを静的スケジューリング装置３３で受信し、そのデータをハードディスク装置の記録媒体（ハードディスク）に格納する。
【０１１４】
【発明の効果】
以上説明したように、本発明によれば次のような効果がある。
(1) ：サブプロセスの静的スケジューリング手段は、動的プロセス割当てを行うためのサブプロセスのスケジュール情報を、予め、静的に求めておく。このように、動的プロセス割当てを行う時、割当てを行うプロセスを構成するサブプロセスを動的プロセス割当てを行う前に、予め静的にスケジュール情報を求めておく。
【０１１５】
そして、そこで求められたサブプロセスの実行スケジュールからそのプロセスの要求するプロセッサ数やプロセッサ間データ転送量を基に、動的プロセス割当てを行うことで、全体の処理の効率の良い実現に寄与するところが大である。
【０１１６】
(2) ：サブプロセスの静的スケジューリング手段は、サブプロセスのスケジュール情報を求める時、そのプロセスが要求するプロセッサ数最小となるスケジュール情報を得る。このように、サブプロセスの静的スケジューリング手段が静的スケジューリングを行う時、そのプロセスが要求するプロセッサ数が最小となるサブプロセススケジュールを求めることで、プロセッサ数の少ないシステムに適した動的プロセス割当てを行うことができる。
【０１１７】
(3) ：サブプロセスの静的スケジューリング手段は、サブプロセスのスケジュール情報を求める時、そのプロセス単体の処理時間が最小となるスケジュール情報を得る。このように、サブプロセスの静的スケジューリング手段が静的スケジューリングを行う時、そのプロセス単体の処理時間が最小となるサブプロセススケジュールを求めることで、全プロセスの処理時間の短い動的プロセス割当てを行うことができる。
【０１１８】
(4) ：サブプロセスの静的スケジューリング手段は、サブプロセスのスケジュール情報を求める時、そのプロセスが要求するプロセッサ間データ転送量が最小となるスケジュール情報を得る。このように、サブプロセスの静的スケジューリング手段が静的スケジューリングを行う時、そのプロセスが要求するプロセッサ間データ転送量の最大値が最小となるサブプロセススケジュールを求めることで、プロセッサ間データ転送のための、容量が小さなシステムに適した動的プロセス割当てを行うことができる。
【０１１９】
(5) ：サブプロセスの静的スケジューリング手段は、サブプロセスのスケジュール情報を求める時、前記サブプロセスのスケジュール情報を、複数の条件により求めておき、プロセスの要求条件に応じて適当なスケジュール情報を選択して割当てる。
【０１２０】
このようにすれば、サブプロセスの静的スケジューリング手段が静的スケジューリングを行う時、複数のサブプロセススケジュールを求めておき、その中から動的プロセス割当て時に、最適なサブプロセススケジュールを選択することで、その時の割当て状況に適応した動的プロセス割当てを行うことができる。
【０１２１】
(6) ：請求項６では、静的スケジューリング装置のサブプロセスの静的スケジューリング手段は、動的プロセス割当てを行うためのサブプロセスのスケジュール情報を、予め、静的に求めておく。このように、動的プロセス割当てを行う時、割当てを行うプロセスを構成するサブプロセスを動的プロセス割当てを行う前に、予め静的にスケジュール情報を求めておく。
【０１２２】
そして、そこで求められたサブプロセスの実行スケジュールからそのプロセスの要求するプロセッサ数やプロセッサ間データ転送量を基に、動的プロセス割当てを行うことで、全体の処理の効率の良い実現に寄与するところが大である。
【０１２３】
(7) ：請求項７では、コンピュータが、前記記録媒体のプログラムを読み出して実行することにより、動的プロセス割当てを行うためのサブプロセスのスケジュール情報を、予め、静的に求めておく。このように、動的プロセス割当てを行う時、割当てを行うプロセスを構成するサブプロセスを動的プロセス割当てを行う前に、予め静的にスケジュール情報を求めておく。
【０１２４】
そして、そこで求められたサブプロセスの実行スケジュールからそのプロセスの要求するプロセッサ数やプロセッサ間データ転送量を基に、動的プロセス割当てを行うことで、全体の処理の効率の良い実現に寄与するところが大である。
【図面の簡単な説明】
【図１】本発明の実施の形態におけるシステムの説明図である。
【図２】本発明の実施の形態におけるサブプロセスの処理説明図である。
【図３】本発明の実施の形態におけるプロセス割当て処理フローチャート（その１）である。
【図４】本発明の実施の形態におけるプロセス割当て処理フローチャート（その２）である。
【図５】本発明の実施の形態におけるプロセス割当て処理フローチャート（その３）である。
【図６】本発明の実施の形態におけるプロセス割当て処理説明図（その１）である。
【図７】本発明の実施の形態におけるプロセス割当て処理説明図（その２）である。
【図８】従来例の説明図（その１）である。
【図９】従来例の説明図（その２）である。
【図１０】従来例の説明図（その３）である。
【図１１】従来例の説明図（その４）である。
【図１２】従来例の説明図（その５）である。
【図１３】従来例の説明図（その６）である。
【符号の説明】
１０情報処理ユニット
１１信号処理プロセッサ
１２命令キャッシュ
１３データＲＡＭ
１４、１５リンク制御部
１６メインキャッシュ
１７リンクキャッシュ
１８ＤＲＡＭ
１９ＤＲＡＭコントローラ
２０通信リンク
２５信号処理部
２６通信制御部
３０ホストメモリバス
３１ホストプロセッサ
３３静的スケジューリング装置
３４動的プロセス割当て手段
３５サブプロセスの静的スケジューリング手段[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a process allocation for dynamically allocating a process including a plurality of subprocesses as constituent elements to each processor of a linear array type multiprocessor system in which a plurality of processors are connected in series.On the wayRelated.
[0002]
In recent years, in fields where real-time processing is required, such as digital communication systems and multimedia systems, high-speed, low-cost, digital signal processing processors (DSPs) that can flexibly handle complex algorithms, Processors that can execute processing at high speed are widely used.
[0003]
However, when trying to process a plurality of processes such as image processing and sound processing CG at the same time, it is difficult to perform processing in real time with a single processor. Further, even in processing realized by a single processor, if the amount of data to be processed increases or the required speed increases, it may not be realized by a single processor. Therefore, it is necessary to increase the processing speed by using a plurality of processors.
[0004]
[Prior art]
A conventional example will be described below with reference to the drawings.
§1: Description of Conventional Example 1
Conventionally, in a multiprocessor system, when assigning a process to each processor, a method of assigning a process in order from an available processor is used. This is because, in the linear array type multiprocessor system, when the communication between processors is performed, there is only one transfer path (linear) or two (ring), so it is unnecessary to consider the communication time.
[0005]
In addition, such a method of finding free processors and sequentially assigning them has an advantage that time required for important assignments is small in dynamic process assignment. However, depending on the application, there is a case where the amount of data transferred through the link used for data transfer between processors is considerably large. In such a case, the amount of data transferred on a link between certain processors becomes large, and other processors The data transfer flow of the link between them may be reduced.
[0006]
In such a case, although the processor capacity is still sufficiently free, data transfer cannot be performed, and the entire processing may not be realized in real time.
[0007]
§2: Description of Conventional Example 2
As an allocation method considering the data transfer amount, for example, there is an invention described in JP-A-9-34847. Hereinafter, this example will be described as Conventional Example 2. In Conventional Example 2, a switch is placed on a communication path, the data transfer amount is counted with the switch, and the communication path is changed accordingly, thereby assigning a process in consideration of the data transfer amount.
[0008]
However, this process allocation method is effective if there are a plurality of communication paths, but in the linear array type processor system, there is only one communication path and only two link configurations, and there is almost no effect.
[0009]
§3: Description of Conventional Example 3
There has already been proposed a process that can realize processing utilizing the capability of a processor by allocating a process that minimizes the maximum data transfer amount between processors. However, since each process allocated here generally requires a plurality of processor resources, a plurality of sub-processes to be processed by a single processor are connected.
[0010]
Therefore, the resource required by the process varies depending on the execution schedule of the sub-process. As for the execution schedule of such a sub-process, a method is known in which a control group is set in advance and the inside of the group is scheduled, as disclosed in JP-A-8-152903.
[0011]
However, in this method, the control group is statically determined first, so in order to distribute the load of the data transfer amount between the groups, it is necessary to change the allocation of the control group each time. Not done.
[0012]
§4: Description of Conventional Example 4
Among systems composed of a plurality of processors, a linear array type multiprocessor system obtained by connecting a plurality of processors in series is Japanese Patent Application No. 9-221617 (hereinafter referred to as “conventional example 4”) previously filed. It is described in. The linear array type multiprocessor system shown in the conventional example 4 has a simple configuration and can constitute the cheapest system.
[0013]
In the linear array type multiprocessor system, the processor can directly transfer data to one of the left and right processors, but data transfer to the other processors can be performed only via other processors. Therefore, when a plurality of processors are dynamically processed or stopped according to a user's request, the ability of the plurality of processors cannot be fully utilized depending on how the processes are allocated.
[0014]
Hereinafter, the outline of Conventional Example 4 will be described with reference to the drawings. FIG. 8 is an explanatory diagram (part 1) of a conventional example, and is an explanatory diagram of a signal processing accelerator (linear array type processor system). The configuration of the signal processing accelerator (linear array type processor system) will be described below with reference to FIG.
[0015]
The signal processing accelerator shown in FIG. 8 includes a plurality of identical information processing units 10. The information processing units 10 are connected to each other and to the host memory bus 30. Each information processing unit 10 includes a signal processor 11, an instruction cache 12, a data RAM 13, link control units 14 and 15, a main cache 16, a link cache 17, and a DRAM controller 19. The signal processor 11 and the data RAM 13 constitute a signal processing unit 25. Further, the link control units 14 and 15, the main cache 16, and the link cache 17 constitute a communication control unit 26. A communication link 20 is connected to the link controllers 14 and 15. Each information processing unit 10 is arranged in series via the communication link 20 and communicates with the adjacent information processing unit 10. By communicating the communication contents from one information processing unit 10 to the next information processing unit 10, communication can be performed between any information processing units. In this case, although three information processing units 10 are shown in FIG. 8, the number is not limited to three and is arbitrary.
[0016]
Each information processing unit 10 is connected to the host memory bus 30 via the DRAM controller 19. Further, a host processor 31 is connected to the host memory bus 30. The signal processor 11 is a processor for realizing a signal processing function, and the instruction cache 12 is a cache memory for storing instructions frequently used by the signal processor 11.
[0017]
A program executed by the signal processor 11 is stored in the DRAM 18 in addition to the instruction cache 12. The data RAM 13 is used as a work area for storing intermediate results when the signal processor 11 processes data. The main cache 16 and the link cache 17 are cache memories for storing data to be processed by the signal processor 11.
[0018]
The main cache 16 stores data fetched from the DRAM 18 of the information processing unit 10 itself, and the link cache 17 stores data fetched from other information processing units 10 via the link control units 14 and 15.
[0019]
Even if the data in the main cache 16 is swapped out, if the data becomes necessary, the data can be read again from its own DRAM 18. On the other hand, when the data in the link cache 17 is swapped out, it is necessary to bring the data from another information processing unit 10 through the communication link 20 again.
[0020]
If the main cache 16 and the link cache 17 are the same cache memory, for example, the data acquired from the other information processing unit 10 by storing the data from its own DRAM 18 in the cache memory even when the communication load is heavy. Problems such as swapping out. In this example, the main cache 16 and the link cache 17 are divided by function.
[0021]
When performing data processing, the plurality of information processing units 10 perform parallel processing or pipeline processing while communicating with each other. For example, while some information processing units 10 execute image data processing in parallel, some other information processing units 10 can process audio data in parallel.
[0022]
Communication between the plurality of information processing units 10 is performed by the communication link 20. Therefore, the host memory bus 30 is not involved in communication between the plurality of information processing units 10 and can provide a data transfer path to other processes such as an OS process executed by the host processor 31.
[0023]
Each information processing unit 10 writes the processed data in the DRAM 18. The host processor 31 can write data after data processing by accessing the DRAM 18 via the host memory bus 30. The signal processing accelerator of FIG. 8 includes a plurality of information processing units 10 that can communicate without interposing the host memory bus 30 and causes the information processing units 10 to execute parallel processing. Therefore, data processing speed is reduced due to bus contention. Thus, high-speed signal processing can be realized.
[0024]
Also, since a different information processing unit 10 can be assigned to each of a plurality of processes such as an image processing process and a sound processing process, it is suitable for multimedia signal processing that requires processing of a plurality of different signals. . In addition, the signal processor 25 (signal processor 11, instruction cache 12, data RAM 13), communication controller 26 (cache memories 16 and 17, and link controllers 14 and 15), and memory (DRAM 18 and DRAM controller 19). Can be integrated on a single chip and mounted on a personal computer in the same form as a conventional memory.
[0025]
Therefore, there is an advantage that the cost can be overlapped with the conventional memory bus and the signal processing accelerator embedded in the memory bus can be utilized by software. Accordingly, it is possible to reduce the cost for adding and expanding hardware and to construct a system excellent in function expansion.
[0026]
§5: Explanation of process allocation method ... See FIG.
FIG. 9 is an explanatory diagram of the conventional example (part 2), and shows two different process allocation methods (A in FIG. 1 and Example 2 in FIG. B). Hereinafter, two different process allocation methods will be described with reference to FIG. A case where process 1 is assigned to processor elements PE1 and PE3 and process 2 is assigned to processor elements PE2 and PE4 as shown in FIG. 9A will be described.
[0027]
If the data transfer amount between two PEs in one processor is M, the data transfer amount between PE1 and PE3 via PE2 is M, and data transfer between PE2 and PE4 via PE3 The amount is also M. Therefore, the data transfer amount is M between PE1 and PE2, 2M between PE2 and PE3, and M between PE3 and PE4.
[0028]
Next, as shown in FIG. 9B, the case where the process 1 is assigned to the processor elements PE1 and PE2 and the process 2 is assigned to the processor elements PE3 and PE4 will be described. In this case, the data transfer amount between PE1 and PE2 is M, the data transfer amount between PE3 and PE4 is M, and no data transfer is performed between PE2 and PE3.
[0029]
If the data transfer capability between PEs is, for example, 1.5M, in the case of FIG. 9A (Example 1), the processing of one process cannot be realized, whereas FIG. In the case of FIG. 9B (example 2), both processes can be realized simultaneously. As described above, the data transfer amount of each link varies depending on how the processes are allocated, and there are cases where complete simultaneous processing of a plurality of processes is possible and impossible.
[0030]
If simultaneous processing is not possible, the overall data processing speed naturally deteriorates. In addition, since the number and timing required from other processor elements are completely unknown before an actual request is issued, it is necessary to dynamically allocate processes. Therefore, an efficient dynamic process allocation algorithm is required.
[0031]
§6: Description of dynamic process allocation algorithm ... See FIGS.
10 to 13 are explanatory diagrams (No. 3) to (No. 6) of the conventional example, showing a dynamic process allocation algorithm. Hereinafter, a dynamic process allocation algorithm will be described with reference to FIGS.
[0032]
(1): Description of the main routine ... See FIG.
FIG. 10 is an explanatory diagram (part 3) of the conventional example and is a flowchart showing a main routine of the dynamic process allocation algorithm. The main routine of the dynamic process allocation algorithm will be described below with reference to FIG. S1 to S5 indicate each processing step.
[0033]
This dynamic process allocation algorithm is a process executed by a resource management program (not shown). This dynamic process allocation algorithm allocates a resource (in this case, the resource indicates a PE) based on the following two indexes. The first indicator is that the data transfer of the assigning process does not overlap with other data transfers as much as possible. The second index is that, when the next process is assigned as a result of assigning the process, it does not overlap with other data transfers as much as possible.
[0034]
First, the first index is set so that the maximum value of the data transfer amount on the communication link 20 caused by the process to be allocated is minimized. Next, among the allocation methods having the same maximum value, an allocation method is obtained by using the second index so that the next allocation process is not inhibited as much as possible.
[0035]
As shown in FIG. 10, in this algorithm, the allocation method is separately obtained when one PE is requested and when a plurality of PEs are requested. This is because when only one PE is requested, data transfer on the communication link 20 by the process does not occur, and therefore the influence on the next process allocation may be considered.
[0036]
On the other hand, when a plurality of PEs are allocated, since data transfer via the communication link 20 is necessary, the efficiency of the process itself is affected by how the processes are allocated.
[0037]
In step S1 of FIG. 10, the number of PEs that can be used as free resources is checked. If there is no PE that can be used, the algorithm is terminated, and if it exists, the process proceeds to step S2. In step S2, it is determined whether the required number of PEs is one. If there is one, the process proceeds to step S3, and if there is more than one, the process proceeds to step S4.
[0038]
In step S3, an assignment is made for one PE request. If the allocation is unsuccessful, the algorithm is terminated. If the allocation is successful, the process proceeds to step S5. In step S4, allocation is performed for a plurality of PE requests. If the allocation fails, the algorithm is terminated. If the allocation is successful, the process proceeds to step S5. In step S5, the process ID is updated. That is, a new process ID is assigned to a new process. This ends the algorithm.
[0039]
(2): Explanation of allocation for one PE request-see Fig. 11
FIG. 11 is an explanatory diagram (part 4) of the conventional example, and is a flowchart showing an allocation process for one PE request in step S3 of FIG. In addition, S11-S17 show each process step.
[0040]
In step S11, an available PE is searched. Next, in step S12, a loop is formed for all available PEs. That is, the following steps are repeated for all available PEs. In step S13, one PE is provisionally allocated. In step S14, the next allocation efficiency is calculated. The result of calculating the allocation efficiency is referred to as “Result”.
[0041]
In step S15, Result holds the minimum value. In step S16, the loop is terminated. In step S17, the PE whose Result is the minimum value is actually assigned to the process. The process ends here.
[0042]
(3): Explanation of allocation for multiple requests of PEs ... see Fig. 12
FIG. 12 is an explanatory diagram of the conventional example (No. 5), and is a flowchart showing allocation of a plurality of PE requests. In the following, allocation of a plurality of PE requests in step S4 in FIG. 10 will be described. In addition, S21-S31 show each process step.
[0043]
In step S21, an available PE is searched. In step S22, the first loop is configured for all combinations of the required number of available PEs. That is, the following steps are repeated for all combinations of the required number of available PEs.
[0044]
In step S23, the data transfer amount of the communication link 20 generated by the allocation of the process is calculated. In step S24, the allocation with the minimum value of the data transfer amount of the communication link 20 is held. In step S25, the first loop is terminated. In step S26, the allocation that minimizes the increase in the data transfer amount of the communication link 20 is set as the allocation selected in the first loop, and the second loop is configured for all the selected allocations.
[0045]
In step S27, a plurality of PEs are provisionally allocated with the selected allocation method. In step S28, the next allocation efficiency is calculated. The result of calculating the next allocation efficiency is denoted as Result. In step S29, the result Result holds the minimum assignment. In step S30, the second loop is terminated. In step S31, the assignment is actually assigned to the process by the assignment method with the smallest Result. The above process ends.
[0046]
(4): Explanation of the next allocation efficiency calculation process ... See FIG.
FIG. 13 is an explanatory diagram of the conventional example (No. 6), and is a flowchart showing the next allocation efficiency calculation process in step S14 of FIG. 11 and step S28 of FIG. Hereinafter, the above-described allocation efficiency calculation process will be described with reference to FIG. S41 to S43 indicate each processing step.
[0047]
In step S41, the leftmost PE among the available PEs is selected as PE-L. In step S42, the rightmost PE among the available PEs is selected as PE-R. In step S43, the number of communication links from PE-L to PE-R is obtained, and this number of communication links is defined as Result. The process ends here.
[0048]
In the flowchart of FIG. 13, the leftmost PE and the rightmost PE are selected, and the number of communication links between the two PEs is obtained. That is, the allocation efficiency in allocating the next process is determined by the number of communication links. This can be considered as follows. That is, the fact that the number of communication links between both PEs is small means that there are a group of available PEs. Conversely, if the number of communication links between both PEs is large, the available PEs are spread over a wide range.
[0049]
If a process is allocated to PEs grouped in a narrow range, the number of PEs interposed between the allocated PEs is naturally small, and the maximum value of the data transfer amount after allocation is likely to be small. If a process is assigned to PEs that are separated from each other over a wide range, the number of intervening PEs is large, and there is a high possibility that it overlaps with the data transfer of another process. High nature.
[0050]
That is, the flowchart of FIG. 13 gives an indication that the available PEs remaining after the process allocation are in a narrow range as much as possible. In other words, after the process allocation, when the remaining available PE is allocated the next time, an index is given to improve the data transfer efficiency as much as possible.
[0051]
[Problems to be solved by the invention]
The conventional apparatus as described above has the following problems.
(1): In Conventional Example 1, although the processor capacity is still sufficiently open, data transfer cannot be performed, and thus the entire processing may not be realized in real time.
[0052]
(2): In the conventional example 2, the process allocation method is effective if there are a plurality of communication paths, but in the linear array type processor system, there is only one communication path and only two link configurations, and there is almost no effect. .
[0053]
(3): In Conventional Example 3, since the control group is statically determined first, in order to distribute the load of the data transfer amount between the groups, it is necessary to change the allocation of the control group each time. Distribution is not done dynamically.
[0054]
(4): In the dynamic process allocation in the linear array processor system shown in the conventional example 4, the process to be allocated requires a plurality of processor resources. Therefore, in general, each process has a configuration in which a plurality of sub-processes processed by a single processor are connected. Depending on the execution schedule of the sub-process, the number of processors required by the process and the data transfer amount are variable.
[0055]
Therefore, there is a problem that the execution schedule of the sub-process affects the increase / decrease in the data transfer amount between the processors due to the dynamic process allocation, and this makes it impossible to fully demonstrate the capacity of the entire system.
[0056]
In addition, by using the execution schedule of a sub-process that requires a small number of processors and the amount of data transfer required by the process, even if the execution speed of other processes can be satisfied, the processing speed of the single process decreases, There is a problem that processing cannot be realized.
[0057]
The present invention solves such a conventional problem and obtains the execution schedule of a sub-process in advance by a static scheduling means in order to maximize the capacity of the entire system by dynamic process allocation in a linear array type processor system. The purpose is to enable efficient processing.
[0058]
[Means for Solving the Problems]
In order to achieve the above object, the present invention is configured as follows.
  (1): In a process allocation method for dynamically allocating a process including a plurality of sub-processes as constituent elements to each processor of a linear array type multiprocessor system in which a plurality of processors are connected in series, A procedure for statically obtaining in advance the execution schedule information including the number of processors required by the process and the amount of data transferred between the processors of the sub-process constituting the process for performing by the static scheduling means of the sub-process, A dynamic process allocating unit that receives execution schedule information of the obtained sub-process and performs dynamic process allocation to each of the processors based on the execution schedule information;In addition, the static scheduling means of the sub-process obtains a sub-process schedule that minimizes the number of processors required for the process to finish within a specified time when performing the static scheduling, and The allocation is attempted with the determined number of processors, and it is determined whether the allocation is possible. If the allocation is impossible, the number of processors is increased to determine whether the allocation is possible again. When it becomes possible, a procedure for obtaining a schedule that minimizes the number of processors required by the process;It is characterized by having.
[0059]
  (2):A process allocation method for dynamically allocating a process including a plurality of sub-processes as constituent elements to each processor of a linear array type multiprocessor system in which a plurality of processors are connected in series. A process of statically obtaining execution schedule information including the number of processors required by the process and the amount of data transferred between processors in advance by a static scheduling means of the subprocess, and dynamic process allocation Means for receiving execution schedule information of the obtained sub-process and performing dynamic process allocation to each processor based on the execution schedule information, and static scheduling means for the sub-process The static schedule When performing a ring, a procedure for obtaining a sub-process schedule that minimizes the processing time required to complete the process with the specified number of processors and an operation that divides the total processing time of the sub-processes by the number of target processors Attempts to allocate under the procedure for obtaining the processing time and the condition of processing time T equal to or greater than the obtained processing time, and determines whether or not the allocation is possible. , Repeatedly determining whether allocation is possible again, and when allocation becomes possible, a procedure for obtaining a schedule that minimizes the processing time of the process aloneIt is characterized by having.
[0060]
  (3):A process allocation method for dynamically allocating a process including a plurality of sub-processes as constituent elements to each processor of a linear array type multiprocessor system in which a plurality of processors are connected in series. A process of statically obtaining execution schedule information including the number of processors required by the process and the amount of data transferred between processors in advance by a static scheduling means of the subprocess, and dynamic process allocation Means for receiving execution schedule information of the obtained sub-process and performing dynamic process allocation to each processor based on the execution schedule information, and static scheduling means for the sub-process The static schedule When performing ringing, the procedure for dividing the process up to the number of sub-processes, the procedure for scheduling the divided processes and obtaining the execution time of each sub-process, and the time constraint conditions at this time are as follows: If the above condition is not satisfied, another division is attempted again, and it is repeatedly determined whether or not the above condition is satisfied. If the above condition is satisfied, between each division at that time It is determined whether or not the maximum value of the data transfer amount is minimum, and if not, the process is repeated. A procedure for performing the process for all the division patterns and setting the stored maximum data transfer amount to the minimum as a sub-process schedule of the process;It is characterized by having.
[0061]
  (Four) :A process allocation method for dynamically allocating a process including a plurality of sub-processes as constituent elements to each processor of a linear array type multiprocessor system in which a plurality of processors are connected in series. A process of statically obtaining execution schedule information including the number of processors required by the process and the amount of data transferred between processors in advance by a static scheduling means of the subprocess, and dynamic process allocation Means for receiving execution schedule information of the obtained sub-process and performing dynamic process allocation to each processor based on the execution schedule information, and static scheduling means for the sub-process Is composed of multiple A plurality of subprocess static scheduling means to select and use a subprocess schedule result optimum for dynamic process allocation from a plurality of subprocess schedules obtained in advance; and When a plurality of pieces of information including a process allocation status to the current processor, a data transfer amount between the current processors, and future process deallocation information are updated for each process and the next process is allocated Selecting and assigning optimal information from the plurality of information;It is characterized by having.
[0065]
(Function)
The operation of the present invention based on the above configuration will be described.
(a): Action of (1) above
The sub-process static scheduling means statically obtains the sub-process schedule information for performing dynamic process allocation in advance. As described above, when dynamic process allocation is performed, schedule information is statically obtained in advance before performing dynamic process allocation for sub-processes constituting the allocation process.
[0066]
Then, based on the execution schedule of the subprocess obtained there, dynamic process allocation is performed based on the number of processors required by the process and the data transfer amount between processors, which contributes to efficient realization of the entire processing. It ’s big.
[0067]
(b): Action of (2) above
When obtaining the sub process schedule information, the sub process static scheduling means obtains the schedule information that minimizes the number of processors required by the process. In this way, when the static scheduling means of a subprocess performs static scheduling, a dynamic process allocation suitable for a system with a small number of processors is obtained by obtaining a subprocess schedule that minimizes the number of processors required by the process. It can be performed.
[0068]
(c): Action of (3) above
When sub-process static scheduling means obtains sub-process schedule information, it obtains schedule information that minimizes the processing time of the single process. In this way, when the static scheduling means of a subprocess performs static scheduling, a dynamic process allocation with a short processing time for all processes is performed by obtaining a subprocess schedule that minimizes the processing time of the process itself. be able to.
[0069]
(d): Action of (4) above
When the sub-process static scheduling means obtains the sub-process schedule information, the sub-process static scheduling means obtains the schedule information that minimizes the inter-processor data transfer amount requested by the process. In this way, when the static scheduling means of the sub-process performs the static scheduling, the sub-process schedule that minimizes the maximum value of the inter-processor data transfer amount required by the process is obtained, so that the inter-processor data transfer is achieved. The dynamic process allocation suitable for a system having a small capacity can be performed.
[0070]
(e): Action of (5) above
The sub-process static scheduling means obtains the sub-process schedule information based on a plurality of conditions, and selects and assigns appropriate schedule information according to the process requirements. .
[0071]
In this way, when the static scheduling means of the sub-process performs static scheduling, a plurality of sub-process schedules are obtained, and the optimum sub-process schedule is selected from among them by the dynamic process allocation. Dynamic process allocation adapted to the allocation status at that time can be performed.
[0072]
(f): Action of (6) above
The static scheduling means of the subprocess of the static scheduling apparatus statically obtains the schedule information of the subprocess for performing dynamic process allocation in advance. As described above, when dynamic process allocation is performed, schedule information is statically obtained in advance before performing dynamic process allocation for sub-processes constituting the allocation process.
[0073]
Then, based on the execution schedule of the subprocess obtained there, dynamic process allocation is performed based on the number of processors required by the process and the data transfer amount between processors, which contributes to efficient realization of the entire processing. It ’s big.
[0074]
(g): Action of (7) above
By reading and executing the program on the recording medium, the computer statically obtains in advance schedule information of sub-processes for performing dynamic process allocation. As described above, when dynamic process allocation is performed, schedule information is statically obtained in advance before performing dynamic process allocation for sub-processes constituting the allocation process.
[0075]
Then, based on the execution schedule of the subprocess obtained there, dynamic process allocation is performed based on the number of processors required by the process and the data transfer amount between processors, which contributes to efficient realization of the entire processing. It ’s big.
[0076]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described in detail below with reference to the drawings.
§1: System description ... See Fig. 1
FIG. 1 is an explanatory diagram of the system. Hereinafter, a system used in the present embodiment will be described with reference to FIG. As shown in FIG. 1, this system is a system for performing dynamic process allocation to a linear array type multiprocessor system.
[0077]
The linear array type multiprocessor system is a system in which a plurality of processors 32 (PE1, PE2, PE3, PE4...) Are connected in series, and is a system having the same configuration as that of the conventional example 4. In this case, each of the processors 32 corresponds to the information processing unit 10 of the conventional example 4.
[0078]
A host processor 31 that performs dynamic process allocation to each processor 32 is connected to each processor 32 by a bus. In addition, a static scheduling device 33 for performing static allocation of sub-processes is connected to the host processor 31 before performing dynamic process allocation to the processors 32 by the host processor 31.
[0079]
The host processor 31 is provided with dynamic process allocation means 34 for performing dynamic process allocation to each processor 32. The static scheduling device 33 includes a sub process for performing static scheduling of sub processes. A process static scheduling means 35 is provided.
[0080]
In this case, the static scheduling device 33 may be directly connected to the linear array processor system by bus coupling or the like, but such a connection is not necessary. For example, the static scheduling device 33 and the host processor 31 need only be in a state where data communication is possible by some means, and are connected via a communication line such as a LAN and used in a state where data can be transmitted to each other. It is also possible to do.
[0081]
That is, the processor 33 may be placed near the host processor 31 (for example, in the same personal computer) or may be placed at a remote location (in a different apparatus). This can be implemented at least if data can be transmitted to a device provided with the dynamic process allocation means 34. In this case, the dynamic process allocation means 34 and the sub-process static scheduling means 35 are realized by executing a program.
[0082]
§2: Explanation of process allocation method ... See Figure 2
FIG. 2 is an explanatory diagram of sub-process processing. Hereinafter, processing of the sub-process will be described with reference to FIG. In the system shown in FIG. 1, the host processor 31 performs dynamic process allocation processing on each processor 32 to perform processing by the linear array processor system.
[0083]
In this case, since each process to be allocated generally requires a plurality of processor resources, a plurality of subprocesses to be processed by a single processor are connected. Execution of this sub-process changes the resources required by that process. In the following description, a case where the subprocess constitutes an element of one process and a plurality of subprocesses exist in one process will be described.
[0084]
As described above, when the host processor 31 performs dynamic process allocation, a plurality of sub-processes constituting the dynamic allocation process are provided in the static scheduling apparatus 33 before performing dynamic process allocation. The sub-process static scheduling means 35 performs static scheduling.
[0085]
The number of processors requested by the process and the inter-processor data transfer amount are transferred to the host processor 31 from the sub-process execution schedule information determined there, and the dynamic process allocation means 34 receives it. Thereafter, the dynamic process allocation unit 34 of the host processor 31 performs dynamic process allocation to each processor 32 based on the number of processors requested by the received process and the data transfer amount between processors. In this way, efficient realization of the entire processing is possible.
[0086]
In this case, as a process allocation method for allocating a process including a sub-process as a component to each processor 32 of the linear array type multiprocessor system, there are the following methods.
[0087]
{Circle around (1)} The process allocation method 1 is a method (time-constrained scheduling) in which the static scheduling means 35 of the sub-process obtains a schedule that minimizes the number of processors required by the process.
[0088]
{Circle around (2)} Process allocation method 2 is a method in which the sub-process static scheduling means 35 obtains a schedule that minimizes the processing time of a single process (scheduling with a limited number of processors).
[0089]
{Circle around (3)} The process allocation method 3 is a method in which the static scheduling means 35 of the sub-process obtains a schedule that minimizes the inter-processor data transfer amount requested by the process.
[0090]
{Circle over (4)} Process allocation method 4 is a method in which schedule information of the sub-process is obtained based on a plurality of conditions, and an appropriate schedule is selected and allocated according to the process requirement conditions.
[0091]
§3: Explanation of process allocation method 1 ... See FIG.
FIG. 3 is a flowchart (part 1) of process allocation processing. The process allocation method 1 is a method in which the sub-process static scheduling means 35 obtains a schedule that minimizes the number of processors required by the process (scheduling with time constraints), and will be described below with reference to FIG. The following processing is processing performed by the static scheduling means 35 of the subprocess, and S51 to S54 indicate each processing step.
[0092]
When performing static scheduling, the sub-process static scheduling means 35 obtains a sub-process schedule that minimizes the number Np of processors required for the process to be completed within a specified time. Therefore, first, the number of processors Np is obtained by the equation Np = (total processing time of sub-processes) / (target processing time) (S51).
[0093]
Next, allocation is attempted by the Np processor (S52), and it is determined whether the allocation is possible (S53). As a result, if the allocation is possible, the process is terminated. If the allocation is impossible, Np is updated (Np = Np + 1) (S54), and the process of S52 is repeated. In this way, it is possible to obtain a schedule that minimizes the number of processors required by the process at the time when assignment becomes possible in the processing of S53.
[0094]
§4: Explanation of process allocation method 2 ... See FIG.
FIG. 4 is a process allocation process flowchart (part 2). The process allocation method 2 is a method (static processor-constrained scheduling) in which the sub-process static scheduling means 35 obtains a schedule that minimizes the processing time of the single process, and will be described below with reference to FIG. The following processing is processing performed by the static scheduling means 35 of the subprocess, and S61 to S64 indicate each processing step.
[0095]
When performing static scheduling, the sub-process static scheduling means 35 obtains a sub-process schedule that minimizes the processing time T for completing the process with the specified number of processors. Therefore, first, the processing time T is obtained by the equation T = (total processing time of subprocesses) / (target number of processors) (S61).
[0096]
Next, allocation is attempted under the condition of processing time ≦ T (S62), and it is determined whether the allocation is possible (S63). As a result, if the allocation is possible, the process is terminated. If the allocation is impossible, T is updated {T = T + (unit time)} (S64), and the process is repeated from S62. In this way, a schedule that minimizes the processing time of a single process can be obtained at the time when assignment is possible in the processing of S63.
[0097]
§5: Explanation of process allocation method 3 ... See FIG.
FIG. 5 is a process allocation process flowchart (part 3). FIG. 6 is an explanatory diagram of process allocation processing (part 1). The process allocation method 3 is a method in which the static scheduling means 35 of the sub-process obtains a schedule that minimizes the inter-processor data transfer amount required by the process, and will be described below with reference to FIGS. The following processing is processing performed by the static scheduling means 35 of the sub-process, and S71 to S79 indicate each processing step. N is the number of sub-processes, and Mi is the number of division patterns for i division.
[0098]
The sub-process static scheduling means 35 first attempts to divide a process from i = 1 to N, where N is the number of sub-processes. That is, i = 1 to N loops (S71), and the process is divided (S72). In this case, if the number of division patterns when i is divided is Mi, division patterns exist from j = 1 to Mi loop (S73).
[0099]
For example, when N = 3 and {1,2,3} is divided into two, there are M2 = 2 patterns of {{1,2}, {3}}, {{1}, {2,3}} . Therefore, j = 1 to Mi loop (S73), the divided processes are scheduled, and the execution start time of each sub-process is obtained (S74). At this time, it is determined whether or not the condition (time constraint) can be satisfied (S75). If the condition is satisfied, the process proceeds to S76. If not satisfied, the process proceeds to S78, and another division is tried.
[0100]
When the above condition is satisfied (when the time constraint is satisfied), it is determined whether or not the maximum value of the data transfer amount between each division at that time is minimum (S76). If not, the process of S78 is performed. If it is minimum, the data transfer amount with the minimum maximum value is stored in a memory or the like (S77).
[0101]
This is repeatedly executed for all division patterns. When the double loop (S78, S79) ends, the schedule stored in the memory or the like is set as the sub-process schedule of the process.
[0102]
Hereinafter, the process of the process allocation method 3 will be described in detail with reference to FIG. The process shown in FIG. 6 includes eight subprocesses {1, 2, 3, 4, 5, 6, 7, 8}. When i = 3, that is, in the case of three divisions, for example, {1, 5, 7}, {2, 4}, {3, 6, 8} are divided.
[0103]
Scheduling is performed on the result, and the start time of each sub-process is determined for the three processors. If there is a schedule that satisfies the condition (time constraint), the transfer amount is calculated, and if the maximum value is smaller than the previously obtained allocation, it is stored.
[0104]
§6: Explanation of process allocation method 4 ... See FIG.
FIG. 7 is an explanatory diagram of process allocation processing (part 2). The process allocation method 4 is a method in which schedule information of sub-processes is obtained based on a plurality of conditions, and an appropriate schedule is selected and allocated according to the process requirements, and will be described below with reference to FIG.
[0105]
In this processing, the sub-process static scheduling means 35 is composed of N pieces, and the sub-processes are statically scheduled by these N means. When performing the static scheduling, an optimum subprocess schedule result at the time of dynamic process allocation is obtained from a plurality of subprocess schedules obtained in advance by the static scheduling means (1 to N) of the N subprocesses. Select and use.
[0106]
1. Process allocation status to the current processor (which processor is free)
2. Current data transfer between processors
3. Future process deallocation information (when, from which processor, which process is deallocated)
In the process, the three pieces of information 1 to 3 are updated every time the process is assigned. When allocating the next process, for example, if there are few processors available, the sub-processor allocation that minimizes the number of requested processors is selected from these three pieces of information, and the process allocation is performed. Also, when the data transfer amount is small, sub-processor allocation that minimizes the data transfer amount is selected to perform process allocation.
[0107]
Furthermore, the sub-process allocation that minimizes the processing time of the next process is used when the sum of this processing time and the time for deallocating is within the time constraints of this process.
[0108]
§7: Description of recording medium and program
The static scheduling processing of the subprocess performed by the static scheduling means 35 of the subprocess of the static scheduling device 33 is realized as follows by the CPU in the static scheduling device 33 executing the program.
[0109]
The static scheduling device 33 is provided with a hard disk device, and a program for realizing the processing, various other data, and the like are stored in a recording medium (hard disk) of the hard disk device. When the processing is performed, the program and data stored in the recording medium of the hard disk device are read out and taken into a memory provided in the static scheduling device 33 under the control of the CPU.
[0110]
Thereafter, the CPU performs the processing by sequentially reading and executing the necessary programs from among the programs stored in the memory. The program to be recorded on the recording medium of the hard disk device is recorded (stored) as follows.
[0111]
(1): A program (program data created by another device) stored in a flexible disk (floppy disk) is read by a flexible disk drive device provided in the static scheduling device 33, and a recording medium (hard disk) of the hard disk device is read. ).
[0112]
{Circle over (2)}: Data stored in a storage medium such as a magneto-optical disk or CD-ROM is read by a drive device provided in the static scheduling device 33 and stored in a recording medium (hard disk) of the hard disk device.
[0113]
(3): Data transmitted from another device via a communication line such as a LAN is received by the static scheduling device 33, and the data is stored in a recording medium (hard disk) of the hard disk device.
[0114]
【The invention's effect】
As described above, the present invention has the following effects.
  (1) :subThe process static scheduling means statically obtains in advance schedule information of sub-processes for performing dynamic process allocation. As described above, when dynamic process allocation is performed, schedule information is statically obtained in advance before performing dynamic process allocation for sub-processes constituting the allocation process.
[0115]
Then, based on the execution schedule of the subprocess obtained there, dynamic process allocation is performed based on the number of processors required by the process and the data transfer amount between processors, which contributes to efficient realization of the entire processing. It ’s big.
[0116]
  (2) :When obtaining the sub process schedule information, the sub process static scheduling means obtains the schedule information that minimizes the number of processors required by the process. In this way, when the static scheduling means of a subprocess performs static scheduling, a dynamic process allocation suitable for a system with a small number of processors is obtained by obtaining a subprocess schedule that minimizes the number of processors required by the process. It can be performed.
[0117]
  (3) :When the sub-process static scheduling means obtains the sub-process schedule information, the sub-process static scheduling means obtains the schedule information that minimizes the processing time of the single process. In this way, when the static scheduling means of a subprocess performs static scheduling, a dynamic process allocation with a short processing time for all processes is performed by obtaining a subprocess schedule that minimizes the processing time of the process itself. be able to.
[0118]
  (Four) :When the sub-process static scheduling means obtains the sub-process schedule information, the sub-process static scheduling means obtains the schedule information that minimizes the inter-processor data transfer amount requested by the process. In this way, when the static scheduling means of the sub-process performs the static scheduling, the sub-process schedule that minimizes the maximum value of the inter-processor data transfer amount required by the process is obtained, so that the inter-processor data transfer is achieved. The dynamic process allocation suitable for a system having a small capacity can be performed.
[0119]
  (Five) :The sub-process static scheduling means obtains the sub-process schedule information based on a plurality of conditions, and selects and assigns appropriate schedule information according to the process requirements. .
[0120]
In this way, when the static scheduling means of the sub-process performs static scheduling, a plurality of sub-process schedules are obtained, and the optimum sub-process schedule is selected from among them by the dynamic process allocation. Dynamic process allocation adapted to the allocation status at that time can be performed.
[0121]
(6): In claim 6, the static scheduling means of the sub-process of the static scheduling device statically obtains the schedule information of the sub-process for performing dynamic process allocation in advance. As described above, when dynamic process allocation is performed, schedule information is statically obtained in advance before performing dynamic process allocation for sub-processes constituting the allocation process.
[0122]
Then, based on the execution schedule of the subprocess obtained there, dynamic process allocation is performed based on the number of processors required by the process and the data transfer amount between processors, which contributes to efficient realization of the entire processing. It ’s big.
[0123]
(7): In claim 7, the computer reads out and executes the program of the recording medium, and statically obtains schedule information of sub-processes for performing dynamic process allocation in advance. As described above, when dynamic process allocation is performed, schedule information is statically obtained in advance before performing dynamic process allocation for sub-processes constituting the allocation process.
[0124]
Then, based on the execution schedule of the subprocess obtained there, dynamic process allocation is performed based on the number of processors required by the process and the data transfer amount between processors, which contributes to efficient realization of the entire processing. It ’s big.
[Brief description of the drawings]
FIG. 1 is an explanatory diagram of a system in an embodiment of the present invention.
FIG. 2 is a process explanatory diagram of a sub process in the embodiment of the present invention.
FIG. 3 is a process allocation process flowchart (part 1) according to the embodiment of the present invention;
FIG. 4 is a process allocation process flowchart (part 2) according to the embodiment of the present invention;
FIG. 5 is a process allocation process flowchart (part 3) according to the embodiment of the present invention;
FIG. 6 is an explanatory diagram (part 1) illustrating process allocation processing according to the embodiment of this invention.
FIG. 7 is a process allocation process explanatory diagram (2) according to the embodiment of the present invention;
FIG. 8 is an explanatory diagram (part 1) of a conventional example.
FIG. 9 is an explanatory diagram (part 2) of a conventional example.
FIG. 10 is an explanatory diagram of a conventional example (part 3);
FIG. 11 is an explanatory diagram of a conventional example (part 4);
FIG. 12 is an explanatory diagram (No. 5) of a conventional example.
FIG. 13 is an explanatory diagram of a conventional example (No. 6).
[Explanation of symbols]
10 Information processing unit
11 Signal processor
12 Instruction cache
13 Data RAM
14, 15 Link control unit
16 Main cache
17 Link cache
18 DRAM
19 DRAM controller
20 Communication link
25 Signal processor
26 Communication control unit
30 Host memory bus
31 Host processor
33 Static scheduling device
34 Dynamic process allocation means
35 Static scheduling means for subprocesses

Claims

In a process allocation method for dynamically allocating a process including a plurality of subprocesses as constituent elements to each processor of a linear array type multiprocessor system in which a plurality of processors are connected in series,
Execution schedule information including the number of processors required by the process and the amount of data transferred between processors of the sub-process constituting the process for performing the dynamic process allocation is statically set in advance by the static scheduling means of the sub-process. The steps you're looking for,
A dynamic process allocating unit that receives the execution schedule information of the obtained sub-process and, based on the execution schedule information, performs a dynamic process allocation to each of the processors ;
The static scheduling means of the subprocess is:
When performing static scheduling, a procedure for obtaining a sub-process schedule that minimizes the number of processors required for the process to finish within a specified time; and
Allocation is attempted with the determined number of processors, and it is determined whether the allocation is possible. If the allocation is impossible, the number of processors is increased to determine whether the allocation is possible again. When the process becomes possible, a procedure for obtaining a schedule that minimizes the number of processors required by the process;
A process allocation method characterized by comprising:

In a process allocation method for dynamically allocating a process including a plurality of subprocesses as constituent elements to each processor of a linear array type multiprocessor system in which a plurality of processors are connected in series,
Execution schedule information including the number of processors required by the process and the amount of data transferred between processors of the sub-process constituting the process for performing the dynamic process allocation is statically set in advance by the static scheduling means of the sub-process. The steps you're looking for,
A dynamic process allocating unit that receives the execution schedule information of the obtained sub-process and, based on the execution schedule information, performs a dynamic process allocation to each of the processors;
The static scheduling means of the subprocess is:
When performing static scheduling, a procedure for obtaining a sub-process schedule that minimizes the processing time for the process to finish with the specified number of processors,
A procedure for obtaining the processing time by performing an operation of dividing the total processing time of sub-processes by the number of target processors;
Allocation is attempted under the condition of the processing time T equal to or greater than the calculated processing time, and it is determined whether the allocation is possible. If the allocation is impossible, the processing time T is updated, and whether the allocation is possible again. The procedure to obtain a schedule that minimizes the processing time of the single process when it becomes possible to assign,
A process allocation method characterized by comprising:

In a process allocation method for dynamically allocating a process including a plurality of subprocesses as constituent elements to each processor of a linear array type multiprocessor system in which a plurality of processors are connected in series,
Execution schedule information including the number of processors required by the process and the amount of data transferred between processors of the sub-process constituting the process for performing the dynamic process allocation is statically set in advance by the static scheduling means of the sub-process. The steps you're looking for,
A dynamic process allocating unit that receives the execution schedule information of the obtained sub-process and, based on the execution schedule information, performs a dynamic process allocation to each of the processors;
The static scheduling means of the subprocess is:
A procedure for dividing the process up to the number of sub-processes when performing static scheduling;
A procedure for performing scheduling on the divided processes and obtaining an execution time of each sub-process ;
It is determined whether or not the condition of the time constraint at this time is satisfied. If the condition is not satisfied, another division is tried again, and it is repeatedly determined whether or not the condition is satisfied, and the condition is satisfied. In this case, it is determined whether or not the maximum value of the data transfer amount between the divisions at that time is minimum, and if not, the process is repeated, and if it is minimum, the maximum value of the data transfer amount is minimum. Instructions for saving,
Performing the procedure for all the division patterns, and setting the stored maximum data transfer amount as a minimum as a sub-process schedule of the process;
A process allocation method characterized by comprising:

In a process allocation method for dynamically allocating a process including a plurality of subprocesses as constituent elements to each processor of a linear array type multiprocessor system in which a plurality of processors are connected in series,
Execution schedule information including the number of processors required by the process and the amount of data transferred between processors of the sub-process constituting the process for performing the dynamic process allocation is statically set in advance by the static scheduling means of the sub-process. The steps you're looking for,
A dynamic process allocating unit that receives the execution schedule information of the obtained sub-process and, based on the execution schedule information, performs a dynamic process allocation to each of the processors;
When the sub-process static scheduling means is composed of a plurality of sub-process schedules determined in advance by a plurality of sub-process static scheduling means, Select and use process schedule results,
In the above procedure, a plurality of pieces of information including the process allocation status to the current processor, the data transfer amount between the current processors, and future process deallocation information are updated every time the process is allocated, and the next process is performed. When assigning, selecting and assigning optimal information from the plurality of information;
A process allocation method characterized by comprising: