JP3986316B2

JP3986316B2 - Data transfer method and apparatus

Info

Publication number: JP3986316B2
Application number: JP2002009101A
Authority: JP
Inventors: 和幸佐藤; 裕嗣野中; 伸欣寺島; 智行神崎
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2002-01-17
Filing date: 2002-01-17
Publication date: 2007-10-03
Anticipated expiration: 2022-01-17
Also published as: JP2003218896A

Description

【０００１】
【発明の属する技術分野】
本発明は、データ転送方法及び装置に関し、特に伝送装置等のプロセッサのファームウェアおよびデータを旧版から新版へ更新する際のソフトウェア・ダウンロード方法及びこれを用いた装置に関するものである。
【０００２】
近年のネットワーク・システムにおいては、各地域に分散するプロセッサに対してファームウェアを旧版から新版へ更新する際、監視端末からの遠隔操作によるダウンロード機能が要求されている。
このため、各プロセッサに不揮発性の読み書き可能なメモリデバイスを持たせ、これにファームウェア・プログラムやデータベース等を書き込むようなソフトウェア・ダウンロード機能が提供されている。
【０００３】
今後、システムの複雑化とともにダウンロード対象のファームウェア数およびファームウェアのプログラムサイズも増加する傾向にあるため、旧版から新版への更新を円滑に実行し、装置の立ち上げ時間を短縮する必要がある。
【０００４】
【従来の技術】
図15は、従来のデータ転送方法及びこれを用いた装置としてのマルチプロセッサシステムを概念的に示したものである。
この例において、メインプロセッサ1は、n個のサブプロセッサ3_1〜3_n（以下、符号「3」で総称することがある。）に、片方向設定ノードとしてのリピータハブ5_1, 5_21〜5_2p-1（以下、符号「5」で総称することがある。）を介して例えば100BASE-FXにより接続されている。
【０００５】
すなわち、メインプロセッサ1は、ポート数pを有するリピータハブ5_1からｐ-1個のリピータハブ5に接続され、そして、これらのリピータハブ5は、さらにそれぞれp−1個のサブプロセッサに接続されており、以ってリピータハブ5は全体でn個のサブプロセッサ3に接続されている。
【０００６】
そして、メインプロセッサ1は、各サブプロセッサ3の運用状態を、自局に内蔵するデータベースから参照し、接続されている監視端末4からのダウンロード要求コマンドを受けて、運用中または運用予定のサブプロセッサ3へダウンロードを行う。
【０００７】
図16は、このような従来のマルチプロセッサシステムにおけるデータ転送のシーケンスが示されている。なお、この図では、サブプロセッサ31及び32のみを例示し、また図15に示したリピータハブ5は説明の便宜上省略されている。
まず、メインプロセッサ1は、監視端末4からファイル転送要求S20またはダウンロード要求S21を受けると、内蔵しているメモリデバイス6のRAMディスクに対してデータ書込S22を行う。
【０００８】
これに応答して、メモリデバイス6のRAMディスクはデータ読出S23をメインプロセッサ1に対して行うと共に、データ書込S24をメモリデバイス6のフラッシュメモリに対して行い、さらにRAMディスクからFTPデータ転送S25を例えばサブプロセッサ3_1にまず行う。そして、このFTPデータ転送の完了通知S26を同様にしてサブプロセッサ3_1に送る。
【０００９】
サブプロセッサ3_1では、そのCPUからフラッシュメモリに対してデータ書込S27を行うと、フラッシュメモリからサムチェックS28がCPUに返される。これと共に、フラッシュメモリはFTPデータ転送の結果がOKである通知S29をメインプロセッサ1に対して行う。
【００１０】
同様にして、メインプロセッサ1のメモリデバイス6は、別のサブプロセッサ3_2に対しても、FTPデータ転送S30を行い、FTPデータ転送完了通知S31をサブプロセッサ3_2のCPUに送ると、サブプロセッサ3_2のCPUはそのフラッシュメモリに対してデータ書込S32を行うと共に、上記と同様にサムチェックS33をCPUに返し、このFTPデータ転送の結果通知S34をサブプロセッサ3_2のCPUからメインプロセッサ1に送る。
【００１１】
このように、FTPデータ転送技術を用いて、メインプロセッサ1からサブプロセッサ3へポイント−ポイントでデータ転送している。
一方、データをダウンロード（転送）する際、ブロードキャスト(UDP)送信により複数のサブプロセッサに同時にダウンロードを行う方法があるが、この方法では、UDPのプロトコルでは送達確認ができないため、結局、アプリケーションレベルでの送達確認の処理が必要となり、またUDPではパケットの特性上、TCPに比べて信頼性が低いから使用することはできない。
【００１２】
従って上記のように、ポイント-ポイントでデータ転送することになる。
【００１３】
【発明が解決しようとする課題】
このように、メインプロセッサから複数のサブプロセッサへポイント−ポイントでデータ転送しているが、この場合にメインプロセッサのCPUの処理能力が高く、LAN回線の転送レートも非常に高い場合は問題ないが、安価なメインプロセッサを使用してコスト削減するような場合には、サブプロセッサの数が多い程全てのサブプロセッサに対してのダウンロードが終了するまでに膨大な時間を要してしまう。
【００１４】
例えば、100Kbpsで2Mバイトのデータを176台のサブプロセッサに対してダウンロードする場合、1サブプロセッサあたり約20秒かかり、全サブプロセッサの転送には、単純計算で58分(約1時間)が必要となる。
このようにダウンロードに膨大な時間がかかるため、顧客へのサービス提供が遅れ、ビジネスチャンスを逸してしまうという問題があった。
【００１５】
従って本発明は、メインプロセッサから複数のサブプロセッサへデータ転送する方法及びこの方法を用いた装置において、安価なプロセッサを使用しても、データ転送に要する時間を大幅に削減できるようにすることを目的とする。
【００１６】
【課題を解決するための手段】
上記の目的を達成するため、本発明では、メインプロセッサから、パラレル設定可能なノードを介して複数のサブプロセッサにデータを転送する方法であって、該メインプロセッサが、最小ホップ数を各々が与える複数のマルチキャストツリーを予め記憶しており、該サブプロセッサの接続可能数から該ノードのポート数を決定すると共に、該ポート数に対応した該最小ホップ数のマルチキャストツリーを選択し、このマルチキャストツリーにおいて、該最小ホップ数でグループ化された各グループ内で該最小ホップ数を維持するように決定されたパスに沿って該ノードを介して各サブプロセッサに所定のデータを転送することを特徴とした方法が提供される。
【００１７】
また、本発明では、メインプロセッサと、パラレル設定可能なノードと、該ノードを介して該メインプロセッサに接続される複数のサブプロセッサと、を備え、該メインプロセッサが、最小ホップ数を各々が与える複数のマルチキャストツリーを予め記憶しており、該サブプロセッサの接続可能数から該ノードのポート数を決定すると共に、該ポート数に対応した該最小ホップ数のマルチキャストツリーを選択し、このマルチキャストツリーにおいて、該最小ホップ数でグループ化された各グループ内で該最小ホップ数を維持するように決定されたパスに沿って該ノードを介して各サブプロセッサに所定のデータを転送することを特徴としたデータ転送装置が提供される。
【００１９】
今、上記のサブプロセッサの接続可能数をNとし、該ポート数をpとすると、該最大サブプロセッサ数Nmaxが、(p-1)²で２ⁿ（nは自然数）に一番近い数であり、該グループ数Ngrが(log₂Nmax)+1であり、各グループ内のサブプロセッサ数Nsubが２**(log₂Nmax-グループ番号)(**はべき乗を示す)であればよい。
【００２０】
さらに、上記のメインプロセッサは、各グループ毎にサブプロセッサのアクセス順序を決めておき、該ノードを経由したメインプロセッサ−サブプロセッサ間及びサブプロセッサ同士間のパスを決定することができる。
さらに、上記のノードとしてスイッチングハブを用いることができる。
【００２１】
図1には、上記の本発明に係るデータ転送方法を実現する装置としてのマルチプロセッサシステムの概念構成が示されている。この図1の概念構成は、図15に示した従来例の概念構成と比較すると分かるように、リピータハブ5_1, 5_21〜5_2p-1の代わりに、それぞれノードとしてのスイッチングハブ2_1, 2_21〜2_2p-1（以下、符号「2」で総称することがある。）を用いている点が異なっている。
【００２２】
すなわち、このスイッチングハブ2は、ストア・フォワード専用方式のパラレル設定可能なノードであるため、複数のポート間で転送を行う場合に互いに影響されることがないという特徴があり、メインプロセッサ1からサブプロセッサ3への転送を、スイッチングハブ経由でパラレル処理することにより、データ転送時間を短縮するものである。
【００２３】
図1に示す概念構成を分かり易くするため、図2に示すマルチプロセッサシステムでは、ポート数p=5のスイッチングハブ2_1の1つのポートをメインプロセッサ1に接続し、同じくポート数p=5のスイッチングハブ2_21〜2_24の各1つのポートをそのスイッチングハブ2_1の残りのp-1個のポートに接続する。
【００２４】
そして各スイッチングハブ2_21〜2_24の残りの4個のポートには、それぞれ4台のサブプロセッサ3_1〜3_4, 3_5〜3_8, 3_9〜3_12,及び3_13〜3_16がそれぞれ接続され、最終的に5個のスイッチングハブを用いて16台のサブプロセッサを接続している。
【００２５】
このような構成において、本発明の特徴である、メインプロセッサ1が、サブプロセッサ3の接続可能数からノード（スイッチングハブ）2のポート数を決定し、該ポート数に対応した最小ホップ数のマルチキャストツリーを選択し、このマルチキャストツリーに沿って各サブプロセッサ3に所定のデータを転送することについて、以下に図を参照して説明する。
【００２６】
まず、図2に示したマルチプロセッサシステムから、データ転送を実際に行うためのマルチキャストツリーがどのように選択されるかを説明する。
メインプロセッサ1は、図3のフローチャートに示すように、監視端末4からのダウンロードコマンド(COPY-MEMコマンド)などのソフトウエアダウンロ−ド指示を受けると（ステップS1）、メインプロセッサ1内のデータベースを検索する（ステップS2）。
【００２７】
このデータベースには、図4に示すような種々の数のサブプロセッサから成るマルチキャストツリーが予め格納されており、以下のステップS3〜S5により、これらのマルチキャストの最も好ましいものが選択されることになる。なお、図中の数字はデータ転送のホップ数を示している。
【００２８】
マルチキャストツリーの選択及びグループ数の算出（ステップ S3 ）
図4(1)〜(3)…に示す各マルチキャストツリーは、サブプロセッサ数2ⁿ（nは自然数）に応じて全てのツリー構造の中でそれぞれ最もホップ数（グループ数）が少なく好ましいもの（ダウンロード時間が少ないもの）として選択されたものである。
【００２９】
例えば、同図(1)に示すように、サブプロセッサ数が8台(n=3)の場合には4つのグループに分け、第1のグループは4台で構成し、このグループ内では2回のデータ転送で済み、第2のグループでは1回のデータ転送で済む。また第3及び第4のグループではメインプロセッサ1から直接それぞれ1回のデータ転送が行われるだけである。
【００３０】
従って、この場合にはデータ転送を行う回数、すなわちホップ数は“4”となる。
同様にして、同図(2)に示すようにサブプロセッサ数が16台(n=4)の場合には、5つのグループに分け、第1のグループは3回のデータ転送で済み、第2のグループは2回のデータ転送で済み、そして第3のグループは1回のデータ転送で済む。第4及び第5のグループは同図(1)の場合と同様にメインプロセッサ1から直接それぞれ1回データ転送を受けるだけである。
【００３１】
また、同図(3)に示すように、サブプロセッサ数が32台(n=5)の場合には、6つのグループに分け、第1のグループは4回のデータ転送で済み、第2のグループは3回のデータ転送、第3のグループは2回のデータ転送、第4のグループは1回のデータ転送で済む事になる。第5及び第6のグループは上記と同様にそれぞれ1回ずつである。
なお、最後のグループにおいては、常にデータ転送回数及びサブプロセッサの数は固定値“1”である。
【００３２】
このように、ホップ数とグループ数とは対応している。
このようなマルチキャストツリーをどのように選択するかを示したものが図5のフローチャートである。
まず、メインプロセッサ1は局データより、接続可能なサブプロセッサ数Nを検索する（ステップS11）。なお、図4に示したマルチキャストツリーを構成するサブプロセッサ数Nmaxが2ⁿであり、実際にメインプロセッサ1が接続可能なサブプロセッサ3の台数Nとは異なる場合が当然存在する。これについては後述する。
【００３３】
次に、スイッチングハブ2のポート数pを初期値=1に設定し（ステップS12）、ステップS13でN＞(p−1)²であるか否かを判定し、“yes”の場合にはポート数pを“1”だけインクリメント（ステップS14）するが、“no”の場合には、N=(p−1)²であることが判明したので、この時のポート数pを(p−1)²に代入して一番近い最大サブプロセッサ数Nmax=2ⁿを検索する（ステップS15）。
【００３４】
すなわち、これらのステップS12〜S15においては、図示の例の如く、▲１▼N＝16の場合には、(p−1)²=(5−1)²=16で、ポート数P=5となり、この場合一番近い最大サブプロセッサ数Nmax=2ⁿは16で最初に設定したサブプロセッサ数Nと一致する。
一方、▲２▼N=196の場合には、(p−1)²=(15−1)²=196で、p=15となるので、この場合のサブプロセッサ数Nmax=2ⁿは128＜Nmax＜256であるが、より多い方として最大値=256が必要であり、最初に設定したサブプロセッサ数Nより大きな値（小さくなることはない。）になる。
【００３５】
このようにして最大のサブプロセッサ数Nmaxが求められると、これに該当するマルチキャストツリーを図4に示した種々のマルチキャストツリーの中から選択する。
すなわち、サブプロセッサ数=8の場合には、Nmax=8のツリーを選択し（ステップS16）、サブプロセッサ数=16の場合にはNmax=16のマルチキャストツリーを選択し（ステップS17）、サブプロセッサ数=32の場合にはNmax=32のマルチキャストツリーを選択し、同様にしてサブプロセッサ数が2ⁿの場合には、Nmax=2ⁿのマルチキャストツリーを選択することになる（ステップS18）。
【００３６】
そしてこのようにして選択されたマルチキャストツリーにおいて、上記のNmaxの値からグループ数Ngr及び各グループのプロセッサ数Nsubを図示の算出式により求めることができる。
この結果、例えばサブプロセッサ数=8の場合にはグループ数Ngr=4となり、この各グループのサブプロセッサ数はNsub=4, 2, 1, 1となり、図4(1)に示したサブプロセッサ数Nmax=8の場合と一致することが分かる。なお、いずれのグループにおいても、最後のグループについては常に“1”が固定値として与えられるのでグループ数=ホップ数となる。
【００３７】
グループ化検索（ステップ S4 ）
上記のようにマルチキャストツリーの選択とグループ数、及び各グループのサブプロセッサ数の算出を行った後、今度は図6に示すグループ化を行う。
ここでは、上記のようにマルチキャストホップ数でグループ化されたサブプロセッサを算出したグループ数Ngrと各グループ内のサブプロセッサ数Nsubが既に算出されているので、選択したマルチキャストツリーのプロセッサ数Nmaxの値に従い、各グループ毎にサブプロセッサを点線で示すようにサブプロセッサ3_1〜3_16までシーケンシャルにこのサブプロセッサ番号を割り当てて行く。
【００３８】
すなわち、この図6のグループ化概念図に対応する図4(2)においては、16台のサブプロセッサがホップ毎に数字で示されているだけであるが、図6では、これらの各グループの各サブプロセッサが、図2に例示したマルチプロセッサシステムにおけるサブプロセッサのどこに該当するのかをハントしながら決定している。
【００３９】
サブプロセッサのパス接続（ステップ S5 ）
図6に示したように、図4(2)に示すサブプロセッサと図2に示すサブプロセッサとの対応関係を決めた後、今度はどのような順序でデータ転送（ソフトウエアダウンロード）を行うかを決める必要があり、図7はこのような場合のパス接続の概念を示したものである。
【００４０】
すなわち、図7の例では、まずメインプロセッサ1からスイッチングハブ2_1及び2_21を経由してサブプロセッサ3_1にデータ転送を行う。これが第1ホップである。
データ転送を受けたサブプロセッサ3_1は、スイッチングハブ2_21を経由してサブプロセッサ3_2に対してデータ転送を行う。これと同時に、メインプロセッサ1はスイッチングハブ2_1及び2_23を経由してサブプロセッサ3_9にデータ転送を行う。これが第2ホップであり、2台のサブプロセッサに同時にデータ転送が行われる。
【００４１】
第3ホップでは、データ転送を受けたサブプロセッサ3_2が、スイッチングハブ2_21を経由してサブプロセッサ3_3に対しデータ転送を行い、これと同時にサブプロセッサ3_1がスイッチングハブ2_21, 2_1,及び2_22を経由してサブプロセッサ3_6に対してデータ転送を行う。さらに、サブプロセッサ3_9はスイッチングハブ2_23を経由してサブプロセッサ3_10にデータ転送を行い、さらにメインプロセッサ1はスイッチングハブ2_1及び2_24を経由してサブプロセッサ3_13にデータ転送を行う。従って、第3ホップでは4台のサブプロセッサに同時にデータ転送が行われる。
【００４２】
第4ホップにおいては、サブプロセッサ3_3からスイッチングハブ2_21を経由してサブプロセッサ3_4にデータ転送が行われ、サブプロセッサ3_2からスイッチングハブ2_21, 2_1,及び2_22を経由してサブプロセッサ3_5にデータ転送が行われ、サブプロセッサ3_6からスイッチングハブ2_22を経由してサブプロセッサ3_7にデータ転送が行われ、サブプロセッサ3_1からスイッチングハブ2_21, 2_1,及び2_22を経由してサブプロセッサ3_8にデータ転送が行われる。さらに、サブプロセッサ3_10からスイッチングハブ2_23を経由してサブプロセッサ3_11にデータ転送が行われ、サブプロセッサ3_9からスイッチングハブ2_23を経由してサブプロセッサ3_12へデータ転送が行われる。そして、サブプロセッサ3_13からスイッチングハブ2_24を経由してサブプロセッサ3_14にデータ転送が行われ、さらにメインプロセッサ1からスイッチングハブ2_1及び2_24を経由してサブプロセッサ3_15へデータ転送が行われる。従って第4ホップでは8台のサブプロセッサにデータ転送が同時に行われることになる。
【００４３】
そして最後の第5ホップにおいては、メインプロセッサ1からスイッチングハブ2_1及び2_24を経由してサブプロセッサ3_16にデータ転送が行われる。
このようにして、全てのデータ転送が5回で終了することになる。
このようにデータ転送の手順が決まった後、メインプロセッサ1はデータの転送（ダウンロード）を実行する（ステップS6）。
【００４４】
このようなデータ転送方法及び装置により、メインプロセッサの処理能力が高くなくても、各サブプロセッサにこの処理を分散させ、本発明によるパス接続を使用すれば従来の方法及び装置より転送方法を格段に高速化させることが可能となる。
【００４５】
【発明の実施の形態】
図8は、図7に示すように、メインプロセッサ1が各サブプロセッサ3に対してパス接続を行った後、データの転送をする際のフォーマットを示したものである。この転送データはメインプロセッサからサブプロセッサへの転送データまたはサブプロセッサからサブプロセッサへの転送データのいずれも配信するものであり、両者の区別は、フォーマットの番号“0”に記載されている“Notification Code”（通知情報種別）により判別が可能になっている。その他、“Sequence ID”（シーケンスNO.；01h〜64h）、“FromCPU”（送信元プロセッサ；0001h〜00B0h：Subプロセッサ00FEh：メインプロセッサ）、“To CPU”（送信先プロセッサ；0001h〜00B0h：Subプロセッサ00FEh：メインプロセッサ）、“User Information”（ユーザー判別情報）、“Frame Page”（フレームページ番号）、“Total Frame Page”（総フレームページ数）、そして“data”（データ部；パス情報他、各種データ転送）で構成されている。
【００４６】
図9は、本発明に係るデータ転送方法を実現する装置としてのマルチプロセッサシステムにおけるメインプロセッサ1とサブプロセッサ3のソフトウエア構成例を示したものである。
まずメインプロセッサ1においては、CPY-MEMコマンド処理構造11として、CPY-MEMコマンド処理部12と構成計算部13とグループ作成部14とパス作成部15とサブプロセッサ通信部16と送信データ作成部17とデータ送信部18とを有しており、グループ作成部14及びサブプロセッサ通信部16はパス/マップデータ19を保有し、パス作成部15はグループデータ20を保有し、そして送信データ作成部17及びデータ送信部18は送信データ21を保有している。
【００４７】
サブプロセッサ3においては、CPY-MEMコマンド処理構造31として、CPY-MEMコマンド処理部32とメイン/サブプロセッサ通信部33とデータ送信部34とデータ受信部35とを有し、プロセッサ通信部33はパス/マップデータ36及びグループデータ37を保有し、データ送信部34は送信データ38を保有し、そしてデータ受信部35は受信データ39を保有している。
【００４８】
なお、図4に示したマルチキャストツリーはパス/マップデータ19及び36に予め格納されている。また、コマンド処理部12及び32は点線で示すように上記の全てのデータの処理に関っている。
図10は、メインプロセッサ1が各サブプロセッサ3にデータ転送を行う時のシーケンス例を示したものである。この転送シーケンスは、図16に示した従来例の転送シーケンスにおいて、メインプロセッサ1のメモリデバイス6から各サブプロセッサ3へFTPデータ転送を行うステップS41及びFTPデータ転送完了通知ステップS42の代わりに、例えばサブプロセッサ3_1からサブプロセッサ3_2に対してFTPデータ転送（ステップS30）を行い、且つFTPデータ転送完了通知（ステップS31）を行う点が異なっている。
【００４９】
すなわち、従来例においてはメインプロセッサから全てのサブプロセッサに対してダウンロードを行っていたが、本発明ではサブプロセッサから受信したグループデータ及びパスデータに従って該当するサブプロセッサにダウンロードを実施するようにしている点が異なっている。但し、ダウンロード処理結果については従来通りメインプロセッサに通知を返信する（ステップS29, S34）。
【００５０】
図11〜図14は、サブプロセッサが64台の場合のデータ転送を扱った実施例を示している。
この実施例においても、まず図11に示すように、メインプロセッサ1内のデータベースを基に、図3のステップS3、すなわち図5に示したツリー選択及びグループ数及び各グループのサブプロセッサ数の算出を行う。
【００５１】
そして、この情報を基に図12に示すグループ化を、図6と同様に実施する。
そして、図13に示すようにパスツリーを図3のステップS5、すなわち図7と同様に実施してパス接続を行う。
図14には、メインプロセッサ1からサブプロセッサ3へ、及びサブプロセッサ3からサブプロセッサ3へのデータの送受信タイミングが示されている。すなわち各サブプロセッサ3は図13に示したパスに従い、受信したデータをさらに次のサブプロセッサへ転送する。このように受信と送信を繰り返すが、末端のサブプロセッサについては受信のみで送信を行わない場合がある。
【００５２】
この結果、サブプロセッサ64台でのCPU占有率はおよそ26%となり、大きく負荷分散されることが分かる。
また、この例では、サブプロセッサ3_1の負荷が一番高くなっており、定常的に他のタスク処理などにより、付加が非常に高いことが事前に判明している場合は、上記のステップS3〜S5のデータ転送を実施した後に、休んでいるサブプロセッサへその役割を交換する処理を行うこともできる（パスの入れ換え）ため、性能の最適化を事前に予測する措置を講じることが可能となる。
【００５３】
（付記１）
メインプロセッサから、パラレル設定可能なノードを介して複数のサブプロセッサにデータを転送する方法であって、
該メインプロセッサが、該サブプロセッサの接続可能数から該ノードのポート数を決定し、該ポート数に対応した最小ホップ数のマルチキャストツリーを選択し、該マルチキャストツリーに沿って各サブプロセッサに所定のデータを転送することを特徴とした方法。
【００５４】
（付記２）付記1において、
該メインプロセッサが、最小ホップ数を与える複数のマルチキャストツリーを予め記憶しており、該ポート数により決定される最大サブプロセッサ数を有する一つの該マルチキャストツリーを選択し、該マルチキャストツリーをさらに、該ホップ数でグループ化し、各グループ内のパスを決定することを特徴とした方法。
【００５５】
（付記３）付記2において、
該サブプロセッサの接続可能数をNとし、該ポート数をpとすると、該最大サブプロセッサ数Nmaxが(p-1)²で与えられる2ⁿ（nは自然数）に一番近い最大数であり、該グループ化の数Ngrが(log₂Nmax)+1であり、各グループ内のサブプロセッサ数Nsubが2**(log₂Nmax-グループ番号)(**はべき乗を示す)であることを特徴とした方法。
【００５６】
（付記４）付記1から3のいずれか一つにおいて、
該メインプロセッサが、各グループ毎にサブプロセッサのアクセス順序を決めておき、該ノードを経由したメインプロセッサ−サブプロセッサ間及びサブプロセッサ同士間のパスを決定することを特徴とした方法。
【００５７】
（付記５）付記1から4のいずれか一つにおいて、
該ノードが、スイッチングハブであることを特徴とした方法。
（付記６）
メインプロセッサと、
パラレル設定可能なノードと、
該ノードを介して該メインプロセッサに接続される複数のサブプロセッサと、を備え、
該メインプロセッサが、該サブプロセッサの接続可能数から該ノードのポート数を決定し、該ポート数に対応した最小ホップ数のマルチキャストツリーを選択し、該マルチキャストツリーに沿って各サブプロセッサに所定のデータを転送することを特徴としたデータ転送装置。
【００５８】
（付記７）付記６において、
該メインプロセッサが、最小ホップ数を与える複数のマルチキャストツリーを予め記憶しており、該ポート数により決定される最大サブプロセッサ数を有する一つの該マルチキャストツリーを選択し、該マルチキャストツリーをさらに、該ホップ数でグループ化し、各グループ内のパスを決定することを特徴としたデータ転送装置。
【００５９】
（付記８）付記7において、
該サブプロセッサの接続可能数をNとし、該ポート数をpとすると、該最大サブプロセッサ数Nmaxが(p-1)²で与えられる2ⁿ（nは自然数）に一番近い最大数であり、該グループ化の数Ngrが(log₂Nmax)+1であり、各グループ内のサブプロセッサ数Nsubが２**(log₂Nmax-グループ番号)(**はべき乗を示す)であることを特徴としたデータ転送装置。
【００６０】
（付記９）付記6から8のいずれか一つにおいて、
該メインプロセッサが、各グループ毎にサブプロセッサのアクセス順序を決めておき、該ノードを経由したメインプロセッサ−サブプロセッサ間及びサブプロセッサ同士間のパスを決定することを特徴としたデータ転送装置。
【００６１】
（付記１０）付記6から9のいずれか一つにおいて、
該ノードが、スイッチングハブであることを特徴としたデータ転送装置。
【００６２】
【発明の効果】
以上述べたように本発明に係るデータ転送方法及び装置によれば、以下のような効果が得られる。
【００６３】
(1)本発明を採用することにより、従来例よりデータをダウンロードする時間を、大幅に短縮することができる。これは、サブプロセッサの数が多くなればなるほど効果を発揮する。
従来例と本発明を比べると下記のようになり（ホップ数で示す）、その効果が大きいことが分かる。
・サブプロサセッサ数=16個→従来例=16ホップ、本発明=5ホップ→3.2倍
・サブプロサセッサ数=32個→従来例=32ホップ、本発明=7ホップ→4.5倍
・サブプロサセッサ数=64個→従来例=64ホップ、本発明=9ホップ→7.1倍
・サブプロサセッサ数=n 個→〜以降、上記の如く指数関数的に処理能力が向上する。
【図面の簡単な説明】
【図１】本発明に係るデータ方法及び装置の概念構成を示したブロック図である。
【図２】図1に示した概念構成において16台のサブプロセッサを用いた場合のマルチプロセッサシステムを示したブロック図である。
【図３】本発明に係るデータ転送方法及び装置の概念動作を示したフローチャート図である。
【図４】本発明に係るデータ転送方法及び装置に用いるマルチキャストツリーの一般化概念を示したブロック図である。
【図５】本発明に係るデータ転送方法及び装置においてツリー選択及びグループ数算出を行う場合の概念を示したフローチャート図である。
【図６】本発明に係るデータ転送方法及び装置におけるグループ化の概念を示した図である。
【図７】本発明に係るデータ転送方法及び装置におけるパス接続の概念を示した図である。
【図８】本発明に係るデータ転送方法および装置におけるメインプロセッサとサブプロセッサ間のデータ転送時のフォーマットを示した図である。
【図９】本発明に係るデータ転送方法及び装置に用いられるメインプロセッサとサブプロセッサのソフトウエア構成例を示したブロック図である。
【図１０】本発明に係るデータ転送方法及び装置におけるデータ転送シーケンス例を示した図である。
【図１１】本発明に係るデータ転送方法及び装置において64台のサブプロセッサを用いた場合のグループ数の算出例を示した図である。
【図１２】図12のマルチキャストツリーにおけるグループ化を示した図である。
【図１３】図13におけるグループ化の後、パス接続を行った例を示した図である。
【図１４】図14におけるパス接続を行った後のデータの送信/受信タイミングを示した図である。
【図１５】従来例の概念構成例を示したブロック図である。
【図１６】従来例のデータ転送シーケンスを示した図である。
【符号の説明】
1 メインプロセッサ
2（2_1, 2_21〜2_2P-1）ノード（スイッチングハブ）
3（3_1〜3_n）サブプロセッサ
4 監視端末
図中、同一符号は同一又は相当部分を示す。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a data transfer method and apparatus, and more particularly to a software download method and apparatus using the same when updating firmware and data of a processor such as a transmission apparatus from an old version to a new version.
[0002]
In recent network systems, a download function by remote control from a monitoring terminal is required when updating firmware from an old version to a new version for processors distributed in each region.
For this reason, each processor has a non-volatile readable / writable memory device, and a software download function for writing a firmware program, a database, or the like is provided.
[0003]
In the future, as the system becomes more complex, the number of firmware to be downloaded and the program size of the firmware will tend to increase. Therefore, it is necessary to smoothly update from the old version to the new version and shorten the startup time of the apparatus.
[0004]
[Prior art]
FIG. 15 conceptually shows a conventional data transfer method and a multiprocessor system as an apparatus using the same.
In this example, the main processor 1 sends repeater hubs 5_1, 5_21 to 5_2p-1 (one-way setting nodes) to n sub-processors 3_1 to 3_n (hereinafter sometimes collectively referred to as “3”). Hereinafter, they may be collectively referred to as “5”.), For example, by 100BASE-FX.
[0005]
That is, the main processor 1 is connected to repeater hubs 5_1 to p-1 repeater hubs 5 having a port number p, and these repeater hubs 5 are further connected to p-1 sub-processors, respectively. Therefore, the repeater hub 5 is connected to n sub-processors 3 as a whole.
[0006]
Then, the main processor 1 refers to the operation status of each sub-processor 3 from the database built in its own station, receives a download request command from the connected monitoring terminal 4, and operates or is scheduled to operate. Download to 3.
[0007]
  Figure 16Shows a data transfer sequence in such a conventional multiprocessor system. In this figure, only the sub-processors 31 and 32 are illustrated, and the repeater hub 5 shown in FIG. 15 is omitted for convenience of explanation.
  First, upon receiving a file transfer request S20 or a download request S21 from the monitoring terminal 4, the main processor 1 performs data writing S22 on the RAM disk of the built-in memory device 6.
[0008]
In response to this, the RAM disk of the memory device 6 performs data reading S23 to the main processor 1, and also performs data writing S24 to the flash memory of the memory device 6, and further transfers FTP data from the RAM disk to S25. First, for example, the sub-processor 3_1 is performed. Then, the FTP data transfer completion notification S26 is sent to the sub processor 3_1 in the same manner.
[0009]
When the sub processor 3_1 performs data write S27 from the CPU to the flash memory, a sum check S28 is returned from the flash memory to the CPU. At the same time, the flash memory notifies the main processor 1 of notification S29 that the result of the FTP data transfer is OK.
[0010]
Similarly, when the memory device 6 of the main processor 1 performs FTP data transfer S30 to another subprocessor 3_2 and sends an FTP data transfer completion notification S31 to the CPU of the subprocessor 3_2, the memory device 6 of the subprocessor 3_2 The CPU performs data writing S32 on the flash memory, and returns a sum check S33 to the CPU as described above, and sends a result notification S34 of this FTP data transfer from the CPU of the sub processor 3_2 to the main processor 1.
[0011]
In this way, data is transferred from the main processor 1 to the sub-processor 3 point-to-point using FTP data transfer technology.
On the other hand, when downloading (transferring) data, there is a method of downloading to multiple sub-processors simultaneously by broadcast (UDP) transmission. However, with this method, delivery confirmation cannot be performed using the UDP protocol. In addition, UDP cannot be used because it is less reliable than TCP due to packet characteristics.
[0012]
Therefore, as described above, data is transferred point by point.
[0013]
[Problems to be solved by the invention]
In this way, data is transferred from the main processor to a plurality of sub-processors in a point-to-point manner. In this case, there is no problem if the processing power of the CPU of the main processor is high and the transfer rate of the LAN line is also very high. In the case where the cost is reduced by using an inexpensive main processor, the larger the number of sub processors, the longer the time required for downloading to all sub processors.
[0014]
For example, when downloading 2 Mbytes of data at 100 Kbps to 176 sub-processors, it takes about 20 seconds per sub-processor, and it takes 58 minutes (about 1 hour) to transfer all sub-processors. It becomes.
As described above, since it takes a lot of time to download, there has been a problem that service provision to customers is delayed and business opportunities are missed.
[0015]
Therefore, the present invention provides a method for transferring data from a main processor to a plurality of sub-processors and an apparatus using this method, so that the time required for data transfer can be greatly reduced even if an inexpensive processor is used. Objective.
[0016]
[Means for Solving the Problems]
  In order to achieve the above object, according to the present invention, there is provided a method for transferring data from a main processor to a plurality of sub-processors via nodes that can be set in parallel.Pre-store multiple multicast trees, each giving a minimum number of hops,Determine the number of ports of the node from the number of connectable sub processorsAs well as, Corresponding to the number of portsTheSelect the multicast tree with the minimum number of hops,thisIn the multicast treeHeyTheThrough the node along a path determined to maintain the minimum hop count within each group grouped by the minimum hop countA method is provided that transfers predetermined data to each sub-processor.
[0017]
  The present invention further includes a main processor, a node that can be set in parallel, and a plurality of sub-processors connected to the main processor via the node.Pre-store multiple multicast trees, each giving a minimum number of hops,Determine the number of ports of the node from the number of connectable sub processorsAs well as, Corresponding to the number of portsTheSelect the multicast tree with the minimum number of hops,thisIn the multicast treeHeyTheThrough the node along a path determined to maintain the minimum hop count within each group grouped by the minimum hop countA data transfer device is provided that transfers predetermined data to each sub-processor.
[0019]
  nowWhen the number of connectable sub-processors is N and the number of ports is p, the maximum number of sub-processors Nmax is (p-1)²2ⁿ(N is a natural number) and the number of groups Ngr is (log₂Nmax) +1, and the number of sub-processors in each group Nsub is 2 ** (log₂Nmax-group number) (** indicates a power).
[0020]
Further, the main processor can determine the access order of the sub processors for each group, and can determine the paths between the main processor and the sub processors and between the sub processors via the node.
Furthermore, a switching hub can be used as the node.
[0021]
FIG. 1 shows a conceptual configuration of a multiprocessor system as an apparatus for realizing the data transfer method according to the present invention. As can be seen by comparing the conceptual configuration of FIG. 1 with the conceptual configuration of the conventional example shown in FIG. 15, instead of the repeater hubs 5_1, 5_21-5_2p-1, switching hubs 2_1, 2_21-2_2p- The difference is that 1 (hereinafter sometimes collectively referred to as “2”) is used.
[0022]
In other words, since the switching hub 2 is a store / forward-only parallel-configurable node, there is a characteristic that it is not influenced by each other when transferring between a plurality of ports. Data transfer time is shortened by parallel processing of transfer to the processor 3 via the switching hub.
[0023]
In order to make the conceptual configuration shown in FIG. 1 easy to understand, in the multiprocessor system shown in FIG. 2, one port of the switching hub 2_1 with the number of ports p = 5 is connected to the main processor 1, and the switching with the number of ports p = 5 is also performed. One port of each of the hubs 2_21 to 2_24 is connected to the remaining p-1 ports of the switching hub 2_1.
[0024]
The remaining four ports of each switching hub 2_21 to 2_24 are connected to four sub processors 3_1 to 3_4, 3_5 to 3_8, 3_9 to 3_12, and 3_13 to 3_16, respectively. 16 sub-processors are connected using a switching hub.
[0025]
In such a configuration, the main processor 1, which is a feature of the present invention, determines the number of ports of the node (switching hub) 2 from the connectable number of sub-processors 3 and multicasts with the minimum number of hops corresponding to the number of ports. Selection of a tree and transfer of predetermined data to each sub-processor 3 along this multicast tree will be described below with reference to the drawings.
[0026]
First, it will be described how a multicast tree for actually performing data transfer is selected from the multiprocessor system shown in FIG.
As shown in the flowchart of FIG. 3, when the main processor 1 receives a software download instruction such as a download command (COPY-MEM command) from the monitoring terminal 4 (step S1), the main processor 1 stores the database in the main processor 1 Search (step S2).
[0027]
In this database, a multicast tree composed of various numbers of sub-processors as shown in FIG. 4 is stored in advance, and the most preferable one of these multicasts is selected by the following steps S3 to S5. . The numbers in the figure indicate the number of data transfer hops.
[0028]
Multicast tree selection and group count calculation (steps) S3 )
Each multicast tree shown in Fig. 4 (1)-(3) ...ⁿAccording to (n is a natural number), the number of hops (number of groups) is the smallest among all the tree structures, and it is selected as the preferable one (the one with the short download time).
[0029]
For example, as shown in (1) of the figure, when the number of sub-processors is 8 (n = 3), it is divided into 4 groups, and the first group consists of 4 units, and 2 times in this group. The second group only needs to be transferred once. In the third and fourth groups, only one data transfer is performed directly from the main processor 1.
[0030]
Therefore, in this case, the number of times of data transfer, that is, the number of hops is “4”.
Similarly, when the number of sub-processors is 16 (n = 4) as shown in (2) in the same figure, it is divided into 5 groups, the first group needs only 3 data transfers, the second This group only needs two data transfers, and the third group only needs one data transfer. The fourth and fifth groups only receive one data transfer directly from the main processor 1 as in the case of FIG.
[0031]
Also, as shown in (3) in the figure, when the number of sub-processors is 32 (n = 5), it is divided into 6 groups, the first group needs only 4 data transfers, and the second The group needs three data transfers, the third group only needs two data transfers, and the fourth group only needs one data transfer. The fifth and sixth groups are each once as described above.
In the last group, the number of data transfers and the number of sub-processors are always fixed values “1”.
[0032]
Thus, the number of hops corresponds to the number of groups.
FIG. 5 is a flowchart showing how to select such a multicast tree.
First, the main processor 1 searches the number N of connectable sub processors from the station data (step S11). Note that the number Nmax of sub-processors constituting the multicast tree shown in FIG.ⁿOf course, there are cases where the number N of sub-processors 3 to which the main processor 1 can actually be connected is different. This will be described later.
[0033]
  Next, the number of ports p of the switching hub 2 is set to an initial value = 1 (step S12), and N> (p−1) in step S13²Whether or notOrIn the case of “yes”, the port number p is incremented by “1” (step S14), but in the case of “no”, N = (p−1)²It was found that the number of ports p at this time is (p−1)²The maximum number of sub processors closest to Nmax = 2ⁿIs searched (step S15).
[0034]
That is, in these steps S12 to S15, as in the example shown in the figure, when (1) N = 16, (p−1)²= (5−1)²= 16, the number of ports P = 5, in this case the closest maximum number of sub-processors Nmax = 2ⁿIs equal to the number of sub-processors N initially set at 16.
On the other hand, when (2) N = 196, (p−1)²= (15−1)²= 196 and p = 15, so the number of sub-processors in this case Nmax = 2ⁿIs 128 <Nmax <256, but the larger value requires a maximum value = 256, which is larger (not smaller) than the initially set number N of sub-processors.
[0035]
When the maximum number of sub-processors Nmax is obtained in this way, the corresponding multicast tree is selected from the various multicast trees shown in FIG.
That is, when the number of subprocessors = 8, a tree with Nmax = 8 is selected (step S16), and when the number of subprocessors = 16, a multicast tree with Nmax = 16 is selected (step S17). If the number is 32, select a multicast tree with Nmax = 32, and the number of sub-processors is 2 in the same way.ⁿIf Nmax = 2ⁿIs selected (step S18).
[0036]
In the multicast tree selected in this way, the number of groups Ngr and the number of processors Nsub of each group can be obtained from the above-described Nmax value using the calculation formula shown in the figure.
As a result, for example, when the number of sub-processors is 8, the number of groups is Ngr = 4. The number of sub-processors in each group is Nsub = 4, 2, 1, 1, and the number of sub-processors shown in FIG. It can be seen that this is consistent with the case of Nmax = 8. In any group, since “1” is always given as a fixed value for the last group, the number of groups is equal to the number of hops.
[0037]
Grouped search (step S4 )
After selecting the multicast tree and calculating the number of groups and the number of sub-processors of each group as described above, the grouping shown in FIG. 6 is performed next time.
Here, since the number of groups Ngr calculated for the sub-processors grouped by the number of multicast hops as described above and the number of sub-processors Nsub in each group have already been calculated, the value of the number of processors Nmax of the selected multicast tree Accordingly, the sub-processor numbers are assigned sequentially to the sub-processors 3_1 to 3_16 as indicated by dotted lines for each group.
[0038]
That is, in FIG. 4 (2) corresponding to the grouping conceptual diagram of FIG. 6, only 16 sub-processors are indicated by numbers for each hop, but in FIG. Each sub processor determines where it corresponds to the sub processor in the multiprocessor system illustrated in FIG.
[0039]
Subprocessor path connection (steps) S5 )
As shown in FIG. 6, after determining the correspondence between the sub-processor shown in FIG. 4 (2) and the sub-processor shown in FIG. 2, in what order will data transfer (software download) be performed this time? FIG. 7 shows the concept of path connection in such a case.
[0040]
That is, in the example of FIG. 7, first, data transfer is performed from the main processor 1 to the sub processor 3_1 via the switching hubs 2_1 and 2_21. This is the first hop.
The sub processor 3_1 that has received the data transfer performs data transfer to the sub processor 3_2 via the switching hub 2_21. At the same time, the main processor 1 performs data transfer to the sub processor 3_9 via the switching hubs 2_1 and 2_23. This is the second hop, and data is transferred to two sub-processors simultaneously.
[0041]
In the third hop, the sub processor 3_2 that has received the data transfer transfers data to the sub processor 3_3 via the switching hub 2_21. At the same time, the sub processor 3_1 passes through the switching hubs 2_21, 2_1, and 2_22. The data is transferred to the sub processor 3_6. Further, the sub processor 3_9 transfers data to the sub processor 3_10 via the switching hub 2_23, and the main processor 1 transfers data to the sub processor 3_13 via the switching hubs 2_1 and 2_24. Therefore, in the third hop, data transfer is simultaneously performed to four sub processors.
[0042]
In the fourth hop, data transfer is performed from sub-processor 3_3 to sub-processor 3_4 via switching hub 2_21, and data transfer is performed from sub-processor 3_2 to sub-processor 3_5 via switching hubs 2_21, 2_1, and 2_22. The sub processor 3_6 transfers data to the sub processor 3_7 via the switching hub 2_22, and the sub processor 3_1 transfers data to the sub processor 3_8 via the switching hubs 2_21, 2_1, and 2_22. Further, data transfer is performed from the sub processor 3_10 to the sub processor 3_11 via the switching hub 2_23, and data transfer is performed from the sub processor 3_9 to the sub processor 3_12 via the switching hub 2_23. Then, data transfer is performed from the sub processor 3_13 to the sub processor 3_14 via the switching hub 2_24, and further data transfer is performed from the main processor 1 to the sub processor 3_15 via the switching hubs 2_1 and 2_24. Therefore, in the fourth hop, data transfer is performed simultaneously to eight sub-processors.
[0043]
In the final fifth hop, data transfer is performed from the main processor 1 to the sub processor 3_16 via the switching hubs 2_1 and 2_24.
In this way, all data transfer is completed in five times.
After the data transfer procedure is determined in this way, the main processor 1 executes data transfer (download) (step S6).
[0044]
With such a data transfer method and apparatus, even if the processing capability of the main processor is not high, if this process is distributed to each sub-processor and the path connection according to the present invention is used, the transfer method is markedly improved over the conventional method and apparatus. It is possible to increase the speed.
[0045]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 8 shows a format for transferring data after the main processor 1 makes a path connection to each sub-processor 3 as shown in FIG. This transfer data distributes either the transfer data from the main processor to the sub-processor or the transfer data from the sub-processor to the sub-processor. The distinction between them is “Notification” described in the format number “0”. Identification is possible by “Code” (notification information type). In addition, “Sequence ID” (sequence No .; 01h to 64h), “FromCPU” (source processor; 0001h to 00B0h: Sub processor 00FEh: main processor), “To CPU” (destination processor; 0001h to 00B0h: Sub Processor 00FEh: Main processor), “User Information” (user identification information), “Frame Page” (frame page number), “Total Frame Page” (total number of frame pages), and “data” (data section; path information, etc.) , Various data transfer).
[0046]
FIG. 9 shows a software configuration example of the main processor 1 and the sub processor 3 in the multiprocessor system as a device for realizing the data transfer method according to the present invention.
First, in the main processor 1, as the CPY-MEM command processing structure 11, the CPY-MEM command processing unit 12, the configuration calculation unit 13, the group creation unit 14, the path creation unit 15, the sub processor communication unit 16, and the transmission data creation unit 17 And the data transmission unit 18, the group creation unit 14 and the sub-processor communication unit 16 have path / map data 19, the path creation unit 15 has group data 20, and the transmission data creation unit 17 The data transmission unit 18 holds transmission data 21.
[0047]
The sub processor 3 includes a CPY-MEM command processing unit 31, a CPY-MEM command processing unit 32, a main / sub processor communication unit 33, a data transmission unit 34, and a data reception unit 35. The path / map data 36 and the group data 37 are held, the data transmission unit 34 has the transmission data 38, and the data reception unit 35 has the reception data 39.
[0048]
The multicast tree shown in FIG. 4 is stored in advance in the path / map data 19 and 36. The command processing units 12 and 32 are involved in the processing of all the above data as indicated by dotted lines.
FIG. 10 shows a sequence example when the main processor 1 performs data transfer to each sub-processor 3. This transfer sequence is the same as the transfer sequence of the conventional example shown in FIG. 16, instead of the step S41 for performing FTP data transfer from the memory device 6 of the main processor 1 to each sub processor 3 and the FTP data transfer completion notification step S42. The difference is that the FTP data transfer (step S30) is performed from the sub processor 3_1 to the sub processor 3_2, and the FTP data transfer completion notification (step S31) is performed.
[0049]
That is, in the conventional example, downloading is performed from the main processor to all the sub processors, but in the present invention, downloading is performed to the corresponding sub processors according to the group data and path data received from the sub processors. The point is different. However, the notification of the download processing result is returned to the main processor as before (steps S29 and S34).
[0050]
FIGS. 11 to 14 show an embodiment in which data transfer is handled when 64 sub-processors are used.
Also in this embodiment, as shown in FIG. 11, first, based on the database in the main processor 1, step S3 in FIG. 3, that is, the tree selection and the number of groups and the number of sub processors in each group shown in FIG. I do.
[0051]
Based on this information, the grouping shown in FIG. 12 is performed in the same manner as in FIG.
Then, as shown in FIG. 13, the path tree is implemented in the same manner as in step S5 of FIG. 3, that is, in FIG.
FIG. 14 shows data transmission / reception timings from the main processor 1 to the sub processor 3 and from the sub processor 3 to the sub processor 3. That is, each sub processor 3 further transfers the received data to the next sub processor according to the path shown in FIG. Although reception and transmission are repeated in this manner, there are cases where the terminal sub-processor is only received and not transmitted.
[0052]
As a result, the CPU occupancy rate of 64 sub-processors is about 26%, and it can be seen that the load is greatly distributed.
Further, in this example, when the load on the sub-processor 3_1 is the highest and it is known in advance that the addition is very high by other task processing or the like, the above steps S3 to S3 After performing the data transfer of S5, it is possible to perform processing to exchange its role with the sub processor that is resting (path replacement), so it is possible to take measures to predict performance optimization in advance .
[0053]
(Appendix 1)
A method of transferring data from a main processor to a plurality of sub-processors via nodes that can be set in parallel,
The main processor determines the number of ports of the node from the connectable number of the sub processors, selects a multicast tree having the minimum number of hops corresponding to the number of ports, and assigns a predetermined number to each sub processor along the multicast tree. A method characterized by transferring data.
[0054]
(Appendix 2) In Appendix 1,
The main processor stores in advance a plurality of multicast trees giving the minimum number of hops, selects one multicast tree having the maximum number of sub-processors determined by the number of ports, and further selects the multicast tree A method characterized by grouping by the number of hops and determining a path within each group.
[0055]
(Appendix 3) In Appendix 2,
When the number of connectable sub-processors is N and the number of ports is p, the maximum number of sub-processors Nmax is (p−1)²2 given inⁿThe maximum number closest to (n is a natural number), and the grouping number Ngr is (log₂Nmax) +1, and the number of sub-processors in each group Nsub is 2 ** (log₂Nmax-group number) (** indicates power).
[0056]
(Appendix 4) In any one of Appendices 1 to 3,
A method in which the main processor determines an access order of sub-processors for each group, and determines paths between the main processor and sub-processors and between sub-processors via the node.
[0057]
(Appendix 5) In any one of Appendices 1 to 4,
A method wherein the node is a switching hub.
(Appendix 6)
A main processor;
Parallel configurable nodes,
A plurality of sub-processors connected to the main processor via the node,
The main processor determines the number of ports of the node from the connectable number of the sub processors, selects a multicast tree having the minimum number of hops corresponding to the number of ports, and assigns a predetermined number to each sub processor along the multicast tree. A data transfer device for transferring data.
[0058]
(Appendix 7) In Appendix 6,
The main processor stores in advance a plurality of multicast trees giving the minimum number of hops, selects one multicast tree having the maximum number of sub-processors determined by the number of ports, and further selects the multicast tree A data transfer apparatus characterized by grouping by the number of hops and determining a path within each group.
[0059]
(Appendix 8) In Appendix 7,
When the number of connectable sub-processors is N and the number of ports is p, the maximum number of sub-processors Nmax is (p−1)²2 given inⁿThe maximum number closest to (n is a natural number), and the grouping number Ngr is (log₂Nmax) +1, and the number of sub-processors in each group Nsub is 2 ** (log₂Nmax-group number) (** indicates a power).
[0060]
(Appendix 9) In any one of Appendices 6 to 8,
A data transfer apparatus, wherein the main processor determines an access order of sub-processors for each group, and determines paths between the main processor and sub-processors and between sub-processors via the node.
[0061]
(Appendix 10) In any one of Appendices 6 to 9,
A data transfer apparatus, wherein the node is a switching hub.
[0062]
【The invention's effect】
  As described above, according to the data transfer method and apparatus according to the present invention,Less thanThe following effects are obtained.
[0063]
(1) By adopting the present invention, it is possible to greatly reduce the time for downloading data compared to the conventional example. This becomes more effective as the number of sub-processors increases.
  Comparison between the conventional example and the present invention is as follows (indicated by the number of hops), and it can be seen that the effect is great.
・ Number of sub processor = 16 → Conventional example = 16 hops, Invention = 5 hops → 3.2 times
・ Number of sub processor = 32 → Conventional example = 32 hops, Invention = 7 hops → 4.5 times
・ Number of sub processor = 64 → Conventional example = 64 hops, Invention = 9 hops → 7.1 times
・ The number of sub processors = n → → After that, the processing power is exponentially improved as described above.UpTo do.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a conceptual configuration of a data method and apparatus according to the present invention.
FIG. 2 is a block diagram showing a multiprocessor system when 16 subprocessors are used in the conceptual configuration shown in FIG. 1;
FIG. 3 is a flowchart showing a conceptual operation of a data transfer method and apparatus according to the present invention.
FIG. 4 is a block diagram showing a generalized concept of a multicast tree used in the data transfer method and apparatus according to the present invention.
FIG. 5 is a flowchart showing a concept when tree selection and group count calculation are performed in the data transfer method and apparatus according to the present invention.
FIG. 6 is a diagram showing the concept of grouping in the data transfer method and apparatus according to the present invention.
FIG. 7 is a diagram showing a concept of path connection in the data transfer method and apparatus according to the present invention.
FIG. 8 is a diagram showing a format at the time of data transfer between the main processor and the sub-processor in the data transfer method and apparatus according to the present invention.
FIG. 9 is a block diagram showing a software configuration example of a main processor and a sub processor used in the data transfer method and apparatus according to the present invention.
FIG. 10 is a diagram showing an example of a data transfer sequence in the data transfer method and apparatus according to the present invention.
FIG. 11 is a diagram showing a calculation example of the number of groups when 64 sub-processors are used in the data transfer method and apparatus according to the present invention.
12 is a diagram showing grouping in the multicast tree of FIG. 12. FIG.
13 is a diagram showing an example in which path connection is performed after grouping in FIG. 13;
14 is a diagram showing data transmission / reception timing after the path connection in FIG. 14 is performed.
FIG. 15 is a block diagram showing a conceptual configuration example of a conventional example.
FIG. 16 is a diagram showing a data transfer sequence of a conventional example.
[Explanation of symbols]
1 Main processor
2 (2_1, 2_21 to 2_2P-1) nodes (switching hubs)
3 (3_1-3_n) subprocessors
4 Monitoring terminal
In the drawings, the same reference numerals indicate the same or corresponding parts.

Claims

A method of transferring data from a main processor to a plurality of sub-processors via nodes that can be set in parallel,
The main processor stores in advance a plurality of multicast trees to provide that each of the minimum number of hops, and determines the number of ports of the node from the connectable number of the sub-processors, the minimum hop corresponding to the number of said ports select the multicast tree number, at this multicast tree, each through the nodes along the determined path so as to maintain the number of said minimum hop in each group grouped by said minimum number of hops A method characterized by transferring predetermined data to a sub-processor.

A main processor;
Parallel configurable nodes,
A plurality of sub-processors connected to the main processor via the node,
The main processor stores in advance a plurality of multicast trees to provide that each of the minimum number of hops, and determines the number of ports of the node from the connectable number of the sub-processors, the minimum hop corresponding to the number of said ports select the multicast tree number, at this multicast tree, each through the nodes along the determined path so as to maintain the number of said minimum hop in each group grouped by said minimum number of hops A data transfer apparatus for transferring predetermined data to a sub-processor.