JP5084197B2

JP5084197B2 - Processor node system and processor node cluster system

Info

Publication number: JP5084197B2
Application number: JP2006217953A
Authority: JP
Inventors: 英幸斎藤; 和由堀江
Original assignee: Sony Interactive Entertainment Inc; Sony Computer Entertainment Inc
Current assignee: Sony Interactive Entertainment Inc
Priority date: 2006-08-10
Filing date: 2006-08-10
Publication date: 2012-11-28
Anticipated expiration: 2026-08-10
Also published as: JP2008041027A

Description

この発明は、複数のプロセッサを相互接続したプロセッサノードシステムおよびプロセッサノードクラスタシステムに関する。 The present invention relates to a processor node system and a processor node cluster system in which a plurality of processors are interconnected.

パーソナルコンピュータやサーバには、ＰＣＩ（Peripheral Component Interconnect）バスを介して各種の周辺デバイスが接続され、情報処理システムが構成される。プロセッサの入出力バスと、周辺デバイスの入出力バスであるＰＣＩバスとは規格が異なるため、通常、ブリッジを介してプロセッサと周辺デバイスとが接続される。 Various peripheral devices are connected to a personal computer or server via a PCI (Peripheral Component Interconnect) bus to constitute an information processing system. Since the input / output bus of the processor and the PCI bus that is the input / output bus of the peripheral device have different standards, the processor and the peripheral device are usually connected via a bridge.

情報処理システムの機能拡張や性能強化を図るために、グラフィックプロセッサや高速なメモリデバイスをＰＣＩデバイスとして接続することがあり、より多くの周辺デバイスをＰＣＩバスで接続できるようにすることが要請されている。そのため、ＰＣＩエクスプレス（PCI Express）（商標または登録商標）スイッチを用いて、一つのプロセッサに対して複数のデバイスを接続することが行われている。また、複数のプロセッサノードを相互接続したり、プロセッサノードとデバイスを相互接続するために、Ｉｎｆｉｎｉｂａｎｄと呼ばれる超高速インタフェース技術が用いられることがある。 In order to expand the functions and enhance the performance of information processing systems, graphic processors and high-speed memory devices are sometimes connected as PCI devices, and more peripheral devices are required to be connected via the PCI bus. Yes. Therefore, a plurality of devices are connected to one processor using a PCI Express (trademark or registered trademark) switch. In addition, in order to interconnect a plurality of processor nodes or to interconnect a processor node and a device, an ultra-high-speed interface technology called Infiniband may be used.

１０ギガビットイーサネット（商標または登録商標）やＩｎｆｉｎｉｂａｎｄ技術を用いて複数のプロセッサノードを相互接続したクラスタシステムでは、プロセッサ間の高速な通信を実現することができるという利点があるが、スイッチが未だ高価であるため、クラスタシステムを低価格で提供することは難しく、クラスタ内のプロセッサノード数を増やしていくには限界がある。さらに、イーサネット（商標または登録商標）やＩｎｆｉｎｉｂａｎｄでは、パケットの生成、プロトコル処理などソフトウェアのオーバーヘッドが大きいというデメリットがある。 In a cluster system in which a plurality of processor nodes are interconnected using 10 Gigabit Ethernet (trademark or registered trademark) or Infiniband technology, there is an advantage that high-speed communication between processors can be realized, but a switch is still expensive. Therefore, it is difficult to provide a cluster system at a low price, and there is a limit to increasing the number of processor nodes in the cluster. Further, Ethernet (trademark or registered trademark) and Infiniband have disadvantages such as large software overhead such as packet generation and protocol processing.

本発明はこうした課題に鑑みてなされたものであり、その目的は、複数のプロセッサを安価な手段により結合して、高速なプロセッサ間通信を実現する技術およびその技術を利用したプロセッサノードシステムやプロセッサノードクラスタシステムを提供することにある。 The present invention has been made in view of these problems, and an object of the present invention is to realize a technique for realizing high-speed communication between processors by combining a plurality of processors by inexpensive means, and a processor node system and a processor using the technique. To provide a node cluster system.

上記課題を解決するために、本発明のある態様のプロセッサノードシステムは、プロセッサと、前記プロセッサの入出力バスと周辺デバイスが接続されるＰＣＩエクスプレスとの間でデータを中継するブリッジとが搭載されたプロセッサ基板を複数含む。前記ブリッジのポートは当該プロセッサがホストとなるルートコンプレックスモードまたは当該プロセッサが周辺デバイスとなるエンドポイントモードに設定可能に構成され、一のプロセッサ基板のブリッジのルートコンプレックスモードに設定されたポートを、別のプロセッサ基板のブリッジのエンドポイントモードに設定されたポートに接続することにより、前記複数のプロセッサ基板間が相互結合される。 In order to solve the above problems, a processor node system according to an aspect of the present invention includes a processor and a bridge that relays data between an input / output bus of the processor and a PCI express to which a peripheral device is connected. A plurality of processor boards. The port of the bridge is configured to be set to a root complex mode in which the processor is a host or an endpoint mode in which the processor is a peripheral device. By connecting to a port set to the end point mode of the bridge of the processor boards, the plurality of processor boards are mutually coupled.

前記一のプロセッサ基板のブリッジのルートコンプレックスモードに設定されたポートに設けられるＰＣＩエクスプレスコネクタと、前記別のプロセッサ基板のブリッジのエンドポイントモードに設定されたポートに設けられるＰＣＩエクスプレスコネクタとがフレキシブル基板により配線接続されてもよい。 A PCI express connector provided in a port set in the root complex mode of the bridge of the one processor board and a PCI express connector provided in a port set in the end point mode of the bridge of the other processor board May be connected by wiring.

前記一のプロセッサ基板のブリッジのルートコンプレックスモードに設定されたポートに設けられるＰＣＩエクスプレスコネクタと、前記別のプロセッサ基板のブリッジのエンドポイントモードに設定されたポートに設けられるＰＣＩエクスプレスコネクタとを相互接続するための一枚のバックプレーン基板をさらに設けてもよい。 The PCI express connector provided in the port set in the root complex mode of the bridge of the one processor board and the PCI express connector provided in the port set in the endpoint mode of the bridge of the other processor board are interconnected. A single backplane substrate may be further provided.

本発明の別の態様もまた、プロセッサノードシステムである。このプロセッサノードシステムは、プロセッサと、前記プロセッサの入出力バスと周辺デバイスが接続されるＰＣＩエクスプレスとの間でテータを中継するブリッジのセットが２組搭載されたプロセッサ基板を４枚含む。各ブリッジは、当該プロセッサがホストとなるルートコンプレックスモードに設定されたポートと当該プロセッサが周辺デバイスとなるエンドポイントモードに設定されたポートを有する。一のプロセッサ基板のルートコンプレックスモードに設定されたポートは、別のプロセッサ基板のブリッジのエンドポイントモードに設定されたポートに接続されることを条件として、各プロセッサ基板の合計４個のポートの内、３個のポートを用いて、前記４枚のプロセッサ基板の内、任意の２枚のプロセッサ基板間を相互結合される。 Another aspect of the present invention is also a processor node system. This processor node system includes four processor boards on which two sets of bridges for relaying data between a processor and a PCI express to which the input / output bus of the processor and peripheral devices are connected are mounted. Each bridge has a port set in a root complex mode where the processor is a host and a port set in an endpoint mode where the processor is a peripheral device. A port set in the root complex mode of one processor board is connected to a port set in the endpoint mode of the bridge of another processor board. Of the four processor boards, two of the four processor boards are mutually coupled using three ports.

筐体内に前記４枚のプロセッサ基板面を互いに平行に設置し、筐体背面に各プロセッサ基板のブリッジのポートに設けられるＰＣＩエクスプレスコネクタが配置されるように構成し、筐体背面に配置された各プロセッサ基板のＰＣＩエクスプレスコネクタ間をフレキシブル基板により接続してもよい。 The four processor board surfaces are installed parallel to each other in the housing, and the PCI express connector provided at the bridge port of each processor board is arranged on the rear surface of the housing. The PCI express connectors of each processor board may be connected by a flexible board.

筐体内に前記４枚のプロセッサ基板面を互いに平行に設置し、筐体背面に各プロセッサ基板のブリッジのポートに設けられるＰＣＩエクスプレスコネクタが配置されるように構成し、前記一のプロセッサ基板のブリッジのルートコンプレックスモードに設定されたポートに設けられるＰＣＩエクスプレスコネクタと、前記別のプロセッサ基板のブリッジのエンドポイントモードに設定されたポートに設けられるＰＣＩエクスプレスコネクタとを相互接続するための一枚のバックプレーン基板をさらに設けてもよい。 The four processor board surfaces are installed in a case in parallel to each other, and a PCI express connector provided at a bridge port of each processor board is arranged on the back side of the case. One back for interconnecting a PCI express connector provided in a port set in the root complex mode and a PCI express connector provided in a port set in the endpoint mode of the bridge of the other processor board A plain substrate may be further provided.

本発明のさらに別の態様は、プロセッサノードクラスタシステムである。このプロセッサノードクラスタシステムは、プロセッサノードシステムを複数含む。隣接する２つのプロセッサノードシステム間で当該プロセッサノードシステム内のプロセッサ基板間の接続に使用されていない空きポートを互いに接続することにより、前記複数のプロセッサノードシステム間が相互結合される。 Yet another embodiment of the present invention is a processor node cluster system. This processor node cluster system includes a plurality of processor node systems. By connecting unused ports that are not used for connection between processor boards in the processor node system between two adjacent processor node systems, the plurality of processor node systems are mutually coupled.

なお、以上の構成要素の任意の組合せ、本発明の表現を方法、装置、システム、コンピュータプログラム、データ構造、記録媒体などの間で変換したものもまた、本発明の態様として有効である。 It should be noted that any combination of the above-described constituent elements and the expression of the present invention converted between a method, an apparatus, a system, a computer program, a data structure, a recording medium, and the like are also effective as an aspect of the present invention.

本発明によれば、複数のプロセッサを相互接続して安価で高性能なシステムを構成することができる。 According to the present invention, an inexpensive and high-performance system can be configured by interconnecting a plurality of processors.

実施の形態に係るクラスタシステムは、プロセッサが搭載された基板（ボード）をフレキシブル基板で密結合することにより構成される。図１を参照して、各プロセッサ基板の構成を説明し、図２を参照して、４つのプロセッサ基板をフレキシブル基板により密結合したノードの構成を説明する。図３を参照して、複数のノード間をフレキシブル基板により連結することにより構成されるクラスタシステムを説明する。また、図４〜図８を参照して、プロセッサ基板間をフレキシブル基板により接続する形態について説明する。 The cluster system according to the embodiment is configured by tightly coupling a board (board) on which a processor is mounted with a flexible board. The configuration of each processor board will be described with reference to FIG. 1, and the configuration of a node in which four processor boards are tightly coupled by a flexible board will be described with reference to FIG. With reference to FIG. 3, a cluster system configured by connecting a plurality of nodes with a flexible substrate will be described. A form in which the processor boards are connected by a flexible board will be described with reference to FIGS.

図１は、プロセッサ基板５０の構成図である。プロセッサ基板５０には、２つのマルチコアプロセッサ（Multicore Processor）（以下、「ＭＣＰ」と呼ぶ）２０、２１が搭載されている。各ＭＣＰ２０、２１は、複数のプロセッサコアを１つのパッケージに集積したものであり、プロセッサコアとして、１つのプロセッシングエレメント（ＰＥ）と、複数のサブプロセッシングエレメント（ＳＰＥ）を含む。ＰＥは、キャッシュメモリを有し、ＤＲＡＭ１０から読み込んだデータをキャッシュしながら、情報処理を行う。また、ＰＥは、各ＭＣＰ２０、２１全体を統括的に制御する。各ＳＰＥはローカルメモリを内部にもち、ローカルメモリに対してデータを読み書きしながら、情報処理を行う。複数のＳＰＥは非同期で動作する。 FIG. 1 is a configuration diagram of the processor board 50. On the processor board 50, two multicore processors (hereinafter referred to as “MCP”) 20 and 21 are mounted. Each of the MCPs 20 and 21 is obtained by integrating a plurality of processor cores in one package, and includes one processing element (PE) and a plurality of sub-processing elements (SPE) as the processor core. The PE has a cache memory, and performs information processing while caching data read from the DRAM 10. The PE controls the entire MCPs 20 and 21 in an integrated manner. Each SPE has a local memory, and performs information processing while reading / writing data from / to the local memory. Multiple SPEs operate asynchronously.

２つのＭＣＰ２０、２１は入出力インタフェース（以下、「ＩＯＩＦ」と呼ぶ）６４を介して相互に接続されており、高速なデータ通信が可能である。さらに、各ＭＣＰ２０、２１は、ＩＯＩＦ６２、６３を介してブリッジ３０、３１の上流（アップストリーム）ポートに接続されている。ブリッジ３０、３１の下流（ダウンストリーム）ポートには、ＰＣＩエクスプレス６６、６７を介して各種の周辺（ペリフェラル）デバイスや他のプロセッサ基板が接続される。 The two MCPs 20 and 21 are connected to each other via an input / output interface (hereinafter referred to as “IOIF”) 64, and high-speed data communication is possible. Further, each MCP 20, 21 is connected to the upstream (upstream) port of the bridges 30, 31 via the IOIFs 62, 63. Various peripheral (peripheral) devices and other processor boards are connected to the downstream (downstream) ports of the bridges 30 and 31 via PCI express 66 and 67.

ここで、ＰＣＩエクスプレス６６、６７は、ＰＣＩエクスプレス（PCI Express）（商標または登録商標）の仕様にしたがうものであるが、現行のＰＣＩエクスプレス規格に限定する趣旨ではなく、現行のＰＣＩエクスプレス規格に準拠するものや、現行のＰＣＩエクスプレス規格をさらに拡張したり、発展させた規格によるものであってもかまわない。ＰＣＩエクスプレス６６、６７で接続された周辺デバイスや他のプロセッサ基板を以下、「ＰＣＩデバイス」という。 Here, PCI Express 66 and 67 comply with the specification of PCI Express (trademark or registered trademark), but are not limited to the current PCI Express standard, and conform to the current PCI Express standard. It may be based on a standard that has been expanded or developed from the current PCI Express standard. Peripheral devices and other processor boards connected by the PCI express 66 and 67 are hereinafter referred to as “PCI devices”.

ＩＯＩＦ６２、６３、６４は、上りと下りの２つのチャネルをもち、メモリバスに匹敵する高い帯域幅、たとえば、数十ギガバイト／秒を実現している。各ＭＣＰ２０、２１の所定のメモリ領域は、ＩＯＩＦ６２、６３、６４を介して参照可能なＩ／Ｏアドレス空間にメモリマッピングされる。各ＭＣＰ２０、２１は、ＩＯＩＦ６２、６３、６４を介してＩ／Ｏアドレス空間にマッピングされた他のＭＣＰのメモリ領域にアクセスすることが可能であり、高速なプロセッサ間通信が実現される。 The IOIFs 62, 63, and 64 have two channels, upstream and downstream, and realize a high bandwidth comparable to the memory bus, for example, several tens of gigabytes / second. A predetermined memory area of each MCP 20, 21 is memory-mapped into an I / O address space that can be referred to via the IOIFs 62, 63, 64. Each MCP 20, 21 can access the memory area of another MCP mapped to the I / O address space via the IOIFs 62, 63, 64, thereby realizing high-speed interprocessor communication.

各ブリッジ３０、３１は、ＩＯＩＦ６２、６３とＰＣＩエクスプレス６６、６７とを「橋渡し」することで、ＭＣＰ２０、２１とＰＣＩデバイスとを相互接続する。ＩＯＩＦ６２、６３と、ＰＣＩエクスプレス６６、６７とは、バスの規格が異なるため、ブリッジ３０、３１は、２つのバスの間でプロトコルの変換を行い、ＭＣＰ２０、２１とＰＣＩデバイスとがやりとりするデータのフォーマットを各バスの仕様に合わせる。 Each bridge 30, 31 “bridges” the IOIFs 62, 63 and the PCI express 66, 67, thereby interconnecting the MCPs 20, 21 and the PCI device. Since the IOIFs 62 and 63 and the PCI express 66 and 67 have different bus standards, the bridges 30 and 31 convert the protocol between the two buses, and exchange data between the MCPs 20 and 21 and the PCI device. Match the format to the specifications of each bus.

ＰＣＩエクスプレス６６、６７に接続されたＰＣＩデバイスの先にさらにＰＣＩエクスプレスを介してＰＣＩデバイスを接続していくと、ＭＣＰ２０、２１をルート（根）とし、リーフ（葉）にはＰＣＩデバイスが接続されたＰＣＩデバイスのツリー（木）構造が形成される。以下、このＰＣＩデバイスのツリー構造を「ＰＣＩツリー」という。 When a PCI device is further connected to the end of the PCI device connected to the PCI express 66 or 67 via the PCI express, the MCP 20 or 21 is set as the root, and the PCI device is connected to the leaf. A PCI device tree structure is formed. Hereinafter, the tree structure of the PCI device is referred to as “PCI tree”.

各ブリッジ３０、３１の下流ポートは２つ設けられており、一方は、ルートコンプレックス（ＲＣ；Root Complex）として、他方は、エンドポイント（ＥＰ；Endpoint）としてコンフィグレーションして用いることができる。１つのポートがルートコンプレックスモードとエンドポイントモードを切り替えられるように構成されてもよい。ブリッジ３０、３１の下流ポートをルートコンプレックスとして用いると、ＭＣＰ２０、２１は、ＰＣＩツリーのルートとなって、ＰＣＩデバイスを接続するホストとして機能する。ブリッジ３０、３１の下流ポートをエンドポイントとして用いると、ＭＣＰ２０、２１は、ホストに接続されるＰＣＩデバイスとして機能する。 Two downstream ports of each bridge 30 and 31 are provided. One can be configured and used as a root complex (RC) and the other as an endpoint (EP). One port may be configured to be able to switch between the root complex mode and the endpoint mode. When the downstream ports of the bridges 30 and 31 are used as a root complex, the MCPs 20 and 21 function as hosts for connecting PCI devices as roots of the PCI tree. When the downstream ports of the bridges 30 and 31 are used as endpoints, the MCPs 20 and 21 function as PCI devices connected to the host.

ブリッジ３０、３１の下流ポートには、ルートコンプレックス用のコネクタ４０Ｒ、４１Ｒと、エンドポイント用のコネクタ４０Ｅ、４１Ｅとが設けられる。本実施の形態では、プロセッサ基板５０に設けられた合計４個のコネクタ４０Ｒ、４０Ｅ、４１Ｒ、４１Ｅの内、３個のコネクタを同一ノード内の他の３つのプロセッサ基板との接続に用い、残りの１個のコネクタを他のノードのプロセッサ基板との接続に用いる。 Downstream ports of the bridges 30 and 31 are provided with route complex connectors 40R and 41R and endpoint connectors 40E and 41E. In the present embodiment, three connectors out of a total of four connectors 40R, 40E, 41R, 41E provided on the processor board 50 are used for connection to the other three processor boards in the same node, and the remaining One connector is used for connection to the processor board of another node.

図２は、４つのプロセッサ基板が密結合されたノード１００の構成図である。ノード１００は、第１プロセッサ基板５０、第２プロセッサ基板５１、第３プロセッサ基板５２、および第４プロセッサ基板５３をフレキシブル基板で相互に接続したものである。各プロセッサ基板５０〜５３の構成は、図１で説明した通りである。 FIG. 2 is a configuration diagram of the node 100 in which four processor boards are tightly coupled. In the node 100, a first processor board 50, a second processor board 51, a third processor board 52, and a fourth processor board 53 are connected to each other by a flexible board. The configuration of each of the processor boards 50 to 53 is as described in FIG.

以下、第１プロセッサ基板５０、第２プロセッサ基板５１、第３プロセッサ基板５２、第４プロセッサ基板５３をそれぞれ「プロセッサ基板０」、「プロセッサ基板１」、「プロセッサ基板２」、「プロセッサ基板３」と呼ぶ。 Hereinafter, the first processor board 50, the second processor board 51, the third processor board 52, and the fourth processor board 53 are respectively referred to as “processor board 0”, “processor board 1”, “processor board 2”, and “processor board 3”. Call it.

プロセッサ基板０に搭載された２つのＭＣＰ２０、２１をそれぞれ「ＭＣＰ０」、「ＭＣＰ１」と呼び、ＭＣＰ０、ＭＣＰ１に接続されたブリッジ３０、３１をそれぞれ「ブリッジ０」、「ブリッジ１」と呼ぶ。同様に、プロセッサ基板１に搭載された２つのＭＣＰ２２、２３をそれぞれ「ＭＣＰ２」、「ＭＣＰ３」と呼び、ＭＣＰ２、ＭＣＰ３に接続されたブリッジ３２、３３をそれぞれ「ブリッジ２」、「ブリッジ３」と呼ぶ。プロセッサ基板２に搭載された２つのＭＣＰ２４、２５をそれぞれ「ＭＣＰ４」、「ＭＣＰ５」と呼び、ＭＣＰ４、ＭＣＰ５に接続されたブリッジ３４、３５をそれぞれ「ブリッジ４」、「ブリッジ５」と呼ぶ。プロセッサ基板３に搭載された２つのＭＣＰ２６、２７をそれぞれ「ＭＣＰ６」、「ＭＣＰ７」と呼び、ＭＣＰ６、ＭＣＰ７に接続されたブリッジ３６、３７をそれぞれ「ブリッジ６」、「ブリッジ７」と呼ぶ。 The two MCPs 20 and 21 mounted on the processor board 0 are called “MCP0” and “MCP1”, respectively, and the bridges 30 and 31 connected to the MCP0 and MCP1 are called “bridge 0” and “bridge 1”, respectively. Similarly, the two MCPs 22 and 23 mounted on the processor board 1 are referred to as “MCP2” and “MCP3”, respectively, and the bridges 32 and 33 connected to the MCP2 and MCP3 are referred to as “bridge 2” and “bridge 3”, respectively. Call. The two MCPs 24 and 25 mounted on the processor board 2 are called “MCP4” and “MCP5”, respectively, and the bridges 34 and 35 connected to the MCP4 and MCP5 are called “bridge 4” and “bridge 5”, respectively. The two MCPs 26 and 27 mounted on the processor board 3 are called “MCP6” and “MCP7”, respectively, and the bridges 36 and 37 connected to the MCP6 and MCP7 are called “bridge 6” and “bridge 7”, respectively.

各プロセッサ基板内の２つのＭＣＰを相互接続するＩＯＩＦを「ＩＯＩＦ０」と呼び、ＭＣＰとブリッジの上流ポート間のＩＯＩＦを「ＩＯＩＦ１」と呼ぶ。 An IOIF interconnecting two MCPs in each processor board is called “IOIF0”, and an IOIF between the MCP and the upstream port of the bridge is called “IOIF1”.

ブリッジ０のＲＣ用コネクタ、ＥＰ用コネクタをそれぞれ「コネクタＲＣ０」、「コネクタＥＰ０」と呼ぶ。同様に、ブリッジ１〜ブリッジ７のＲＣ用コネクタをそれぞれ「コネクタＲＣ１」〜「コネクタＲＣ７」と呼び、ブリッジ１〜ブリッジ７のＥＰ用コネクタをそれぞれ「コネクタＥＰ１」〜「コネクタＥＰ７」と呼ぶ。 The RC connector and the EP connector of the bridge 0 are referred to as “connector RC0” and “connector EP0”, respectively. Similarly, the RC connectors of the bridges 1 to 7 are called “connectors RC1” to “connectors RC7”, respectively, and the EP connectors of the bridges 1 to 7 are called “connectors EP1” to “connectors EP7”, respectively.

プロセッサ基板０のＭＣＰ０側のブリッジ０のコネクタＲＣ０は、プロセッサ基板１のＭＣＰ３側のブリッジ３のコネクタＥＰ３と接続される。この接続により、プロセッサ基板０のＭＣＰ０から見た場合、ＭＣＰ０はルートコンプレックスとして機能し、プロセッサ基板１のＭＣＰ３はエンドポイントとして機能する。すなわち、プロセッサ基板０のＭＣＰ０はホストであり、プロセッサ基板１をＰＣＩデバイスとして接続した形態となり、ＭＣＰ０をルートとしてＭＣＰ３をつないだＰＣＩツリーが形成される。 The connector RC0 of the bridge 0 on the MCP0 side of the processor board 0 is connected to the connector EP3 of the bridge 3 on the MCP3 side of the processor board 1. With this connection, when viewed from the MCP0 of the processor board 0, the MCP0 functions as a root complex, and the MCP3 of the processor board 1 functions as an end point. That is, the MCP0 of the processor board 0 is a host, and the processor board 1 is connected as a PCI device, and a PCI tree is formed by connecting the MCP3 with the MCP0 as a root.

プロセッサ基板０のＭＣＰ１側のブリッジ１のコネクタＲＣ１は、プロセッサ基板２のＭＣＰ４側のブリッジ４のコネクタＥＰ４と接続される。ルートコンプレックスであるプロセッサ基板０のＭＣＰ１から見た場合、プロセッサ基板０のＭＣＰ１をホスト、プロセッサ基板２をデバイスとするＰＣＩツリーが形成される。 The connector RC1 of the bridge 1 on the MCP1 side of the processor board 0 is connected to the connector EP4 of the bridge 4 on the MCP4 side of the processor board 2. When viewed from the MCP1 of the processor board 0 that is the root complex, a PCI tree is formed in which the MCP1 of the processor board 0 is the host and the processor board 2 is the device.

プロセッサ基板０のＭＣＰ１側のブリッジ１のコネクタＥＰ１は、プロセッサ基板３のＭＣＰ６側のブリッジ６のコネクタＲＣ６と接続される。ルートコンプレックスであるプロセッサ基板３のＭＣＰ６から見た場合、プロセッサ基板３のＭＣＰ６をホスト、プロセッサ基板０をデバイスとするＰＣＩツリーが形成される。 The connector EP1 of the bridge 1 on the MCP1 side of the processor board 0 is connected to the connector RC6 of the bridge 6 on the MCP6 side of the processor board 3. When viewed from the MCP 6 of the processor board 3 that is the root complex, a PCI tree is formed with the MCP 6 of the processor board 3 as a host and the processor board 0 as a device.

同様に、プロセッサ基板１のＭＣＰ２側のブリッジ２のコネクタＲＣ２は、プロセッサ基板２のＭＣＰ５側のブリッジ５のコネクタＥＰ５と接続され、ブリッジ２のコネクタＥＰ２は、プロセッサ基板３のＭＣＰ７側のブリッジ７のコネクタＲＣ７と接続される。プロセッサ基板２のＭＣＰ４側のブリッジ４のコネクタＲＣ４は、プロセッサ基板３のＭＣＰ７側のブリッジ７のコネクタＥＰ７と接続される。 Similarly, the connector RC2 of the bridge 2 on the MCP2 side of the processor board 1 is connected to the connector EP5 of the bridge 5 on the MCP5 side of the processor board 2, and the connector EP2 of the bridge 2 is connected to the bridge 7 on the MCP7 side of the processor board 3. Connected to the connector RC7. The connector RC4 of the bridge 4 on the MCP4 side of the processor board 2 is connected to the connector EP7 of the bridge 7 on the MCP7 side of the processor board 3.

ノード１００内のプロセッサ基板間の接続に用いられないブリッジのコネクタ、すなわち、プロセッサ基板０のブリッジ０のコネクタＥＰ０、プロセッサ基板１のブリッジ３のコネクタＲＣ３、プロセッサ基板２のブリッジ５のコネクタＲＣ５、およびプロセッサ基板３のブリッジ６のコネクタＥＰ６は、空きスロットとして、他のノードのプロセッサ基板との接続に利用される。 Bridge connectors not used for connection between processor boards in node 100, ie, connector EP0 of bridge 0 of processor board 0, connector RC3 of bridge 3 of processor board 1, connector RC5 of bridge 5 of processor board 2, and The connector EP6 of the bridge 6 of the processor board 3 is used as a free slot for connection to the processor board of another node.

図３は、複数のノードを連結したクラスタシステム２００の構成図である。クラスタシステム２００は、図２で説明した構成のノード１００〜１０２、１１０〜１１２、１２０〜１２０を上下左右に連結したものである。たとえば、ノード１００の右にはノード１０１が接続され、ノード１０１のさらに右にはノード１０２が接続される。ノード１００の下にはノード１１０が接続され、ノード１１０のさらに下にはノード１２０が接続される。 FIG. 3 is a configuration diagram of a cluster system 200 in which a plurality of nodes are connected. In the cluster system 200, the nodes 100 to 102, 110 to 112, and 120 to 120 having the configuration described in FIG. For example, the node 101 is connected to the right of the node 100, and the node 102 is connected to the right of the node 101. A node 110 is connected below the node 100, and a node 120 is connected further below the node 110.

同図に示すように、左右に並ぶ２つのノードは、左側のノードのプロセッサ基板３のコネクタＥＰ６と、右側のノードのプロセッサ基板１のコネクタＲＣ３とを接続することにより、結合される。上下に並ぶ２つのノードは、上側のノードのプロセッサ基板２のコネクタＲＣ５と、下側のノードのプロセッサ基板０のコネクタＥＰ０とを接続することにより、結合される。 As shown in the figure, the two nodes arranged in the left and right direction are coupled by connecting the connector EP6 of the processor board 3 of the left node and the connector RC3 of the processor board 1 of the right node. The two nodes arranged vertically are coupled by connecting the connector RC5 of the processor board 2 of the upper node and the connector EP0 of the processor board 0 of the lower node.

クラスタシステム２００において、端部に位置するノードの隣接ノードが存在しない側のコネクタは空きスロットになるが、この空きスロットには各種の周辺デバイスを接続したり、さらにノードを接続することにより、システムを拡張することができる。 In the cluster system 200, the connector on the side where the adjacent node of the node located at the end does not exist becomes an empty slot. By connecting various peripheral devices to this empty slot and further connecting nodes, the system Can be extended.

このように、クラスタシステム２００では、ノードを上下左右に結合する平面上の配置により、ノード数を自由自在に増やしていくことができるという利点がある。 As described above, the cluster system 200 has an advantage that the number of nodes can be freely increased by the arrangement on the plane connecting the nodes vertically and horizontally.

クラスタシステム２００において、各ノード内の４枚のプロセッサ基板間の接続、およびノード間の接続には、フレキシブル基板が用いられる。以下、図４〜図８を参照して、フレキシブル基板を用いた接続形態を説明する。 In the cluster system 200, a flexible substrate is used for connection between four processor boards in each node and for connection between nodes. Hereinafter, a connection form using a flexible substrate will be described with reference to FIGS.

図４は、プロセッサ基板５０の裏面の配線の模式図である。同図において、ＭＣＰ０と複数のＤＲＡＭ１０の間の配線、ＭＣＰ０とブリッジ０の間の配線、ブリッジ０とコネクタＲＣ０、ＥＰ０の間の配線が示されている。また、ＭＣＰ１と複数のＤＲＡＭ１１の間の配線、ＭＣＰ１とブリッジ１の間の配線、ブリッジ１とコネクタＲＣ１、ＥＰ１の間の配線が示されている。各コネクタＲＣ０、ＥＰ０、ＲＣ１、ＥＰ１はＰＣＩ−Ｅｘｐｒｅｓｓ×１６コネクタであり、フレキシブル基板を接続することができる。 FIG. 4 is a schematic diagram of the wiring on the back surface of the processor board 50. In the figure, wiring between MCP0 and a plurality of DRAMs 10, wiring between MCP0 and bridge 0, and wiring between bridge 0 and connectors RC0 and EP0 are shown. Also, wiring between MCP1 and a plurality of DRAMs 11, wiring between MCP1 and bridge 1, and wiring between bridge 1 and connectors RC1 and EP1 are shown. Each connector RC0, EP0, RC1, EP1 is a PCI-Express × 16 connector and can be connected to a flexible substrate.

図５は、ノード１００内の４枚のプロセッサ基板５０〜５３間をフレキシブル基板によって接続した構成を示す図である。フレキシブル基板は、プリント配線基板の一種であり、ＦＰＣ（Flexible Printed Circuit）とも呼ばれ、薄くて屈曲性がある。 FIG. 5 is a diagram showing a configuration in which the four processor boards 50 to 53 in the node 100 are connected by a flexible board. A flexible substrate is a kind of printed wiring board, also called FPC (Flexible Printed Circuit), and is thin and flexible.

図２で説明したプロセッサ基板５０〜５３（プロセッサ基板０〜３）を、フレキシブル基板による接続がしやすいように、プロセッサ基板１（符号５１）、プロセッサ基板０（符号５０）、プロセッサ基板２（符号５２）、プロセッサ基板３（符号５３）の順に、基板面を互いに平行にして配置する。 The processor boards 50 to 53 (processor boards 0 to 3) described in FIG. 2 are easily connected to each other by a flexible board, the processor board 1 (reference numeral 51), the processor board 0 (reference numeral 50), and the processor board 2 (reference numeral). 52) and the processor boards 3 (reference numeral 53) in this order, the board surfaces are arranged parallel to each other.

プロセッサ基板１のコネクタＲＣ２は、フレキシブル基板２０１によりプロセッサ基板２のコネクタＥＰ５と接続される。プロセッサ基板１のコネクタＥＰ２は、フレキシブル基板２０２によりプロセッサ基板３のコネクタＲＣ７と接続される。プロセッサ基板１のコネクタＥＰ３は、フレキシブル基板２０３によりプロセッサ基板０のコネクタＲＣ０と接続される。 The connector RC2 of the processor board 1 is connected to the connector EP5 of the processor board 2 by the flexible board 201. The connector EP2 of the processor board 1 is connected to the connector RC7 of the processor board 3 by the flexible board 202. The connector EP3 of the processor board 1 is connected to the connector RC0 of the processor board 0 by the flexible board 203.

プロセッサ基板０のコネクタＲＣ１は、プロセッサ基板２のコネクタＥＰ４とフレキシブル基板２０４によって接続される。プロセッサ基板０のコネクタＥＰ１は、プロセッサ基板３のコネクタＲＣ６とフレキシブル基板２０５によって接続される。プロセッサ基板２のコネクタＲＣ４は、プロセッサ基板３のコネクタＥＰ７とフレキシブル基板２０６によって接続される。 The connector RC1 of the processor board 0 is connected to the connector EP4 of the processor board 2 by the flexible board 204. The connector EP1 of the processor board 0 is connected to the connector RC6 of the processor board 3 and the flexible board 205. The connector RC4 of the processor board 2 is connected to the connector EP7 of the processor board 3 by the flexible board 206.

プロセッサ基板１のコネクタＥＰ２とプロセッサ基板３のコネクタＲＣ７をつなぐフレキシブル基板２０２は、プロセッサ基板０のコネクタＲＣ１とプロセッサ基板２のコネクタＥＰ４をつなぐフレキシブル基板２０４の上側をまたいでいる。このようにフレキシブル基板を用いれば、配線の上に別の配線が通るような接続形態も可能であり、４枚のプロセッサ基板を平行に並べて相互に密結合させ、省スペース化を図ることができる。 The flexible board 202 that connects the connector EP2 of the processor board 1 and the connector RC7 of the processor board 3 straddles the upper side of the flexible board 204 that connects the connector RC1 of the processor board 0 and the connector EP4 of the processor board 2. If a flexible board is used in this way, a connection configuration in which another wiring passes over the wiring is possible, and four processor boards can be arranged in parallel and tightly coupled to each other to save space. .

また、汎用品のＰＣＩエクスプレスコネクタとフレキシブル基板を用いてプロセッサ基板間を接続する構成であるため、ＰＣＩエクスプレススイッチなどでプロセッサ基板間を相互接続した構成に比べて、はるかに安価であり、製造コストを削減することができる。 In addition, since the configuration is such that the processor boards are connected using a general-purpose PCI Express connector and a flexible board, it is far cheaper than the configuration in which the processor boards are interconnected by a PCI Express switch or the like, and the manufacturing cost is low. Can be reduced.

さらに、プロセッサ基板の部品実装密度を高くし、プロセッサ基板を小型化することによって、より短いフレキシブル基板でプロセッサ基板を相互接続することができ、高速信号を扱うことが可能になる。ＰＣＩ−Ｅｘｐｒｅｓｓは高速通信を前提としており、ケーブル接続では信号の伝搬が遅く、ケーブル接続によってプロセッサ基板間の密結合を実現することは困難である。本実施の形態では、フレキシブル基板でプロセッサ基板間を配線するため、高速信号の伝搬が可能である。 Furthermore, by increasing the component mounting density of the processor board and reducing the size of the processor board, the processor boards can be interconnected with a shorter flexible board, and high-speed signals can be handled. PCI-Express is premised on high-speed communication, and signal propagation is slow in cable connection, and it is difficult to realize tight coupling between processor boards by cable connection. In this embodiment mode, the processor boards are wired with flexible boards, so that high-speed signal propagation is possible.

図６は、クラスタシステム２００内の複数のノード間をフレキシブル基板によって接続した構成を示す図である。同図では、図３の４つの隣接するノード１００、１０１、１１０、１１１の接続形態が示されている。各ノード１００、１０１、１１０、１１１内の４枚のプロセッサ基板間は、図５で説明したようにフレキシブル基板で接続されている。ただし、ノード１１０については、ノード間の接続形態を把握しやすくするため、ノード内のプロセッサ基板間を接続するフレキシブル基板を図示していない。 FIG. 6 is a diagram showing a configuration in which a plurality of nodes in the cluster system 200 are connected by a flexible substrate. In the figure, the connection form of the four adjacent nodes 100, 101, 110, 111 of FIG. 3 is shown. The four processor boards in each of the nodes 100, 101, 110, and 111 are connected by a flexible board as described with reference to FIG. However, for the node 110, in order to make it easy to grasp the connection form between the nodes, a flexible substrate for connecting the processor boards in the node is not shown.

ノード１００のプロセッサ基板３のコネクタＥＰ６は、ノード１０１のプロセッサ基板１のコネクタＲＣ３とフレキシブル基板２１１によって接続される。これにより２つのノード１００、１０１が左右方向に結合する。同様にノード１１０のプロセッサ基板３のコネクタＥＰ６は、ノード１１１のプロセッサ基板１のコネクタＲＣ３とフレキシブル基板２２１によって接続され、２つのノード１１０、１１１が左右方向に結合する。 The connector EP6 of the processor board 3 of the node 100 is connected to the connector RC3 of the processor board 1 of the node 101 by the flexible board 211. As a result, the two nodes 100 and 101 are coupled in the left-right direction. Similarly, the connector EP6 of the processor board 3 of the node 110 is connected to the connector RC3 of the processor board 1 of the node 111 and the flexible board 221, and the two nodes 110 and 111 are coupled in the left-right direction.

ノード１００のプロセッサ基板２のコネクタＲＣ５は、ノード１１０のフレキシブル基板０のコネクタＥＰ０とフレキシブル基板２１４によって接続される。これにより２つのノード１００、１１０が上下方向に結合する。同様に、ノード１０１のプロセッサ基板２のコネクタＲＣ５は、ノード１１１のフレキシブル基板０のコネクタＥＰ０とフレキシブル基板２２４によって接続され、２つのノード１０１、１１１が上下方向に結合する。 The connector RC5 of the processor board 2 of the node 100 is connected to the connector EP0 of the flexible board 0 of the node 110 by the flexible board 214. As a result, the two nodes 100 and 110 are coupled in the vertical direction. Similarly, the connector RC5 of the processor board 2 of the node 101 is connected by the connector EP0 of the flexible board 0 of the node 111 and the flexible board 224, and the two nodes 101 and 111 are coupled in the vertical direction.

クラスタシステム２００では、ノード間の接続にもフレキシブル基板が用いられ、省スペース化とコストダウンを図ることができる。クラスタシステム２００は、複数のノードを平面上で上下左右に配置して接続する形態であるため、隣り合うノード間の距離を短くすることができ、ノード間接続に用いるフレキシブル基板の長さを十分に短くすることができ、ＰＣＩ−Ｅｘｐｒｅｓｓの高速信号を扱うことができる。 In the cluster system 200, a flexible substrate is also used for connection between nodes, and space saving and cost reduction can be achieved. Since the cluster system 200 has a configuration in which a plurality of nodes are arranged in a plane on the top, bottom, left, and right, the distance between adjacent nodes can be shortened, and the length of the flexible substrate used for inter-node connection is sufficient. And a high-speed PCI-Express signal can be handled.

図７は、ノード１００の筐体を説明する図である。ノード１００の筐体には、４枚のプロセッサ基板５０から５３が収納されており、背面のコネクタ間は図５で説明したように６個のフレキシブル基板２０１〜２０６で接続されている。さらに、図６で説明したように、プロセッサ基板０には、上方向に隣接するノードのプロセッサ基板２と接続するためのフレキシブル基板２１３が設けられ、プロセッサ基板２には、下方向に隣接するノードのプロセッサ基板０と接続するためのフレキシブル基板２１４が設けられる。一方、プロセッサ基板１には、左方向に隣接するノードのプロセッサ基板３と接続するためのフレキシブル基板２１２が設けられ、プロセッサ基板３には、右方向に隣接するノードのプロセッサ基板１と接続するためのフレキシブル基板２１１が設けられる。 FIG. 7 is a diagram for explaining the housing of the node 100. The housing of the node 100 accommodates four processor boards 50 to 53, and the connectors on the back are connected by the six flexible boards 201 to 206 as described with reference to FIG. Further, as described with reference to FIG. 6, the processor board 0 is provided with the flexible board 213 for connecting to the processor board 2 of the node adjacent in the upward direction. A flexible substrate 214 for connection to the processor substrate 0 is provided. On the other hand, the processor board 1 is provided with a flexible board 212 for connection to the processor board 3 of the node adjacent in the left direction, and the processor board 3 is connected to the processor board 1 of the node adjacent in the right direction. Flexible substrate 211 is provided.

図８は、クラスタシステム２００の筐体を説明する図である。図７のノード１００の筐体を上下左右に並べ、図７で説明したフレキシブル基板２１１、２１２によって左右方向にノード間を接続し、フレキシブル基板２１３、２１４によって上下方向にノード間を接続する。このように、クラスタシステム２００は、ノードの筐体を平面に配置してフレキシブル基板で接続することで容易に構成することができる。また、ノードの追加がしやすく、スケーラビリティがあり、多数のノードを結合したノードクラスタを省スペースで安価に提供することができる。 FIG. 8 is a diagram for explaining the chassis of the cluster system 200. The chassis of the node 100 of FIG. 7 is arranged vertically and horizontally, and the nodes are connected in the horizontal direction by the flexible boards 211 and 212 described in FIG. 7, and the nodes are connected in the vertical direction by the flexible boards 213 and 214. In this way, the cluster system 200 can be easily configured by arranging the node housings on a plane and connecting them with a flexible substrate. In addition, it is easy to add a node, there is scalability, and a node cluster in which a large number of nodes are combined can be provided in a small space and at a low cost.

図４〜図８では、プロセッサ基板にフレキシブル基板用コネクタが設けられ、フレキシブル基板用コネクタ間をフレキシブル基板で接続する形態を説明した。このように汎用ＰＣＩエクスプレスコネクタをフレキシブル基板で接続する形態は、接続形態の一例に過ぎず、これ以外の接続形態も考えられる。別の接続形態として、プロセッサ基板のカードエッジを差し込むための汎用のＰＣＩエクスプレスコネクタを搭載したバックプレーン基板を一枚用意して、４枚のプロセッサ基板をバックプレーン基板に差し込むことで図５で説明したＰＣＩエクスプレスコネクタ間の接続をバックプレーン基板上で実現してもよい。また、さらに別の接続形態として、プロセッサ基板に差動信号用コネクタペアであるＺＤコネクタを設け、バックプレーン基板上でＺＤコネクタを接続するように構成してもよい。このようなバックプレーン基板を用いた接続形態もフレキシブル基板を用いた接続形態と同様、安価な高速通信を実現することができ、また、省スペース化を図ることができる。 4-8, the processor board was provided with the connector for flexible boards, and the form which connects between the connectors for flexible boards with a flexible board was demonstrated. Thus, the form which connects a general purpose PCI express connector with a flexible substrate is only an example of a connection form, and other connection forms are also conceivable. As another connection form, a single backplane board equipped with a general-purpose PCI Express connector for inserting the card edge of the processor board is prepared, and four processor boards are inserted into the backplane board, which will be described with reference to FIG. The connection between the PCI express connectors may be realized on the backplane board. As yet another connection form, a ZD connector that is a differential signal connector pair may be provided on the processor board, and the ZD connector may be connected on the backplane board. A connection form using such a backplane substrate can realize inexpensive high-speed communication and can save space, similarly to the connection form using a flexible substrate.

図９Ａ〜図９Ｄを参照して、図２で説明したノード１００内の４枚のプロセッサ基板のフルメッシュ型の結合により形成されるＰＣＩツリーを説明する。ＰＣＩツリーは、ノード内の各ＭＣＰがＰＣＩエクスプレスで接続されたＰＣＩデバイスを検索することにより得られる。 With reference to FIGS. 9A to 9D, a PCI tree formed by a full-mesh coupling of four processor boards in the node 100 described in FIG. 2 will be described. The PCI tree is obtained by searching for a PCI device to which each MCP in the node is connected by PCI Express.

図９Ａは、ＭＣＰ０またはＭＣＰ１を中心に置いた場合のＰＣＩツリーを説明する図である。同図ではＰＣＩツリー構造において同じ階層にあるＭＣＰを水平に配置し、ルートに近い方を上に、リーフに近い方を下に配置している。 FIG. 9A is a diagram for explaining a PCI tree when MCP0 or MCP1 is placed at the center. In the figure, MCPs in the same hierarchy in the PCI tree structure are arranged horizontally, with the one closer to the root up and the one closer to the leaf down.

ルートコンプレックスであるＭＣＰ０のすぐ下の階層には、ＭＣＰ３がエンドポイントとして接続されている。ＭＣＰ２がＭＣＰ３と同階層にあって、ＭＣＰ３に接続されている。これにより、ＭＣＰ０をルートとする第１のＰＣＩツリーが形成される。ＭＣＰ１はＭＣＰ０と同階層にあって、ＭＣＰ０に接続されている。ルートコンプレックスであるＭＣＰ１のすぐ下の階層には、ＭＣＰ４がエンドポイントとして接続されている。ＭＣＰ５がＭＣＰ４と同階層にあって、ＭＣＰ４に接続されている。これにより、ＭＣＰ１をルートとする第２のＰＣＩツリーが形成される。ルートコンプレックスであるＭＣＰ６は、ＭＣＰ１のすぐ上の階層にあって、ＭＣＰ１をエンドポイントとして接続している。ＭＣＰ７がＭＣＰ６と同階層にあって、ＭＣＰ６に接続されている。これにより、ＭＣＰ６をルートとする第３のＰＣＩツリーが形成される。 MCP3 is connected as an end point immediately below the root complex MCP0. MCP2 is on the same level as MCP3 and is connected to MCP3. As a result, a first PCI tree having MCP0 as a root is formed. MCP1 is on the same level as MCP0 and is connected to MCP0. The MCP 4 is connected as an end point to the hierarchy immediately below the MCP 1 that is the root complex. The MCP 5 is on the same level as the MCP 4 and is connected to the MCP 4. As a result, a second PCI tree having MCP1 as a root is formed. The MCP 6 that is the root complex is in the hierarchy immediately above the MCP 1 and connects the MCP 1 as an end point. The MCP 7 is on the same level as the MCP 6 and is connected to the MCP 6. As a result, a third PCI tree having MCP 6 as a root is formed.

図９Ｂは、ＭＣＰ２またはＭＣＰ３を中心に置いた場合のＰＣＩツリーを説明する図である。ルートコンプレックスであるＭＣＰ２のすぐ下の階層には、ＭＣＰ５がエンドポイントとして接続されている。ＭＣＰ４がＭＣＰ５と同階層にあって、ＭＣＰ５に接続されている。これにより、ＭＣＰ２をルートとする第１のＰＣＩツリーが形成される。ルートコンプレックスであるＭＣＰ７は、ＭＣＰ２のすぐ上の階層にあって、ＭＣＰ２をエンドポイントとして接続している。ＭＣＰ６がＭＣＰ７と同階層にあって、ＭＣＰ７に接続されている。これにより、ＭＣＰ７をルートとする第２のＰＣＩツリーが形成される。ＭＣＰ３はＭＣＰ２と同階層にあって、ＭＣＰ２に接続されている。ルートコンプレックスであるＭＣＰ３のすぐ下の階層には、他のノードのエンドポイントが接続される。ルートコンプレックスであるＭＣＰ０は、ＭＣＰ３のすぐ上の階層にあって、ＭＣＰ３をエンドポイントとして接続している。ＭＣＰ１がＭＣＰ０と同階層にあって、ＭＣＰ０に接続されている。これにより、ＭＣＰ０をルートとする第３のＰＣＩツリーが形成される。 FIG. 9B is a diagram for explaining a PCI tree when MCP2 or MCP3 is placed at the center. The MCP 5 is connected as an end point to the hierarchy immediately below the MCP 2 that is the root complex. The MCP 4 is on the same level as the MCP 5 and is connected to the MCP 5. As a result, a first PCI tree having MCP2 as a root is formed. The MCP 7 that is the root complex is in the hierarchy immediately above the MCP 2 and connects the MCP 2 as an end point. The MCP 6 is on the same level as the MCP 7 and is connected to the MCP 7. As a result, a second PCI tree having MCP 7 as a root is formed. MCP3 is on the same level as MCP2 and is connected to MCP2. The endpoints of other nodes are connected to the hierarchy immediately below the MCP 3 that is the root complex. MCP0, which is the root complex, is in the hierarchy immediately above MCP3 and connects MCP3 as an end point. MCP1 is on the same level as MCP0 and is connected to MCP0. As a result, a third PCI tree rooted at MCP0 is formed.

図９Ｃは、ＭＣＰ４またはＭＣＰ５を中心に置いた場合のＰＣＩツリーを説明する図である。ルートコンプレックスであるＭＣＰ４のすぐ下の階層には、ＭＣＰ７がエンドポイントとして接続されている。ＭＣＰ６がＭＣＰ７と同階層にあって、ＭＣＰ７に接続されている。これにより、ＭＣＰ４をルートとする第１のＰＣＩツリーが形成される。ルートコンプレックスであるＭＣＰ１は、ＭＣＰ４のすぐ上の階層にあって、ＭＣＰ４をエンドポイントとして接続している。ＭＣＰ０がＭＣＰ１と同階層にあって、ＭＣＰ１に接続されている。これにより、ＭＣＰ１をルートとする第２のＰＣＩツリーが形成される。ＭＣＰ５はＭＣＰ４と同階層にあって、ＭＣＰ４に接続されている。ルートコンプレックスであるＭＣＰ５のすぐ下の階層には、他のノードのエンドポイントが接続される。ルートコンプレックスであるＭＣＰ２は、ＭＣＰ５のすぐ上の階層にあって、ＭＣＰ５をエンドポイントとして接続している。ＭＣＰ３がＭＣＰ２と同階層にあって、ＭＣＰ２に接続されている。これにより、ＭＣＰ２をルートとする第３のＰＣＩツリーが形成される。 FIG. 9C is a diagram illustrating a PCI tree when MCP4 or MCP5 is placed at the center. The MCP 7 is connected as an end point immediately below the root complex MCP 4. The MCP 6 is on the same level as the MCP 7 and is connected to the MCP 7. As a result, a first PCI tree having MCP 4 as a root is formed. The MCP1 that is the root complex is in a hierarchy immediately above the MCP4 and connects the MCP4 as an end point. MCP0 is on the same level as MCP1 and is connected to MCP1. As a result, a second PCI tree having MCP1 as a root is formed. The MCP 5 is on the same level as the MCP 4 and is connected to the MCP 4. The endpoints of other nodes are connected to the hierarchy immediately below the MCP 5 that is the root complex. The MCP2 that is the root complex is in a hierarchy immediately above the MCP5 and connects the MCP5 as an end point. MCP3 is on the same level as MCP2 and is connected to MCP2. As a result, a third PCI tree rooted at MCP2 is formed.

図９Ｄは、ＭＣＰ６またはＭＣＰ７を中心に置いた場合のＰＣＩツリーを説明する図である。ルートコンプレックスであるＭＣＰ６のすぐ下の階層には、ＭＣＰ１がエンドポイントとして接続されている。ＭＣＰ０がＭＣＰ１と同階層にあって、ＭＣＰ１に接続されている。これにより、ＭＣＰ６をルートとする第１のＰＣＩツリーが形成される。ＭＣＰ７はＭＣＰ６と同階層にあって、ＭＣＰ６に接続されている。ルートコンプレックスであるＭＣＰ７のすぐ下の階層には、ＭＣＰ２がエンドポイントとして接続されている。ＭＣＰ３がＭＣＰ２と同階層にあって、ＭＣＰ２に接続されている。これにより、ＭＣＰ７をルートとする第２のＰＣＩツリーが形成される。ルートコンプレックスであるＭＣＰ４は、ＭＣＰ７のすぐ上の階層にあって、ＭＣＰ７をエンドポイントとして接続している。ＭＣＰ５がＭＣＰ４と同階層にあって、ＭＣＰ４に接続されている。これにより、ＭＣＰ４をルートとする第３のＰＣＩツリーが形成される。 FIG. 9D is a diagram illustrating a PCI tree when MCP6 or MCP7 is placed at the center. MCP1 is connected as an end point immediately below the root complex MCP6. MCP0 is on the same level as MCP1 and is connected to MCP1. As a result, a first PCI tree having MCP 6 as a root is formed. The MCP 7 is on the same level as the MCP 6 and is connected to the MCP 6. The MCP 2 is connected as an end point to the layer immediately below the MCP 7 that is the root complex. MCP3 is on the same level as MCP2 and is connected to MCP2. As a result, a second PCI tree having MCP 7 as a root is formed. The MCP4 that is the root complex is in a layer immediately above the MCP7, and connects the MCP7 as an end point. The MCP 5 is on the same level as the MCP 4 and is connected to the MCP 4. As a result, a third PCI tree having MCP 4 as a root is formed.

このように、ノード１００内の４つのプロセッサ基板間で図２で説明したようにＲＣコネクタとＥＰコネクタを接続することにより、あるＭＣＰをルートとするＰＣＩツリーが複数形成される。ノード１００内のＭＣＰ０〜ＭＣＰ７はそれぞれ、自己をルートとするＰＣＩツリー内で、もしくは異なるＰＣＩツリーをまたぐことで他のＭＣＰとの間でデータ通信を行うことができる。ノード１００の空きスロットのコネクタと接続された隣接ノードのプロセッサ基板上のＭＣＰは、同一ＰＣＩツリー内にあるため、ノードをまたいでデータ通信が可能である。しかし、ノード１００の空きスロットのコネクタと接続されていない他のノードのＭＣＰとデータ通信をする場合は、同一ＰＣＩツリー内にないため、ルーティングが必要となる。このため、ノード１００内の各ＭＣＰは、ソフトウェアでルーティングを実行して、他のＰＣＩツリー内のＭＣＰとの通信を可能にする。 As described above, by connecting the RC connector and the EP connector between the four processor boards in the node 100 as described in FIG. 2, a plurality of PCI trees having a certain MCP as a root are formed. Each of the MCP0 to MCP7 in the node 100 can perform data communication with another MCP within a PCI tree rooted in itself or across different PCI trees. Since the MCP on the processor board of the adjacent node connected to the connector of the empty slot of the node 100 is in the same PCI tree, data communication is possible across the nodes. However, when data communication is performed with an MCP of another node that is not connected to the connector of the empty slot of the node 100, routing is necessary because it is not in the same PCI tree. For this reason, each MCP in the node 100 performs routing by software and enables communication with MCPs in other PCI trees.

ノード１００内の各ＭＣＰが、自己のＰＣＩツリー内で、もしくは異なるＰＣＩツリーをまたぐことで他のＭＣＰの所定の共有領域にアクセスすることができるように、各ＭＣＰのメモリ空間には他のＭＣＰの所定の共有領域がメモリマッピングされる。図１０〜図１７を参照して、このメモリマッピングを説明する。 The memory space of each MCP has another MCP so that each MCP in the node 100 can access a predetermined shared area of the other MCP within its own PCI tree or across different PCI trees. The predetermined shared area is memory-mapped. This memory mapping will be described with reference to FIGS.

図１０は、ノード１００内の各ＭＣＰのメモリ空間３００を説明する図である。メモリ空間３００には、コヒーレントなローカルメモリ領域３５１とノンコヒーレントな共有メモリ領域３５２がある。コヒーレントなローカルメモリ領域３５１は、メモリアクセスのアトミック性が保証され、同期制御がなされる領域であり、他のＭＣＰからはアクセスすることはできない。ノンコヒーレントな共有メモリ領域３５２は、他のＭＣＰのメモリ空間にマッピングされ、他のＭＣＰからアクセスされる。メモリ空間３００には、さらに各ＭＣＰのＳＰＥおよびＰＥのレジスタやＳＰＥのローカルストアがマッピングされたノンコヒーレント領域３５３がある。このノンコヒーレント領域３５３の少なくとも一部は、他のＭＣＰのメモリ空間にマッピングされ、他のＭＣＰからアクセスされる。 FIG. 10 is a diagram for explaining the memory space 300 of each MCP in the node 100. The memory space 300 includes a coherent local memory area 351 and a non-coherent shared memory area 352. The coherent local memory area 351 is an area in which atomicity of memory access is ensured and synchronization control is performed, and cannot be accessed from other MCPs. The non-coherent shared memory area 352 is mapped to the memory space of another MCP and accessed from the other MCP. The memory space 300 further includes a non-coherent area 353 in which the SPE and PE registers of each MCP and the local store of the SPE are mapped. At least a part of the non-coherent area 353 is mapped to the memory space of another MCP and accessed from the other MCP.

メモリ空間３００には、ＩＯＩＦ０を介してアクセス可能なＩ／Ｏアドレス空間がＩＯＩＦ０領域３６０としてメモリマッピングされる。また、ＩＯＩＦ１を介してアクセス可能なＩ／Ｏアドレス空間がＩＯＩＦ１領域３７０としてメモリマッピングされる。 In the memory space 300, an I / O address space accessible via the IOIF0 is memory-mapped as an IOIF0 area 360. In addition, an I / O address space accessible via IOIF1 is memory-mapped as an IOIF1 area 370.

各ＭＣＰは、自分のメモリ空間３００内のノンコヒーレント領域３５３に含まれるＳＰＥ／ＰＥのレジスタやＳＰＥのローカルストア、およびノンコヒーレントな共有メモリ領域３５２を共有領域（shared area）として、ＩＯＩＦ０を介して他のＭＣＰに開放してアクセスを許可する。各ＭＣＰは、他のＭＣＰにアクセスを許可する共有領域の情報をＩＯＩＦ０用のＩ／Ｏページテーブル（以下、「ＩＯＰＴ」という）３１０に格納する。他のＭＣＰは、このＩＯＰＴ３１０を参照して、共有領域を自分のメモリ空間にマッピングしてアクセス可能にする。 Each MCP uses the SPE / PE registers included in the non-coherent area 353 in its own memory space 300, the local store of the SPE, and the non-coherent shared memory area 352 as a shared area via the IOIF0. Open to other MCPs to allow access. Each MCP stores information on a shared area that permits access to other MCPs in an IOIF0 I / O page table (hereinafter referred to as “IOPT”) 310. Other MCPs refer to this IOPT 310 and map the shared area to their memory space to make it accessible.

図１１は、ＭＣＰ０の共有領域がＭＣＰ１のメモリ空間３０１にマッピングされ、ＭＣＰ１の共有領域がＭＣＰ０のメモリ空間３００にマッピングされる様子を説明する図である。ＭＣＰ０のＩＯＩＦ０用のＩＯＰＴ３１０（「ＩＯＰＴ０」）がＩＯＩＦ０経由でＭＣＰ１に提示されると、ＩＯＰＴ０により指定されたＭＣＰ０の共有領域３２１がＭＣＰ１のメモリ空間３０１のＩＯＩＦ０領域３６１にマッピングされる。ＭＣＰ０の共有領域３２１には、ＭＣＰ０のＳＰＥ／ＰＥのレジスタ、ＭＣＰ０のＳＰＥのローカルストア、およびＭＣＰ０の共有メモリが含まれる。 FIG. 11 is a diagram for explaining how the shared area of MCP0 is mapped to the memory space 301 of MCP1 and the shared area of MCP1 is mapped to the memory space 300 of MCP0. When an IOPT 310 (“IOPT0”) for IOIF0 of MCP0 is presented to MCP1 via IOIF0, the shared area 321 of MCP0 specified by IOPT0 is mapped to the IOIF0 area 361 of the memory space 301 of MCP1. The shared area 321 of the MCP0 includes the SPE / PE register of the MCP0, the local store of the SPE of the MCP0, and the shared memory of the MCP0.

一方、ＭＣＰ１のＩＯＩＦ０用のＩＯＰＴ３１１（「ＩＯＰＴ１」）がＩＯＩＦ０経由でＭＣＰ０に提示されると、ＩＯＰＴ１により指定されたＭＣＰ１の共有領域３２０がＭＣＰ０のメモリ空間３００のＩＯＩＦ０領域３６０にマッピングされる。ＭＣＰ１の共有領域３２０には、ＭＣＰ１のＳＰＥ／ＰＥのレジスタ、ＭＣＰ１のＳＰＥのローカルストア、およびＭＣＰ１の共有メモリが含まれる。 On the other hand, when IOPT 311 (“IOPT1”) for IOIF0 of MCP1 is presented to MCP0 via IOIF0, shared area 320 of MCP1 specified by IOPT1 is mapped to IOIF0 area 360 of memory space 300 of MCP0. The shared area 320 of the MCP1 includes the SPE / PE register of the MCP1, the local store of the SPE of the MCP1, and the shared memory of the MCP1.

このように、ＩＯＩＦ０を介して接続されたＭＣＰ０とＭＣＰ１は、互いに相手の共有領域が自分のメモリ空間３００、３０１にメモリマッピングされているため、相手の共有領域にアクセスすることができる。 As described above, MCP0 and MCP1 connected via IOIF0 can access each other's shared area because the other party's shared area is memory-mapped in their own memory spaces 300 and 301.

図１２は、ＩＯＩＦ０で相互接続されたＭＣＰ０およびＭＣＰ１のそれぞれの共有領域がＩＯＩＦ１経由で接続された他のＭＣＰのメモリ空間にマッピングされる様子を説明する図である。 FIG. 12 is a diagram for explaining how the shared areas of MCP0 and MCP1 interconnected by IOIF0 are mapped to the memory spaces of other MCPs connected via IOIF1.

図１１で説明したように、ＭＣＰ１のメモリ空間のＩＯＩＦ０領域３６１には、ＭＣＰ０の共有領域３２１がマッピングされている。ＭＣＰ１は、自分の共有領域とともにＭＣＰ０の共有領域３２１をＩＯＩＦ１経由で接続された他のＭＣＰに開放してアクセスを許可する。ＭＣＰ１は、ＩＯＩＦ１用のＩＯＰＴ３３１に、自分の共有領域、すなわちＭＣＰ１のＳＰＥ／ＰＥのレジスタ、ＭＣＰ１のローカルストア、およびＭＣＰ１の共有メモリの情報を格納する。さらにＭＣＰ１は、ＩＯＩＦ１のＩＯＰＴ３３１に、ＭＣＰ０の共有領域３２１、すなわちＭＣＰ０のＳＰＥ／ＰＥのレジスタ、ＭＣＰ０のローカルストア、およびＭＣＰ０の共有メモリの情報を格納する。 As described with reference to FIG. 11, the shared area 321 of MCP0 is mapped to the IOIF0 area 361 of the memory space of MCP1. The MCP 1 opens the shared area 321 of the MCP 0 together with its own shared area to other MCPs connected via the IOIF 1 and permits access. The MCP 1 stores information on its own shared area, that is, the SPE / PE register of the MCP 1, the local store of the MCP 1, and the shared memory of the MCP 1 in the IOPT 331 for the IOIF 1. Further, the MCP 1 stores information of the shared area 321 of the MCP 0, that is, the SPE / PE register of the MCP 0, the local store of the MCP 0, and the shared memory of the MCP 0 in the IOPT 331 of the IOIF 1.

図９Ａで説明したように、ＭＣＰ１とＭＣＰ６の接続関係は、ＭＣＰ６がルートコンプレックス、ＭＣＰ１がエンドポイントの関係であるから、エンドポイントであるＭＣＰ１が自分の共有領域の情報をルートコンプレックスであるＭＣＰ６に提示する。ＭＣＰ１は、ＩＯＩＦ１用のＩＯＰＴ３３１をＩＯＩＦ１経由で接続されたＭＣＰ６に提示する。ＭＣＰ６は、ＩＯＩＦ１用のＩＯＰＴ３３１で指定されたＭＣＰ０とＭＣＰ１の両方の共有領域３４２を自分のメモリ空間３０６のＩＯＩＦ１領域３７６にマッピングする。 As described with reference to FIG. 9A, the connection relationship between MCP1 and MCP6 is that MCP6 is a root complex and MCP1 is an end point relationship. Therefore, MCP1 that is an end point transmits information of its own shared area to MCP 6 that is the root complex. Present. The MCP 1 presents the IOPT 331 for the IOIF 1 to the MCP 6 connected via the IOIF 1. The MCP 6 maps the shared area 342 of both MCP 0 and MCP 1 specified by the IOPT 331 for IOIF 1 to the IOIF 1 area 376 of its own memory space 306.

図１３（ａ）、（ｂ）は、ＭＣＰ０がＩＯＩＦ１経由で接続された他のＭＣＰからＩＯＩＦ１用のＩＯＰＴの提示を受けた場合に、ＭＣＰ０のメモリ空間３００に他のＭＣＰの共有領域がマッピングされる様子を説明する図である。 In FIGS. 13A and 13B, when the MCP0 receives an IOPT for the IOIF1 from another MCP connected via the IOIF1, the shared area of the other MCP is mapped to the memory space 300 of the MCP0. FIG.

図１３（ａ）に示すように、ルートコンプレックスであるＭＣＰ０は、ＩＯＩＦ１を経由してエンドポイントであるＭＣＰ３に接続されている。ＭＣＰ３はＩＯＩＦ０によりＭＣＰ２と相互接続されるから、ＭＣＰ３のメモリ空間のＩＯＩＦ０領域にはＭＣＰ２の共有領域がマッピングされる。図１２で説明したＭＣＰ１からＭＣＰ６へのＩＯＩＦ１用のＩＯＰＴの提示と同様に、エンドポイントであるＭＣＰ３は、自分の共有領域とＭＣＰ２の共有領域の情報をＩＯＩＦ１用のＩＯＰＴに格納してルートコンプレックスであるＭＣＰ０に提示する。 As shown in FIG. 13A, the MCP0 that is the root complex is connected to the MCP3 that is the end point via the IOIF1. Since MCP3 is interconnected with MCP2 by IOIF0, the shared area of MCP2 is mapped to the IOIF0 area of the memory space of MCP3. Similar to the presentation of IOPT for IOIF1 from MCP1 to MCP6 described with reference to FIG. 12, MCP3 as an end point stores information on its own shared area and shared area of MCP2 in IOPT for IOIF1 in the root complex. Present to a certain MCP0.

ＭＣＰ０は、ＭＣＰ３からＩＯＩＦ１用のＩＯＰＴの提示を受けて、図１３（ｂ）に示すように、メモリ空間３００のＩＯＩＦ１領域３７０にＭＣＰ２およびＭＣＰ３の共有領域３４０をマッピングする。 MCP0 receives the IOPT for IOIF1 from MCP3, and maps MCP2 and MCP3 shared area 340 to IOIF1 area 370 of memory space 300, as shown in FIG.

図１４（ａ）、（ｂ）は、ＭＣＰ１がＩＯＩＦ１経由で接続された他のＭＣＰからＩＯＩＦ１用のＩＯＰＴの提示を受けた場合に、ＭＣＰ１のメモリ空間３０１に他のＭＣＰの共有領域がマッピングされる様子を説明する図である。 In FIGS. 14A and 14B, when MCP1 receives presentation of IOPT for IOIF1 from another MCP connected via IOIF1, the shared area of other MCP is mapped to memory space 301 of MCP1. FIG.

図１４（ａ）に示すように、ルートコンプレックスであるＭＣＰ１は、ＩＯＩＦ１を経由してエンドポイントであるＭＣＰ４に接続されている。ＭＣＰ４はＩＯＩＦ０によりＭＣＰ５と相互接続されるから、ＭＣＰ４のメモリ空間のＩＯＩＦ０領域にはＭＣＰ５の共有領域がマッピングされる。エンドポイントであるＭＣＰ４は、自分の共有領域とＭＣＰ５の共有領域の情報をＩＯＩＦ１用のＩＯＰＴに格納してルートコンプレックスであるＭＣＰ１に提示する。ＭＣＰ１は、ＭＣＰ４からＩＯＩＦ１用のＩＯＰＴの提示を受けて、図１４（ｂ）に示すように、メモリ空間３０１のＩＯＩＦ１領域３７１にＭＣＰ４およびＭＣＰ５の共有領域３４６をマッピングする。 As shown in FIG. 14A, the MCP1 that is the root complex is connected to the MCP4 that is the end point via the IOIF1. Since MCP4 is interconnected with MCP5 by IOIF0, the shared area of MCP5 is mapped to the IOIF0 area of the memory space of MCP4. The MCP 4 that is the end point stores the information of its own shared area and the shared area of the MCP 5 in the IOPT for the IOIF 1 and presents it to the MCP 1 that is the root complex. MCP1 receives the presentation of IOPT for IOIF1 from MCP4, and maps MCP4 and MCP5 shared area 346 to IOIF1 area 371 of memory space 301 as shown in FIG. 14B.

同様に、ＭＣＰ６は、自分の共有領域とＭＣＰ７の共有領域の情報をＩＯＩＦ１用のＩＯＰＴに格納してＭＣＰ１に提示し、ＭＣＰ１は、ＭＣＰ６からＩＯＩＦ１用のＩＯＰＴの提示を受けて、図１４（ｂ）に示すように、メモリ空間３０１のＩＯＩＦ１領域３７１にＭＣＰ６およびＭＣＰ７の共有領域３４８をマッピングする。 Similarly, the MCP 6 stores the information of its own shared area and the shared area of the MCP 7 in the IOPT for the IOIF 1 and presents the information to the MCP 1. The MCP 1 receives the IOPT for the IOIF 1 from the MCP 6, and FIG. ), The shared area 348 of MCP6 and MCP7 is mapped to the IOIF1 area 371 of the memory space 301.

次に、ＭＣＰ０とＭＣＰ１は、図１３（ｂ）、図１４（ｂ）のメモリ空間３００、３０１のＩＯＩＦ１領域３７０、３７１にマッピングされた、ＩＯＩＦ１経由で接続された他のＭＣＰの共有領域の情報を互いに交換する。 Next, MCP0 and MCP1 are information on shared areas of other MCPs connected via IOIF1 that are mapped to IOIF1 areas 370 and 371 of memory spaces 300 and 301 in FIGS. 13B and 14B. Exchange each other.

図１５は、ＭＣＰ０とＭＣＰ１間でメモリマッピングされた共有領域の情報をやりとりする様子を説明する図である。ＭＣＰ０は、メモリ空間３００のＩＯＩＦ１領域３７０にマッピングされたＭＣＰ２およびＭＣＰ３の共有領域３４０の情報をＩＯＩＦ０を介してＭＣＰ１に与える。ＭＣＰ１は、ＭＣＰ０から与えられた情報にもとづき、ＭＣＰ２およびＭＣＰ３の共有領域を自分のメモリ空間３０１のＩＯＩＦ０領域３６１にマッピングする。 FIG. 15 is a diagram for explaining how the information of the shared area that is memory-mapped is exchanged between MCP0 and MCP1. The MCP0 gives the information of the MCP2 and the shared area 340 of the MCP3 mapped to the IOIF1 area 370 of the memory space 300 to the MCP1 via the IOIF0. MCP1 maps the shared area of MCP2 and MCP3 to IOIF0 area 361 of its own memory space 301 based on the information given from MCP0.

一方、ＭＣＰ１は、メモリ空間３０１のＩＯＩＦ１領域３７１にマッピングされたＭＣＰ４およびＭＣＰ５の共有領域３４６の情報と、ＭＣＰ６およびＭＣＰ７の共有領域３４８の情報とをＩＯＩＦ０を介してＭＣＰ０に与える。ＭＣＰ０は、ＭＣＰ１から与えられた情報にもとづき、ＭＣＰ４およびＭＣＰ５の共有領域とＭＣＰ６およびＭＣＰ７の共有領域を自分のメモリ空間３００のＩＯＩＦ０領域３６０にマッピングする。 On the other hand, the MCP 1 gives the information of the shared area 346 of the MCP 4 and MCP 5 mapped to the IOIF 1 area 371 of the memory space 301 and the information of the shared area 348 of the MCP 6 and MCP 7 to the MCP 0 via the IOIF 0. MCP0 maps the shared areas of MCP4 and MCP5 and the shared areas of MCP6 and MCP7 to IOIF0 area 360 of its own memory space 300 based on the information given from MCP1.

図１１、図１３（ｂ）、図１４（ｂ）、および図１５で説明した手順でメモリ空間に他のＭＣＰの共有領域がメモリマッピングされることにより、ＭＣＰ０は、図９Ａで説明した第１〜第３ＰＣＩツリー内にあるＭＣＰ１〜ＭＣＰ７の共有領域にアクセスすることができるようになる。なぜなら第１〜第３ＰＣＩツリーをまたがって一つのアドレスマップが構築されているからである。同様に、ＭＣＰ１は、図９Ａで説明した第１〜第３ＰＣＩツリー内にあるＭＣＰ０、ＭＣＰ２〜ＭＣＰ７の共有領域にアクセスすることができるようになる。 As a result of the memory mapping of the shared areas of other MCPs in the memory space according to the procedure described with reference to FIGS. 11, 13B, 14B, and 15, the MCP 0 is the first described with reference to FIG. 9A. It becomes possible to access the shared areas of MCP1 to MCP7 in the third PCI tree. This is because one address map is constructed across the first to third PCI trees. Similarly, MCP1 can access the shared areas of MCP0 and MCP2 to MCP7 in the first to third PCI trees described in FIG. 9A.

このように、ノード１００内の各ＭＣＰは、第１〜第３ＰＣＩツリー内の他のＭＣＰの共有領域を自分のメモリ空間にメモリマッピングしており、第１〜第３ＰＣＩツリー内の他のＭＣＰの共有領域にアクセスしたり、第１〜第３ＰＣＩツリー内の他のＭＣＰと共有領域を介したデータ通信や同期制御を実行することができる。ノード１００内のプロセッサ基板間はフレキシブル基板で接続され、高速なＰＣＩ−Ｅｘｐｒｅｓｓによる通信が可能なハードウェア構成が採用されている。したがって、ノード１００内の各ＭＣＰは、メモリマッピングされた共有領域を高速にアクセスすることができ、他のＭＣＰとデータのやりとりを効率良く行うことができる。 In this way, each MCP in the node 100 performs memory mapping of the shared area of other MCPs in the first to third PCI trees to its own memory space, and other MCPs in the first to third PCI trees. It is possible to access the shared area and execute data communication and synchronization control via the shared area with other MCPs in the first to third PCI trees. A hardware configuration is adopted in which the processor boards in the node 100 are connected by a flexible board, and high-speed PCI-Express communication is possible. Accordingly, each MCP in the node 100 can access the memory mapped shared area at high speed, and can efficiently exchange data with other MCPs.

図１６は、連接ノードのＭＣＰとの接続も含めたＰＣＩツリーを説明する図である。ＭＣＰ０のブリッジ０のコネクタＥＰ０は、隣接ノードのブリッジ５’のコネクタＲＣ５と接続され、ＭＣＰ５’がＭＣＰ０に対してルートコンプレックスとなる。ＭＣＰ４’はＭＣＰ５’と同階層にあって、ＭＣＰ５’と接続されている。ＭＣＰ０は、隣接ノードのＭＣＰ５’からＩＯＩＦ１用ＩＯＰＴの提示を受けて、ＭＣＰ５’およびＭＣＰ４’の共有領域３４９をメモリ空間３００のＩＯＩＦ１領域３７０にマッピングする。 FIG. 16 is a diagram for explaining a PCI tree including connections with MCPs of connected nodes. The connector EP0 of the bridge 0 of the MCP0 is connected to the connector RC5 of the bridge 5 'of the adjacent node, and the MCP 5' becomes a root complex with respect to the MCP0. The MCP 4 'is on the same level as the MCP 5' and is connected to the MCP 5 '. The MCP 0 receives the IOPT 1 IOPT from the adjacent node MCP 5 ′, and maps the shared area 349 of the MCP 5 ′ and MCP 4 ′ to the IOIF 1 area 370 of the memory space 300.

図１７は、図１６のＰＣＩツリーの場合におけるＭＣＰ０のメモリ空間３００を説明する図である。図１７に示すように、ＩＯＩＦ１領域３７０には、ＭＣＰ２およびＭＣＰ３の共有領域３４０の他、ＭＣＰ４’およびＭＣＰ５’の共有領域３４０がメモリマッピングされる。また、ＩＯＩＦ０領域３６０には、ＭＣＰ４およびＭＣＰ５の共有領域３２６、ＭＣＰ６およびＭＣＰ７の共有領域３２８、およびＭＣＰ１の共有領域３２０がメモリマッピングされる。 FIG. 17 is a diagram for explaining the memory space 300 of MCP0 in the case of the PCI tree of FIG. As shown in FIG. 17, in the IOIF1 area 370, in addition to the shared area 340 of MCP2 and MCP3, the shared area 340 of MCP4 'and MCP5' is memory-mapped. In the IOIF0 area 360, the MCP4 and MCP5 shared area 326, the MCP6 and MCP7 shared area 328, and the MCP1 shared area 320 are memory-mapped.

まとめると、ＰＣＩのメモリマップは、ＰＣＩツリーのルートにあるホストプロセッサが、デバイスやスイッチのベースアドレスを設定することで構成される。エンドポイントであるデバイスは、自分が要求するアドレス領域のサイズをホストプロセッサに通知し、ホストプロセッサは、デバイスが要求したサイズにしたがってメモリマップを構築する。具体的には、要求するアドレスレンジのサイズは、コンフィグレーションレジスタのＢＡＲフィールドに実装するビット数で指定される。 In summary, the PCI memory map is configured by setting the base addresses of devices and switches by the host processor at the root of the PCI tree. The device as the end point notifies the host processor of the size of the address area requested by itself, and the host processor constructs a memory map according to the size requested by the device. Specifically, the size of the requested address range is specified by the number of bits implemented in the BAR field of the configuration register.

本実施の形態のブリッジデバイスは、エンドポイントとして動作する場合、外部からアクセス可能なコンフィグレーションレジスタと内部からアクセス可能なコンフィグレーションレジスタをそれぞれ別々のレジスタとして実装し、それぞれのレジスタについて要求されるアドレスレンジのサイズ、すなわちＢＡＲの実装ビット数を設定することが可能である。これにより、システム初期化時に設定したサイズのアドレスレンジにより、ＰＣＩのアドレスマップがそれぞれのホストプロセッサによって構築される。ここで、それぞれのホストプロセッサとは、ルートコンプレックスとなるプロセッサと、エンドポイントとして動作するプロセッサのことである。 When the bridge device according to the present embodiment operates as an endpoint, a configuration register accessible from the outside and a configuration register accessible from the inside are mounted as separate registers, and addresses required for the respective registers. It is possible to set the size of the range, that is, the number of bits installed in the BAR. Thus, a PCI address map is constructed by each host processor according to the address range of the size set at the time of system initialization. Here, each host processor refers to a processor that is a root complex and a processor that operates as an endpoint.

一方、ＩＯＩＦのメモリマップは、ＩＯＰＴに共有領域の情報を格納して他のＭＣＰに提示することにより設定される。この作業は外部からメモリアクセスがあった場合、その先にマッピングされる領域を設定するものである。この設定作業は、自分がルートコンプレックスとして動作する場合でも、自分がエンドポイントとして動作する場合でも、ＰＣＩからトランザクションを受け、それをメモリアクセスとして許可する場合は必要となる。 On the other hand, the memory map of IOIF is set by storing shared area information in IOPT and presenting it to other MCPs. In this operation, when there is a memory access from the outside, an area to be mapped ahead is set. This setting work is required when a transaction is received from the PCI and allowed as a memory access, regardless of whether the user operates as a root complex or as an endpoint.

実際の運用としては、ＰＣＩで構築するアドレスサイズは余裕をもたせたサイズにしてＰＣＩメモリマップを構築し、その中で実際にメモリをマップする範囲は、ＩＯＰＴによって設定することになる。また、ＰＣＩメモリマップのアドレスレンジに関しては、ＰＣＩエクスプレスの規格にしたがい、エンドポイントが通知し、ルートコンプレックスがアドレス構築するということになるが、その中で、どの範囲がメモリやローカルストアにマッピングされているかについての情報は、図１５で説明したように、共有メモリを介したオリジナルプロトコルでやりとりする必要がある。 In actual operation, the PCI memory map is constructed by setting the address size constructed by PCI with a margin, and the range in which the memory is actually mapped is set by IOPT. As for the address range of the PCI memory map, according to the PCI Express standard, the end point notifies and the root complex constructs the address. Among them, the range is mapped to the memory or local store. As described with reference to FIG. 15, it is necessary to exchange information regarding whether or not there is an original protocol via a shared memory.

以上説明したように、本実施の形態によれば、プロセッサ基板の汎用のＰＣＩエクスプレスコネクタ間を安価なフレキシブル基板やバックプレーン基板で直接接続することにより、ＰＣＩエクスプレススイッチを必要としない、安価でかつ高性能なクラスタシステムを構築することができる。 As described above, according to the present embodiment, a general-purpose PCI Express connector on a processor board is directly connected by an inexpensive flexible board or a backplane board, so that a PCI Express switch is not required and is inexpensive. A high-performance cluster system can be constructed.

以上、本発明を実施の形態をもとに説明した。実施の形態は例示であり、それらの各構成要素や各処理プロセスの組合せにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解されるところである。 The present invention has been described based on the embodiments. The embodiments are exemplifications, and it will be understood by those skilled in the art that various modifications can be made to combinations of the respective constituent elements and processing processes, and such modifications are within the scope of the present invention. .

上記の実施の形態では、プロセッサ基板にマルチコアプロセッサが搭載された場合を説明したが、これはシングルプロセッサであってもよい。また、実施の形態では、プロセッサ基板に２つのマルチコアプロセッサが搭載され、４枚のプロセッサ基板で１つのノードを構成する例を説明したが、プロセッサ基板に搭載されるプロセッサの数、１つのノード内のプロセッサ基板の数、ブリッジのコネクタ数などは、設計の自由度がある。ノード内の複数のプロセッサ基板をフレキシブル基板によって密結合し、ノード間をさらにフレキシブル基板で連結してノードクラスタを構成することができる限り、ノード内のプロセッサ基板の数と配置、ノードクラスタ内のノードの配置にはいろいろなパターンがありうる。いずれにしても安価、省スペース、高速通信の各要求を満足する接続形態が好ましい。 In the above embodiment, the case where the multi-core processor is mounted on the processor board has been described, but this may be a single processor. Further, in the embodiment, an example in which two multi-core processors are mounted on a processor board and one node is configured by four processor boards has been described. However, the number of processors mounted on the processor board and one node The number of processor boards, the number of bridge connectors, etc. have design freedom. As long as a plurality of processor boards in a node can be tightly coupled by a flexible board and a node cluster can be configured by further connecting the nodes with a flexible board, the number and arrangement of processor boards in the node, the nodes in the node cluster There can be various patterns of arrangement. In any case, a connection configuration that satisfies the requirements of low cost, space saving, and high-speed communication is preferable.

上記の実施の形態では、プロセッサ基板のブリッジのポートに他のプロセッサ基板のポートを接続したが、プロセッサ基板のブリッジのポートに周辺デバイスを接続し、プロセッサと各種周辺デバイスを相互結合したシステムを構成してもよい。また、ブリッジはプロセッサの入出力バスをＰＣＩエクスプレスに接続したが、他のプロセッサ基板や周辺デバイスが接続される外部インタフェースとしてＰＣＩエクスプレス以外のインタフェースが用いられてもよい。 In the above embodiment, the processor board bridge port is connected to the processor board bridge port, but a peripheral device is connected to the processor board bridge port, and the system is configured by mutually coupling the processor and various peripheral devices. May be. In addition, although the bridge connects the input / output bus of the processor to PCI Express, an interface other than PCI Express may be used as an external interface to which other processor boards and peripheral devices are connected.

プロセッサ基板の構成図である。It is a block diagram of a processor board. ４つのプロセッサ基板が密結合されたノードの構成図である。It is a block diagram of a node in which four processor boards are tightly coupled. 複数のノードを連結したクラスタシステムの構成図である。It is a block diagram of the cluster system which connected the some node. プロセッサ基板の裏面の配線の模式図である。It is a schematic diagram of the wiring of the back surface of a processor board | substrate. ノード内の４枚のプロセッサ基板間をフレキシブル基板によって接続した構成を示す図である。It is a figure which shows the structure which connected between four processor boards in a node by the flexible substrate. クラスタシステム内の複数のノード間をフレキシブル基板によって接続した構成を示す図である。It is a figure which shows the structure which connected between the some nodes in a cluster system by the flexible substrate. ノードの筐体を説明する図である。It is a figure explaining the housing | casing of a node. クラスタシステムの筐体を説明する図である。It is a figure explaining the housing | casing of a cluster system. 図２のノード内で形成されるＰＣＩツリーを説明する図である。FIG. 3 is a diagram illustrating a PCI tree formed in the node of FIG. 2. 図２のノード内で形成されるＰＣＩツリーを説明する図である。FIG. 3 is a diagram illustrating a PCI tree formed in the node of FIG. 2. 図２のノード内で形成されるＰＣＩツリーを説明する図である。FIG. 3 is a diagram illustrating a PCI tree formed in the node of FIG. 2. 図２のノード内で形成されるＰＣＩツリーを説明する図である。FIG. 3 is a diagram illustrating a PCI tree formed in the node of FIG. 2. ノード内の各ＭＣＰのメモリ空間を説明する図である。It is a figure explaining the memory space of each MCP in a node. あるＭＣＰのメモリ空間に他のＭＣＰの共有領域がマッピングされる様子を説明する図である。It is a figure explaining a mode that the common area | region of another MCP is mapped by the memory space of a certain MCP. あるＭＣＰのメモリ空間に他のＭＣＰの共有領域がマッピングされる様子を説明する図である。It is a figure explaining a mode that the common area | region of another MCP is mapped by the memory space of a certain MCP. あるＭＣＰのメモリ空間に他のＭＣＰの共有領域がマッピングされる様子を説明する図である。It is a figure explaining a mode that the common area | region of another MCP is mapped by the memory space of a certain MCP. あるＭＣＰのメモリ空間に他のＭＣＰの共有領域がマッピングされる様子を説明する図である。It is a figure explaining a mode that the common area | region of another MCP is mapped by the memory space of a certain MCP. ２つのＭＣＰ間でメモリマッピングされた共有領域の情報をやりとりする様子を説明する図である。It is a figure explaining a mode that the information of the shared area memory-mapped between two MCPs is exchanged. 連接ノードのＭＣＰとの接続も含めたＰＣＩツリーを説明する図である。It is a figure explaining the PCI tree including the connection with MCP of a connection node. 図１６のＰＣＩツリーの場合におけるＭＣＰのメモリ空間を説明する図である。It is a figure explaining the memory space of MCP in the case of the PCI tree of FIG.

Explanation of symbols

１０ＤＲＡＭ、２０マルチコアプロセッサ、３０ブリッジ、５０プロセッサ基板、１００ノード、２００クラスタシステム、２０１〜２０６、２１１〜２１４フレキシブル基板、３００メモリ空間、３１０ＩＯＰＴ。 10 DRAM, 20 multi-core processor, 30 bridge, 50 processor board, 100 node, 200 cluster system, 201-206, 211-214 flexible board, 300 memory space, 310 IOPT.

Claims

A plurality of processor boards each including a processor and a bridge that relays data between an input / output bus of the processor and a PCI express to which peripheral devices are connected;
The port of the bridge is configured to be set to a root complex mode where the processor is a host or an endpoint mode where the processor is a peripheral device,
By connecting a port set to the root complex mode of the bridge of one processor board to a port set to the endpoint mode of the bridge of another processor board, the plurality of processor boards are mutually coupled. A processor node system characterized by the above.

A PCI express connector provided in a port set in the root complex mode of the bridge of the one processor board and a PCI express connector provided in a port set in the end point mode of the bridge of the other processor board The processor node system according to claim 1, wherein the processor node system is connected by wiring.

The PCI express connector provided in the port set in the root complex mode of the bridge of the one processor board and the PCI express connector provided in the port set in the endpoint mode of the bridge of the other processor board are interconnected. The processor node system according to claim 1, further comprising a single backplane board for performing the processing.

Including four processor boards on which two sets of a processor and a bridge for relaying data between an input / output bus of the processor and a PCI express to which peripheral devices are connected;
Each bridge has a port set in a root complex mode where the processor is a host and a port set in an endpoint mode where the processor is a peripheral device,
A port set in the root complex mode of one processor board is connected to a port set in the endpoint mode of the bridge of another processor board. 3. A processor node system comprising three ports, wherein two arbitrary processor boards among the four processor boards are mutually coupled.

The three ports of each processor board are defined as a first port, a second port, and a third port, and three processor boards other than the processor board are designated as a first processor board, a second processor board, and a third processor board. In this case, the first port is connected to the first processor board, the second port is connected to the second processor board, and the third port is connected to the third processor board. The processor node system according to claim 4.

Two processors in the processor board are connected via an input / output bus, and by connecting ports between the processor boards, any two of a total of eight processors on the four processor boards can be connected. 6. The processor node system of claim 5 , wherein the processors are communicatively interconnected with each other.

A PCI express connector provided in a port set in the root complex mode of the bridge of the one processor board and a PCI express connector provided in a port set in the end point mode of the bridge of the other processor board processor node system according to any one of claims 4 to 6, characterized by comprising hardwired by.

The four processor board surfaces are installed parallel to each other in the housing, and the PCI express connector provided at the bridge port of each processor board is arranged on the rear surface of the housing. 8. The processor node system according to claim 7 , wherein PCI express connectors of each processor board are connected by a flexible board.

The four processor board surfaces are installed in a case in parallel to each other, and a PCI express connector provided at a bridge port of each processor board is arranged on the back side of the case. One back for interconnecting a PCI express connector provided in a port set in the root complex mode and a PCI express connector provided in a port set in the endpoint mode of the bridge of the other processor board processor node system according to any one of claims 4 to 6, characterized in that further provided a plain substrate.

It is configured that each processor is configured to be accessible to the shared area of the other processor by mapping the shared area of the interconnected other processor as an I / O address space in the memory space of each processor. The processor node system according to any one of claims 4 to 9 , wherein

A processor node cluster system including a plurality of processor node systems according to any one of claims 4 to 10 and interconnecting each processor node system as a cluster and having a cluster connection form,
A plurality of processor node systems are mutually coupled by connecting empty ports that are not used for connection between processor boards in the processor node system between two adjacent processor node systems. A processor node cluster system.

12. The processor node cluster system according to claim 11 , wherein a PCI express connector is provided in the empty port, and the PCI express connectors of the empty port are connected by wiring with a flexible board.