JP4413198B2

JP4413198B2 - Floating point data summation processing method and computer system

Info

Publication number: JP4413198B2
Application number: JP2006080535A
Authority: JP
Inventors: 淳一稲垣; 正夫小薮; 宏明石畑
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2006-03-23
Filing date: 2006-03-23
Publication date: 2010-02-10
Anticipated expiration: 2026-03-23
Also published as: KR100824189B1; KR20070096740A; CN101042638B; JP2007257269A; EP1837754A2; US20070226288A1; US7873688B2; EP1837754A3; CN101042638A

Description

本発明は、浮動小数点データの総和を演算する浮動小数点データの総和演算処理方法及びコンピュータシステムに関し、特に、複数のコンピュータノードの浮動小数点データの総和を演算する浮動小数点データの総和演算処理方法及びコンピュータシステムに関する。 The present invention relates to a floating point data sum operation processing method and computer system for calculating the sum of floating point data, and more particularly to a floating point data sum operation processing method and computer for calculating the sum of floating point data of a plurality of computer nodes. About the system.

コンピュータを含むノードを複数設け、これら複数のノードをネットワークで接続した並列計算型コンピュータシステムが提供されている。このような並列計算機では、１ジョブを複数のノードで並列に計算処理し、これらの処理データを、ネットワークを介し、やりとりする。このような並列計算機においては、大規模なものでは、数百〜数千ノードで構成される。 A parallel computing computer system is provided in which a plurality of nodes including computers are provided and these nodes are connected by a network. In such a parallel computer, one job is calculated in parallel by a plurality of nodes, and these processing data are exchanged via a network. Such a parallel computer is composed of hundreds to thousands of nodes in a large-scale computer.

このような並列計算機においては、複数のノードの持つデータを集めて、指定された演算を実行する。これをリダクション処理という。このようなリダクション処理としては、全ノードのデータの総和を求める演算や、全ノードのデータの最大値や最小値を求める演算などがある。 In such a parallel computer, data of a plurality of nodes is collected and a specified operation is executed. This is called reduction processing. As such reduction processing, there are an operation for obtaining the sum of the data of all the nodes, an operation for obtaining the maximum value and the minimum value of the data of all the nodes, and the like.

一方、コンピュータの取り扱うデータ形式として、指数部と仮数部で数値を表現する浮動小数点形式は、小数点の位置が一定のところにある固定小数点形式による表現よりも広い範囲の数値が表現できる。図１９は、浮動小数点形式の説明図であり、ＩＥＥＥ規格の浮動小数点形式を示す。 On the other hand, as a data format handled by a computer, a floating-point format that expresses a numerical value by an exponent part and a mantissa part can express a wider range of numerical values than an expression by a fixed-point format in which the position of the decimal point is constant. FIG. 19 is an explanatory diagram of the floating point format, and shows the IEEE standard floating point format.

図１９には、３２ビットの単精度浮動小数点データと、６４ビットの倍精度浮動小数点データを示す。いずれも、符号ビットと、指数部と、仮数部とからなる。符号ビットは、数値の符号を示し、「１」は負数、「０」は正数を表す。又、指数部は、２のべき乗の整数値を表し、仮数部は、１．０以上〜２．０未満の値（正規化数）を表す。そして、指数表現の結果が、仮数部と乗算され、実際の数値を表す。 FIG. 19 shows 32-bit single precision floating point data and 64-bit double precision floating point data. Both are composed of a sign bit, an exponent part, and a mantissa part. The sign bit indicates the sign of a numerical value, where “1” represents a negative number and “0” represents a positive number. The exponent part represents an integer value that is a power of 2, and the mantissa part represents a value (normalized number) of 1.0 or more and less than 2.0. Then, the result of exponential expression is multiplied by the mantissa part to represent an actual numerical value.

このような浮動小数点データの総和演算では、３個以上の浮動小数点データを加算した場合に、３個のデータの加算順序によって、演算結果の数値が異なる。図２０及び図２１は、総和演算の説明図である。ここでは、倍精度浮動小数点データの値を、１６進表示で示してある。 In such a sum operation of floating point data, when three or more floating point data are added, the numerical value of the operation result differs depending on the addition order of the three data. 20 and 21 are explanatory diagrams of the sum operation. Here, the value of the double precision floating point data is shown in hexadecimal.

図２０に示すように、指数部と仮数部からなる浮動少数点データ１，２，３，４を加算する場合に、データ１，２，３，４の順で加算すると、データ１とデータ２の加算を行い、その加算結果１とデータ３の加算を行い、更に、その加算結果２とデータ４の加算を行う。 As shown in FIG. 20, when adding floating-point data 1, 2, 3, 4 consisting of an exponent part and a mantissa part, if data 1, 2, 3, 4 are added in this order, data 1 and data 2 The addition result 1 and the data 3 are added, and the addition result 2 and the data 4 are added.

一方、図２１に示すように、データ１，３，４，２の順で加算すると、データ１とデータ３の加算を行い、その加算結果１とデータ４の加算を行い、更に、その加算結果２とデータ２の加算を行う。 On the other hand, as shown in FIG. 21, when data 1, 3, 4, and 2 are added in this order, data 1 and data 3 are added, the addition result 1 and data 4 are added, and the addition result 2 and data 2 are added.

図２０、図２１の数値例で示されるように、４個のデータの加算結果に相違が現れる。この原因は、１回毎の演算結果が正規化されるため、仮数部の桁落ちが発生するためである。 As shown in the numerical examples of FIGS. 20 and 21, a difference appears in the addition result of the four data. This is because the calculation result for each time is normalized, so that the mantissa part is lost.

並列計算機では、１つのジョブを、複数の計算機で並列に実行するため、その並列に実行された途中結果や最終結果を集めて、総和を求める等の演算が必要になる場合がある。この時、データの形式が浮動小数点形式であると、演算順序によって、演算結果が異なることは、並列計算の正確さに影響を与える。このため、演算順序を守らなくても、演算結果の同一性を保証する方法が提案されている。 In a parallel computer, since one job is executed in parallel by a plurality of computers, operations such as collecting intermediate results and final results executed in parallel and obtaining a sum may be required. At this time, if the data format is a floating-point format, the fact that the calculation results differ depending on the calculation order affects the accuracy of the parallel calculation. For this reason, there has been proposed a method for guaranteeing the sameness of the calculation results without obeying the calculation order.

図２２は、かかる従来の浮動小数点データの総和演算の説明図であり、演算順序を守らなくても、演算結果の同一性を保証する方法を示す。 FIG. 22 is an explanatory diagram of such a conventional summation operation of floating point data, and shows a method of guaranteeing the sameness of the operation results without keeping the operation order.

図２２に示すように、複数のノードの浮動小数点データの総和演算等を行うリダクション機構を、各ノードと別に設けることが、処理効率の上で、有効である。先ず、ノードの各々は、浮動小数点データの指数部のみを取り出し、その指数部の最大値を求めるように、リダクション機構に指示する。 As shown in FIG. 22, it is effective in terms of processing efficiency to provide a reduction mechanism that performs summation of floating point data of a plurality of nodes separately from each node. First, each of the nodes instructs the reduction mechanism to extract only the exponent part of the floating-point data and obtain the maximum value of the exponent part.

リダクション機構は、各ノードから送られてくる指数部データを比較し、最大値の指数部のみを保持し、全ノードからの指数部データの比較が終了すると、その最大値の指数部を全ノードに返す。 The reduction mechanism compares the exponent part data sent from each node, holds only the exponent part of the maximum value, and when the comparison of the exponent part data from all nodes is completed, the exponent part of the maximum value is all nodes. Return to.

各ノードは、リダクション機構から返ってきた最大値の指数部に合わせて、仮数部の桁合わせを実行する。そして、各ノードは、その桁合わせした仮数部データの総和を求めるように、リダクション機構に指示する。 Each node performs digit alignment of the mantissa part according to the exponent part of the maximum value returned from the reduction mechanism. Each node then instructs the reduction mechanism to calculate the sum of the mantissa data that has been aligned.

リダクション機構は、各ノードから送られてくる仮数部データを加算して、全ノードからの仮数部データの加算が終了すると、その結果を全ノードに返す。 The reduction mechanism adds the mantissa data sent from each node, and when the addition of the mantissa data from all the nodes is completed, returns the result to all the nodes.

各ノードは、最大値の指数部データと仮数部データの総和から、正規化した浮動小数点データを作成する。 Each node creates normalized floating point data from the sum of the exponent part data and mantissa part data of the maximum value.

このように、従来技術では、各ノードで、指数部の最大値に合わせて、仮数部データの桁合わせが実行され、その桁合わせ済みのデータが、リダクション機構に送られるため、総和演算の計算順序を気にすることなく、総和演算できる（例えば、特許文献１）。
特表２００５−５０６５９６号公報 In this way, in the conventional technique, each node performs digit alignment of mantissa data in accordance with the maximum value of the exponent, and the digitized data is sent to the reduction mechanism. Summation can be performed without worrying about the order (for example, Patent Document 1).
JP 2005-506596 A

しかしながら、従来技術では、浮動小数点データの総和を求める場合に、指数部の大小比較と仮数部の加算の２回の演算が必要となる。このため、各ノードとリダクション機構とのデータのやりとりも２回必要であり、リダクション処理の時間が長くなる。特に、ノード数が数百〜数千に増えると、その処理時間が長くなり、並列処理の高速化の阻害要因となる。 However, in the conventional technique, when calculating the sum of floating-point data, two operations of comparing the magnitude of the exponent part and adding the mantissa part are required. For this reason, data exchange between each node and the reduction mechanism is also required twice, and the reduction processing time becomes long. In particular, when the number of nodes increases from several hundred to several thousand, the processing time becomes long, which is an impediment to speeding up parallel processing.

一方、演算順序を守るため、リダクション機構に、全ノードのデータを記憶する記憶回路を設け、全ノードのデータを受信後、順番に加算を行う方法が考えられる。しかし、ノード数が増加すると、記憶回路の規模が増大し、コスト上昇の原因となる。しかも、全てのノードのデータを受け取ってから計算を開始すると、それだけ、処理時間が長くなる。特に、ノード数が数百〜数千に増えると、その回路規模が大きくなり、且つ処理時間の長さが顕著となる。 On the other hand, in order to keep the calculation order, it is conceivable to provide a storage circuit for storing data of all nodes in the reduction mechanism, and to perform addition in order after receiving data of all nodes. However, when the number of nodes increases, the scale of the memory circuit increases, causing an increase in cost. In addition, if the calculation is started after receiving the data of all the nodes, the processing time is increased accordingly. In particular, when the number of nodes increases from several hundred to several thousand, the circuit scale increases and the length of processing time becomes significant.

本発明の目的は、多数のノードの浮動小数点データの総和演算を高速化するための浮動小数点データの総和演算処理方法及びコンピュータシステムを提供することにある。 An object of the present invention is to provide a floating point data sum operation processing method and a computer system for speeding up the sum operation of floating point data of a large number of nodes.

又、本発明の他の目的は、演算順序を守ることなく、多数のノードの浮動小数点データの総和演算を高速化し、並列処理に有効な浮動小数点データの総和演算処理方法及びコンピュータシステムを提供することにある。 Another object of the present invention is to provide a floating point data summation processing method and a computer system that are effective for parallel processing by speeding up the summation of floating point data of a large number of nodes without observing the order of operations. There is.

更に、本発明の他の目的は、不必要な記憶回路を設けることなく、多数のノードの浮動小数点データの総和演算を高速化するための浮動小数点データの総和演算処理方法及びコンピュータシステムを提供することにある。 Furthermore, another object of the present invention is to provide a floating point data sum operation processing method and computer system for speeding up the sum operation of floating point data of a large number of nodes without providing an unnecessary storage circuit. There is.

この目的の達成のため、本発明は、３つ以上の浮動小数点データの総和を、コンピュータを用いて演算する浮動小数点データの総和演算処理方法において、前記浮動小数点データの指数部の上位ビットの大きさにより分けた複数のグループの前記指数部の上位ビットが最大値のグループの仮数部の総和と、前記指数部の上位ビットが２番目に最大値のグループの仮数部の総和とをコンピュータの演算回路が計算するステップと、前記指数部の上位ビットが最大値のグループの仮数部の総和と、前記指数部の上位ビットが２番目に最大値のグループの仮数部の総和との加算を行う処理をコンピュータの演算回路が実行するステップとを有する To achieve these objects, the present invention includes three or more the sum of the floating point data, the summation processing method for floating point data for computing using a computer, the size of the upper bits of the exponent part of the floating point data the sum of the mantissa sections of a group of maximum upper bit of the exponent portion of the plurality of groups sifts is done, the calculation of the sum of the mantissa sections of a group of maximum upper bits of the exponent sections have the second computer processing performed the steps of circuit calculates a sum of the mantissa sections of a group of maximum upper bits of the exponent is, the addition of the sum of the mantissa sections of a group of maximum upper bit in the second of the exponent And a step of executing an arithmetic circuit of a computer

又、本発明のコンピュータシステムは、複数のノードと、前記各ノードから浮動小数点データを受信し、受信した浮動小数点データの総和を演算するリダクション機構とを有し、前記リダクション機構は、前記受信した浮動小数点データの指数部の上位ビットの大きさにより分けた複数のグループの前記指数部の上位ビットが最大値のグループの仮数部の総和と、前記指数部の上位ビットが２番目に最大値のグループの仮数部の総和とを計算し、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和との加算を行う。 The computer system of the present invention includes a plurality of nodes and a reduction mechanism that receives floating point data from each of the nodes and calculates a sum of the received floating point data, and the reduction mechanism receives the received The sum of the mantissa part of the group in which the upper bits of the exponent part of the plurality of groups divided by the size of the upper bits of the exponent part of the floating point data is the maximum value, and the upper bit of the exponent part is the second highest value The sum of the mantissa part of the group is calculated, and the sum of the mantissa part of the group having the maximum exponent part is added to the sum of the mantissa part of the group having the second largest exponent part.

又、複数のノードと、前記各ノードから浮動小数点データを受信し、受信した浮動小数点データの総和を演算するリダクション機構とを有し、前記各ノードは、ノード内の前記浮動小数点データの指数部の上位ビットの大きさにより分けた複数のグループの前記指数部の上位ビットが最大値のグループの仮数部の総和と、前記指数部の上位ビットが２番目に最大値のグループの仮数部の総和とを計算し、前記各グループ毎に計算された計算結果を前記リダクション機構に送り、前記リダクション機構は、複数ノードから受信した浮動小数点データの指数部の上位ビットの大きさにより分けた複数のグループの中で、複数ノードから受信した浮動小数点データの指数部の上位ビットが最大値のグループの仮数部の総和と、前記複数ノードから受信した浮動小数点データの指数部の上位ビットが２番目に最大値のグループの仮数部の総和とを計算し、前記リダクション機構における各グループ毎に計算された計算結果を各ノードに返し、前記各ノードは、前記リダクション機構から返ってきた前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和との加算を行う。 A plurality of nodes; and a reduction mechanism for receiving floating point data from each of the nodes and calculating a sum of the received floating point data, wherein each node has an exponent part of the floating point data in the node. the sum of the mantissa sections of a group of maximum upper bits of the exponent portion of the plurality of groups obtained by dividing by the size of the upper bits of the sum of the mantissa sections of a group of maximum upper bits of the exponent sections have the second And sending the calculation result calculated for each group to the reduction mechanism, wherein the reduction mechanism is divided into a plurality of groups divided according to the size of the upper bits of the exponent part of the floating-point data received from a plurality of nodes. receiving in the sum of the mantissa sections of a group of maximum upper bit of the exponent part of the floating point data received from the multiple nodes, from the plurality of nodes in the The upper bits of the exponent part of the floating point data to calculate the sum of the mantissa sections of a group of a second highest value, returns the calculation results calculated for each group in said reduction mechanism in each node, each node The sum of the mantissa part of the group with the maximum exponent part returned from the reduction mechanism and the sum of the mantissa part of the group with the second largest exponent part are added.

更に、本発明は、好ましくは、前記計算ステップは、前記指数部の上位ビットを比較して、前記比較結果により、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算するステップからなる。 Further, in the present invention, it is preferable that the calculation step compares the high-order bits of the exponent part and, based on the comparison result, the sum of the mantissa part of the group having the maximum exponent part, and the exponent part is 2 And the step of calculating the sum of the mantissa part of the maximum value group.

更に、本発明は、好ましくは、前記計算ステップは、前記指数部の下位ビットの値に応じて、前記仮数部をシフトして、データ幅を拡張した仮数部を作成するステップと、前記データ幅を拡張した仮数部を用いて、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算するステップとを有する。 Further, in the present invention, it is preferable that the calculating step shifts the mantissa part according to a value of a lower bit of the exponent part to create a mantissa part by expanding a data width; and the data width And calculating the sum of the mantissa part of the group whose exponent part is the maximum value and the sum of the mantissa part of the group whose exponent part is the second largest value.

更に、本発明は、好ましくは、前記加算ステップは、前記指数部が２番目に最大値のグループの仮数部の総和結果と、前記指数部が最大値のグループの仮数部の総和結果との桁合わせを行うステップと、前記指数部が最大値のグループの総和結果と、前記桁合わせされた前記指数部が２番目に最大値のグループの仮数部の総和結果とを加算するステップを有する。 Furthermore, in the present invention, it is preferable that the adding step includes a digit of a sum result of a mantissa part of a group having the second largest exponent part and a sum result of a mantissa part of a group having the largest exponent part. And a step of adding the summation result of the group having the maximum exponent part and the summation result of the mantissa part of the group having the second largest exponent part.

更に、本発明は、好ましくは、前記仮数部の加算結果と、前記指数部の上位ビットとから前記浮動小数点データを作成するステップを有する。 Furthermore, the present invention preferably includes a step of creating the floating point data from the addition result of the mantissa part and the upper bits of the exponent part.

本発明では、指数部が最大値のグループの演算結果には、指数値が２以上小さいグループの演算結果が、影響しないことから、指数部が最大値のグループと、指数部が２番目に最大値のグループのみの総和を演算し、指数部が最大値のグループと、指数部が２番目に最大値のグループの総和同士を加算することにより、数値の計算順序に関係なく計算しても、計算結果の同一性を保証できる。 In the present invention, the calculation result of the group having the maximum exponent is not influenced by the calculation result of the group having the index value of 2 or more, so the group having the maximum exponent is the second largest. By calculating the sum of only the group of values, and adding the sum of the group with the largest exponent part and the group with the second largest exponent part, The identity of calculation results can be guaranteed.

以下、本発明の実施の形態を、コンピュータシステムの構成、リダクション機構の構成、第１の実施の形態、第２の実施の形態、他の実施の形態の順で説明するが、本発明は、この実施の形態に限られない。 Hereinafter, embodiments of the present invention will be described in the order of the configuration of a computer system, the configuration of a reduction mechanism, the first embodiment, the second embodiment, and other embodiments. The present invention is not limited to this embodiment.

――コンピュータシステムの構成――
図１は、本発明のコンピュータシステムの一実施の形態の構成図、図２は、図１のノードのブロック図、図３は、図１のネットワークアダプタのブロック図、図４は、図１の転送データのフレームフォーマット図である。 --Computer system configuration--
1 is a block diagram of an embodiment of a computer system of the present invention, FIG. 2 is a block diagram of a node in FIG. 1, FIG. 3 is a block diagram of a network adapter in FIG. 1, and FIG. It is a frame format figure of transfer data.

図１は、コンピュータシステムとして、並列計算機を示す。図１に示すように、並列計算機は、複数（ここでは、４つ）のノード１０，１１，１２，１３と、２つのクロスバースイッチ（図中、ＳＷＡ，ＳＷＢ）２０，２１と，リダクション機構２２とを有する。各ノード１０，１１，１２，１３は、３つのネットワークアダプタ（図中、Ａ，Ｂ，Ｃで示す）１４Ａ，１４Ｂ，１４Ｃを有する。各ノード１０，１１，１２，１３のネットワークアダプタ１４Ａ，１４Ｂは、各々クロスバースイッチ２０，２１を介して、相互に通信する。又、各ノード１０，１１，１２，１３のネットワークアダプタ１４Ｃは、リダクション機構２２と通信する。即ち、各ノード１０，１１，１２，１３のネットワークアダプタ１４Ａ，１４Ｂ，１４Ｃのそれぞれは、Ｅｔｈｅｒｎｅｔ（登録商標）等のインターフェースで、伝送路を介し、クロスバースイッチ２０，２１，リダクション機構２２に接続される。 FIG. 1 shows a parallel computer as a computer system. As shown in FIG. 1, the parallel computer includes a plurality of (here, four) nodes 10, 11, 12, and 13, two crossbar switches (in the figure, SWA and SWB) 20, 21 and a reduction mechanism. 22. Each node 10, 11, 12, 13 has three network adapters (indicated by A, B, C in the figure) 14A, 14B, 14C. The network adapters 14A and 14B of the nodes 10, 11, 12, and 13 communicate with each other via the crossbar switches 20 and 21, respectively. In addition, the network adapter 14 </ b> C of each node 10, 11, 12, 13 communicates with the reduction mechanism 22. That is, each of the network adapters 14A, 14B, and 14C of each of the nodes 10, 11, 12, and 13 is connected to the crossbar switches 20 and 21 and the reduction mechanism 22 through a transmission line through an interface such as Ethernet (registered trademark). Is done.

このノード１０（１１，１２，１３）は、図２に示すように、ＣＰＵ４０と、メモリ４４と、ＩＯアダプタ４６と、前述のネットワークアダプタ１４Ａ〜１４Ｃとが、システムコントローラ４２を介して接続された計算機である。又、このＣＰＵ４０，メモリ４４、ＩＯアダプタ４６の数は、このノードに必要な処理能力に応じて、複数設けても良い。 In the node 10 (11, 12, 13), as shown in FIG. 2, a CPU 40, a memory 44, an IO adapter 46, and the network adapters 14A to 14C are connected via a system controller 42. It is a calculator. A plurality of CPUs 40, memories 44, and IO adapters 46 may be provided according to the processing capability required for this node.

図１及び図２のネットワークアダプタ１４Ａ（１４Ｂ，１４Ｃ）は、図３に示すように、システムコントローラ４２と接続するホストインターフェイス制御回路５０と、送信制御回路５２と、伝送路に接続されるネットワークインターフェイス制御回路５４と、受信制御回路５６とで構成される。このネットワークアダプタ１４Ａ（１４Ｂ，１４Ｃ）は、ノード間やリダクション機構２２とのデータ通信を担当する。 As shown in FIG. 3, the network adapter 14A (14B, 14C) in FIGS. 1 and 2 includes a host interface control circuit 50 connected to the system controller 42, a transmission control circuit 52, and a network interface connected to the transmission path. A control circuit 54 and a reception control circuit 56 are included. This network adapter 14A (14B, 14C) is in charge of data communication between nodes and with the reduction mechanism 22.

ネットワークアダプタ１４Ａ（１４Ｂ，１４Ｃ）を介してデータ転送をする場合には、図４に示すようなフレーム形式で通信する。図４に示すフレーム形式は、Ｅｔｈｅｒｎｅｔ（登録商標）で使用されるフレーム形式を示しており、宛先アドレスと、送信元アドレスと、フレームタイプ（例えば、コマンド種別、データサイズ等）と、データ、フレームチエックサム（例えば、ＣＲＣ（ＣｙｃｌｉｃＲｅｄｕｎｄａｎｃｙＣｏｄｅ））とからなる。データ領域のデータ長（データサイズ）は、可変であり、転送データは、必要に応じて、複数個のフレームに分割して転送する。 When data is transferred via the network adapter 14A (14B, 14C), communication is performed in a frame format as shown in FIG. The frame format shown in FIG. 4 is a frame format used in Ethernet (registered trademark), and includes a destination address, a transmission source address, a frame type (for example, command type, data size, etc.), data, and frame. Checksum (for example, CRC (Cyclic Redundancy Code)). The data length (data size) of the data area is variable, and the transfer data is divided into a plurality of frames and transferred as necessary.

――リダクション機構の構成――
図５は、図１のリダクション機構の構成図である。図５に示すように、リダクション機構２２の主要部は、各ノードからの送受信を制御するネットワーク制御部２２−１と、後述する各ノードからの浮動小数点データを所定のデータ形式に変換し、且つ演算結果を浮動小数点データに変換するデータ変換部２２−２と、データ変換後の受信データを保持するレジスタ２２−３と、リダクションの各種演算を実行する演算回路（ＡＬＵ１，ＡＬＵ２）２２−４，２２−７と、演算結果を保持するレジスタ（Ｒ１，Ｒ２）２２−５，２２−８と、データの比較を行う比較回路（ＣＭＰ）２２−６と、レジスタ２２−５，２２−８を選択するマルチプレクサ２２−９とを有する。 --Reduction mechanism configuration--
FIG. 5 is a configuration diagram of the reduction mechanism of FIG. As shown in FIG. 5, the main part of the reduction mechanism 22 is a network control unit 22-1 for controlling transmission / reception from each node, converts floating point data from each node described later into a predetermined data format, and A data converter 22-2 for converting the calculation result into floating point data, a register 22-3 for holding the received data after the data conversion, an arithmetic circuit (ALU1, ALU2) 22-4 for executing various reduction operations 22-7, registers (R1, R2) 22-5 and 22-8 for holding operation results, comparison circuit (CMP) 22-6 for comparing data, and registers 22-5 and 22-8 are selected. Multiplexer 22-9.

データ変換部２２−２で変換された受信データは、第１のレジスタ２２−３に保持され、第１の演算回路２２−４，第２の演算回路２２−７、比較回路２２−６に入力される。比較回路２２−６は、後述するように、指数部の上位ビットを比較する。又、第１の演算回路２２−４の演算結果は、第２のレジスタ２２−５に保持され、第１の演算回路２２−４、比較回路２２−６、第３のレジスタ２２−８に入力される。 The reception data converted by the data converter 22-2 is held in the first register 22-3 and input to the first arithmetic circuit 22-4, the second arithmetic circuit 22-7, and the comparison circuit 22-6. Is done. The comparison circuit 22-6 compares the upper bits of the exponent part as will be described later. The calculation result of the first calculation circuit 22-4 is held in the second register 22-5 and input to the first calculation circuit 22-4, the comparison circuit 22-6, and the third register 22-8. Is done.

更に、第３のレジスタ２２−８の保持データは、第２の演算回路２２−７に入力される。比較回路２２−６の比較結果に応じて、第１、第２の演算回路２２−４，２２−７が、加算を行う。第２のレジスタ２２−５が、指数部が最大値のグループに対応した仮数部の演算結果を保持し、第３のレジスタ２２−８が、指数部が２番目に最大値のグループに対応した仮数部の演算結果を保持する。 Furthermore, the data held in the third register 22-8 is input to the second arithmetic circuit 22-7. The first and second arithmetic circuits 22-4 and 22-7 perform addition in accordance with the comparison result of the comparison circuit 22-6. The second register 22-5 holds the operation result of the mantissa part corresponding to the group whose exponent part is the maximum value, and the third register 22-8 corresponds to the group whose exponent part is the second largest value Holds the mantissa result.

この実施の形態では、従来のリダクション機構の構成に、データ変換部２２−２、演算回路２２−７、レジスタ２２−８、マルチプレクサ２２−９が付加されている。 In this embodiment, a data converter 22-2, an arithmetic circuit 22-7, a register 22-8, and a multiplexer 22-9 are added to the configuration of the conventional reduction mechanism.

――第１の実施の形態――
図６は、本発明の浮動小数点総和演算処理の第１の実施の形態の説明図、図７は、図６のデータ変換処理の説明図、図８は、図６のデータ変換処理で補数をとる場合の処理の説明図、図９は、図５、図６の比較結果による演算処理の説明図、図１０と図１１は、演算結果を浮動小数点データに変換する処理の説明図、図１２は、指数部上位ビットと仮数部の絶対値の関係図である。 -First embodiment-
FIG. 6 is an explanatory diagram of the first embodiment of the floating point sum operation processing of the present invention, FIG. 7 is an explanatory diagram of the data conversion processing of FIG. 6, and FIG. 8 is a complement of the data conversion processing of FIG. FIG. 9 is an explanatory diagram of the arithmetic processing based on the comparison result of FIG. 5 and FIG. 6, FIGS. 10 and 11 are explanatory diagrams of the processing for converting the arithmetic result into floating point data, FIG. FIG. 4 is a relationship diagram between the upper bits of the exponent part and the absolute values of the mantissa part.

図６に示すように、ノード１０，１１，１２，１３は、リダクション処理すべき浮動小数点データをそのままリダクション機構２２に送り、総和の計算を指示する。 As shown in FIG. 6, the nodes 10, 11, 12, and 13 send the floating point data to be reduced to the reduction mechanism 22 as it is, and instruct the calculation of the sum.

リダクション機構２２は、全ノードからの浮動小数点データを到着順に加算して、演算結果を全ノードに返す。この加算処理では、図７、図８で後述するデータ変換処理と、図９で後述する大小比較による加算処理と、図１０、図１１で後述する演算結果を浮動小数点データに変換する処理を実行する。そして、ノード１０，１１，１２，１３は、演算結果をリダクション機構２２から受け取る。 The reduction mechanism 22 adds the floating point data from all nodes in the order of arrival and returns the calculation result to all nodes. In this addition processing, data conversion processing described later in FIGS. 7 and 8, addition processing based on magnitude comparison described later in FIG. 9, and processing for converting operation results described later in FIGS. 10 and 11 into floating point data are executed. To do. Then, the nodes 10, 11, 12, and 13 receive the calculation result from the reduction mechanism 22.

次に、このリダクション機構２２の総和演算処理を説明する。尚、以下の説明では、図１９に示した６４ビットの倍精度浮動小数点データを例に説明するが、３２ビットの単精度浮動小数点データも同様に処理できる。 Next, the sum calculation process of the reduction mechanism 22 will be described. In the following description, the 64-bit double-precision floating point data shown in FIG. 19 will be described as an example, but 32-bit single-precision floating point data can be processed in the same manner.

図７に示すように、総和演算のデータ幅を決める。演算するデータの最大個数を１２７個までとする場合には、総和を求める演算では、最大７桁（２の７乗＝１２８）まで、有効桁が増大する可能性がある。そこで、先ず、浮動小数点データの仮数部の桁数（倍精度では、５２ビット）と、この桁数（７ビット）とを合計する。即ち、５２＋７＝５９ビットとなる。 As shown in FIG. 7, the data width of the sum operation is determined. When the maximum number of data to be calculated is 127, in the calculation for calculating the sum, there is a possibility that the effective digits increase to a maximum of 7 digits (2 7 = 128). Therefore, first, the number of digits of the mantissa part of floating point data (52 bits in double precision) and the number of digits (7 bits) are summed. That is, 52 + 7 = 59 bits.

次に、指数部の下位ビットの削減するビット数を決める。削減するビット数で表現できる桁数が、前記した合計の桁数より大きいことが条件となる。削減するビット数が、５ビットで３１桁、６ビットで６３桁（２の６乗＝６４）となる。倍精度では、前記した合計の桁数が５９ビットであり、合計値以上の削減ビット数は、指数部の下位６ビットで条件を満足する。 Next, the number of bits to be reduced in the lower bits of the exponent part is determined. The condition is that the number of digits that can be expressed by the number of bits to be reduced is larger than the total number of digits described above. The number of bits to be reduced is 31 in 5 bits and 63 in 6 bits (2 6 = 64). In double precision, the total number of digits described above is 59 bits, and the number of bits reduced beyond the total value satisfies the condition with the lower 6 bits of the exponent part.

従って、必要なデータ幅は、５２（仮数部）＋７（増加桁数）＋６３（シフト量）＋２（その他）＝１２４ビットとなる。尚、その他は、仮数部の省略されている最上位桁と符号ビットの２ビットである。 Therefore, the necessary data width is 52 (mantissa part) +7 (increment number of digits) +63 (shift amount) +2 (others) = 124 bits. The other bits are the most significant digit in which the mantissa part is omitted and the sign bit.

このように、演算データ幅を決定すると、図７のように、浮動小数点データを、このデータ幅の変換データに変換する。即ち、倍精度浮動小数点データの１２４ビット幅で説明すると、仮数部の最上位桁を補填し、指数部の下位６ビットの値分、シフトした位置に、仮数部をセットする。又、仮数部以外は、「０」をセットする。尚、浮動小数点では、値がゼロ以外の場合、最上位桁の「１」が省略されているため、上述の補填が必要となる。 Thus, when the calculation data width is determined, the floating point data is converted into conversion data having this data width as shown in FIG. That is, in the case of the 124-bit width of the double-precision floating point data, the most significant digit of the mantissa is supplemented, and the mantissa is set at a position shifted by the lower 6 bits of the exponent. Also, except for the mantissa part, “0” is set. In the floating point, when the value is other than zero, the most significant digit “1” is omitted, and thus the above-described compensation is required.

又、符号が負数を示している場合には、図８のように、１２４ビット幅に変換した後、２の補数表現に変換する。この仮数部の変換は、図５のデータ変換部２２−２が実行し、第１のレジスタ２２−３には、指数部上位ビット、変換された仮数部がセットされる。 If the sign indicates a negative number, as shown in FIG. 8, it is converted to a 124-bit width and then converted to a two's complement expression. The mantissa part is converted by the data converter 22-2 of FIG. 5, and the first bit of the exponent part and the converted mantissa part are set in the first register 22-3.

次に、図９により、総和演算処理を説明する。図９において、指数１、仮数１は、新規に受信した指数部上位ビットと仮数部を表し、指数３、仮数３は、演算結果の指数部上位ビットの最大値とその仮数部を表し、仮数４は、演算結果の指数部上位ビットが２番目の最大値に対応した仮数部を表す。 Next, the sum calculation process will be described with reference to FIG. In FIG. 9, exponent 1 and mantissa 1 represent the newly received exponent part upper bits and mantissa part, exponent 3 and mantissa 3 represent the maximum value of the exponent part upper bits and the mantissa part of the operation result, and the mantissa 4 represents a mantissa part corresponding to the second maximum value in the exponent part upper bits of the operation result.

尚、図５では、指数１、仮数１は、第１のレジスタ２２−３に、指数３、仮数３は、第２のレジスタ２２−５に、仮数４は、第３のレジスタ２２−８にセットされる。第１のレジスタ２２−３に、新規に受信した浮動小数点データの指数部上位ビットと仮数部がセットされると、比較回路２２−６は、指数１と第２のレジスタ２２−５の指数３とを比較する。 In FIG. 5, the exponent 1 and mantissa 1 are in the first register 22-3, the exponent 3, mantissa 3 is in the second register 22-5, and the mantissa 4 is in the third register 22-8. Set. When the high-order bit and mantissa part of the newly received floating-point data are set in the first register 22-3, the comparison circuit 22-6 makes the exponent 1 and the exponent 3 of the second register 22-5. And compare.

図９に示すように、比較回路２２−６の比較結果が指数１＞指数３＋１である場合には、指数１が最大となるため、演算回路２２−４を介し、第２のレジスタ２２−５に、指数１、仮数１を、新指数３、新仮数３としてセットし、第３のレジスタ２２−８には、指数３が、２番目の最大値でないため、「０」をセットする。 As shown in FIG. 9, when the comparison result of the comparison circuit 22-6 is exponent 1> exponential 3 + 1, the exponent 1 is the maximum, so the second register 22-5 is passed through the arithmetic circuit 22-4. Then, the exponent 1 and the mantissa 1 are set as the new exponent 3 and the new mantissa 3, and “0” is set in the third register 22-8 because the exponent 3 is not the second maximum value.

又、比較回路２２−６の比較結果が指数１＝指数３＋１である場合には、指数１が最大となるため、演算回路２２−４を介し、第２のレジスタ２２−５に、指数１、仮数１を、新指数３、新仮数３としてセットし、第３のレジスタ２２−８には、指数３が２番目の最大値であるため、第２のレジスタ２２−５の仮数３をセットする。 Further, when the comparison result of the comparison circuit 22-6 is exponent 1 = exponent 3 + 1, the exponent 1 is the maximum. Therefore, the exponent 1, 1 is stored in the second register 22-5 via the arithmetic circuit 22-4. The mantissa 1 is set as the new exponent 3 and the new mantissa 3, and since the exponent 3 is the second maximum value, the mantissa 3 of the second register 22-5 is set in the third register 22-8. .

比較回路２２−６の比較結果が指数１＝指数３である場合には、指数１と指数３が同一の最大値グループとなるため、演算回路２２−４に、第２のレジスタ２２−５の仮数３に、仮数１を加算するよう指示し、第２のレジスタ２２−５に、指数３、仮数１＋仮数３を、新指数３、新仮数３としてセットし、第３のレジスタ２２−８の値（仮数４）は変更しない。 When the comparison result of the comparison circuit 22-6 is index 1 = index 3, the index 1 and the index 3 are the same maximum value group, so that the arithmetic circuit 22-4 includes the second register 22-5. Instructs the mantissa 3 to add the mantissa 1, sets the exponent 3 and the mantissa 1 + mantissa 3 as the new exponent 3 and the new mantissa 3 in the second register 22-5, and sets the third register 22-8. The value (mantissa 4) is not changed.

比較回路２２−６の比較結果が指数１＋１＝指数３である場合には、指数３が最大となるため、第２のレジスタ２２−５の指数３、仮数３は変更せず、指数１が、２番目の最大値であるため、演算回路２２−７に、第３のレジスタ２２−８の仮数４と、仮数１との加算を指示し、第３のレジスタ２２−８には、仮数１＋仮数４を、新仮数４としてセットする。 When the comparison result of the comparison circuit 22-6 is exponent 1 + 1 = exponent 3, the exponent 3 is the maximum. Therefore, the exponent 3 and mantissa 3 of the second register 22-5 are not changed, and the exponent 1 is Since it is the second maximum value, the arithmetic circuit 22-7 is instructed to add the mantissa 4 of the third register 22-8 and the mantissa 1, and the mantissa 1 + mantissa is stored in the third register 22-8. 4 is set as the new mantissa 4.

比較回路２２−６の比較結果が指数１＋１＜指数３である場合には、指数３が最大となり、指数１が２番目の最大値でないため、第２のレジスタ２２−５の指数３、仮数３、第３のレジスタ２２−８の仮数４を変更しない。 When the comparison result of the comparison circuit 22-6 is exponent 1 + 1 <exponent 3, the exponent 3 is the maximum, and the exponent 1 is not the second maximum value, so that the exponent 3 and mantissa 3 of the second register 22-5 The mantissa 4 of the third register 22-8 is not changed.

このようにして、指数部の上位ビットの値が最大値の指数（新指数３）と、指数部の上位ビットが最大の仮数部の演算結果（新仮数３）と、指数部の上位ビットの値が２番目に最大である仮数部の演算結果（新仮数４）が得られる。 In this way, the exponent with the highest value of the upper part of the exponent part (new exponent 3), the operation result of the mantissa part with the highest exponent bit (new mantissa 3), and the upper bit of the exponent part An operation result (new mantissa 4) of the mantissa part having the second largest value is obtained.

次に、この得られた新指数３、新仮数３、新仮数４の３つの値から、正規化した浮動小数点データへの変換処理を図１０、図１１で説明する。 Next, conversion processing from the obtained three values of the new exponent 3, the new mantissa 3 and the new mantissa 4 to the normalized floating point data will be described with reference to FIGS.

先ず、図１０に示すように、指数部の上位ビットの値が２番目に最大である仮数部の演算結果である仮数４を、指数部の上位ビットの値が最大である仮数部に桁合わせするため６４ビット右にシフトし、上位ビットには、ビット１２３の値（オール“０”又は“１”）を補填する。次に、この桁合わせした仮数４の値と仮数３の値とを加算して、総和を求める。 First, as shown in FIG. 10, the mantissa 4 which is the result of the operation of the mantissa part having the second highest value of the exponent part is aligned with the mantissa part having the maximum value of the upper bit of the exponent part. Therefore, the value is shifted to the right by 64 bits, and the value of bit 123 (all “0” or “1”) is filled in the upper bits. Next, the value of mantissa 4 and the value of mantissa 3 that have been aligned are added to obtain the sum.

次に、図１１に示すように、指数部上位ビットの最大値である指数３と図１０で求めた仮数の総和から、倍精度浮動小数点データに変換する。例えば、５ビット（ビット６２〜５８）の指数と、１２４ビットの仮数部とから、後述するように、１ビットの符号と、１１ビットの指数部と、５２ビットの仮数部を作成する。 Next, as shown in FIG. 11, the sum of the exponent 3 which is the maximum value of the exponent upper bits and the mantissa obtained in FIG. 10 is converted into double precision floating point data. For example, a 1-bit code, an 11-bit exponent, and a 52-bit mantissa are created from a 5-bit (bits 62 to 58) exponent and a 124-bit mantissa, as will be described later.

図５では、データ変換部２２−２が、第２のレジスタ２２−５、第３のレジスタ２２−８の保持値を得て、前述の桁合わせ、総和、変換を行う。 In FIG. 5, the data conversion unit 22-2 obtains the holding values of the second register 22-5 and the third register 22-8, and performs the above-described digit alignment, summation, and conversion.

図１２は、指数上位ビットと、仮数部で示される絶対値の範囲の関係図である。先ず、前述のように、指数部の下位ビットを削除し、仮数部に反映することにより、５ビット（ビット６２〜５８）の指数部と、１２４ビットの仮数部で、演算データを表現する。この仮数部は、全体の総和を求めた時にも、オーバーフローしないように、演算する最大データ数（前述の図７では、１２７個）を考慮して、決定する。 FIG. 12 is a diagram showing the relationship between the exponent upper bits and the range of absolute values indicated by the mantissa part. First, as described above, the lower-order bits of the exponent part are deleted and reflected in the mantissa part, so that the arithmetic data is expressed by the exponent part of 5 bits (bits 62 to 58) and the mantissa part of 124 bits. The mantissa part is determined in consideration of the maximum number of data to be calculated (127 in FIG. 7 described above) so as not to overflow even when the total sum is obtained.

図１２に示すように、指数の上位ビットの値が同じグループと、グループ毎の総和を演算した結果の指数部と仮数部で表される数値の絶対値の範囲から、ある指数数グループ（ここでは、ｎ）の最下位ビットは、指数値が、２つ離れた指数値グループ（ここでは、ｎ−２）の最上位ビットより、大きい値を示す。 As shown in FIG. 12, a certain exponent number group (here, from a group having the same value of the high-order bits of the exponent and a range of absolute values of numerical values represented by an exponent part and a mantissa part as a result of calculating the sum for each group) In this case, the least significant bit of n) indicates a value whose exponent value is larger than the most significant bit of an exponent value group (here, n−2) that is separated by two.

即ち、指数部が最大値のグループの演算結果には、指数値が２以上小さいグループの演算結果が、影響しないことが分かる。これは、指数部の差分により、仮数部の桁合わせを実行した時に、有効桁が無くなり、ゼロを加算する場合と同じ意味である。 That is, it can be understood that the calculation result of the group having the exponent value of 2 or more is not affected by the calculation result of the group having the maximum exponent part. This is the same meaning as when zero is added when the mantissa part digit alignment is executed due to the exponent part difference.

そして、指数部が同じグループの総和を求める演算では、指数部の下位ビット（ここでは、６ビット）に応じて、仮数部をシフトして、有効桁を増やしているため、仮数部の桁落ちが生じない。このため、図９に示した指数部が同じグループ同士の演算では、演算順序に関係なく、同じ演算結果となる。 And in the calculation to find the sum of groups with the same exponent part, the mantissa part is shifted and the significant digits are increased according to the lower bits (here, 6 bits) of the exponent part. Does not occur. For this reason, in the calculation between groups having the same exponent shown in FIG. 9, the same calculation result is obtained regardless of the calculation order.

更に、前述のように、指数部が最大値のグループの演算結果には、指数値が２以上小さいグループの演算結果が、影響しないため、指数部が最大値のグループと、２番目に最大値のグループのみの総和を演算する。そして、指数部が最大値のグループと、２番目に最大値のグループの総和を別々に計算して、最後に、桁合わせして、両方の総和を計算することにより、数値の計算順序に関係なく計算しても、計算結果の同一性を保証できる。 Furthermore, as described above, the calculation result of the group having the maximum exponent is not affected by the calculation result of the group having the index value of 2 or more, so the group having the maximum value of the exponent and the second maximum value are not affected. Calculate the sum of only the groups. Then, calculate the sum of the group with the largest exponent and the group with the second largest value separately, and finally, align the digits and calculate the sum of both. Even if there is no calculation, the same result can be guaranteed.

次に、図１３、図１４、図１５により、実際の数値を入れた実施例を説明する。ここでは、ＩＥＥＥ規格の倍精度浮動小数点形式データで、指数部の下位６ビットを削除し、仮数部を拡張し、演算データが４個の例で説明する。又、数値の表現は、全て１６進数値で表現し、ビット数が「４」に満たない場合は、右つめで表現する。 Next, an embodiment including actual numerical values will be described with reference to FIGS. Here, an example will be described in which the lower 6 bits of the exponent part is deleted and the mantissa part is expanded in the IEEE standard double precision floating point format data, and the arithmetic data is four. The numerical values are all expressed as hexadecimal values, and when the number of bits is less than “4”, they are expressed as the right hand.

図１３は、データ１，２，３，４の指数部の下位６ビットを削除し、仮数部を拡張した変換データを示す。尚、データ１，２，３，４は、おのおの、１０進数値表現で、「２．５９４０７３３８５３６５４１Ｅ＋１８」、「２．８８２３０３７６１５１７１２Ｅ＋１８」、「−２．２６６７３５９１１７７７４３Ｅ＋２３」、「２．２６６７７０４９９４２２５７Ｅ＋２３」である。尚、「Ｅ＋１８」は、１０の１８乗を示す。 FIG. 13 shows converted data in which the lower 6 bits of the exponent part of data 1, 2, 3, 4 are deleted and the mantissa part is extended. The data 1, 2, 3, and 4 are expressed in decimal values, and are “2.59407338553641E + 18”, “2.8882303761571212E + 18”, “−2.266735911777743E + 23”, and “2.266777049994257E + 23”. “E + 18” indicates 10 to the 18th power.

図１３のように、データ１は、変換前は、指数＝４４Ｃであり、仮数＝８００１８００００００００であり、符号は、＋である。途中１で、省略されている最上位桁「１」を補填し、仮数部を１２４ビットに拡張する。次に、指数部の下位６ビット（＝０Ｃ）により、１２４ビットの仮数部を、左に１２ビットシフトする。指数部は、上位５ビットを記憶する。この５ビットの指数値が指数グループを示す。 As shown in FIG. 13, before conversion, data 1 has an exponent = 44C, a mantissa = 8 0018 0000 0000, and a sign is +. In midway 1, the omitted most significant digit “1” is filled in, and the mantissa is expanded to 124 bits. Next, the mantissa part of 124 bits is shifted 12 bits to the left by the lower 6 bits (= 0C) of the exponent part. The exponent part stores the upper 5 bits. This 5-bit exponent value indicates an exponent group.

データ２も同様であり、符号が負数を示すため、変換データの補数演算を追加している。以下、同様にして、データ３，４の変換データを得る。 The same applies to data 2, and since the sign indicates a negative number, a complement operation for the converted data is added. Thereafter, conversion data of data 3 and 4 is obtained in the same manner.

次に、データ１，２，３，４は、それぞれ、指数値グループ毎に、演算される。図１３から理解されるように、データ１と２は、同じ指数値グループであり、データ３，４は、別の同じ指数値グループである。図１４に示すように、データ１の仮数部とデータ２の仮数部を加算して、指数値グループ（指数＝１１）の仮数３（図９参照）を得る。 Next, the data 1, 2, 3, and 4 are calculated for each exponent value group. As can be understood from FIG. 13, data 1 and 2 are the same exponent value group, and data 3 and 4 are another same exponent value group. As shown in FIG. 14, the mantissa part of data 1 and the mantissa part of data 2 are added to obtain mantissa 3 (see FIG. 9) of the exponent value group (exponent = 11).

次に、同様に、データ３の仮数部とデータ４の仮数部を加算して、指数値グループ（指数＝１０）の仮数４（図９参照）を得る。そして、仮数４は、仮数３に比べて、指数部が、６４（＝６ビット）違うので、図１０の原理で、指数部を合わせるため、仮数４を右に６４ビットシフトする。そして、そのシフトした値を、仮数３に加算して、最終演算結果を求める。 Next, similarly, the mantissa part of the data 3 and the mantissa part of the data 4 are added to obtain the mantissa 4 (see FIG. 9) of the exponent value group (exponent = 10). Since the mantissa 4 is different from the mantissa 3 in the exponent part by 64 (= 6 bits), the mantissa 4 is shifted to the right by 64 bits to match the exponent part according to the principle of FIG. Then, the shifted value is added to the mantissa 3, and the final calculation result is obtained.

この最終演算結果を、図１５のように、倍精度浮動小数点形式に変換する。途中１では、指数グループが上位５ビットで示されているため、省略されている下位６ビットにゼロを補填する。次に、途中２では、倍精度浮動小数点の仮数部の有効桁数が、５３ビットのため、５３ビットの仮数部に変換する。この時、仮数部の５３ビットの左端が「１」になるように変換する。図１５では、下位５３ビットを左に、３ビットシフトした値が、仮数部となり、左に３ビットシフトしたので、指数部を「−３」の値に変更する。変換後の符号は、１２４ビットの仮数部の左端の値がそのまま符合ビットとなる。 This final operation result is converted into a double precision floating point format as shown in FIG. On the way 1, since the exponent group is indicated by the upper 5 bits, zeros are filled in the omitted lower 6 bits. Next, on the way 2, since the number of significant digits of the mantissa part of the double-precision floating point is 53 bits, it is converted to a mantissa part of 53 bits. At this time, the left end of 53 bits of the mantissa is converted to “1”. In FIG. 15, the value obtained by shifting the lower 53 bits to the left by 3 bits becomes the mantissa part, and since the value is shifted by 3 bits to the left, the exponent part is changed to a value of “−3”. In the converted code, the leftmost value of the mantissa part of 124 bits becomes the sign bit as it is.

途中３では、５３ビットの仮数部の中で、左端の１ビットは省略するので、浮動小数点形式で使用するのは、５２ビットとなる。変換後は、１ビットの符号ビットと、１１ビットの指数部と、５２ビットの仮数部からなる倍精度浮動小数点形式のデータが得られる。 On the way 3, the leftmost 1 bit in the 53-bit mantissa is omitted, so the floating point format uses 52 bits. After conversion, double-precision floating-point format data consisting of a 1-bit sign bit, an 11-bit exponent part, and a 52-bit mantissa part is obtained.

――第２の実施の形態――
図１６は、本発明の第２の実施の形態の浮動小数点総和演算処理の説明図、図１７は、そのリダクション機構の構成図、図１８は、図１７の比較結果と演算処理の関係図である。この実施の形態は、ノード内に複数のＣＰＵ４０が存在し、最初にノード内で、ノード内の浮動小数点総和演算を実施してから、次に、リダクション機構２２で、全ノードの浮動小数点総和演算を実施する例である。 -Second embodiment-
FIG. 16 is an explanatory diagram of the floating-point sum calculation processing according to the second embodiment of the present invention, FIG. 17 is a configuration diagram of the reduction mechanism, and FIG. 18 is a relationship diagram between the comparison result of FIG. is there. In this embodiment, there are a plurality of CPUs 40 in a node. First, the floating point summation operation in the node is performed in the node, and then the reduction mechanism 22 performs the floating point summation operation in all the nodes. It is an example which implements.

図１６に示すように、各ノード１０，１１，１２，１３は、複数のＣＰＵの浮動小数点データ総和演算を、前述の図7〜図９の処理により、指数値グループ毎の総和を求める。そして、求められ指数部と仮数部を、リダクション機構２２に送り、ノード間の総和の計算を指示する。 As shown in FIG. 16, each of the nodes 10, 11, 12, and 13 calculates the sum of each floating point data summation of each of the plurality of CPUs by the processing shown in FIGS. Then, the obtained exponent part and mantissa part are sent to the reduction mechanism 22 to instruct the calculation of the sum between the nodes.

リダクション機構２２は、図１７に示すように、図５の構成に比し、データ変換部２２−２を備えていない。即ち、変換された指数部と仮数部とが送られるため、変換動作は必要ない。そして、リダクション機構２２は、全ノードからの指数部と仮数部のデータを到着順に加算して、演算結果を全ノードに返す。この加算処理では、図１８で後述する大小比較による加算処理を実行する。そして、ノード１０，１１，１２，１３は、演算結果をリダクション機構２２から受け取り、図１０と図１１で示した正規化した浮動小数点データを作成する。 As shown in FIG. 17, the reduction mechanism 22 does not include a data conversion unit 22-2 as compared with the configuration of FIG. That is, since the converted exponent part and mantissa part are sent, no conversion operation is necessary. Then, the reduction mechanism 22 adds the exponent and mantissa data from all nodes in the order of arrival and returns the calculation result to all nodes. In this addition process, an addition process based on size comparison, which will be described later with reference to FIG. 18, is executed. Then, the nodes 10, 11, 12, and 13 receive the calculation result from the reduction mechanism 22, and create the normalized floating point data shown in FIGS.

次に、リダクション機構２２の総和演算処理を、図１８で説明する。図１８において、図９と同様に、指数１、仮数１は、新規に受信した指数部上位ビットと仮数部を表し、仮数２は、新規に受信したデータの指数部上位ビットが２番目の最大値に対応した仮数部、指数３、仮数３は、演算結果の指数部上位ビットの最大値とその仮数部を表し、仮数４は、演算結果の指数部上位ビットが２番目の最大値に対応した仮数部を表す。 Next, the sum calculation process of the reduction mechanism 22 will be described with reference to FIG. In FIG. 18, as in FIG. 9, exponent 1 and mantissa 1 represent the newly received exponent part upper bits and mantissa part, and mantissa 2 represents the second largest exponent part upper bit of the newly received data. The mantissa part corresponding to the value, the exponent 3, and the mantissa 3 represent the maximum value and the mantissa part of the exponent part upper bit of the operation result, and the mantissa 4 corresponds to the second highest value of the exponent part upper bit of the operation result Represents the mantissa part.

又、図１７では、指数１、仮数１、仮数２は、第１のレジスタ２２−３に、指数３、仮数３は、第２のレジスタ２２−５に、仮数４は、第３のレジスタ２２−８にセットされる。第１のレジスタ２２−３に、新規に受信した浮動小数点データの指数部上位ビットと仮数部がセットされると、比較回路２２−６は、指数１と第２のレジスタ２２−５の指数３とを比較する。 In FIG. 17, the exponent 1, mantissa 1 and mantissa 2 are in the first register 22-3, the exponent 3 and mantissa 3 are in the second register 22-5, and the mantissa 4 is in the third register 22. Set to -8. When the high-order bit and mantissa part of the newly received floating-point data are set in the first register 22-3, the comparison circuit 22-6 makes the exponent 1 and the exponent 3 of the second register 22-5. And compare.

図１８に示すように、比較回路２２−６の比較結果が指数１＞指数３＋１である場合には、指数１が最大となるため、演算回路２２−４を介し、第２のレジスタ２２−５に、指数１、仮数１を、新指数３、新仮数３として、セットし、第３のレジスタ２２−８には、仮数２をセットする。 As shown in FIG. 18, when the comparison result of the comparison circuit 22-6 is exponent 1> exponential 3 + 1, the exponent 1 is the maximum, and therefore the second register 22-5 is passed through the arithmetic circuit 22-4. The exponent 1 and the mantissa 1 are set as the new exponent 3 and the new mantissa 3, and the mantissa 2 is set in the third register 22-8.

又、比較回路２２−６の比較結果が指数１＝指数３＋１である場合には、指数１が最大となるため、演算回路２２−４を介し、第２のレジスタ２２−５に、指数１、仮数１を、新指数３、新仮数３として、セットし、第３のレジスタ２２−８には、２番目の最大値が指数３であるため、演算回路２２−７で、仮数２＋仮数３を演算し、仮数２＋仮数３がセットされる。 Further, when the comparison result of the comparison circuit 22-6 is exponent 1 = exponent 3 + 1, the exponent 1 is the maximum. Therefore, the exponent 1, 1 is stored in the second register 22-5 via the arithmetic circuit 22-4. The mantissa 1 is set as a new exponent 3 and a new mantissa 3, and since the second maximum value is the exponent 3 in the third register 22-8, the mantissa 2 + mantissa 3 is set in the arithmetic circuit 22-7. The mantissa 2 + the mantissa 3 is set.

比較回路２２−６の比較結果が指数１＝指数３である場合には、指数１と指数３が同一の最大値グループとなるため、演算回路２２−４に、第２のレジスタ２２−５の仮数３に、仮数１を加算するよう指示し、第２のレジスタ２２−５に、指数３、仮数１＋仮数３を、新指数３、新仮数３としてセットし、第３のレジスタ２２−８には、演算回路２２−７で、仮数２＋仮数４を演算し、仮数２＋仮数４がセットされる。 When the comparison result of the comparison circuit 22-6 is index 1 = index 3, the index 1 and the index 3 are the same maximum value group, so that the arithmetic circuit 22-4 includes the second register 22-5. Instructs mantissa 3 to add mantissa 1, sets exponent 3 and mantissa 1 + mantissa 3 as new exponent 3 and new mantissa 3 in second register 22-5, and stores in third register 22-8 The arithmetic circuit 22-7 calculates the mantissa 2 + the mantissa 4, and sets the mantissa 2 + the mantissa 4.

比較回路２２−６の比較結果が指数１＋１＝指数３である場合には、指数３が最大となるため、第２のレジスタ２２−５の指数３、仮数３を変更せず、指数１が、２番目の最大値であるため、演算回路２２−４に、第３のレジスタ２２−８の仮数４と、仮数１との加算を指示し、第３のレジスタ２２−８に、仮数１＋仮数４を、新仮数４としセットする。 When the comparison result of the comparison circuit 22-6 is exponent 1 + 1 = exponent 3, the exponent 3 is the maximum. Therefore, the exponent 3 and mantissa 3 of the second register 22-5 are not changed, and the exponent 1 is Since it is the second maximum value, the arithmetic circuit 22-4 is instructed to add the mantissa 4 of the third register 22-8 and the mantissa 1, and the mantissa 1 + mantissa 4 is instructed to the third register 22-8. Is set as the new mantissa 4.

最後に、指数３、仮数３、仮数４の３つのデータを全ノードへ返す。全ノードは、受け取った指数３、仮数３、仮数４から正規化した浮動小数点データを作成する。 Finally, three data of exponent 3, mantissa 3, and mantissa 4 are returned to all nodes. All nodes create normalized floating point data from the received exponent 3, mantissa 3, and mantissa 4.

このように、ノード内で、ノード内の浮動小数点総和演算を行い、リダクション機構で、ノード間の浮動小数点総和演算を行うこともできる。 In this way, it is also possible to perform floating point summation calculation within a node and to perform floating point summation calculation between nodes by a reduction mechanism.

――他の実施の形態――
前述の実施の形態では、６４ビットの倍精度浮動小数点データで説明したが、３２ビットの単精度浮動小数点データにも適用できる。この場合には、増加桁数は、データ最大個数に依存するため、７ビットと同じであるが、シフト量は、削減するビット数が５ビットで良いため、データ幅は、２３（仮数部）＋７＋３１＋２＝６３ビットとなる。 -Other embodiments-
In the above-described embodiment, the 64-bit double-precision floating point data has been described. However, the present invention can also be applied to 32-bit single-precision floating point data. In this case, since the number of increased digits depends on the maximum number of data, it is the same as 7 bits. However, since the shift amount may be 5 bits, the data width is 23 (mantissa part). + 7 + 31 + 2 = 63 bits.

又、４ノードの並列計算機で説明したが、２ノード以上の並列計算機に適用できる。又、ノードの構成を、ＣＰＵ，メモリ等のコンピュータユニットで説明したが、他のコンピュータ構成のものでも良い。更に、伝送路のフォーマットは、Ｅｔｈｅｒｎｅｔ（登録商標）に限らず、他のネットワークプロトコルを適用できる。 In addition, although the description has been given for the four-node parallel computer, the invention can be applied to a parallel computer having two or more nodes. Further, the configuration of the node has been described with a computer unit such as a CPU and a memory, but other computer configurations may be used. Furthermore, the format of the transmission path is not limited to Ethernet (registered trademark), and other network protocols can be applied.

（付記１）３つ以上の浮動小数点データの総和を演算する浮動小数点データの総和演算処理方法において、前記浮動小数点データの指数部の大きさにより分けた複数のグループの前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算するステップと、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和との加算を行うステップとを有することを特徴とする浮動小数点データの総和演算処理方法。 (Supplementary note 1) In the floating point data sum operation processing method for calculating the sum of three or more floating point data, the exponent part of a plurality of groups divided by the size of the exponent part of the floating point data has a maximum value. Calculating the sum of the mantissa part of the group and the sum of the mantissa part of the group whose exponent part is the second largest value; the sum of the mantissa part of the group whose exponent part is the maximum value; And a step of adding the sum of the mantissa part of the second largest value group to the sum of mantissa parts.

（付記２）前記計算ステップは、前記指数部の上位ビットを比較して、前記比較結果により、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算するステップからなることを特徴とする付記１の浮動小数点データの総和演算処理方法。 (Additional remark 2) The said calculation step compares the high-order bit of the said exponent part, According to the said comparison result, the sum total of the mantissa part of the group whose exponent part is the maximum value, and the said exponent part are the 2nd maximum value The floating point data sum operation processing method according to appendix 1, characterized by comprising the step of calculating the sum of the mantissa part of the group.

（付記３）前記計算ステップは、前記指数部の下位ビットの値に応じて、前記仮数部をシフトして、データ幅を拡張した仮数部を作成するステップと、前記データ幅を拡張した仮数部を用いて、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算するステップとを有することを特徴とする付記１の浮動小数点データの総和演算処理方法。 (Supplementary Note 3) The calculation step includes a step of creating a mantissa part by expanding the data width by shifting the mantissa part according to a value of a lower bit of the exponent part, and a mantissa part by extending the data width And calculating the sum of the mantissa part of the group whose exponent part is the maximum value and the sum of the mantissa part of the group whose exponent part is the second largest value using Method for summation processing of floating point data.

（付記４）前記加算ステップは、前記指数部が２番目に最大値のグループの仮数部の総和結果と、前記指数部が最大値のグループの仮数部の総和結果との桁合わせを行うステップと、前記指数部が最大値のグループの総和結果と、前記桁合わせされた前記指数部が２番目に最大値のグループの仮数部の総和結果とを加算するステップを有することを特徴とする付記１の浮動小数点データの総和演算処理方法。 (Supplementary Note 4) The adding step includes a step of performing digit alignment between the summation result of the mantissa part of the group having the second largest exponent part and the summation result of the mantissa part of the group having the largest exponent part; And adding the summation result of the group having the maximum exponent part and the summation result of the mantissa part of the mantissa part of the group having the second largest exponent value. Method for summation processing of floating point data.

（付記５）前記仮数部の加算結果と、前記指数部の上位ビットとから前記浮動小数点データを作成するステップを更に有することを特徴とする付記１の浮動小数点データの総和演算処理方法。 (Supplementary note 5) The floating point data sum operation processing method according to supplementary note 1, further comprising the step of creating the floating point data from the addition result of the mantissa part and the upper bits of the exponent part.

（付記６）複数のノードと、前記各ノードの浮動小数点データの総和を演算するリダクション機構とを有し、前記リダクション機構は、前記浮動小数点データの指数部の大きさにより分けた複数のグループの前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算し、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和との加算を行うことを特徴とするコンピュータシステム。 (Supplementary note 6) A plurality of nodes and a reduction mechanism for calculating the sum of the floating point data of each of the nodes, wherein the reduction mechanism includes a plurality of groups divided according to the size of the exponent part of the floating point data. The sum of the mantissa part of the group whose exponent part is the maximum value and the sum of the mantissa part of the group whose exponent part is the second largest value are calculated, and the sum of the mantissa part of the group whose exponent part is the maximum value and The computer system is characterized in that the exponent part is added to the sum of the mantissa part of the group having the second largest value.

（付記７）前記リダクション機構は、前記指数部の上位ビットを比較して、前記比較結果により、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算することを特徴とする付記６のコンピュータシステム。 (Supplementary note 7) The reduction mechanism compares the higher order bits of the exponent part, and, according to the comparison result, the sum of the mantissa part of the group having the maximum exponent part and the exponent part having the second highest value The computer system according to appendix 6, wherein the sum of the mantissa part of the group is calculated.

（付記８）前記リダクション機構は、前記指数部の下位ビットの値に応じて、前記仮数部をシフトして、データ幅を拡張した仮数部を作成し、前記データ幅を拡張した仮数部を用いて、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算することを特徴とする付記６のコンピュータシステム。 (Supplementary Note 8) The reduction mechanism shifts the mantissa part according to the value of the low-order bit of the exponent part, creates a mantissa part with an expanded data width, and uses the mantissa part with the expanded data width The computer system according to appendix 6, wherein the sum of the mantissa part of the group having the maximum exponent part and the sum of the mantissa part of the group having the second largest exponent part are calculated.

（付記９）前記リダクション機構は、前記指数部が２番目に最大値のグループの仮数部の総和結果と、前記指数部が最大値のグループの仮数部の総和結果との桁合わせを行い、前記指数部が最大値のグループの総和結果と、前記桁合わせされた前記指数部が２番目に最大値のグループの仮数部の総和結果とを加算することを特徴とする付記６のコンピュータシステム。 (Supplementary Note 9) The reduction mechanism performs digit alignment between the summation result of the mantissa part of the group having the second largest exponent part and the summation result of the mantissa part of the group having the largest exponent part, The computer system according to appendix 6, wherein the summation result of the group having the maximum exponent part and the summation result of the mantissa part of the group having the second largest exponent value are added to the digit part.

（付記１０）前記リダクション機構は、前記仮数部の加算結果と、前記指数部の上位ビットとから前記浮動小数点データを作成することを特徴とする付記６のコンピュータシステム。 (Supplementary note 10) The computer system according to supplementary note 6, wherein the reduction mechanism creates the floating point data from the addition result of the mantissa part and the upper bits of the exponent part.

（付記１１）複数のノードと、前記各ノードの浮動小数点データの総和を演算するリダクション機構とを有し、前記各ノードは、ノード内の前記浮動小数点データの指数部の大きさにより分けた複数のグループの前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和を計算し、計算結果を前記リダクション機構に送り、前記リダクション機構は、複数ノードの指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和を計算し、計算結果を各ノードに返し、前記各ノードは、前記リダクション機構から返ってきた前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和との加算を行うことを特徴とするコンピュータシステム。 (Supplementary note 11) A plurality of nodes and a reduction mechanism for calculating the sum of the floating point data of each of the nodes, wherein each of the nodes is divided by the size of the exponent part of the floating point data in the node The sum of the mantissa part of the group having the maximum value in the exponent part of the group and the sum of the mantissa part of the group having the second largest value in the exponent part is sent to the reduction mechanism, and the reduction mechanism Calculates the sum of the mantissa part of the group having the largest exponent part of the plurality of nodes and the sum of the mantissa part of the group having the second largest value of the exponent part, and returns the calculation result to each node. The exponent part returned from the reduction mechanism adds the sum of the mantissa part of the group with the maximum value and the sum of the mantissa part of the group with the second largest value of the exponent part. Computer systems that.

（付記１２）前記各ノードは、前記指数部の上位ビットを比較して、前記比較結果により、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算することを特徴とする付記１１のコンピュータシステム。 (Additional remark 12) Each said node compares the high-order bit of the said exponent part, and according to the said comparison result, the sum total of the mantissa part of the group whose said exponent part is the maximum value, and the said exponent part are the 2nd largest The computer system according to appendix 11, wherein the sum of the mantissa part of the group is calculated.

（付記１３）前記各ノードは、前記指数部の下位ビットの値に応じて、前記仮数部をシフトして、データ幅を拡張した仮数部を作成し、前記データ幅を拡張した仮数部を用いて、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算することを特徴とする付記１１のコンピュータシステム。 (Additional remark 13) Each said node shifts the said mantissa part according to the value of the low-order bit of the said exponent part, creates the mantissa part which expanded the data width, and uses the mantissa part which expanded the said data width The computer system according to claim 11, wherein the sum of the mantissa part of the group having the maximum exponent part and the sum of the mantissa part of the group having the second largest exponent part are calculated.

（付記１４）前記各ノードは、前記指数部が２番目に最大値のグループの仮数部の総和結果と、前記指数部が最大値のグループの仮数部の総和結果との桁合わせを行い、前記指数部が最大値のグループの総和結果と、前記桁合わせされた前記指数部が２番目に最大値のグループの仮数部の総和結果とを加算することを特徴とする付記１１のコンピュータシステム。 (Supplementary Note 14) Each of the nodes performs digit alignment of the summation result of the mantissa part of the group having the second largest exponent part and the summation result of the mantissa part of the group having the largest exponent part, The computer system according to appendix 11, wherein the summation result of the group having the maximum exponent part and the summation result of the mantissa part of the group having the second largest exponent value are added to the digit part.

（付記１５）前記各ノードは、前記仮数部の加算結果と、前記指数部の上位ビットとから前記浮動小数点データを作成することを特徴とする付記１１のコンピュータシステム。 (Supplementary note 15) The computer system according to supplementary note 11, wherein each node creates the floating-point data from the addition result of the mantissa part and the upper bits of the exponent part.

（付記１６）複数のノードと、前記各ノードの浮動小数点データの総和を演算するリダクション機構とを有し、前記各ノードは、ノード内の前記浮動小数点データの指数部の大きさにより分けた複数のグループの前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和を計算し、その計算結果を前記リダクション機構に送り、前記リダクション機構は、複数ノードの指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和を計算し、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和との加算を行う、ことを特徴とするコンピュータシステム。 (Supplementary note 16) A plurality of nodes and a reduction mechanism for calculating the sum of the floating point data of each of the nodes, wherein each of the nodes is divided by the size of the exponent part of the floating point data in the node The sum of the mantissa part of the group having the maximum value in the exponent part and the sum of the mantissa part of the group having the second largest value in the exponent part is sent to the reduction mechanism, and the reduction result is sent to the reduction mechanism. The mechanism calculates the sum of the mantissa part of the group having the largest exponent part of the plurality of nodes and the sum of the mantissa part of the group having the second largest value of the exponent part, and the mantissa of the group having the largest exponent part. A computer system characterized by adding the sum of the parts and the sum of the mantissa part of the group having the second largest value in the exponent part.

（付記１７）コンピュータに、３つ以上の浮動小数点データの総和を演算させるプログラムであって、前記浮動小数点データの指数部の大きさにより分けた複数のグループの前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和とを計算するステップと、前記指数部が最大値のグループの仮数部の総和と、前記指数部が２番目に最大値のグループの仮数部の総和との加算を行うステップとを、前記コンピュータに実行させることを特徴とするプログラム。 (Supplementary note 17) A program for causing a computer to calculate the sum of three or more floating-point data, wherein the exponent part of a plurality of groups divided by the size of the exponent part of the floating-point data is a group having the maximum value Calculating the sum of the mantissa part, the sum of the mantissa part of the group with the second largest value of the exponent part, the sum of the mantissa part of the group with the largest value of the exponent part, and the second part of the exponent part And causing the computer to execute the step of adding to the sum of the mantissa part of the maximum value group.

指数部が最大値のグループの演算結果には、指数値が２以上小さいグループの演算結果が、影響しないことから、指数部が最大値のグループと、２番目に最大値のグループのみの総和を演算し、指数部が最大値のグループと、２番目に最大値のグループの総和同士を加算することにより、数値の計算順序に関係なく計算しても、計算結果の同一性を保証できる。 Since the calculation result of the group with the maximum exponent is not affected by the calculation result of the group with the index value of 2 or more, the sum of only the group with the maximum value of the exponent and the group with the second maximum value is added. By calculating and adding the sum of the group having the maximum exponent part and the group having the second largest value, the sameness of the calculation result can be guaranteed even if the calculation is performed regardless of the numerical calculation order.

本発明の一実施の形態のコンピュータシステムの構成図である。It is a block diagram of the computer system of one embodiment of this invention. 図１のノードの構成図である。It is a block diagram of the node of FIG. 図１及び図２のネットワークアダプタの構成図である。It is a block diagram of the network adapter of FIG.1 and FIG.2. 図１の伝送フレームのフォーマット図である。FIG. 2 is a format diagram of a transmission frame in FIG. 1. 図１のリダクション機構の構成図である。It is a block diagram of the reduction mechanism of FIG. 本発明の第１の実施の形態の浮動小数点データの総和演算処理の説明図である。It is explanatory drawing of the sum total calculation process of the floating point data of the 1st Embodiment of this invention. 図６のデータ変換処理の説明図である。It is explanatory drawing of the data conversion process of FIG. 図７の補数データ作成処理の説明図である。It is explanatory drawing of the complement data creation process of FIG. 図５の比較結果と演算処理の関係図である。FIG. 6 is a relationship diagram between the comparison result of FIG. 5 and arithmetic processing. 図６の総和加算処理の説明図である。It is explanatory drawing of the sum total addition process of FIG. 図６の浮動小数点データへの変換処理の説明図である。It is explanatory drawing of the conversion process to the floating point data of FIG. 図６の指数の上位ビットと、仮数部の絶対値の関係図である。FIG. 7 is a relationship diagram between the upper bits of the exponent of FIG. 6 and the absolute value of the mantissa part. 図６のデータ変換処理の実施例の説明図である。It is explanatory drawing of the Example of the data conversion process of FIG. 図６の総和加算処理の実施例の説明図である。It is explanatory drawing of the Example of the sum total addition process of FIG. 図６の浮動小数点データへの変換処理の実施例の説明図である。It is explanatory drawing of the Example of the conversion process to the floating point data of FIG. 本発明の第２の実施の形態の浮動小数点データの総和演算処理の説明図である。It is explanatory drawing of the sum total calculation process of the floating point data of the 2nd Embodiment of this invention. 図１６のリダクション機構の構成図である。It is a block diagram of the reduction mechanism of FIG. 図１７の比較結果と演算処理の関係図である。FIG. 18 is a relationship diagram between the comparison result of FIG. 17 and arithmetic processing. 浮動小数点データのフォーマットの説明図である。It is explanatory drawing of the format of floating point data. 従来の浮動小数点データの総和演算処理の説明図である。It is explanatory drawing of the sum total calculation process of the conventional floating point data. 図２０の計算順序を入れ替えた、従来の浮動小数点データの総和演算処理の説明図である。It is explanatory drawing of the sum total calculation process of the conventional floating point data which replaced the calculation order of FIG. 従来の計算順序を守らなくても良い浮動小数点データの総和演算処理の説明図である。It is explanatory drawing of the sum total calculation process of the floating point data which does not need to follow the conventional calculation order.

Explanation of symbols

１０，１１，１２，１３ノード
１４Ａ，１４Ｂ，１４Ｃネットワークアダプタ
２０，２１クロスバースイッチ
２２リダクション機構（浮動小数点総和演算回路）
４０ＣＰＵ
４２システムコントローラ
４４メモリ
４６ＩＯアダプタ
５０ホストインターフェイス制御回路
５２送信制御回路
５４ネットワークインターフェイス制御回路
５６受信制御回路 10, 11, 12, 13 Nodes 14A, 14B, 14C Network adapter 20, 21 Crossbar switch 22 Reduction mechanism (floating point summation circuit)
40 CPU
42 System Controller 44 Memory 46 IO Adapter 50 Host Interface Control Circuit 52 Transmission Control Circuit 54 Network Interface Control Circuit 56 Reception Control Circuit

Claims

In a floating point data sum operation processing method for calculating the sum of three or more floating point data using a computer ,
The sum of the mantissa part of the group in which the upper bits of the exponent part of the plurality of groups divided by the size of the upper bits of the exponent part of the floating point data is the maximum value, and the upper bit of the exponent part is the second largest value A calculation circuit of a computer calculates a sum of mantissa parts of the group of
The sum of the mantissa sections of a group of maximum upper bits of the exponent section, a mantissa operation circuit processing computers for adding the sum of a group of maximum upper bits of the exponent sections have the second run summation processing method for floating point data, comprising the steps of:.

Multiple nodes,
A reduction mechanism that receives floating point data from each of the nodes and calculates the sum of the received floating point data;
The reduction mechanism includes the sum of the mantissa part of the group having the highest value of the upper bits of the exponent part of the plurality of groups divided by the size of the upper bits of the exponent part of the received floating-point data, and the upper part of the exponent part . bits calculated the sum of the mantissa sections of a group of a second highest value, the sum of the mantissa sections of a group of which exponent sections have a maximum value, the sum of the mantissa sections of the group of the maximum value the exponent sections have the second A computer system characterized by performing addition with.

Multiple nodes,
A reduction mechanism that receives floating point data from each of the nodes and calculates the sum of the received floating point data;
Wherein each node includes: the total upper bit of the exponent portion of the plurality of groups divided by the size of the upper bits of the exponent part of the floating point data of the mantissa sections of a group of maximum value in the node, the exponent a sum of the mantissa sections of a group of maximum calculated upper bits in the second, feeding the calculation results calculated for each group in the reduction mechanism,
In the reduction mechanism, the upper bits of the exponent part of the floating-point data received from the plurality of nodes have the maximum value among the groups divided by the size of the upper bits of the exponent part of the floating-point data received from the nodes . Calculate the sum of the mantissa part of the group and the sum of the mantissa part of the group with the second highest value of the exponent part of the floating-point data received from the multiple nodes , and calculate for each group in the reduction mechanism Returns the calculated result to each node,
Each node adds the sum of the mantissa part of the group whose exponent part is the maximum value returned from the reduction mechanism and the sum of the mantissa part of the group whose index part is the second largest value. A featured computer system.