JP2522372B2

JP2522372B2 - Data driven computer

Info

Publication number: JP2522372B2
Application number: JP63321574A
Authority: JP
Inventors: 伸史小守; 浩乃坪田; 憲司嶋
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1988-12-19
Filing date: 1988-12-19
Publication date: 1996-08-07
Anticipated expiration: 2011-08-07
Also published as: JPH02165286A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、処理対象のデータ相互間の依存関係に基づ
いて処理を駆動する方式を採用している、所謂データ駆
動形計算機に関する。Description: TECHNICAL FIELD The present invention relates to a so-called data-driven computer that employs a method of driving processing based on a dependency relationship between data to be processed.

[Conventional technology]

従来のデータ駆動形計算機の代表的な例が、科学技術
用高速計算システム研究開発成果発表会講演予稿集（昭
和59年６月25日）の19頁から22頁及びIEEE COMPCON '84
SPRING予稿集（英文）の486頁から490頁に開示されて
いる。これらの公知資料は、いずれもSigma−１と呼ば
れるデータ駆動形計算機の構成と動作について述べてい
る。以下、これらの公知資料に基づいて、従来技術を説
明する。Typical examples of conventional data-driven computers are pages 19 to 22 and IEEE COMPCON '84 of the proceedings of the high-speed computing system research and development results presentation for science and technology (June 25, 1984).
It is disclosed on pages 486 to 490 of the SPRING Proceedings (English). All of these known materials describe the configuration and operation of a data driven computer called Sigma-1. The prior art will be described below based on these publicly known materials.

第５図は上述の従来のデータ駆動形計算機の構成を示
している。FIG. 5 shows the configuration of the above-mentioned conventional data driven computer.

このデータ駆動形計算機は、計算機外部とのインタフ
ェイスであるネットワークインタフェース14、処理対象
の各データから行先ノード番号が一致する二つのデータ
を検出するマッチングメモリ10、ノード番号に基づいて
演算命令のコードを取出す命令取出部15、命令コードに
従って演算処理を実行する命令実行部12、演算処理後の
データの次の行先ノード番号を指定するための行先指定
部16等とこれらを接続するデータ経路から構成されてい
る。This data-driven computer has a network interface 14 that is an interface with the outside of the computer, a matching memory 10 that detects two data whose destination node numbers match from each data to be processed, and an operation instruction code based on the node number. An instruction fetch unit 15 for fetching, an instruction execution unit 12 for executing arithmetic processing according to an instruction code, a destination designation unit 16 for designating a next destination node number of data after arithmetic processing, and the like, and a data path connecting these. Has been done.

また、第４図は従来のデータ駆動形計算機の具体的な
動作を説明するためのプログラム（データフローグラ
フ）の例である。Further, FIG. 4 is an example of a program (data flow graph) for explaining a specific operation of the conventional data driven computer.

このデータフローグラフは、データＡとＢとをノード
＃０において加算し、この結果のデータＧと別のデータ
Ｃとをノード＃２において乗算してその結果データＩを
求め、他方ではデータＤとＥとをノード＃１において加
算し、この結果のデータＨと別のデータＦとをノード＃
４において乗算してその結果データＪを求め・・・、デ
ータＷとＸとをノード＃7096において乗算し、この結果
データＹと先に求められているデータＩとをノード＃70
97において加算してその結果データＺを求め・・・とい
うように構成されている。In this data flow graph, data A and B are added at node # 0, the resulting data G and another data C are multiplied at node # 2 to obtain the resulting data I, and the other is data D. E and E are added at node # 1, and the resulting data H and another data F are added to node # 1.
4 to obtain the result data J ... The data W and X are multiplied at the node # 7096, and the result data Y and the previously obtained data I are obtained at the node # 70.
At 97, the result is added to obtain the data Z, and so on.

外部からネットワークインタフェース14を介して入力
されたパケット（タグ情報を有するデータ）Ａ（201）
は、行先ノード番号〈０〉とデータ［Ａ］とから構成さ
れている。このパケット201は、マッチングメモリ10に
送られるが、これと同時に、パケット201の内の行先ノ
ード番号〈０〉（203）は命令取出部15にも送られる。
データ［Ａ］の２項演算の相手となるデータ［Ｂ］を有
するパケットＢが既に入力されてマッチングメモリ10内
で待機していると仮定すると、これら２つのパケットA,
Bの行先ノードは共に＃０で一致するため、マッチング
メモリ10はデータ［Ａ］とデータ［Ｂ］とを対にしたパ
ケット206を出力する。Packet (data with tag information) A (201) input from outside via the network interface 14
Is composed of a destination node number <0> and data [A]. This packet 201 is sent to the matching memory 10, and at the same time, the destination node number <0> (203) in the packet 201 is also sent to the instruction fetch unit 15.
Assuming that packet B having data [B], which is a partner of binary operation of data [A], has already been input and is waiting in matching memory 10, these two packets A,
Since the destination nodes of B match with each other at # 0, the matching memory 10 outputs the packet 206 including the data [A] and the data [B] as a pair.

一方、命令取出部15では、ノード番号〈０〉に相当す
るアドレスの内容である命令コード「＋」（205）が読
出されて出力される。On the other hand, the instruction fetch unit 15 reads and outputs the instruction code “+” (205) which is the content of the address corresponding to the node number <0>.

命令取出部15のメモリ構成を第６図（ａ）に示す。 The memory configuration of the instruction fetch unit 15 is shown in FIG.

次に、命令取出部15に入力された行先ノード番号
〈０〉は、そのままノード番号〈０〉（20）として行先
指定部16に送られる。また、命令取出部15で読み出され
た命令コード「＋」（205）は、データ対［Ａ］，
［Ｂ］のパケット206と共にパケット207として命令実行
部12に与えられる。このとき、行先指定部16において
は、第４図のデータフローグラフ中のノード番号〈０〉
に対応する演算である加算の結果データが次に行くべき
ノード＃２を表す〈２〉（208）が読出される。Next, the destination node number <0> input to the instruction fetching unit 15 is sent as it is to the destination specifying unit 16 as the node number <0> (20). Further, the instruction code “+” (205) read by the instruction fetching section 15 is a data pair [A],
It is given to the instruction execution unit 12 as a packet 207 together with the packet 206 of [B]. At this time, in the destination designation unit 16, the node number <0> in the data flow graph of FIG.
<2> (208) representing the node # 2 to which the result data of the addition, which is the operation corresponding to, is to be read next is read.

行先指定部16のメモリ構成を、第６図（ｂ）に示す。 The memory configuration of the destination specifying unit 16 is shown in FIG. 6 (b).

また同時に、命令実行部12においては［Ａ］＋［Ｂ］
の演算が行われ、その演算結果データ［Ｇ］（209）が
出力される。行先指定部16の出力208と命令実行部12の
出力209と、即ち行先ノード番号〈０〉とデータ［Ｇ］
とはパケット210としてネットワークインタフェース14
を通過し、再び命令取出部15及びマッチングメモリ10に
送られる。At the same time, in the instruction execution unit 12, [A] + [B]
The calculation result data [G] (209) is output. Output 208 of destination designation unit 16 and output 209 of instruction execution unit 12, that is, destination node number <0> and data [G]
And the packet 210 as a network interface 14
And is sent again to the instruction fetch unit 15 and the matching memory 10.

以上の様な処理の連鎖によって、第４図に示したデー
タフローグラフの全ノードに相当する演算が施され、プ
ログラムの実行が終了する。このとき、データ依存関係
が存在するノード、例えば、ノード＃０とノード＃２と
における処理はこの順に逐次的にのみ実行可能である
が、データ依存関係が存在しないノード、例えば、ノー
ド＃０とノード＃１とにおける処理は処理資源の許す限
りにおいて並列に実行可能である。なおここで、デー
タ依存関係とは、二つのノード間の接続関係において、
一方のノードの処理が完了することによってはじめて他
方の処理を行うために必要な入力データが供給されるよ
うな接続関係にあることを指している。Through the chain of processing as described above, the operations corresponding to all the nodes of the data flow graph shown in FIG. 4 are performed, and the execution of the program ends. At this time, the processing in the node having the data dependency, for example, the node # 0 and the node # 2 can be executed only sequentially in this order, but the node having no data dependency, for example, the node # 0. The processes in the node # 1 can be executed in parallel as long as the processing resources allow. Note that here, the data dependency relationship is a connection relationship between two nodes.
It means that the connection relationship is such that the input data necessary for performing the other process is supplied only after the process of one node is completed.

命令取出部15と行先指定部16のメモリ構成を、前述の
如く、第６図（ａ）及び（ｂ）にそれぞれ示す。各々の
情報のビット幅は、公知資料では明確に記述されていな
いので、ここでは、命令コードのビット幅を４ビット、
行先ノード番号のビット幅を16ビットと仮定した。ま
た、上記公知資料には、データコピー操作に関する具体
的な記述がないので、データコピーについては広く一般
的に知られている手法を用いることを仮定した。The memory configurations of the instruction fetch unit 15 and the destination designating unit 16 are shown in FIGS. 6 (a) and 6 (b), respectively, as described above. Since the bit width of each information is not clearly described in the publicly known material, here, the bit width of the instruction code is 4 bits,
The bit width of the destination node number is assumed to be 16 bits. Further, since there is no specific description about the data copy operation in the publicly known material, it is assumed that a widely known method is used for the data copy.

行先指定部16内のメモリにおける最上位ビット“COP
Y"はコピービットであり、第４図に示したデータフロー
グラフ中のあるノードにおける演算結果が複数のノード
に送出される場合に、データコピーを行ってそれぞれの
データに対して行先ノード番号を与えるために用いられ
ている。例えば、ノード＃２における演算結果がノード
＃7097とノード＃５との両方に送出されているのに対応
して、上記メモリの第２番地のコピービットは“1"が格
納されており、この場合、行先ノード番号＃7097が読出
されて演算結果［Ｉ］に付加されて出力された後、連続
的に次の番地（第３番地）が読出されて行先ノード番号
＃５が同一の演算結果［Ｉ］に付加されて出力される。The most significant bit "COP in the memory in the destination specifying unit 16"
Y "is a copy bit, and when the operation result at a node in the data flow graph shown in FIG. 4 is sent to multiple nodes, data copy is performed and the destination node number is assigned to each data. For example, the copy bit at the second address of the memory is "1" in response to the operation result in the node # 2 being sent to both the node # 7097 and the node # 5. Is stored. In this case, the destination node number # 7097 is read, added to the operation result [I] and output, and then the next address (third address) is continuously read and the destination node number # 7097 is read. The number # 5 is added to the same calculation result [I] and output.

[Problems to be Solved by the Invention]

ところで、上述のような従来のデータ駆動形計算機に
は、プログラムメモリの規模が大きくなるという問題が
ある。By the way, the conventional data driven computer as described above has a problem that the scale of the program memory becomes large.

現在、広く一般に用いられているノイマン形計算機の
場合、プログラムに記述された順序に従って逐次処理を
行うので、プログラムカウンタと呼ばれる一個のレジス
タで実行番地が一元的に管理されている。従って、分岐
命令以外は次に実行すべき命令の番地（飛び先番地）を
特に指定する必要がなく、データ駆動形計算機に比して
小さい規模のプログラムメモリで同一内容のプログラム
を格納することができる。これに対して、データ駆動形
計算機の場合には、各命令の処理を並列的に実行するた
め、実行番地を一元的に管理することが不可能である。
このため、データ駆動形計算機では、原理的に総ての命
令に対して次に実行すべき命令の番地（行先ノード番
号）を指定しておく必要があり、プログラムメモリの規
模が拡大する原因となっている。上述の従来例において
も、１命令当りのビット幅21ビット（命令コード４ビッ
ト＋行先ノード番号17ビット）の内、16ビットを行先ノ
ード番号が占めている。In the case of the Neumann computer, which is widely used at present, since the sequential processing is performed in the order described in the program, the execution address is centrally managed by one register called a program counter. Therefore, it is not necessary to specify the address (jump destination address) of the next instruction to be executed other than the branch instruction, and it is possible to store programs of the same content in a program memory of a smaller scale than a data-driven computer. it can. On the other hand, in the case of a data driven computer, since the processing of each instruction is executed in parallel, it is impossible to centrally manage the execution address.
For this reason, in a data-driven computer, it is necessary to specify the address (destination node number) of the next instruction to be executed for all instructions in principle, which causes the scale of the program memory to increase. Has become. Also in the above-mentioned conventional example, the destination node number occupies 16 bits out of the bit width of 21 bits per instruction (instruction code 4 bits + destination node number 17 bits).

本発明は、このような実情に鑑み、プログラムメモリ
の規模を圧縮することにより、データ駆動形計算機のハ
ードウェア規模の低減を図り、同時にメモリ規模の圧縮
に伴うメモリアクセスタイムの短縮による性能向上を図
らんとするものである。In view of such circumstances, the present invention reduces the hardware scale of a data driven computer by compressing the scale of the program memory, and at the same time improves performance by shortening the memory access time accompanying the compression of the memory scale. It is a matter of illustration.

[Means for solving the problem]

本発明のデータ駆動形計算機は、プログラムメモリ内
の行先ノード番号を、例えば、現在の命令の格納番地か
らの可変長の相対番地で表す構成を採っている。The data-driven computer of the present invention has a configuration in which the destination node number in the program memory is represented by a relative address having a variable length from the storage address of the current instruction.

[Action]

本発明に係るデータ駆動形計算機においては、次に実
行すべき命令の格納番地を、例えば現在の命令に対する
相対番地で表わしているため、次に命令の格納番地は、
現在の命令の番地と次の命令の相対番地との間で演算を
行うことにより得られる。In the data driven computer according to the present invention, since the storage address of the instruction to be executed next is represented by, for example, the relative address with respect to the current instruction, the storage address of the instruction next is
It is obtained by performing an operation between the address of the current instruction and the relative address of the next instruction.

Example of Invention

以下、本発明をその実施例を示す図面に基づいて詳述
する。Hereinafter, the present invention will be described in detail with reference to the drawings showing an embodiment thereof.

第１図は本発明に係るデータ駆動形計算機の一実施例
の構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of an embodiment of a data driven computer according to the present invention.

本発明のデータ駆動形計算機は、外部とのインタフェ
イスであるネットワークインタフェイス14、処理対象の
各データから行先ノード番号が一致する二つのデータを
検出する一致検出部としてのマッチングメモリ10、ノー
ド番号に基づいて演算命令のコードを取出す機能と演算
処理後のデータの次の行先ノード番号を指定するための
二つの機能を有するプログラムメモリ11、命令コードに
従って演算処理を実行する演算処理部としての命令実行
部12、行先ノード番号の更新処理のための演算手段とし
ての加算器13等とこれらを接続するデータ経路にて構成
されている。The data-driven computer of the present invention is a network interface 14 that is an interface with the outside, a matching memory 10 as a match detection unit that detects two data whose destination node numbers match from each data to be processed, a node number. A program memory 11 having a function of taking out the code of the operation instruction based on the above and two functions of designating the next destination node number of the data after the operation processing, an instruction as an operation processing unit for executing the operation processing according to the instruction code It is composed of an execution unit 12, an adder 13 as an arithmetic means for updating the destination node number, and a data path connecting them.

このような本発明のデータ駆動形計算機により、第３
図のプログラム（データフローグラフ）を実行する場合
について動作を説明する。なお、第３図に示すデータフ
ローグラフは、前述の第４図のプログラムと全く同一の
プログラムであるが、本発明の説明のためにノード番号
の割付け方を異ならせてある。With such a data driven computer of the present invention,
The operation will be described for the case of executing the program (data flow graph) shown in the figure. The data flow graph shown in FIG. 3 is the same program as the program shown in FIG. 4 described above, but the node numbers are assigned differently for the purpose of explaining the present invention.

このデータフローグラフは、データＡとＢとをノード
＃０において加算し、この結果のデータＧと別のデータ
Ｃとをノード＃２において乗算してその結果データＩを
求め、他方ではデータＤとＥとをノード＃１において加
算し、この結果のデータＧと別のデータＣとをノード＃
５において乗算してその結果データＪを求め・・・、デ
ータＷとＸとをノード＃7096において乗算し、この結果
データＹと先に求められているデータＩとをノード＃70
97において加算してその結果データＺを求め・・・とい
うように構成されている。In this data flow graph, data A and B are added at node # 0, the resulting data G and another data C are multiplied at node # 2 to obtain the resulting data I, and the other is data D. E and E are added at node # 1, and the resulting data G and another data C are added to node # 1.
5, the resulting data J is obtained, the data W and X are multiplied at node # 7096, and the result data Y and the previously obtained data I are obtained at node # 70.
At 97, the result is added to obtain the data Z, and so on.

外部からネットワークインタフェース14を経由して入
力されたパケット101は、行先ノード番号〈０〉，デー
タ［Ａ］及び命令コード「＋」を含んでいる。The packet 101 input from the outside via the network interface 14 includes the destination node number <0>, the data [A], and the instruction code “+”.

本発明のデータ駆動形計算機においては、２サイクル
のパイプライン処理が行われる。その初段はマッチング
メモリ10におけるデータの待合わせである。外部から入
力されたパケット101は、マッチングメモリ10に送られ
るが、この際、２項演算「＋」の相手のデータ［Ｂ］を
含むパケットが未到着の場合には、入力されたパケット
101はマッチングメモリ内に一旦格納され、データ
［Ｂ］を含むパケットの到着を待つ。また、既にもう一
方のデータ［Ｂ］を含むパケットが到着してマッチング
メモリ内に格納されている場合には、これら二つのパケ
ットの行先ノード番号〈０〉が一致していることが検出
され、入力パケット101にデータ［Ｂ］を付加したパケ
ット102がマッチングメモリ10から送出される。In the data driven computer of the present invention, pipeline processing of two cycles is performed. The first stage is waiting for data in the matching memory 10. The packet 101 input from the outside is sent to the matching memory 10. At this time, if the packet including the partner data [B] of the binary operation “+” has not arrived, the input packet 101
101 is once stored in the matching memory and waits for the arrival of a packet containing data [B]. Also, when a packet including the other data [B] has already arrived and is stored in the matching memory, it is detected that the destination node numbers <0> of these two packets match. A packet 102 in which data [B] is added to the input packet 101 is sent out from the matching memory 10.

パケット102に含まれている情報の内、行先ノード番
号〈０〉（103）はプログラムメモリ11及び後述する加
算器13に送られ、残りのデータ［Ａ］，［Ｂ］及び命令
コード「＋」は、パケット104として命令実行部12に送
られる。Of the information included in the packet 102, the destination node number <0> (103) is sent to the program memory 11 and the adder 13 described later, and the remaining data [A] and [B] and the instruction code “+” Is sent to the instruction execution unit 12 as a packet 104.

命令実行部12では、命令コード「＋」に従ってデータ
［Ａ］と［Ｂ］との加算が行われ、加算結果データ
［Ｇ］（107）が出力される。In the instruction execution unit 12, the data [A] and [B] are added according to the instruction code “+”, and the addition result data [G] (107) is output.

一方、命令実行部12における処理と並列して実行され
るプログラムメモリ11での処理について、以下に説明す
る。On the other hand, the processing in the program memory 11 that is executed in parallel with the processing in the instruction execution unit 12 will be described below.

第２図は、本発明に係るデータ駆動型計算機のプログ
ラムメモリ11のメモリ構成を示している。FIG. 2 shows the memory configuration of the program memory 11 of the data driven computer according to the present invention.

第２図のメモリにおいて、各ワードの最上位ビット
“EXT"は後述する拡張アドレスフラグである。In the memory of FIG. 2, the most significant bit "EXT" of each word is an extended address flag described later.

その下位側のビット“COPY"は、従来例における“COP
Y"ビットと同様の意味を有し、この“COPY"ビットが
“1"であれば、演算結果の行先ノードが複数個あり、こ
れらの各行先ノードに関する命令コードと行先ノード番
号の情報とが次の番地以降に格納されていることを示し
ている。The lower bit "COPY" is the "COP" in the conventional example.
It has the same meaning as the Y "bit, and if this" COPY "bit is" 1 ", there are multiple destination nodes of the operation result, and the operation code and destination node number information for each destination node are It indicates that it is stored after the next address.

この“COPY"ビットに続く下位側４ビットには、次の
サイクルで実行すべき命令コード（OPECODE）が格納さ
れており、その更に下位側の６ビットには、行先ノード
番号が現在のノード番号に対する相対番地（RNODE）で
格納されている。The instruction code (OPECODE) to be executed in the next cycle is stored in the lower 4 bits following this "COPY" bit, and the destination node number is the current node number in the lower 6 bits. It is stored as a relative address (RNODE) with respect to.

上述の如き本発明のデータ駆動型計算機の具体的な動
作を説明する。A specific operation of the data driven computer of the present invention as described above will be described.

パケット101がマッチングメモリ10において一致検出
されてパケット102が出力されると、パケット102のノー
ド番号〈０〉（103）がプログラムメモリ11に出力され
る。これにより、第２図に示したメモリの第０番地がア
クセスされて、命令コード「＊」と、相対行先ノード番
号〈２〉が続出されて、命令コードレジスタ（OPE-RE
G）131及び行先ノード番号レジスタ（DEST-REG）132に
それぞれ書込まれる。When the packet 101 is detected as matching in the matching memory 10 and the packet 102 is output, the node number <0> (103) of the packet 102 is output to the program memory 11. As a result, the address 0 of the memory shown in FIG. 2 is accessed, and the instruction code “*” and the relative destination node number <2> are successively output, and the instruction code register (OPE-RE
G) 131 and destination node number register (DEST-REG) 132, respectively.

命令コードは命令コードレジスタ131からそのまま出
力される。一方、相対行先ノード番号〈２〉は行先ノー
ド番号レジスタ132から加算器13に送られ、加算器13に
おいて、先に入力されている入力ノード番号〈０〉（10
3）と加算されて行先ノード番号の絶対値〈２〉（106）
が得られる。The instruction code is output from the instruction code register 131 as it is. On the other hand, the relative destination node number <2> is sent from the destination node number register 132 to the adder 13, and the adder 13 inputs the input node number <0> (10
3) and the absolute value of the destination node number <2> (106)
Is obtained.

このようにして得られた命令コード「＊」，行先ノー
ド番号〈２〉及び命令実行部12から出力されるデータ
［Ｇ］にて構成されるパケット108が再びネットワーク
インタフェイス14を経由してマッチングメモリ10に供給
され、このような処理の反復によって、プログラムの全
体が実行される。The packet 108 composed of the instruction code “*”, the destination node number <2>, and the data [G] output from the instruction executing unit 12 thus obtained matches again via the network interface 14. The entire program is executed by being supplied to the memory 10 and repeating such processing.

但し本発明では、相対行先ノード番号を格納するプロ
グラムメモリ11のメモリのビット幅を６ビットに制限し
ているために、実際の相対行先ノード番号の値が255
（＝2⁶−１）を越えた場合に桁溢れが発生する。However, in the present invention, since the bit width of the memory of the program memory 11 for storing the relative destination node number is limited to 6 bits, the actual value of the relative destination node number is 255.
Overflow occurs when (= 2 ⁶ -1) is exceeded.

例えば、第３図のデータフローグラフにおけるノード
＃２からノード＃7079への矢印に対応する相対行先ノー
ド番号がこれに相当する。このような場合には、各ノー
ド番号に対応するプログラムメモリの最上位ビット“EX
T"を“1"とすることによって、溢れた桁を格納するため
の拡張アドレスビット領域として次の番地を用いること
が可能なように本発明では構成してある。For example, the relative destination node number corresponding to the arrow from the node # 2 to the node # 7079 in the data flow graph of FIG. 3 corresponds to this. In such a case, the most significant bit "EX" of the program memory corresponding to each node number
By setting T "to" 1 ", the present invention is configured so that the next address can be used as an extended address bit area for storing an overflow digit.

第３図のノード＃２における処理に必要な入力データ
［Ｇ］と［Ｃ］とが揃い、マッチングメモリ10から行先
ノード番号〈２〉，命令コード「＊」，データ［Ｇ］及
び［Ｃ］にて構成されるパケットが出力された場合につ
いて説明する。The input data [G] and [C] necessary for the processing in the node # 2 in FIG. 3 are complete, and the destination node number <2>, the instruction code “*”, the data [G] and [C] are acquired from the matching memory 10. A case will be described in which the packet composed of is output.

上述の如く、命令コードとデータは命令実行部12に送
られて乗算が実行され、結果データ［Ｉ］が出力され
る。これと並行して、行先ノード番号〈２〉がプログラ
ムメモリ11に送出され、行先ノード番号が続出される。
入力されたノード番号が〈Ｎ〉であるときのプログラム
メモリ11の詳細な論理動作を、第７図のフローチャート
に示す。As described above, the instruction code and the data are sent to the instruction execution unit 12 to be multiplied, and the result data [I] is output. In parallel with this, the destination node number <2> is sent to the program memory 11, and the destination node numbers are successively output.
The detailed logical operation of the program memory 11 when the input node number is <N> is shown in the flowchart of FIG.

このフローチャート中において、「READ M」はメモリ
のＭ番地を読出す動作、ビット（6:0）等はビット６か
らビット０にて構成されているビット列、“EXT",“COP
Y"はプログラムメモリ内の拡張アドレスフラグビットと
コピーフラグビットとの値をそれぞれ表わしている。In this flowchart, "READ M" is an operation for reading the M address of the memory, bits (6: 0), etc. are a bit string composed of bits 6 to 0, "EXT", "COP".
Y "represents the value of the extended address flag bit and the copy flag bit in the program memory, respectively.

行先ノード番号〈２〉を有するパケットがプログラム
メモリ11に入力されると、プログラムメモリ11の第２番
地の内容が読出される（ステップS4）。読出された内容
のうち、命令コード「＋」が命令コードレジスタ131
に、また相対行先ノード番号〈４〉が行先ノード番号レ
ジスタ132の下位６ビットに、即ちビット（5:0）にそれ
ぞれ書込まれる（ステップS5,S6）。When the packet having the destination node number <2> is input to the program memory 11, the content of the second address of the program memory 11 is read (step S4). Of the read contents, the instruction code “+” indicates that the instruction code register 131
And the relative destination node number <4> is written in the lower 6 bits of the destination node number register 132, that is, in bits (5: 0) (steps S5 and S6).

読出されたワードの“EXT"ビットは“0"なので（ステ
ップS7）、命令コード及び行先ノード番号レジスタの値
〈４〉と入力されたノード番号〈２〉との加算器13によ
る加算結果の〈６〉が行先ノード番号として出力される
（ステップS8）。Since the "EXT" bit of the read word is "0" (step S7), the instruction code and the value <4> of the destination node number register and the input node number <2>6> is output as the destination node number (step S8).

この際、読出されたワードの“COPY"ビットが“1"な
ので、読出しアドレス〈２〉をインクリメントすると共
に（ステップS10）、ループカウント〈Ｌ〉と行先ノー
ド番号レジスタ132の内容をクリアした後（ステップS2,
S3）、インクリメント後のアドレスである第３番地を読
出す（ステップS4）。At this time, since the "COPY" bit of the read word is "1", the read address <2> is incremented (step S10), and the loop count <L> and the contents of the destination node number register 132 are cleared ( Step S2,
S3), the third address, which is the address after the increment, is read (step S4).

初回の読出しの際と同様に、読出したワード中の命令
コード「＋」と相対行先ノード番号〈55〉（＝“11011
1"）とを、命令コードレジスタ131と行先ノード番号レ
ジスタ132の下位６ビットにそれぞれ書込む（ステップS
5,S6）。読出したワードの“EXT"ビットが“1"であるの
で（ステップS7）、読出しアドレスをインクリメントし
て（ステップS11）、第４番地の内容を読出し、読出し
たワードの下位10ビットを、行先ノード番号レジスタ13
2のビット15からビット６に書込む（ステップS13）。最
後に読出したワードのEXTビットの値が“0"であるので
（ステップS15）、現状の命令コード「＋」及び行先ノ
ード番号レジスタ132の内容〈7095〉（＝“0001 1011 1
011 0111"）と入力したノード番号＃２との加算結果〈7
097〉を出力する（ステップS8）。Similar to the case of the first read, the instruction code "+" in the read word and the relative destination node number <55> (= "11011
1 ") are written in the lower 6 bits of the instruction code register 131 and the destination node number register 132, respectively (step S
5, S6). Since the "EXT" bit of the read word is "1" (step S7), the read address is incremented (step S11), the contents of address 4 are read, and the lower 10 bits of the read word are set to the destination node. Number register 13
Write from bit 15 to bit 6 of 2 (step S13). Since the value of the EXT bit of the last read word is “0” (step S15), the current instruction code “+” and the contents of the destination node number register 132 <7095> (= “0001 1011 1
011 0111 ") and the addition result of the input node number # 2 <7
097> is output (step S8).

また、最後に読出したワードのCOPYビットの値は“0"
なので（ステップS9）、ノード番号〈２〉を有する入力
パケットに対する一連の処理を終了する。The value of the COPY bit of the last read word is "0".
Therefore (step S9), the series of processes for the input packet having the node number <2> ends.

上述のように、行先ノード番号を相対行先ノード番号
で与えた場合において、相対行先ノード番号が所定のビ
ット幅を越えてオーバーフローしても、プログラムメモ
リの次の番地を相対行先ノード番号の拡張アドレス領域
として用いることによって処理が完結され、どのような
規模の、またどのようなノード番号割付けが行われたプ
ログラムに対しても本発明が適用可能である。As described above, when the relative destination node number is given as the destination node number, even if the relative destination node number overflows beyond the predetermined bit width, the next address of the program memory is set to the extended address of the relative destination node number. The present invention can be applied to a program of which scale is completed by using it as an area and which node number is assigned.

一般に、ノード間の接続は局所的であることから、プ
ログラム規模によらず、相対行先ノード番号が大きな値
をとることは稀であると考えられる。実際に、3987ノー
ドからなるテストプログラムに対して、入力から順に近
いノードに小さいノード番号を与えるようにノード番号
を割付けた場合に相対行先ノード番号がどのような統計
的分布を示すかを評価した。その結果を第８図のグラフ
に示す。グラフの横軸は、相対行先ノード番号を、縦軸
は累積度数の百分率を、各区間に対する数字はその区間
内に含まれるノード数をそれぞれ示している。Generally, since the connection between nodes is local, it is considered that the relative destination node number rarely takes a large value regardless of the program scale. In fact, we evaluated the statistical distribution of the relative destination node numbers when the node numbers were assigned to the test program consisting of 3987 nodes so that the nodes nearer to the input would be assigned smaller node numbers. . The results are shown in the graph of FIG. The horizontal axis of the graph shows the relative destination node number, the vertical axis shows the percentage of the cumulative frequency, and the numbers for each section indicate the number of nodes included in that section.

この第８図のグラフからは、全体の約９割（88％）は
相対行先ノード番号が63番地以内であることが容易に理
解される。即ち、相対行先ノード番号のビット幅を６ビ
ットとしても、前述の如き拡張ビットを必要とする可能
性は約１割であることを意味している。From the graph of FIG. 8, it is easily understood that about 90% (88%) of the total have relative destination node numbers within 63 addresses. In other words, even if the bit width of the relative destination node number is 6 bits, it means that there is about a 10% possibility that the extension bit as described above is needed.

従来例のプログラムメモリのビット幅が21ビット（命
令取り出し部４ビット＋行き先指定部17ビット）である
のに対して、本実施例の場合は12ビットであり、本発明
を実施した場合に拡張アドレスを格納するために余分に
必要となるワード数が上述の統計から10％程度あるとし
ても、メモリの規模は（12×1.1）/21≒0.63 倍に圧縮される。本実施例ではプログラムメモリ空間を
16ビットとしたが、この圧縮効果はプログラムメモリ空
間が大きくなると一層顕著になる。例えば、プログラム
メモリ空間を32ビットとした場合には、相対行先ノード
番号空間を８ビットに拡大し、拡張アドレスビットが必
要となる割合を20％と見積っても、メモリ規模はとなり、実際のビット数を代入すると倍となり、半分以下に圧縮可能である。The bit width of the program memory in the conventional example is 21 bits (instruction fetching section 4 bits + destination designating section 17 bits), whereas in the case of the present embodiment it is 12 bits, which is expanded when the present invention is carried out. Even if the number of extra words required to store the address is about 10% from the above statistics, the memory size is compressed to (12 × 1.1) / 21 ≈ 0.63 times. In this embodiment, the program memory space is
Although 16 bits is used, this compression effect becomes more remarkable as the program memory space increases. For example, if the program memory space is 32 bits, the relative destination node number space is expanded to 8 bits, and even if the proportion of extended address bits required is estimated to be 20%, the memory scale will be And substituting the actual number of bits, It is doubled and can be compressed to less than half.

なお、上記実施例では説明の簡便化のために、相対行
先ノード番号を正の数に限定したが、符号付き数（２の
補数表現）とすることによって、出力側のノード番号よ
り小さいノード番号をもつノードに対してデータを送出
することが可能になる。これによって、繰返し構造を含
むプログラムに対しても、本発明を適用できる。In the above embodiment, the relative destination node number is limited to a positive number for the sake of simplification of description. However, by using a signed number (two's complement representation), the node number smaller than the output side node number. It becomes possible to send data to a node having. As a result, the present invention can be applied to a program including a repeating structure.

更に、本実施例では、相対行先ノード番号の桁溢れビ
ットを拡張ビットに割り当てたが、他の手法として、プ
ログラムをページ単位に分割し、ページ内番地を例えば
６ビットで表わし、ページを越えて行先ノードが存在す
る場合に、相対ページ番号を拡張ビットに割当てる手法
を採ることも可能である。Further, in this embodiment, the overflow bit of the relative destination node number is assigned to the extension bit. However, as another method, the program is divided into page units, and the page internal address is represented by, for example, 6 bits, and the page is crossed. It is also possible to adopt a method of assigning the relative page number to the extension bit when the destination node exists.

なお上記実施例では、従来例に示したデータ駆動形計
算機の命令取出部と行先指定部とを一つのプログラムメ
モリに統合し、このプログラムメモリを命令実行部と並
列配置したことにより、冗長な入力アドレスラッチ及び
デコーダ等を省略してハードウェア規模を低減してい
る。In the above embodiment, the instruction fetching unit and the destination designating unit of the data driven computer shown in the conventional example are integrated into one program memory, and this program memory is arranged in parallel with the instruction executing unit, so that the redundant input is performed. The hardware scale is reduced by omitting the address latch and decoder.

このように、プログラムメモリと命令実行部を並列配
置したことにより、一つの命令を実行するための基本サ
イクルはマッチングメモリにおける処理と、「プログラムメモリ＋命令実行部」における処理との二つのパイプライン段にて構成されるので、従来例の
データ駆動形計算機に比べて遜色のない短い基本サイク
ルが得られており、単発の命令に対する応答時間も短く
抑えることができる。By arranging the program memory and the instruction execution unit in parallel in this way, the basic cycle for executing one instruction is two pipelines: processing in the matching memory and processing in the "program memory + instruction execution unit". Since it is composed of stages, a short basic cycle comparable to that of the conventional data driven computer is obtained, and the response time for a single instruction can be suppressed to be short.

〔The invention's effect〕

以上説明したように、本発明のデータ駆動形計算機に
おいては、プログラムメモリ内の行先ノード番号を、例
えば、現在の命令の格納番地からの相対番地で指定した
ことにより、行先ノード番号を格納するためのビット幅
を大幅に圧縮し、プログラムメモリの規模を低減するこ
とができる。As described above, in the data-driven computer of the present invention, the destination node number is stored by designating the destination node number in the program memory by, for example, the relative address from the storage address of the current instruction. It is possible to significantly reduce the bit width of and reduce the scale of the program memory.

一般に、データ駆動形計算機は、並列分散処理に適し
た計算モデルの上に構築されており、特別な制御を加え
なくてもマルチタスクが自律的に並列実行されるという
際だった特徴があると考えられている。しかしながら、
マルチタスクの各々を格納するプログラムメモリのアド
レスが固定的に決められているような現状の方式の場
合、現在実行中のタスクTaとプログラムメモリの領域を
共有する恐れのある他のタスクTbとは、プログラムメモ
リに空き領域があっても同時には実行できない。従っ
て、現状のデータ駆動形計算機は、マルチタスクに適し
ているという本来有している特徴を充分に活用すること
が出来ていないというのが実情である。Generally, a data-driven computer is built on a calculation model suitable for parallel and distributed processing, and it has the distinctive feature that multitasks are autonomously executed in parallel without special control. It is considered. However,
In the case of the current method in which the address of the program memory that stores each of the multitasks is fixedly determined, the task Ta currently being executed and the other task Tb that may share the area of the program memory are , Even if there is a free area in the program memory, it cannot be executed at the same time. Therefore, the current situation is that the current data-driven computer has not been able to fully utilize the inherent characteristic of being suitable for multitasking.

これに対して、本発明のデータ駆動形計算機では、全
てのノードのノード番号が他のノードからの相対行先ノ
ード番号で指定されるので、プログラムメモリに空き領
域があれば絶対番地とは無関係に、任意の空き領域に新
たなタスクを外部からロードし、他のタスクと並列に実
行することができる。On the other hand, in the data-driven computer of the present invention, since the node numbers of all nodes are specified by the relative destination node numbers from other nodes, if there is a free area in the program memory, it will be irrelevant to the absolute address. , A new task can be externally loaded to any free area and executed in parallel with other tasks.

以上は、１プロセッサ上でのマルチタスク実行につい
て述べたのものであるが、第９図に示すように、ネット
ワークを介して複数のデータ駆動形計算機（プロセッ
サ）＃０〜＃Ｎが接続されているような構成において、
各プロセッサに跨る個々のタスクTa,Tb,Tcが、各プロセ
ッサ内で同一のプログラムメモリ領域を占めるようにす
れば、ネットワークを介して相互にパケット通信を行う
マルチプロセッサに対しても、マルチタスク実行及びタ
スク生成を行わせることが可能であり、複数のデータ駆
動形計算機を用いた高速並列分散処理を容易に実行する
ことができる。The above is a description of multitask execution on one processor, but as shown in FIG. 9, it seems that a plurality of data driven computers (processors) # 0 to #N are connected via a network. In such a configuration,
If the individual tasks Ta, Tb, Tc across each processor occupy the same program memory area within each processor, multitask execution can be performed even for multiprocessors that perform packet communication with each other via a network. Also, it is possible to generate a task, and it is possible to easily execute high-speed parallel distributed processing using a plurality of data-driven computers.

【図面の簡単な説明】第１図は本発明によるデータ駆動形計算機の一実施例を
示すブロック図、第２図は本発明によるデータ駆動形計
算機のプログラムメモリのメモリ構成、より具体的には
第３図のプログラムを格納した場合のメモリ内容を示す
模式図、第３図は本発明のデータ駆動形計算機により実
行されるプログラム（データフローグラフ）の一例を示
す模式図、第４図は従来例えばのデータ駆動形計算機で
実行するプログラム（データフローグラフ）の一例を示
す模式図であり、ノード番号の割付けを除いて第３図と
同様のプログラムを示し、第５図は従来例のデータ駆動
形計算機の一例を示すブロック図、第６図（ａ）は従来
例のデータ駆動形計算機における命令取出部のメモリ構
成を示す図、第６図（ｂ）は行先指定部のメモリ構成を
示す図模式図であり、いずれも第４図に示すプログラム
を格納した場合のメモリ内容を示し、第７図は本発明の
データ駆動形計算機のプログラムメモリにおける詳細な
処理手順を示すフローチャート、第８図は相対行先ノー
ド番号の値の分布を示すグラフ、第９図はマルチプロセ
ッサ構成におけるマルチタスクの実行状態におけるプロ
グラムメモリのマップを示す概念図である。 10……マッチングメモリ、11……プログラムメモリ、12
……命令実行部、13……加算器なお、図において、同一符号は、同一または相当部分を
示している。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram showing an embodiment of a data driven computer according to the present invention, and FIG. 2 is a memory configuration of a program memory of the data driven computer according to the present invention, more specifically, FIG. 3 is a schematic diagram showing the memory contents when the program of FIG. 3 is stored, FIG. 3 is a schematic diagram showing an example of a program (data flow graph) executed by the data driven computer of the present invention, and FIG. FIG. 5 is a schematic diagram showing an example of a program (data flow graph) executed by the data driven computer, showing a program similar to that shown in FIG. 3 except for node number assignment, and FIG. 6 is a block diagram showing an example of a computer, FIG. 6 (a) is a diagram showing a memory configuration of an instruction fetching unit in a conventional data driven computer, and FIG. 6 (b) is a memory configuration of a destination designating unit. And FIG. 7 is a schematic diagram showing a configuration, each showing memory contents when the program shown in FIG. 4 is stored, and FIG. 7 is a flowchart showing a detailed processing procedure in the program memory of the data driven computer of the present invention, FIG. 8 is a graph showing the distribution of the values of the relative destination node numbers, and FIG. 9 is a conceptual diagram showing a map of the program memory in the execution state of multitask in the multiprocessor configuration. 10 …… Matching memory, 11 …… Program memory, 12
... Instruction execution unit, 13 ... adder In the drawings, the same reference numerals indicate the same or corresponding parts.

フロントページの続き (56)参考文献小守他゛データ駆動形プロセッサのプログラム記憶方式に関する−提案″並列処理シンポジウムＪＳＳＰ’89論文集（1989−02−02−04）ＰＰ．85〜90. ゛Ａｒｖｉｎｄ＆Ｖ．Ｋａｔｈａｉｌ″゛ＡＭｕｌｔｉｐｌｅＰｒｏｃｅｓｓｏｒＤａｔａＦｌｏｗＭａｃｈｉｎｅｔｈａｔＳｕｐｐｏｒｔｓＧｅｎｅｒａｌｉｚｅｄＰｒｏｃｅｄｕｒｅｓ″Ｐｒｏｃｅｃｄｉｎｇｓｏｆｔｈｅ８ｔｈＡｎｎｕａｌＳｙｍｐｏｓｉｕｍｏｎＣｏｍｐｕｔｅｒＡｒｃｈｉｔｅｃｔｕｒｅ（1981−05−12−14）ＰＰ．291〜302.Continuation of the front page (56) References Omori et al. "Proposal on program storage method of data driven processor-proposition" Parallel Processing Symposium JSSP'89 Proceedings (1989-02-02-04) PP.85-90. Arvind & V. Katha il "A Multiple Process Processor Data Flow Machined Supports Generalized Procedures 285-1981.

Claims

(57) [Claims]

1. A match detection unit for detecting two data whose destination node numbers are attached to each piece of data to be processed, and data for the two data for which a match is detected by the match detection unit. An arithmetic processing unit for performing arithmetic processing in accordance with the attached instruction code, and its contents are read by using the destination node number attached to the data processed by the arithmetic processing unit as an address, and the contents are read based on the read contents. In a data-driven computer including a program memory that updates a destination node number of data and an instruction code, the program memory stores the destination node number attached to the data processed by the arithmetic processing unit and the destination node number. An address calculation means for performing a predetermined calculation with the variable length address information read from the program memory as an address, A data-driven computer, characterized in that the result of calculation by the address calculation means is set as the updated destination node number.