JPH0644299B2

JPH0644299B2 - Data Flow Processor

Info

Publication number: JPH0644299B2
Application number: JP18582887A
Authority: JP
Inventors: 薫内田
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1987-07-24
Filing date: 1987-07-24
Publication date: 1994-06-08
Anticipated expiration: 2009-06-08
Also published as: JPS6429937A

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、メモリ部、演算部をパイプライン状のバスで
結合し、データフロー方式により演算順序をコントロー
ルすることを特徴とするデータフロープロセッサに関す
るものである。DETAILED DESCRIPTION OF THE INVENTION (Industrial field of use) The present invention relates to a data flow processor characterized in that a memory unit and an arithmetic unit are connected by a pipeline bus and the arithmetic order is controlled by a data flow method. It is about.

（従来の技術）従来、データフロープロセッサとして日本電気株式会社
製のμPD7281がある。(Prior Art) Conventionally, there is a μPD7281 manufactured by NEC Corporation as a data flow processor.

μPD7281は第６図に示されるような構成を持つ。外部バ
スから装置に入力されるデータの単位となるトークン
は、データ値、入力後にリンクテーブル92を参照するた
めのリンクテーブルアドレス、そのトークンが処理され
るべき装置を示すモジュール番号を持ち、トークン入力
部91は、外部バスを通るトークンのモジュール番号がそ
の装置の番号と一致する場合にそのトークンを内部に入
力し、そうでない場合トークン出力部97を通じてそのま
ま外部バスから出力する。入力されたトークンは、その
持つリンクテーブルアドレスによりリンクテーブル92を
参照し、そこでファンクションテーブル93を参照するた
めのファンクションテーブルアドレスと次回にリンクテ
ーブル92を参照するためのリンクテーブルアドレスを得
た後にファンクションテーブル93へ送られる。The μPD7281 has a structure as shown in FIG. The token, which is the unit of data input to the device from the external bus, has a data value, a link table address for referring to the link table 92 after input, and a module number indicating the device on which the token is to be processed. The unit 91 inputs the token internally when the module number of the token passing through the external bus matches the number of the device, and otherwise outputs the token directly from the external bus through the token output unit 97. The input token refers to the link table 92 by its own link table address, and after obtaining the function table address for referencing the function table 93 and the link table address for referencing the link table 92 next time, the function is executed. Sent to table 93.

トークンはファンクションテーブル93においてそのファ
ンクションテーブルアドレスによる参照を行い、そこで
データメモリ94の管理情報の参照・更新を行うと同時に
プロセッシングユニット96での処理内容を示す処理コー
ドとデータメモリ94のアクセスアドレスを得、データメ
モリ94に送られ、そこで必要に応じて２項演算の相手方
のオペランドとの待合わせあるいは定数演算のための定
数の読み出しを行う。キューメモリ95はプロセッシング
ユニット96が前のトークンを処理中で次のトークンを入
力できない時にトークンを一時保持するためのメモリで
あり、プロセッシングユニット96がビジーでない場合に
は、トークンはキューメモリ95からプロセッシングユニ
ット96に送られ、その処理コードに応じて、算術演算、
論理演算、シフト、比較、ビット反転、プライオリティ
エンコーディング、分流、数値発生、コピー、内部レジ
スタを利用した演算などのうちの１つの処理を受ける。
なおトークンの持つ処理コードが出力を示すものである
場合には、トークンはキューメモリ95からトークン出力
部97へ送られ、入力トークンの形に変形された後に、外
部バスへ出力される。プロセッシングユニット96で処理
をうけたトークンは、リンクテーブル92に送られ、再び
そのリンクテーブルアドレスにより参照を行う。以下同
様にして出力命令が実行されるまで内部のリンクバスを
回り、そのデータ値に対して必要な処理を受ける。The token refers to the function table address in the function table 93, and then refers to and updates the management information of the data memory 94, and at the same time obtains a processing code indicating the processing content in the processing unit 96 and an access address of the data memory 94. , And is sent to the data memory 94, where it waits for the operand of the other party of the binary operation or reads out a constant for constant operation, if necessary. The queue memory 95 is a memory for temporarily holding the token when the processing unit 96 is processing the previous token and cannot input the next token, and when the processing unit 96 is not busy, the token is processed from the queue memory 95. Sent to the unit 96, depending on the processing code, arithmetic operation,
It receives one of a logical operation, a shift, a comparison, a bit inversion, a priority encoding, a shunt, a numerical value generation, a copy, and an operation using an internal register.
If the processing code of the token indicates output, the token is sent from the queue memory 95 to the token output unit 97, transformed into the form of the input token, and then output to the external bus. The token processed by the processing unit 96 is sent to the link table 92, and again referred to by the link table address. In the same manner, the data is passed through the internal link bus until the output instruction is executed, and the necessary processing is performed on the data value.

（発明が解決しようとする問題点）前述のデータフロープロセッサでベクトルデータの形を
した複数のデータをまとめて扱いこれに対して処理を行
う場合には、それらを２項演算の積み重ねに分解し、そ
の様な部分部分の２項演算の２つのデータを待ち合わせ
命令によってプロセッサ内で待ち合わせることにより処
理を進めて行かねばならず、この場合１つの２項演算を
行うためにトークンが内部リングを１周する必要がある
ために処理速度が遅くなるという問題点がある。(Problems to be Solved by the Invention) When a plurality of data in the form of vector data are handled collectively by the above data flow processor and processing is performed on them, they are decomposed into a stack of binary operations. , Two pieces of binary operation data of such a partial portion must be waited for in the processor by a wait instruction to advance the processing. In this case, the token makes the internal ring 1 in order to perform one binary operation. There is a problem that the processing speed becomes slow because it is necessary to go around.

これを解決するためにプロセッシングユニットにレジス
タを持ち、まとめて処理されるべき複数データを次々に
プロセッシングユニットに流し込み、レジスタ内の内部
状態を利用して高速に処理することができるが、この場
合にはベクトルデータの連続演算が終了し結果を出力す
るべきことを知るためにこれらのまとまったデータの数
を演算のための命令として予めプログラムし通過するト
ークンを計数しなければならず、長さが可変のベクトル
データを入力して処理するということは出来ない。In order to solve this, the processing unit has a register, multiple data that should be processed collectively can be poured into the processing unit one after another, and it can be processed at high speed by using the internal state in the register. In order to know that the continuous operation of vector data is completed and the result should be output, it is necessary to pre-program the number of these collected data as an instruction for the operation and count the tokens passing through. It is impossible to input variable vector data for processing.

例えば疎行列を扱う数値計算でみられるようなもとの行
列のうち非０要素のみを持つベクトルを処理する場合を
考える。このときベクトルデータの長さは可変であり、
これらのデータをメモリから連続して読出しそれらの総
和をとる処理を行うデータフロープロセッサに入力する
ときには、それらが連続データとして流れるために同一
の識別子を持たせることによりその処理プログラムを記
述するフローグラフにおける同一のアークに流さねばな
らない。その際その最後尾を識別するためにそのトーク
ンがベクトルデータの終端であることを示すデータを持
つトークンを最後につけるという形態で流す必要があ
る。しかしさらにこのように総和をとるプログラムモジ
ュールが同じデータフロープロセッサ上に複数ありそれ
らを独立に動作させようとするときには、１つしかない
レジスタを共有ししかも正しい動作を行わせるために、
同時には１つのプログラムモジュールしか累積加算処理
を行わないように同期動作をソフトウェアにより実現し
なければならない。For example, consider the case of processing a vector having only non-zero elements in the original matrix as seen in the numerical calculation that handles a sparse matrix. At this time, the length of vector data is variable,
When these data are continuously read from the memory and input to the data flow processor that performs the process of summing them, the flow graph that describes the processing program by giving the same identifier because they flow as continuous data Must flow into the same arc at. At that time, in order to identify the end, it is necessary to send a token having data indicating that the token is the end of vector data at the end. However, in addition, when there are a plurality of program modules that perform summation in the same data flow processor and they are to be operated independently, in order to share only one register and perform correct operation,
At the same time, the synchronous operation must be realized by software so that only one program module performs the cumulative addition process.

そのばあいのプログラムを記述するフローグラフの一例
を第７図に示す。第７図においては上で説明したように
同一の識別子を持って入力されて来るデータトークン列
の各々のデータについて、まず終端マークであるかどう
かを判定し、その後に累積加算動作を行い、終端マーク
検出時にレジスタの読み出し動作を行っている。さらに
上記のような複数のプログラムモジュールにおける累積
加算処理が互いに干渉することを防ぐために、同時には
高々１つのプログラムモジュールしか動作しなように制
御している。An example of a flow graph describing the program in that case is shown in FIG. In FIG. 7, as described above, for each data of the data token string input with the same identifier, it is first judged whether or not it is the end mark, and then the cumulative addition operation is performed, The register is being read when the mark is detected. Further, in order to prevent the cumulative addition processing in a plurality of program modules as described above from interfering with each other, it is controlled so that only one program module operates at a time.

このように長さは可変で最後に最後尾であることを識別
情報のあるベクトルデータを次々にこのデータフロープ
ロセッサに入力しそのベクトルデータ要素の総和を求め
るという問題に対しては、そのベクトルの最後尾を認識
することにより入力データを一度ずつプロセッシングユ
ニットを通すだけで処理することは出来ず、ソフトウェ
アでこれを実現しなければならず、さらに複数のプログ
ラムモジュールの同期もプログラムしなければならない
のでソフトウェアの負担が大きくなり、同期のオーバー
ヘッドにより処理速度が遅くなることは避けられない。In this way, for the problem that the vector data with identification information that the length is variable and is the last at the end is input to this data flow processor one after another and the sum of the vector data elements is obtained, By recognizing the tail end, it is not possible to process the input data by passing it through the processing unit one by one, and this must be realized by software, and moreover, the synchronization of multiple program modules must also be programmed. It is unavoidable that the load of software becomes heavy and the processing speed becomes slow due to the synchronization overhead.

本発明は可変長のベクトルデータを持つトークン列が加
算動作のためにプロセッシングユニットに１度ずつ通る
だけで処理を完了することが出来、さらに複数のプログ
ラムモジュール間の干渉も発生しないような機構を導入
することによりデータフロープロセッサ内のトークン流
量を減らし、高速な処理を実現するデータフロープロセ
ッサを提供することを目的としている。The present invention provides a mechanism in which a token string having variable-length vector data can be completed by passing through a processing unit once for an addition operation, and further, interference between a plurality of program modules does not occur. The purpose is to provide a data flow processor that reduces the token flow rate in the data flow processor and realizes high-speed processing.

（問題点を解決するための手段）本発明のデータフロープロセッサは、リングバス上に順
に接続された、トークンの行き先アドレスを貯えておく
リンクテーブル、命令コードを参照するためのアドレス
を生成するファンクションテーブルアドレス生成部、命
令コードを貯えておくファンクションテーブル、２項演
算に用いるデータを一時貯えるデータメモリ、トークン
を一時保持するキューメモリ、トークンに対するデータ
処理を行い前記リンクテーブルへその結果を出力するプ
ロセッシングユニットを持ち、さらにキューメモリから
トークンを外部バスへ送出するトークン出力部と外部バ
スからトークンを入力してリンクテーブルまたは前記ト
ークン出力部へ送出するトークン入力部と、を持つこと
により構成されることを特徴としている。(Means for Solving Problems) A data flow processor of the present invention is a link table which is connected in order on a ring bus and stores a destination address of a token, and a function for generating an address for referring to an instruction code. Table address generation unit, function table for storing instruction code, data memory for temporarily storing data used for binary operation, queue memory for temporarily holding token, processing for processing token data and outputting the result to the link table It has a unit and further has a token output unit for sending a token from the queue memory to the external bus and a token input unit for inputting a token from the external bus and sending it to the link table or the token output unit. Is characterized by .

（作用）本発明においては長さＮのベクトルデータの和を求める
場合、それらのベクトルデータを持つ長さＮの組トーク
ンを用いる。これらのトークンはトークンとして従来か
ら持っていたリンクテーブルアドレス、データ値などの
他に組識別子を持つ。これは例えばＮ個の組トークンの
先頭のトークンでは“１”、２個目からＮ−１個目まで
は“２”、最後尾では“０”というものである。これら
のトークンは同じリンクテーブルアドレスを持ち、リン
クテーブルでの参照によって同じファンクションテーブ
ルベースアドレスを得た後に連続してファンクションテ
ーブルアドレス生成部に入力されるが、ここでそれの持
つファンクションテーブルベースアドレスと上記組識別
子とを演算することによりファンクションテーブルのア
クセスアドレスを得る。これにより例えばこの場合には
先頭のトークンは「そのデータ値をプロセッシングユニ
ットのレジスタに設定し、トークンは消滅する」という
命令コードを、２個目からＮ−１個目までは「そのデー
タ値をプロセッシングユニットのレジスタに加算し、ト
ークンは消滅する」という命令コードを、Ｎ個目は「そ
のデータ値をプロセッシングユニットのレジスタに加算
し、その後レジスタの値を結果データトークンとする」
という命令コードを持つ。(Operation) In the present invention, when obtaining the sum of vector data of length N, a set token of length N having the vector data is used. These tokens have a set identifier in addition to a link table address, a data value, etc. which have been conventionally held as a token. This is, for example, "1" in the first token of the N set of tokens, "2" in the second to N-1th tokens, and "0" in the last token. These tokens have the same link table address, and are continuously input to the function table address generator after obtaining the same function table base address by reference in the link table. The access address of the function table is obtained by calculating the above set identifier. As a result, for example, in this case, the first token has the instruction code "set its data value in the register of the processing unit and the token disappears" from the second to the N-1th "the data value For the Nth instruction code, "add the data value to the register of the processing unit and then use the value of the register as the result data token."
Has the instruction code.

このような方法により、処理すべきベクトルデータの長
さがあらかじめデータフロープロセッサ内のプログラム
により与えられてなくても、演算部にこれらＮ個のトー
クンを１回送るだけで入力される組トークンが異なる命
令を実行することにより目的の完結した処理を行うこと
が可能になる。By such a method, even if the length of the vector data to be processed is not given in advance by the program in the data flow processor, the set token input by sending these N tokens to the arithmetic unit once By executing different instructions, it is possible to perform the processing that is complete for the purpose.

（実施例）次に本発明について図面を参照して説明する。(Example) Next, this invention is demonstrated with reference to drawings.

第１図は本発明の一実施例におけるデータフロープロセ
ッサの構成を示す内部ブロック図であり、11はトークン
入力部、12はリンクテーブル、20はファンクションテー
ブルアドレス生成部、13はファンクションテーブル、14
はデータメモリ、15はキューメモリ、16はプロセッシン
グユニット、17はトークン出力部であり、リンクテーブ
ル12、ファンクションテーブルアドレス生成部20、ファ
ンクションテーブル13、データメモリ14、キューメモリ
15、プロセッシングユニット16は図に示すようにこの順
にパイプライン方式のバスでリング状に接続しており、
トークンはこの内部リングバス上でデータフロープロセ
ッサ内のパイプラインクロックに同期して転送される。FIG. 1 is an internal block diagram showing a configuration of a data flow processor according to an embodiment of the present invention. 11 is a token input unit, 12 is a link table, 20 is a function table address generation unit, 13 is a function table, and 14 is a function table.
Is a data memory, 15 is a queue memory, 16 is a processing unit, 17 is a token output unit, and a link table 12, a function table address generation unit 20, a function table 13, a data memory 14, a queue memory
15, the processing unit 16 is connected in a ring shape by a pipeline bus in this order as shown in the figure,
The token is transferred on this internal ring bus in synchronization with the pipeline clock in the data flow processor.

第２図は本発明の一実施例を用いたデータ処理装置の例
の全体構成図である。本装置においては複数のデータフ
ロープロセッサ１、２と、１つのメモリインタフェース
回路３が外部バス５で結ばれており、外部バス５はメモ
リインタフェース回路３を介してメモリ４と接続されて
いる。外部バス５上でトークンはハンドシェーク方式に
より非同期に転送される。FIG. 2 is an overall configuration diagram of an example of a data processing device using an embodiment of the present invention. In this device, a plurality of data flow processors 1 and 2 and one memory interface circuit 3 are connected by an external bus 5, and the external bus 5 is connected to the memory 4 via the memory interface circuit 3. The token is transferred asynchronously on the external bus 5 by the handshake method.

第３図はデータの単位であるトークンの形式を示す。FIG. 3 shows the format of a token which is a unit of data.

外部バス５上でのトークン60はモジュール番号61、組識
別子62、リンクテーブルアドレス63とデータ部64からな
る。The token 60 on the external bus 5 comprises a module number 61, a set identifier 62, a link table address 63 and a data section 64.

65はリンクテーブル12からファンクションテーブルアド
レス生成部20へ送られる、また66はファンクションテー
ブルアドレス生成部20からファンクションテーブル13へ
送られるトークンの形式を示す。Reference numeral 65 indicates the format of the token sent from the link table 12 to the function table address generation unit 20, and 66 indicates the format of the token sent from the function table address generation unit 20 to the function table 13.

この実施例で用いられるトークンについては１つまたは
複数のトークンからなる組トークンで１つのまとまりと
して処理することが出来る。組トークンは常に連続して
処理装置内を流れ、また同一のモジュール番号61と同一
のリンクテーブルアドレス63、83、87を持つ。The tokens used in this embodiment can be processed as one unit with a set token composed of one or a plurality of tokens. The group token always flows continuously in the processing device, and has the same module number 61 and the same link table address 63, 83, 87.

組識別子62、82、86は組トークン内でのそのトークンの
識別に用いられ、そのトークンが単独で組トークンを構
成する場合には“０”を、また複数のトークンからなる
組トークン内で互いを区別する必要がある場合にはそれ
ぞれ異なる値を持つことができる。ただし組トークン内
の最後尾のトークンは組識別子として“０”を持つ。The pair identifiers 62, 82, 86 are used to identify the token within a pair token, and are "0" if the token alone constitutes a pair token, or one another within a pair token composed of multiple tokens. Can have different values if they need to be distinguished. However, the last token in the group token has "0" as the group identifier.

トークン入力部11は前段のデータフロープロセッサまた
はメモリインタフェース回路からのトークン入力リクエ
スト信号42により入力されるトークンのうちそのモジュ
ール番号61が、その装置に与えられた番号に等しいもの
のみを内部へ取り込みリンクテーブル12へパイプライン
サイクルに同期して送り、その他のトークンは通過トー
クンとしてそのままトークン出力部17へ送る。ただしリ
ンクテーブル12あるいはトークン出力部17が以下に述べ
るように組トークンの処理中でビジー状態である場合に
はトークンを送出せず、更に前段のデータフロープロセ
ッサまたはメモリインタフェース回路にハンドシェーク
のアクノレジ信号43を返さないことにより入力を停止す
る。The token input unit 11 takes in only those tokens whose module number 61 is equal to the number given to the device among the tokens inputted by the token input request signal 42 from the preceding data flow processor or memory interface circuit. The tokens are sent to the table 12 in synchronism with the pipeline cycle, and the other tokens are sent as they are to the token output unit 17 as passing tokens. However, when the link table 12 or the token output unit 17 is in the busy state during the processing of the group token as described below, the token is not sent, and the handshake acknowledge signal 43 is sent to the preceding data flow processor or memory interface circuit. Stop input by not returning.

リンクテーブル12はプロセッシングユニット16またはト
ークン入力部11からトークンを入力するが、両方から同
時にそのリクエストがあった場合には通常はトークン入
力部11からの入力を優先する。ただしプロセッシングユ
ニット16がコピー動作により連続トークン生成中はそれ
を優先し、またどちらからのものであっても入力された
トークンの組識別子が“０”でない場合には、そのトー
クンが複数のトークンからなる組トークンの最後尾以外
のトークンであり、さらに連続して組トークンの残りが
入力されてくることがわかるので、それらのトークンの
送出元でない方に対してビジー状態であることを知らせ
入力を停止することにより、それらの組トークン全体を
優先して連続して入力する。これにより組トークンの連
続性が保証される。The link table 12 inputs a token from the processing unit 16 or the token input unit 11, but when both requests are made at the same time, the input from the token input unit 11 is usually prioritized. However, the processing unit 16 gives priority to it while the continuous token is being generated by the copy operation, and if the input token set identifier is not "0" regardless of which token it is, the token is Since it is a token other than the last one of the group tokens and the rest of the group tokens are input continuously, inform the person who is not the source of those tokens that they are busy and input. By stopping, the entire tokens in the set are preferentially input continuously. This guarantees the continuity of the group token.

プロセッシングユニット16またはトークン入力部11から
リンクテーブル12に入力されたトークンは従来と同様に
そのリンクテーブルアドレス63を用いてその内部メモリ
を参照し、そこで得たファンクションテーブルベースア
ドレス81及び次回のリンクテーブル12参照のためのリン
クテーブルアドレス83を持ってファンクションテーブル
アドレス生成部20に送られる。このときのトークン形式
を第３図の65に示す。The token input to the link table 12 from the processing unit 16 or the token input unit 11 refers to its internal memory by using the link table address 63 as in the conventional case, and the function table base address 81 and the next link table obtained there. 12 The link table address 83 for reference is sent to the function table address generation unit 20. The token format at this time is shown at 65 in FIG.

ファンクションテーブルアドレス生成部20では後述のよ
うにファンクションテーブルアドレス85を生成し、第３
図の66の形式のトークンをファンクションテーブル13に
送る。The function table address generator 20 generates the function table address 85 as described later, and the third
A token of the form 66 in the figure is sent to the function table 13.

ファンクションテーブル13では入力されるトークン66の
ファンクションテーブルアドレス85によりその内部のテ
ーブルをアクセスし、データメモリ14の管理情報の参照
・更新を行なうと同時にプロセッシングユニット16での
処理内容を示す処理コード、及びデータメモリ14のアク
セスアドレスをトークンに付加する。In the function table 13, the internal table is accessed by the function table address 85 of the input token 66, the management information of the data memory 14 is referred / updated, and at the same time, the processing code indicating the processing content in the processing unit 16, and The access address of the data memory 14 is added to the token.

データメモリ14は上記アクセスアドレスによってアクセ
スされ、必要に応じて２項演算のデータの、またはメモ
リインターフェース回路に対する書込みトークン出力の
時のデータを持つトークンとアドレスを持つトークンの
待ち合わせのためのキューとして、あるいは定数演算の
ための定数などの格納のためのメモリとして用いられ
る。The data memory 14 is accessed by the above-mentioned access address, and if necessary, as a queue for waiting for the token having the address of the binary operation data or the token having the data when the write token is output to the memory interface circuit, Alternatively, it is used as a memory for storing constants for constant calculation.

キューメモリ15は、プロセッシングユニット16あるいは
トークン出力部17にトークンを入力する前にトークンを
一時保持するためのメモリであり、キューの先頭のトー
クンの処理コードに応じてトークンをプロセッシングユ
ニット16またはトークン出力部17に送出するという動作
を行う。The queue memory 15 is a memory for temporarily holding the token before inputting the token to the processing unit 16 or the token output unit 17, and outputs the token according to the processing code of the token at the head of the queue. The operation of sending to the unit 17 is performed.

プロセッシングユニット16は従来と同様にトークンのデ
ータに対する算術演算、論理演算、シフト、比較、ビッ
ト反転、プライオリティエンコーディング、分流、数値
発生、コピー、内部レジスタの操作とそれを用いた演算
を行う機能、あるいは前記機能の一部を持ち、キューメ
モリ15から入力されたトークンの持つデータに対して同
じくそのトークンの持つ処理コードに従って演算処理を
行い、演算結果のデータを持つトークンをリンクテーブ
ル12へ送る。The processing unit 16 has a function of performing arithmetic operations, logical operations, shifts, comparisons, bit inversions, priority encodings, shunting, number generation, copying, internal register operations and operations using the same as in the conventional processing, or Having some of the above-mentioned functions, it also performs arithmetic processing on the data of the token input from the queue memory 15 according to the processing code of the token, and sends the token having the data of the arithmetic result to the link table 12.

キューメモリ15を構成するＦＩＦＯの先頭のトークンの
処理コード68が出力を示すものである場合、そのトーク
ンはトークン出力部17に送られる。ただしトークン出力
部17がビジー状態である場合にはそこへの出力を停止す
る。When the processing code 68 of the token at the head of the FIFO forming the queue memory 15 indicates output, the token is sent to the token output unit 17. However, when the token output unit 17 is in the busy state, the output to the token output unit 17 is stopped.

トークン出力部17はキューメモリ15またはトークン入力
部11から入力されたトークンに対してトークンを外部バ
ス5a上のトークン形式60に変形し、外部バス5aを介して
後段のデータフロープロセッサまたはメモリインタフェ
ース回路に対して出力する。ただしキューメモリ15また
はトークン入力部11の両方から同時にそのリクエストが
あった場合には通常はトークン入力部11からの入力を優
先する。ただしどちらからのものであっても入力された
トークンの組識別子が“０”でない場合には、そのトー
クンが複数のトークンからなる組トークンの最後尾以外
のトークンであり、さらに連続して組トークンの残りが
入力されてくることがわかるので、それらのトークンの
送出元でない方に対してビジー状態であることを知らせ
入力を停止することにより、それらの組トークン全体を
優先して連続して入力する。これにより組トークンの連
続性が保証される。The token output unit 17 transforms the token input from the queue memory 15 or the token input unit 11 into a token format 60 on the external bus 5a, and a data flow processor or a memory interface circuit at a subsequent stage via the external bus 5a. Output to. However, when there is a request from both the queue memory 15 and the token input unit 11 at the same time, the input from the token input unit 11 is usually given priority. However, if the set identifier of the input token is not "0" regardless of which one, the token is a token other than the end of the set token consisting of multiple tokens, and the set tokens are consecutive. Since it is understood that the rest of the tokens will be input, by notifying the non-source of those tokens that they are in a busy state and stopping the input, priority is given to the entire set of tokens To do. This guarantees the continuity of the group token.

メモリインタフェース回路３はデータフロープロセッサ
１、２からこれを行き先とするモジュール番号61を持つ
トークン60を入力された時にその中のリングテーブルア
ドレス63、データ64で指定される動作を行う。例えば読
出し要求トークンが与えられた時にデータ64に含まれる
値をアドレスとしてメモリ４から読出し動作を行い読み
だした値を持つトークンをリンクテーブルアドレス63で
指定されたデータフロープロセッサに送り出す、あるい
は書込みデータを持つトークンと書込みアドレスを持つ
トークンが組トークンとして与えられた時にそれらを用
いてメモリ４に対する書込み動作を行うなどである。さ
らにメモリインタフェース回路３は値Ｎと「読出し組ト
ークン生成命令」を持つトークンが与えられるとメモリ
４からＮ回の読出し動作を行い、それらの値を持ちさら
にそれぞれ必要な組識別子を持つ長さＮの組トークンを
生成してデータフロープロセッサに対して送出する。When the memory interface circuit 3 receives a token 60 having a module number 61 destined to it from the data flow processors 1 and 2, the operation specified by the ring table address 63 and data 64 therein is performed. For example, when a read request token is given, a read operation is performed from the memory 4 by using the value contained in the data 64 as an address, and the token having the read value is sent to the data flow processor designated by the link table address 63, or the write data is written. When a token having a token and a token having a write address are given as a pair token, they are used to perform a write operation to the memory 4. Further, when the memory interface circuit 3 is given a token having a value N and a "read group token generation command", it performs a read operation N times from the memory 4, holds those values, and has a length N having a necessary group identifier. To generate a pair token of and send it to the data flow processor.

次に第４図を用いてファンクションテーブルアドレス生
成部20の実施例の詳細について述べる。Next, details of the embodiment of the function table address generator 20 will be described with reference to FIG.

ファンクションテーブルアドレス生成部20はレジスタ50
を持ち、リンクテーブル12から入力されたトークン形式
65を持つトークンをパイプラインクロックに同期してこ
のレジスタ50に保持する。レジスタ50に保持されたトー
クン65のファンクションテーブルベースアドレス81の値
と組識別子82の値とが加算器51に入力され、加算器51は
その和をトークン形式66を持つ出力トークンのファンク
ションテーブルアドレス85フィールドの値として出力す
る。出力トークン66のその他のフィールド86、87、88に
はレジスタ50上の対応するフィールド82、83、84の値が
そのまま用いられる。The function table address generator 20 is a register 50
Has a token format entered from the link table 12
The token with 65 is held in this register 50 in synchronization with the pipeline clock. The value of the function table base address 81 of the token 65 held in the register 50 and the value of the pair identifier 82 are input to the adder 51, and the adder 51 adds the sum to the function table address 85 of the output token having the token format 66. Output as the value of the field. For the other fields 86, 87, 88 of the output token 66, the values of the corresponding fields 82, 83, 84 on the register 50 are used as they are.

これにより組トークンを構成する複数のトークンが同一
のリンクテーブルアドレスを持ってリンクテーブルに入
力されることにより同一のファンクションテーブルベー
スアドレスを持っていても、異なる組識別子を持つこと
によって異なるファンクションテーブルアドレスを持つ
ことが可能となる。それを用いてファンクションテーブ
ル13の異なったアドレスをアクセスすることにより、同
一のリンクテーブルアドレス63を持ち常に連続してデー
タフロープロセッサ内を流れる組トークンの中の各々の
トークンが異なった処理を受けることができる。As a result, multiple tokens that make up a group token have the same function table base address by inputting to the link table with the same link table address, but different function table addresses by having different group identifiers. It becomes possible to have. By using it to access different addresses in the function table 13, each token in the set tokens having the same link table address 63 and continuously flowing in the data flow processor is subjected to different processing. You can

次にこのデータフロープロセッサを用いて長さＮのベク
トルデータの和を求める場合の方法と動作について説明
する。Next, a method and operation for obtaining the sum of vector data of length N using this data flow processor will be described.

あらかじめリンクテーブル12のアドレスａにファンクシ
ョンテーブルベースアドレスとしてｂを、ファンクショ
ンテーブル13のアドレス（ｂ＋１）に「そのデータ値を
プロセッシングユニットのレジスタに設定し、トークン
は消滅する」という命令コードを、アドレス（ｂ＋２）
に「そのデータ値をプロセッシングユニットのレジスタ
に加算し、トークンは消滅する」という命令コードを、
アドレスｂには「そのデータ値をプロセッシングユニッ
トのレジスタに加算し、その後レジスタの値を結果デー
タトークンとする」という命令コードを設定しておく。In advance, the address a of the link table 12 is set to b as the function table base address, and the address (b + 1) of the function table 13 is set to the address (the data value is set in the register of the processing unit and the token disappears) to the address ( b + 2)
To the instruction code "Add that data value to the register of the processing unit and the token disappears"
An instruction code "add the data value to the register of the processing unit and then use the value of the register as a result data token" is set in the address b.

また長さＮの組トークンをメモリインタフェース回路に
おいて用意し、データとしては和を取られるべきＮ個の
データのそれぞれを、リンクテーブルアドレスとして値
ａを持たせる。また組識別子としては先頭のトークンで
は“１”、２個目からＮ−１個目までは“２”、最後尾
では“０”を持たせる。In addition, a set token of length N is prepared in the memory interface circuit, and each of N pieces of data to be summed as data has a value a as a link table address. As the set identifier, the first token has "1", the second to N-1th tokens have "2", and the last token has "0".

この組トークンを構成するトークンは連続してリンクテ
ーブル12に入力されそのリンクテーブルアドレスで示さ
れるアドレスａにアクセスすることにより、形式65でフ
ァンクションテーブルベースアドレス81としてｂを持つ
トークンがＮ個連続してファンクションテーブルアドレ
ス生成部20に送られる。ファンクションテーブルアドレ
ス生成部20において、先頭のトークンはファンクション
テーブルベースアドレスと組識別子“１”とを加えるこ
とにより（ｂ＋１）を、２個目からＮ−１個目までは同
様に（ｂ＋２）を、最後尾ではｂを、それぞれファンク
ションテーブルアドレス85として得る。The tokens that form this group of tokens are continuously input to the link table 12 and the address a indicated by the link table address is accessed, so that N tokens having b as the function table base address 81 in the format 65 are consecutive. And is sent to the function table address generation unit 20. In the function table address generation unit 20, the first token is (b + 1) by adding the function table base address and the set identifier “1”, and similarly from the second token to the N−1th token (b + 2), At the end, b is obtained as the function table address 85, respectively.

これらのトークンはそのまま連続してファンクションテ
ーブル13に送られ、それらの持つファンクションテーブ
ルアドレスによりテーブルをアクセスすることにより、
先頭のトークンは「そのデータ値をプロセッシングユニ
ットのレジスタに設定し、トークンは消滅する」という
命令コードを、２個目からＮ−１個目までは「そのデー
タ値をプロセッシングユニットのレジスタに加算し、ト
ークンは消滅する」という命令コードを、Ｎ個目は「そ
のデータ値をプロセッシングユニットのレジスタに加算
し、その後レジスタの値を結果データトークンとする」
という命令コードを持ち、その後プロセッシングユニッ
ト16においてそのコードに従ってレジスタを利用するこ
とにより和を求めることが出来る。These tokens are continuously sent as they are to the function table 13, and by accessing the table by the function table address that they have,
For the first token, the instruction code "Set the data value in the processing unit register and the token disappears" is added to the second to N-1th instruction "Add the data value to the processing unit register. , The token disappears ", the Nth is" add the data value to the register of the processing unit, and then use the value of the register as the result data token. "
Then, the processing unit 16 can use the register according to the code to obtain the sum.

なお本実施例と異なり、長さＮの組トークンの最後尾の
組識別子として“０”をその他のトークンでは“１”を
用い、先頭からＮ−１個目までのトークンはレジスタへ
の加算を、最後尾のトークンは加算、結果出力とレジス
タクリアを同時に行うという命令を用いることも可能で
ある。さらに組識別子として“１”を持つＮ個のトーク
ンと“０”を持つ１つのトークンからなる長さＮ＋１の
組トークンを用い、前者はレジスタ加算を、後者は加算
を行わずに結果出力とレジスタクリアを行うという実現
もできる。Note that unlike the present embodiment, "0" is used as the last group identifier of the group token of length N and "1" is used for the other tokens, and the N-1th tokens from the beginning are added to the register. , It is also possible to use an instruction that the last token is added, the result is output and the register is cleared at the same time. Furthermore, a set token of length N + 1 consisting of N tokens having "1" and one token having "0" as a set identifier is used, the former is register addition, the latter is the result output and register without addition. It is also possible to achieve clearing.

例として従来の装置の問題点の項で述べた累積加算処理
を本実施例を用いて行う場合のプログラムを記述すると
フローグラフの一例を第５図に示す。第５図においては
上で説明したように同一の識別子を持って入力されて来
るデータトークン列で終端はそのトークンの組識別子で
識別されるためそれを判定する命令は不要であり、かつ
各々のベクトルデータは組トークンに対する制御によっ
て必ず連続して流れるため、複数のプログラムモジュー
ル間で干渉を起こす可能性はなく、従って同期メカニズ
ムを組み込む必要もない。これによりプログラムの負担
が軽減され、また同期制御のためのプログラムオーバー
ヘットがないため処理の高速化が実現されている。As an example, FIG. 5 shows an example of a flow graph in which a program for performing the cumulative addition processing described in the section of the problem of the conventional apparatus is described using this embodiment. In FIG. 5, as described above, in the data token string input with the same identifier, the end is identified by the pair identifier of the token, so an instruction to determine it is unnecessary, and each Since the vector data always flows continuously under the control of the set token, there is no possibility of causing interference between a plurality of program modules, and therefore it is not necessary to incorporate a synchronization mechanism. This reduces the load on the program and speeds up the processing because there is no program overhead for synchronous control.

（発明の効果）以上説明したように本発明によれば、連続データの中で
個々のトークンにより異なった処理を行わせることが可
能になり、特にレジスタを用いる場合のように従来連続
トークンに対する同一の処理の前後に前処理と後処理を
行わせていたような場合について、その連続データの数
があらかじめ分かっていなくても、演算部にこれらＮ個
のトークンを１回送るだけでそれらの前後処理をその中
間の処理と連続的に行うことができ、プログラムの負担
の軽減と処理の高速化を図ることができる。(Effects of the Invention) As described above, according to the present invention, it is possible to perform different processing by individual tokens in continuous data. In the case where pre-processing and post-processing are performed before and after the processing of, even if the number of continuous data is not known in advance, it is possible to send these N tokens to the arithmetic unit once and The processing can be continuously performed with the intermediate processing, and the load on the program can be reduced and the processing can be speeded up.

[Brief description of drawings]

第１図は本発明のデータフロープロセッサの構成図、第
２図はデータフロープロセッサを用いた処理装置の例を
示す全体構成図、第３図は本発明の説明に供するトーク
ンの形式を示す図、第４図はファンクションテーブルア
ドレス生成部20の実施例を示す詳細な構成図、第５図は
本発明における累積加算処理の実現法を示すフローグラ
フ図、第６図は従来の装置を説明するためのデータフロ
ープロセッサの構成図、第７図は従来装置における累積
加算処理の実現法を示すフローグラフ図である。図において、 11……トークン入力部、12……リンクテーブル、13……
ファンクションテーブル、14……データメモリ、15……
キューメモリ、16……プロセッシングユニット、17……
トークン出力部、20……ファンクションテーブルアドレ
ス生成部。FIG. 1 is a configuration diagram of a data flow processor of the present invention, FIG. 2 is an overall configuration diagram showing an example of a processing device using the data flow processor, and FIG. 3 is a diagram showing a format of a token used for explaining the present invention. 4, FIG. 4 is a detailed block diagram showing an embodiment of the function table address generation unit 20, FIG. 5 is a flow graph diagram showing a method of implementing the cumulative addition process in the present invention, and FIG. 6 is a conventional device. FIG. 7 is a configuration diagram of a data flow processor for the purpose, and FIG. 7 is a flow graph diagram showing a method of realizing the cumulative addition process in the conventional apparatus. In the figure, 11 …… Token input section, 12 …… Link table, 13 ……
Function table, 14 …… Data memory, 15 ……
Cue memory, 16 ... Processing unit, 17 ...
Token output part, 20 ... Function table address generation part.

Claims

[Claims]

1. A data flow processor that performs processing by flowing a token, which is a unit of data, to an internal ring-shaped bus, and a link that stores destination addresses of tokens that are connected in sequence on the bus. Table, function table address generation unit that generates addresses for referencing instruction codes, function table that stores instruction codes, data memory that temporarily stores data used for binary operation, queue memory that temporarily holds tokens, and tokens It has a processing unit that performs data processing and outputs the result to the link table, and further outputs a token from the queue memory to the external bus and a token output unit, and inputs the token from the external bus to the link table or the token output unit. Token input When a set token composed of a plurality of continuously flowing tokens and having a set identifier for identifying each other is given as an input to the data flow processor, the token is stored in the function table. A function table base address that each of the tokens in the set token has in common in the function table address generation unit, an address for referring to a processing instruction for
A data flow processor, wherein each token is generated by an operation with a different set identifier in each token.