JP5145659B2

JP5145659B2 - Vector renaming method and vector computer

Info

Publication number: JP5145659B2
Application number: JP2006169010A
Authority: JP
Inventors: 聡多賀谷
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2006-06-19
Filing date: 2006-06-19
Publication date: 2013-02-20
Anticipated expiration: 2026-06-19
Also published as: JP2007334819A

Description

本発明は、ベクトル型計算機に関し、特に、レジスタリネーミングを可能にして演算スループットを向上させるためのレジスタリネーミング方式と、このようなレジスタリネーミング方式を採用したベクトル型計算機とに関する。 The present invention relates to a vector computer, and more particularly, to a register renaming method that enables register renaming to improve calculation throughput, and a vector computer that employs such a register renaming method.

計算機システムにおいてその性能を向上させるためには、単位時間当たりに実行できる命令数を高めることが必要である。特に、持続的な性能(Sustained Performance)の向上のためには、命令を発行するレート（Ｉｓｓｕｅレート）を向上することが必要である。 In order to improve the performance of a computer system, it is necessary to increase the number of instructions that can be executed per unit time. In particular, in order to improve sustained performance, it is necessary to improve the rate at which instructions are issued (Issue rate).

計算機システムにおいて、演算処理は一般にレジスタ間で行われるが、パイプライン処理を用いている場合に、あるレジスタに対する命令を発行した後、同じレジスタが別の命令を実行できるようになるためには一定のマシンサイクルを必要とする。ハードウェアには、プログラムが前提とする個数よりも多くの個数のレジスタを用意しておき、プログラム上では同一のレジスタに対するものである複数の命令があるときに、命令の実行時において、プログラムで規定されているレジスタとは別のレジスタを用いるようにすることによって、演算の高速化が可能になる。このような手法で演算の高速化を図るものが、レジスタリネーミング（レジスタの名前の付け替え）である。 In computer systems, arithmetic processing is generally performed between registers. However, when pipeline processing is used, after issuing an instruction for one register, the same register can execute another instruction. Need machine cycles. The hardware has more registers than the program assumes, and when there are multiple instructions that are for the same register in the program, By using a register different from the defined register, the operation speed can be increased. Register renaming (renaming of registers) is a method for speeding up operations by such a method.

一般的に行われているレジスタリネーミングでは、ある演算結果を書き込むレジスタに別名を与え、リソースの依存性を緩和する。たとえば、命令Ａが論理リソースＸに対してデータを読み出している途中で、同一の論理リソースＸに対する書き込みを行う後続命令Ｂに関して、論理リソースＸに対して物理リソースＸ１、Ｘ２をそれぞれＡ，Ｂに割り当てておくことで、Ｂ実行によるデータの書きつぶしを防ぐことが可能となる。これは命令Ｂの発行制限の緩和を意味し、発行レートの向上が見込める。 In register renaming that is generally performed, an alias is given to a register to which a certain operation result is written, and resource dependency is relaxed. For example, regarding the subsequent instruction B that writes to the same logical resource X while the instruction A is reading data from the logical resource X, the physical resources X1 and X2 are changed to A and B for the logical resource X, respectively. By allocating, it becomes possible to prevent data from being overwritten by B execution. This means that the issue limit of the instruction B is relaxed, and an increase in the issue rate can be expected.

しかしながら、レジスタリネーミングをベクトル型計算機に適用しようとする場合には、ベクトル型計算機には、要素ごとに計算の有無を切り替えることのできるマスク機能が設けられており、単純にリネーミングメカニズムを実現できない、という問題点がある。たとえば前述の命令Ｂにマスクがかかっていた場合、マスク対象の要素には、演算結果の元の値、すなわち命令Ａの実行結果の対応する要素を格納する必要がある。これは、演算がマスクされている以上、あるレジスタに既に書き込まれたデータが、あとに実行される命令で参照される可能性があるからである。ベクトル計算機におけるマスク機能については、例えば、特許文献１、２に記載されている。 However, if register renaming is to be applied to a vector computer, the vector computer is provided with a mask function that can switch the presence / absence of calculation for each element, thus realizing a simple renaming mechanism. There is a problem that it is not possible. For example, when the above-described instruction B is masked, it is necessary to store the original value of the operation result, that is, the corresponding element of the execution result of the instruction A, in the masked element. This is because data already written in a certain register may be referred to by an instruction to be executed later as long as the operation is masked. The mask function in the vector computer is described in Patent Documents 1 and 2, for example.

そのため、単純に複数の独立したリソースをＡ，Ｂに割り当てるだけで不十分で、そのリソース間での値の引渡しが必要となり、既存のレジスタリネーミング技術を単純に導入することができなかった。リソース間での値の引渡しを行うようにした構成は、例えば、特許文献３に記載されている。
特開昭６０−１５０４７３号公報特開平１−２８４９７０号公報特開２０００−１７２５０５号公報 For this reason, it is not sufficient to simply assign a plurality of independent resources to A and B, and it is necessary to pass values between the resources, and it is not possible to simply introduce an existing register renaming technique. A configuration in which a value is transferred between resources is described in Patent Document 3, for example.
JP 60-150473 A JP-A-1-284970 JP 2000-172505 A

上述したように従来のレジスタリネーミングの手法は、命令の発行レートを向上することによって演算処理性能を高める手法として広く用いられているものの、ベクトル計算機にはそのままでは適用できない、という問題点を有する。特許文献３に記載されたようにリソース間で値の引渡しを行えるような構成を採用する場合には、ハードウェア構成が複雑になる、という問題点がある。 As described above, the conventional register renaming technique is widely used as a technique for improving the arithmetic processing performance by improving the instruction issue rate, but has a problem that it cannot be applied to a vector computer as it is. . When adopting a configuration in which a value can be transferred between resources as described in Patent Document 3, there is a problem that the hardware configuration becomes complicated.

そこで本発明の目的は、単純なハードウェアを用い、ベクトル型計算機においてマスク演算動作とレジスタリネーミングを同時に実現することが可能な、ベクトルリネーミング方式と、このようなベクトルリネーミング方式を採用したベクトル型計算機とを提供することにある。 Therefore, an object of the present invention is to adopt a vector renaming method and a vector renaming method that can simultaneously realize mask operation and register renaming in a vector computer using simple hardware. It is to provide a vector type computer.

本発明のベクトルリネーミング方式は、ベクトル計算機においてマスク演算機能を実現したままレジスタリネーミングを可能にするベクトルリネーミング方式であって、論理ベクトルレジスタに対応するとともに相互に組をなす複数のリネーミングレジスタを備え、各リネーミングレジスタは、それぞれ１ビットのデータを保持するセルと、セルごとに設けられたセレクタとを有し、セレクタは、ライトデータか組内の他のセルが保持する値かをマスク信号に応じて対応するセルに供給し、マスクが設定されている場合にはセル間でセルの値をコピーし、マスクが設定されていないときにはライトデータをいずれかのセルに書き込むようにしたものである。 Baie transfected Ruri naming scheme of the present invention is a vector renaming scheme that enables left register renaming to achieve a mask operation function in vector computer, a plurality of forming a pair with each other while corresponding to the logical vector register Each renaming register has a cell that holds 1-bit data and a selector provided for each cell, and the selector holds the write data or other cells in the set . The value is supplied to the corresponding cell according to the mask signal. When the mask is set, the cell value is copied between the cells. When the mask is not set, the write data is written to any cell. It is what I did.

本発明のベクトル型計算機は、レジスタリネーミングを制御する制御手段と、論理ベクトルレジスタに対応するとともに相互に組をなす複数のリネーミングレジスタと、演算器と、制御手段によって制御されて複数のリネーミングレジスタのうちの１つの出力を演算器に供給する第１のセレクタと、演算器の演算結果を入力して複数のリネーミングレジスタのいずれかに格納する書き込み制御手段と、を備え、各リネーミングレジスタは、それぞれ１ビットのデータを保持するセルと、セルごとに設けられた第２のセレクタとを有し、第２のセレクタは、書き込み制御手段からのライトデータか組内の他のセルが保持する値かをマスク信号に応じて対応するセルに供給し、マスクが設定されている場合にはセル間でセルの値をコピーし、マスクが設定されていないときにはライトデータをいずれかのセルに書き込むようにしたものである。 Vector type computer of the present invention includes a control unit for controlling the register renaming, a plurality of renaming register constituting a pair mutually with corresponding logical vector registers, and the arithmetic unit is controlled by the control means more A first selector that supplies an output of one of the renaming registers to the arithmetic unit, and a write control unit that inputs an operation result of the arithmetic unit and stores it in any of the plurality of renaming registers, Each renaming register has a cell that holds 1-bit data, and a second selector provided for each cell. The second selector can select either write data from the write control means or other data in the set . Is supplied to the corresponding cell according to the mask signal, and if the mask is set, the cell value is copied between cells and When the but not set is obtained so as to write the write data into one of the cells.

本発明は、論理ベクトルレジスタにマッピングされる複数のリネーミングレジスタにおいて、これらのリネーミングレジスタを結合することによってマスク演算に必要な値のコピーを行えるようにすることにより、単純なセル構成によって、マスク演算を実現したままレジスタリネーミングを実現をできるという効果がある。特に本発明は、大掛かりなハードウェアを必要とせずにマスク演算とレジスタリネーミングとを両立させているので、低コストで、ベクトル命令の発行レートを大幅に向上させることができる。 The present invention allows a plurality of renaming registers mapped to a logical vector register to copy values necessary for mask operation by combining these renaming registers, thereby enabling a simple cell configuration. There is an effect that register renaming can be realized while the mask operation is realized. In particular, according to the present invention, since the mask operation and the register renaming are made compatible without requiring a large-scale hardware, the issue rate of vector instructions can be greatly improved at low cost.

次に、本発明の好ましい実施の形態について、図面を参照して説明する。図１は本発明の実施の一形態のベクトルリネームミング方式で用いられるセルセット（セルの組）の構成を示している。 Next, a preferred embodiment of the present invention will be described with reference to the drawings. FIG. 1 shows the configuration of a cell set (cell set) used in the vector renaming method according to the embodiment of the present invention.

図１に示すセルセットは、それぞれ１ビットのＲＡＭ（ランダムアクセスメモリ）セルＣ０，Ｃ１を備えており、それぞれのビットは、同一論理レジスタにマップされる２つの物理レジスタに対応している。ここでは、１つの論理ベクトルレジスタが２つの物理ベクトルレジスタのいずれかにリネーミングされるものとして、２ビット分のセルを有するが、実際には、リネーミングの形態に応じて、３個以上のセルを用いて１つのセルセットを構成してもよい。 The cell set shown in FIG. 1 includes 1-bit RAM (Random Access Memory) cells C0 and C1, and each bit corresponds to two physical registers mapped to the same logical register. Here, one logical vector register is renamed to one of two physical vector registers, and has cells for 2 bits. However, actually, three or more cells are used depending on the form of renaming. One cell set may be configured using cells.

これらのセルは、ハードウェア上に近接してレイアウトされることが望ましい。このようにセルが集まった単位をセルセットと読んでいる。 These cells are preferably laid out in close proximity on the hardware. A unit in which cells are gathered in this way is read as a cell set.

各セルＣ０，Ｃ１に対してデータを書き込む経路には、それぞれ、セレクタＳ０，Ｓ１が設けられている。セルに対するライトデータＷＤがセルセットに入力しており、このデータＷＤは、セルＣ０，Ｃ１のいずれかあるいは両方に対して書き込みを行う際の入力データとなる。セレクタＳ０には、セルＣ１の出力ＲＤ１も接続し、セレクタＳ１にはセルＣ０の出力ＲＤ０も接続する。結局、セレクタＳ０，Ｓ１を介して、セルＣ０，Ｃ１がたすきがけとなるように接続していることになる。セレクタＳ０，Ｓ１には、マスク信号ＭＡＳＫが与えられている。マスク信号ＭＡＳＫがアクティブ（“１”）であれば、セレクタＳ０，Ｓ１は、セルＣ０，Ｃ１に対してそれぞれ書込データとして、相手側セルの出力（セルＣ１，Ｃ０の出力）を供給し、ＭＡＳＫ信号がアクティブでなければ（すなわち“０”であれば）、ライドデータＷＤを供給する。各セルＣ０，Ｃ１からは、それぞれ、リードデータＲＤ０，ＲＤ１が読み出される。なお、ベクトル計算機における各演算器（後述するＦＭＡＣ７０，７３、ＦＤＶ７１，７４、ＦＬＯＧＩＣ７２，７５）に対してリードＲＤ０，ＲＤ１のどちらが供給されるかは、リソース管理部４からの選択信号４２によって制御される後述するセレクタ８０によって選択される。セレクタは、対をなして設けられた物理ベクトルレジスタから、論理レジスタにマップされるものを選択してその出力を各演算器に供給するために設けられている。 Selectors S0 and S1 are provided in the paths for writing data to the cells C0 and C1, respectively. Write data WD for the cell is input to the cell set, and this data WD becomes input data when writing to one or both of the cells C0 and C1. The output RD1 of the cell C1 is also connected to the selector S0, and the output RD0 of the cell C0 is also connected to the selector S1. Eventually, the cells C0 and C1 are connected through the selectors S0 and S1 so as to be opened. The selectors S0 and S1 are supplied with a mask signal MASK. If the mask signal MASK is active (“1”), the selectors S0 and S1 supply the outputs of the counterpart cell (the outputs of the cells C1 and C0) as write data to the cells C0 and C1, respectively. If the MASK signal is not active (ie, “0”), the ride data WD is supplied. Read data RD0 and RD1 are read from the cells C0 and C1, respectively. Note that the selection signal 42 from the resource management unit 4 controls which of the leads RD0 and RD1 is supplied to each computing unit (FMAC 70, 73, FDV 71, 74, FLOGIC 72, 75 described later) in the vector computer. This is selected by a selector 80 described later. The selector is provided for selecting one mapped to the logical register from the physical vector registers provided in pairs and supplying the output to each arithmetic unit.

各セルＣ０，Ｃ１に対しては、それぞれ、ライトイネーブル信号ＷＥ０，ＷＥ１が与えられている。ＷＥ０はＣ０に対するライトイネーブルであり、この信号がアクティブであれば、そのときセレクタＣ０で選択されている信号がセルＣ０に書き込まれる。すなわち、ＷＥ０がアクティブであれば、セルＣ０に対して、データＷＤあるいはセルＣ１の出力値が書き込まれる。例えば、マスク信号ＭＡＳＫが“１”のときに信号ＷＥ０がアクティブ（“１”）になると、セルＣ０にはセルＣ１の内容が書き込まれる、すなわち値がコピーされる。信号ＭＡＳＫが“０”の時に信号ＷＥ０がアクティブになると、セルＣ０にはライトデータＷＤの値が書き込まれる。同様に、信号ＷＥ１はセルＣ１に対するライトイネーブル信号であり、この信号ＷＥ１がアクティブになると、セルＣ１に対して、データＷＤあるいはセルＣ０の出力値が書き込まれる。 Write enable signals WE0 and WE1 are supplied to the cells C0 and C1, respectively. WE0 is a write enable for C0. If this signal is active, the signal selected by the selector C0 is written to the cell C0. That is, if WE0 is active, data WD or the output value of cell C1 is written into cell C0. For example, when the signal WE0 becomes active (“1”) when the mask signal MASK is “1”, the contents of the cell C1 are written into the cell C0, that is, the value is copied. When the signal WE0 becomes active when the signal MASK is “0”, the value of the write data WD is written in the cell C0. Similarly, the signal WE1 is a write enable signal for the cell C1, and when this signal WE1 becomes active, the output value of the data WD or the cell C0 is written into the cell C1.

結局、このセルセットでは、マスク信号（マスクビット）ＭＡＳＫを指定することで、書き込みデータを外部からのライトデータＷＤ、あるいはセルセット内部の自セルの反対側のセル（自セルがセルＣ０であればセルＣ１側）のデータを切り替えて書き込むことが可能である。 After all, in this cell set, by designating the mask signal (mask bit) MASK, the write data is written from the outside or the cell on the opposite side of the own cell in the cell set (if the own cell is the cell C0). For example, data on the cell C1 side) can be switched and written.

ベクトル計算機には、演算要素ごとに演算の可否を指定できる演算マスク機能が存在し、マスクされた要素に対して値を変化させない命令であるベクトルマスク対象命令が存在する。そのため、ベクトル計算機におけるリネーミング機構では、マスクされた「変化しない」データを考慮することが必要である。論理レジスタＶＲ０に物理レジスタＶＲＲ１００，ＶＲＲ１０８が対応する場合、論理ベクトルレジスタＶＲ０と物理ベクトルレジスタＶＲＲ１００，ＶＲＲ１０８のうちの対応する要素との間には、情報の受け渡しパスが必要となる。 A vector computer has an operation mask function that can specify whether or not an operation can be performed for each operation element, and a vector mask target instruction that is an instruction that does not change the value of the masked element. Therefore, the renaming mechanism in the vector computer needs to consider the masked “no change” data. When the physical registers VRR100 and VRR108 correspond to the logical register VR0, an information passing path is required between the logical vector register VR0 and the corresponding elements of the physical vector registers VRR100 and VRR108.

本実施形態のセルセットを用いることにより、セルセット内のＶＲＲ（物理ベクトルレジスタ）セルのいずれかに元のデータが入っていれば、その値を反対側にあるセルの更新時に受け継ぐことが可能となり、リネーミング動作とマスク動作を両立されることが可能となる。本実施形態では、このようなセルセットが、論理ベクトルレジスタのサイズだけ存在することになる。たとえば、６４ビット構成で２５６エントリのベクトルレジスタを論理的に８本持つアーキテクチャでは、上述したセルセットを６４×２５６×８＝１３１，０７２個用意することになる。各セルセットが２セルから構成されているとすれば、２６２，１４４個の物理セルが存在することになる。 By using the cell set of this embodiment, if the original data is contained in any of the VRR (physical vector register) cells in the cell set, the value can be inherited when the cell on the opposite side is updated. Thus, both the renaming operation and the mask operation can be achieved. In the present embodiment, such a cell set exists for the size of the logical vector register. For example, in an architecture having a 64-bit configuration and eight logically 256 vector registers, 64 × 256 × 8 = 131,072 cell sets described above are prepared. If each cell set is composed of two cells, there will be 262,144 physical cells.

図２は、上述したセルセットを用いることによってレジスタリネーミングを実行するベクトル計算機の構成の一例を示している。このベクトル計算機は、一般的な命令キャッシュなどを含む命令供給部１と、命令供給部１からの命令をデコードするデコーダ２と、デコードされた命令の発行制御を行う発行制御部３と、発行制御部３で発行された命令が用いるリソースの管理を行うリソース管理部４と、を備えている。リソース管理部４は、発行制御部３が発行する命令について、特にレジスタリネーミング機能のために論理ベクトルレジスタ番号と物理ベクトルレジスタ番号の対応テーブル４１を用意する。図３は、対応テーブル４１の内容の一例を示している。 FIG. 2 shows an example of the configuration of a vector computer that performs register renaming by using the above-described cell set. This vector computer includes an instruction supply unit 1 including a general instruction cache, a decoder 2 that decodes an instruction from the instruction supply unit 1, an issue control unit 3 that performs issue control of the decoded instruction, and issue control. And a resource management unit 4 that manages resources used by instructions issued by the unit 3. The resource management unit 4 prepares a correspondence table 41 of logical vector register numbers and physical vector register numbers for instructions issued by the issue control unit 3, particularly for the register renaming function. FIG. 3 shows an example of the contents of the correspondence table 41.

ここでＶＲＲ１０は、ベクトルリネーミングレジスタであり、論理的にベクトルレジスタＶＲ０〜ＶＲ７を構成する場合の実体のレジスタ群であり、物理ベクトルレジスタＶＲＲ１００〜ＶＲＲ１１５の１６個から構成される。各物理ベクトルレジスタは、６４ビット幅を持つデータレジスタを２５６エントリ備える。ベクトルリネーミングレジスタＶＲＲ１０には、発行制御部３から制御信号３１が供給されている。 Here, the VRR 10 is a vector renaming register, and is an actual register group in the case of logically configuring the vector registers VR0 to VR7, and includes 16 physical vector registers VRR100 to VRR115. Each physical vector register includes 256 entries of data registers having a 64-bit width. A control signal 31 is supplied from the issue control unit 3 to the vector renaming register VRR10.

なお、論理レジスタＶＲ０の物理的な値は、物理ベクトルレジスタＶＲＲ１００あるいは物理ベクトルレジスタＶＲＲ１０８のいずれかに格納される。同様に論理レジスタＶＲ１の物理的な値は物理ベクトルレジスタＶＲＲ１０１あるいは物理ベクトルレジスタＶＲＲ１０９のいずれかに格納され、以下同様に、論理レジスタと物理ベクトルレジスタの対応付けがなされる。 The physical value of the logical register VR0 is stored in either the physical vector register VRR100 or the physical vector register VRR108. Similarly, the physical value of the logical register VR1 is stored in either the physical vector register VRR101 or the physical vector register VRR109. Similarly, the logical register and the physical vector register are associated with each other.

ベクトル計算機には、加算及び乗算を行うＦＭＡＣ（演算器）７０，ＦＭＡＣ７３と、除算を行うＦＤＶ（除算器）７１，ＦＤＶ７４と、論理演算を行うＦＬＯＧＩＣ（論理演算器）７２，ＦＬＯＧＩＣ７５が設けられ、また、それぞれの計算結果が送られてくるデータ書き込み制御部９が設けられている。データ書き込み制御部９は、論理ベクトルレジスタ番号と、割り当てられた物理ベクトルレジスタ番号（すなわちＶＲＲ１００〜ＶＲＲ１１５）の対応付け、およびマスク情報から、ベクトルリネーミングレジスタＶＲＲ１０の適当な場所に、計算結果として得られたデータ、あるいは物理べクトルレジスタＶＲＲ１００〜１１５に既に書き込まれているデータを書き込む。 The vector computer is provided with FMAC (arithmetic unit) 70 and FMAC 73 for performing addition and multiplication, FDV (divider) 71 and FDV 74 for performing division, FLOGIC (logical arithmetic unit) 72 and FLOGIC 75 for performing logical operation, In addition, a data write control unit 9 to which each calculation result is sent is provided. The data write control unit 9 obtains, as a calculation result, an appropriate location of the vector renaming register VRR10 from the correspondence between the logical vector register number and the assigned physical vector register number (that is, VRR100 to VRR115) and the mask information. Written data or data already written in the physical vector registers VRR100 to 115.

なお、対をなす物理ベクトルレジスタの出力のいずれか一方を選択するためにセレクタ８０が設けられており、セレクタ８０で選択された物理ベクトルレジスタから読出されたベクトルデータが、ＦＭＡＣ７０，ＦＤＶ７１，ＦＬＯＧＩＣ（論理演算器）７２，ＦＭＡＣ７３，ＦＤＶ７４，ＦＬＯＧＩＣ７５に供給される。セレクタ８０はリソース管理部４からの選択信号４２によって制御される。例えば、物理ベクトルレジスタＶＲＲ１００および物理ベクトルレジスタＶＲＲ１０８の一方がセレクタ８０によって選択されるようになっている。 Note that a selector 80 is provided to select one of the outputs of the paired physical vector registers, and vector data read from the physical vector register selected by the selector 80 is FMAC70, FDV71, FLOGIC ( Logic unit) 72, FMAC 73, FDV 74, and FLOGIC 75. The selector 80 is controlled by a selection signal 42 from the resource management unit 4. For example, one of the physical vector register VRR100 and the physical vector register VRR108 is selected by the selector 80.

さらにこのベクトル計算機には、ＬＳユニット５と、外部記憶メモリ６とが設けられている。ＬＳユニット６は、物理ベクトルレジスタＶＲＲ１００〜１１５からのデータを外部記憶メモリ６にストアする動作、また外部記憶メモリ６からデータをロードして物理ベクトルレジスタＶＲＲ１００〜１１５のいずれかに書き込む動作を管理する。 Further, the vector computer is provided with an LS unit 5 and an external storage memory 6. The LS unit 6 manages the operation of storing data from the physical vector registers VRR100 to 115 in the external storage memory 6, and the operation of loading data from the external storage memory 6 and writing to any of the physical vector registers VRR100 to 115. .

次に、このベクトル計算機におけるレジスタリネーミングの動作について説明する。 Next, the register renaming operation in this vector computer will be described.

以下の説明において、「ベクトル」とは、複数の要素データを内在するデータ列を意味する。ここではベクトルは、最大２５６個の６４ビット要素データを持つこととする。以下では、図４に示す命令列を実行するものとした場合の処理を説明する。図４に示す命令列において、「ＶＡＤＤ」はオペランドで指定した２つの論理ベクトルレジスタに格納されたベクトルデータ同士の要素ごとの加算を行い、結果をディスティネーションベクトルレジスタに格納する命令である。 In the following description, “vector” means a data string containing a plurality of element data. Here, it is assumed that the vector has a maximum of 256 64-bit element data. Hereinafter, processing when the instruction sequence shown in FIG. 4 is executed will be described. In the instruction sequence shown in FIG. 4, “VADD” is an instruction for performing addition for each element of vector data stored in two logical vector registers specified by operands and storing the result in the destination vector register.

「ＶＳＵＢ」は、オペランドで指定した２つの論理ベクトルレジスタに格納されたベクトルデータ同士の要素ごとの減算を行い、結果をディスティネーションベクトルレジスタに格納する命令である。 “VSUB” is an instruction that performs subtraction for each element between the vector data stored in the two logical vector registers specified by the operand, and stores the result in the destination vector register.

ここで、ＶＡＤＤ命令およびＶＳＵＢ命令は、いずれも、演算を行わない要素を選択的に指定すること（すなわちマスクをかける）が可能である。ＶＦＭＫ命令は、ベクトルのマスクを変更する命令である。ベクトル演算を実際に行う要素数（ＶＬ；ベクトル長）は、ソフトウエアによって０〜２５６の間で変化させることが可能である。ＬＶＬ命令は、演算対象のベクトル要素数を変更する命令である。たとえばＬＶＬ命令でＶＬを１０と指定すると、ベクトル演算はベクトルレジスタに含まれる要素のうち、先頭から１０要素だけに対して行われる。 Here, in both the VADD instruction and the VSUB instruction, it is possible to selectively specify (that is, apply a mask) an element on which an operation is not performed. The VFMK instruction is an instruction for changing a vector mask. The number of elements (VL; vector length) in which the vector operation is actually performed can be changed between 0 and 256 by software. The LVL instruction is an instruction for changing the number of vector elements to be calculated. For example, when VL is specified as 10 by the LVL instruction, vector operation is performed on only 10 elements from the top among the elements included in the vector register.

まず、図４に示した命令群において、１行目の命令ＬＶＬが命令供給部１から供給され、デコーダ２でデコードされ、発行制御部３においてＬＶＬ命令が発行され、ベクトル長ＶＬが４にセットされる。ＬＶＬ命令でＶＬとして４が指定されたことにより、以下では、ベクトル演算は、ベクトルレジスタに含まれている要素のうち最初の４要素に対して行われることになる。このＬＶＬ命令の終了時における、リソース管理部４内の対応テーブル４１、マスク情報、ＶＬ、およびリネーミングレジスタ（物理ベクトルレジスタ）ＶＲＲ１００，ＶＲＲ１０１，ＶＲＲ１０２，ＶＲＲ１０８の値を図５に示す。図５に示すように、ＶＬに４という値がセットされている。なお、現時点では、論理ベクトルレジスタＶＲ０に対する物理ベクトルレジスタとして、ＶＲＲ１００が割り振られている。これは、この時点での物理ベクトルレジスタＶＲＲ１００の値が論理レジスタＶＲ０の値になっているということを示している。 First, in the instruction group shown in FIG. 4, the instruction LVL in the first row is supplied from the instruction supply unit 1, decoded by the decoder 2, the LVL instruction is issued by the issue control unit 3, and the vector length VL is set to 4. Is done. Since 4 is designated as VL in the LVL instruction, the vector operation will be performed on the first four elements included in the vector register below. FIG. 5 shows values of the correspondence table 41, mask information, VL, and renaming registers (physical vector registers) VRR100, VRR101, VRR102, and VRR108 in the resource management unit 4 at the end of the LVL instruction. As shown in FIG. 5, a value of 4 is set in VL. At this time, the VRR 100 is allocated as a physical vector register for the logical vector register VR0. This indicates that the value of the physical vector register VRR100 at this time is the value of the logical register VR0.

次に、図４に示す命令群のうち２行目の命令ＶＦＭＫが命令供給部１から供給され、デコーダ２でデコードされる。発行制御部３においてＬＦＭＫ命令が発行され、ベクトルマスクに“０１０１”がセットされる。オペランドを“０１０１”とするＶＦＭＫ命令は、マスクが“１”になっている要素、すなわちこの場合は、第２の要素および第４の要素に対して演算を行わないようなマスクをかける命令である。 Next, the instruction VFMK in the second row in the instruction group shown in FIG. 4 is supplied from the instruction supply unit 1 and decoded by the decoder 2. The issue control unit 3 issues an LFMK instruction and sets “0101” in the vector mask. The VFMK instruction whose operand is “0101” is an instruction that applies a mask that does not perform an operation on the element whose mask is “1”, in this case, the second element and the fourth element. is there.

このＶＦＭＫ命令の終了時における、リソース管理部４内の対応テーブル４１、マスク情報、ＶＬ、およびリネーミングレジスタＶＲＲ１００，ＶＲＲ１０１，ＶＲＲ１０２，ＶＲＲ１０８の値を図６に示す。図６の通り、マスク情報に“０１０１”という値がセットされている。 FIG. 6 shows values of the correspondence table 41, mask information, VL, and renaming registers VRR100, VRR101, VRR102, and VRR108 in the resource management unit 4 at the end of the VFMK instruction. As shown in FIG. 6, the value “0101” is set in the mask information.

次に、３行目の命令ＶＡＤＤが命令供給部１から供給され、デコーダ２でデコードされる。図示した例では、この命令ＶＡＤＤは、論理ベクトルレジスタＶＲ１，ＶＲ２に格納された値のベクトル加算を行って論理ベクトルレジスタＶＲ０に格納するものである。 Next, the instruction VADD in the third row is supplied from the instruction supply unit 1 and decoded by the decoder 2. In the illustrated example, this instruction VADD performs vector addition of values stored in the logical vector registers VR1 and VR2 and stores the result in the logical vector register VR0.

発行制御部３においてＶＡＤＤ命令が発行される。その際、リソース管理部４は、このＶＡＤＤ命令の結果を書き込む論理レジスタＶＲ０に対して、物理ベクトルレジスタＶＲＲ１０８側を割り当て、リソース管理部４内の対応情報を物理ベクトルレジスタＶＲＲ１０８に書き換える。ＶＡＤＤ命令はＦＭＡＣ７０によって実行され、その計算結果は、データ書き込み制御部９に送られる。 The issue control unit 3 issues a VADD instruction. At this time, the resource management unit 4 assigns the physical vector register VRR108 side to the logical register VR0 to which the result of the VADD instruction is written, and rewrites the correspondence information in the resource management unit 4 to the physical vector register VRR108. The VADD instruction is executed by the FMAC 70, and the calculation result is sent to the data write control unit 9.

なお、このＶＡＤＤ命令を実行させた結果を書き込む動作は、データ書き込み制御部９によって行われる。図７は、データ書き込み制御部９からの制御データを示している。データ書き込み制御部９からは、要素番号、書き込み物理ベクトルレジスタＶＲＲのレジスタ名、計算結果であるライトデータＷＤ、マスク情報ＭＡＳＫがリネーミングベクトルレジスタＶＲＲ１０に送られる。そして、リネーミングベクトルレジスタＶＲＲ１０内の対応するレジスタ、すなわち物理ベクトルレジスタＶＲＲ１０８内にあるセルセットにおいて、適切な書き込みが行われる。 The operation of writing the result of executing the VADD instruction is performed by the data write control unit 9. FIG. 7 shows control data from the data write control unit 9. From the data write control unit 9, the element number, the register name of the write physical vector register VRR, the write data WD as the calculation result, and the mask information MASK are sent to the renaming vector register VRR10. Then, appropriate writing is performed in the corresponding register in the renaming vector register VRR10, that is, in the cell set in the physical vector register VRR108.

たとえば、要素０の演算結果に関しては、マスク情報が“０”なので、計算結果である“１２”が、ライトデータＷＤとして物理ベクトルレジスタＶＲＲ１００側のセルに書き込まれる。図１に示したセルセットにおいて、セルＣ０側に物理ベクトルレジスタＶＲＲ１００のデータ、セルＣ１側に物理ベクトルレジスタＶＲ１０８のデータが書き込まれているとすると、ライトイネーブル信号ＷＥ１が“１”となり、かつＭＡＳＫ信号は“０”のままで、セルＣ１に対してライトデータＷＤである“１２”が書き込まれる。 For example, regarding the operation result of element 0, since the mask information is “0”, the calculation result “12” is written to the cell on the physical vector register VRR 100 side as write data WD. In the cell set shown in FIG. 1, if the data of the physical vector register VRR100 is written on the cell C0 side and the data of the physical vector register VR108 is written on the cell C1 side, the write enable signal WE1 becomes “1” and MASK The signal remains “0” and “12”, which is the write data WD, is written to the cell C1.

要素１の演算結果に関しては、マスク情報が“１”すなわちアクティブであるので、物理ベクトルレジスタＶＲＲ１００側のデータである“１”が反対側の物理ベクトルレジスタＶＲＲ１０８側のセルに書き込まれる。図１のセルセットにおいては、セルＣ０側に物理ベクトルレジスタＶＲＲ１００側のデータ、セルＣ１側に物理ベクトルレジスタＶＲ１０８側のデータが書き込まれているとすると、ライトイネーブル信号ＷＥ１が“１”となり、かつＭＡＳＫ信号は“１”となって、セルＣ１に、セルＣ０の格納値である“１”が書き込まれる。 Regarding the operation result of element 1, since the mask information is “1”, that is, active, “1”, which is data on the physical vector register VRR 100 side, is written in the cell on the opposite physical vector register VRR 108 side. In the cell set of FIG. 1, if the data on the physical vector register VRR100 side is written on the cell C0 side and the data on the physical vector register VR108 side is written on the cell C1 side, the write enable signal WE1 becomes “1”, and The MASK signal becomes “1”, and “1” which is the stored value of the cell C0 is written into the cell C1.

このＶＡＤＤ命令の実行終了時における、リソース管理部４内の対応テーブル４１、マスク情報、ＶＬ、およびリネーミングレジスタＶＲＲ１００，ＶＲＲ１０１，ＶＲＲ１０２，ＶＲＲ１０８の値を図８に示す。図８に示す通り、ＶＡＤＤ命令の実行結果は、割り当てられた物理ベクトルレジスタＶＲＲ１０８に、マスク情報を反映した形で格納されている。 FIG. 8 shows values of the correspondence table 41, mask information, VL, and renaming registers VRR100, VRR101, VRR102, and VRR108 in the resource management unit 4 at the end of execution of the VADD instruction. As shown in FIG. 8, the execution result of the VADD instruction is stored in the allocated physical vector register VRR 108 in a form reflecting the mask information.

最後に、４行目のＶＳＵＢ命令が命令供給部１から供給され、デコーダ２でデコードされる。図示したものは、このＶＳＵＢ命令は、論理ベクトルレジスタＶＲ２の内容から論理ベクトルレジスタＶＲ１の内容をベクトル減算し、論理ベクトルレジスタＶＲ０に格納するものである。 Finally, the VSUB instruction in the fourth row is supplied from the instruction supply unit 1 and decoded by the decoder 2. In the illustrated example, this VSUB instruction subtracts the content of the logical vector register VR1 from the content of the logical vector register VR2 and stores the result in the logical vector register VR0.

発行制御部３においてＶＳＵＢ命令が発行される。その際、リソース管理部４は、このＶＳＵＢ命令の結果を書き込む論理レジスタＶＲ０に対して、物理ベクトルレジスタＶＲＲ１００側を割り当て、リソース管理部４内の対応情報を物理ベクトルレジスタＶＲＲ１００に書き換える。ＶＳＵＢ命令はＦＭＡＣ７０によって実行され、その計算結果は、データ書き込み制御部９に送られる。 The issue control unit 3 issues a VSUB instruction. At this time, the resource management unit 4 assigns the physical vector register VRR100 side to the logical register VR0 to which the result of the VSUB instruction is written, and rewrites the correspondence information in the resource management unit 4 to the physical vector register VRR100. The VSUB instruction is executed by the FMAC 70, and the calculation result is sent to the data write control unit 9.

前述のＶＡＤＤ命令の場合と同様に、このＶＳＵＢ命令の実行結果の書き込み動作は、データ書き込み制御部９によって行われる。図９は、データ書き込み制御部９からの制御データを示している。物理ベクトルレジスタＶＲＲ１００内の対応するレジスタ、すなわち物理ベクトルレジスタＶＲＲ１００内にあるセルセットにおいて、適切な書き込みが行われる。 As in the case of the VADD instruction described above, the write operation of the execution result of this VSUB instruction is performed by the data write control unit 9. FIG. 9 shows control data from the data write control unit 9. Appropriate writing is performed in the corresponding register in the physical vector register VRR100, that is, in the cell set in the physical vector register VRR100.

たとえば、要素０の演算結果に関しては、マスク情報が“０”であるので、計算結果である“４”が、ライトデータＷＤとして物理ベクトルレジスタＶＲＲ１００側のセルに書き込まれる。図１のセルセットにおいては、セルＣ０側にＶ物理ベクトルレジスタＲＲ１００側のデータ、セルＣ１側に物理ベクトルレジスタＶＲＲ１０８側のデータが書き込まれているとすると、ライトイーネーブル信号ＷＥ０が“１”となり、かつＭＡＳＫ信号は“０”のままで、セルＣ０に、ライトデータＷＤである“１２”が書き込まれる。 For example, for the operation result of element 0, since the mask information is “0”, the calculation result “4” is written in the cell on the physical vector register VRR 100 side as write data WD. In the cell set of FIG. 1, if the data on the V physical vector register RR100 side is written on the cell C0 side and the data on the physical vector register VRR108 side is written on the cell C1, the write enable signal WE0 becomes “1”. In addition, the MASK signal remains “0”, and “12”, which is the write data WD, is written into the cell C0.

要素１の演算結果に関しては、マスク情報が“１”であるので、物理ベクトルレジスタＶＲＲ１００側のデータである“１”が物理ベクトルレジスタＶＲＲ１０８側のセルに書き込まれる。図１のセルセットにおいては、セルＣ０側に物理ベクトルレジスタＶＲＲ１００のデータ、セルＣ１側に物理ベクトルレジスタＶＲ１０８側のデータが書き込まれているとすると、ライトイネーブル信号ＷＥ０が“１”となり、かつＭＡＳＫ信号は“１”となって、セルＣ０には、セルＣ１の格納値である“１”が書き込まれる。 Regarding the operation result of element 1, since the mask information is “1”, “1”, which is data on the physical vector register VRR 100 side, is written to the cell on the physical vector register VRR 108 side. In the cell set of FIG. 1, if the data of the physical vector register VRR100 is written to the cell C0 side and the data of the physical vector register VR108 side is written to the cell C1, the write enable signal WE0 becomes “1” and MASK The signal becomes “1”, and “1” which is the stored value of the cell C1 is written into the cell C0.

このＶＳＵＢ命令の終了時における、リソース管理部４内の対応テーブル４１、マスク情報、ＶＬ、およびリネーミングレジスタＶＲＲ１００，ＶＲＲ１０１，ＶＲＲ１０２，ＶＲＲ１０８の値を図１０に示す。図１０に示す通り、ＶＳＵＢ命令の実行結果は、割り当てられた物理ベクトルレジスタＶＲＲ１００に、マスク情報を反映した形で格納されている。 FIG. 10 shows the correspondence table 41 in the resource management unit 4, mask information, VL, and values of the renaming registers VRR100, VRR101, VRR102, and VRR108 at the end of the VSUB instruction. As shown in FIG. 10, the execution result of the VSUB instruction is stored in the allocated physical vector register VRR100 in a form reflecting the mask information.

このようにして、本実施形態によれば、ベクトル計算機においてマスク演算が行われていたとしても、リネームされたベクトルレジスタ側にも正常な結果が書き込まれることが保障できる。 In this way, according to the present embodiment, even if a mask operation is performed in the vector computer, it can be ensured that a normal result is written also to the renamed vector register side.

なお、上述では一つ一つの命令が非パイプライン的に動作した例を説明したが、当然のことながら、各命令はパイプライン的に動作し、それぞれの命令動作の一部が互いにオーバーラップしてもよい。 In the above description, an example in which each instruction operates in a non-pipeline is described. However, as a matter of course, each instruction operates in a pipeline, and a part of each instruction operation overlaps each other. May be.

本発明の実施の一形態のベクトルリネーミング方式で用いられるセルセットを示す回路図である。It is a circuit diagram which shows the cell set used with the vector renaming system of one Embodiment of this invention. 本発明の実施の一形態のベクトル計算機の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the vector computer of one Embodiment of this invention. リソース管理部内に設けられる対応テーブルの内容の一例を示す図である。It is a figure which shows an example of the content of the corresponding | compatible table provided in a resource management part. ベクトル計算機に与えられる命令群の一例を示す図である。It is a figure which shows an example of the command group given to a vector computer. ＬＶＬ命令の終了時における、対応テーブル、マスク情報、ＶＬ、およびリネーミングレジスタの値を示す図である。It is a figure which shows the value of a correspondence table, mask information, VL, and a renaming register at the time of completion | finish of a LVL instruction | indication. ＶＦＭＫ命令の終了時における、対応テーブル、マスク情報、ＶＬ、およびリネーミングレジスタの値を示す図である。It is a figure which shows the value of a correspondence table, mask information, VL, and a renaming register at the time of completion | finish of a VFMK instruction | indication. データ書き込み制御部からの制御データを示す図である。It is a figure which shows the control data from a data writing control part. ＶＡＤＤ命令の終了時における、対応テーブル、マスク情報、ＶＬ、およびリネーミングレジスタの値を示す図である。It is a figure which shows the value of a correspondence table, mask information, VL, and a renaming register at the time of completion | finish of a VADD instruction | indication. データ書き込み制御部からの制御データを示す図である。It is a figure which shows the control data from a data writing control part. ＶＳＵＢ命令の終了時における、対応テーブル、マスク情報、ＶＬ、およびリネーミングレジスタの値を示す図である。It is a figure which shows the value of a correspondence table, mask information, VL, and a renaming register at the time of completion | finish of a VSUB instruction.

Explanation of symbols

１命令供給部
２デコーダ
３発行制御部
４リソース管理部
５ＬＳユニット
６外部記憶メモリ
９データ書き込み制御部
３１制御信号
４１対応テーブル
４２選択信号
８０，Ｓ０，Ｓ１セレクタ
７０，７３ＦＭＡＣ（演算器）
７１，７４ＦＤＶ（除算器）
７２，７５ＦＬＯＧＩＣ（論理演算器）
Ｃ０，Ｃ１セル
ＶＲ論理レジスタ
ＶＲＲ物理ベクトルレジスタ DESCRIPTION OF SYMBOLS 1 Instruction supply part 2 Decoder 3 Issue control part 4 Resource management part 5 LS unit 6 External storage memory 9 Data write control part 31 Control signal 41 Correspondence table 42 Selection signal 80, S0, S1 Selector 70, 73 FMAC (arithmetic unit)
71,74 FDV (divider)
72,75 FLOGIC (logic operator)
C0, C1 cell VR logical register VRR physical vector register

Claims

A vector renaming method that enables register renaming while realizing a mask operation function in a vector computer,
Comprising a plurality of renaming register constituting a pair mutually with corresponding logical vector register,
Each of the renaming registers includes a cell that holds 1-bit data, and a selector provided for each cell.
The selector supplies write data or a value held by another cell in the set to a corresponding cell according to a mask signal,
A vector renaming method in which a cell value is copied between the cells when a mask is set, and the write data is written to any cell when the mask is not set.

The vector renaming method according to claim 1, wherein two physical vector registers are provided as the plurality of renaming registers for each logical vector register.

A vector calculator,
Control means for controlling register renaming;
A plurality of renaming register constituting a pair mutually with corresponding logical vector register,
An arithmetic unit;
A first selector which is controlled by the control means and supplies an output of one of the plurality of renaming registers to the computing unit;
Write control means for inputting a calculation result of the calculator and storing it in any of the plurality of renaming registers;
With
Each of the renaming registers includes a cell that holds 1-bit data, and a second selector provided for each cell,
The second selector supplies write data from the write control means or a value held by another cell in the set to a corresponding cell according to a mask signal,
A vector calculator that copies a cell value between the cells when a mask is set, and writes the write data to any cell when the mask is not set.