JP2014225201A

JP2014225201A - Semiconductor device design apparatus, semiconductor device design method, and semiconductor device design program

Info

Publication number: JP2014225201A
Application number: JP2013105141A
Authority: JP
Inventors: 瀬戸　謙修; Kanenaga Seto; 謙修瀬戸; 宏晃竹鼻; Hiroaki Takehana; 秀則松崎; Hidenori Matsuzaki; 風間　英樹; Hideki Kazama; 英樹風間
Original assignee: Semiconductor Technology Academic Research Center
Current assignee: Semiconductor Technology Academic Research Center
Priority date: 2013-05-17
Filing date: 2013-05-17
Publication date: 2014-12-04

Abstract

PROBLEM TO BE SOLVED: To provide a semiconductor device design apparatus, a semiconductor device design method, and a semiconductor device design program capable of preventing a redundant register from being added during a behavioral description, preventing an increase in a code quantity of the behavioral description, preventing an increase in an area of generated hardware, and achieving high speed operation.SOLUTION: An analysis unit analyzes a possible range of a subscript of a first array access from a behavioral description. A first processing unit changes a range of a first loop to a range of a second loop so as to include the possible range of the subscript of the first array access. A second processing unit adds a second array access including entirety of the range of the subscript of the first array access within the range of the second loop. A third processing unit calculates a reuse distance between the first and second array accesses by searching a data reuse relation between the first and second array accesses, adds description of a shift register to the second loop, and replaces the first array access by a register having the same number as the reuse distance.

Description

本発明は、デジタル回路の動作が記述された動作レベル記述（以下、単に動作記述と称す）を最適化する半導体装置の設計装置、設計方法、設計プログラムに関する。 The present invention relates to a semiconductor device design apparatus, a design method, and a design program for optimizing a behavior level description (hereinafter simply referred to as a behavior description) in which an operation of a digital circuit is described.

高位合成技術（非特許文献１）は、Ｃ言語等のプログラムにより構成された動作記述（以下、Ｃ記述とも言う）からＲＴＬ(Register Transfer Level)記述を自動生成できるため、ＶＬＳＩ(Very Large Scale Integration)の設計効率を大きく向上できる。しかし、高性能なＲＴＬ記述を生成するには、通常、Ｃ記述上でメモリアクセスの最適化が必要となる。現在、高位合成ツールによるメモリアクセスの最適化は、高位合成ツールの利用者が、人手でＣ記述を最適化する必要があり、設計時間の増大を招いている。 The high-level synthesis technology (Non-patent Document 1) can automatically generate an RTL (Register Transfer Level) description from an operation description (hereinafter also referred to as C description) configured by a program such as C language. ) Design efficiency can be greatly improved. However, in order to generate a high-performance RTL description, it is usually necessary to optimize memory access on the C description. Currently, optimization of memory access by a high-level synthesis tool requires the user of the high-level synthesis tool to manually optimize the C description, which increases design time.

メモリアクセスの最適化技術の１つとしてスカラリプレイスと呼ばれる技術がある（非特許文献２、３、４）。Ｃ記述での配列と一時変数は、ハードウェア上において、メモリとレジスタとなる。スカラリプレイスは、Ｃ記述中の複数の配列へのアクセスを一時変数へのアクセス、すなわち、メモリへのアクセスをレジスタへのアクセスに置き換える技術であり、例えばシフトレジスタを用意することにより、配列へのアクセスを削減できる。これを高位合成前のＣ記述に適用することにより、自動生成されるハードウェアのメモリアクセスを削減でき、性能の向上が期待できる。 One technique for optimizing memory access is a technique called scalar replacement (Non-Patent Documents 2, 3, and 4). The array and temporary variable in the C description are a memory and a register on the hardware. The scalar place is a technique that replaces access to a plurality of arrays in a C description with access to temporary variables, that is, access to a memory with access to a register. For example, by preparing a shift register, an array can be accessed. Access can be reduced. By applying this to the C description before high-level synthesis, it is possible to reduce the memory access of the automatically generated hardware and to expect an improvement in performance.

しかし、既存のスカラリプレイスは、予め用意したシフトレジスタに初期値を設定する必要がある場合がある。その際、コード量が増大し、ハードウェアの面積が増大するとともに、動作速度が低下するという問題を有している。 However, in the existing scalar replacement, it may be necessary to set an initial value in a shift register prepared in advance. At this time, there is a problem that the code amount increases, the area of hardware increases, and the operation speed decreases.

特開２０１２−１１８８３５号公報JP 2012-118835 A

Daniel D. Gajski, “High Level Synthesis: An Introduction to Chip and System Design,” Kluwer Academic Publishers, 1992.Daniel D. Gajski, “High Level Synthesis: An Introduction to Chip and System Design,” Kluwer Academic Publishers, 1992. Steve Carr, Ken Kennedy, “Scalar Replacement in the Presence of Conditional Control Flow,” Software-Practice and Experience, Vol. 24, Issue 1, pp. 51-77, 1994.Steve Carr, Ken Kennedy, “Scalar Replacement in the Presence of Conditional Control Flow,” Software-Practice and Experience, Vol. 24, Issue 1, pp. 51-77, 1994. Byoungro So, “An Efficient, Design Space Exploration For Balance Between Computation And Memory”, Ph.D. University of Southern California, 2003.Byoungro So, “An Efficient, Design Space Exploration For Balance Between Computation And Memory”, Ph.D. University of Southern California, 2003. Nastaran Baradaran, Pedro C. Diniz, and Joonseok Park, “Extending the Applicability of Scalar Replacement to Multiple Induction Variables,” pp. 455-469, Languages and Compilers for High Performance Computing, 2004Nastaran Baradaran, Pedro C. Diniz, and Joonseok Park, “Extending the Applicability of Scalar Replacement to Multiple Induction Variables,” pp. 455-469, Languages and Compilers for High Performance Computing, 2004

本発明は、シフトレジスタを初期化する際、ハードウェアの面積の増加を防止して高速動作を可能とする半導体装置の設計装置、設計方法、設計プログラムを提供するものである。 The present invention provides a design apparatus, a design method, and a design program for a semiconductor device that can prevent a hardware area from increasing and enable high-speed operation when a shift register is initialized.

本発明の半導体装置の設計装置の態様は、複数の第１の配列アクセス、及び第１のループを含むデジタル回路の動作が記述された動作記述が記憶された記憶部と、前記記憶部に記憶された前記動作記述より、前記第１の配列アクセスの添え字の取り得る範囲をそれぞれ解析する解析部と、前記解析部により解析された前記第１の配列アクセスの添え字の取り得る範囲を包含するような第２の配列アクセスが存在するように、前記第１のループの範囲を第２のループの範囲に変更する第１の処理部と、前記第２のループの範囲において、前記第１の配列アクセスの添え字の範囲を全て含む前記第２の配列アクセスを追加する第２の処理部と、前記第１、第２の配列アクセス間のデータ再利用関係を調べ、各配列アクセス間の再利用距離を算出し、前記再利用距離の最大値と同じサイズのシフトレジスタの記述を前記第２のループに追加し、前記第１の配列アクセスを前記再利用距離と同じ番号を有するレジスタへのアクセスに置き換える第３の処理部とを具備することを特徴とする。 According to an aspect of the semiconductor device designing apparatus of the present invention, a storage unit storing operation descriptions in which a plurality of first array accesses and operations of a digital circuit including a first loop are described is stored in the storage unit. The analysis unit for analyzing the range that can be taken by the subscript of the first array access, and the range that can be taken by the subscript of the first array access analyzed by the analysis unit are included from the above described behavior description A first processing unit that changes a range of the first loop to a range of a second loop so that there is a second array access to be performed, and in the range of the second loop, the first loop A second processing unit that adds the second array access including the entire range of array access subscripts, and a data reuse relationship between the first and second array accesses. Calculate the reuse distance A description of a shift register having the same size as the maximum reuse distance is added to the second loop, and a third array access is replaced with an access to a register having the same number as the reuse distance. And a processing unit.

動作記述の一例を示す図。The figure which shows an example of operation | movement description. 図１に示す動作記述に対して既存手法によるスカラリプレイス、初期化、及びループピーリングを実行した場合の途中経過を示す図。The figure which shows the progress in the case of performing the scalar replacement, initialization, and loop peeling by the existing method with respect to the behavioral description shown in FIG. 図１に示す動作記述に対して既存手法によるスカラリプレイス、初期化、及びループピーリングを実行した場合の結果を示す図。The figure which shows the result at the time of performing the scalar replacement, initialization, and loop peeling by the existing method with respect to the behavioral description shown in FIG. 本実施形態が適用される動作合成装置の一例を示す構成図。The block diagram which shows an example of the behavioral synthesis apparatus with which this embodiment is applied. 図４に示す制御部と記憶装置の機能を示す構成図。The block diagram which shows the function of the control part and memory | storage device which are shown in FIG. 本実施形態の動作を示すフローチャート。The flowchart which shows operation | movement of this embodiment. 本実施形態に適用される動作記述の一例を示す図。The figure which shows an example of the action description applied to this embodiment. 図７に示す動作記述中の各配列アクセスの添え字が取り得る範囲を示す図。The figure which shows the range which the subscript of each array access in the action description shown in FIG. 7 can take. 本実施形態に係る第１の処理部の処理結果を示す図。The figure which shows the process result of the 1st process part which concerns on this embodiment. 本実施形態に係る第２の処理部の処理結果を示す図。The figure which shows the process result of the 2nd process part which concerns on this embodiment. 図１１（ａ）（ｂ）は、本実施形態の配列アクセス間の再利用の関係を示す図。FIGS. 11A and 11B are diagrams showing a reuse relationship between array accesses according to the present embodiment. 図１１（ｂ）に示す再利用の関係をまとめて示す図。The figure which shows the relationship of reuse shown in FIG.11 (b) collectively. 本実施形態に係る第３の処理部の処理結果を示すものであり、最適化された動作記述の一例を示す図。The figure which shows the process result of the 3rd process part which concerns on this embodiment, and shows an example of the optimized operation | movement description. 図７に示す動作記述に対して、既存の配列アクセスの削除と初期化用配列アクセスを追加した場合の動作記述を示す図。The figure which shows the operation description at the time of adding the deletion of the existing array access and the array access for initialization with respect to the operation description shown in FIG. 図１４に示す動作記述の外側のループ（ループｉ）を１回分、既存のループピーリングした場合の動作記述を示す図。The figure which shows the action description at the time of carrying out the existing loop peeling of the loop (loop i) of the outer side of the action description shown in FIG. 図１５に示す動作記述の内側のループ（ループｊ）を１回分、既存のループピーリングした場合の動作記述を示す図。The figure which shows the operation | movement description at the time of carrying out the existing loop peeling of the inner loop (loop j) of the operation | movement description shown in FIG.

前述したように、スカラリプレイスでは、予め用意したシフトレジスタに初期値を設定する必要がある場合がある。例えば図１に示す動作記述に既存のスカラリプレイス技術と既存の初期化技術を適用した場合、最終的な結果は、図３に示すようになる。図２は、図１から図３に至る途中経過のコードを示している。図２、図３において、“ｓｈｉｆｔ＿ｒｅｇｉｓｔｅｒｓ（Ｂ＿ｒｅｇ＿５，．．．，Ｂ＿ｒｅｇ＿０）”は、図１の配列アクセスをシフトレジスタへのアクセスに置き換える際に必要となる、シフトレジスタの記述を簡易的に示している。図２において、第３行と第５行は、シフトレジスタを初期化するための実行文である。 As described above, in the scalar replace, it may be necessary to set an initial value in a prepared shift register. For example, when the existing scalar replacement technique and the existing initialization technique are applied to the behavioral description shown in FIG. 1, the final result is as shown in FIG. FIG. 2 shows an intermediate code from FIG. 1 to FIG. 2 and 3, “shift_registers (B_reg_5,..., B_reg_0)” simply shows a description of the shift register that is required when the array access in FIG. 1 is replaced with an access to the shift register. Yes. In FIG. 2, the third and fifth lines are executable statements for initializing the shift register.

図１の第３行の配列アクセスＢ［ｉ］［ｊ］は、再利用ジェネレータ（再利用元）である。すなわち、配列アクセスＢ［ｉ−１］［ｊ−１］に再利用されるデータを供給するための供給源である。したがって、Ｂ［ｉ］［ｊ］でアクセスしたデータをシフトレジスタにコピーするため、図２の第３行の実行文が必要となる。しかし、この初期化だけではレジスタＢ＿ｒｅｇ＿０に値が代入されるだけで、図２の第６行のレジスタＢ＿ｒｅｇ＿５には値が存在しない。したがって、シフトレジスタ内に値が満たされるまでは、レジスタＢ＿ｒｅｇ＿５に対して、図２の第５行のように、適切な値を設定する必要がある。 The array access B [i] [j] in the third row in FIG. 1 is a reuse generator (reuse source). That is, it is a supply source for supplying data to be reused for array access B [i-1] [j-1]. Therefore, in order to copy the data accessed by B [i] [j] to the shift register, the executable statement in the third line in FIG. 2 is required. However, this initialization only assigns a value to the register B_reg_0, and no value exists in the register B_reg_5 in the sixth row in FIG. Therefore, until the value is filled in the shift register, it is necessary to set an appropriate value for the register B_reg_5 as shown in the fifth row of FIG.

既存手法では、図２に示すスカラリプレイス及び初期化を適用した後、ループピーリング（Loop peeling）を適用し、ループを分割することにより、初期化に伴って発生した配列アクセスがメインループから除去される。ここで、ループピーリングとは、ループの最初、または、最後の繰り返しのいくつかを、メインループから分けて実行することであり、結果としてメインループ内の配列アクセスを繰り返し実行する図２の第５行のような実行文をループの外側に移動することであり、メインループ実行の開始間隔を短縮し、性能を向上させる技術である。 In the existing method, after applying the scalar replacement and initialization shown in FIG. 2, by applying loop peeling, and dividing the loop, the array access generated by the initialization is removed from the main loop. The Here, the loop peeling is to execute some of the first or last iteration of the loop separately from the main loop. As a result, the array access in the main loop is repeatedly executed as shown in FIG. This is a technique that moves an executable statement such as a line to the outside of a loop, shortens the start interval of main loop execution, and improves performance.

図３は、図２に示す動作記述に対してループピーリングを適用した結果を示している。ループピーリングを適用したことにより、図３の第１２行乃至第１６行に示す最も内側のループでは、ループ１回あたり配列Ｂへのアクセスを１回にすることができる。しかしながら、ループピーリングによってコード量が増加するため、高位合成を使用してハードウェア化する際、面積を増加させる要因となる。 FIG. 3 shows the result of applying loop peeling to the behavioral description shown in FIG. By applying the loop peeling, the innermost loop shown in the 12th to 16th rows in FIG. 3 can access the array B once per loop. However, since the amount of code increases due to loop peeling, it becomes a factor of increasing the area when hardware is realized using high-level synthesis.

そこで、本実施形態は、ループピーリングを必要としないシフトレジスタの初期化について説明する。 Thus, in the present embodiment, the initialization of a shift register that does not require loop peeling will be described.

（実施形態）
以下、図面を参照して実施形態について、具体的に説明する。 (Embodiment)
Hereinafter, embodiments will be specifically described with reference to the drawings.

本実施形態は、例えば動作記述の最適化による集積回路設計の最適化について説明する。動作記述の最適化は、高位合成の下位概念であり、動作記述の最適化も高位合成と同様に、Ｃ言語などのプログラミング言語により構成された動作記述からＲＴＬ記述を生成する際に使用され、例えば画像処理やオーディオ信号処理回路の設計に広く適用されている。しかし、本実施形態は、動作記述の最適化に限定されるものではなく、高位合成に適用することも可能である。 In the present embodiment, optimization of integrated circuit design by optimizing behavioral descriptions will be described. The optimization of the behavioral description is a subordinate concept of the high-level synthesis, and the optimization of the behavioral description is used when generating the RTL description from the behavioral description configured by a programming language such as C language, as in the high-level synthesis. For example, it is widely applied to the design of image processing and audio signal processing circuits. However, this embodiment is not limited to optimizing behavioral descriptions, and can also be applied to high-level synthesis.

（装置構成）
図４は、本実施形態が適用される動作記述の最適化装置の一例を示している。動作記述の最適化装置１０は、例えばコンピュータにより構成され、このコンピュータは、例えば制御部１１に接続された入力装置１２、出力装置１３、記憶装置１４、メモリ１５により構成されている。 (Device configuration)
FIG. 4 shows an example of an operation description optimizing device to which the present embodiment is applied. The behavioral description optimizing device 10 is constituted by, for example, a computer, and this computer is constituted by, for example, an input device 12, an output device 13, a storage device 14, and a memory 15 connected to the control unit 11.

制御部１１は、例えばＣＰＵ（Central Processing Unit）であり、メモリ１５に記憶されたプログラムに従って所定の動作を実行する。 The control unit 11 is a CPU (Central Processing Unit), for example, and executes a predetermined operation according to a program stored in the memory 15.

入力装置１２は、半導体集積回路の動作を示す動作記述や、半導体集積回路の制約等の動作記述の最適化処理に必要な情報を入力するための例えばキーボードである。 The input device 12 is, for example, a keyboard for inputting information necessary for optimizing the operation description indicating the operation of the semiconductor integrated circuit and the operation description such as restrictions on the semiconductor integrated circuit.

出力装置１３は、生成された動作記述等を出力するものであり、例えばプリンタにより構成されている。 The output device 13 outputs the generated operation description and the like, and is configured by a printer, for example.

記憶装置１４は、例えばハードディスク装置やフラッシュメモリなどのコンピュータにより読み取り可能な記憶媒体である。この記憶装置１４には、例えば制御部１１の動作に必要なプログラムや、動作記述の最適化の処理に必要なプログラム、或いは例えばＣ言語による動作記述などが記憶される。 The storage device 14 is a computer-readable storage medium such as a hard disk device or a flash memory. The storage device 14 stores, for example, a program necessary for the operation of the control unit 11, a program necessary for optimizing the operation description, or an operation description in C language, for example.

メモリ１５は、例えばＲＡＭ（Random Access Memory）であり、記憶装置１４から読み出された制御部１１の動作に必要なプログラムなどがロードされる。 The memory 15 is, for example, a RAM (Random Access Memory), and is loaded with a program and the like necessary for the operation of the control unit 11 read from the storage device 14.

図５は、図４に示す制御部１１と記憶装置１４により実現される機能を示している。 FIG. 5 shows functions realized by the control unit 11 and the storage device 14 shown in FIG.

制御部１１は、解析処理部１１ａ、第１の処理部１１ｂ、第２の処理部１１ｃ、及び第３の処理部１１ｄを有している。 The control unit 11 includes an analysis processing unit 11a, a first processing unit 11b, a second processing unit 11c, and a third processing unit 11d.

尚、記憶装置１４は、制御部１１に直接接続されている必要はなく、ネットワークを介して接続されていてもよい。 The storage device 14 need not be directly connected to the control unit 11 and may be connected via a network.

また、記憶装置１４には、動作記述１４ａと、制御部１１の動作の過程において生成された例えば配列アクセスの範囲情報１４ｂ、及び最適化された動作記述１４ｃが格納される。 Further, the storage device 14 stores an operation description 14a, array access range information 14b generated in the course of the operation of the control unit 11, and an optimized operation description 14c.

（動作）
図６は、制御部１１の動作を示すものであり、図６を参照して図５に示す各機能について説明する。 (Operation)
FIG. 6 shows the operation of the control unit 11, and each function shown in FIG. 5 will be described with reference to FIG.

先ず、制御部１１の解析処理部１１ａは、記憶装置１４から動作記述１４ａを読み出し、動作記述１４ａ中の配列アクセスの添え字が取る範囲を解析する（Ｓ１１）。 First, the analysis processing unit 11a of the control unit 11 reads the behavior description 14a from the storage device 14, and analyzes the range taken by the array access subscript in the behavior description 14a (S11).

図７は、記憶装置１４に記憶された動作記述の一例を示している。配列アクセスは、“ｆｏｒ”ループ中で出現する配列への実際の参照のことであり、動作記述が図７に示すような場合、配列Ａに関して、１つの配列アクセスＡ[ｉ]［ｊ］が含まれ、配列Ｂに関して、３つの配列アクセスＢ［ｉ］［ｊ＋１］、Ｂ［ｉ］［ｊ］、Ｂ［ｉ−１］［ｊ］が含まれる。 FIG. 7 shows an example of the behavior description stored in the storage device 14. An array access is an actual reference to an array appearing in the “for” loop. When the behavioral description is as shown in FIG. 7, one array access A [i] [j] is related to the array A. And for array B, three array accesses B [i] [j + 1], B [i] [j], B [i-1] [j] are included.

また、配列アクセスの添え字とは、例えば配列アクセスＢ［ｘ］［ｙ］に対して、“ｘ”及び“ｙ”のことを言い、図７に示す例えばＢ［ｉ−１］［ｊ］の場合、“ｉ−１”“ｊ”のことを言う。添え字が取る範囲とは、設定されたループ中において、添え字が変化する範囲のことを言う。したがって、配列アクセスの添え字が取る範囲を解析するとは、ループの実行を制御する変数、すなわちループ変数“ｉ”“ｊ”が、ループで指定された範囲（図７において、１＜＝ｉ＜＝３２、１＜＝ｊ＜＝１６）を動く際に、配列アクセスの添え字が動く範囲を調べることである。例えば上記３つの配列アクセスＢ［ｉ］［ｊ＋１］、Ｂ［ｉ］［ｊ］、Ｂ［ｉ−１］［ｊ］に対して、添え字の動く範囲は、次のようになる。 The array access subscript means, for example, “x” and “y” for the array access B [x] [y], for example, B [i−1] [j] shown in FIG. In this case, it means “i−1” and “j”. The range taken by the subscript means a range where the subscript changes in the set loop. Therefore, to analyze the range taken by the array access subscript, the variable that controls the execution of the loop, that is, the loop variable “i” “j” is the range specified in the loop (in FIG. 7, 1 <= i < = 32, 1 <= j <= 16), the range in which the array access index moves is checked. For example, for the above three array accesses B [i] [j + 1], B [i] [j], and B [i-1] [j], the range in which the subscript moves is as follows.

Ｂ［ｉ］［ｊ＋１］：１＜＝ｉ＜＝３２、２＜＝ｊ＋１＜＝１７
Ｂ［ｉ］［ｊ］：１＜＝ｉ＜＝３２、１＜＝ｊ＜＝１６
Ｂ［ｉ−１］［ｊ］：０＜＝ｉ−１＜＝３１、１＜＝ｊ＜＝１６
図８に示す破線は、配列アクセスＢ［ｉ］［ｊ＋１］、Ｂ［ｉ］［ｊ］、Ｂ［ｉ−１］［ｊ］の添え字の範囲を示している。 B [i] [j + 1]: 1 <= i <= 32, 2 <= j + 1 <= 17
B [i] [j]: 1 <= i <= 32, 1 <= j <= 16
B [i-1] [j]: 0 <= i-1 <= 31, 1 <= j <= 16
The broken lines shown in FIG. 8 indicate the subscript ranges of the array accesses B [i] [j + 1], B [i] [j], and B [i-1] [j].

次に、解析処理部１１ａは、上記解析された配列アクセスの添え字が取る範囲に基づき、配列の添え字の取る範囲を解析する（Ｓ１２）。 Next, the analysis processing unit 11a analyzes the range of the array subscript based on the analyzed range of the array access subscript (S12).

配列の添え字の取る範囲とは、配列毎に、各配列アクセスの添え字の取る範囲の和集合を求めることである。例えば図７に示す動作記述において配列Ｂの添え字［Ｘ］［Ｙ］の取る範囲は、次のようになる。 The range taken by the array subscript is to obtain the union of the ranges taken by the subscripts of each array access for each array. For example, in the behavioral description shown in FIG. 7, the range of the subscript [X] [Y] of the array B is as follows.

｛１＜＝Ｘ＜＝３２ ∪ １＜＝Ｘ＜＝３２ ∪ ０＜＝Ｘ＜＝３１｝
＝｛０＜＝Ｘ＜＝３２｝、
｛２＜＝Ｙ＜＝１７ ∪ １＜＝Ｙ＜＝１６ ∪ １＜＝Ｙ＜＝１６｝
＝｛１＜＝Ｙ＜＝１７｝
図８に示す実線の枠は、二次元配列Ｂの添え字［Ｘ］［Ｙ］が取り得る範囲を示している。 {1 <= X <= 32 ∪ 1 <= X <= 32 ０ 0 <= X <= 31}
= {0 <= X <= 32},
{2 <= Y <= 17 １ 1 <= Y <= 16 ∪ 1 <= Y <= 16}
= {1 <= Y <= 17}
A solid line frame shown in FIG. 8 indicates a range that the subscript [X] [Y] of the two-dimensional array B can take.

上記のようにして解析された配列アクセスの添え字が取る範囲、及び配列の添え字の取る範囲の情報は、記憶装置１４内に配列アクセスの範囲情報１４ｂとして記憶される。 The range taken by the array access subscript analyzed as described above and the information on the range taken by the array subscript are stored in the storage device 14 as array access range information 14b.

次に、上記解析結果に基づき、第１の処理部１１ｂにより、ループ範囲が変更される（Ｓ１３）。 Next, based on the analysis result, the first processing unit 11b changes the loop range (S13).

すなわち、上記配列の添え字が取る範囲に基づき、ループ変数の初期値と最終値が変更される。例えば、図７に示す動作記述において、各配列アクセスに再利用データの供給源となる配列アクセス、すなわち、ジェネレータとしての配列アクセスＢ［ｉ］［ｊ＋１］の添え字が、“ｉ、ｊ＋１”であり、これら添え字の取る範囲が配列の添え字の取る範囲を含むようになればよい。このため、これら添え字は、それぞれ０＜＝ｉ＜＝３２、且つ、１＜＝ｊ＋１＜＝１７となればよい。すなわち、変更後のループ変数の範囲は、０＜＝ｉ＜＝３２、０＜＝ｊ＜＝１６となり、変更後のループ変数の範囲のもとで、再利用ジェネレータとなる配列アクセスＢ［ｉ］［ｊ＋１］と同じ添え字を持つ配列アクセスＢ［ｉ］［ｊ＋１］の添え字の取る範囲は各配列アクセスの添え字の取り得る範囲を包含する範囲となる。したがって、図７に示す元の動作記述と同じ動作を保つため、ループ範囲が下記のように変更される。 That is, the initial value and final value of the loop variable are changed based on the range taken by the subscript of the array. For example, in the behavioral description shown in FIG. 7, the array access that is the source of reuse data for each array access, that is, the subscript of the array access B [i] [j + 1] as a generator is “i, j + 1”. Yes, the range taken by these subscripts only needs to include the range taken by the subscripts of the array. Therefore, these subscripts may be 0 <= i <= 32 and 1 <= j + 1 <= 17, respectively. That is, the range of the loop variable after the change is 0 <= i <= 32 and 0 <= j <= 16, and the array access B [i that becomes the reuse generator under the range of the loop variable after the change. ] The range taken by the subscript of array access B [i] [j + 1] having the same subscript as [j + 1] is a range including the range that can be taken by the subscript of each array access. Therefore, in order to maintain the same operation as the original operation description shown in FIG. 7, the loop range is changed as follows.

ｆｏｒ（ｉ＝０；ｉ＜３３；ｉ＋＋）｛
ｆｏｒ（ｊ＝０；ｊ＜１７；ｊ＋＋）｛
ｉｆ（（ｉ＞＝１）＆＆（ｊ＞＝１））｛
尚、変更するループ範囲の決定方法は、換言すると次のようになる。 for (i = 0; i <33; i ++) {
for (j = 0; j <17; j ++) {
if ((i> = 1) &&(j> = 1)) {
The method of determining the loop range to be changed is as follows in other words.

図７に示す初期化が必要な動作記述において、配列の添え字の取る範囲が、ジェネレータとなる配列アクセスの添え字の取る範囲に含まれること。 In the behavioral description that requires initialization shown in FIG. 7, the range taken by the array subscript must be included in the range taken by the subscript for array access serving as a generator.

例えば、図７に示す動作記述において、ジェネレータとしての配列アクセスＢ［ｉ］［ｊ＋１］の取る範囲は、前述したように、１＜＝ｉ＜＝３２、２＜＝ｊ＋１＜＝１７である。配列Ｂの添え字の取る範囲は、Ｂ［Ｘ］［Ｙ］｛０＜＝Ｘ＜＝３２｝且つ｛１＜＝Ｙ＜＝１７｝である。再利用ジェネレータとなる配列アクセスＢ［ｉ］［ｊ＋１］と同じ添え字を持つ配列アクセスに対して、添え字の取る範囲を、配列の添え字の取る範囲と一致させるのに必要なループ変数の範囲が求められる。図７に示す動作記述の場合、再利用ジェネレータの添え字が“ｉ、ｊ＋１”であり、これらの添え字が、それぞれ、０＜＝ｉ＜＝３２、且つ１＜＝ｊ＋１＜＝１７となればよいため、ループ変数の範囲は、０＜＝ｉ＜＝３２、且つ０＜＝ｊ＜＝１６となる。 For example, in the behavioral description shown in FIG. 7, the range taken by the array access B [i] [j + 1] as a generator is 1 <= i <= 32 and 2 <= j + 1 <= 17 as described above. The range of the subscript of the array B is B [X] [Y] {0 <= X <= 32} and {1 <= Y <= 17}. For the array access having the same subscript as the array access B [i] [j + 1] serving as the reuse generator, the loop variable necessary to match the range taken by the subscript with the range taken by the subscript of the array A range is required. In the behavioral description shown in FIG. 7, the reuse generator subscripts are “i, j + 1”, and these subscripts can be 0 <= i <= 32 and 1 <= j + 1 <= 17, respectively. Therefore, the range of the loop variable is 0 <= i <= 32 and 0 <= j <= 16.

図９は、ループ範囲の変更後の動作記述を示している。図９の第１行において、“ｉ＝１”から“ｉ＝０”に変更され、第２行において、“ｊ＝１”から“ｊ＝０”に変更され、第３行に条件文が挿入されている。 FIG. 9 shows the behavioral description after changing the loop range. In the first line of FIG. 9, “i = 1” is changed to “i = 0”, in the second line, “j = 1” is changed to “j = 0”, and the conditional statement is changed to the third line. Has been inserted.

この後、第２の処理部１１ｃにより、配列アクセスが追加される（Ｓ１４）。具体的には、図１０の第３行に示すように、配列の添え字の範囲を全て含む配列アクセス“Ｂｒｅｇ０＝Ｂ［ｉ］［ｊ＋１］”が最も内側のループの先頭に追加される。 Thereafter, array access is added by the second processing unit 11c (S14). Specifically, as shown in the third row of FIG. 10, array access “Breg0 = B [i] [j + 1]” including the entire range of array subscripts is added to the head of the innermost loop.

次に、第３の処理部１１ｄにより、配列アクセスが削除される。 Next, array access is deleted by the third processing unit 11d.

具体的には、複数の配列アクセス間の再利用の依存関係が解析される（Ｓ１５）。 More specifically, the reuse dependency between a plurality of array accesses is analyzed (S15).

配列アクセスＢ［ｉ］［ｊ＋１］、Ｂ［ｉ］［ｊ］、Ｂ［ｉ―１］［ｊ］は、配列アクセスＢ［ｉ］［ｊ＋１］のデータがＢ［ｉ］［ｊ＋１］、Ｂ［ｉ］［ｊ］、Ｂ［ｉ―１］［ｊ］に再利用されるため、依存関係にある。 For array access B [i] [j + 1], B [i] [j], B [i-1] [j], the data for array access B [i] [j + 1] is B [i] [j + 1], B Since it is reused for [i] [j] and B [i-1] [j], they are in a dependency relationship.

この動作記述に対して、配列アクセス間の再利用の関係が解析され、再利用距離が計算される。 For this behavioral description, the reuse relationship between array accesses is analyzed and the reuse distance is calculated.

図１１は、図１０に示す動作記述に対応する再利用グラフを示している。再利用グラフとは、各配列アクセスに対して、再利用データの供給源（再利用元）としてのジェネレータのノード、又は供給先（以下、再利用先とも言う）のノードを作成し、供給源のノードと供給先のノードをエッジと称する矢印で接続したものである。このエッジには、再利用ベクトルが書き添えられる。 FIG. 11 shows a reuse graph corresponding to the behavioral description shown in FIG. A reuse graph creates a node of a generator as a supply source (reuse source) of reuse data or a node of a supply destination (hereinafter also referred to as a reuse destination) for each array access. Are connected to the supply destination node by an arrow called an edge. This edge is accompanied by a reuse vector.

図１１（ａ）（ｂ）に示す再利用グラフにおいて、４つの丸印は、配列アクセスＢ［ｉ］［ｊ＋１］、Ｂ［ｉ］［ｊ＋１］、Ｂ［ｉ］［ｊ］、Ｂ［ｉ―１］［ｊ］を表し、エッジにより再利用関係が表されている。エッジの付け根が再利用元としてのジェネレータであり、エッジの先が再利用先である。 In the reuse graph shown in FIGS. 11A and 11B, the four circles indicate array access B [i] [j + 1], B [i] [j + 1], B [i] [j], B [i -1] represents [j], and the reuse relationship is represented by an edge. The root of the edge is a generator as a reuse source, and the tip of the edge is the reuse destination.

図１１（ａ）（ｂ）において、二重丸で囲まれた配列アクセスＢ［ｉ］［ｊ＋１］は、配列アクセスＢ［ｉ］［ｊ＋１］から外側に向いたエッジのみが接続され、配列アクセスＢ［ｉ］［ｊ＋１］に向いたエッジが接続されていない。このため、二重丸で囲まれた配列アクセスＢ［ｉ］［ｊ＋１］は、ジェネレータである。 11 (a) and 11 (b), array access B [i] [j + 1] surrounded by double circles is connected only to the edges facing outward from the array access B [i] [j + 1]. Edges facing B [i] [j + 1] are not connected. For this reason, array access B [i] [j + 1] surrounded by a double circle is a generator.

図１１（ａ）に示すように、エッジには、再利用ベクトル＜ｄｙ、ｄｘ＞が付加されている。再利用ベクトル＜ｄｙ、ｄｘ＞は、エッジで接続された配列アクセスの添え字同士の差により構成されている。例えば、配列アクセスＢ［ｉ］［ｊ＋１］とＢ［ｉ］［ｊ＋１］との再利用ベクトルは、＜０、０＞として求められ、配列アクセスＢ［ｉ］［ｊ＋１］とＢ［ｉ］［ｊ］との再利用ベクトルは、＜０、１＞として求められ、配列アクセスＢ［ｉ］［ｊ＋１］とＢ［ｉ−１］［ｊ］との再利用ベクトルは、＜１、１＞として求められる。 As shown in FIG. 11A, reuse vectors <dy, dx> are added to the edges. The reuse vector <dy, dx> is configured by a difference between array access subscripts connected by edges. For example, the reuse vector of array access B [i] [j + 1] and B [i] [j + 1] is obtained as <0, 0>, and array access B [i] [j + 1] and B [i] [ j] is obtained as <0,1>, and the reuse vector between array accesses B [i] [j + 1] and B [i-1] [j] is <1,1>. Desired.

次に、上記配列アクセス間の依存関係に基づき、第３の処理部１１ｄにより、再利用距離ｄが算出される（Ｓ１６）。 Next, based on the dependency relationship between the array accesses, the reuse distance d is calculated by the third processing unit 11d (S16).

図１１（ａ）に示す再利用ベクトル＜ｄｙ、ｄｘ＞は、各配列アクセスがジェネレータに対して何回ループが回った後に再利用されるかを表している。例えば再利用ベクトル＜ｄｙ、ｄｘ＞（２重ループの時）の場合、外側ループがｄｙ回、内側ループがｄｘ回回った後、再利用元から再利用先に再利用が発生することを示す。 The reuse vector <dy, dx> shown in FIG. 11 (a) represents how many times each array access is reused after a loop has been made to the generator. For example, in the case of a reuse vector <dy, dx> (in the case of a double loop), it indicates that reuse occurs from the reuse source to the reuse destination after the outer loop is rotated dy times and the inner loop is rotated dx times. .

図１１（ｂ）に示す再利用距離ｄは、各配列アクセスがジェネレータに対して何回ループが回った後に再利用されるかを、最も内側のループ回数に換算して、表している。再利用距離ｄは、図１１（ａ）に示す再利用ベクトルに基づき計算される。すなわち、再利用距離ｄは、再利用ベクトル＜ｄｙ，ｄｘ＞、ループ回数をｊとした場合、“ｄ＝ｄｙ×ｊ＋ｄｘ”で表される。図１１（ａ）に示す再利用ベクトル＜０、０＞の場合、内側のループ回数は図１０より１７回である。これに基づき、再利用距離ｄ１を計算すると、ｄ１＝０×１７＋０＝０となる。また、再利用ベクトル＜０、１＞の場合、再利用距離ｄ３は、ｄ３＝０×１７＋１＝１となる。さらに、再利用ベクトル＜１、１＞の場合、再利用距離ｄ４は、ｄ４＝１×１７＋１＝１８となる。 The reuse distance d shown in FIG. 11B represents how many times each array access is looped with respect to the generator and then reused in terms of the innermost loop count. The reuse distance d is calculated based on the reuse vector shown in FIG. That is, the reuse distance d is represented by “d = dy × j + dx”, where the reuse vector <dy, dx> and the number of loops are j. In the case of the reuse vector <0, 0> shown in FIG. 11A, the number of inner loops is 17 times from FIG. Based on this, the reuse distance d1 is calculated to be d1 = 0 × 17 + 0 = 0. In the case of the reuse vector <0, 1>, the reuse distance d3 is d3 = 0 × 17 + 1 = 1. Further, in the case of the reuse vector <1, 1>, the reuse distance d4 is d4 = 1 × 17 + 1 = 18.

図１２は、配列アクセスの削除に必要な情報をまとめた表であり、アクセス種類、配列アクセス、及び再利用距離により表されている。この表は、注目する配列(配列Ｂ)の再利用グラフの各矢印を、再利用距離の小さい順に上から並べたものである。 FIG. 12 is a table summarizing information necessary for deletion of array access, and is represented by access type, array access, and reuse distance. In this table, the arrows of the reuse graph of the array of interest (array B) are arranged from the top in ascending order of reuse distance.

この後、第３の処理部１１ｄにより、シフトレジスタの記述が、最も内側のループの最後部に追加され、さらに、配列アクセスが一時変数アクセスに置き換えられる（Ｓ１７）。 Thereafter, the description of the shift register is added to the last part of the innermost loop by the third processing unit 11d, and the array access is replaced with the temporary variable access (S17).

シフトレジスタのサイズ、すなわち、シフトレジスタを構成するレジスタの数は、再利用距離の最大値に設定される。この例の場合、再利用距離の最大値は、ｄ４＝１８であるため、シフトレジスタは、１８個のレジスタにより構成される。 The size of the shift register, that is, the number of registers constituting the shift register is set to the maximum value of the reuse distance. In this example, since the maximum value of the reuse distance is d4 = 18, the shift register is composed of 18 registers.

さらに、再利用先の配列アクセスが対応する再利用距離の番号を持つシフトレジスタを参照するように配列アクセスが一時変数アクセスに置き換えられ、動作記述が最適化される。 Furthermore, the array access is replaced with a temporary variable access so that the array access at the reuse destination refers to the shift register having the corresponding reuse distance number, and the behavioral description is optimized.

図１３は、配列アクセスが削除され、最適化された動作記述の例を示している。図１３に示す第５行において、再利用先の配列アクセスが再利用距離の番号を持つ一時変数アクセス（レジスタアクセス）に置き換えられ、第７行乃至第２４行にシフトレジスタの記述が追加されている。このシフトレジスタの記述は、ループが１回りする毎にＢｒｅｇｉのデータがＢｒｅｇｉ＋１にシフトされることを示している。 FIG. 13 shows an example of an optimized behavior description with array access deleted. In the fifth line shown in FIG. 13, the array access at the reuse destination is replaced with a temporary variable access (register access) having a reuse distance number, and the description of the shift register is added to the seventh to 24th lines. Yes. This description of the shift register indicates that the data of Bregi is shifted to Bregi + 1 every time the loop goes around.

図５に示すように、上記最適化された動作記述１４ｃは、記憶装置１４に記憶される。 As shown in FIG. 5, the optimized behavior description 14 c is stored in the storage device 14.

（実施形態の効果）
上記実施形態によれば、動作記述に含まれる各配列アクセスの添え字が取り得る範囲、及びこれら配列アクセスの添え字が取る範囲を包含する範囲を解析し、この解析結果に基づき、ループ範囲を拡張することによりループ範囲を変更し、この変更したループ範囲に基づき、配列アクセスを追加し、この追加した配列アクセスを含む複数の配列アクセスの再利用関係を解析し、解析された再利用関係から再利用距離を算出し、この算出された再利用距離の最大値に対応するサイズのシフトレジスタを動作記述に追加し、さらに、配列アクセスを再利用距離の番号を有するレジスタへのアクセスに置き換え、このレジスタによりシフトレジスタを参照するようにしている。このため、既存のループピーリングを用いる場合に比べて、動作記述のコード量を削減でき、冗長なシフトレジスタの追加を防止することができる。したがって、生成されるハードウェアの面積の増加を防止することが可能である。 (Effect of embodiment)
According to the above embodiment, the range that each array access subscript included in the behavior description can take and the range that includes the range that these array access subscripts take are analyzed, and the loop range is determined based on the analysis result. Change the loop range by expanding, add array access based on this changed loop range, analyze the reuse relationship of multiple array accesses including this added sequence access, and from the analyzed reuse relationship Calculate the reuse distance, add a shift register of a size corresponding to the calculated maximum reuse distance to the operation description, and replace the array access with access to the register having the reuse distance number. The shift register is referred to by this register. For this reason, compared with the case where the existing loop peeling is used, the code amount of the operation description can be reduced, and the addition of a redundant shift register can be prevented. Therefore, it is possible to prevent an increase in the area of the generated hardware.

具体的には、既存のループリーリングを用いた初期化手法の場合、ループを分割するため、複数のループが作られてしまうが、本実施形態の場合１つのループによるスカラリプレイスが可能となるため、コードの短縮が可能である。 Specifically, in the case of the initialization method using the existing loop reeling, since the loop is divided, a plurality of loops are created. In the present embodiment, scalar replacement by one loop is possible. Therefore, the code can be shortened.

図１４乃至図１６は、図７に示す動作記述に対して、既存のスカラリプレイスとループピーリングを適用した場合の動作記述を示している。 14 to 16 show behavioral descriptions when the existing scalar replacement and loop peeling are applied to the behavioral description shown in FIG.

図１４は、図７に示す動作記述に対して、配列アクセスの削除と初期化用配列アクセスを追加した後のコードを示している。 FIG. 14 shows the code after adding array access deletion and initialization array access to the behavioral description shown in FIG.

図１５は、図１４に示す動作記述から、外側のループ（ループｉ）を１回分、ループピーリングしたコードを示している。 FIG. 15 shows a code obtained by loop peeling the outer loop (loop i) once from the behavioral description shown in FIG.

図１６は、図１５に示す動作記述に対して、図１５の第２７行に示す内側のループ（ループｊ）を１回分、ループピーリングしたコードを示している。第５０行乃至第６８行に示すように、ループピーリングにより、破線で囲まれたメインループの記述から、初期化用の配列アクセスが全て削除されている。 FIG. 16 shows code obtained by loop peeling the inner loop (loop j) shown in the 27th line of FIG. 15 once for the behavioral description shown in FIG. As shown in the 50th to 68th lines, all array accesses for initialization are deleted from the description of the main loop surrounded by the broken line by the loop peeling.

図１６に示すように、図７に示す元の動作記述に対して、既存のスカラリプレイスとループピーリングを実行した場合のコード量と、図１３に示す本実施形態によるループピーリングを用いないスカラリプレイス適用後のコード量を比較すると、既存手法の場合、図１６に示すように、７０行のコードが必要である。これに対して、本実施形態の場合、２５行のコードで構成される。このため、本実施形態の場合、既存のスカラリプレイスとループピーリングを用いた場合に比べて、コード量を６３％削減することができる。したがって、生成されるハードウェアの面積を既存手法に比べて大幅に低減することが可能である。 As shown in FIG. 16, the amount of code when executing the existing scalar replacement and loop peeling for the original behavioral description shown in FIG. 7, and the scalar replacement without using the loop peeling according to the present embodiment shown in FIG. Comparing the amount of code after application, the existing method requires 70 lines of code as shown in FIG. On the other hand, in the case of this embodiment, it is composed of 25 lines of code. For this reason, in the case of this embodiment, compared with the case where the existing scalar replace and loop peeling are used, the code amount can be reduced by 63%. Therefore, the area of the generated hardware can be significantly reduced as compared with the existing method.

しかも、ハードウェアの面積を低減し、冗長なレジスタの増加を防止することが可能であるため、高速動作が可能となる。 In addition, since the area of hardware can be reduced and an increase in redundant registers can be prevented, high-speed operation is possible.

尚、図１１（ａ）（ｂ）に示す再利用グラフ、図１２に示す再利用発生グラフの作成は、任意であり、算出された再利用距離に基づき、配列アクセスをシフトレジスタに置き換えることも可能である。 Note that the reuse graphs shown in FIGS. 11A and 11B and the reuse occurrence graph shown in FIG. 12 are arbitrary, and array access may be replaced with a shift register based on the calculated reuse distance. Is possible.

その他、本発明は上記各実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記各実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 In addition, the present invention is not limited to the above-described embodiments as they are, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. Moreover, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the above embodiments. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

１０…動作記述の最適化装置、１１…制御部、１４…記憶装置、１１ａ…解析処理部、１１ｂ…第１の処理部、１１ｃ…第２の処理部、１１ｄ…第３の処理部、１４ａ…動作記述、１４ｂ…配列アクセスの範囲情報、１４ｃ…最適化された動作記述。 DESCRIPTION OF SYMBOLS 10 ... Operation description optimization apparatus, 11 ... Control part, 14 ... Memory | storage device, 11a ... Analysis process part, 11b ... 1st process part, 11c ... 2nd process part, 11d ... 3rd process part, 14a ... Behavior description, 14b ... Array access range information, 14c ... Optimized behavior description.

Claims

A storage unit storing a plurality of first array accesses and an operation description in which operations of the digital circuit including the first loop are described;
An analysis unit that analyzes each possible range of the subscript of the first array access from the behavior description stored in the storage unit;
The range of the first loop is the range of the second loop so that there is a second sequence access that includes the range that can be taken by the subscript of the first sequence access analyzed by the analysis unit. A first processing unit to be changed to
A second processing unit for adding the second array access including all subscript ranges of the first array access in the range of the second loop;
The data reuse relationship between the first and second array accesses is examined, the reuse distance between each array access is calculated, and the description of the shift register having the same size as the maximum value of the reuse distance is described in the second And a third processing unit for adding to the loop and replacing the first array access with an access to a register having the same number as the reuse distance.

2. The semiconductor device design apparatus according to claim 1, wherein the first processing unit calculates a union of ranges that can be taken by a plurality of subscripts of the array access analyzed by the analysis unit.

2. The semiconductor device according to claim 1, wherein the second processing unit adds an array access including the entire subscript range of the array to the head of the innermost loop of the second loop. Design equipment.

The semiconductor device design apparatus according to claim 1, wherein the third processing unit adds the shift register to the last part of the innermost loop of the second loop.

A storage unit storing a plurality of first array accesses and an operation description in which operations of the digital circuit including the first loop are described;
Analyzing each possible range of subscripts of the first array access from the operation description stored in the storage unit and describing the operation of the digital circuit including the plurality of first array accesses and the first loop. ,
The range of the first loop is changed to the range of the second loop so that there is a second sequence access that includes the possible range of the subscript of the analyzed first sequence access. ,
Adding the second array access including all subscript ranges of the first array access in the range of the second loop;
The data reuse relationship between the first and second array accesses is examined, the reuse distance between each array access is calculated, and the description of the shift register having the same size as the maximum value of the reuse distance is described in the second A method for designing a semiconductor device, wherein the method is added to a loop, and the first array access is replaced with an access to a register having the same number as the reuse distance.

A storage unit storing a plurality of first array accesses and an operation description in which operations of the digital circuit including the first loop are described;
Analyzing each possible range of subscripts of the first array access from the operation description stored in the storage unit and describing the operation of the digital circuit including the plurality of first array accesses and the first loop. ,
The range of the first loop is changed to the range of the second loop so that there is a second sequence access that includes the possible range of the subscript of the analyzed first sequence access. ,
Adding the second array access including all subscript ranges of the first array access in the range of the second loop;
The data reuse relationship between the first and second array accesses is examined, the reuse distance between each array access is calculated, and the description of the shift register having the same size as the maximum value of the reuse distance is described in the second A semiconductor device design program for causing a computer to execute a process of adding to a loop and replacing the first array access with an access to a register having the same number as the reuse distance.