JP7225904B2

JP7225904B2 - Vector operation processing device, array variable initialization method by vector operation processing device, and array variable initialization program by vector operation processing device

Info

Publication number: JP7225904B2
Application number: JP2019033557A
Authority: JP
Inventors: 壮也藤本
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2019-02-27
Filing date: 2019-02-27
Publication date: 2023-02-21
Anticipated expiration: 2039-02-27
Also published as: JP2020140284A

Description

本願発明は、ベクトル演算を実行可能な情報処理装置による、配列変数を初期化する技術に関する。 The present invention relates to a technique for initializing array variables by an information processing device capable of executing vector operations.

近年、様々な技術領域においてソフトウェアの高機能化が進むのに伴い、ソフトウェアの規模も急速に増大している。このようなソフトウェアでは、使用する配列変数（配列データ）も大規模化していることから、大規模な配列変数に対する初期化やデータの設定を効率的かつ高速に行う技術が期待されている。 2. Description of the Related Art In recent years, as software becomes more sophisticated in various technical fields, the scale of software is rapidly increasing. Since the array variables (array data) used in such software are also becoming large-scale, there is a need for a technique for efficiently and quickly initializing and setting data for large-scale array variables.

このような技術に関連する技術として、特許文献１には、複数のプロセッサがそれぞれ自装置に割り付けられたデータを処理するに際して、入力装置から入力したデータまたは記憶装置に格納されている配列データの全部または一部を分割して各記憶空間または各プロセッサに割り付ける配列データの分配／収集処理装置が開示されている。この装置は、各記憶空間または各プロセッサへの割り付け対象となる１以上の任意の次元方向で分割した分割ブロックに対応して、各次元の範囲に関する情報を持つ転送テーブルを作成する。この装置は、作成した転送テーブルに基づいて、分割ブロックごとにデータを各記憶空間または各プロセッサに転送する。そしてこの装置は、任意の転送パターンによりデータを自動分配する。 As a technology related to such a technology, Patent Document 1 discloses that when a plurality of processors processes data assigned to their own devices, data input from an input device or arrangement data stored in a storage device is processed. A distributing/collecting processor for array data is disclosed which divides all or part of it and allocates it to each storage space or each processor. This device creates a transfer table having information about the range of each dimension, corresponding to divided blocks divided in one or more arbitrary dimensions to be allocated to each storage space or each processor. This device transfers data to each storage space or each processor for each divided block based on the created transfer table. This device then automatically distributes data according to arbitrary transfer patterns.

また、特許文献２には、メモリ資源を多く費やすことなく、かつ、高速に配列の初期化処理を行うことができる定数代入方法が開示されている。この方法では、配列の配列空間を擬似的に０クリアするための「０」値が格納された物理メモリブロックが物理メモリに予め用意されている。この方法では、物理メモリブロックを示す物理メモリページのテーブルが予め用意されている。この方法では、ユーザプログラムが起動され、配列が現れても、その配列空間のために特に物理メモリブロック（物理メモリページ）を確保しない。そしてこの方法では、初期化状態において、各配列空間をいずれも物理メモリブロック（物理メモリページ）とリンクするように設定する。 Further, Patent Document 2 discloses a constant assignment method that can perform array initialization processing at high speed without consuming a large amount of memory resources. In this method, a physical memory block storing "0" values for pseudo-clearing the array space to 0 is prepared in advance in the physical memory. In this method, a table of physical memory pages indicating physical memory blocks is prepared in advance. In this method, even if a user program is started and an array appears, no physical memory block (physical memory page) is specifically reserved for the array space. In this method, each array space is set to be linked with a physical memory block (physical memory page) in the initialization state.

特開平０４－３２１１５８号公報JP-A-04-321158 特開２００４－０３０３５３号公報Japanese Patent Application Laid-Open No. 2004-030353

スーパーコンピュータ等のベクトル演算を実行可能なベクトル演算処理装置は、一般的に、配列変数に対する初期化処理を、メモリへの書き込みを行うベクトル演算命令であるベクトルストア命令を発行することによって行っている。即ち、ベクトル演算処理装置では、メモリに記憶されている配列変数は、ベクトルストア命令によって、所定の初期値が書き込まれる。配列変数の初期化は、通常、ソフトウェアの実行が開始されたときなどに集中して行なわれることが多い。そしてベクトル演算命令を実行する場合、通常、所定のオーバーヘッド（処理を実行することに伴うコスト）が発生することなどから、多数の大規模な配列変数を使用するソフトウェアを実行する場合では、配列変数を初期化するベクトルストア命令が演算コア内に滞留する場合がある。 A vector arithmetic processing unit capable of executing vector arithmetic, such as a supercomputer, generally initializes array variables by issuing a vector store instruction, which is a vector arithmetic instruction for writing to memory. . That is, in the vector processor, predetermined initial values are written to the array variables stored in the memory by vector store instructions. Initialization of array variables is usually performed intensively when software execution is started. When executing vector operation instructions, a certain amount of overhead (the cost associated with executing processing) is usually incurred. Therefore, when executing software that uses many large array variables, array variables A vector store instruction that initializes may stay in the arithmetic core.

このような場合、後続するベクトル演算命令も実行されずに演算コア内に滞留することになるので、性能が低下する問題が発生する。即ち、ベクトル演算処理装置において、配列変数を初期化する処理を高速に行うことが課題である。上述した特許文献１及び２は、この課題について言及していない。本願発明の主たる目的は、この課題を解決するベクトル演算処理装置等を提供することである。 In such a case, the subsequent vector operation instruction is not executed and stays in the operation core, resulting in a problem of performance degradation. That is, the problem is how to perform processing for initializing array variables at high speed in a vector arithmetic processing device. Patent documents 1 and 2 mentioned above do not refer to this problem. A main object of the present invention is to provide a vector arithmetic processing device or the like that solves this problem.

本願発明の一態様に係るベクトル演算処理装置は、メモリに記憶されている配列変数に対する演算を、ベクトル演算命令により実行することを制御するベクトル制御手段と、実行する命令が、前記配列変数に対する初期化を行う初期化命令であるか否かを判定する判定手段と、前記初期化命令を、前記ベクトル演算命令を用いずに、前記ベクトル演算命令と同様に前記メモリに対してアクセスするスカラ演算命令に変換することによって実行する実行手段と、を備える。 A vector operation processing apparatus according to one aspect of the present invention includes vector control means for controlling execution of operations on array variables stored in a memory by vector operation instructions; and a scalar operation instruction for accessing the memory in the same manner as the vector operation instruction without using the vector operation instruction. and an execution means for executing by converting the

上記目的を達成する他の見地において、本願発明の一態様に係るベクトル演算処理装置による配列変数初期化方法は、ベクトル演算処理装置によって、メモリに記憶されている配列変数に対する演算を、ベクトル演算命令により実行することを制御し、実行する命令が、前記配列変数に対する初期化を行う初期化命令であるか否かを判定し、前記初期化命令を、前記ベクトル演算命令を用いずに、前記ベクトル演算命令と同様に前記メモリに対してアクセスするスカラ演算命令に変換することによって実行する。 In another aspect of achieving the above object, according to an aspect of the present invention, there is provided an array variable initialization method by a vector operation processing device, in which the vector operation processing device performs operations on array variables stored in a memory using vector operation instructions. to control the execution of the vector It is executed by converting it into a scalar operation instruction that accesses the memory in the same manner as an operation instruction.

また、上記目的を達成する更なる見地において、本願発明の一態様に係るベクトル演算処理装置による配列変数初期化プログラムは、メモリに記憶されている配列変数に対する演算を、ベクトル演算命令により実行することを制御するベクトル制御処理と、実行する命令が、前記配列変数に対する初期化を行う初期化命令であるか否かを判定する判定処理と、前記初期化命令を、前記ベクトル演算命令を用いずに、前記ベクトル演算命令と同様に前記メモリに対してアクセスするスカラ演算命令に変換することによって実行する実行処理と、をベクトル演算処理装置に実行させる。 In a further aspect of achieving the above object, an array variable initialization program by a vector operation processing device according to one aspect of the present invention performs operations on array variables stored in a memory using vector operation instructions. a determination process for determining whether or not an instruction to be executed is an initialization instruction for initializing the array variable; and a process for executing the initialization instruction without using the vector operation instruction. , and an execution process executed by converting the vector operation instruction into a scalar operation instruction for accessing the memory in the same manner as the vector operation instruction.

更に、本願発明は、係るベクトル演算処理装置による配列変数初期化プログラム（コンピュータプログラム）が格納された、コンピュータ読み取り可能な、不揮発性の記録媒体によっても実現可能である。 Furthermore, the present invention can also be realized by a computer-readable, non-volatile recording medium storing an array variable initialization program (computer program) by the vector processing device.

本願発明は、ベクトル演算処理装置において、配列変数を初期化する処理を高速に実行することを可能とする。 INDUSTRIAL APPLICABILITY The present invention makes it possible to execute processing for initializing array variables at high speed in a vector processing device.

本願発明の第１の実施形態に係るベクトル演算処理装置１０の構成を示すブロック図である。1 is a block diagram showing the configuration of a vector arithmetic processing device 10 according to a first embodiment of the present invention; FIG. 本願発明の第１の実施形態に係るベクトル演算処理装置１０が、メモリ１６に格納されている配列変数に対する初期化を行う際に、メモリモジュール１６１にアクセスするアドレスの範囲を例示する図である。4 is a diagram illustrating an address range accessed by the vector processing device 10 according to the first embodiment of the present invention to access the memory module 161 when initializing array variables stored in the memory 16. FIG. 本願発明の第１の実施形態に係るベクトル演算処理装置１０が配列変数を初期化する動作を示すフローチャートである。4 is a flow chart showing an operation of initializing array variables by the vector processing device 10 according to the first embodiment of the present invention; 本願発明の第１の実施形態に係るベクトル演算処理装置１０による配列変数の初期化処理のタイムチャート（１／２）である。4 is a time chart (1/2) of initialization processing of array variables by the vector processing device 10 according to the first embodiment of the present invention; 本願発明の第１の実施形態に係るベクトル演算処理装置１０による配列変数の初期化処理のタイムチャート（２／２）である。4 is a time chart (2/2) of initialization processing of array variables by the vector processing device 10 according to the first embodiment of the present invention; 本願発明の第２の実施形態に係るベクトル演算処理装置２０の構成を示すブロック図である。FIG. 3 is a block diagram showing the configuration of a vector arithmetic processing device 20 according to a second embodiment of the present invention; 本願発明の各実施形態に係るベクトル演算処理装置を実行可能な情報処理装置９００の構成を示すブロック図である。1 is a block diagram showing the configuration of an information processing device 900 capable of executing a vector arithmetic processing device according to each embodiment of the present invention; FIG.

以下、本願発明の実施の形態について図面を参照して詳細に説明する。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜第１の実施形態＞
図１は、本願発明の第１の実施の形態に係るベクトル演算処理装置１０の構成を示すブロック図である。ベクトル演算処理装置１０は、例えばスーパーコンピュータ等の、ベクトル演算を実行可能な情報処理装置である。ベクトル演算処理装置１０は、大別して、１以上の演算コア１１、メモリアクセス制御部１５、及び、メモリ１６を備えている。 <First Embodiment>
FIG. 1 is a block diagram showing the configuration of a vector arithmetic processing device 10 according to the first embodiment of the present invention. The vector processing device 10 is an information processing device, such as a supercomputer, capable of executing vector processing. The vector arithmetic processing device 10 is roughly divided into one or more arithmetic cores 11 , a memory access controller 15 and a memory 16 .

演算コアは例えばＣＰＵ（Central Processing Unit）であり、例えば主記憶であるメモリ１６に記憶されているプログラム（ソフトウェア）を読み出して実行し、その実行結果をメモリ１６に格納する。メモリアクセス制御部１５は、演算コア１１からメモリ１６に対するアクセス（データの読み出しあるいは書き込み）を制御する。 The arithmetic core is, for example, a CPU (Central Processing Unit), reads and executes a program (software) stored in the memory 16 which is the main memory, and stores the execution result in the memory 16 . The memory access control unit 15 controls access (reading or writing of data) from the arithmetic core 11 to the memory 16 .

演算コア１１は、ベクトル制御部１２、命令処理部１３、及び、アドレス制御部１４を備えている。尚、本実施形態では、命令処理部１３とアドレス制御部１４とをまとめて、実行部と称する場合がある。 The arithmetic core 11 includes a vector control section 12 , an instruction processing section 13 and an address control section 14 . In this embodiment, the instruction processing unit 13 and the address control unit 14 may be collectively referred to as an execution unit.

命令処理部１３は、判定部１３１、及び、スカラレジスタ１３２を備える。命令処理部１３は、実行するプログラムに含まれる命令をフェッチし、フェッチした命令をデコードし、デコードした命令を実行する。 The instruction processing unit 13 includes a determination unit 131 and a scalar register 132 . The instruction processing unit 13 fetches instructions included in a program to be executed, decodes the fetched instructions, and executes the decoded instructions.

本実施形態に係る命令処理部１３によって実行される命令には、大別して、ベクトル演算命令とスカラ演算命令とがある。但し、ベクトル演算命令とは、例えば配列変数に含まれる複数の要素に対して、同様の演算をパイプライン処理により行う演算命令のことである。一方、スカラ演算命令は、上述したベクトル演算命令以外の演算命令であり、例えば配列変数でない（複数の要素を含まない）変数に対して個別の演算を行う演算命令のことである。命令処理部１３は、実行する命令に含まれる命令種別を表す情報（命令コード）によって、当該命令がベクトル演算命令であるのかスカラ演算命令であるのかを判別可能である。 The instructions executed by the instruction processing unit 13 according to this embodiment are roughly classified into vector operation instructions and scalar operation instructions. However, the vector operation instruction is, for example, an operation instruction that performs similar operations on a plurality of elements included in an array variable by pipeline processing. On the other hand, a scalar operation instruction is an operation instruction other than the vector operation instruction described above, for example, an operation instruction that performs an individual operation on a variable that is not an array variable (does not contain multiple elements). The instruction processing unit 13 can determine whether the instruction to be executed is a vector operation instruction or a scalar operation instruction by information (instruction code) indicating the instruction type included in the instruction to be executed.

本実施形態では以降、スカラ演算命令あるいはベクトル演算命令による、メモリ１６に対するデータの書き込み動作（ストア動作）について説明することとする。 In this embodiment, the write operation (store operation) of data to the memory 16 by a scalar operation instruction or a vector operation instruction will be described below.

命令処理部１３は、実行する命令がスカラ演算命令（ストア命令）である場合、当該命令によってメモリ１６に書き込むデータ、及び、書き込み先のアドレスを含む、当該命令に関する情報をスカラレジスタ１３２に格納する。命令処理部１３は、スカラレジスタ１３２に格納した当該命令に関する情報に基づいてメモリ１６に格納するメモリアクセスリクエストを生成することを指示する情報を、アドレス制御部１４へ送信する。 When the instruction to be executed is a scalar operation instruction (store instruction), the instruction processing unit 13 stores information about the instruction, including data to be written to the memory 16 by the instruction and the write destination address, in the scalar register 132. . The instruction processing unit 13 transmits to the address control unit 14 information instructing to generate a memory access request to be stored in the memory 16 based on the information about the instruction stored in the scalar register 132 .

命令処理部１３は、実行する命令がベクトル演算命令（ベクトルストア命令）である場合、当該ベクトルストア命令をベクトル制御部１２へ転送することによって、当該ベクトルストア命令によってメモリ１６に書き込むベクトルデータの取得あるいは生成を、ベクトル制御部１２に指示する。命令処理部１３は、当該ベクトルデータをメモリ１６に格納するメモリアクセスリクエストを生成することを指示する情報を、アドレス制御部１４へ送信する。尚、ベクトル制御部１２は、後述する処理によって、当該ベクトルデータをアドレス制御部１４に送信する。 When the instruction to be executed is a vector operation instruction (vector store instruction), the instruction processing unit 13 transfers the vector store instruction to the vector control unit 12 to obtain vector data to be written to the memory 16 by the vector store instruction. Alternatively, the vector control unit 12 is instructed to generate. The instruction processing unit 13 transmits to the address control unit 14 information instructing to generate a memory access request for storing the vector data in the memory 16 . Note that the vector control unit 12 transmits the vector data to the address control unit 14 through processing described later.

ベクトル制御部１２は、ベクトル演算器１２１、及び、ベクトルレジスタ１２２を備える。ベクトル制御部１２は、命令処理部１３から転送されたベクトルストア命令に基づき、当該ベクトルストア命令によってメモリ１６に書き込むベクトルデータを取得あるいは生成し、当該ベクトルデータをベクトルレジスタ１２２に格納する。ベクトル演算器１２１は、ベクトルレジスタ１２２に格納されたベクトルデータを用いてベクトル演算を実行する機能を備える。ベクトル制御部１２は、ベクトルレジスタ１２２に格納したベクトルデータを、アドレス制御部１４へ送信する。 The vector control unit 12 includes a vector calculator 121 and a vector register 122 . The vector control unit 12 acquires or generates vector data to be written to the memory 16 according to the vector store instruction transferred from the instruction processing unit 13 and stores the vector data in the vector register 122 . The vector calculator 121 has a function of executing vector calculation using vector data stored in the vector register 122 . The vector control unit 12 transmits vector data stored in the vector register 122 to the address control unit 14 .

アドレス制御部１４は、命令処理部１３から上述の通りに受信した情報に基づいて、メモリ１６に対するデータの書き込みを実行するメモリアクセスリクエストを生成する。アドレス制御部１４は、実行する命令がストア命令である場合は、メモリ１６に対する書き込みデータを命令処理部１３から受信し、実行する命令がベクトルストア命令である場合は、メモリ１６に対する書き込みデータをベクトル制御部１２から受信する。アドレス制御部１４は、生成したメモリアクセスリクエストをメモリアクセス制御部１５へ送信する。 The address control unit 14 generates a memory access request for writing data to the memory 16 based on the information received from the instruction processing unit 13 as described above. When the instruction to be executed is a store instruction, the address control unit 14 receives write data for the memory 16 from the instruction processing unit 13. When the instruction to be executed is a vector store instruction, the address control unit 14 receives write data for the memory 16 as a vector. Received from the control unit 12 . The address control unit 14 transmits the generated memory access request to the memory access control unit 15 .

メモリアクセス制御部１５は、アドレス制御部１４から受信したメモリアクセスリクエストを、４つのメモリモジュール１６１乃至１６４を備えるメモリ１６における４つのアクセスポートに送信する。尚、メモリ１６が備えるメモリモジュール及びアクセスポートの数は４つに限定されない。メモリ１６は、４つ以外のメモリモジュール及びアクセスポートを備えてもよい。 The memory access control unit 15 transmits memory access requests received from the address control unit 14 to four access ports in the memory 16 having four memory modules 161 to 164 . The number of memory modules and access ports provided in the memory 16 is not limited to four. Memory 16 may comprise other than four memory modules and access ports.

メモリアクセス制御部１５は、ルーティング制御部１５０、及び、メモリコントローラ１５１乃至１５４を備えている。メモリコントローラ１５１乃至１５４は、メモリ１６が備える４つのアクセスポートを介して、メモリモジュール１６１乃至１６４と通信可能に接続されている。ルーティング制御部１５０、及び、メモリコントローラ１５１乃至１５４の詳細については後述する。 The memory access control unit 15 has a routing control unit 150 and memory controllers 151 to 154 . The memory controllers 151 to 154 are communicably connected to the memory modules 161 to 164 via four access ports of the memory 16 . Details of the routing control unit 150 and the memory controllers 151 to 154 will be described later.

次に、本実施形態に係るベクトル演算処理装置１０が、メモリ１６に記憶されている配列変数を初期化する動作について説明する。 Next, the operation of the vector processing device 10 according to this embodiment to initialize the array variables stored in the memory 16 will be described.

命令処理部１３における判定部１３１は、命令処理部１３によりフェッチされた命令が、メモリ１６に記憶されている配列変数に対する初期化命令であるか否かを判定する。但し、本実施形態に係るベクトル演算処理装置１０が実行するプログラムでは、配列変数に対する初期化命令は、当該配列変数に含まれる全ての要素に対して所定の初期値（例えば「０」）を設定するベクトルストア命令により表されることとする。 The determination unit 131 in the instruction processing unit 13 determines whether or not the instruction fetched by the instruction processing unit 13 is an initialization instruction for array variables stored in the memory 16 . However, in the program executed by the vector processing unit 10 according to this embodiment, the initialization instruction for the array variable sets a predetermined initial value (for example, "0") to all the elements included in the array variable. shall be represented by a vector store instruction that

判定部１３１は、例えば、フェッチされた命令が、配列変数に含まれる全ての要素に対して書き込む値が「０」であることを示す場合、当該命令が配列変数に対する初期化命令であると判定する。判定部１３１は、あるいは例えば、フェッチされた命令が示す命令種別が配列変数に対する初期化命令を示す場合に、当該命令が配列変数に対する初期化命令であると判定してもよい。ただしこの場合、配列変数に対する初期化命令を識別可能な命令種別を表す情報（命令コード）が、命令体系において定義されていることとする。 For example, when the fetched instruction indicates that the values to be written to all the elements included in the array variable are "0", the determination unit 131 determines that the instruction is an initialization instruction for the array variable. do. Alternatively, for example, when the instruction type indicated by the fetched instruction indicates an initialization instruction for array variables, the determination unit 131 may determine that the instruction is an initialization instruction for array variables. However, in this case, it is assumed that information (instruction code) representing an instruction type capable of identifying an initialization instruction for an array variable is defined in the instruction system.

命令処理部１３は、フェッチした命令が、判定部１３１によって、配列変数に対する初期化命令であると判定された場合、当該命令をベクトル演算命令ではなくスカラ演算命令として処理することを決定する。即ち、命令処理部１３は、当該初期化命令を、ベクトル演算命令からスカラ演算命令に変換する。 When the determination unit 131 determines that the fetched instruction is an initialization instruction for an array variable, the instruction processing unit 13 determines to process the instruction as a scalar operation instruction instead of a vector operation instruction. That is, the instruction processing unit 13 converts the initialization instruction from a vector operation instruction to a scalar operation instruction.

命令処理部１３は、使用する初期値「０」を含む当該初期化命令に関する情報を、スカラレジスタ１３２へ格納する。但し、当該初期化命令に関する情報は、例えば、メモリ１６における初期化を行うベースアドレス（先頭アドレス）、及び、初期化する配列変数の要素数を含むこととする。命令処理部１３は、スカラレジスタ１３２へ格納した初期値「０」を含む当該初期化命令に関する情報を、アドレス制御部１４へ送信する。 The instruction processing unit 13 stores information about the initialization instruction including the initial value “0” to be used in the scalar register 132 . However, the information about the initialization instruction includes, for example, the base address (head address) for initialization in the memory 16 and the number of elements of the array variables to be initialized. The instruction processing unit 13 transmits information regarding the initialization instruction including the initial value “0” stored in the scalar register 132 to the address control unit 14 .

アドレス制御部１４は、命令処理部１３から受信した初期化命令に関する情報に基づいて、メモリ１６における上述した４つのアクセスポートにアクセスするためのメモリアクセスリクエストを生成する。この際、アドレス制御部１４は、例えば、ベクトル演算処理装置１０の動作サイクルごとにアクセス先のアドレスを変更しながら、メモリ１６に対してアクセスするためのメモリアクセスリクエストを、アクセス先のアドレス分、連続的に生成する。即ち、スカラ演算命令に変換された初期化命令によるメモリ１６に対するアクセスは、ベクトル演算命令として実行される初期化命令によるメモリ１６に対するアクセスと同様である。 The address control unit 14 generates a memory access request for accessing the four access ports in the memory 16 based on the information regarding the initialization command received from the command processing unit 13 . At this time, the address control unit 14, for example, while changing the address of the access destination for each operation cycle of the vector processing unit 10, sends a memory access request for accessing the memory 16 by the address of the access destination, Generate continuously. That is, access to the memory 16 by an initialization instruction converted into a scalar operation instruction is the same as access to the memory 16 by an initialization instruction executed as a vector operation instruction.

アドレス制御部１４は、生成した各アクセスポートに対するメモリアクセスリクエストを、メモリアクセス制御部１５へ送信する。 The address control unit 14 transmits the generated memory access request for each access port to the memory access control unit 15 .

図２は、本実施形態に係るベクトル演算処理装置１０が、メモリ１６に格納されている配列変数に対する初期化を行なう際に、メモリモジュール１６１にアクセスするアドレスの範囲を例示する図である。図２において、１つの矩形は、配列変数の１要素を表す８バイトのメモリ領域を表す。各矩形における上段の値は、メモリ１６全体におけるアドレスを表し、各矩形における下段の値は、メモリモジュール１６１におけるローカルアドレスを表す。 FIG. 2 is a diagram exemplifying the range of addresses accessed by the vector processing device 10 according to the present embodiment to the memory module 161 when initializing the array variables stored in the memory 16. As shown in FIG. In FIG. 2, one rectangle represents an 8-byte memory area representing one element of an array variable. The upper value in each rectangle represents the address in the entire memory 16 and the lower value in each rectangle represents the local address in the memory module 161 .

図２に示す例では、例えば、メモリ１６におけるアドレス「０」～「１２７」がメモリモジュール１６１に割り当てられ、メモリ１６におけるアドレス「１２８」～「２５５」がメモリモジュール１６２に割り当てられ、メモリ１６におけるアドレス「２５６」～「３８３」がメモリモジュール１６３に割り当てられ、メモリ１６におけるアドレス「３８４」～「５１１」がメモリモジュール１６４に割り当てられている。そして、メモリ１６におけるアドレス「５１２」以降のアドレスも、同様の規則により、メモリモジュール１６１乃至１６４に割り当てられていることとする。 In the example shown in FIG. 2, for example, addresses "0" to "127" in the memory 16 are assigned to the memory module 161, addresses "128" to "255" in the memory 16 are assigned to the memory module 162, and Addresses “256” to “383” are assigned to memory module 163 , and addresses “384” to “511” in memory 16 are assigned to memory module 164 . Addresses after address "512" in the memory 16 are also assigned to the memory modules 161 to 164 according to the same rule.

図２において、網掛けされた矩形は、初期化が行われるメモリ領域を表す。即ち、図２に示す例では、ベクトル演算処理装置１０は、メモリモジュール１６１における、アドレスが「６４」である先頭の要素から１２８個の要素に対する初期化を行う。ベクトル演算処理装置１０は、メモリモジュール１６２乃至１６４に格納されている配列変数の要素に対しても同様に初期化するので、この場合に初期化する配列変数の要素数は、合計５１２個である。 In FIG. 2, shaded rectangles represent memory areas where initialization takes place. That is, in the example shown in FIG. 2, the vector processing unit 10 initializes 128 elements from the first element whose address is "64" in the memory module 161. In the example shown in FIG. Since the vector processing unit 10 similarly initializes the elements of the array variables stored in the memory modules 162 to 164, the total number of elements of the array variables to be initialized in this case is 512. .

アドレス制御部１４は、図２に示す例の場合、メモリモジュール１６１へのアクセスポートに対する、メモリ１６におけるアドレス「６４」～「４１５９」（即ち、メモリモジュール１６１におけるローカルアドレス「６４」～「１０８７」）に「０」を書き込むメモリアクセスリクエストを生成する。尚、本実施形態に係るメモリモジュール１６１へのアクセス単位が例えば１２８バイト（即ち、図２における１行分の矩形が示すメモリ領域）である場合、アドレス制御部１４は、網掛けされた矩形を含む９行分のメモリ領域に対する、１個のメモリアクセスリクエストを生成する。アドレス制御部１４は、メモリモジュール１６２乃至１６４へのアクセスポートに対しても同様に、メモリアクセスリクエストを生成する。 In the case of the example shown in FIG. 2, the address control unit 14 assigns addresses "64" to "4159" in the memory 16 to the access port to the memory module 161 (that is, local addresses "64" to "1087" in the memory module 161). ) to write '0'. When the unit of access to the memory module 161 according to this embodiment is, for example, 128 bytes (that is, the memory area indicated by the rectangle for one row in FIG. 2), the address control unit 14 changes the shaded rectangle to One memory access request is generated for the memory area for nine rows including The address control unit 14 similarly generates memory access requests for access ports to the memory modules 162 to 164 .

メモリアクセス制御部１５は、アドレス制御部１４によって生成された、メモリモジュール１６１乃至１６４へのアクセスポートに対するメモリアクセスリクエストを、アドレス制御部１４から受信する。メモリアクセス制御部１５におけるルーティング制御部１５０は、メモリモジュール１６１へのアクセスポートに対するメモリアクセスリクエストを、メモリコントローラ１５１に入力する。ルーティング制御部１５０は、メモリモジュール１６２乃至１６４へのアクセスポートに対するメモリアクセスリクエストを、順に、メモリコントローラ１５２乃至１５４に入力する。 The memory access control unit 15 receives from the address control unit 14 memory access requests for access ports to the memory modules 161 to 164 generated by the address control unit 14 . A routing control unit 150 in the memory access control unit 15 inputs a memory access request for an access port to the memory module 161 to the memory controller 151 . The routing control unit 150 sequentially inputs memory access requests for access ports to the memory modules 162 to 164 to the memory controllers 152 to 154 .

メモリコントローラ１５１は、ルーティング制御部１５０から入力されたメモリモジュール１６１へのアクセスポートに対するメモリアクセスリクエストを、メモリコントローラ１５１が備えるバッファ（不図示）に格納する。こののち、メモリコントローラ１５１は、当該メモリアクセスリクエストをメモリモジュール１６１に送信する。尚、本実施形態に係るメモリモジュール１６１へのアクセス単位が例えば上述した１２８バイトである場合、メモリコントローラ１５１は、１つのメモリアクセスリクエストにて、図２における網掛けされた矩形を含む９行分のメモリ領域に対するメモリアクセスを行う。これにより、メモリコントローラ１５１は、メモリモジュール１６１に記憶されている配列変数の要素を初期化する。 The memory controller 151 stores the memory access request for the access port to the memory module 161 input from the routing control unit 150 in a buffer (not shown) provided in the memory controller 151 . After that, the memory controller 151 transmits the memory access request to the memory module 161 . When the unit of access to the memory module 161 according to the present embodiment is, for example, 128 bytes as described above, the memory controller 151, with one memory access request, reads nine rows including the shaded rectangle in FIG. memory access to the memory area of Thereby, the memory controller 151 initializes the elements of the array variables stored in the memory module 161 .

メモリコントローラ１５２乃至１５４も同様に、ルーティング制御部１５０から入力されたメモリモジュール１６２乃至１６４へのアクセスポートに対するメモリアクセスリクエストを、メモリコントローラ１５２乃至１５４が備えるバッファ（不図示）に格納する。こののち、メモリコントローラ１５２乃至１５４は、当該メモリアクセスリクエストを、順に、メモリモジュール１６２乃至１６４に送信する。これにより、メモリコントローラ１５２乃至１５４は、メモリモジュール１６２乃至１６４に記憶されている配列変数の要素を初期化する。 Similarly, the memory controllers 152 to 154 store memory access requests for access ports to the memory modules 162 to 164 input from the routing control unit 150 in buffers (not shown) provided in the memory controllers 152 to 154 . After that, the memory controllers 152 to 154 send the memory access requests to the memory modules 162 to 164 in order. As a result, the memory controllers 152-154 initialize the elements of the array variables stored in the memory modules 162-164.

次に図３のフローチャートを参照して、本実施形態に係るベクトル演算処理装置１０が配列変数を初期化する動作（処理）について詳細に説明する。 Next, with reference to the flowchart of FIG. 3, the operation (processing) of initializing the array variables by the vector processing device 10 according to the present embodiment will be described in detail.

命令処理部１３における判定部１３１は、フェッチした命令が、メモリ１６に記憶されている配列変数に対する初期化命令であるか否かを判定する（ステップＳ１０１）。フェッチした命令が配列変数に対する初期化命令でない場合（ステップＳ１０２でＮｏ）、全体の処理は終了する。 The determination unit 131 in the instruction processing unit 13 determines whether or not the fetched instruction is an initialization instruction for array variables stored in the memory 16 (step S101). If the fetched instruction is not an initialization instruction for an array variable (No in step S102), the entire process ends.

フェッチした命令が配列変数に対する初期化命令である場合（ステップＳ１０２でＹｅｓ）、命令処理部１３は、当該初期化命令をベクトル演算命令からスカラ演算命令に変換し、初期値「０」を含む当該初期化命令に関する情報を、スカラレジスタ１３２へ格納する（ステップＳ１０３）。命令処理部１３は、初期値「０」を含む当該初期化命令に関する情報を、アドレス制御部１４へ送信する（ステップＳ１０４）。 If the fetched instruction is an initialization instruction for an array variable (Yes in step S102), the instruction processing unit 13 converts the initialization instruction from a vector operation instruction to a scalar operation instruction, and converts the initialization instruction including the initial value "0". Information about the initialization instruction is stored in the scalar register 132 (step S103). The instruction processing unit 13 transmits information regarding the initialization instruction including the initial value "0" to the address control unit 14 (step S104).

アドレス制御部１４は、命令処理部１３から受信した初期化命令に関する情報に基づいて、メモリ１６における４つのアクセスポートにアクセスするためのメモリアクセスリクエストを生成し、生成した各アクセスポートに対するメモリアクセスリクエストを、メモリアクセス制御部１５へ送信する（ステップＳ１０５）。 The address control unit 14 generates a memory access request for accessing four access ports in the memory 16 based on the information regarding the initialization instruction received from the instruction processing unit 13, and generates a memory access request for each of the generated access ports. is sent to the memory access control unit 15 (step S105).

メモリアクセス制御部１５は、アドレス制御部１４から受信した各アクセスポートに対するメモリアクセスリクエストを、メモリ１６におけるメモリモジュール１６１乃至１６４に送信することによって、メモリ１６に記憶されている配列変数を初期化し（ステップＳ１０６）、全体の処理は終了する。 The memory access control unit 15 initializes array variables stored in the memory 16 by transmitting memory access requests for each access port received from the address control unit 14 to the memory modules 161 to 164 in the memory 16 ( Step S106), the whole process ends.

本実施形態に係るベクトル演算処理装置１０は、配列変数を初期化する処理を高速に実行することができる。その理由は、ベクトル演算処理装置１０は、配列変数に対する初期化を行う初期化命令を、ベクトル演算命令を用いずに、ベクトル演算命令と同様にメモリに対してアクセスするスカラ演算命令に変換することによって実行するからである。 The vector arithmetic processing device 10 according to the present embodiment can execute processing for initializing array variables at high speed. The reason for this is that the vector operation processing unit 10 converts an initialization instruction for initializing an array variable into a scalar operation instruction that accesses memory in the same manner as a vector operation instruction without using a vector operation instruction. because it is executed by

以下に、本実施形態に係るベクトル演算処理装置１０によって実現される効果について、詳細に説明する。 The effects realized by the vector arithmetic processing device 10 according to this embodiment will be described in detail below.

ベクトル演算を実行可能なベクトル演算処理装置は、一般的に、配列変数に対する初期化処理を、ベクトルストア命令を発行し、配列変数に所定の初期値を書き込むことによって行っている。配列変数の初期化は、通常、ソフトウェアの実行が開始されたときなどに集中して行なわれることが多い。そしてベクトル演算命令を実行する場合、通常、所定のオーバーヘッドが発生することなどから、多数の大規模な配列変数を使用するソフトウェアを実行する場合では、配列変数を初期化するベクトルストア命令が演算コア内に滞留する場合がある。この場合、後続するベクトル演算命令も実行されずに演算コア内に滞留することになるので、性能が低下する問題が発生する。 2. Description of the Related Art A vector operation processor capable of executing vector operations generally initializes array variables by issuing a vector store instruction and writing predetermined initial values to the array variables. Initialization of array variables is usually performed intensively when software execution is started. When executing vector arithmetic instructions, a certain amount of overhead is usually generated. Therefore, when executing software that uses a large number of large-scale array variables, the vector store instruction that initializes the array variables must be executed by the arithmetic core. may stay inside. In this case, the subsequent vector operation instruction is not executed and stays in the operation core, resulting in a problem of performance degradation.

このような課題に対して、本実施形態に係るベクトル演算処理装置１０は、ベクトル制御部１２と、判定部１３１と、実行部（命令処理部１３及びアドレス制御部１４）とを備え、例えば、図１乃至図３を参照して上述した通り動作する。即ち、ベクトル制御部１２は、メモリ１６に記憶されている配列変数に対する演算を、ベクトル演算命令により実行することを制御する。判定部１３１は、実行する命令が、当該配列変数に対する初期化を行う初期化命令であるか否かを判定する。そして、実行部は、当該初期化命令を、当該ベクトル演算命令を用いずに、ベクトル演算命令と同様にメモリ１６に対してアクセスするスカラ演算命令に変換することによって実行する。 In order to solve such problems, the vector operation processing device 10 according to the present embodiment includes a vector control unit 12, a determination unit 131, and an execution unit (instruction processing unit 13 and address control unit 14). It operates as described above with reference to FIGS. That is, the vector control unit 12 controls execution of operations on array variables stored in the memory 16 by vector operation instructions. The determination unit 131 determines whether the instruction to be executed is an initialization instruction for initializing the array variable. Then, the execution unit executes the initialization instruction by converting it into a scalar operation instruction that accesses the memory 16 in the same manner as the vector operation instruction without using the vector operation instruction.

図４Ａ及び４Ｂは、本実施形態に係るベクトル演算処理装置１０による配列変数の初期化処理のタイムチャートを例示する図である。また、図４Ａ及び４Ｂは、本実施形態に係るベクトル演算処理装置１０によって実現される効果を明確にするために、当該タイムチャートと、ベクトル演算処理装置１０が一般的なベクトル演算処理装置と同様にベクトル演算命令を用いて配列変数を初期化する場合における初期化処理のタイムチャートとを比較した結果を示している。尚、図４Ａ及び４Ｂが示すタイムチャートにおける横軸は、初期化命令の実行が開始されてからの経過時間を、ベクトル演算処理装置１０の動作サイクルＴにより表している。但し動作サイクルＴは、例えば、ベクトル演算処理装置１０が動作するクロック周期等である。 4A and 4B are diagrams illustrating time charts of initialization processing of array variables by the vector processing device 10 according to the present embodiment. Also, FIGS. 4A and 4B show the time chart and the vector processing device 10 similar to a general vector processing device in order to clarify the effects achieved by the vector processing device 10 according to the present embodiment. shows the result of comparison with the time chart of initialization processing when array variables are initialized using vector operation instructions. The horizontal axis in the time charts shown in FIGS. 4A and 4B represents the elapsed time from the start of execution of the initialization instruction by the operation cycle T of the vector processing unit 10. FIG. However, the operating cycle T is, for example, a clock period during which the vector arithmetic processing unit 10 operates.

ベクトル演算処理装置１０が一般的なベクトル演算処理装置と同様に、ベクトル演算命令（ベクトルストア命令）を用いて配列変数を初期化する場合、図４Ａ及び４Ｂに示す通り、命令処理部１３による、ベクトル制御部１２に対して初期化命令を転送する処理が発生する。この場合、さらに加えて、ベクトル処理部１２による、初期値「０」をベクトルレジスタ１２２へ展開する処理、及び、ベクトルレジスタ１２２へ展開したベクトルストアデータをアドレス制御部１４へ送信する処理も発生する。 When the vector operation processing unit 10 initializes an array variable using a vector operation instruction (vector store instruction) like a general vector operation processing unit, as shown in FIGS. 4A and 4B, the instruction processing unit 13 A process of transferring an initialization instruction to the vector control unit 12 occurs. In this case, in addition, processing by the vector processing unit 12 to expand the initial value “0” to the vector register 122 and processing to transmit the vector store data expanded in the vector register 122 to the address control unit 14 also occur. .

ベクトル演算処理装置１０は、ベクトル演算命令を用いてメモリ１６に記憶された配列変数を初期化する場合、図４Ａ及び４Ｂに示す通り、初期化命令の実行を開示してから２８Ｔ後に配列変数の初期化を完了する。 When the vector operation processing unit 10 initializes the array variables stored in the memory 16 using the vector operation instruction, as shown in FIGS. Complete initialization.

これに対して、ベクトル演算処理装置１０は、スカラ演算命令を用いてメモリ１６に記憶された配列変数を初期化する場合、図４Ａ及び４Ｂに示す通り、初期化命令の実行を開示してから１８Ｔ後に配列変数の初期化を完了する。即ち、本実施形態に係るベクトル演算処理装置１０は、一般的なベクトル演算処理装置と同様にベクトル演算命令を用いて配列変数を初期化する場合と比較して、１０Ｔ早く配列変数の初期化を完了することができる。 On the other hand, when the vector arithmetic processing unit 10 initializes the array variables stored in the memory 16 using a scalar arithmetic instruction, as shown in FIGS. After 18T, the initialization of array variables is completed. That is, the vector processing device 10 according to the present embodiment initializes the array variables 10T earlier than the case where the array variables are initialized using vector processing instructions as in a general vector processing device. can be completed.

本実施形態に係るベクトル演算処理装置１０が、ベクトル演算命令を用いて配列変数を初期化する場合よりも早く配列変数の初期化を完了することができる理由は、以下の通りである。即ち、あるプログラムの実行において、配列変数に対する演算結果は、一般的に要素ごとに異なる値であるので、その演算結果をメモリに高速に格納する場合、要素ごとに異なる値を格納したベクトルレジスタを備えるベクトル制御部を用いたベクトル演算処理（ベクトルストア）を行う必要がある。しかしながら、その場合、上述した通り、ベクトル制御部を用いてベクトル演算処理を行うことに伴うオーバーヘッド（データを転送する処理やベクトルレジスタへデータを展開する処理など）が発生する。 The reason why the vector operation processing device 10 according to the present embodiment can complete the initialization of the array variables earlier than when the array variables are initialized using vector operation instructions is as follows. In other words, in the execution of a certain program, the result of an operation on an array variable generally has a different value for each element. Therefore, in order to store the operation result in memory at high speed, a vector register storing a different value for each element is used. It is necessary to perform vector arithmetic processing (vector store) using the provided vector control unit. However, in that case, as described above, overhead (processing for transferring data, processing for developing data in a vector register, etc.) associated with performing vector arithmetic processing using the vector control unit occurs.

本実施形態に係るベクトル演算処理装置１０は、配列変数を初期化する場合、その配列変数に書き込む値（初期値）が全て同一の値（例えば「０」）であるという特性に着目した構成を備え、ベクトル制御部１２を使用せずに、スカラ演算命令によって、その同一の値を配列変数の要素に一斉に書き込むメモリアクセスリクエストを連続的に発行する。即ち、ベクトル演算処理装置１０は、上述したオーバーヘッドを発生させることなく、ベクトルストア命令によるメモリ１６に対するアクセスと同様な連続的なアクセスを行うことによって、配列変数に対する初期化を行う。これにより、本実施形態に係るベクトル演算処理装置１０は、配列変数を初期化する処理を高速に実行することができる。 The vector operation processing device 10 according to the present embodiment has a configuration focused on the characteristic that when an array variable is initialized, the values (initial values) to be written to the array variables are all the same value (for example, "0"). In addition, without using the vector control unit 12, memory access requests for writing the same values to the elements of array variables all at once are continuously issued by scalar operation instructions. That is, the vector processing unit 10 initializes array variables by successively accessing the memory 16 in the same manner as accessing the memory 16 by a vector store instruction without generating the overhead described above. As a result, the vector operation processing device 10 according to the present embodiment can execute processing for initializing array variables at high speed.

また、配列変数が記憶されたメモリ領域に対して初期値「０」を一斉に書き込むストア命令による本実施形態に係るメモリアクセスは、配列変数の要素ごとに初期値「０」を書き込むベクトルストア命令によるメモリアクセスと比較して、演算コア１１からメモリアクセス制御部１５に送信されるメモリアクセスリクエストの数を少なくすることができる。これにより、本実施形態に係るベクトル演算処理装置１０は、メモリアクセス制御部１５において、メモリアクセスリクエストに関する輻輳が発生することを抑制できるので、配列変数を初期化する処理を高速に実行することができる。また、メモリアクセス制御部１５において、メモリアクセスリクエストに関する輻輳が発生することを抑制する効果は、ベクトル演算処理装置１０が備える演算コア１１の数が多いほど大きくなることが期待できる。 In addition, the memory access according to the present embodiment by a store instruction that writes the initial value "0" to the memory area in which the array variables are stored all at once is a vector store instruction that writes the initial value "0" for each element of the array variables. It is possible to reduce the number of memory access requests sent from the arithmetic core 11 to the memory access control unit 15, compared to the memory access by . As a result, the vector processing device 10 according to the present embodiment can suppress the occurrence of congestion related to memory access requests in the memory access control unit 15, so that the process of initializing array variables can be executed at high speed. can. In addition, in the memory access control unit 15, the effect of suppressing the occurrence of memory access request congestion can be expected to increase as the number of operation cores 11 included in the vector operation processing device 10 increases.

また、本実施形態に係るベクトル演算処理装置１０は、メモリ１６に関して、本実施形態に特有の機能を追加していないので、汎用品などの既存の製品を使用可能である。即ち、本実施形態に係るベクトル演算処理装置１０は、上述した配列変数を初期化する処理を高速に実行する構成を、低コストで実現することができる
＜第２の実施形態＞
図５は、本願発明の第２の実施形態に係るベクトル演算処理装置２０の構成を示すブロック図である。 Further, since the vector arithmetic processing device 10 according to the present embodiment does not add functions specific to the present embodiment with respect to the memory 16, existing products such as general-purpose products can be used. That is, the vector arithmetic processing device 10 according to the present embodiment can realize a configuration for executing the processing for initializing the array variables described above at high speed at low cost <Second Embodiment>
FIG. 5 is a block diagram showing the configuration of the vector arithmetic processing device 20 according to the second embodiment of the present invention.

本実施形態に係るベクトル演算処理装置２０は、ベクトル制御部２１、判定部２２、及び、実行部２３を備える。 A vector arithmetic processing device 20 according to this embodiment includes a vector control unit 21 , a determination unit 22 , and an execution unit 23 .

ベクトル制御部２１は、メモリ２４に記憶されている配列変数２４０に対する演算を、ベクトル演算命令２１０により実行することを制御する。 The vector control unit 21 controls execution of operations on array variables 240 stored in the memory 24 by vector operation instructions 210 .

判定部２２は、実行する命令２００が、配列変数２４０に対する初期化を行う初期化命令２２０であるか否かを判定する。 The determination unit 22 determines whether the instruction 200 to be executed is the initialization instruction 220 for initializing the array variable 240 .

実行部２３は、初期化命令２２０を、ベクトル演算命令２１０を用いずに、ベクトル演算命令２１０と同様にメモリ２４に対してアクセスするスカラ演算命令２３０に変換することによって実行する。 The execution unit 23 executes the initialization instruction 220 by converting it into a scalar operation instruction 230 that accesses the memory 24 in the same manner as the vector operation instruction 210 without using the vector operation instruction 210 .

本実施形態に係るベクトル演算処理装置２０は、配列変数を初期化する処理を高速に実行することができる。その理由は、ベクトル演算処理装置２０は、配列変数２４０に対する初期化を行う初期化命令２２０を、ベクトル演算命令２１０を用いずに、ベクトル演算命令２１０と同様にメモリ２４に対してアクセスするスカラ演算命令２３０に変換することによって実行するからである。 The vector arithmetic processing device 20 according to this embodiment can execute processing for initializing array variables at high speed. The reason for this is that the vector operation processing unit 20 uses the initialization instruction 220 for initializing the array variable 240 without using the vector operation instruction 210 and accessing the memory 24 in the same manner as the vector operation instruction 210. This is because it is executed by converting it into an instruction 230 .

＜ハードウェア構成例＞
上述した各実施形態において図１、及び、図５に示したベクトル演算処理装置における各部は、専用のＨＷ（ＨａｒｄＷａｒｅ）（電子回路）によって実現することができる。また、図１、及び、図５において、少なくとも、下記構成は、ソフトウェアプログラムの機能（処理）単位（ソフトウェアモジュール）と捉えることができる。
・ベクトル制御部１２及び２１、
・命令処理部１３、
・判定部１３１及び２２、
・アドレス制御部１４、
・実行部２３、
・メモリアクセス制御部１５。 <Hardware configuration example>
Each unit in the vector arithmetic processing device shown in FIGS. 1 and 5 in each of the above-described embodiments can be realized by a dedicated HW (Hardware) (electronic circuit). In addition, in FIGS. 1 and 5, at least the following configuration can be regarded as a functional (processing) unit (software module) of the software program.
vector control units 12 and 21,
- Instruction processing unit 13,
- Determining units 131 and 22,
- Address control unit 14,
・execution unit 23,
• Memory access control unit 15 .

但し、これらの図面に示した各部の区分けは、説明の便宜上の構成であり、実装に際しては、様々な構成が想定され得る。この場合のハードウェア環境の一例を、図６を参照して説明する。 However, the division of each part shown in these drawings is a configuration for convenience of explanation, and various configurations can be assumed upon implementation. An example of the hardware environment in this case will be described with reference to FIG.

図６は、本願発明の各実施形態に係るベクトル演算処理装置を実行可能な情報処理装置９００（コンピュータ）の構成を例示的に説明する図である。即ち、図６は、図１、及び、図５に示したベクトル演算処理装置を実現可能なコンピュータ（情報処理装置）の構成であって、上述した実施形態における各機能を実現可能なハードウェア環境を表す。 FIG. 6 is a diagram illustrating the configuration of an information processing device 900 (computer) capable of executing the vector arithmetic processing device according to each embodiment of the present invention. That is, FIG. 6 shows the configuration of a computer (information processing device) capable of realizing the vector arithmetic processing device shown in FIGS. represents

図６に示した情報処理装置９００は、構成要素として下記を備えている。
・ＣＰＵ（Ｃｅｎｔｒａｌ＿Ｐｒｏｃｅｓｓｉｎｇ＿Ｕｎｉｔ）９０１、
・ＲＯＭ（Ｒｅａｄ＿Ｏｎｌｙ＿Ｍｅｍｏｒｙ）９０２、
・ＲＡＭ（Ｒａｎｄｏｍ＿Ａｃｃｅｓｓ＿Ｍｅｍｏｒｙ）９０３、
・ハードディスク（記憶装置）９０４、
・通信インタフェース９０５、
・バス９０６（通信線）、
・ＣＤ－ＲＯＭ（Ｃｏｍｐａｃｔ＿Ｄｉｓｃ＿Ｒｅａｄ＿Ｏｎｌｙ＿Ｍｅｍｏｒｙ）等の記録媒体９０７に格納されたデータを読み書き可能なリーダライタ９０８、
・モニターやスピーカ、キーボード等の入出力インタフェース９０９。 The information processing apparatus 900 shown in FIG. 6 has the following components as components.
CPU (Central_Processing_Unit) 901,
ROM (Read_Only_Memory) 902,
RAM (Random_Access_Memory) 903,
- Hard disk (storage device) 904,
a communication interface 905;
- Bus 906 (communication line),
A reader/writer 908 capable of reading and writing data stored in a recording medium 907 such as a CD-ROM (Compact_Disc_Read_Only_Memory);
- An input/output interface 909 such as a monitor, a speaker, and a keyboard.

即ち、上記構成要素を備える情報処理装置９００は、これらの構成がバス９０６を介して接続された一般的なコンピュータである。情報処理装置９００は、ＣＰＵ９０１を複数備える場合もあれば、マルチコアにより構成されたＣＰＵ９０１を備える場合もある。 That is, the information processing apparatus 900 having the above components is a general computer in which these components are connected via a bus 906 . The information processing apparatus 900 may include a plurality of CPUs 901 or may include CPUs 901 configured by multi-cores.

そして、上述した実施形態を例に説明した本願発明は、図６に示した情報処理装置９００に対して、次の機能を実現可能なコンピュータプログラムを供給する。その機能とは、その実施形態の説明において参照したブロック構成図（図１、及び、図５）における上述した構成、或いはフローチャート（図３）の機能である。本願発明は、その後、そのコンピュータプログラムを、当該ハードウェアのＣＰＵ９０１に読み出して解釈し実行することによって達成される。また、当該装置内に供給されたコンピュータプログラムは、読み書き可能な揮発性のメモリ（ＲＡＭ９０３）、または、ＲＯＭ９０２やハードディスク９０４等の不揮発性の記憶デバイスに格納すれば良い。 The present invention, which has been described with the above-described embodiment as an example, supplies a computer program capable of realizing the following functions to the information processing apparatus 900 shown in FIG. The function is the above-described configuration in the block configuration diagrams (FIGS. 1 and 5) referred to in the description of the embodiment, or the function of the flowchart (FIG. 3). The present invention is then achieved by having the computer program read out by the CPU 901 of the hardware, interpreted and executed. Further, the computer program supplied to the apparatus may be stored in a readable/writable volatile memory (RAM 903) or a nonvolatile storage device such as ROM 902 or hard disk 904.

また、前記の場合において、当該ハードウェア内へのコンピュータプログラムの供給方法は、現在では一般的な手順を採用することができる。その手順としては、例えば、ＣＤ－ＲＯＭ等の各種記録媒体９０７を介して当該装置内にインストールする方法や、インターネット等の通信回線を介して外部よりダウンロードする方法等がある。そして、このような場合において、本願発明は、係るコンピュータプログラムを構成するコード或いは、そのコードが格納された記録媒体９０７によって構成されると捉えることができる。 Also, in the above case, a general procedure can be employed at present as a method of supplying the computer program into the hardware. The procedure includes, for example, a method of installing in the device via various recording media 907 such as a CD-ROM, and a method of downloading from the outside via a communication line such as the Internet. In such a case, the present invention can be considered to be constituted by the code that constitutes the computer program or the recording medium 907 that stores the code.

以上、上述した実施形態を模範的な例として本願発明を説明した。しかしながら、本願発明は、上述した実施形態には限定されない。即ち、本願発明は、本願発明のスコープ内において、当業者が理解し得る様々な態様を適用することができる。 The present invention has been described above using the above-described embodiments as exemplary examples. However, the present invention is not limited to the embodiments described above. That is, within the scope of the present invention, various aspects that can be understood by those skilled in the art can be applied to the present invention.

１０ベクトル演算処理装置
１１演算コア
１２ベクトル制御部
１２１ベクトル演算器
１２２ベクトルレジスタ
１３命令処理部
１３１判定部
１３２スカラレジスタ
１４アドレス制御部
１５メモリアクセス制御部
１５０ルーティング制御部
１５１乃至１５４メモリコントローラ
１６メモリ
１６１乃至１６４メモリモジュール
２０ベクトル演算処理装置
２００実行する命令
２１ベクトル制御部
２１０ベクトル演算命令
２２判定部
２２０初期化命令
２３実行部
２３０スカラ演算命令
２４メモリ
２４０配列変数
９００情報処理装置
９０１ＣＰＵ
９０２ＲＯＭ
９０３ＲＡＭ
９０４ハードディスク（記憶装置）
９０５通信インタフェース
９０６バス
９０７記録媒体
９０８リーダライタ
９０９入出力インタフェース REFERENCE SIGNS LIST 10 vector arithmetic processing unit 11 arithmetic core 12 vector control unit 121 vector arithmetic unit 122 vector register 13 instruction processing unit 131 determination unit 132 scalar register 14 address control unit 15 memory access control unit 150 routing control unit 151 to 154 memory controller 16 memory 161 to 164 memory module 20 vector operation processing unit 200 instruction to be executed 21 vector control unit 210 vector operation instruction 22 determination unit 220 initialization instruction 23 execution unit 230 scalar operation instruction 24 memory 240 array variable 900 information processing device 901 CPU
902 ROMs
903 RAM
904 hard disk (storage device)
905 communication interface 906 bus 907 recording medium 908 reader/writer 909 input/output interface

Claims

vector control means for controlling execution of operations on array variables stored in memory by vector operation instructions;
determining means for determining whether an instruction to be executed is an initialization instruction for initializing the array variable;
execution means for executing the initialization instruction by converting it into a scalar operation instruction that accesses the memory in the same manner as the vector operation instruction without using the vector operation instruction;
A vector processing unit comprising

The execution means converts the initialization instruction into a scalar operation instruction that accesses the memory while changing an access destination address for each operation cycle of the device itself.
2. The vector arithmetic processing device according to claim 1.

The execution means includes a scalar register for storing write data to the memory by the scalar operation instruction, and stores the write data indicated by the initialization instruction in the scalar register.
3. The vector arithmetic processing device according to claim 1 or 2.

the execution means converts the initialization instruction into the scalar operation instruction including an access instruction for a plurality of access ports provided in the memory;
4. The vector arithmetic processing device according to any one of claims 1 to 3.

further comprising memory access control means for executing the initialization instruction converted into the scalar operation instruction by accessing a plurality of access ports provided in the memory;
5. The vector arithmetic processing device according to claim 4.

The determination means determines that the instruction to be executed is the initialization instruction when a value written to an element included in the array variable is a predetermined value in the instruction to be executed.
The vector arithmetic processing device according to any one of claims 1 to 5.

The determining means determines whether or not the instruction type indicated by the instruction to be executed indicates the initialization instruction.
The vector arithmetic processing device according to any one of claims 1 to 5.

further comprising the memory;
The vector arithmetic processing device according to any one of claims 1 to 6.

With the vector arithmetic processing unit,
controlling execution of operations on array variables stored in memory by vector operation instructions;
determining whether the instruction to be executed is an initialization instruction for initializing the array variable;
executing the initialization instruction by converting it into a scalar operation instruction that accesses the memory in the same manner as the vector operation instruction without using the vector operation instruction;
Array variable initialization method by vector arithmetic processing unit.

Vector control processing for controlling execution of operations on array variables stored in memory by vector operation instructions;
Determination processing for determining whether an instruction to be executed is an initialization instruction for initializing the array variable;
execution processing executed by converting the initialization instruction into a scalar operation instruction that accesses the memory in the same manner as the vector operation instruction without using the vector operation instruction;
Array variable initialization program by the vector processing unit for causing the vector processing unit to execute