JP2014225088A

JP2014225088A - Arithmetic unit

Info

Publication number: JP2014225088A
Application number: JP2013103260A
Authority: JP
Inventors: 伊藤　大; Masaru Ito; 大伊藤
Original assignee: Olympus Corp
Current assignee: Olympus Corp
Priority date: 2013-05-15
Filing date: 2013-05-15
Publication date: 2014-12-04
Anticipated expiration: 2033-05-15
Also published as: JP6161395B2

Abstract

PROBLEM TO BE SOLVED: To provide an arithmetic unit that has a plurality of processors processing in coordination, and can prepare necessary data by the timing the respective processors use the data.SOLUTION: An arithmetic unit includes a plurality of processing operation parts that perform arithmetic processing according to input tasks and output information on the subsequent arithmetic processing as tasks, a data storage part that stores data used by the respective processing operation parts for the arithmetic processing or data of the results of the arithmetic processing, a memory control part that performs read-out of data used for the arithmetic processing from an external storage part and writing of the data stored in the data storage part in the external storage part, and a task control part that includes a task queue, outputs the stored tasks to any one of the processing operation parts, and outputs an access instruction instructing the access to the external storage part to the memory control part on the basis of the timing the processing operation part performs the arithmetic processing of the tasks.

Description

本発明は、演算装置に関する。 The present invention relates to an arithmetic device.

従来から、プログラムに応じた演算処理を実行するプロセッサを備えた演算装置がある。このような演算装置では、プロセッサが実行するそれぞれの演算処理を複数のタスクに分割し、プロセッサは、分割したそれぞれのタスクを順次実行することによって演算処理を行う。演算装置における演算処理では、外部のメモリにアクセスしながら実行する、すなわち、演算処理で使用するデータを、外部のメモリから読み出したり、外部のメモリに書き込んだりしながら実行する演算処理もある。しかし、外部のメモリへのアクセスには多くの時間を要してしまうため、外部のメモリへのアクセスを伴う演算処理は、内部のみで行うことができる演算処理に比べて多くの時間を要する。このため、従来の演算装置では、演算装置の処理速度を向上させるため、演算処理に使用するデータを予め外部のメモリから読み出して、外部のメモリよりも高速なメモリ、いわゆる、キャッシュメモリに一時記憶しておく構成のものが多く見受けられる。 2. Description of the Related Art Conventionally, there is an arithmetic device including a processor that executes arithmetic processing according to a program. In such an arithmetic device, each arithmetic processing executed by the processor is divided into a plurality of tasks, and the processor performs arithmetic processing by sequentially executing the divided tasks. In the arithmetic processing in the arithmetic device, there is an arithmetic processing that is executed while accessing an external memory, that is, the data used in the arithmetic processing is read from the external memory or written to the external memory. However, since it takes a lot of time to access an external memory, a calculation process involving access to an external memory requires a lot of time compared to a calculation process that can be performed only inside. For this reason, in the conventional arithmetic device, in order to improve the processing speed of the arithmetic device, data used for the arithmetic processing is read from an external memory in advance and temporarily stored in a memory faster than the external memory, so-called cache memory. There are many things that can be configured.

しかしながら、キャッシュメモリを備えた演算装置であっても、演算装置の回路規模の観点から、演算処理に必要な全てのデータを保持しておくことができるだけの記憶容量を持ったキャッシュメモリを備えることはできない。このため、演算処理に必要なデータがキャッシュメモリに事前に保持されていない場合、いわゆる、キャッシュミスの状態である場合には、やはり外部のメモリに対するアクセスが発生し、演算装置の処理速度を向上させることができない場合もある。 However, even with an arithmetic device provided with a cache memory, from the viewpoint of the circuit scale of the arithmetic device, it is provided with a cache memory having a storage capacity sufficient to hold all data necessary for arithmetic processing. I can't. For this reason, if the data required for the arithmetic processing is not stored in the cache memory in advance, in the case of a so-called cache miss, access to the external memory still occurs, improving the processing speed of the arithmetic unit. There are cases where it cannot be allowed to.

このことから、プロセッサがメモリからデータを読み出すロード命令およびメモリにデータを書き込むストア命令を含む演算命令（以下、「ロード・ストア命令」という）を抽出し、抽出したロード・ストア命令によってアクセスするメモリのアドレスに対するプリフェッチ命令を、データを使用する演算命令よりも早いタイミングで実行することによって、キャッシュミスを防止するようにした演算装置がある（特許文献１参照）。 From this, the processor extracts a calculation instruction including a load instruction for reading data from the memory and a store instruction for writing data to the memory (hereinafter referred to as “load / store instruction”), and is accessed by the extracted load / store instruction. There is an arithmetic unit that prevents a cache miss by executing a prefetch instruction for a given address at an earlier timing than an arithmetic instruction that uses data (see Patent Document 1).

また、プロセッサを複数備え、それぞれのプロセッサが、例えば、画像処理などの一連の処理を分担して並列に演算処理を行う、いわゆる、分散並列処理型の演算装置がある。分散並列処理型の演算装置では、要求された演算命令を複数個のプロセッサが分担して行うことによって、演算処理に要する時間の短縮を図ることができる。 In addition, there is a so-called distributed parallel processing type arithmetic device in which a plurality of processors are provided and each processor shares a series of processing such as image processing and performs arithmetic processing in parallel. In the distributed parallel processing type arithmetic device, the time required for arithmetic processing can be shortened by sharing a requested arithmetic instruction by a plurality of processors.

このような分散並列処理型の演算装置において、プロセッサ同士でのそれぞれのタスクの処理の待ち合わせやラインバッファ処理など、データを一定時間保持する必要がある処理、あるいは演算装置に搭載しているプロセッサの数よりも多くのタスクからなる処理を行うために、特許文献１で開示された、キャッシュミスを防止する技術を適用することも考えられる。 In such a distributed parallel processing type arithmetic device, processing that requires the data to be held for a certain period of time, such as waiting for processing of each task between the processors and line buffer processing, or the processor installed in the arithmetic device In order to perform processing consisting of more tasks than the number, it is conceivable to apply the technique for preventing cache misses disclosed in Patent Document 1.

特開２０１１−７６３１４号公報JP 2011-76314 A

しかしながら、演算装置に備えたプロセッサによる実際の演算処理においては、プリフェッチ命令が発行されてから、プリフェッチ命令によって事前に取得したデータを実際に使用する演算命令が実行されるまでのサイクル数が、プログラムの組み方によって変動する。そして、プリフェッチ命令の発行からデータを使用するまでのサイクル数を制御することはできない。このため、特許文献１で開示された技術を適用した演算装置であっても、演算命令に応じてデータを使用するときまでに、必要なデータの準備が必ず完了していることが保証されるものではない。 However, in the actual arithmetic processing by the processor provided in the arithmetic unit, the number of cycles from when the prefetch instruction is issued until the arithmetic instruction that actually uses data acquired in advance by the prefetch instruction is executed is It will vary depending on how it is assembled. The number of cycles from issuing the prefetch instruction to using the data cannot be controlled. For this reason, even in the arithmetic device to which the technique disclosed in Patent Document 1 is applied, it is guaranteed that necessary data preparation is always completed by the time data is used according to the arithmetic instruction. It is not a thing.

例えば、データを使用する今回の演算命令の１サイクル前にプリフェッチ命令を実行したとしても、プリフェッチ命令に応じたデータを取得するための外部のメモリのアクセスに３０サイクルを要し、前の演算命令の実行が１サイクルで完了してしまうような場合には、プリフェッチ命令によって事前にデータを準備しておくことができず、２９サイクルの間、今回の演算命令の実行が待たされることになる。この演算命令の実行が待たされている時間、つまり、キャッシュミスの状態になっている時間は、プロセッサが演算処理を行うことができず、演算装置の処理速度が低下する要因となってしまう。 For example, even if a prefetch instruction is executed one cycle before the current operation instruction using data, 30 cycles are required to access an external memory for acquiring data corresponding to the prefetch instruction, and the previous operation instruction When the execution of is completed in one cycle, the data cannot be prepared in advance by the prefetch instruction, and the execution of the current operation instruction is waited for 29 cycles. The time for which the execution of the arithmetic instruction is awaited, that is, the time when the cache instruction is in a cache miss state, the processor cannot perform the arithmetic processing, which causes a reduction in the processing speed of the arithmetic device.

このように、特許文献１で開示された技術を適用した演算装置であっても、常にキャッシュメモリへのデータのプリフェッチが間に合うということを保証することができず、必ずしもキャッシュミスを防止することができるとはいえない、という問題がある。 As described above, even an arithmetic device to which the technique disclosed in Patent Document 1 is applied cannot always guarantee that the prefetch of data to the cache memory is in time, and can always prevent a cache miss. There is a problem that it cannot be said.

本発明は、上記の課題認識に基づいてなされたものであり、複数のプロセッサが連携して処理を行う演算装置において、それぞれのプロセッサが実際にデータを使用するタイミングまでに、必要なデータを準備しておくことにより、キャッシュミスを防止することができる演算装置を提供することを目的としている。 The present invention has been made on the basis of the above-mentioned problem recognition, and in an arithmetic device in which a plurality of processors perform processing in cooperation, necessary data is prepared before each processor actually uses the data. An object of the present invention is to provide an arithmetic device capable of preventing a cache miss.

上記の課題を解決するため、本発明の演算装置は、入力されたタスクに応じた演算処理を行う処理機能を有し、次に実行する演算処理に関する情報を前記タスクとして出力する複数の処理演算部と、それぞれの前記処理演算部が前記タスクに応じた演算処理を実行する際に使用するデータ、または前記タスクに応じた演算処理を実行した結果のデータを格納するデータ記憶部と、前記タスクに応じた演算処理を実行する際に使用するデータを接続された外部記憶部から読み出して前記データ記憶部に格納、または前記データ記憶部に格納されている前記タスクに応じた演算処理を実行した結果のデータを接続された前記外部記憶部に書き込むメモリ制御部と、前記タスクを順次格納するタスクキューを具備し、該タスクキューに格納された前記タスクを、複数の前記処理演算部の内、いずれか１つの前記処理演算部に出力すると共に、前記タスクキューに格納されたそれぞれの前記タスクに応じた演算処理を前記処理演算部が実行する際のタイミングに基づいて、前記外部記憶部へのアクセスを前記メモリ制御部に指示するアクセス指示を出力するタスク制御部と、を備えることを特徴とする。 In order to solve the above problems, the arithmetic device of the present invention has a processing function for performing arithmetic processing according to an input task, and outputs a plurality of processing arithmetics as information about the arithmetic processing to be executed next as the task A data storage unit that stores data used when each processing operation unit executes an arithmetic process corresponding to the task, or data obtained as a result of executing the arithmetic process corresponding to the task, and the task The data used when executing the arithmetic processing according to the data is read from the connected external storage unit and stored in the data storage unit, or the arithmetic processing according to the task stored in the data storage unit is executed A memory control unit for writing the result data to the connected external storage unit, and a task queue for sequentially storing the tasks; When a task is output to any one of the plurality of processing operation units among the plurality of processing operation units, and the processing operation unit executes an operation process corresponding to each task stored in the task queue And a task control unit that outputs an access instruction for instructing the memory control unit to access the external storage unit based on the timing.

本発明によれば、複数のプロセッサが連携して処理を行う演算装置において、それぞれのプロセッサが実際にデータを使用するタイミングまでに、必要なデータを準備しておくことにより、キャッシュミスを防止することができる演算装置を提供することができるという効果が得られる。 According to the present invention, in an arithmetic unit in which a plurality of processors perform processing in cooperation, a cache miss is prevented by preparing necessary data before each processor actually uses the data. The effect that the arithmetic device which can be provided can be provided is acquired.

本発明の実施形態における演算装置の概略構成の一例を示したブロック図である。It is the block diagram which showed an example of schematic structure of the arithmetic unit in embodiment of this invention. 本実施形態の演算装置に備えたタスク制御部の概略構成、およびタスク制御部に格納されたタスクの一例を説明する図である。It is a figure explaining a schematic structure of the task control part with which the arithmetic unit of this embodiment was equipped, and an example of the task stored in the task control part. 本実施形態の演算装置に備えたタスク制御部による第１の動作におけるタスクの分配とデータ転送とのタイミングを示したタイミングチャートである。It is a timing chart which showed the timing of task distribution and data transfer in the 1st operation by the task control part with which the arithmetic unit of this embodiment was equipped. 本実施形態の演算装置に備えたタスク制御部による第１の動作における処理手順を示したフローチャートである。It is the flowchart which showed the process sequence in 1st operation | movement by the task control part with which the arithmetic unit of this embodiment was equipped. 本実施形態の演算装置に備えたタスク制御部による第２の動作における処理手順を示したフローチャートである。It is the flowchart which showed the process sequence in 2nd operation | movement by the task control part with which the arithmetic unit of this embodiment was equipped. 本実施形態の演算装置に備えたタスク制御部の概略構成、およびタスク制御部に格納されたタスクの別の一例を説明する図である。It is a figure explaining another example of the schematic structure of the task control part with which the arithmetic unit of this embodiment was equipped, and the task stored in the task control part. 本実施形態の演算装置に備えたタスク制御部による第３の動作における処理手順を示したフローチャートである。It is the flowchart which showed the process sequence in the 3rd operation | movement by the task control part with which the arithmetic unit of this embodiment was equipped. 本実施形態の演算装置に備えたタスク制御部による第３の動作におけるタスクの分配とデータ転送とのタイミングを示したタイミングチャートである。It is a timing chart which showed the timing of task distribution and data transfer in the 3rd operation by the task control part with which the arithmetic unit of this embodiment was equipped. 本実施形態の演算装置に備えたタスク制御部による第３の動作において処理演算部に出力するタスクの順番を入れ替える場合の一例を説明する図である。It is a figure explaining an example in the case of changing the order of the task output to a process calculating part in the 3rd operation | movement by the task control part with which the arithmetic unit of this embodiment was equipped. 本実施形態の演算装置に備えたタスク制御部による第３の動作において処理演算部に出力するタスクの順番を入れ替える場合の一例を説明する図である。It is a figure explaining an example in the case of changing the order of the task output to a process calculating part in the 3rd operation | movement by the task control part with which the arithmetic unit of this embodiment was equipped. 本実施形態の演算装置に備えたタスク制御部による第３の動作において処理演算部に出力するタスクの順番を入れ替える場合の一例を説明する図である。It is a figure explaining an example in the case of changing the order of the task output to a process calculating part in the 3rd operation | movement by the task control part with which the arithmetic unit of this embodiment was equipped. 本実施形態の演算装置に備えたタスク制御部による第４の動作における処理手順を示したフローチャートである。It is the flowchart which showed the process sequence in the 4th operation | movement by the task control part with which the arithmetic unit of this embodiment was equipped. 本実施形態の演算装置に備えたタスク制御部が、データ記憶部に格納されているデータを退避する処理手順を示したフローチャートである。It is the flowchart which showed the process sequence in which the task control part with which the arithmetic unit of this embodiment was equipped saves the data stored in the data storage part.

以下、本発明の実施形態について、図面を参照して説明する。図１は、本実施形態における演算装置の概略構成の一例を示したブロック図である。図１に示した演算装置１０は、ｎ個の処理演算部１１ａ〜処理演算部１１ｎと、タスク制御部１２と、メモリ制御部１３と、ｎ個のデータ記憶部１４ａ〜データ記憶部１４ｎと、を備えている。また、演算装置１０には、外部記憶部２０が接続されている。演算装置１０は、要求された演算処理を、処理演算部１１ａ〜処理演算部１１ｎのそれぞれで分担して行う、分散並列処理型の演算装置である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram illustrating an example of a schematic configuration of the arithmetic device according to the present embodiment. The arithmetic device 10 shown in FIG. 1 includes n processing operation units 11a to 11n, a task control unit 12, a memory control unit 13, n data storage units 14a to 14n, It has. An external storage unit 20 is connected to the arithmetic device 10. The arithmetic device 10 is a distributed parallel processing type arithmetic device that performs the requested arithmetic processing by each of the processing arithmetic units 11a to 11n.

なお、以下の説明においては、処理演算部１１ａ〜処理演算部１１ｎのそれぞれを区別せずに表す場合には、「処理演算部１１」という。また、データ記憶部１４ａ〜データ記憶部１４ｎのそれぞれを区別せずに表す場合には、「データ記憶部１４」という。 In the following description, when each of the processing calculation unit 11a to the processing calculation unit 11n is expressed without being distinguished, it is referred to as a “processing calculation unit 11”. Further, when each of the data storage unit 14a to the data storage unit 14n is expressed without being distinguished, it is referred to as a “data storage unit 14”.

外部記憶部２０は、処理演算部１１ａ〜処理演算部１１ｎのそれぞれで共有される、例えば、ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）などのメモリである。外部記憶部２０には、処理演算部１１ａ〜処理演算部１１ｎのそれぞれが起動するためのプログラムや、処理演算部１１ａ〜処理演算部１１ｎのそれぞれが演算処理を実行するために用いるデータが格納されている。また、外部記憶部２０には、処理演算部１１ａ〜処理演算部１１ｎのそれぞれが演算処理の途中で生成したデータなどが、一時的に格納される。 The external storage unit 20 is a memory such as a DRAM (Dynamic Random Access Memory) that is shared by the processing calculation units 11a to 11n. The external storage unit 20 stores a program for starting each of the processing calculation unit 11a to the processing calculation unit 11n and data used by each of the processing calculation unit 11a to the processing calculation unit 11n to execute calculation processing. ing. Further, the external storage unit 20 temporarily stores data generated by each of the process calculation units 11a to 11n during the calculation process.

処理演算部１１ａ〜処理演算部１１ｎのそれぞれは、同じ処理機能を持ったプロセッサである。処理演算部１１ａ〜処理演算部１１ｎのそれぞれは、演算装置１０に接続された外部記憶部２０へのデータの書き込みや、外部記憶部２０からのデータの読み出しを行いながら、タスク制御部１２から入力された、演算装置１０に対して要求された演算処理におけるそれぞれのタスクを実行する。ただし、演算装置１０では、外部記憶部２０へのデータの書き込みや、外部記憶部２０からのデータの読み出しを、処理演算部１１ａ〜処理演算部１１ｎのそれぞれが直接行わない。そして、処理演算部１１ａ〜処理演算部１１ｎのそれぞれがタスクを実行する際には、外部記憶部２０へのデータの書き込みや、外部記憶部２０からのデータの読み出しの代わりに、データ記憶部１４へのデータの書き込みや、データ記憶部１４からのデータの読み出しを行いながら、タスク制御部１２から入力されたそれぞれのタスクを実行する。 Each of the processing calculation unit 11a to the processing calculation unit 11n is a processor having the same processing function. Each of the processing calculation unit 11 a to the processing calculation unit 11 n is input from the task control unit 12 while writing data to the external storage unit 20 connected to the calculation device 10 or reading data from the external storage unit 20. The respective tasks in the arithmetic processing requested to the arithmetic device 10 are executed. However, in the arithmetic device 10, each of the processing arithmetic units 11 a to 11 n does not directly write data to the external storage unit 20 or read data from the external storage unit 20. When each of the processing operation units 11a to 11n executes a task, instead of writing data to the external storage unit 20 or reading data from the external storage unit 20, the data storage unit 14 Each task input from the task control unit 12 is executed while data is written to and read from the data storage unit 14.

また、処理演算部１１ａ〜処理演算部１１ｎのそれぞれは、自身がタスクを実行した後に引き続き別の処理演算部１１または自身に、次のタスクを実行させるための情報や、次に実行するタスクの内容を表す情報を、次のタスクの実行要求として、タスク制御部１２に出力する。ここで、処理演算部１１がタスク制御部１２に出力する次のタスクの実行要求の情報には、タスクを実行する際に使用するデータが保持されている外部記憶部２０のアドレスやデータ記憶部１４を指定する情報が含まれている。また、処理演算部１１がタスク制御部１２に出力する次のタスクの実行要求の情報には、次のタスクの実行に必要な様々なパラメータのデータが含まれている。なお、以下の説明においては、処理演算部１１がタスク制御部１２に出力する、次のタスクの実行要求も、タスクという。 In addition, each of the processing calculation units 11a to 11n executes information on information to cause another processing calculation unit 11 or itself to execute the next task after the execution of the task or a task to be executed next. Information representing the contents is output to the task control unit 12 as an execution request for the next task. Here, the information of the next task execution request output from the processing operation unit 11 to the task control unit 12 includes the address of the external storage unit 20 that holds data used when executing the task, and the data storage unit. Information specifying 14 is included. Further, the next task execution request information output from the processing operation unit 11 to the task control unit 12 includes data of various parameters necessary for execution of the next task. In the following description, an execution request for the next task output from the processing calculation unit 11 to the task control unit 12 is also referred to as a task.

また、処理演算部１１ａ〜処理演算部１１ｎのそれぞれは、タスク制御部１２から入力された、別の処理演算部１１または自身が前回のタスクを実行した結果である、次のタスクを実行する。なお、処理演算部１１ａ〜処理演算部１１ｎのそれぞれは、さらに別の処理演算部１１または自身が引き続き実行するタスクがある場合には、引き続き別の処理演算部１１または自身に、次のタスクを実行させるための情報や、次に実行するタスクの内容を表す情報を、次のタスクとして再度、タスク制御部１２に出力する。 In addition, each of the processing calculation unit 11a to the processing calculation unit 11n executes the next task, which is the result of the previous processing being executed by another processing calculation unit 11 or itself input from the task control unit 12. In addition, each of the processing calculation unit 11a to the processing calculation unit 11n, when there is a task to be continuously executed by another processing calculation unit 11 or itself, continues the next task to another processing calculation unit 11 or itself. Information for execution and information indicating the content of the task to be executed next are output again to the task control unit 12 as the next task.

また、処理演算部１１ａ〜処理演算部１１ｎのそれぞれは、次のタスクを受け付けられる状態であるか否かを表す信号を、タスク制御部１２に出力する。処理演算部１１ａ〜処理演算部１１ｎのそれぞれは、今回実行しているタスクの処理が完了し、次のタスクを実行する準備が整ったときに、次に実行するタスクを受け付けられる状態であることを表す信号を、タスク制御部１２に出力する。 In addition, each of the processing calculation unit 11a to the processing calculation unit 11n outputs a signal indicating whether or not the next task can be received to the task control unit 12. Each of the processing arithmetic unit 11a to the processing arithmetic unit 11n is in a state in which a task to be executed next can be accepted when processing of the task being executed this time is completed and preparation for executing the next task is completed. Is output to the task control unit 12.

タスク制御部１２は、処理演算部１１ａ〜処理演算部１１ｎのそれぞれから入力されたそれぞれのタスクを受け付け、処理演算部１１ａ〜処理演算部１１ｎのそれぞれから入力された、次のタスクを受け付けられる状態であるか否かを表す信号に基づいて、受け付けたタスクを、処理演算部１１ａ〜処理演算部１１ｎのいずれか１つに割り当てる。 The task control unit 12 receives each task input from each of the processing calculation unit 11a to the processing calculation unit 11n, and can receive the next task input from each of the processing calculation unit 11a to the processing calculation unit 11n. Based on the signal indicating whether or not, the received task is assigned to any one of the processing calculation units 11a to 11n.

より具体的には、タスク制御部１２は、処理演算部１１ａ〜処理演算部１１ｎのそれぞれから入力されたそれぞれのタスクに基づいて、次のタスクを受け付けられる状態であることを表す信号を入力している処理演算部１１の中から、次のタスクを実行させる、いずれか１つ処理演算部１１を選択する。そして、タスク制御部１２は、選択したいずれか１つの処理演算部１１にタスクを出力することによって、演算装置１０に対して要求された演算処理におけるそれぞれのタスクを、処理演算部１１ａ〜処理演算部１１ｎのそれぞれに分配する。 More specifically, the task control unit 12 inputs a signal indicating that the next task can be accepted based on each task input from each of the processing calculation units 11a to 11n. One of the processing calculation units 11 that executes the next task is selected from the processing calculation units 11 that are present. Then, the task control unit 12 outputs the task to any one of the selected processing calculation units 11, thereby processing each task in the calculation processing requested to the calculation device 10 from the processing calculation unit 11 a to the processing calculation. It distributes to each of the parts 11n.

タスク制御部１２は、処理演算部１１ａ〜処理演算部１１ｎのそれぞれからのタスクを受け付けるための構成として、タスクキュー１２１を備えている。タスクキュー１２１は、入力されたタスクを格納する待ち行列のメモリである。タスクキュー１２１には、処理演算部１１ａ〜処理演算部１１ｎのそれぞれから入力されたそれぞれのタスクが、入力された順番で順次格納される。タスクキュー１２１に格納されたそれぞれのタスクは、基本的に、格納された順番で出力されるが、演算装置１０では、タスク制御部１２が、タスクキュー１２１に格納されたタスクを出力する処理演算部１１や、タスクの出力順番を制御する。 The task control unit 12 includes a task queue 121 as a configuration for receiving tasks from each of the processing calculation units 11a to 11n. The task queue 121 is a queue memory for storing input tasks. In the task queue 121, the respective tasks input from each of the processing calculation units 11a to 11n are sequentially stored in the input order. Each task stored in the task queue 121 is basically output in the order of storage, but in the arithmetic device 10, the task control unit 12 outputs a processing operation that outputs the task stored in the task queue 121. The output order of the unit 11 and tasks is controlled.

また、タスク制御部１２は、タスクキュー１２１に格納されたそれぞれのタスクを実行するタイミングに基づいて、外部記憶部２０にアクセスするための指示（以下、「アクセス指示」という）を、メモリ制御部１３に出力する。例えば、タスク制御部１２は、ＤＭＡ（ＤｉｒｅｃｔＭｅｍｏｒｙＡｃｃｅｓｓ）によって外部記憶部２０へのデータの書き込み、および外部記憶部２０からのデータの読み出しを行うアクセス指示を、メモリ制御部１３に出力する。 Further, the task control unit 12 sends an instruction (hereinafter referred to as “access instruction”) for accessing the external storage unit 20 based on the timing of executing each task stored in the task queue 121 to the memory control unit. 13 is output. For example, the task control unit 12 outputs, to the memory control unit 13, an access instruction for writing data to the external storage unit 20 and reading data from the external storage unit 20 by DMA (Direct Memory Access).

より具体的には、タスク制御部１２は、タスクキュー１２１に順次格納されたそれぞれのタスクに基づいて、タスクを実行する際に使用する外部記憶部２０に格納されているデータを、このタスクが割り当てられた処理演算部１１が実際にタスクを実行するタイミングまでに、事前に取得する（読み出す）ためのＤＭＡのアクセス指示を、メモリ制御部１３に出力する。また、タスク制御部１２は、タスクキュー１２１に順次格納されたそれぞれのタスクに基づいて、それぞれの処理演算部１１が割り当てられたタスクを実行する際に使用しないデータを、事前に外部記憶部２０に退避する（書き込む）ためのＤＭＡのアクセス指示を、メモリ制御部１３に出力する。ここで、タスク制御部１２がメモリ制御部１３に出力するアクセス指示には、外部記憶部２０のアドレスや、読み出しまたは書き込みを行うデータの量（大きさ）などを表す情報が含まれている。 More specifically, the task control unit 12 uses this task to store data stored in the external storage unit 20 used when executing a task based on each task sequentially stored in the task queue 121. A DMA access instruction to be acquired (read) in advance is output to the memory control unit 13 by the time when the assigned processing operation unit 11 actually executes the task. Also, the task control unit 12 preliminarily stores data that is not used when executing the tasks assigned to the respective processing calculation units 11 based on the respective tasks sequentially stored in the task queue 121. A DMA access instruction for saving (writing) is output to the memory control unit 13. Here, the access instruction output from the task control unit 12 to the memory control unit 13 includes information indicating the address of the external storage unit 20 and the amount (size) of data to be read or written.

なお、タスク制御部１２によるタスクを出力する処理演算部１１やタスクの出力順番の制御方法、およびメモリ制御部１３に出力する外部記憶部２０へのアクセス指示に関する詳細な説明は、後述する。 A detailed description of the processing operation unit 11 that outputs tasks by the task control unit 12, a method for controlling the output order of tasks, and an instruction to access the external storage unit 20 that is output to the memory control unit 13 will be described later.

メモリ制御部１３は、タスク制御部１２から入力されたアクセス指示に応じて、演算装置１０に接続された外部記憶部２０からのデータの読み出しや、外部記憶部２０へのデータの書き込みを行う。 The memory control unit 13 reads data from the external storage unit 20 connected to the arithmetic device 10 and writes data to the external storage unit 20 according to the access instruction input from the task control unit 12.

より具体的には、メモリ制御部１３は、タスク制御部１２から入力された、外部記憶部２０に格納されているデータを事前に取得する（読み出す）ためのアクセス指示に応じて、アクセス指示によって指定された外部記憶部２０のアドレスから、アクセス指示によって指定された量のデータを読み出し、読み出したデータを、アクセス指示によって指定されたデータ記憶部１４に格納する。メモリ制御部１３は、タスク制御部１２から入力された、データを外部記憶部２０に退避する（書き込む）ためのアクセス指示に応じて、アクセス指示によって指定されたデータ記憶部１４に格納されているデータを読み出し、読み出したデータを、アクセス指示によって指定された外部記憶部２０のアドレスの記憶領域に書き込む。 More specifically, the memory control unit 13 responds to the access instruction in accordance with the access instruction input from the task control unit 12 to obtain (read) data stored in the external storage unit 20 in advance. The amount of data designated by the access instruction is read from the address of the designated external storage unit 20, and the read data is stored in the data storage unit 14 designated by the access instruction. The memory control unit 13 is stored in the data storage unit 14 specified by the access instruction according to the access instruction input from the task control unit 12 to save (write) data to the external storage unit 20. The data is read, and the read data is written to the storage area at the address of the external storage unit 20 specified by the access instruction.

また、メモリ制御部１３は、タスク制御部１２から入力されたアクセス指示に応じて、処理演算部１１ａ〜処理演算部１１ｎのそれぞれから入力された次のタスクの実行要求の情報に含まれる、タスクの実行に必要な様々なパラメータのデータを、データ記憶部１４に格納、または外部記憶部２０に退避する。 In addition, the memory control unit 13 responds to the access instruction input from the task control unit 12, and includes a task included in the next task execution request information input from each of the process calculation units 11a to 11n. The data of various parameters necessary for the execution is stored in the data storage unit 14 or saved in the external storage unit 20.

より具体的には、メモリ制御部１３は、タスク制御部１２から入力された、パラメータのデータを格納する（書き込む）ためのアクセス指示に応じて、アクセス指示によって指定されたデータ記憶部１４ａ〜データ記憶部１４ｎのいずれかに、アクセス指示によって指定されたパラメータのデータを格納する。メモリ制御部１３は、タスク制御部１２から入力された、パラメータのデータを外部記憶部２０に退避する（書き込む）ためのアクセス指示に応じて、アクセス指示によって指定されたパラメータのデータを、アクセス指示によって指定された外部記憶部２０のアドレスの記憶領域に書き込む。ここで外部記憶部２０に退避されたパラメータのデータは、タスク制御部１２から必要に応じて入力されるパラメータのデータを読み出すためのアクセス指示に応じて、外部記憶部２０から読み出され、アクセス指示によって指定されたデータ記憶部１４に再び格納される。 More specifically, the memory control unit 13 receives the data storage unit 14a to the data specified by the access instruction according to the access instruction for storing (writing) the parameter data input from the task control unit 12. The parameter data designated by the access instruction is stored in any of the storage units 14n. The memory control unit 13 receives the parameter data specified by the access instruction according to the access instruction input from the task control unit 12 to save (write) the parameter data to the external storage unit 20. Is written in the storage area of the address of the external storage unit 20 designated by Here, the parameter data saved in the external storage unit 20 is read out from the external storage unit 20 in response to an access instruction for reading out parameter data input from the task control unit 12 as necessary. The data is again stored in the data storage unit 14 designated by the instruction.

データ記憶部１４ａ〜データ記憶部１４ｎのそれぞれは、処理演算部１１ａ〜処理演算部１１ｎのそれぞれに対応し、対応する処理演算部１１がタスクを実行する際に使用するデータや、次のタスクを実行する際に使用するデータ（例えば、現在のタスクを実行した結果のデータ）を格納する、例えば、ＳＲＡＭ（ＳｔａｔｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）などのメモリ、いわゆる、キャッシュメモリである。 Each of the data storage unit 14a to the data storage unit 14n corresponds to each of the processing calculation unit 11a to the processing calculation unit 11n, and stores data used when the corresponding processing calculation unit 11 executes a task, and the next task. It is a so-called cache memory, for example, a memory such as SRAM (Static Random Access Memory) that stores data used for execution (for example, data as a result of executing the current task).

なお、図１では、演算装置１０内に、処理演算部１１のそれぞれに対応したｎ個のデータ記憶部１４を備えた構成を示しているが、データ記憶部１４の構成は、本実施形態の構成のみに限定されるものではない。例えば、演算装置内に１つのデータ記憶部を備え、このデータ記憶部の記憶領域を、処理演算部１１のそれぞれに対応する数に分割した構成であっても、同様に考えることができる。ただし、複数の処理演算部１１が同時にデータ記憶部の別の領域に対してデータの書き込みや読み出しを行うことを考慮すると、データ記憶部は、図１に示したように、処理演算部１１のそれぞれに対応した構成であることが望ましいと考えられる。 1 shows a configuration in which the n data storage units 14 corresponding to each of the processing calculation units 11 are provided in the calculation device 10, the configuration of the data storage unit 14 is the same as that of the present embodiment. It is not limited only to the configuration. For example, the same configuration can be applied to a configuration in which a single data storage unit is provided in the arithmetic device, and the storage area of the data storage unit is divided into numbers corresponding to the respective processing arithmetic units 11. However, considering that the plurality of processing operation units 11 simultaneously write and read data to other areas of the data storage unit, the data storage unit is configured as shown in FIG. It is considered desirable to have a configuration corresponding to each.

このように、演算装置１０では、いずれの処理演算部１１から出力されたタスクも、タスク制御部１２を経由してから、次のタスクを実行する処理演算部１１に入力される。そして、演算装置１０では、次のタスクが割り当てられた処理演算部１１が、外部記憶部２０に格納されているデータを使用するタイミングよりも早いタイミングで、タスクを実行する際に使用するデータを、事前にデータ記憶部１４に格納しておく。また、演算装置１０では、処理演算部１１がタスクを実行する際にデータを使用しない場合には、対応するデータ記憶部１４に格納されたデータを、外部記憶部２０に退避しておく。 As described above, in the arithmetic device 10, the task output from any of the processing arithmetic units 11 is input to the processing arithmetic unit 11 that executes the next task after passing through the task control unit 12. In the arithmetic device 10, the processing arithmetic unit 11 to which the next task is assigned receives data used when executing the task at a timing earlier than the timing of using the data stored in the external storage unit 20. The data is stored in advance in the data storage unit 14. Further, in the arithmetic device 10, when data is not used when the processing arithmetic unit 11 executes a task, the data stored in the corresponding data storage unit 14 is saved in the external storage unit 20.

次に、演算装置１０の動作について説明する。なお、以下の説明においては、演算装置１０に備えている処理演算部１１とデータ記憶部１４とは、それぞれ４つずつである、すなわち、演算装置１０には、処理演算部１１ａ〜処理演算部１１ｄと、データ記憶部１４ａ〜データ記憶部１４ｄとを備えているものとして説明する。また、演算装置１０に備えた４つの処理演算部１１ａ〜処理演算部１１ｄは、想定される以下の１０種類のタスクをそれぞれ実行することができ、それぞれのタスクを実行する際の実行時間（サイクル数）は、以下のサイクル数であるものとする。ここで、１０種類のタスクの平均のサイクル数は、１００サイクルである。 Next, the operation of the arithmetic unit 10 will be described. In the following description, there are four processing arithmetic units 11 and four data storage units 14 included in the arithmetic device 10, that is, the arithmetic device 10 includes the processing arithmetic units 11a to 11a. 11d and the data storage part 14a-the data storage part 14d are demonstrated as what is provided. Further, the four processing calculation units 11a to 11d provided in the calculation device 10 can each execute the following ten types of tasks assumed, and the execution time (cycle) when executing each task Number) is the number of cycles below. Here, the average number of cycles of the 10 types of tasks is 100 cycles.

タスク０＝１１０サイクル
タスク１＝１２０サイクル
タスク２＝１３０サイクル
タスク３＝１４０サイクル
タスク４＝１５０サイクル
タスク５＝９０サイクル
タスク６＝８０サイクル
タスク７＝７０サイクル
タスク８＝６０サイクル
タスク９＝５０サイクル Task 0 = 110 cycles Task 1 = 120 cycles Task 2 = 130 cycles Task 3 = 140 cycles Task 4 = 150 cycles Task 5 = 90 cycles Task 6 = 80 cycles Task 7 = 70 cycles Task 8 = 60 cycles Task 9 = 50 cycles

なお、タスク制御部１２は、それぞれのタスクを実行する際の上記のサイクル数が事前にわかっているものとする。また、メモリ制御部１３が外部記憶部２０に格納されているデータを事前に取得する（読み出す）、または外部記憶部２０にデータを退避する（書き込む）際には、ＤＭＡによって外部記憶部２０にアクセスするものとする。なお、メモリ制御部１３がＤＭＡによって外部記憶部２０にアクセスする際のデータ転送時間（サイクル数）は、１００サイクルであるものとする。 It is assumed that the task control unit 12 knows in advance the number of cycles when executing each task. Further, when the memory control unit 13 acquires (reads) data stored in the external storage unit 20 in advance or saves (writes) data in the external storage unit 20, it is stored in the external storage unit 20 by DMA. Shall be accessed. It is assumed that the data transfer time (number of cycles) when the memory control unit 13 accesses the external storage unit 20 by DMA is 100 cycles.

＜第１の動作＞
まず、演算装置１０の動作において、タスク制御部１２が、タスクキュー１２１に格納されたタスクの順番に基づいて、外部記憶部２０に格納されているデータをＤＭＡによって事前に取得する（読み出す）ためのアクセス指示（以下「ＤＭＡリードアクセス指示」を、メモリ制御部１３に出力する第１の動作について説明する。図２は、本実施形態の演算装置１０に備えたタスク制御部１２の概略構成、およびタスク制御部１２に格納されたタスクの一例を説明する図である。 <First operation>
First, in the operation of the arithmetic unit 10, the task control unit 12 acquires (reads) data stored in the external storage unit 20 in advance by DMA based on the order of tasks stored in the task queue 121. A first operation for outputting an access instruction (hereinafter referred to as “DMA read access instruction”) to the memory control unit 13 will be described. FIG. 2 is a schematic configuration of the task control unit 12 included in the arithmetic device 10 of the present embodiment. 4 is a diagram illustrating an example of tasks stored in a task control unit 12. FIG.

上述したように、タスク制御部１２は、ＤＭＡによって外部記憶部２０にアクセスするためのＤＭＡリードアクセス指示を、メモリ制御部１３に出力する。このため、タスク制御部１２には、図２に示したように、ＤＭＡリクエスト発生部１２２を備えている。ＤＭＡリクエスト発生部１２２は、タスクキュー１２１に順次格納されたそれぞれのタスクに基づいて決定されたタイミングのときに、ＤＭＡリードアクセス指示をメモリ制御部１３に出力する。 As described above, the task control unit 12 outputs a DMA read access instruction for accessing the external storage unit 20 by DMA to the memory control unit 13. Therefore, the task control unit 12 includes a DMA request generation unit 122 as shown in FIG. The DMA request generation unit 122 outputs a DMA read access instruction to the memory control unit 13 at a timing determined based on each task sequentially stored in the task queue 121.

また、図２には、タスク制御部１２に備えたタスクキュー１２１のそれぞれにタスクが格納されている状態を示している。なお、図２においてタスクキュー１２１内に示した“＃（シャープ）”に続く数字は、タスクキュー１２１にそれぞれのタスクが格納された順番を表すタスク番号を示し、“＃０”が最初に格納されたタスク（図２では、タスク６）であり、“＃１”が２番目に格納されたタスク（図２では、タスク３）であることを示している。また、タスク番号は、それぞれのタスクが処理演算部１１に出力される順番も示している。 FIG. 2 shows a state in which tasks are stored in each of the task queues 121 provided in the task control unit 12. In FIG. 2, the number following “# (sharp)” shown in the task queue 121 indicates the task number indicating the order in which each task is stored in the task queue 121, and “# 0” is stored first. This indicates that “# 1” is the second stored task (task 3 in FIG. 2). The task number also indicates the order in which each task is output to the processing operation unit 11.

タスク制御部１２は、基本的に、タスクキュー１２１にそれぞれのタスクが格納された順番で、それぞれのタスクを処理演算部１１に出力する。このとき、最初に格納された“＃０”のタスク６が処理演算部１１に出力されると、“＃１”〜“＃９”のそれぞれのタスクは、タスク番号が１つずつ小さくなる。すなわち、図２における“＃１”のタスク３が“＃０”のタスク３になり、同様に、“＃２”〜“＃９”のそれぞれのタスクも“＃１”〜“＃８”のそれぞれのタスクとなる。これにより、タスク制御部１２は、常に“＃０”のタスクを処理演算部１１に出力するタスクとすることにより、タスクの出力順番の制御を容易に行うことができる。 The task control unit 12 basically outputs the tasks to the processing calculation unit 11 in the order in which the tasks are stored in the task queue 121. At this time, when the first stored task 6 of “# 0” is output to the processing operation unit 11, the task numbers of “# 1” to “# 9” are decreased by one. That is, the task 3 of “# 1” in FIG. 2 becomes the task 3 of “# 0”. Similarly, the tasks “# 2” to “# 9” are also “# 1” to “# 8”. Each task becomes. As a result, the task control unit 12 can easily control the output order of the tasks by always setting the task “# 0” as a task to be output to the processing operation unit 11.

しかし、以下の説明においては、説明を容易にするため、“＃０”のタスクが処理演算部１１に出力される毎にタスク番号が変わるのではなく、タスク制御部１２が、タスクキュー１２１に格納されているタスク番号が小さいタスクから、すなわち、＃０”のタスク６からタスク番号が大きくなっていく順番で、それぞれのタスクを処理演算部１１に出力するものとして説明する。そして、以下の説明においては、“＃８”のタスク０および“＃９”のタスク１を処理演算部１１に出力する前に、タスク制御部１２が、“＃８”のタスク０および“＃９”のタスク１を実行する際に使用するデータを事前に取得する（読み出す）場合の動作について説明する。なお、“＃０”〜“＃７”のそれぞれのタスクでは、外部記憶部２０に格納されているデータを使用しない、すなわち、“＃０”〜“＃７”のそれぞれのタスクを実行するためのＤＭＡリードアクセス指示は出力しないものとする。 However, in the following description, for ease of explanation, the task control unit 12 does not change to the task queue 121 instead of changing the task number every time the task “# 0” is output to the processing operation unit 11. In the following description, it is assumed that the tasks are output to the processing operation unit 11 from the task with the smaller task number, that is, in order of increasing task numbers from the task 6 of # 0 ”. In the description, before outputting the task 0 of “# 8” and the task 1 of “# 9” to the processing operation unit 11, the task control unit 12 performs the tasks 0 and “# 9” of “# 8”. The operation in the case of acquiring (reading) in advance data used when executing 1. The tasks “# 0” to “# 7” are stored in the external storage unit 20. Do not use that data, i.e., "# 0" ~ "# 7" DMA read access instruction for performing respective tasks shall not be output.

ここで、タスク制御部１２が、ＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングを決定する際の考え方について説明する。タスク制御部１２は、外部記憶部２０に格納されているデータを使用するタスクの実行が開始されるタイミングよりも前に、外部記憶部２０へのアクセスが終了し、タスクを実行する際に使用するデータがデータ記憶部１４に格納されている状態にしておく。 Here, the concept when the task control unit 12 determines the timing for outputting the DMA read access instruction to the memory control unit 13 will be described. The task control unit 12 is used when the access to the external storage unit 20 is completed and the task is executed before the start of the execution of the task that uses the data stored in the external storage unit 20 is started. The data to be stored is stored in the data storage unit 14.

このため、タスク制御部１２は、外部記憶部２０に格納されているデータを使用するタスク（以下、「対象タスク」という）よりも前に実行するタスク（以下、「先行タスク」という）の実行時間（以下、「実行サイクル数」という）が、外部記憶部２０にアクセスする際のデータ転送時間（以下、「転送サイクル数」という）よりも長いタイミングとなるように、ＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングに決定する。このタイミングは、下式（１）の関係を満足する必要がある。 Therefore, the task control unit 12 executes a task (hereinafter referred to as “preceding task”) that is executed before a task that uses data stored in the external storage unit 20 (hereinafter referred to as “target task”). The DMA read access instruction is stored in the memory so that the time (hereinafter referred to as “execution cycle number”) is longer than the data transfer time (hereinafter referred to as “transfer cycle number”) when accessing the external storage unit 20. The timing to output to the control unit 13 is determined. This timing needs to satisfy the relationship of the following formula (1).

（（データ転送開始順番＋コア数）×最短タスク実行時間÷コア数）≧データ転送時間
・・・（１） ((Data transfer start order + number of cores) x shortest task execution time ÷ number of cores) ≥ data transfer time
... (1)

そして、上式（１）を満足するデータ転送開始順番は、下式（２）となる。 The data transfer start order that satisfies the above equation (1) is the following equation (2).

データ転送開始順番≧（コア数×（データ転送時間÷最短タスク実行時間−１））
・・・（２） Data transfer start order ≥ (number of cores x (data transfer time ÷ shortest task execution time -1))
... (2)

第１の動作における演算装置１０の構成では、上式（２）におけるコア数は“４”、データ転送時間は１００サイクルであり、１０種類のタスクにおける最短タスク実行時間、すなわち、１０種類のタスクの内最小の実行サイクル数は５０サイクルである。従って、第１の動作の演算装置１０におけるデータ転送開始順番は、下式（３）となる。 In the configuration of the arithmetic unit 10 in the first operation, the number of cores in the above equation (2) is “4”, the data transfer time is 100 cycles, and the shortest task execution time in 10 types of tasks, that is, 10 types of tasks. The minimum number of execution cycles is 50 cycles. Accordingly, the data transfer start order in the arithmetic device 10 of the first operation is expressed by the following equation (3).

データ転送開始順番≧（４×（１００÷５０−１））＝４・・・（３） Data transfer start order ≧ (4 × (100 ÷ 50−1)) = 4 (3)

タスク制御部１２は、対象タスクが上式（３）のデータ転送開始順番となったタイミングを、対象タスクに対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングに決定する。より具体的には、図２に示した“＃８”のタスク０を対象タスクとした場合、対象タスクを出力する順番が４番目、すなわち、“＃４”となったタイミングを、対象タスクに対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングに決定する。また、図２に示した“＃９”のタスク１を対象タスクとした場合、対象タスクを出力する順番が４番目、すなわち、“＃４”となったタイミングを、対象タスクに対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングに決定する。 The task control unit 12 determines the timing at which the target task is in the data transfer start order of the above formula (3) as the timing at which a DMA read access instruction corresponding to the target task is output to the memory control unit 13. More specifically, when the task 0 of “# 8” shown in FIG. 2 is set as the target task, the timing when the target task is output in the fourth order, that is, “# 4” is set as the target task. The timing for outputting the corresponding DMA read access instruction to the memory control unit 13 is determined. Further, when the task 1 of “# 9” shown in FIG. 2 is the target task, the DMA read corresponding to the target task is performed when the output order of the target task is the fourth, that is, “# 4”. The timing for outputting an access instruction to the memory control unit 13 is determined.

そして、タスク制御部１２は、決定したタイミングのときに、ＤＭＡリードアクセス指示をメモリ制御部１３に出力させるための指示を、ＤＭＡリクエスト発生部１２２に出力する。この指示に応じて、ＤＭＡリクエスト発生部１２２は、ＤＭＡリードアクセス指示をメモリ制御部１３に出力し、メモリ制御部１３は、ＤＭＡリクエスト発生部１２２から入力されたＤＭＡリードアクセス指示に応じて、外部記憶部２０からのデータの読み出しを行う。 Then, the task control unit 12 outputs an instruction for causing the memory control unit 13 to output a DMA read access instruction to the DMA request generation unit 122 at the determined timing. In response to this instruction, the DMA request generation unit 122 outputs a DMA read access instruction to the memory control unit 13, and the memory control unit 13 outputs an external request according to the DMA read access instruction input from the DMA request generation unit 122. Data is read from the storage unit 20.

なお、タスク制御部１２は、上式（３）のデータ転送開始順番＝４を予め記憶しておく。しかし、データ転送開始順番を記憶しておく構成は、タスク制御部１２内に記憶しておく構成に限定されるものではなく、例えば、演算装置１０を制御する不図示の制御部内に記憶しておき、制御部が、記憶しているデータ転送開始順番をタスク制御部１２に出力する構成であってもよい。 The task control unit 12 stores in advance the data transfer start order = 4 in the above equation (3). However, the configuration for storing the data transfer start order is not limited to the configuration for storing in the task control unit 12; for example, it is stored in a control unit (not shown) that controls the arithmetic device 10. Alternatively, the control unit may output the stored data transfer start order to the task control unit 12.

図３は、本実施形態の演算装置１０に備えたタスク制御部１２による第１の動作におけるタスクの分配とデータ転送とのタイミングを示したタイミングチャートである。図３に示したタイミングチャートでは、演算装置１０に備えた４つの処理演算部１１ａ〜処理演算部１１ｄにおける前のタスクの実行が同時に完了し、その後、タスク制御部１２が、それぞれの処理演算部１１に順次タスクを出力する場合を示している。 FIG. 3 is a timing chart showing the timing of task distribution and data transfer in the first operation by the task control unit 12 included in the arithmetic device 10 of the present embodiment. In the timing chart shown in FIG. 3, the execution of the previous task in the four processing calculation units 11 a to 11 d included in the calculation device 10 is completed at the same time, and then the task control unit 12 displays each processing calculation unit. 11 shows a case where tasks are sequentially output.

より具体的には、最初の“＃０”のタスク６を処理演算部１１ａに、２番目の“＃１”のタスク３を処理演算部１１ｂに、３番目の“＃２”のタスク４を処理演算部１１ｃに、４番目の“＃３”のタスク０を処理演算部１１ｄに、それぞれ出力する。このとき、“＃８”のタスク０が、４番目（“＃４”）のタスクとなるため、タスク制御部１２は、図２に示した“＃８”のタスク０に対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力させるための指示を、ＤＭＡリクエスト発生部１２２に出力する。これにより、ＤＭＡリクエスト発生部１２２は、“＃８”のタスク０に対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力し、メモリ制御部１３は、ＤＭＡリクエスト発生部１２２から入力されたＤＭＡリードアクセス指示に応じて、“＃８”のタスク０に対応したデータを外部記憶部２０から読み出して、“＃８”のタスク０を実行する処理演算部１１ｄに対応したデータ記憶部１４ｄに格納する。 More specifically, the first “# 0” task 6 is assigned to the processing operation unit 11a, the second “# 1” task 3 is assigned to the processing operation unit 11b, and the third “# 2” task 4 is assigned. The fourth “# 3” task 0 is output to the processing operation unit 11 c to the processing operation unit 11 d. At this time, since the task 0 of “# 8” becomes the fourth (“# 4”) task, the task control unit 12 performs the DMA read access corresponding to the task 0 of “# 8” shown in FIG. An instruction for causing the memory control unit 13 to output the instruction is output to the DMA request generation unit 122. As a result, the DMA request generation unit 122 outputs a DMA read access instruction corresponding to the task 0 of “# 8” to the memory control unit 13, and the memory control unit 13 receives the DMA read input from the DMA request generation unit 122. In response to the access instruction, data corresponding to the task 0 of “# 8” is read from the external storage unit 20 and stored in the data storage unit 14d corresponding to the processing operation unit 11d that executes the task 0 of “# 8”. .

その後、処理演算部１１ａが最初の“＃０”のタスク６の実行が完了したとき、タスク制御部１２は、５番目の“＃４”のタスク３を処理演算部１１ａに出力する。このとき、“＃９”のタスク１が、４番目（“＃４”）のタスクとなるため、タスク制御部１２は、図２に示した“＃９”のタスク１に対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力させるための指示を、ＤＭＡリクエスト発生部１２２に出力する。これにより、ＤＭＡリクエスト発生部１２２は、“＃９”のタスク１に対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力する。そして、メモリ制御部１３は、“＃８”のタスク０に対応したデータの外部記憶部２０からの読み出しが終了した後、ＤＭＡリクエスト発生部１２２から入力されたＤＭＡリードアクセス指示に応じて、“＃９”のタスク１に対応したデータを外部記憶部２０から読み出して、“＃９”のタスク１を実行する処理演算部１１ｂに対応したデータ記憶部１４ｂに格納する。 After that, when the processing operation unit 11a completes the execution of the first “# 0” task 6, the task control unit 12 outputs the fifth “# 4” task 3 to the processing operation unit 11a. At this time, the task 1 of “# 9” becomes the fourth (“# 4”) task, so the task control unit 12 performs DMA read access corresponding to the task 1 of “# 9” shown in FIG. An instruction for causing the memory control unit 13 to output the instruction is output to the DMA request generation unit 122. As a result, the DMA request generation unit 122 outputs a DMA read access instruction corresponding to the task 1 of “# 9” to the memory control unit 13. Then, after the reading of the data corresponding to the task 0 of “# 8” from the external storage unit 20 is completed, the memory control unit 13 responds to the DMA read access instruction input from the DMA request generation unit 122 according to the “DMA read access instruction”. Data corresponding to task # 9 is read from the external storage unit 20 and stored in the data storage unit 14b corresponding to the processing operation unit 11b that executes task # 9.

このように、タスク制御部１２は、対象タスクが割り当てられる処理演算部１１が、外部記憶部２０に格納されているデータを使用するタイミングよりも早いタイミングで、対象タスクを実行する際に使用するデータを、事前にデータ記憶部１４に格納しておく。 As described above, the task control unit 12 is used when the processing calculation unit 11 to which the target task is assigned executes the target task at a timing earlier than the timing at which the data stored in the external storage unit 20 is used. Data is stored in the data storage unit 14 in advance.

なお、対象タスクを実行する際に使用するデータを、対応するデータ記憶部１４に事前に格納する際には、このデータ記憶部１４に演算処理に必要な前のデータが格納されていないか、すなわち、演算処理に必要なデータが残っていないかを確認し、演算処理に必要な前のデータが格納されていない場合にのみ、対象タスクを実行する際に使用するデータを事前に格納する。従って、対応するデータ記憶部１４に演算処理に必要な前のデータが格納されている、すなわち、演算処理に必要なデータが残っている場合には、現在格納されているデータを、外部記憶部２０にデータを退避しておく（書き込んでおく）必要がある。このため、タスク制御部１２は、決定したデータ転送開始順番よりも前のタイミングを、対象タスクに対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングに決定することが望ましい。この場合には、例えば、上式（２）におけるデータ転送時間を２倍にして、データ転送開始順番を決定するなどの方法が考えられる。 In addition, when data used when executing the target task is stored in advance in the corresponding data storage unit 14, whether or not the previous data necessary for the arithmetic processing is stored in the data storage unit 14, That is, it is confirmed whether or not data necessary for the arithmetic processing remains, and only when the previous data necessary for the arithmetic processing is not stored, data used when executing the target task is stored in advance. Accordingly, when the previous data necessary for the arithmetic processing is stored in the corresponding data storage unit 14, that is, when the data necessary for the arithmetic processing remains, the currently stored data is stored in the external storage unit. 20 needs to be saved (written). Therefore, it is desirable that the task control unit 12 determines the timing before the determined data transfer start order as the timing at which the DMA read access instruction corresponding to the target task is output to the memory control unit 13. In this case, for example, a method of determining the data transfer start order by doubling the data transfer time in the above equation (2) is conceivable.

次に、対象タスクが実行される前に、対象タスクが使用するデータを事前にデータ記憶部１４に格納しておく、タスク制御部１２の処理手順について説明する。図４は、本実施形態の演算装置１０に備えたタスク制御部１２による第１の動作における処理手順を示したフローチャートである。なお、以下の説明においては、説明を容易にするため、ＤＭＡリクエスト発生部１２２が、対象タスクに対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングを決定するものとして説明する。また、対象タスクにおけるデータ転送開始順番は“４”であるものとする。 Next, a processing procedure of the task control unit 12 that stores data used by the target task in the data storage unit 14 in advance before the target task is executed will be described. FIG. 4 is a flowchart illustrating a processing procedure in the first operation by the task control unit 12 included in the arithmetic device 10 of the present embodiment. In the following description, for ease of explanation, it is assumed that the DMA request generation unit 122 determines the timing for outputting a DMA read access instruction corresponding to the target task to the memory control unit 13. Further, it is assumed that the data transfer start order in the target task is “4”.

ＤＭＡリクエスト発生部１２２は、タスク制御部１２がタスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、図４に示したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングの決定処理を実行する。まず、タスク制御部１２が、タスクキュー１２１に格納された最初の“＃０”のタスク６を処理演算部１１ａに出力すると、ＤＭＡリクエスト発生部１２２は、ステップＳ１においてタスク番号ｉを“０”にクリアし、タスクキュー１２１に格納されている“＃０”のタスク（図２に示した２番目の“＃１”のタスク３）から、ＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングの決定処理を開始する。なお、図４において、ＱＵＥＵＥ−ＭＡＸは、タスク番号の最大値である。 The DMA request generator 122 outputs the DMA read access instruction shown in FIG. 4 to the memory controller 13 every time the task controller 12 outputs the task stored in the task queue 121 to the processing arithmetic unit 11. Execute the decision process. First, when the task control unit 12 outputs the first “# 0” task 6 stored in the task queue 121 to the processing operation unit 11a, the DMA request generation unit 122 sets the task number i to “0” in step S1. When the DMA read access instruction is output to the memory control unit 13 from the “# 0” task (the second “# 1” task 3 shown in FIG. 2) stored in the task queue 121. The determination process is started. In FIG. 4, QUEUE-MAX is the maximum task number.

ＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングの決定処理において、ＤＭＡリクエスト発生部１２２は、“＃０”のタスク（図２に示した２番目の“＃１”のタスク３）が、外部記憶部２０に格納されているデータを使用する対象タスクであるか否かを確認する（ステップＳ１１）。ステップＳ１１において、外部記憶部２０に格納されているデータを使用する対象タスクでない場合（ステップＳ１１の“ＮＯ”）には、ステップＳ１においてタスク番号ｉに１を加えて、すなわち、タスク番号ｉ＝１として、タスクキュー１２１に格納された２番目の“＃１”のタスク（図２に示した３番目の“＃２”のタスク４）に対する確認を繰り返す。 In the process of determining the timing for outputting the DMA read access instruction to the memory control unit 13, the DMA request generation unit 122 determines that the task "# 0" (the second "# 1" task 3 shown in FIG. 2) It is confirmed whether or not the task uses the data stored in the external storage unit 20 (step S11). If it is determined in step S11 that the data is not a target task using the data stored in the external storage unit 20 ("NO" in step S11), 1 is added to the task number i in step S1, that is, the task number i = 1, the confirmation for the second “# 1” task (the third “# 2” task 4 shown in FIG. 2) stored in the task queue 121 is repeated.

なお、図２に示した２番目の“＃１”のタスク３〜８番目の“＃７”のタスク６は、外部記憶部２０に格納されているデータを使用する対象タスクではないため、ステップＳ１１における確認の結果は、“ＮＯ”の結果を繰り返す。そして、タスク番号ｉ＝７のとき、８番目の“＃７”のタスク（図２に示した９番目の“＃８”のタスク０）は対象タスクであるため、ステップＳ１１において、外部記憶部２０に格納されているデータを使用する対象タスクである場合（ステップＳ１１の“ＹＥＳ”）となる。 Note that the second “# 1” task 3 to the eighth “# 7” task 6 shown in FIG. 2 are not target tasks using the data stored in the external storage unit 20, and therefore, As a result of the confirmation in S11, the result of “NO” is repeated. When the task number i = 7, since the eighth “# 7” task (the ninth “# 8” task 0 shown in FIG. 2) is the target task, in step S11, the external storage unit 20 is a target task that uses the data stored in 20 ("YES" in step S11).

ステップＳ１１における確認の結果が“ＹＥＳ”の結果である場合、ＤＭＡリクエスト発生部１２２は、対象タスクにおけるデータ転送開始順番＝４を取得する（ステップＳ１２）。 If the result of the confirmation in step S11 is “YES”, the DMA request generation unit 122 acquires data transfer start order = 4 in the target task (step S12).

続いて、ＤＭＡリクエスト発生部１２２は、取得したデータ転送開始順番＝４が、タスク番号ｉと同じであるか否か、すなわち、対象タスクが４番目の“＃４”であるか否かを確認する（ステップＳ１３）。ステップＳ１３において、対象タスクが４番目の“＃４”でない場合（ステップＳ１３の“ＮＯ”）には、ステップＳ１においてタスク番号ｉに１を加えて、タスクキュー１２１に格納された次のタスク（図２に示した１０番目の“＃９”のタスク１）に対する確認を行う。 Subsequently, the DMA request generation unit 122 confirms whether or not the acquired data transfer start order = 4 is the same as the task number i, that is, whether or not the target task is the fourth “# 4”. (Step S13). In step S13, if the target task is not the fourth “# 4” (“NO” in step S13), 1 is added to the task number i in step S1, and the next task stored in the task queue 121 ( Confirmation is performed for the tenth “# 9” task 1) shown in FIG.

また、ステップＳ１３において、対象タスクが４番目の“＃４”である場合（ステップＳ１３の“ＹＥＳ”）には、ＤＭＡリクエスト発生部１２２は、４番目の“＃４”のタスク（図２に示した９番目の“＃８”のタスク０）が使用するデータを外部記憶部２０から事前に取得するためのＤＭＡリードアクセス指示を、メモリ制御部１３に出力する（ステップＳ１４）。これにより、メモリ制御部１３は、ＤＭＡリクエスト発生部１２２から入力されたＤＭＡリードアクセス指示に応じて、“＃８”のタスク０に対応したデータを外部記憶部２０から読み出して、“＃８”のタスク０を実行する処理演算部１１ｄに対応したデータ記憶部１４ｄに格納する。 In step S13, if the target task is the fourth “# 4” (“YES” in step S13), the DMA request generation unit 122 sets the fourth “# 4” task (FIG. 2). A DMA read access instruction for acquiring in advance the data used by the ninth “# 8” task 0) from the external storage unit 20 is output to the memory control unit 13 (step S14). As a result, the memory control unit 13 reads the data corresponding to the task 0 of “# 8” from the external storage unit 20 in response to the DMA read access instruction input from the DMA request generation unit 122, and “# 8”. Is stored in the data storage unit 14d corresponding to the processing calculation unit 11d that executes the task 0.

以降、同様に、タスク制御部１２がタスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、図４に示したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングの決定処理が実行される。これにより、メモリ制御部１３は、“＃８”のタスク０に対応したデータの外部記憶部２０からの読み出しが終了した後、ＤＭＡリクエスト発生部１２２から入力されたＤＭＡリードアクセス指示に応じて、“＃９”のタスク１に対応したデータを外部記憶部２０から読み出して、“＃９”のタスク１を実行する処理演算部１１ｂに対応したデータ記憶部１４ｂに格納する。 Thereafter, similarly, every time the task control unit 12 outputs the task stored in the task queue 121 to the processing calculation unit 11, the timing determination process for outputting the DMA read access instruction shown in FIG. Is executed. As a result, the memory control unit 13 finishes reading the data corresponding to the task 0 of “# 8” from the external storage unit 20, and then, according to the DMA read access instruction input from the DMA request generation unit 122, Data corresponding to the task 1 of “# 9” is read from the external storage unit 20 and stored in the data storage unit 14b corresponding to the processing operation unit 11b that executes the task 1 of “# 9”.

このように、タスク制御部１２における第１の動作では、タスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、外部記憶部２０に格納されているデータを使用する対象タスクが処理演算部１１に出力される順番を確認することによって、対象タスクが割り当てられる処理演算部１１が、対象タスクを実行するよりも早いタイミングで、対象タスクを実行する際に使用するデータを、事前にデータ記憶部１４に格納しておく。これにより、演算装置１０では、それぞれの処理演算部１１が使用するデータのキャッシュミスを防止することができる。 As described above, in the first operation in the task control unit 12, every time the task stored in the task queue 121 is output to the processing calculation unit 11, the target task using the data stored in the external storage unit 20 is determined. By confirming the order of output to the processing calculation unit 11, the processing calculation unit 11 to which the target task is assigned executes the target task at a timing earlier than the execution of the target task. Stored in the data storage unit 14. Thereby, in the arithmetic unit 10, it is possible to prevent a cache miss of data used by each processing arithmetic unit 11.

なお、上述したように、対象タスクを実行する際に使用するデータを、対応するデータ記憶部１４に事前に格納する際には、このデータ記憶部１４に演算処理に必要な前のデータが格納されていないかを確認し、データ記憶部１４に演算処理に必要な前のデータが格納されている場合には、現在格納されているデータを、外部記憶部２０にデータを退避しておく（書き込んでおく）必要がある。このため、例えば、図４に示したフローチャートのステップＳ１１とステップＳ１２との間に、データ記憶部１４に演算処理に必要な前のデータが格納されているか否かを確認するステップを設け、このステップにおいて、データ記憶部１４に演算処理に必要な前のデータが格納されていないと確認された場合に、ステップＳ１２以降の処理を実行することが望ましい。なお、このステップにおいて、データ記憶部１４に演算処理に必要な前のデータが格納されていると確認された場合には、データ記憶部１４に現在格納されているデータを外部記憶部２０にデータを退避してから（書き込んでから）、ステップＳ１２以降の処理を実行することになる。 As described above, when data used when executing the target task is stored in advance in the corresponding data storage unit 14, the previous data necessary for the arithmetic processing is stored in the data storage unit 14. If the previous data required for the arithmetic processing is stored in the data storage unit 14, the currently stored data is saved in the external storage unit 20 ( Need to write). For this reason, for example, a step for confirming whether or not the previous data necessary for the arithmetic processing is stored in the data storage unit 14 is provided between step S11 and step S12 in the flowchart shown in FIG. In the step, when it is confirmed that the previous data necessary for the arithmetic processing is not stored in the data storage unit 14, it is desirable to execute the processing after step S12. In this step, when it is confirmed that the previous data necessary for the arithmetic processing is stored in the data storage unit 14, the data currently stored in the data storage unit 14 is transferred to the external storage unit 20. After saving (after writing), the processing after step S12 is executed.

＜第２の動作＞
次に、演算装置１０の動作、特にタスク制御部１２の第２の動作について説明する。第２の動作は、タスクキュー１２１に格納されたそれぞれのタスクからタスク制御部１２が予測した先行タスクの実行時間に基づいて、外部記憶部２０に格納されているデータをＤＭＡによって事前に取得する（読み出す）ためのＤＭＡリードアクセス指示を、メモリ制御部１３に出力する動作である。なお、本第２の動作の説明においても、タスク制御部１２の概略構成は、図２に示したタスク制御部１２の概略構成と同様である。また、タスク制御部１２に格納されたタスクも、図２に示したタスクが格納されているものとして説明を行う。 <Second operation>
Next, the operation of the arithmetic unit 10, particularly the second operation of the task control unit 12 will be described. In the second operation, the data stored in the external storage unit 20 is acquired in advance by DMA based on the execution time of the preceding task predicted by the task control unit 12 from each task stored in the task queue 121. This is an operation of outputting a DMA read access instruction for (reading) to the memory control unit 13. Also in the description of the second operation, the schematic configuration of the task control unit 12 is the same as the schematic configuration of the task control unit 12 illustrated in FIG. Further, the task stored in the task control unit 12 will be described assuming that the task shown in FIG. 2 is stored.

図５は、本実施形態の演算装置１０に備えたタスク制御部１２による第２の動作における処理手順を示したフローチャートである。なお、以下の説明においても、説明を容易にするため、ＤＭＡリクエスト発生部１２２が、対象タスクに対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングを決定するものとして説明する。また、対象タスクに対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミング（以下、「データ転送開始タイミング」という）は、外部記憶部２０にアクセスする際の転送サイクル数と同じタイミング、すなわち、１００サイクルであるものとする。 FIG. 5 is a flowchart illustrating a processing procedure in the second operation by the task control unit 12 included in the arithmetic device 10 of the present embodiment. In the following description, for ease of explanation, it is assumed that the DMA request generation unit 122 determines the timing for outputting a DMA read access instruction corresponding to the target task to the memory control unit 13. The timing for outputting a DMA read access instruction corresponding to the target task to the memory control unit 13 (hereinafter referred to as “data transfer start timing”) is the same as the transfer cycle number when accessing the external storage unit 20, that is, , 100 cycles.

なお、タスク制御部１２は、データ転送開始タイミング＝１００を予め記憶しておく構成であっても、例えば、演算装置１０を制御する不図示の制御部内にデータ転送開始タイミングを記憶しておき、制御部が、記憶しているデータ転送開始タイミングをタスク制御部１２に出力する構成であってもよい。 Note that the task control unit 12 stores the data transfer start timing in a control unit (not shown) that controls the arithmetic device 10, for example, even if the data transfer start timing = 100 is stored in advance. The control unit may output the stored data transfer start timing to the task control unit 12.

ＤＭＡリクエスト発生部１２２は、タスク制御部１２がタスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、図５に示したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングの決定処理を実行する。まず、ＤＭＡリクエスト発生部１２２は、ステップＳ２においてタスク番号ｉを“０”にクリアする。そして、ＤＭＡリクエスト発生部１２２は、タスクキュー１２１に格納されている最初の“＃０”のタスク６から、ＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングの決定処理を開始する。 The DMA request generator 122 outputs the DMA read access instruction shown in FIG. 5 to the memory controller 13 every time the task controller 12 outputs the task stored in the task queue 121 to the processing arithmetic unit 11. Execute the decision process. First, the DMA request generation unit 122 clears the task number i to “0” in step S2. Then, the DMA request generation unit 122 starts timing determination processing for outputting a DMA read access instruction to the memory control unit 13 from the first “# 0” task 6 stored in the task queue 121.

ＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングの決定処理において、ＤＭＡリクエスト発生部１２２は、最初の“＃０”のタスク６が、外部記憶部２０に格納されているデータを使用する対象タスクであるか否かを確認する（ステップＳ２１）。ステップＳ２１において、最初の“＃０”のタスク６が、外部記憶部２０に格納されているデータを使用する対象タスクでない場合（ステップＳ２１の“ＮＯ”）には、ステップＳ２においてタスク番号ｉに１を加えてタスク番号ｉ＝１とし、タスクキュー１２１に格納された２番目の“＃１”のタスク３に対する確認を繰り返す。 In the process of determining the timing for outputting the DMA read access instruction to the memory control unit 13, the DMA request generation unit 122 uses the data stored in the external storage unit 20 for the first “# 0” task 6. It is confirmed whether it is a task (step S21). If the first “# 0” task 6 is not the target task using the data stored in the external storage unit 20 in step S21 (“NO” in step S21), the task number i is set in step S2. 1 is added to set the task number i = 1, and the confirmation for the second “# 1” task 3 stored in the task queue 121 is repeated.

なお、図２に示した最初の“＃０”のタスク６〜８番目の“＃７”のタスク６は、外部記憶部２０に格納されているデータを使用する対象タスクではないため、ステップＳ２１における確認の結果は、“ＮＯ”の結果を繰り返す。そして、タスク番号ｉ＝８のとき、９番目の“＃８”のタスク０は対象タスクであるため、ステップＳ２１において、外部記憶部２０に格納されているデータを使用する対象タスクである場合（ステップＳ２１の“ＹＥＳ”）となる。 Note that the first “# 0” task 6 to the eighth “# 7” task 6 shown in FIG. 2 are not target tasks using the data stored in the external storage unit 20, and therefore, step S 21. As a result of the confirmation, the result of “NO” is repeated. When the task number i = 8, the ninth “# 8” task 0 is a target task. Therefore, in step S21, the task is a target task that uses data stored in the external storage unit 20 ( “YES” in step S21).

ステップＳ２１における確認の結果が“ＹＥＳ”の結果である場合、ＤＭＡリクエスト発生部１２２は、対象タスクが実行されるよりも前に実行される先行タスク（最初の“＃０”のタスク６〜８番目の“＃７”のタスク６）のそれぞれが処理演算部１１に割り当てられることを想定した場合に、それぞれの処理演算部１１に幾つの先行タスクが割り当てられるか、すなわち、処理演算部１１あたりの先行タスク数ＮＵＭ−ＯＦ−ＭＩＮを算出する（ステップＳ２２）。例えば、図２に示したタスクキュー１２１の状態では、処理演算部１１あたりの先行タスク数ＮＵＭ−ＯＦ−ＭＩＮは“２”となる。また、それぞれの処理演算部１１に割り当てられた先行タスクを実行する際の、処理演算部１１あたりの実行サイクル数の最小値ＭＩＮ［ＮＵＭ−ＯＦ−ＭＩＮ］の値を、“最大値（図５においては０ｘＦＦ）”にクリアする。 If the result of the confirmation in step S21 is “YES”, the DMA request generation unit 122 performs the preceding task (first “# 0” tasks 6 to 8) that is executed before the target task is executed. When it is assumed that each of the sixth “# 7” tasks 6) is assigned to the processing operation unit 11, how many preceding tasks are assigned to each processing operation unit 11, that is, per processing operation unit 11 The preceding task number NUM-OF-MIN is calculated (step S22). For example, in the state of the task queue 121 shown in FIG. 2, the number of preceding tasks NUM-OF-MIN per processing operation unit 11 is “2”. Further, the minimum value MIN [NUM-OF-MIN] of the number of execution cycles per processing arithmetic unit 11 when executing the preceding task assigned to each processing arithmetic unit 11 is set to “maximum value (FIG. 5). Is cleared to 0xFF) ".

続いて、ＤＭＡリクエスト発生部１２２は、ステップＳ２３において処理演算部１１あたりの先行タスク数ｋを“０”にクリアし、それぞれの処理演算部１１に先行タスク数ＮＵＭ−ＯＦ−ＭＩＮのタスクが割り当てられた場合に想定される最小の実行サイクル数を求める。このため、ＤＭＡリクエスト発生部１２２は、先行タスクをそれぞれの処理演算部１１が実行する際の、処理演算部１１あたりの実行サイクル数の最小値ＭＩＮ［ＮＵＭ−ＯＦ−ＭＩＮ］の値を、先行タスクの実行サイクル数に応じた値に更新する処理を開始する。 Subsequently, in step S23, the DMA request generation unit 122 clears the preceding task number k per processing operation unit 11 to “0”, and assigns a task having the preceding task number NUM-OF-MIN to each processing operation unit 11. The minimum number of execution cycles assumed when For this reason, the DMA request generation unit 122 sets the value of the minimum value MIN [NUM-OF-MIN] of the number of execution cycles per processing calculation unit 11 when each processing calculation unit 11 executes the preceding task as the preceding task. The process of updating to a value corresponding to the number of task execution cycles is started.

処理演算部１１あたりの実行サイクル数の最小値ＭＩＮ［ＮＵＭ−ＯＦ−ＭＩＮ］の値の更新処理では、ＤＭＡリクエスト発生部１２２は、まず、ステップＳ２４において先行タスク番号ｊを“０”にクリアする。そして、ＤＭＡリクエスト発生部１２２は、最初の先行タスクである“＃０”のタスク６の実行サイクル数が、実行サイクル数の最小値ＭＩＮ［ｋ］の値よりも小さいか否かを確認する（ステップＳ２５）。 In the update process of the minimum value MIN [NUM-OF-MIN] of the number of execution cycles per processing operation unit 11, the DMA request generation unit 122 first clears the preceding task number j to “0” in step S24. . Then, the DMA request generation unit 122 checks whether or not the number of execution cycles of the task 6 of “# 0” that is the first preceding task is smaller than the value of the minimum value MIN [k] of the number of execution cycles ( Step S25).

ここでは、ステップＳ２２において、実行サイクル数の最小値ＭＩＮ［ＮＵＭ−ＯＦ−ＭＩＮ］を“最大値”にしているため、ステップＳ２５における確認の結果が“ＹＥＳ”となり、“＃０”のタスク６の実行サイクル数＝８０を、実行サイクル数の最小値ＭＩＮ［０］の値にする（ステップＳ２６）。なお、ステップＳ２５における確認の結果が“ＮＯ”の結果である場合には、ステップＳ２４において先行タスク番号ｊに１を加えて先行タスク番号ｊ＝１とし、ステップＳ２５において、タスクキュー１２１に格納された２番目の“＃１”のタスク３の実行サイクル数が、実行サイクル数の最小値ＭＩＮ［０］の値よりも小さいか否かの確認を繰り返す。 Here, since the minimum value MIN [NUM-OF-MIN] of the number of execution cycles is set to “maximum value” in step S22, the result of the confirmation in step S25 is “YES”, and task 6 of “# 0” Is set to the value of the minimum value MIN [0] of the number of execution cycles (step S26). If the result of the confirmation in step S25 is “NO”, 1 is added to the preceding task number j in step S24 to set the preceding task number j = 1, and the result is stored in the task queue 121 in step S25. It is repeatedly checked whether the number of execution cycles of the second “# 1” task 3 is smaller than the minimum value MIN [0] of the number of execution cycles.

同様に、ステップＳ２４のループによって、先行タスクの中の最小の実行サイクル数が、実行サイクル数の最小値ＭＩＮ［０］の値となる。全ての先行タスクに対するステップＳ２４のループの処理が完了すると、ＤＭＡリクエスト発生部１２２は、ステップＳ２３において先行タスク数ｋに１を加えて先行タスク数ｋ＝１とする。そして、同様に、ステップＳ２４のループによる、処理演算部１１あたりの実行サイクル数の最小値ＭＩＮ［１］の値の更新処理を行う。 Similarly, by the loop of step S24, the minimum number of execution cycles in the preceding task becomes the value of the minimum value MIN [0] of the number of execution cycles. When the processing of the loop in step S24 for all the preceding tasks is completed, the DMA request generation unit 122 adds 1 to the number of preceding tasks k in step S23 to set the number of preceding tasks k = 1. Similarly, the value of the minimum value MIN [1] of the number of execution cycles per processing operation unit 11 is updated by the loop of step S24.

なお、２回目以降の実行サイクル数の最小値ＭＩＮ［ＮＵＭ−ＯＦ−ＭＩＮ］の更新処理においては、その前の回の実行サイクル数の最小値ＭＩＮ［ＮＵＭ−ＯＦ−ＭＩＮ］の更新処理において使用した先行タスクの実行サイクル数とならないように、１度採用した先行タスクは、２回目以降の実行サイクル数の最小値ＭＩＮ［ＮＵＭ−ＯＦ−ＭＩＮ］の更新処理に使用しないようにする。例えば、図２に示したタスクキュー１２１の状態では、“＃５”のタスク７の実行サイクル数＝７０が、実行サイクル数の最小値ＭＩＮ［０］の値となるため、実行サイクル数の最小値ＭＩＮ［１］の値の更新処理では、“＃５”のタスク７の実行サイクル数の確認を行わないようにする。これにより、図２に示したタスクキュー１２１の状態では、“＃６”のタスク７の実行サイクル数＝７０が、実行サイクル数の最小値ＭＩＮ［１］の値となる。 In the update process of the minimum value MIN [NUM-OF-MIN] of the second and subsequent execution cycles, the update process of the minimum value MIN [NUM-OF-MIN] of the previous execution cycle number is used. In order not to have the number of execution cycles of the preceding task, the preceding task adopted once is not used for the update process of the minimum value MIN [NUM-OF-MIN] of the second and subsequent execution cycles. For example, in the state of the task queue 121 shown in FIG. 2, the number of execution cycles of the task 7 of “# 5” = 70 becomes the value of the minimum value MIN [0] of the number of execution cycles. In the update process of the value MIN [1], the number of execution cycles of the task 7 “# 5” is not confirmed. Thus, in the state of the task queue 121 shown in FIG. 2, the number of execution cycles of the task 7 of “# 6” = 70 becomes the minimum value MIN [1] of the number of execution cycles.

このステップＳ２３のループによって、処理演算部１１あたりの実行サイクル数の最小値ＭＩＮ［ＮＵＭ−ＯＦ−ＭＩＮ］の値が、先行タスクの実行サイクル数が最小である値に、順次更新される。 Through the loop of step S23, the minimum value MIN [NUM-OF-MIN] of the number of execution cycles per processing operation unit 11 is sequentially updated to a value with the minimum number of execution cycles of the preceding task.

続いて、ＤＭＡリクエスト発生部１２２は、実行サイクル数の最小値ＭＩＮ［ｋ］の値を全て合算した、合計値ＭＩＮ−ＳＵＭを算出する（ステップＳ２７）。続いて、ＤＭＡリクエスト発生部１２２は、合計値ＭＩＮ−ＳＵＭに基づいて、現在がデータ転送開始タイミングであるか否かを判定する（ステップＳ２８）。ＤＭＡリクエスト発生部１２２におけるステップＳ２８の判定は、例えば、データ転送開始タイミングの値が合計値ＭＩＮ−ＳＵＭの値よりも小さく（データ転送開始タイミング＜合計値ＭＩＮ−ＳＵＭ）、かつ、データ転送開始タイミングの値が１つの処理演算部１１が実行する先行タスクの実行サイクル数の平均値よりも大きい（データ転送開始タイミング＞合計値ＭＩＮ−ＳＵＭ−合計値ＭＩＮ−ＳＵＭ／先行タスク数ＮＵＭ−ＯＦ−ＭＩＮ）場合に、現在がデータ転送開始タイミングである（ステップＳ２８の“ＹＥＳ”）と判定する。 Subsequently, the DMA request generation unit 122 calculates a total value MIN-SUM obtained by adding all the values of the minimum value MIN [k] of the number of execution cycles (step S27). Subsequently, the DMA request generation unit 122 determines whether the current time is the data transfer start timing based on the total value MIN-SUM (step S28). The determination in step S28 in the DMA request generation unit 122 is, for example, that the value of the data transfer start timing is smaller than the total value MIN-SUM (data transfer start timing <total value MIN-SUM) and the data transfer start timing. Is larger than the average value of the number of execution cycles of the preceding task executed by one processing operation unit 11 (data transfer start timing> total value MIN-SUM-total value MIN-SUM / preceding task number NUM-OF-MIN). ), It is determined that the current time is the data transfer start timing (“YES” in step S28).

例えば、図２に示したタスクキュー１２１の状態では、先行タスク数ＮＵＭ−ＯＦ−ＭＩＮ＝２、実行サイクル数の最小値ＭＩＮ［０］＝７０、実行サイクル数の最小値ＭＩＮ［１］＝７０であり、合計値ＭＩＮ−ＳＵＭ＝１４０である。従って、ＤＭＡリクエスト発生部１２２は、データ転送開始タイミングが、１４０＞データ転送開始タイミング＞１４０−１４０／２（＝７０）の条件を満足するときに、現在がデータ転送開始タイミングであると判定する。 For example, in the state of the task queue 121 shown in FIG. 2, the number of preceding tasks NUM-OF-MIN = 2, the minimum value MIN [0] = 70 of the number of execution cycles, and the minimum value MIN [1] = 70 of the number of execution cycles. And the total value MIN−SUM = 140. Accordingly, when the data transfer start timing satisfies the condition of 140> data transfer start timing> 140-140 / 2 (= 70), the DMA request generation unit 122 determines that the current time is the data transfer start timing. .

ステップＳ２８において、現在がデータ転送開始タイミングでないと判定された場合（ステップＳ２８の“ＮＯ”）には、ステップＳ２においてタスク番号ｉに１を加えて、タスクキュー１２１に格納された次のタスク（図２に示した１０番目の“＃９”のタスク１）に対する確認を行う。 If it is determined in step S28 that the current time is not the data transfer start timing (“NO” in step S28), 1 is added to the task number i in step S2 and the next task stored in the task queue 121 ( Confirmation is performed for the tenth “# 9” task 1) shown in FIG.

また、ステップＳ２８において、現在がデータ転送開始タイミングであると判定された場合（ステップＳ２８の“ＹＥＳ”）には、ＤＭＡリクエスト発生部１２２は、９番目の“＃８”のタスク０が使用するデータを外部記憶部２０から事前に取得するためのＤＭＡリードアクセス指示を、メモリ制御部１３に出力する（ステップＳ２９）。これにより、メモリ制御部１３は、例えば、図３に示した第１の動作におけるタスクの分配とデータ転送とのタイミングと同様に、ＤＭＡリクエスト発生部１２２から入力されたＤＭＡリードアクセス指示に応じて、“＃８”のタスク０に対応したデータを外部記憶部２０から読み出して、“＃８”のタスク０を実行する処理演算部１１ｄに対応したデータ記憶部１４ｄに格納する。 If it is determined in step S28 that the current time is the data transfer start timing (“YES” in step S28), the DMA request generating unit 122 uses the ninth “# 8” task 0. A DMA read access instruction for acquiring data from the external storage unit 20 in advance is output to the memory control unit 13 (step S29). Thereby, for example, the memory control unit 13 responds to the DMA read access instruction input from the DMA request generation unit 122 in the same manner as the task distribution and data transfer timing in the first operation shown in FIG. The data corresponding to the task 0 of “# 8” is read from the external storage unit 20, and stored in the data storage unit 14d corresponding to the processing operation unit 11d that executes the task 0 of “# 8”.

以降、同様に、タスク制御部１２がタスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、図５に示したＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングの決定処理が実行される。これにより、メモリ制御部１３は、図３に示した第１の動作におけるタスクの分配とデータ転送とのタイミングと同様に、“＃８”のタスク０に対応したデータの外部記憶部２０からの読み出しが終了した後、ＤＭＡリクエスト発生部１２２から入力されたＤＭＡリードアクセス指示に応じて、“＃９”のタスク１に対応したデータを外部記憶部２０から読み出して、“＃９”のタスク１を実行する処理演算部１１ｂに対応したデータ記憶部１４ｂに格納する。 Thereafter, similarly, every time the task control unit 12 outputs the task stored in the task queue 121 to the processing calculation unit 11, the timing determination process for outputting the DMA read access instruction shown in FIG. Is executed. As a result, the memory control unit 13 sends the data corresponding to the task 0 of “# 8” from the external storage unit 20 in the same manner as the task distribution and data transfer timing in the first operation shown in FIG. After the reading is completed, the data corresponding to the task 1 of “# 9” is read from the external storage unit 20 in accordance with the DMA read access instruction input from the DMA request generation unit 122, and the task 1 of “# 9” is read. Is stored in the data storage unit 14b corresponding to the processing calculation unit 11b.

このように、タスク制御部１２における第２の動作では、タスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、先行タスクの実行時間を予測し、予測した先行タスクの実行時間とデータ転送開始タイミングとに基づいて、対象タスクが使用するデータを取得するためのＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングを決定する。これにより、タスク制御部１２における第２の動作でも、対象タスクが割り当てられる処理演算部１１が、対象タスクを実行するよりも早いタイミングで、対象タスクを実行する際に使用するデータを、事前にデータ記憶部１４に格納しておくことができる。このことにより、演算装置１０では、それぞれの処理演算部１１が使用するデータのキャッシュミスを防止することができる。 As described above, in the second operation in the task control unit 12, each time the task stored in the task queue 121 is output to the processing calculation unit 11, the execution time of the preceding task is predicted, and the predicted execution time of the preceding task is predicted. And a timing for outputting a DMA read access instruction for acquiring data used by the target task to the memory control unit 13 based on the data transfer start timing. As a result, even in the second operation in the task control unit 12, the processing operation unit 11 to which the target task is assigned executes data in advance when executing the target task at a timing earlier than executing the target task. It can be stored in the data storage unit 14. As a result, the arithmetic device 10 can prevent a cache miss of data used by each processing arithmetic unit 11.

なお、対象タスクを実行する際に使用するデータを、対応するデータ記憶部１４に事前に格納する際には、このデータ記憶部１４に演算処理に必要な前のデータが格納されていないかを確認し、データ記憶部１４に演算処理に必要な前のデータが格納されている場合には、現在格納されているデータを、外部記憶部２０にデータを退避しておく（書き込んでおく）必要がある。このため、例えば、図５に示したフローチャートのステップＳ２８とステップＳ２９との間に、データ記憶部１４に演算処理に必要な前のデータが格納されているか否かを確認するステップを設け、このステップにおいて、データ記憶部１４に演算処理に必要な前のデータが格納されていないと確認された場合に、ステップＳ２９の処理を実行することが望ましい。なお、このステップにおいて、データ記憶部１４に演算処理に必要な前のデータが格納されていると確認された場合には、データ記憶部１４に現在格納されているデータを外部記憶部２０にデータを退避してから（書き込んでから）、ステップＳ２９の処理を実行することになる。このため、ステップＳ２８において判定されるデータ転送開始タイミングは、例えば、外部記憶部２０にアクセスする際の転送サイクル数を２倍にしておくなど、メモリ制御部１３がＤＭＡによって外部記憶部２０にアクセスする際の転送サイクル数に対して余裕を持っておくことが望ましい。 When data used when executing the target task is stored in the corresponding data storage unit 14 in advance, whether or not the previous data necessary for the arithmetic processing is stored in the data storage unit 14 is checked. When the previous data necessary for the arithmetic processing is stored in the data storage unit 14, it is necessary to save (write) the currently stored data in the external storage unit 20. There is. For this reason, for example, a step is provided between step S28 and step S29 in the flowchart shown in FIG. 5 to check whether or not the previous data necessary for the arithmetic processing is stored in the data storage unit 14, In the step, when it is confirmed that the previous data necessary for the calculation process is not stored in the data storage unit 14, it is desirable to execute the process of step S29. In this step, when it is confirmed that the previous data necessary for the arithmetic processing is stored in the data storage unit 14, the data currently stored in the data storage unit 14 is transferred to the external storage unit 20. After saving (after writing), the process of step S29 is executed. For this reason, the data transfer start timing determined in step S28 is that the memory control unit 13 accesses the external storage unit 20 by DMA, for example, by doubling the number of transfer cycles when accessing the external storage unit 20. It is desirable to have a margin with respect to the number of transfer cycles.

＜第３の動作＞
次に、演算装置１０の動作、特にタスク制御部１２の第３の動作について説明する。第３の動作は、タスクキュー１２１に格納されたそれぞれのタスクの優先度に基づいて、外部記憶部２０に格納されているデータをＤＭＡによって事前に取得する（読み出す）ためのＤＭＡリードアクセス指示を、メモリ制御部１３に出力する動作である。 <Third operation>
Next, the operation of the arithmetic unit 10, particularly the third operation of the task control unit 12 will be described. The third operation is a DMA read access instruction for acquiring (reading) data stored in the external storage unit 20 in advance by DMA based on the priority of each task stored in the task queue 121. This is an operation to output to the memory control unit 13.

図６は、本実施形態の演算装置１０に備えたタスク制御部１２の概略構成、およびタスク制御部１２に格納されたタスクの別の一例を説明する図である。本第３の動作の説明においても、タスク制御部１２の概略構成は、図２に示したタスク制御部１２の概略構成と同様である。ただし、本第３の動作の説明においては、図６に示したように、タスク制御部１２に格納されたタスクが、図２に示したタスクと異なるものとする。 FIG. 6 is a diagram illustrating a schematic configuration of the task control unit 12 included in the arithmetic device 10 of the present embodiment, and another example of tasks stored in the task control unit 12. Also in the description of the third operation, the schematic configuration of the task control unit 12 is the same as the schematic configuration of the task control unit 12 illustrated in FIG. However, in the description of the third operation, as shown in FIG. 6, it is assumed that the task stored in the task control unit 12 is different from the task shown in FIG.

なお、図６に示したタスクにおいても、図２に示したタスクと同様に、タスクキュー１２１内に示した“＃（シャープ）”に続く数字は、タスクキュー１２１にそれぞれのタスクが格納された順番を表すタスク番号を示している。また、本第３の動作においては、タスクに続く数字が大きいほど優先度が高い、つまり、優先度は、タスク９＞タスク８＞タスク７＞タスク６＞タスク５＞タスク４＞タスク３＞タスク２＞タスク１＞タスク０であるものとする。 Also in the task shown in FIG. 6, as in the task shown in FIG. 2, the numbers following “# (sharp)” in the task queue 121 indicate that each task is stored in the task queue 121. A task number indicating the order is shown. In the third operation, the higher the number following the task, the higher the priority. In other words, the priority is Task 9> Task 8> Task 7> Task 6> Task 5> Task 4> Task 3> Task. It is assumed that 2> task 1> task 0.

図７は、本実施形態の演算装置１０に備えたタスク制御部１２による第３の動作における処理手順を示したフローチャートである。なお、以下の説明においては、“＃０”〜“＃７”のそれぞれのタスクは対象タスクではなく、優先度に応じて処理演算部１１に出力する順番の並び替えが完了しているものとし、“＃８”のタスク０および“＃９”のタスク１を処理演算部１１に出力する順番を、優先度に応じて並び替える場合について説明する。 FIG. 7 is a flowchart illustrating a processing procedure in the third operation by the task control unit 12 included in the arithmetic device 10 of the present embodiment. In the following description, it is assumed that each of the tasks “# 0” to “# 7” is not a target task, and the rearrangement of the order of output to the processing operation unit 11 according to the priority is completed. A case where the order of outputting the task 0 of “# 8” and the task 1 of “# 9” to the processing operation unit 11 is rearranged according to the priority will be described.

ＤＭＡリクエスト発生部１２２は、タスク制御部１２がタスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、図７に示したタスクを処理演算部１１に出力する順番を優先度に応じて並び替える処理を実行する。まず、ＤＭＡリクエスト発生部１２２は、ステップＳ３において最高優先度ＰＲＩ−ＭＡＸを“−１”に、最高優先度のタスク番号ＰＲＩ−ＭＡＸ−ＩＤＸを“０”にクリアする。 Each time the task control unit 12 outputs the task stored in the task queue 121 to the processing operation unit 11, the DMA request generation unit 122 sets the order in which the tasks illustrated in FIG. 7 are output to the processing operation unit 11 as priority. A rearrangement process is executed accordingly. First, the DMA request generator 122 clears the highest priority PRI-MAX to “−1” and the highest priority task number PRI-MAX-IDX to “0” in step S3.

続いて、ＤＭＡリクエスト発生部１２２は、ステップＳ３１においてタスク番号ｉを“８”とし、タスクキュー１２１に格納されている９番目の“＃８”のタスク０から、処理演算部１１に出力する順番の並び替え処理を開始する。 Subsequently, the DMA request generation unit 122 sets the task number i to “8” in step S31, and outputs the task number 9 from the ninth “# 8” task 0 to the processing calculation unit 11. The sorting process is started.

処理演算部１１に出力する順番の並び替え処理において、ＤＭＡリクエスト発生部１２２は、“＃８”のタスク０が、処理演算部１１に出力済みであるか否かを確認する（ステップＳ３２）。ステップＳ３２において、“＃８”のタスク０が処理演算部１１に出力済みである場合（ステップＳ３２の“ＹＥＳ”）には、ステップＳ３１においてタスク番号ｉに１を加えて、すなわち、タスク番号ｉ＝９として、タスクキュー１２１に格納された１０番目の“＃９”のタスク１に対する確認を繰り返す。 In the rearrangement process of the order of output to the processing operation unit 11, the DMA request generation unit 122 confirms whether or not the task 0 of “# 8” has been output to the processing operation unit 11 (step S32). If the task 0 of “# 8” has already been output to the processing operation unit 11 in step S32 (“YES” in step S32), 1 is added to the task number i in step S31, that is, the task number i. = 9, the confirmation for the 10th “# 9” task 1 stored in the task queue 121 is repeated.

ステップＳ３２における確認の結果が、“＃８”のタスク０が処理演算部１１に出力済みでない場合（ステップＳ３２の“ＮＯ”）、ＤＭＡリクエスト発生部１２２は、ステップＳ３３において、“＃８”のタスク０の優先度が、最高優先度ＰＲＩ−ＭＡＸよりも高いか否かを確認する。ステップＳ３３において、“＃８”のタスク０の優先度が、最高優先度ＰＲＩ−ＭＡＸよりも高くない、すなわち、“＃８”のタスク０の優先度が最高優先度ＰＲＩ−ＭＡＸよりも低い場合（ステップＳ３３の“ＮＯ”）には、ステップＳ３１においてタスク番号ｉに１を加えて、タスクキュー１２１に格納された１０番目の“＃９”のタスク１に対する確認を繰り返す。 As a result of the confirmation in step S32, when the task 0 of “# 8” has not been output to the processing operation unit 11 (“NO” in step S32), the DMA request generation unit 122 determines that “# 8” in step S33. It is confirmed whether or not the priority of task 0 is higher than the highest priority PRI-MAX. In step S33, the priority of task 0 of “# 8” is not higher than the highest priority PRI-MAX, that is, the priority of task 0 of “# 8” is lower than the highest priority PRI-MAX. (“NO” in step S33), 1 is added to the task number i in step S31, and the check for the 10th “# 9” task 1 stored in the task queue 121 is repeated.

ステップＳ３３における確認の結果が、“＃８”のタスク０の優先度が最高優先度ＰＲＩ−ＭＡＸよりも高い場合（ステップＳ３３の“ＹＥＳ”）、ＤＭＡリクエスト発生部１２２は、ステップＳ３４において、最高優先度ＰＲＩ−ＭＡＸを“＃８”のタスク０の優先度の値とする。また、ＤＭＡリクエスト発生部１２２は、最高優先度のタスク番号ＰＲＩ−ＭＡＸ−ＩＤＸを、“＃８”にする。そして、ＤＭＡリクエスト発生部１２２は、ステップＳ３１においてタスク番号ｉに１を加えて、タスクキュー１２１に格納された１０番目の“＃９”のタスク１に対する処理演算部１１に出力する順番の並び替え処理を開始する。 When the priority of the task 0 of “# 8” is higher than the highest priority PRI-MAX (“YES” in step S33) as a result of the confirmation in step S33, the DMA request generator 122 determines that the highest priority is obtained in step S34. The priority PRI-MAX is set to the priority value of task 0 of “# 8”. In addition, the DMA request generation unit 122 sets the highest priority task number PRI-MAX-IDX to “# 8”. In step S31, the DMA request generation unit 122 adds 1 to the task number i, and rearranges the order of output to the processing calculation unit 11 for the 10th “# 9” task 1 stored in the task queue 121. Start processing.

続いて、ＤＭＡリクエスト発生部１２２は、ステップＳ３１のループの処理が完了すると、すなわち、タスクキュー１２１に格納された全てのタスクに対して、処理演算部１１に出力する順番の並び替え処理を完了すると、最後に、最高優先度のタスク番号ＰＲＩ−ＭＡＸ−ＩＤＸのタスクが使用するデータを外部記憶部２０から事前に取得するためのＤＭＡリードアクセス指示を、メモリ制御部１３に出力する（ステップＳ３５）。 Subsequently, when the processing of the loop in step S31 is completed, the DMA request generation unit 122 completes the rearrangement processing of the order of output to the processing arithmetic unit 11 for all tasks stored in the task queue 121. Then, finally, a DMA read access instruction for acquiring data used by the task having the highest priority task number PRI-MAX-IDX in advance from the external storage unit 20 is output to the memory control unit 13 (step S35). ).

上述したように、本第３の動作においては、タスクに続く数字が大きいほど優先度が高い、つまり、優先度は、“＃９”のタスク１＞“＃８”のタスク０である。従って、ステップＳ３１のループの処理が完了したとき、すなわち、“＃９”のタスク１に対する処理演算部１１に出力する順番の並び替え処理が完了したときには、最高優先度ＰＲＩ−ＭＡＸは“＃９”のタスク１の優先度となり、最高優先度のタスク番号ＰＲＩ−ＭＡＸ−ＩＤＸは“＃９”となっている。このため、タスク制御部１２は、９番目の“＃８”のタスク０に対応したＤＭＡリードアクセス指示よりも先に、１０番目の“＃９”のタスク１に対応したＤＭＡリードアクセス指示を、メモリ制御部１３に出力する。 As described above, in the third operation, the higher the number following the task, the higher the priority, that is, the priority is task 1 of “# 9”> task 0 of “# 8”. Accordingly, when the processing of the loop in step S31 is completed, that is, when the rearrangement processing of the order to be output to the processing operation unit 11 for the task 1 of “# 9” is completed, the highest priority PRI-MAX is “# 9”. The task number PRI-MAX-IDX having the highest priority is "# 9". For this reason, the task control unit 12 gives a DMA read access instruction corresponding to the tenth “# 9” task 1 prior to the DMA read access instruction corresponding to the ninth “# 8” task 0. Output to the memory control unit 13.

図８は、本実施形態の演算装置１０に備えたタスク制御部１２による第３の動作におけるタスクの分配とデータ転送とのタイミングを示したタイミングチャートである。図８に示したタイミングチャートには、タスク制御部１２が、演算装置１０に備えた４つの処理演算部１１ａ〜処理演算部１１ｄのそれぞれに、“＃０”のタスク７〜“＃７”のタスク０を順次出力し、その後、“＃８”のタスク０よりも先に“＃９”のタスク１を処理演算部１１ｄに出力する場合を示している。 FIG. 8 is a timing chart showing the timing of task distribution and data transfer in the third operation by the task control unit 12 provided in the arithmetic device 10 of the present embodiment. In the timing chart shown in FIG. 8, the task control unit 12 adds the tasks 7 to “# 7” of “# 0” to each of the four processing calculation units 11 a to 11 d included in the calculation device 10. In this example, the task 0 is sequentially output, and then the task 1 “# 9” is output to the processing operation unit 11d before the task 0 “# 8”.

より具体的には、“＃４”のタスク４を処理演算部１１ａに、“＃５”のタスク３を処理演算部１１ｂに、“＃６”のタスク３を処理演算部１１ｃに、“＃７”のタスク０を処理演算部１１ｄに、それぞれ出力する。そして、処理演算部１１ａ〜処理演算部１１ｄの内、割り当てられたタスクの実行が最も早く完了した処理演算部１１ｄに、次にタスクキュー１２１に格納された“＃８”のタスク０よりも優先度が高い“＃９”のタスク１を出力し、その後に、割り当てられたタスクの実行が最も早く完了した処理演算部１１ｂに、“＃９”のタスク１よりも優先度が低い“＃８”のタスク０を出力する場合を示している。 More specifically, the task 4 of “# 4” is set to the processing calculation unit 11a, the task 3 of “# 5” is set to the processing calculation unit 11b, and the task 3 of “# 6” is set to the processing calculation unit 11c. 7 ″ task 0 is output to the processing operation section 11d. Then, among the processing arithmetic units 11a to 11d, the processing arithmetic unit 11d that has executed the assigned task earliest is given priority over the task 0 of “# 8” stored in the task queue 121 next. “# 9” task 1 having a high degree of output is output, and thereafter, the processing operation unit 11b that has completed the execution of the assigned task earliest has a lower priority than “# 9” task 1 “# 8”. The case where the task 0 of "" is output is shown.

この場合であっても、タスク制御部１２は、本第３の動作によって、優先度が低い“＃８”のタスク０に対応したＤＭＡリードアクセス指示よりも先に、優先度が高い“＃９”のタスク１に対応したＤＭＡリードアクセス指示をメモリ制御部１３に出力している。これにより、メモリ制御部１３は、タスク制御部１２から入力されたＤＭＡリードアクセス指示に応じて、“＃９”のタスク１に対応したデータを外部記憶部２０から読み出して、“＃９”のタスク１を実行する処理演算部１１ｄに対応したデータ記憶部１４ｄに格納する。その後、メモリ制御部１３は、タスク制御部１２から入力されたＤＭＡリードアクセス指示に応じて、“＃８”のタスク０に対応したデータを外部記憶部２０から読み出して、“＃８”のタスク０を実行する処理演算部１１ｂに対応したデータ記憶部１４ｂに格納する。 Even in this case, the task control unit 12 uses the third operation to increase the priority “# 9” prior to the DMA read access instruction corresponding to the task 0 having the low priority “# 8”. The DMA read access instruction corresponding to task 1 is output to the memory control unit 13. As a result, the memory control unit 13 reads the data corresponding to the task 1 of “# 9” from the external storage unit 20 in accordance with the DMA read access instruction input from the task control unit 12 and reads “# 9”. The data is stored in the data storage unit 14d corresponding to the processing calculation unit 11d that executes the task 1. Thereafter, in response to the DMA read access instruction input from the task control unit 12, the memory control unit 13 reads data corresponding to the task 0 of “# 8” from the external storage unit 20, and performs the task of “# 8”. The data is stored in the data storage unit 14b corresponding to the processing calculation unit 11b that executes 0.

このように、タスク制御部１２における第３の動作では、タスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、タスクキュー１２１内の対象タスクの優先度を確認することによって、対象タスクが割り当てられる処理演算部１１が、対象タスクを実行するよりも早いタイミングで、対象タスクを実行する際に使用するデータを、事前にデータ記憶部１４に格納しておくことができる。 As described above, in the third operation in the task control unit 12, by checking the priority of the target task in the task queue 121 every time the task stored in the task queue 121 is output to the processing calculation unit 11, Data used when executing the target task can be stored in the data storage unit 14 in advance at a timing earlier than the processing operation unit 11 to which the target task is assigned executes the target task.

また、タスク制御部１２における第３の動作では、それぞれのタスクの優先度に応じて処理演算部１１に出力するタスクの順番を並び替えることができる。 In the third operation in the task control unit 12, the order of tasks output to the processing calculation unit 11 can be rearranged according to the priority of each task.

ここで、それぞれのタスクの優先度に応じて処理演算部１１に出力するタスクの順番を並び替える場合の一例について説明する。図９〜図１１は、本実施形態の演算装置１０に備えたタスク制御部１２による第３の動作において処理演算部１１に出力するタスクの順番を入れ替える場合の一例を説明する図である。図９には、本一例におけるそれぞれのタスクの優先度の関係を示し、図１０には、タスクの順番を入れ替える前のタスクキュー１２１に格納されたそれぞれのタスクの状態と処理演算部１１がそれぞれのタスクを実行するタイミングを示し、図１１には、タスクの順番を入れ替えた後のタスクキュー１２１に格納されたそれぞれのタスクの状態と処理演算部１１がそれぞれのタスクを実行するタイミングを示している。なお、以下の説明においては、説明を容易にするため、演算装置１０内に１つの処理演算部１１のみを備えている場合について説明する。 Here, an example of rearranging the order of tasks output to the processing operation unit 11 according to the priority of each task will be described. 9 to 11 are diagrams for explaining an example in which the order of tasks output to the processing arithmetic unit 11 is switched in the third operation by the task control unit 12 included in the arithmetic device 10 of the present embodiment. FIG. 9 shows the relationship between the priorities of the tasks in this example, and FIG. 10 shows the state of each task stored in the task queue 121 and the processing calculation unit 11 before the task order is changed. FIG. 11 shows the state of each task stored in the task queue 121 after the task order is changed and the timing at which the processing operation unit 11 executes each task. Yes. In the following description, for ease of explanation, a case where only one processing calculation unit 11 is provided in the calculation device 10 will be described.

まず、図９を参照して、本一例におけるそれぞれのタスクの優先度の関係を説明する。本一例では、処理演算部１１がそれぞれのタスクを実行した際に、図９に示したような、ぞれぞれのタスクに関連する下位のタスクが発生するものとする。より具体的には、処理演算部１１がタスク０を実行した結果として、タスク０−０とタスク０−１との下位のタスクが発生し、さらに、処理演算部１１がタスク０−０を実行した結果として、タスク０−０−０とタスク０−０−１との下位のタスクが発生するものとする。また、処理演算部１１がタスク１を実行した結果として、タスク１−０の下位のタスクが発生するものとする。 First, with reference to FIG. 9, the relationship of the priority of each task in this example will be described. In this example, it is assumed that when the processing operation unit 11 executes each task, a lower-order task related to each task as shown in FIG. 9 occurs. More specifically, as a result of execution of task 0 by the processing operation unit 11, lower-level tasks of task 0-0 and task 0-1 are generated, and further, the processing operation unit 11 executes task 0-0. As a result, it is assumed that lower-order tasks of task 0-0-0 and task 0-0-1 are generated. Further, it is assumed that a task lower than task 1-0 is generated as a result of processing task 11 being executed by task 11.

また、本一例では、それぞれのタスクの優先度の関係は、図９に示したように、タスクに続く数字が大きいほど優先度が低く、同じ系列のタスクでは、上位のタスクほど優先度が低く、同じ階層のタスクでは、タスクに続く数字の階層が大きいほど優先度が低いものとする。より具体的には、タスク０とタスク１との優先度はタスク０＞タスク１であり、タスク０と、タスク０−０と、タスク０−０−０との優先度はタスク０−０−０＞タスク０−０＞タスク０であり、タスク０−０とタスク０−１との優先度はタスク０−０＞タスク０−１であるものとする。つまり、図９に示したそれぞれのタスクの優先度は、タスク０−０−０＞タスク０−０−１＞タスク０−０＞タスク０−１＞タスク１−０＞タスク０＞タスク１であるものとする。 Further, in this example, as shown in FIG. 9, the priority relationship of each task is lower as the number following the task is larger, and in the same series of tasks, the higher task is lower in priority. For tasks in the same hierarchy, the higher the number hierarchy following the task, the lower the priority. More specifically, the priority of task 0 and task 1 is task 0> task 1, and the priority of task 0, task 0-0, and task 0-0-0 is task 0-0-. It is assumed that 0> task 0-0> task 0, and the priority of task 0-0 and task 0-1 is task 0-0> task 0-1. That is, the priority of each task shown in FIG. 9 is as follows: task 0-0-0> task 0-0-1> task 0-0> task 0-1> task 1-0> task 0> task 1 It shall be.

図９に示したような関係にあるタスクが発生した順番でタスクキュー１２１に順次格納されると、例えば、図１０（ａ）に示したように、タスクキュー１２１の“＃０”にタスク０が、“＃１”にタスク１が、“＃２”にタスク０−０が、“＃３”にタスク１−０が、“＃４”にタスク０−０−０が、“＃５”にタスク０−０−１が、“＃６”にタスク０−１がそれぞれ格納される。このような順番でタスクキュー１２１に格納されたそれぞれタスクを、１つの処理演算部１１が順次実行すると、図１０（ｂ）に示したように、“＃３”にタスク１−０の実行が完了した時点で、タスク１の系列の全てのタスクが完了し、その後、“＃６”にタスク０−１の実行が完了した時点で、タスク０の系列の全てのタスクが完了することになる。これは、上述したタスク０とタスク１との優先度（タスク０＞タスク１）の関係にある優先順位通りにそれぞれの系列のタスクが完了していないことになる。 When the tasks having the relationship as shown in FIG. 9 are sequentially stored in the task queue 121 in the order in which they occurred, for example, as shown in FIG. However, Task 1 is “# 1”, Task 0-0 is “# 2”, Task 1-0 is “# 3”, Task 0-0-0 is “# 4”, “# 5” Stores task 0-0-1 and "# 6" stores task 0-1. When one processing operation unit 11 sequentially executes the tasks stored in the task queue 121 in this order, as shown in FIG. 10B, the execution of the task 1-0 is performed at “# 3”. At the time of completion, all tasks in the task 1 series are completed. After that, when the execution of the task 0-1 is completed at "# 6", all tasks in the task 0 series are completed. . This means that the tasks of the respective series are not completed according to the priority order in the relationship of the priorities of task 0 and task 1 (task 0> task 1).

しかし、演算装置１０では、タスク制御部１２における第３の動作によって、タスクキュー１２１に格納されたタスクの優先度を確認し、それぞれのタスクの優先度に応じて処理演算部１１に出力するタスクの順番を並び替えることによって、タスク０とタスク１とのそれぞれの系列のタスクを、優先順位通りに完了させることができる。 However, in the arithmetic device 10, the task operation unit 12 checks the priority of the task stored in the task queue 121 by the third operation in the task control unit 12, and outputs the task to the processing arithmetic unit 11 according to the priority of each task. By rearranging the order, the tasks of the tasks 0 and 1 can be completed according to the priority order.

より具体的には、図９に示したような関係にあるタスクが発生した順番でタスクキュー１２１に順次格納された後に、処理演算部１１に出力するタスクの順番を並び替える。これにより、例えば、図１１（ａ）に示したように、タスクキュー１２１の“＃０”にタスク０が、“＃１”にタスク１が、“＃２”にタスク０−０が、“＃３”にタスク０−０−０が、“＃４”にタスク０−０−１が、“＃５”にタスク０−１が、“＃６”にタスク１−０がそれぞれ格納された状態と同様の順番で、それぞれのタスクが処理演算部１１に出力される。このような順番でタスクキュー１２１に格納されたそれぞれタスクを、１つの処理演算部１１が順次実行すると、図１１（ｂ）に示したように、“＃５”にタスク０−１の実行が完了した時点で、タスク０の系列の全てのタスクが完了し、その後、“＃６”にタスク１−０の実行が完了した時点で、タスク１の系列の全てのタスクが完了することになる。これは、上述したタスク０とタスク１との優先度（タスク０＞タスク１）の関係にある優先順位通りにそれぞれの系列のタスクが完了している。 More specifically, after the tasks having the relationship shown in FIG. 9 are sequentially stored in the task queue 121 in the order in which they occurred, the order of tasks output to the processing operation unit 11 is rearranged. As a result, for example, as shown in FIG. 11A, task 0 is “0”, task 1 is “# 1”, task 0-0 is “# 2”, Task 0-0-0 is stored in # 3, task 0-0-1 is stored in "# 4", task 0-1 is stored in "# 5", and task 1-0 is stored in "# 6". Each task is output to the processing operation unit 11 in the same order as the state. When one processing operation unit 11 sequentially executes the tasks stored in the task queue 121 in this order, as shown in FIG. 11B, the execution of the task 0-1 is performed at "# 5". At the time of completion, all tasks in the task 0 series are completed. After that, when the execution of the task 1-0 is completed at “# 6”, all tasks in the task 1 series are completed. . This is because the tasks of the respective series are completed according to the priority order in the relationship of the priorities of task 0 and task 1 (task 0> task 1).

このように、タスク制御部１２は、それぞれのタスクの優先度に応じて処理演算部１１に出力するタスクの順番を並び替えることによって、優先順位通りにそれぞれのタスクの実行を完了させることができる。 In this way, the task control unit 12 can complete the execution of each task according to the priority order by rearranging the order of the tasks output to the processing operation unit 11 according to the priority of each task. .

このように、タスク制御部１２における第３の動作では、タスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、タスクキュー１２１内の対象タスクの優先度を確認することによって、対象タスクが割り当てられる処理演算部１１が、対象タスクを実行するよりも早いタイミングで、対象タスクを実行する際に使用するデータを、事前にデータ記憶部１４に格納しておくことができる。これにより、演算装置１０では、それぞれのタスクの優先度に応じて処理演算部１１に出力するタスクの順番を並び替えると共に、それぞれの処理演算部１１が使用するデータのキャッシュミスを防止することができる。 As described above, in the third operation in the task control unit 12, by checking the priority of the target task in the task queue 121 every time the task stored in the task queue 121 is output to the processing calculation unit 11, Data used when executing the target task can be stored in the data storage unit 14 in advance at a timing earlier than the processing operation unit 11 to which the target task is assigned executes the target task. Thereby, in the arithmetic unit 10, the order of tasks output to the processing arithmetic unit 11 is rearranged according to the priority of each task, and a cache miss of data used by each processing arithmetic unit 11 can be prevented. it can.

なお、対象タスクを実行する際に使用するデータを、対応するデータ記憶部１４に事前に格納する際には、このデータ記憶部１４に演算処理に必要な前のデータが格納されていないかを確認し、データ記憶部１４に演算処理に必要な前のデータが格納されている場合には、現在格納されているデータを、外部記憶部２０にデータを退避しておく（書き込んでおく）必要がある。このため、例えば、図７に示したフローチャートのステップＳ３１のループとステップＳ３５との間に、データ記憶部１４に演算処理に必要な前のデータが格納されているか否かを確認するステップを設け、このステップにおいて、データ記憶部１４に演算処理に必要な前のデータが格納されていないと確認された場合に、ステップＳ３５の処理を実行することが望ましい。なお、このステップにおいて、データ記憶部１４に演算処理に必要な前のデータが格納されていると確認された場合には、データ記憶部１４に現在格納されているデータを外部記憶部２０にデータを退避してから（書き込んでから）、ステップＳ３５の処理を実行することになる。このため、ステップＳ３５においてＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングは、例えば、外部記憶部２０にアクセスする際の転送サイクル数を２倍にしておくなどの対応によって、余裕を持ったタイミングにしておくことが望ましい。なお、ＤＭＡリードアクセス指示をメモリ制御部１３に出力するタイミングの考え方は、第１の動作および第２の動作と同様であるため、詳細な説明は省略する。 When data used when executing the target task is stored in the corresponding data storage unit 14 in advance, whether or not the previous data necessary for the arithmetic processing is stored in the data storage unit 14 is checked. When the previous data necessary for the arithmetic processing is stored in the data storage unit 14, it is necessary to save (write) the currently stored data in the external storage unit 20. There is. For this reason, for example, a step for checking whether or not the previous data necessary for the arithmetic processing is stored in the data storage unit 14 is provided between the loop of step S31 in the flowchart shown in FIG. 7 and step S35. In this step, when it is confirmed that the previous data necessary for the calculation process is not stored in the data storage unit 14, it is desirable to execute the process of step S35. In this step, when it is confirmed that the previous data necessary for the arithmetic processing is stored in the data storage unit 14, the data currently stored in the data storage unit 14 is transferred to the external storage unit 20. After saving (after writing), the process of step S35 is executed. For this reason, the timing at which the DMA read access instruction is output to the memory control unit 13 in step S35 has a margin due to, for example, doubling the number of transfer cycles when accessing the external storage unit 20. It is desirable to keep timing. Note that the concept of timing for outputting the DMA read access instruction to the memory control unit 13 is the same as that in the first operation and the second operation, and thus detailed description thereof is omitted.

なお、優先度の高いタスクを処理演算部１１に出力するときに、このタスクを実行する際に使用するデータのデータ記憶部１４への事前の格納が終了していない場合には、例えば、優先度は低いが外部記憶部２０に格納されているデータを使用しないタスクを、このタスクよりも先に処理演算部１１に出力することもできる。これにより、優先度の高いタスクを処理演算部１１に出力するタイミングを、データ記憶部１４へのデータの格納が終了するタイミングまで遅らせることができる。 In addition, when a task with high priority is output to the processing operation unit 11 and the prior storage in the data storage unit 14 of data used when executing this task is not completed, for example, priority is given. A task that does not use data stored in the external storage unit 20 at a low frequency can be output to the processing operation unit 11 prior to this task. Thereby, the timing which outputs a task with high priority to the process calculating part 11 can be delayed to the timing which the storage of the data in the data storage part 14 is complete | finished.

＜第４の動作＞
次に、演算装置１０の動作、特にタスク制御部１２の第４の動作について説明する。図１２は、本実施形態の演算装置１０に備えたタスク制御部１２による第４の動作における処理手順を示したフローチャートである。第４の動作は、第３の動作と同様に、タスクキュー１２１に格納されたそれぞれのタスクの優先度に基づいて、外部記憶部２０に格納されているデータをＤＭＡによって事前に取得する（読み出す）ためのＤＭＡリードアクセス指示を出力する前に、データ記憶部１４に演算処理に必要な前のデータが格納されているか否かの確認を行う動作である。そして、第４の動作は、データ記憶部１４に演算処理に必要な前のデータが格納されている場合に、データ記憶部１４に現在格納されているデータを、ＤＭＡによって外部記憶部２０に退避する（書き込む）ためのアクセス指示（以下「ＤＭＡライトアクセス指示」を、メモリ制御部１３に出力する動作である。 <Fourth operation>
Next, the operation of the arithmetic unit 10, particularly the fourth operation of the task control unit 12 will be described. FIG. 12 is a flowchart showing a processing procedure in the fourth operation by the task control unit 12 included in the arithmetic device 10 of the present embodiment. As in the third operation, the fourth operation acquires (reads out) data stored in the external storage unit 20 in advance by DMA based on the priority of each task stored in the task queue 121. This is an operation for confirming whether or not the previous data necessary for the arithmetic processing is stored in the data storage unit 14 before outputting the DMA read access instruction. The fourth operation is to save the data currently stored in the data storage unit 14 to the external storage unit 20 by DMA when the previous data necessary for the arithmetic processing is stored in the data storage unit 14. This is an operation for outputting an access instruction (hereinafter referred to as “DMA write access instruction”) to perform (write) to the memory control unit 13.

すなわち、第４の動作は、図７に示したタスク制御部１２の第３の動作の処理手順に、データ記憶部１４に演算処理に必要な前のデータが格納されているか否かを確認するステップを設けた動作である。従って、本第４の動作の説明においては、図７に示したタスク制御部１２の第３の動作の処理手順と同様の手順に同じステップ番号を付与して説明を省略し、図７に示したタスク制御部１２の第３の動作の処理手順と異なる手順のみを説明する。 That is, in the fourth operation, it is confirmed whether or not the previous data necessary for the arithmetic processing is stored in the data storage unit 14 in the processing procedure of the third operation of the task control unit 12 shown in FIG. This is an operation with steps. Therefore, in the description of the fourth operation, the same step number is assigned to the same procedure as that of the third operation of the task control unit 12 shown in FIG. Only a procedure different from the processing procedure of the third operation of the task control unit 12 will be described.

なお、図１２に示したタスク制御部１２の第４の動作の処理手順では、ステップＳ３１のループにおいてタスク番号ｉを“０”にクリアしている。しかし、ステップＳ３１のループにおいてクリアしたタスク番号ｉが異なっている場合でも、ステップＳ３１のループ内の処理は同様に考えることができるため、同様に、詳細な説明は省略する。 In the procedure of the fourth operation of the task control unit 12 shown in FIG. 12, the task number i is cleared to “0” in the loop of step S31. However, even if the task number i cleared in the loop of step S31 is different, the processing in the loop of step S31 can be considered in the same way, and thus detailed description is omitted.

ＤＭＡリクエスト発生部１２２は、タスク制御部１２がタスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、図１２に示したタスクを処理演算部１１に出力する順番を優先度に応じて並び替える処理を実行する。図１２に示したタスクを処理演算部１１に出力する順番を優先度に応じて並び替える処理では、ＤＭＡリクエスト発生部１２２は、図７に示したタスク制御部１２の第３の動作の処理手順と同様に、ステップＳ３において最高優先度ＰＲＩ−ＭＡＸを“−１”に、最高優先度のタスク番号ＰＲＩ−ＭＡＸ−ＩＤＸを“０”にクリアする。 Each time the task control unit 12 outputs the task stored in the task queue 121 to the processing calculation unit 11, the DMA request generation unit 122 sets the order of outputting the tasks illustrated in FIG. A rearrangement process is executed accordingly. In the process of rearranging the order of outputting the tasks shown in FIG. 12 to the processing operation unit 11 according to the priority, the DMA request generation unit 122 performs the processing procedure of the third operation of the task control unit 12 shown in FIG. Similarly, in step S3, the highest priority PRI-MAX is cleared to "-1" and the highest priority task number PRI-MAX-IDX is cleared to "0".

続いて、ＤＭＡリクエスト発生部１２２は、図７に示したタスク制御部１２の第３の動作の処理手順と同様に、ステップＳ３２〜ステップＳ３４を含むステップＳ３１のループにおいて、処理演算部１１に出力する順番の並び替え処理を行い、最高優先度ＰＲＩ−ＭＡＸと最高優先度のタスク番号ＰＲＩ−ＭＡＸ−ＩＤＸとの値を、最も優先度が高いタスクに応じた値にする。 Subsequently, the DMA request generation unit 122 outputs to the processing calculation unit 11 in the loop of step S31 including step S32 to step S34, similarly to the processing procedure of the third operation of the task control unit 12 illustrated in FIG. In this order, the values of the highest priority PRI-MAX and the highest priority task number PRI-MAX-IDX are set to values corresponding to the task with the highest priority.

続いて、ＤＭＡリクエスト発生部１２２は、ステップＳ３１のループの処理が完了すると、すなわち、タスクキュー１２１に格納された全てのタスクに対して、処理演算部１１に出力する順番の並び替え処理を完了すると、最高優先度のタスク番号ＰＲＩ−ＭＡＸ−ＩＤＸのタスクが使用するデータを外部記憶部２０から事前に取得して格納するデータ記憶部１４が空きの状態であるか否かを確認する（ステップＳ４）。つまり、最も優先度が高いタスクを実行する処理演算部１１に対応したデータ記憶部１４に、演算処理に必要な前のデータが格納されているか否かを確認する。 Subsequently, when the processing of the loop in step S31 is completed, the DMA request generation unit 122 completes the rearrangement processing of the order of output to the processing arithmetic unit 11 for all tasks stored in the task queue 121. Then, it is confirmed whether or not the data storage unit 14 for acquiring and storing the data used by the task having the highest priority task number PRI-MAX-IDX in advance from the external storage unit 20 is empty (step S4). That is, it is confirmed whether or not the previous data necessary for the arithmetic processing is stored in the data storage unit 14 corresponding to the processing arithmetic unit 11 that executes the task with the highest priority.

ステップＳ４において、データ記憶部１４が空きの状態であると確認された場合（ステップＳ４の“ＹＥＳ”）には、図７に示したタスク制御部１２の第３の動作の処理手順と同様に、ステップＳ３５において、最高優先度のタスク番号ＰＲＩ−ＭＡＸ−ＩＤＸのタスクが使用するデータを外部記憶部２０から事前に取得するためのＤＭＡリードアクセス指示を、メモリ制御部１３に出力する。 If it is confirmed in step S4 that the data storage unit 14 is empty (“YES” in step S4), the processing procedure of the third operation of the task control unit 12 shown in FIG. In step S35, a DMA read access instruction for acquiring data used by the task having the highest priority task number PRI-MAX-IDX in advance from the external storage unit 20 is output to the memory control unit 13.

ステップＳ４において、データ記憶部１４が空きの状態でないと確認された場合（ステップＳ４の“ＮＯ”）には、ステップＳ５において、最高優先度のタスク番号ＰＲＩ−ＭＡＸ−ＩＤＸのタスクに対応した処理演算部１１が使用するデータ記憶部１４に現在格納されているデータを、外部記憶部２０に事前に退避しておく、データ退避の処理を行う。 If it is confirmed in step S4 that the data storage unit 14 is not empty ("NO" in step S4), processing corresponding to the task with the highest priority task number PRI-MAX-IDX in step S5. Data saving processing is performed in which the data currently stored in the data storage unit 14 used by the calculation unit 11 is saved in the external storage unit 20 in advance.

そして、ＤＭＡリクエスト発生部１２２は、ステップＳ５におけるデータ記憶部１４に現在格納されているデータのデータ退避処理を完了すると、図７に示したタスク制御部１２の第３の動作の処理手順と同様に、ステップＳ３５において、最高優先度のタスク番号ＰＲＩ−ＭＡＸ−ＩＤＸのタスクが使用するデータを外部記憶部２０から事前に取得するためのＤＭＡリードアクセス指示を、メモリ制御部１３に出力する。 Then, when the DMA request generation unit 122 completes the data saving process of the data currently stored in the data storage unit 14 in step S5, it is the same as the processing procedure of the third operation of the task control unit 12 shown in FIG. In step S35, a DMA read access instruction for obtaining in advance data from the external storage unit 20 used by the task having the highest priority task number PRI-MAX-IDX is output to the memory control unit 13.

ここで、ステップＳ５におけるデータ退避処理について説明する。図１３は、本実施形態の演算装置１０に備えたタスク制御部１２が、データ記憶部１４に格納されているデータを退避する処理手順を示したフローチャートである。タスク制御部１２によるデータ退避処理は、タスクキュー１２１に格納されたそれぞれのタスクの優先度に基づいて、最も優先度が低いタスクを実行する処理演算部１１に対応するデータ記憶部１４に格納されたデータを、ＤＭＡによって外部記憶部２０に退避する（書き込む）ためのＤＭＡライトアクセス指示を、メモリ制御部１３に出力する処理である。 Here, the data saving process in step S5 will be described. FIG. 13 is a flowchart illustrating a processing procedure in which the task control unit 12 included in the arithmetic device 10 according to the present embodiment saves data stored in the data storage unit 14. The data saving process by the task control unit 12 is stored in the data storage unit 14 corresponding to the processing calculation unit 11 that executes the task with the lowest priority based on the priority of each task stored in the task queue 121. This is a process of outputting a DMA write access instruction for saving (writing) data to the external storage unit 20 by DMA to the memory control unit 13.

ＤＭＡリクエスト発生部１２２は、図１２に示したタスクを処理演算部１１に出力する順番を優先度に応じて並び替える処理のステップＳ４において、データ記憶部１４が空きの状態でないと確認された場合（ステップＳ４の“ＮＯ”）に、図１３に示したデータ退避処理を実行する。まず、ＤＭＡリクエスト発生部１２２は、ステップＳ５１において最低優先度ＰＲＩ−ＭＩＮを“最大値（図１３においては０ｘＦＦ）”に、最低優先度のタスク番号ＰＲＩ−ＭＩＮ−ＩＤＸを“０”にクリアする。 When the DMA request generation unit 122 confirms that the data storage unit 14 is not free in step S4 of the process of rearranging the order of outputting the tasks illustrated in FIG. 12 to the processing calculation unit 11 according to the priority. The data saving process shown in FIG. 13 is executed (“NO” in step S4). First, in step S51, the DMA request generation unit 122 clears the lowest priority PRI-MIN to “maximum value (0xFF in FIG. 13)” and the lowest priority task number PRI-MIN-IDX to “0”. .

続いて、ＤＭＡリクエスト発生部１２２は、ステップＳ５２においてタスク番号ｉを“０”にクリアし、タスクキュー１２１に格納されている“＃０”のタスクから、処理演算部１１に出力するタスクにおいて、最も優先度が低いタスクを選択する処理を開始する。 Subsequently, the DMA request generation unit 122 clears the task number i to “0” in step S52, and the task output from the task “# 0” stored in the task queue 121 to the processing operation unit 11 Start the process of selecting the task with the lowest priority.

処理演算部１１に出力する最も優先度が低いタスクの選択処理において、ＤＭＡリクエスト発生部１２２は、“＃０”のタスクが、処理演算部１１に出力済みであるか否かを確認する（ステップＳ５３）。ステップＳ５３において、“＃０”のタスクが処理演算部１１に出力済みである場合（ステップＳ５３の“ＹＥＳ”）には、ステップＳ５２においてタスク番号ｉに１を加えて、すなわち、タスク番号ｉ＝１として、タスクキュー１２１に格納された２番目の“＃１”のタスクに対する確認を繰り返す。 In the process of selecting the lowest priority task to be output to the processing operation unit 11, the DMA request generation unit 122 confirms whether or not the task “# 0” has been output to the processing operation unit 11 (step S53). When the task “# 0” has already been output to the processing operation unit 11 in step S53 (“YES” in step S53), 1 is added to the task number i in step S52, that is, the task number i = 1, the confirmation for the second “# 1” task stored in the task queue 121 is repeated.

ステップＳ５３における確認の結果が、“＃０”のタスクが処理演算部１１に出力済みでない場合（ステップＳ５３の“ＮＯ”）、ＤＭＡリクエスト発生部１２２は、ステップＳ５４において、“＃０”のタスクの優先度が、最低優先度ＰＲＩ−ＭＩＮよりも低いか否かを確認する。ステップＳ５４において、“＃０”のタスクの優先度が、最低優先度ＰＲＩ−ＭＩＮよりも低くない、すなわち、“＃０”のタスクの優先度が最低優先度ＰＲＩ−ＭＩＮよりも高い場合（ステップＳ５４の“ＮＯ”）には、ステップＳ５２においてタスク番号ｉに１を加えて、タスクキュー１２１に格納された２番目の“＃１”のタスクに対する確認を繰り返す。 If the result of the confirmation in step S53 is that the task “# 0” has not been output to the processing operation unit 11 (“NO” in step S53), the DMA request generator 122 determines that the task “# 0” in step S54. Is lower than the lowest priority PRI-MIN. In step S54, the priority of the task “# 0” is not lower than the lowest priority PRI-MIN, that is, the priority of the task “# 0” is higher than the lowest priority PRI-MIN (step S54). In “NO” in S54, 1 is added to the task number i in step S52, and the confirmation for the second “# 1” task stored in the task queue 121 is repeated.

ステップＳ５４における確認の結果が、“＃０”のタスクの優先度が最低優先度ＰＲＩ−ＭＩＮよりも低い場合（ステップＳ５４の“ＹＥＳ”）、ＤＭＡリクエスト発生部１２２は、ステップＳ５５において、最低優先度ＰＲＩ−ＭＩＮを“＃０”のタスクの優先度の値とする。また、ＤＭＡリクエスト発生部１２２は、最低優先度のタスク番号ＰＲＩ−ＭＩＮ−ＩＤＸを、“＃０”にする。そして、ＤＭＡリクエスト発生部１２２は、ステップＳ５２においてタスク番号ｉに１を加えて、タスクキュー１２１に格納された２番目の“＃１”のタスクに対する最も優先度が低いタスクの選択処理を開始する。 If the priority of the task “# 0” is lower than the lowest priority PRI-MIN (“YES” in step S54), the DMA request generator 122 determines that the lowest priority is given in step S55. The degree PRI-MIN is set to the priority value of the task “# 0”. Also, the DMA request generation unit 122 sets the task number PRI-MIN-IDX having the lowest priority to “# 0”. In step S52, the DMA request generation unit 122 adds 1 to the task number i, and starts selecting a task with the lowest priority for the second “# 1” task stored in the task queue 121. .

続いて、ＤＭＡリクエスト発生部１２２は、ステップＳ５２のループの処理が完了すると、すなわち、タスクキュー１２１に格納された全てのタスクに対して、最も優先度が低いタスクの選択処理を完了すると、最後に、最低優先度のタスク番号ＰＲＩ−ＭＩＮ−ＩＤＸのタスクで使用するデータ記憶部１４のデータを、外部記憶部２０に事前に退避するためのＤＭＡライトアクセス指示を、メモリ制御部１３に出力する（ステップＳ５６）。 Subsequently, when the processing of the loop of step S52 is completed, that is, when the task selection processing with the lowest priority is completed for all the tasks stored in the task queue 121, the DMA request generation unit 122 In addition, a DMA write access instruction for saving the data in the data storage unit 14 used in the task having the lowest priority task number PRI-MIN-IDX to the external storage unit 20 in advance is output to the memory control unit 13. (Step S56).

このように、データ退避処理では、タスクキュー１２１に格納されたタスクの内、処理演算部１１によって実行がされていないタスクの優先度を確認し、最も優先度が低いタスクに対応したデータ記憶部１４が格納しているデータを、事前に外部記憶部２０に退避するためのＤＭＡライトアクセス指示を、メモリ制御部１３に出力する。なお、図１２に示したタスクを処理演算部１１に出力する順番を優先度に応じて並び替える処理のステップＳ４において、データ記憶部１４が空きの状態でないと確認されたタスクが複数ある場合には、タスクキュー１２１に格納された優先度が低いタスクに対応したデータ記憶部１４から順次、データを外部記憶部２０に退避する。 Thus, in the data saving process, the priority of the task not executed by the processing operation unit 11 among the tasks stored in the task queue 121 is confirmed, and the data storage unit corresponding to the task with the lowest priority A DMA write access instruction for saving the data stored in 14 to the external storage unit 20 in advance is output to the memory control unit 13. In the case where there are a plurality of tasks in which the data storage unit 14 is confirmed not to be empty in step S4 of the process of rearranging the order of outputting the tasks shown in FIG. 12 to the processing operation unit 11 according to the priority. Sequentially saves data to the external storage unit 20 from the data storage unit 14 corresponding to the low priority task stored in the task queue 121.

このように、タスク制御部１２における第４の動作では、タスクキュー１２１に格納されたタスクを処理演算部１１に出力する毎に、タスクキュー１２１内のタスクの優先度を確認する。さらに、最も優先度が高い対象タスクを実行する処理演算部１１に対応したデータ記憶部１４に、演算処理に必要なデータが格納されているか否かを確認する。そして、データ記憶部１４に演算処理に必要なデータが格納されている場合には、データ記憶部１４に現在格納されているデータを外部記憶部２０にデータを退避してから（書き込んでから）、最も優先度が高い対象タスクが使用するデータを取得するためのＤＭＡリードアクセス指示をメモリ制御部１３に出力する。これにより、タスク制御部１２における第４の動作でも、対象タスクが割り当てられる処理演算部１１が対象タスクを実行するよりも早いタイミングで、対象タスクを実行する際に使用するデータを、事前にデータ記憶部１４に格納しておくことができる。このことにより、演算装置１０では、それぞれのタスクの優先度に応じて処理演算部１１に出力するタスクの順番を並び替えると共に、それぞれの処理演算部１１が使用するデータのキャッシュミスを防止することができる。 As described above, in the fourth operation in the task control unit 12, each time the task stored in the task queue 121 is output to the processing calculation unit 11, the priority of the task in the task queue 121 is confirmed. Furthermore, it is confirmed whether or not data necessary for the arithmetic processing is stored in the data storage unit 14 corresponding to the processing arithmetic unit 11 that executes the target task with the highest priority. If the data necessary for the arithmetic processing is stored in the data storage unit 14, the data currently stored in the data storage unit 14 is saved in the external storage unit 20 (after being written). A DMA read access instruction for acquiring data used by the target task with the highest priority is output to the memory control unit 13. As a result, even in the fourth operation in the task control unit 12, the data used when the target task is executed at a timing earlier than the processing arithmetic unit 11 to which the target task is assigned executes the target task. It can be stored in the storage unit 14. As a result, the arithmetic unit 10 rearranges the order of tasks output to the processing arithmetic unit 11 according to the priority of each task, and prevents cache misses of data used by the respective processing arithmetic units 11. Can do.

上記に述べたとおり、本発明を実施するための形態によれば、複数の処理演算部（プロセッサ）が連携して処理を行う演算装置において、それぞれの処理演算部が実際にデータを使用するタスクを実行するよりも早いタイミングで、タスクを実行する際に使用するデータを、事前に外部記憶部から取得してデータ記憶部に格納しておく。これにより、本発明を実施するための形態では、それぞれの処理演算部が他のタスクを実行している期間に、外部記憶部からデータを取得する処理の期間を隠蔽することができる。このことにより、本発明を実施するための形態では、それぞれの処理演算部がタスクを実行する際に使用するデータのキャッシュミスを防止することができる演算装置を提供することができる。 As described above, according to the mode for carrying out the present invention, in a computing device in which a plurality of processing computing units (processors) cooperate to perform processing, each processing computing unit actually uses data. The data used when executing the task is acquired from the external storage unit in advance and stored in the data storage unit at a timing earlier than executing the above. Thereby, in the form for implementing this invention, the period of the process which acquires data from an external storage part can be concealed in the period when each process calculating part is performing another task. Thus, in the embodiment for carrying out the present invention, it is possible to provide an arithmetic device capable of preventing a cache miss of data used when each processing arithmetic unit executes a task.

また、本発明を実施するための形態によれば、外部記憶部から取得したデータを格納するデータ記憶部に、演算処理に必要な前のデータが格納されている場合には、このデータを、事前に外部記憶部に退避しておく。これにより、本発明を実施するための形態では、それぞれの処理演算部が他のタスクを実行している期間に、外部記憶部にデータを退避する処理の期間を隠蔽することができる。このことにより、本発明を実施するための形態では、それぞれの処理演算部がタスクを実行する際に使用するデータを外部記憶部から取得するための期間を、十分に確保することができる。 Further, according to the embodiment for carrying out the present invention, when the previous data necessary for the arithmetic processing is stored in the data storage unit that stores the data acquired from the external storage unit, Save to the external storage unit in advance. Thereby, in the form for implementing this invention, the period of the process which saves data to an external memory | storage part can be concealed in the period when each process calculating part is performing another task. Thereby, in the form for implementing this invention, the period for acquiring the data used when each process calculating part performs a task from an external memory | storage part can fully be ensured.

これらにより、本発明を実施するための形態では、演算装置を備えたシステムにおける処理時間の短縮を図ることができ、演算装置を備えたシステムの性能を向上させることができる。 As a result, in the embodiment for carrying out the present invention, it is possible to shorten the processing time in the system including the arithmetic device, and to improve the performance of the system including the arithmetic device.

なお、本実施形態においては、外部記憶部２０が演算装置１０に接続されている構成について説明したが、演算装置１０と外部記憶部２０との接続は、本発明を実施するための形態のように、演算装置１０と外部記憶部２０とが直接接続されている構成に限定されるものではない。例えば、外部記憶部２０が、ネットワーク上に構成されたサーバであり、演算装置１０とサーバとが、ネットワークを介して接続される構成であっても、同様に、本発明の考え方を適用することができる。この場合、例えば、メモリ制御部１３は、通信部を介して、サーバからのデータの読み出し（受信）や、サーバへのデータの書き込み（送信）を行うと考えることができる。 In the present embodiment, the configuration in which the external storage unit 20 is connected to the arithmetic device 10 has been described. However, the connection between the arithmetic device 10 and the external storage unit 20 is as in the form for carrying out the present invention. In addition, the configuration is not limited to the configuration in which the arithmetic device 10 and the external storage unit 20 are directly connected. For example, even if the external storage unit 20 is a server configured on a network and the arithmetic device 10 and the server are connected via a network, the concept of the present invention is similarly applied. Can do. In this case, for example, it can be considered that the memory control unit 13 reads (receives) data from the server or writes (transmits) data to the server via the communication unit.

また、本実施形態においては、タスク制御部１２が、それぞれのタスクを実行する際の実行時間（サイクル数）など、外部記憶部２０にアクセスするタイミングを決定するための情報が事前にわかっているものとして説明したが、タスク制御部１２は、本発明を実施するための形態に限定されるものではない。例えば、タスクキュー１２１に格納されたタスクに含まれるプログラムのサイズや、データ量またはループ回数などのパラメータに基づいて、タスクを実行する際の実行時間（サイクル数）などの、外部記憶部２０にアクセスするタイミングを決定するための情報を算出する構成であってもよい。 Further, in the present embodiment, information for determining the timing for accessing the external storage unit 20 such as an execution time (number of cycles) when the task control unit 12 executes each task is known in advance. Although described as a thing, the task control part 12 is not limited to the form for implementing this invention. For example, in the external storage unit 20 such as the execution time (number of cycles) when executing a task based on the size of the program included in the task stored in the task queue 121, parameters such as the data amount or the number of loops It may be configured to calculate information for determining the access timing.

また、本実施形態においては、それぞれの処理演算部１１の処理機能に関して説明していないが、例えば、演算装置１０を、撮像装置などの撮像システム内に備えている場合には、それぞれの処理演算部１１は、撮像システムにおける画像処理を行う処理機能を持っていると考えることができる。 In the present embodiment, the processing function of each processing operation unit 11 is not described. For example, when the arithmetic device 10 is provided in an imaging system such as an imaging device, each processing operation is performed. The unit 11 can be considered to have a processing function for performing image processing in the imaging system.

また、本実施形態においては、演算装置１０に備えたそれぞれの処理演算部１１が持っている処理機能について説明していない。しかし、例えば、演算装置１０が、撮像装置などの撮像システムに備えた画像処理装置である場合には、それぞれの処理演算部１１の処理機能は、ＹＣ変換処理、ノイズ除去処理、歪み補正処理、キズ補正処理、画像圧縮処理など、撮像システムにおける様々な画像処理を行うことができる処理機能であると考えられる。また、例えば、演算装置１０が、撮像装置などの撮像システムに備えた画像処理装置内の画像認識部など、撮像システムにおける画像処理の一部の処理部である場合には、それぞれの処理演算部１１の処理機能は、画像処理装置内の一部の処理部において種々の処理を行うための処理機能であると考えられる。 Moreover, in this embodiment, the processing function which each processing calculating part 11 with which the calculating device 10 was equipped has is not demonstrated. However, for example, when the arithmetic device 10 is an image processing device provided in an imaging system such as an imaging device, the processing functions of the respective processing arithmetic units 11 include YC conversion processing, noise removal processing, distortion correction processing, It is considered that this is a processing function capable of performing various image processing in the imaging system such as scratch correction processing and image compression processing. For example, when the arithmetic device 10 is a part of image processing in the imaging system, such as an image recognition unit in an image processing device provided in an imaging system such as an imaging device, each processing arithmetic unit The processing function 11 is considered to be a processing function for performing various processes in a part of the processing units in the image processing apparatus.

以上、本発明の実施形態について、図面を参照して説明してきたが、具体的な構成はこの実施形態に限定されるものではなく、本発明の趣旨を逸脱しない範囲においての種々の変更も含まれる。 The embodiment of the present invention has been described above with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes various modifications within the scope of the present invention. It is.

１０・・・演算装置
１１，１１ａ，１１ｂ，１１ｎ・・・処理演算部
１２・・・タスク制御部
１２１・・・タスクキュー（タスク制御部）
１２２・・・ＤＭＡリクエスト発生部（タスク制御部）
１３・・・メモリ制御部
１４，１４ａ，１４ｂ，１４ｎ・・・データ記憶部
２０・・・外部記憶部 DESCRIPTION OF SYMBOLS 10 ... Arithmetic unit 11, 11a, 11b, 11n ... Processing operation part 12 ... Task control part 121 ... Task queue (task control part)
122... DMA request generation unit (task control unit)
13 ... Memory control unit 14, 14a, 14b, 14n ... Data storage unit 20 ... External storage unit

Claims

A plurality of processing operation units having a processing function for performing arithmetic processing according to the input task, and outputting information on the arithmetic processing to be executed next as the task;
A data storage unit for storing data used when each processing operation unit executes an operation process corresponding to the task, or data obtained as a result of executing the operation process corresponding to the task;
Data used when executing arithmetic processing according to the task is read from a connected external storage unit and stored in the data storage unit, or arithmetic processing according to the task stored in the data storage unit A memory control unit for writing data of the execution result to the connected external storage unit;
A task queue for sequentially storing the tasks is provided, and the task stored in the task queue is output to any one of the plurality of processing operation units and stored in the task queue. A task control unit that outputs an access instruction for instructing the memory control unit to access the external storage unit, based on a timing when the processing operation unit executes a calculation process corresponding to each of the tasks performed; ,
An arithmetic device comprising:

The task control unit
Each time the task stored in the task queue is output to the processing arithmetic unit, the timing at which the processing arithmetic unit executes the arithmetic processing corresponding to the task stored in the task queue is confirmed. Then, based on the confirmed timing, each task is configured such that access to the external storage unit corresponding to each task is completed by the time when each task is output to the processing operation unit. Outputting the access instruction corresponding to
The arithmetic unit according to claim 1.

The task control unit
From among each of the tasks stored in the task queue, a target task that is the task for performing arithmetic processing using data stored in the external storage unit is selected, and the target task is selected as the processing arithmetic unit. To the external storage unit for checking the timing at which the processing operation unit executes the arithmetic processing according to the target task based on the order of output to the target task, and reading the data used in the target task from the external storage unit The access instruction corresponding to the target task is output so that the access is completed by the time when the target task is output to the processing operation unit.
The arithmetic unit according to claim 2.

The task control unit
The order in which the target task stored in the task queue is output to the processing arithmetic unit is assumed to be stored in the task queue and the number of the processing arithmetic units provided in the arithmetic device. In a task, based on an execution time that minimizes the processing time when executing arithmetic processing according to the task, and a transfer time when the memory control unit accesses the external storage unit to transfer data Output the access instruction corresponding to the target task when the predetermined data transfer start order is reached.
The arithmetic unit according to claim 3.

The task control unit
In each of the tasks stored in the task queue, stored in the task queue before the target task that is the task that performs the arithmetic processing using the data stored in the external storage unit, Based on each execution time for executing the arithmetic processing corresponding to the preceding task, which is the task for performing arithmetic processing that does not use the data stored in the external storage unit, the arithmetic processing corresponding to the target task is the processing arithmetic The access to the external storage unit for reading the data used in the target task from the external storage unit is completed by the timing when the target task is output to the processing operation unit. And outputting the access instruction corresponding to the target task,
The arithmetic unit according to claim 2.

The task control unit
Assuming that each of the preceding task is output to each of the processing calculation units, the minimum execution time when the calculation processing corresponding to the input preceding task is executed by each of the processing calculation units is calculated. The access instruction corresponding to the target task is output based on the calculated minimum execution time and the transfer time when the memory control unit accesses the external storage unit to transfer data.
The arithmetic unit according to claim 5.

The task control unit
The target task is processed based on the priority of the target task, which is the task that performs arithmetic processing using data stored in the external storage unit in each of the tasks stored in the task queue. Rearranging the order of output to the calculation unit, and from the target task having the highest priority, access to the external storage unit for reading data used in the target task from the external storage unit, the target task is Outputting the access instruction corresponding to the target task so as to be completed by the timing of outputting to the processing operation unit,
The arithmetic unit according to claim 2.

The task control unit
When data used for arithmetic processing corresponding to the task is stored in the data storage unit that reads and stores data stored in the external storage unit in response to the access instruction corresponding to the target task In order to write the data stored in the data storage unit to the external storage unit and save the access instruction before outputting the access instruction corresponding to the target task,
The arithmetic unit according to any one of claims 3 to 7, wherein the arithmetic unit is any one of the above.

The task control unit
Used by the task with the lowest priority based on the priority of the task that performs arithmetic processing using the data stored in the data storage unit in each of the tasks stored in the task queue Outputting the access instruction for writing and evacuating the stored data to the external storage unit from the data storage unit storing the data;
The arithmetic unit according to claim 8.