JP2019215804A

JP2019215804A - Multicore microcomputer and parallelizing method

Info

Publication number: JP2019215804A
Application number: JP2018113736A
Authority: JP
Inventors: 憲一峰田; Kenichi Mineda
Original assignee: Denso Corp
Current assignee: Denso Corp
Priority date: 2018-06-14
Filing date: 2018-06-14
Publication date: 2019-12-19
Anticipated expiration: 2038-06-14
Also published as: DE102019207629A1; JP7073933B2

Abstract

To provide a multicore microcomputer with which it is possible to heighten a parallel execution frequency in processing units by a plurality of cores while preventing mutual interferences, even when the parallelism of parallel programs is low.SOLUTION: Among a plurality of processing units of each task that is a candidate for execution task in cores 31c, 31d of a multicore microcomputer 31, a processing unit for which the data scheduled to be used is accessible when the processing unit is executed, is determined as the processing unit to be executed. Furthermore, a disabling process for disabling the data scheduled to be used from being accessed by a processing unit executed in the other core is executed. When, as the result of execution of the disabling process in some core, it is determined, with respect to processing units in the other core, that the data scheduled to be used cannot be accessed, another processing unit by which the data scheduled to be used is accessible in the other core is determined as the processing unit to be executed.SELECTED DRAWING: Figure 6

Description

本発明は、シングルプログラムから生成された並列プログラムを実行するマルチコアマイコン、及びシングルコアマイコン用のシングルプログラムからマルチコアマイコン用の並列プログラムを生成する並列化方法に関する。 The present invention relates to a multicore microcomputer that executes a parallel program generated from a single program, and a parallelization method that generates a parallel program for a multicore microcomputer from a single program for a single core microcomputer.

従来、シングルコアマイコン用のシングルプログラムから、マルチコアマイコン用の並列プログラムを生成する並列化方法の一例として、特許文献１に開示された並列化コンパイル方法が知られている。 2. Description of the Related Art Conventionally, as an example of a parallelization method for generating a parallel program for a multi-core microcomputer from a single program for a single-core microcomputer, a parallelization compile method disclosed in Patent Document 1 is known.

この並列化コンパイル方法では、シングルプログラムのソースコードの字句解析や構文解析を行って中間言語に展開し、この中間言語を用いて、複数のマクロタスク（処理単位）の依存関係の解析や最適化等を行う。また、従来の並列化コンパイル方法では、各マクロタスクの依存関係やマクロタスク毎の実行時間を基にコアへの割り付けやスケジューリングを行って並列プログラムを生成する。 In this parallel compilation method, lexical analysis and syntax analysis of the source code of a single program are performed to develop into an intermediate language, and this intermediate language is used to analyze and optimize the dependencies of a plurality of macro tasks (processing units). And so on. In the conventional parallelizing compilation method, a parallel program is generated by allocating to cores and scheduling based on the dependency of each macro task and the execution time of each macro task.

特開２０１５−１８０７号公報JP-A-2015-1807

しかしながら、シングルプログラムから生成された並列プログラムの並列度が低い場合、すなわち、並列プログラムにおいて複数のコアにて並列実行できる処理単位が少ない場合には、マルチコアマイコンの処理能力を十分に活かすことができない。 However, when the degree of parallelism of a parallel program generated from a single program is low, that is, when the number of processing units that can be executed in parallel by a plurality of cores in the parallel program is small, the processing capability of the multi-core microcomputer cannot be fully utilized. .

本発明は、上述した点に鑑みてなされたものであり、生成される並列プログラムの並列度が低くても、相互干渉を防止しつつ複数のコアでの処理単位の並列実行頻度を高めることが可能なマルチコアマイコン及び並列化方法を提供することを目的とする。 The present invention has been made in view of the above points, and it is possible to increase the frequency of parallel execution of a processing unit in a plurality of cores while preventing mutual interference even if the degree of parallelism of a generated parallel program is low. It is an object of the present invention to provide a possible multi-core microcomputer and a parallelization method.

上記目的を達成するために、本発明によるマルチコアマイコンは、コアが一つであるシングルコアマイコン用のシングルプログラムから生成された、複数のコア（３１ｃ、３１ｄ）を有するマルチコアマイコン用の並列プログラム（３１ａ１’）を実行するものであって、
並列プログラムは、シングルプログラムに含まれる、複数の処理単位からなるタスク毎に、複数の処理単位の依存関係に基づき、複数の処理単位の複数のコアへの割り付けと実行順序とが決定されたものであり、
マルチコアマイコンの複数のコアは、それぞれ、自コアに割り付けられた、複数のタスクに属するそれぞれの処理単位から、実行すべき処理単位を選択する選択部（Ｓ１１０〜Ｓ２００、Ｓ３１０〜Ｓ４００、５１０〜Ｓ６１０、Ｓ６７０）を有し、
選択部は、
各タスクの処理単位を対象として、当該処理単位の実行時に利用予定のデータにアクセス可能であるか否かを判定する判定部（Ｓ１４０、Ｓ３４０、Ｓ５４０）と、
判定部によって利用予定のデータにアクセス可能と判定された処理単位を実行すべき処理単位として決定するとともに、他コアにおいて実行される処理単位による該当データへのアクセスを禁止するための禁止処理（Ｓ１５０、Ｓ３５０、Ｓ５５０、Ｓ６７０）を実行する決定時処理部（Ｓ１５０〜Ｓ１６０、Ｓ３５０〜Ｓ３６０、Ｓ５５０〜Ｓ５６０、Ｓ６７０）と、を備え、
複数のコアは、それぞれ、決定時処理部によって決定された処理単位に含まれる命令を実行するように構成される。 In order to achieve the above object, a multi-core microcomputer according to the present invention provides a parallel program for a multi-core microcomputer having a plurality of cores (31c, 31d) generated from a single program for a single-core microcomputer having one core. 31a1 ′), and
A parallel program is one in which the assignment of multiple processing units to multiple cores and the execution order are determined based on the dependencies of multiple processing units for each task consisting of multiple processing units included in a single program And
Each of the plurality of cores of the multi-core microcomputer selects a processing unit to be executed from among the processing units belonging to the plurality of tasks allocated to the own core (S110 to S200, S310 to S400, 510 to S610). , S670),
The selection unit is
A determination unit (S140, S340, S540) that determines whether or not the data to be used can be accessed when the processing unit is executed, with respect to the processing unit of each task;
Prohibition processing for determining the processing unit determined to be accessible to the data to be used by the determination unit as the processing unit to be executed, and prohibiting the processing unit executed in another core from accessing the corresponding data (S150) , S350, S550, S670) at the time of determination (S150-S160, S350-S360, S550-S560, S670).
Each of the plurality of cores is configured to execute an instruction included in the processing unit determined by the determination-time processing unit.

このように、本発明によるマルチコアマイコンによれば、各コアにおいて、割り付けられた各タスクの複数の処理単位の中で、その処理単位の実行時に利用予定のデータにアクセス可能である処理単位が、実行すべき処理単位として決定される。さらに、その利用予定のデータが他コアにおいて実行される処理単位によってアクセスされることを禁止するための禁止処理が実行される。これにより、複数のコアにおいて実行される処理単位同士が相互干渉することを防止することができる。 As described above, according to the multi-core microcomputer according to the present invention, in each core, among a plurality of processing units of each assigned task, a processing unit capable of accessing data to be used when the processing unit is executed, It is determined as a processing unit to be executed. Further, a prohibition process is executed to prohibit the data to be used from being accessed by a processing unit executed in another core. Thereby, it is possible to prevent the processing units executed in the plurality of cores from interfering with each other.

そして、あるコアにおける上記禁止処理の実行の結果、他コアの処理単位について、利用予定のデータにアクセス不可と判定された場合には、他コアにおいて利用予定データにアクセス可能である別の処理単位が、実行すべき処理単位として決定される。従って、他コアでの処理単位の実行により相互干渉を引き起こす虞がある場合には、当該他コアにおいて、そのような相互干渉を引き起こす虞のない別の処理単位が実行される。このため、生成される並列プログラムの並列度が低くても、複数のコアでの処理単位の並列実行頻度を高めることが可能となる。 Then, as a result of the execution of the above-described prohibition process in one core, if it is determined that the data to be used is inaccessible for the processing unit of another core, another processing unit that can access the data to be used in another core Is determined as a processing unit to be executed. Therefore, when there is a possibility that mutual interference may be caused by execution of a processing unit in another core, another processing unit which is not likely to cause such mutual interference is executed in the other core. For this reason, even if the degree of parallelism of the generated parallel program is low, it is possible to increase the parallel execution frequency of the processing unit in a plurality of cores.

また、本発明の並列化方法は、コアが一つであるシングルコアマイコン用のシングルプログラムから、複数のコア（３１ｃ，３１ｂ）を有するマルチコアマイコン用の並列プログラム（３１ａ１）を生成するものであって、
シングルプログラムに含まれる、複数の処理単位からなるタスク毎に、複数の処理単位の依存関係に基づき、複数の処理単位の複数のコアへの割り付けと実行順序とを決定するコア割付及び実行順序決定手順（１０ａ〜１０ｄ）と、
実行順序決定手順にて決定された複数のコアへの割り付け及び実行順序に従って複数の処理単位がマルチコアマイコンの複数のコアで実行されるように並列プログラムを生成するとともに、並列プログラムの各々の処理単位の開始ポイントに開始ポイント命令を追加し、終了ポイントに終了ポイント命令を追加する並列プログラム生成手順（１０ｅ、１０ｆ）と、を備え、
開始ポイント命令と終了ポイント命令とは、マルチコアマイコンにおいて、複数のコアで同時期に実行される処理単位に含まれる命令によって同じデータへアクセスしないようにデータアクセスの調停を行なわせるとともに、データアクセスが禁止されたコアにおいて、該当するタスクの処理単位以外の別のタスクの処理単位を実行させるものである。 Further, the parallelization method of the present invention generates a parallel program (31a1) for a multi-core microcomputer having a plurality of cores (31c, 31b) from a single program for a single-core microcomputer having one core. hand,
Core assignment and execution order determination for determining the assignment of a plurality of processing units to a plurality of cores and the execution order based on the dependency of the plurality of processing units for each task including a plurality of processing units included in a single program Procedures (10a to 10d),
A parallel program is generated such that a plurality of processing units are executed by a plurality of cores of the multi-core microcomputer in accordance with the assignment to the plurality of cores determined in the execution order determination procedure and the execution order, and each processing unit of the parallel program And a parallel program generation procedure (10e, 10f) for adding a start point instruction to the start point of the above and an end point instruction to the end point.
The start point instruction and the end point instruction allow the multi-core microcomputer to perform data access arbitration so that the same data is not accessed by an instruction included in a processing unit executed by a plurality of cores at the same time. In the prohibited core, a processing unit of another task other than the processing unit of the corresponding task is executed.

従って、本発明の並列化方法は、マルチコアマイコンにおいて、複数のコアで実行される処理単位同士の相互干渉を防止しつつ、複数のコアでの処理単位の並列実行頻度を高めることが可能な並列プログラムを生成することができる。 Therefore, the parallelization method according to the present invention provides a multi-core microcomputer capable of increasing the frequency of parallel execution of processing units in a plurality of cores while preventing mutual interference between processing units executed in the plurality of cores. A program can be generated.

上記括弧内の参照番号は、本開示の理解を容易にすべく、後述する実施形態における具体的な構成との対応関係の一例を示すものにすぎず、なんら発明の範囲を制限することを意図したものではない。 The reference numbers in parentheses above are merely one example of a correspondence relationship with a specific configuration in the embodiment described later in order to facilitate understanding of the present disclosure, and are intended to limit the scope of the invention. It was not done.

また、上述した特徴以外の、特許請求の範囲の各請求項に記載した技術的特徴に関しては、後述する実施形態の説明及び添付図面から明らかになる。 Further, technical features described in each claim of the claims other than the above-described features will be apparent from the description of the embodiments and the accompanying drawings described later.

実施形態における、自動並列化ツールとしてのコンピュータの概略構成を示すブロック図である。FIG. 2 is a block diagram illustrating a schematic configuration of a computer as an automatic parallelization tool in the embodiment. 実施形態における、自動並列化ツールとしてのコンピュータの機能を示すブロック図である。FIG. 4 is a block diagram illustrating functions of a computer as an automatic parallelization tool in the embodiment. 各タスクがそれぞれ複数の処理単位からなり、それら複数の処理単位の各コアへの割り付けの一例を示す図である。FIG. 4 is a diagram showing an example of each task consisting of a plurality of processing units and assigning the plurality of processing units to each core. マルチコアマイコンを内蔵する車載装置の構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a configuration of an in-vehicle device including a multi-core microcomputer. マルチコアマイコンの各コア内の構成を概念的に示したブロック図である。FIG. 2 is a block diagram conceptually showing a configuration in each core of the multi-core microcomputer. 第１実施形態において、並列プログラムを実行する際のマルチコアマイコンの各コアにおける処理動作を説明するためのフローチャートである。6 is a flowchart for explaining processing operations in each core of the multi-core microcomputer when executing a parallel program in the first embodiment. 第２実施形態における、マルチコアマイコンの各コア内の構成を概念的に示したブロック図である。FIG. 9 is a block diagram conceptually showing a configuration in each core of a multi-core microcomputer in a second embodiment. 第２実施形態において、並列プログラムを実行する際のマルチコアマイコンの各コアにおける処理動作を説明するためのフローチャートである。FIG. 9 is a flowchart illustrating a processing operation in each core of the multi-core microcomputer when executing a parallel program in the second embodiment. 第３実施形態における、マルチコアマイコンの各コア内の構成を概念的に示したブロック図である。FIG. 13 is a block diagram conceptually showing a configuration in each core of a multi-core microcomputer in a third embodiment. 第３実施形態において、並列プログラムを実行する際のマルチコアマイコンの各コアにおける処理動作を説明するためのフローチャートである。FIG. 13 is a flowchart illustrating a processing operation in each core of the multi-core microcomputer when executing a parallel program in the third embodiment.

（第１実施形態）
以下において、図面を参照しながら、発明を実施するための第１実施形態を説明する。本実施形態では、並列化ツールとしてのコンピュータ１０が、コアが一つであるシングルコアマイコン用のシングルプログラムから、２個以上のコア３１ｃ、３１ｄを有するマルチコアマイコン３１用に並列化した並列プログラム３１ａ１を生成する例について説明する。なお、マルチコアマイコン３１のコアの数は、３個以上であっても良い。 (First embodiment)
Hereinafter, a first embodiment for carrying out the invention will be described with reference to the drawings. In the present embodiment, the computer 10 as a parallelization tool is a single program for a single core microcomputer having one core, and is converted into a parallel program 31a1 for a multicore microcomputer 31 having two or more cores 31c and 31d. An example of generating is described. The number of cores of the multi-core microcomputer 31 may be three or more.

このように、シングルプログラムから並列プログラム３１ａ１を生成する背景として、制御の高度化によりプログラム量は年々増加する傾向にあるのに対し、シングルコアマイコンの性能向上には限界があることが挙げられる。つまり、例えばシングルコアマイコンの動作周波数を高めて処理能力を向上しようとしても、動作周波数を高めるにも限界があり、また動作周波数を高めることにより発熱量の増大や消費電力の増加を招いてしまう。このため、コア数の増加により処理能力向上を図るマルチコアマイコン３１を適用することが有効と考えられている。 As described above, the background of generating the parallel program 31a1 from the single program is that the amount of the program tends to increase year by year due to the sophistication of the control, but the performance improvement of the single core microcomputer is limited. That is, for example, even if an attempt is made to increase the operating frequency by increasing the operating frequency of a single-core microcomputer, there is a limit in increasing the operating frequency, and increasing the operating frequency causes an increase in heat generation and an increase in power consumption. . For this reason, it is considered effective to apply the multi-core microcomputer 31 which improves the processing capability by increasing the number of cores.

この際、プログラムの開発者が、マルチコアの能力を最大限に発揮させられるように、各コアに適切に処理を割り振ったり、そのスケジューリングも行ったりしなければならないとすると、プログラムの開発負荷が増加してしまう。このようなプログラムの開発負荷を低減するために、シングルプログラムから並列プログラム３１ａ１を自動生成することは技術的意義がある。さらに、シングルプログラムから並列プログラム３１ａ１を自動生成することにより、シングルプロセッサ用に開発した既存のソフトウエア資産を有効に活用することも可能となる。 At this time, if the program developers must properly allocate processing to each core and schedule it so that the multi-core capability can be maximized, the program development load will increase Resulting in. It is of technical significance to automatically generate the parallel program 31a1 from a single program in order to reduce the development load of such a program. Further, by automatically generating the parallel program 31a1 from the single program, it is possible to effectively use the existing software resources developed for the single processor.

まず、図１を参照して、コンピュータ１０の構成に関して説明する。コンピュータ１０は、並列化方法を実行する並列化ツールに相当し、シングルプログラムから並列プログラム３１ａ１を生成するものである。なお、本実施形態では、コンピュータ１０は、Ｃ言語で記述されたシングルプログラムに基づき、Ｃ言語で記述された並列プログラム３１ａ１を生成するように構成される。このため、後述するマルチコアマイコン３１のＲＯＭ（コードＲＯＭ）３１ａに記憶され、マルチコアマイコン３１によって実行される並列プログラム３１ａ１’は、例えば図２に示すように、さらにコンパイラ２０によりコンパイルされて、バイナリコードに翻訳されたものとなる。 First, the configuration of the computer 10 will be described with reference to FIG. The computer 10 corresponds to a parallelization tool that executes a parallelization method, and generates a parallel program 31a1 from a single program. In the present embodiment, the computer 10 is configured to generate a parallel program 31a1 described in C language based on a single program described in C language. For this reason, the parallel program 31a1 ′ stored in a ROM (code ROM) 31a of the multi-core microcomputer 31 described later and executed by the multi-core microcomputer 31 is further compiled by the compiler 20, for example, as shown in FIG. Will be translated.

しかしながら、本発明は、これに限定されない。シングルプログラムは、Ｃ言語とは異なるプログラミング言語で記述されていてもよい。また、並列プログラム３１ａ１は、例えば、シングルプログラムの解析時に使用する中間言語で記述されていてもよい。あるいは、コンピュータ１０は、Ｃ言語で記述された並列プログラムと中間言語で記述された並列プログラムとをともに生成してもよい。さらに、コンピュータ１０が、コンパイラ２０としての機能も取り込み、直接、バイナリコードの並列プログラム３１ａ１’を生成してもよい。 However, the present invention is not limited to this. The single program may be described in a programming language different from the C language. The parallel program 31a1 may be described in, for example, an intermediate language used when analyzing a single program. Alternatively, the computer 10 may generate both a parallel program written in the C language and a parallel program written in the intermediate language. Further, the computer 10 may incorporate the function as the compiler 20 and directly generate the binary code parallel program 31a1 '.

コンピュータ１０は、図１に示すように、ディスプレイ１１、ＨＤＤ１２、ＣＰＵ１３、ＲＯＭ１４、ＲＡＭ１５、入力装置１６、読取部１７などを備えて構成されている。コンピュータ１０は、読取部１７により、記憶媒体１８に記憶された記憶内容を読み取ることができる。図１に示すように、記憶媒体１８には、例えば、自動並列化コンパイラ１が記憶される。なお、コンピュータ１０及び記憶媒体１８は、特開２０１５−１８０７号公報に記載されたパーソナルコンピュータ１００及び記憶媒体１８０と同様であるため、詳細は、特開２０１５−１８０７号公報を参照されたい。 As shown in FIG. 1, the computer 10 includes a display 11, an HDD 12, a CPU 13, a ROM 14, a RAM 15, an input device 16, a reading unit 17, and the like. The computer 10 can read the storage content stored in the storage medium 18 by the reading unit 17. As shown in FIG. 1, the storage medium 18 stores, for example, the automatic parallelizing compiler 1. Since the computer 10 and the storage medium 18 are the same as the personal computer 100 and the storage medium 180 described in JP-A-2015-1807, refer to JP-A-2015-1807 for details.

自動並列化コンパイラ１は、並列プログラム３１ａ１を生成するための手順をコンピュータ１０に実行させるソフトウエアである。よって、自動並列化コンパイラ１により、コンピュータ１０は並列化方法を実行可能となる。換言すれば、自動並列化コンパイラ１は、並列化方法を含むプログラムである。コンピュータ１０は、自動並列化コンパイラ１を実行することで、並列化ツールとして、並列プログラム３１ａ１を生成する。 The automatic parallelizing compiler 1 is software that causes the computer 10 to execute a procedure for generating the parallel program 31a1. Therefore, the automatic parallelizing compiler 1 enables the computer 10 to execute the parallelizing method. In other words, the automatic parallelizing compiler 1 is a program including a parallelizing method. The computer 10 executes the automatic parallelizing compiler 1 to generate a parallel program 31a1 as a parallelizing tool.

次に、図２を参照して、並列化ツールとしてのコンピュータ１０が有する、シングルプログラムから並列プログラム３１ａ１を生成するための各機能及び処理手順について説明する。図２は、コンピュータ１０の各機能及び処理手順を機能ブロックとして表した図である。図２に示すように、コンピュータ１０は、字句解析部１０ａ、構文・意味解析部１０ｂ、依存関係解析部１０ｃ、コア割付及びスケジューリング部１０ｄ、同期ポイント命令追加部１０ｅ、及びコード生成部１０ｆとしての機能を有している。 Next, with reference to FIG. 2, functions and processing procedures of the computer 10 as a parallelization tool for generating the parallel program 31a1 from the single program will be described. FIG. 2 is a diagram showing functions and processing procedures of the computer 10 as functional blocks. As shown in FIG. 2, the computer 10 includes a lexical analysis unit 10a, a syntax / semantic analysis unit 10b, a dependency analysis unit 10c, a core allocation and scheduling unit 10d, a synchronization point instruction addition unit 10e, and a code generation unit 10f. Has a function.

本実施形態では、図２に示すように、コンピュータ１０には、制御対象機器を制御するためのシングルプログラム全体を一度に解析して並列プログラム３１ａ１を生成するのではなく、独立した処理機能（タスク）毎に分割されたシングルプログラムを対象として、その並列プログラム３１ａ１を生成する。なお、本実施形態により生成される並列プログラム３１ａ１は、マルチコアマイコン３１において実行されるときの実行速度を早めることができるので、制御対象機器として、例えば、素早い処理速度が求められる、車両に搭載されたエンジンや電動モータとすることが好適である。この場合、図４に示すように、マルチコアマイコン３１は、車両に搭載されるエンジン制御装置、モータ制御装置、ハイブリッド制御装置などの車載装置３０として具現化される。 In the present embodiment, as shown in FIG. 2, the computer 10 does not analyze the entire single program for controlling the control target device at a time to generate the parallel program 31 a 1, but instead executes independent processing functions (tasks). ), The parallel program 31a1 is generated for the single program divided for each of the programs. The parallel program 31a1 generated according to the present embodiment can increase the execution speed when it is executed by the multi-core microcomputer 31. Therefore, the parallel program 31a1 is mounted on a vehicle that requires a high processing speed, for example, as a control target device. It is preferable to use an engine or an electric motor. In this case, as shown in FIG. 4, the multi-core microcomputer 31 is embodied as an in-vehicle device 30 such as an engine control device, a motor control device, or a hybrid control device mounted on a vehicle.

タスク毎に分割されたシングルプログラムは複数の処理単位を含み、その複数の処理単位が実行されることにより、タスク毎の目的とする処理機能を実現することができる。このように複数の処理単位は、目的とする処理機能を実現するために協働するものであり、例えば先の処理単位で処理された変数データを参照する後の処理単位や、先の処理単位の条件分岐によって実行される後の処理単位などを含む。 The single program divided for each task includes a plurality of processing units, and by executing the plurality of processing units, a target processing function for each task can be realized. As described above, the plurality of processing units cooperate to realize a target processing function. For example, the processing unit after referring to the variable data processed in the previous processing unit, the previous processing unit And the subsequent processing unit executed by the conditional branch.

ここで、処理単位とは、各コアに割り振る際の最小単位であるコア配置単位や、関数をいう。コア配置単位は、処理ブロック、マクロタスク、あるいは単なる処理単位などと言い換えることができる。コア配置単位と関数との関係は、コア配置単位≧関数である。つまり、関数は、コア配置単位自体である場合や、コア配置単位に含まれる親関数やサブ関数の場合がある。 Here, the processing unit refers to a core arrangement unit which is a minimum unit when allocating to each core, or a function. The core arrangement unit can be rephrased as a processing block, a macro task, or a simple processing unit. The relationship between the core arrangement unit and the function is such that core arrangement unit ≧ function. That is, the function may be the core arrangement unit itself, or may be a parent function or subfunction included in the core arrangement unit.

字句解析部１０ａ及び構文・意味解析部１０ｂは、Ｃ言語で記述されたシングルプログラムのソースコードを対象として、字句解析や、構文と意味の解析を行い、中間言語に展開する。字句解析部１０ａ及び構文・意味解析部１０ｂによって展開された中間言語は、汎用的な命令を含んでいる。なお、字句解析部１０ａ及び構文・意味解析部１０ｂは、特開２０１５−１８０７号公報のＦＥ３に相当するため、詳細は、特開２０１５−１８０７号公報を参照されたい。 The lexical analysis unit 10a and the syntax / semantic analysis unit 10b perform lexical analysis, syntax and semantic analysis on the source code of a single program described in the C language, and develop the intermediate code. The intermediate language developed by the lexical analyzer 10a and the syntax / semantic analyzer 10b includes general-purpose instructions. Note that the lexical analysis unit 10a and the syntax / semantic analysis unit 10b correspond to FE3 in JP-A-2015-1807, so refer to JP-A-2015-1807 for details.

依存関係解析部１０ｃは、中間言語に展開されたシングルプログラムに含まれる処理単位の依存関係を解析し、並列実行可能な処理単位を抽出する。依存関係には、後に実行される処理単位が先に実行される処理単位で更新された変数データを参照するなどのデータ依存関係と、後に実行される処理単位が先に実行される処理単位の条件分岐先となるなどの制御依存関係とが含まれる。このような依存関係がある複数の処理単位は、依存関係に従う処理順序で実行される必要がある。なお、本実施形態では、上述したようにタスク毎に分割されたシングルプログラムが並列化の対象である。タスク毎に分割されたシングルプログラムに含まれる複数の処理単位は、データ依存関係や制御依存関係を有している。 The dependency analysis unit 10c analyzes the dependency of the processing units included in the single program developed in the intermediate language, and extracts the processing units that can be executed in parallel. Dependencies include data dependencies, such as a processing unit executed later referring to updated variable data in a processing unit executed first, and a processing unit executed later in a processing unit executed earlier. Control dependencies such as conditional branch destinations are included. A plurality of processing units having such a dependency need to be executed in a processing order according to the dependency. In the present embodiment, a single program divided for each task as described above is a target of parallelization. A plurality of processing units included in a single program divided for each task have a data dependency and a control dependency.

コア割付及びスケジューリング部１０ｄは、依存関係解析部１０ｃで解析した解析結果に基づき、複数の処理単位を２個以上のコア３１ｃ、３１ｄに割り付ける（割り振る）。この際、コア割付及びスケジューリング部１０ｄは、例えば、並列実行可能な処理単位が２個以上のコア３１ｃ、３１ｄで並行して実行されるように、複数の処理単位の割り付けを行う。 The core allocation and scheduling unit 10d allocates (allocates) a plurality of processing units to two or more cores 31c and 31d based on the analysis result analyzed by the dependency analysis unit 10c. At this time, the core allocation and scheduling unit 10d allocates a plurality of processing units so that, for example, processing units that can be executed in parallel are executed in parallel by two or more cores 31c and 31d.

図３に、各タスクが、それぞれ複数の処理単位からなり、それら複数の処理単位の各コア３１ｃ、３１ｄへの割り付けの一例を示す。図３に示す例では、シングルプログラムには、タスクＡ、タスクＢ、及びタスクＣの３つのタスクが含まれている。なお、図３では、説明のため、３つのタスクしか示していないが、実際には、シングルプログラムには、より多くのタスクが含まれる場合がある。 FIG. 3 shows an example of the assignment of each of the plurality of processing units to each of the cores 31c and 31d. In the example shown in FIG. 3, the single program includes three tasks of task A, task B, and task C. Although FIG. 3 shows only three tasks for the sake of explanation, a single program may actually include more tasks in some cases.

コア割付及びスケジューリング部１０ｄは、図３に示すように、タスクＡの処理単位Ａ１〜Ａ３に関して、処理単位Ａ１、Ａ２を第１コア３１ｃに割り付け、処理単位Ａ２と並列に実行可能な処理単位Ａ３を第２コア３１ｄに割り付ける。また、タスクＢの処理単位Ｂ１、Ｂ２に関して、処理単位Ｂ１、Ｂ２を第２コア３１ｄに割り付ける。そして、タスクＣの処理単位Ｃ１〜Ｃ４については、処理単位Ｃ１、Ｃ２を第１コア３１ｃに割り付け、それらと並列に実行可能な処理単位Ｃ３、Ｃ４を第２コア３１ｄに割り付ける。 As shown in FIG. 3, the core allocation and scheduling unit 10d allocates the processing units A1 and A2 to the first core 31c for the processing units A1 to A3 of the task A, and executes the processing unit A3 that can be executed in parallel with the processing unit A2. Is allocated to the second core 31d. In addition, regarding the processing units B1 and B2 of the task B, the processing units B1 and B2 are allocated to the second core 31d. Then, for the processing units C1 to C4 of the task C, the processing units C1 and C2 are allocated to the first core 31c, and the processing units C3 and C4 that can be executed in parallel with them are allocated to the second core 31d.

そして、コア割付及びスケジューリング部１０ｄは、第１コア３１ｃ及び第２コア３１ｄに割り振られた複数の処理単位Ａ１〜Ａ３、Ｂ１〜Ｂ２、Ｃ１〜Ｃ４のスケジューリングを行う。具体的には、コア割付及びスケジューリング部１０ｄは、各タスク毎に、各タスクの処理単位Ａ１〜Ａ３、Ｂ１〜Ｂ２、Ｃ１〜Ｃ４の依存関係に基づいて、第１コア３１ｃ及び第２コア３１ｄに割り振られた各処理単位Ａ１〜Ａ３、Ｂ１〜Ｂ２、Ｃ１〜Ｃ４の実行スケジュール（実行順序）を決定する。なお、依存関係解析部１０ｃ、及びコア割付及びスケジューリング部１０ｄは、特開２０１５−１８０７号公報のＭＰ５に相当するため、詳細は、特開２０１５−１８０７号公報を参照されたい。 Then, the core allocation and scheduling unit 10d performs scheduling of the plurality of processing units A1 to A3, B1 to B2, and C1 to C4 allocated to the first core 31c and the second core 31d. Specifically, the core allocation and scheduling unit 10d performs, for each task, the first core 31c and the second core 31d based on the dependency of the processing units A1 to A3, B1 to B2, and C1 to C4 of each task. The execution schedule (execution order) of each of the processing units A1 to A3, B1 to B2, and C1 to C4 allocated to is determined. Note that the dependency analysis unit 10c and the core assignment and scheduling unit 10d correspond to MP5 in JP-A-2015-1807, so refer to JP-A-2015-1807 for details.

同期ポイント命令追加部１０ｅは、コア割付及びスケジューリング部１０ｄによってスケジューリングされた、並列プログラムを構成する各タスクの複数の処理単位の各々の開始ポイントに開始ポイント命令を追加し、終了ポイントに終了ポイント命令を追加するものである。同期ポイント命令追加部１０ｅは、追加すべき開始ポイント命令、及び終了ポイント命令としてのプログラムを保有しており、各々の処理単位に対応しつつ、それぞれの開始ポイント及び終了ポイントに、開始ポイント命令及び終了ポイント命令を追加する。 The synchronization point instruction adding unit 10e adds a start point instruction to each start point of each of the plurality of processing units of each task constituting the parallel program, which is scheduled by the core allocation and scheduling unit 10d, and adds an end point instruction to the end point. Is to be added. The synchronization point instruction adding unit 10e has a program as a start point instruction to be added and a program as an end point instruction, and a start point instruction and an end point are added to each start point and end point while corresponding to each processing unit. Add an end point instruction.

開始ポイント命令は、対応する処理単位の開始ポイントを示すものであることに加え、以下に説明する情報や命令も含む。まず、開始ポイント命令は、対応する処理単位に含まれる命令によってメモリ（ＲＡＭ３１ｂやレジスタなど）に保存されたデータへのアクセス（データの読み取り、更新など）が行われる場合、そのアクセス対象となるデータを示すアクセスデータ情報を含む。同期ポイント命令追加部１０ｅは、アクセス対象となるデータに関する情報を、構文・意味解析部１０ｂから取得することができる。 The start point instruction indicates the start point of the corresponding processing unit, and also includes information and instructions described below. First, a start point instruction is a data to be accessed when an instruction included in a corresponding processing unit accesses (reads, updates, etc.) data stored in a memory (RAM 31b, register, or the like). Is included. The synchronization point command addition unit 10e can acquire information on data to be accessed from the syntax / semantic analysis unit 10b.

これにより、マルチコアマイコン３１にて、並列プログラム３１ａ１’に含まれる各処理単位の実行開始時（実質的な処理の開始前）に、開始ポイント命令に基づき、その処理単位がいずれのデータにアクセスする予定であるかを事前に把握することができる。このアクセスデータ情報は、後述するように、各コア３１ｃ、３１ｄに設けられたアクセスデータ制御ユニット４３内のアクセスデータテーブル４３ａ、４３ｂに一時的に保存される。なお、処理単位がメモリに保存されたデータにアクセスするものではない場合、開始ポイント命令は、アクセスデータ情報を含まないか、あるいは、アクセスするデータが無いことを示すアクセスデータ情報を含む。 Thus, at the start of execution of each processing unit included in the parallel program 31a1 '(before the start of substantial processing), the multicore microcomputer 31 accesses any data based on the start point instruction. It is possible to know in advance whether the event is scheduled. This access data information is temporarily stored in access data tables 43a and 43b in an access data control unit 43 provided in each of the cores 31c and 31d, as described later. If the processing unit does not access data stored in the memory, the start point instruction does not include access data information or includes access data information indicating that there is no data to be accessed.

また、開始ポイント命令は、アクセスデータ情報が示すデータにアクセス可能か否かを判定するための命令を含む。この命令に基づき、各コア３１ｃ、３１ｄのアクセスデータ制御ユニット４３は、アクセスデータ情報が示すデータにアクセス可能か否かの判定を行う。より詳細には、アクセスデータ制御ユニット４３は、上記命令により、アクセス予定のデータに対応する禁止フラグの状態を確認する。禁止フラグは、並列プログラム３１ａ１’の各タスクのそれぞれの処理単位によりアクセスされるデータの各々に個別に対応するように設定され、各コア３１ｃ，３１ｄが読み書き可能なメモリの所定の記憶領域に保存される。例えば、禁止フラグの状態がセット状態である場合、アクセスデータ制御ユニット４３は、対応するデータにアクセス不可であると判定する。一方、禁止フラグの状態がリセット状態である場合、アクセスデータ制御ユニット４３は、対応するデータにアクセス可能であると判定する。 Further, the start point instruction includes an instruction for determining whether or not the data indicated by the access data information can be accessed. Based on this instruction, the access data control unit 43 of each of the cores 31c and 31d determines whether or not the data indicated by the access data information can be accessed. More specifically, the access data control unit 43 checks the state of the prohibition flag corresponding to the data to be accessed according to the instruction. The prohibition flag is set so as to individually correspond to data accessed by each processing unit of each task of the parallel program 31a1 ', and is stored in a predetermined storage area of a memory readable and writable by each of the cores 31c and 31d. Is done. For example, when the state of the prohibition flag is the set state, the access data control unit 43 determines that the corresponding data cannot be accessed. On the other hand, when the state of the prohibition flag is the reset state, the access data control unit 43 determines that the corresponding data can be accessed.

また、開始ポイント命令は、アクセスデータ制御ユニット４３にてアクセスデータ情報が示すデータにアクセス可能と判定された場合に、アクセスデータ制御ユニット４３が、該当データのロックを行うための命令を含む。このロック命令に基づき、アクセスデータ制御ユニット４３は、禁止フラグがリセット状態であり、対応するデータにアクセス可能と判定した場合、参照した禁止フラグをセット状態に変化させて、データをロックする。これにより、開始ポイント命令以降の処理単位の命令が開始された場合、その処理単位の命令を実行中に、他のコアで実行される処理単位による該当データへのアクセスを禁止することが可能となる。 The start point instruction includes an instruction for the access data control unit 43 to lock the data when the access data control unit 43 determines that the data indicated by the access data information is accessible. Based on this lock command, when the access data control unit 43 determines that the prohibition flag is in the reset state and the corresponding data is accessible, the access data control unit 43 changes the referred prohibition flag to the set state and locks the data. Thus, when an instruction in a processing unit after the start point instruction is started, it is possible to prohibit access to the corresponding data by a processing unit executed by another core while executing the instruction in the processing unit. Become.

また、開始ポイント命令は、アクセスデータ制御ユニット４３にてアクセスデータ情報が示すデータにアクセス不可と判定された場合、別のタスクの処理単位への切り替えを指示するための命令を含む。この命令に基づいて、プロセス切替ユニット４４は、アクセスデータ制御ユニット４３にてアクセスデータ情報が示すデータにアクセス不可と判定された場合に、実行対象となる処理単位を、別のタスクの処理単位に切り替える。その結果、データへのアクセス不可と判定されたコア３１ｃ、３１ｄにおいて、該当データが解放されるまで、処理単位の実行が待機されるのではなく、切り替えられた処理単位が実行されえる。このため、各コア３１ｃ、３１ｄでの処理単位の実行頻度を高めることが可能となる。なお、実行対象となった処理単位は、あらためて、アクセスデータ制御ユニット４３において、アクセス予定のデータへのアクセスが可能であるかどうかにより、その実行可否が判定される。 Further, the start point instruction includes an instruction for instructing switching to a processing unit of another task when the access data control unit 43 determines that the data indicated by the access data information is inaccessible. Based on this instruction, the process switching unit 44 sets the processing unit to be executed to the processing unit of another task when the access data control unit 43 determines that the data indicated by the access data information is inaccessible. Switch. As a result, in the cores 31c and 31d for which it is determined that the data cannot be accessed, the execution of the processing unit is not waited until the corresponding data is released, but the switched processing unit can be executed. For this reason, it is possible to increase the execution frequency of the processing unit in each of the cores 31c and 31d. It should be noted that whether or not the execution target processing unit is executable is determined again by the access data control unit 43 based on whether or not access to the data to be accessed is possible.

より詳細に説明すると、各コア３１ｃ，３１ｄに、いずれかの処理単位が割り振られた各タスクは、それぞれ処理優先度が定められている。プロセス切替ユニット４４は、初期的に、実行対象となる処理単位を、最も高い処理優先度を持つタスクの処理単位に定める。その処理単位に関して、アクセスデータ制御ユニット４３において、アクセス予定のデータへのアクセス不可と判定されると、プロセス切替ユニット４４は、実行対象となる処理単位を、次に処理優先度の高いタスクに属する処理単位に切り替える。 Describing in more detail, each task to which one of the processing units is assigned to each of the cores 31c and 31d has a predetermined processing priority. The process switching unit 44 initially determines the processing unit to be executed as the processing unit of the task having the highest processing priority. When the access data control unit 43 determines that access to the data to be accessed is not possible for the processing unit, the process switching unit 44 assigns the processing unit to be executed to the task having the next highest processing priority. Switch to processing unit.

各タスクの処理優先度に関しては、開始ポイント命令が、各タスクの処理優先度を示す情報を含んでいても良い。その場合、開始ポイント命令に含まれる処理優先度情報が、プロセス切替ユニット４４のプライオリティテーブル４４ｃに展開される。プロセス切替ユニット４４は、プライオリティテーブル４４ｃに展開された各タスクの処理優先度情報に基づき、いずれのタスクが最も高い処理優先度を持つか、次に処理優先度が高いタスクはいずれであるか等を判定することができる。あるいは、各タスクの処理単位に含まれる命令を保存する命令キュー４２ａ、４２ｂについて、予め、各タスクと１対１に対応するように、各タスクとの対応関係を定めるとともに、プライオリティテーブル４４ｃには、各命令キュー４２ａ、４２ｂの処理優先度を定めても良い。 Regarding the processing priority of each task, the start point instruction may include information indicating the processing priority of each task. In that case, the processing priority information included in the start point instruction is developed in the priority table 44c of the process switching unit 44. The process switching unit 44 determines which task has the highest processing priority, which task has the next highest processing priority, based on the processing priority information of each task expanded in the priority table 44c, and the like. Can be determined. Alternatively, for the instruction queues 42a and 42b for storing instructions included in the processing unit of each task, the correspondence between each task is determined in advance so as to correspond to each task on a one-to-one basis, and the priority table 44c stores Alternatively, the processing priority of each of the instruction queues 42a and 42b may be determined.

そして、終了ポイント命令は、対応する処理単位の終了ポイントを示すものであることに加え、以下に説明する命令も含む。すなわち、終了ポイント命令は、開始ポイント命令に含まれるデータのロック命令に基づき、データがロックされた場合、アクセスデータ制御ユニット４３が、そのデータのロックを解除するための命令を含む。処理単位の命令がすべて終了すれば、もはや、他のコアで実行される処理単位による該当データへのアクセスを禁止する必要は無いためである。終了ポイント命令に含まれるロック解除命令に基づいて、アクセスデータ制御ユニット４３は、禁止フラグの状態をセット状態からリセット状態に変化させる。 The end point instruction includes an instruction described below, in addition to indicating the end point of the corresponding processing unit. That is, the end point instruction includes an instruction for the access data control unit 43 to unlock the data when the data is locked based on the data lock instruction included in the start point instruction. This is because it is no longer necessary to prohibit a processing unit executed by another core from accessing the corresponding data when all the instructions of the processing unit are completed. Based on the lock release command included in the end point command, the access data control unit 43 changes the state of the prohibition flag from the set state to the reset state.

コード生成部１０ｆは、コア割付及びスケジューリング部１０ｄによって決定された各コア３１ｃ、３１ｄへの割り付け及び実行順序に従って該当するタスクの複数の処理単位が実行され、かつ各処理単位の開始ポイント及び終了ポイントでは、上述した開始ポイント命令及び終了ポイント命令が実行されるように、並列プログラム３１ａ１に相当するプログラムコードを生成する。コンピュータ１０は、コード生成部１０ｆによって生成されたプログラムコードを並列プログラム３１ａ１として出力する。 The code generation unit 10f executes a plurality of processing units of the corresponding task according to the allocation and execution order to each of the cores 31c and 31d determined by the core allocation and scheduling unit 10d, and also includes a start point and an end point of each processing unit. Then, a program code corresponding to the parallel program 31a1 is generated so that the above-described start point instruction and end point instruction are executed. The computer 10 outputs the program code generated by the code generator 10f as a parallel program 31a1.

次に、車載装置３０の構成に関して説明する。車載装置３０は、図４に示すように、マルチコアマイコン３１、通信部３２、センサ部３３、及び入出力ポート３４などを備えて構成される。また、マルチコアマイコン３１は、ＲＯＭ３１ａ、ＲＡＭ３１ｂ、第１コア３１ｃ、及び第２コア３１ｄなどを備えて構成されている。車載装置３０は、例えば、自動車に搭載されたエンジン制御装置やハイブリッド制御装置などに適用され得る。以下、車載装置３０がエンジン制御装置として適用された例について説明する。 Next, the configuration of the vehicle-mounted device 30 will be described. As shown in FIG. 4, the in-vehicle device 30 includes a multi-core microcomputer 31, a communication unit 32, a sensor unit 33, an input / output port 34, and the like. The multi-core microcomputer 31 includes a ROM 31a, a RAM 31b, a first core 31c, a second core 31d, and the like. The in-vehicle device 30 can be applied to, for example, an engine control device or a hybrid control device mounted on an automobile. Hereinafter, an example in which the in-vehicle device 30 is applied as an engine control device will be described.

第１コア３１ｃと第２コア３１ｄは、コードＲＯＭとしてのＲＯＭ３１ａに保存された並列プログラム３１ａ１’を実行することで、例えばエンジン制御を実行する。具体的には、制御対象機器としての各アクチュエータを駆動することにより燃料噴射量、点火時期、吸入空気量などを制御する。なお、ＲＯＭ３１ａには、エンジン制御で使用される定数データなども保存されている。ＲＡＭ３１ｂは、変数データなどを一時的に格納するものであり、マルチコアマイコン３１が並列プログラム３１ａ１’を実行するときに、各コア３１ｃ、３１ｄにより適宜アクセスされる。通信部３２は、車内ＬＡＮ等を介して接続された他のＥＣＵと通信を行う。センサ部３３は、エンジンの状態を検出するための各種のセンサを含む。入出力ポート３４は、エンジンを制御するための各種信号の送受信を行う。 The first core 31c and the second core 31d execute, for example, engine control by executing a parallel program 31a1 'stored in a ROM 31a as a code ROM. Specifically, a fuel injection amount, an ignition timing, an intake air amount, and the like are controlled by driving each actuator as a control target device. The ROM 31a also stores constant data used for engine control. The RAM 31b temporarily stores variable data and the like, and is appropriately accessed by the cores 31c and 31d when the multi-core microcomputer 31 executes the parallel program 31a1 '. The communication unit 32 communicates with another ECU connected via an in-vehicle LAN or the like. The sensor unit 33 includes various sensors for detecting the state of the engine. The input / output port 34 transmits and receives various signals for controlling the engine.

図５は、各コア３１ｃ、３１ｄ内の構成を概念的に示したブロック図である。なお、各コア３１ｃ、３１ｄは同一の構成を有するため、図５には、第１コア３１ｃについてのみ、内部構成を示している。また、図５では、説明を簡単にするため、相対的に高い処理優先度を持つタスク_Ｈと、相対的に低い処理優先度を持つタスク_Ｌの２つのタスクを、実行の有無を判定する対象となる実行タスク候補とした例について説明する。 FIG. 5 is a block diagram conceptually showing a configuration in each of the cores 31c and 31d. Since the cores 31c and 31d have the same configuration, FIG. 5 shows the internal configuration of only the first core 31c. In FIG. 5, for simplicity of explanation, it is determined whether two tasks, task_H having a relatively high processing priority and task_L having a relatively low processing priority, are executed. A description will be given of an example in which execution task candidates to be executed are set as execution task candidates.

図５に示すように、各コア３１ｃ、３１ｄは、ＲＯＭ３１ａから読み出した各タスクの処理単位（タスク_Ｈ、タスク_Ｌ）に含まれる命令コードを格納する命令キャッシュメモリ４０ａ、４０ｂを有している。命令キャッシュメモリ４０ａ、４０ｂに格納された各処理単位の命令コードは、命令フェッチ・デコード部４１によって読み出され、デコードされて、それぞれの命令キュー４２ａ、４２ｂに転送される。 As shown in FIG. 5, each of the cores 31c and 31d has instruction cache memories 40a and 40b for storing instruction codes included in a processing unit (task_H, task_L) of each task read from the ROM 31a. I have. The instruction code of each processing unit stored in the instruction cache memories 40a and 40b is read out by the instruction fetch / decode unit 41, decoded, and transferred to the respective instruction queues 42a and 42b.

なお、図５において、タスク_Ｈ及びタスク_Ｌの処理単位の命令コードが分離したコードＲＯＭ３１ａに記憶されるように描かれているが、コードＲＯＭ３１ａが物理的に分離していることは必ずしも必要ではない。例えば、複数のタスクが共通のコードＲＯＭ３１ａに記憶されても良い。この場合、各コア３１ｃ、３１ｄが、実行タスク候補とする各タスクに対応する数のプログラムカウンタを備えることで、共通のＲＯＭ３１ａから各タスクの処理単位の命令コードを読み出すことができる。一方、命令キュー４２ａ、４２ｂに関しては、実行タスク候補の数に応じた複数の命令キューを備えることが好ましい。これにより、処理単位の切り替えを即座に行うことができるようになるためである。 Although FIG. 5 illustrates that the instruction codes of the processing units of task_H and task_L are stored in the separated code ROM 31a, it is not always necessary that the code ROM 31a be physically separated. is not. For example, a plurality of tasks may be stored in the common code ROM 31a. In this case, since each of the cores 31c and 31d has the number of program counters corresponding to each of the tasks to be executed task candidates, it is possible to read out the instruction code of the processing unit of each task from the common ROM 31a. On the other hand, the instruction queues 42a and 42b preferably include a plurality of instruction queues corresponding to the number of execution task candidates. This is because processing units can be switched immediately.

ただし、各コア３１ｃ、３１ｄは、それぞれのコア３１ｃ、３１ｄに割り付けられたすべてのタスクに対応する数の命令キューを有していなくともよい。実行タスク候補となる複数のタスクの処理単位の命令が複数の命令キューに蓄積されていれば、ほとんどの場合、処理単位の切り替えを支障なく行うことができ、各コア３１ｃ、３１ｄが待機状態となることを回避できるためである。例えば、実行タスク候補とする複数のタスクは、タスク間で処理順序が規定されていないタスクの内、処理優先度の高いタスクから必要数のタスクを選択することができる。そして、実行タスク候補の１つのタスクについて、すべての処理単位の実行が終了した場合、未処理のタスクの中で、最も高い処理優先度のタスクで置き換えれば良い。 However, each of the cores 31c and 31d may not have the number of instruction queues corresponding to all the tasks allocated to the respective cores 31c and 31d. If the instructions of the processing units of the plurality of tasks as the execution task candidates are accumulated in the plurality of instruction queues, in most cases, the switching of the processing units can be performed without any trouble, and each of the cores 31c and 31d is in the standby state. This is because it can be avoided. For example, as for a plurality of tasks to be executed task candidates, a required number of tasks can be selected from tasks having a high processing priority among tasks whose processing order is not specified among tasks. Then, when the execution of all the processing units is completed for one task of the execution task candidate, it may be replaced with the task having the highest processing priority among the unprocessed tasks.

各コア３１ｃ、３１ｄは、図５に示すように、アクセスデータ制御ユニット４３を有する。このアクセスデータ制御ユニット４３は、実行タスク候補毎に、アクセスデータテーブル４３ａ、４３ｂを有する。アクセスデータテーブル４３ａ、４３ｂには、上述したように、開始ポイント命令に含まれるアクセスデータ情報が格納される。これにより、アクセスデータ制御ユニット４３は、実行タスク候補となっているタスクの処理単位が実行された場合に、アクセスすることになるデータを、その処理単位の実行前に把握することができる。そして、アクセスデータ制御ユニット４３は、アクセスデータ情報が示すデータに対応する禁止フラグの状態を参照することにより、アクセスデータ情報が示すデータにアクセス可能か否かの判定を行う。この判定において、アクセスデータ情報が示すデータにアクセス可能と判定すると、アクセスデータ制御ユニット４３は、該当データに対応する禁止フラグをセット状態に変化させて、そのデータをロックする。これにより、処理単位の命令を実行する準備が完了する。一方、アクセスデータ情報が示すデータにアクセス不可と判定すると、アクセスデータ制御ユニット４３は、その判定結果を、プロセス切替ユニット４４に与える。 Each of the cores 31c and 31d has an access data control unit 43 as shown in FIG. The access data control unit 43 has access data tables 43a and 43b for each execution task candidate. As described above, the access data information included in the start point instruction is stored in the access data tables 43a and 43b. Thus, the access data control unit 43 can grasp the data to be accessed before the execution of the processing unit when the processing unit of the task which is the execution task candidate is executed. Then, the access data control unit 43 determines whether or not the data indicated by the access data information can be accessed by referring to the state of the prohibition flag corresponding to the data indicated by the access data information. In this determination, when it is determined that the data indicated by the access data information is accessible, the access data control unit 43 changes the prohibition flag corresponding to the data to a set state and locks the data. This completes the preparation for executing the instruction in the processing unit. On the other hand, if it is determined that the data indicated by the access data information cannot be accessed, the access data control unit 43 gives the determination result to the process switching unit 44.

プロセス切替ユニット４４は、上述したように、実行タスク候補の各々のタスクの処理優先度を示す情報を格納するプライオリティテーブル４４ｃを備える。プロセス切替ユニット４４は、プライオリティテーブル４４ｃの各々のタスクの処理優先度を示す情報に基づき、初期的に、最も高い処理優先度を持つタスクの処理単位を実行対象の処理単位として定める。具体的には、プロセス切替ユニット４４は、実行タスク候補となっている各タスクの処理単位の命令をそれぞれ格納しているレジスタ４４ａ、４４ｂの中で、最も高い処理優先度を持つタスクに対応するレジスタに格納されている命令を演算部４５へ転送する。ただし、プロセス切替ユニット４４は、最も高い処理優先度を有するタスクの処理単位がデータへのアクセス不可であるとの判定結果をアクセスデータ制御ユニット４３から得た場合、プロセス切替ユニット４４は、プライオリティテーブル４４ｃを参照して、次に処理優先度が高いタスクの処理単位を実行対象の処理単位とする。換言すれば、プロセス切替ユニット４４は、実行対象とする処理単位を、最も高い処理優先度を持つタスクの処理単位から、次に高い処理優先度を持つタスクの処理単位へと切り替える。このとき、プロセス切替ユニット４４は、実行タスク候補となっている各タスクの処理単位の命令をそれぞれ格納しているレジスタ４４ａ、４４ｂの中で、次に高い処理優先度を持つタスクに対応するレジスタに格納されている命令を演算部４５へ転送する。 As described above, the process switching unit 44 includes the priority table 44c that stores information indicating the processing priority of each task of the execution task candidates. The process switching unit 44 initially determines the processing unit of the task having the highest processing priority as the execution target processing unit based on the information indicating the processing priority of each task in the priority table 44c. Specifically, the process switching unit 44 corresponds to the task having the highest processing priority among the registers 44a and 44b storing the instruction of the processing unit of each task that is a candidate for the execution task. The instruction stored in the register is transferred to the operation unit 45. However, when the process switching unit 44 obtains, from the access data control unit 43, a determination result that the processing unit of the task having the highest processing priority is inaccessible to data, the process switching unit 44 With reference to 44c, the processing unit of the task having the next highest processing priority is set as the execution target processing unit. In other words, the process switching unit 44 switches the processing unit to be executed from the processing unit of the task having the highest processing priority to the processing unit of the task having the next highest processing priority. At this time, the process switching unit 44 sets the register corresponding to the task having the next highest processing priority among the registers 44a and 44b storing the instruction of the processing unit of each task that is a candidate for the execution task. Is transferred to the arithmetic unit 45.

演算部４５は、演算ユニット４５ａとレジスタ４５ｂとを有する。プロセス切替ユニット４４から転送された命令は、レジスタ４５ｂに一旦格納される。そして、レジスタ４５ｂに格納された命令に応じた処理が、演算ユニット４５ａにて実行される。このように、各コア３１ｃ、３１ｄでは、処理単位によってデータへのアクセス不可である場合、実行対象となる処理単位の切り替えが行われるので、各コア３１ｃ、３１ｄでの処理単位の実行頻度を高めることができるようになる。 The operation unit 45 has an operation unit 45a and a register 45b. The instruction transferred from the process switching unit 44 is temporarily stored in the register 45b. Then, processing according to the instruction stored in the register 45b is executed by the arithmetic unit 45a. As described above, in each of the cores 31c and 31d, when the data cannot be accessed by the processing unit, the processing unit to be executed is switched, so that the execution frequency of the processing unit in each of the cores 31c and 31d is increased. Will be able to do it.

データキャッシュメモリ４６は、実行対象となる処理単位によってアクセスされるデータを、予めメモリから読み出して保存しておくものである。処理単位の実行によって、そのデータが更新される場合には、その更新データがデータキャッシュメモリ４６に一旦保存され、その後、メモリ内のデータが更新データによって書換えられる。 The data cache memory 46 stores data accessed by a processing unit to be executed by reading from the memory in advance and storing the data. When the data is updated by execution of the processing unit, the updated data is temporarily stored in the data cache memory 46, and then the data in the memory is rewritten by the updated data.

ここで、上述した車載装置３０などの組み込みシステムのためのシングルプログラムでは、各タスク間で処理順序が規定されないことも多い。このため、各コア３１ｃ、３１ｄでデータアクセス待ちのために処理単位の実行処理が中断される状況が生じた場合、処理順序が規定されていないタスクの処理単位を代わりに実行することにより、各コア３１ｃ、３１ｄにおける並列実行頻度を高めることができる。その結果、生成された並列プログラム３１ａ１’の実行速度を早めることが可能となる。 Here, in a single program for an embedded system such as the in-vehicle device 30 described above, the processing order is often not defined between the tasks. For this reason, in a case where the execution processing of the processing unit is interrupted due to the data access waiting in each of the cores 31c and 31d, the processing unit of the task whose processing order is not defined is executed instead. The frequency of parallel execution in the cores 31c and 31d can be increased. As a result, it is possible to increase the execution speed of the generated parallel program 31a1 '.

そのため、本実施形態では、並列プログラム３１ａ１’及びマルチコアマイコン３１を、上述したように構成することで、各コア３１ｃ、３１ｄで実行される処理単位同士の相互干渉を防止しつつ、各コア３１ｃ、３１ｄでの処理単位の並列実行頻度を高めることを可能とした。 Therefore, in the present embodiment, by configuring the parallel program 31a1 ′ and the multi-core microcomputer 31 as described above, it is possible to prevent mutual interference between the processing units executed by the cores 31c and 31d, and to reduce the This makes it possible to increase the frequency of parallel execution of processing units in 31d.

以下に、並列プログラム３１ａ１’を実行する際のマルチコアマイコン３１の動作を、図６のフローチャートを参照して説明することにより、各コア３１ｃ、３１ｄで実行される処理単位同士の相互干渉を防止しつつ、各コア３１ｃ、３１ｄでの処理単位の並列実行頻度を高めることができるかについてより詳しく説明する。なお、図６のフローチャートは、第１コア３１ｃにて実行される処理を示しているが、他のコアでも同様の処理が実行される。 Hereinafter, the operation of the multi-core microcomputer 31 when executing the parallel program 31a1 'will be described with reference to the flowchart of FIG. 6, so that mutual interference between processing units executed by the cores 31c and 31d can be prevented. Meanwhile, a more detailed description will be given as to whether the frequency of parallel execution of the processing unit in each of the cores 31c and 31d can be increased. Although the flowchart of FIG. 6 illustrates the processing executed by the first core 31c, the same processing is executed by other cores.

図６のステップＳ１００では、実行タスク候補毎にそれぞれの処理単位に含まれる命令をロードし、対応する命令キュー４２ａ、４２ｂにエンキューする。ステップＳ１１０では、実行タスク候補の中で最も高い処理優先度を持つタスクを選択し、その選択したタスクに対応する命令キュー４２ａ、４２ｂの命令を参照する。ステップＳ１２０では、参照した命令が開始ポイント命令であるか否かを判定する。ステップＳ１２０にて開始ポイント命令であると判定した場合、ステップＳ１３０の処理に進んで、開始ポイント命令に含まれるアクセスデータ情報を取得し、アクセスデータテーブル４３ａに格納する。そして、ステップＳ１４０において、アクセスデータ情報が示すデータにアクセス可能であるか、つまり、そのデータに対応する禁止フラグがリセット状態となっているかどうかを判定する。なお、処理単位が、メモリに保存されたデータにアクセスするものではない場合には、ステップＳ１４０の判定結果は、常に“Ｙｅｓ”となる。 In step S100 of FIG. 6, the instructions included in each processing unit are loaded for each execution task candidate, and enqueued in the corresponding instruction queues 42a and 42b. In step S110, a task having the highest processing priority among the execution task candidates is selected, and the instructions in the instruction queues 42a and 42b corresponding to the selected task are referred to. In step S120, it is determined whether the referred command is a start point command. If it is determined in step S120 that the command is a start point command, the process proceeds to step S130, in which access data information included in the start point command is acquired and stored in the access data table 43a. Then, in step S140, it is determined whether the data indicated by the access data information is accessible, that is, whether the prohibition flag corresponding to the data is reset. If the processing unit does not access data stored in the memory, the determination result in step S140 is always “Yes”.

ステップＳ１４０にて、アクセスデータ情報が示すデータにアクセス可能と判定した場合、ステップＳ１５０の処理に進む。ステップＳ１５０では、アクセス可能と判定したデータに対応する禁止フラグの状態をセット状態に変化させる。これにより、開始ポイント命令以降の処理単位の命令を実行する準備が整う。そのため、ステップＳ１６０では、同タスクの処理単位の次の命令を参照した上で、ステップＳ１２０の処理に戻る。この場合、ステップＳ１２０では、開始ポイント命令では無いと判定され、ステップＳ２１０の処理に進む。さらに、参照している命令は、処理単位の実体的処理に係る命令であるため、ステップＳ２１０においても、終了ポイント命令ではないと判定され、ステップＳ２５０の処理に進む。 If it is determined in step S140 that the data indicated by the access data information can be accessed, the process proceeds to step S150. In step S150, the state of the prohibition flag corresponding to the data determined to be accessible is changed to the set state. As a result, a preparation for executing the instruction in the processing unit after the start point instruction is completed. Therefore, in step S160, the process returns to step S120 after referring to the next instruction in the processing unit of the task. In this case, in step S120, it is determined that the instruction is not a start point instruction, and the process proceeds to step S210. Further, since the referred command is a command related to the substantive processing of the processing unit, it is also determined in step S210 that the command is not an end point command, and the process proceeds to step S250.

ステップＳ２５０では、参照している命令を実行する。すなわち、プロセス切替ユニット４４から該当する命令が演算部４５に転送され、その転送された命令に応じた処理が演算ユニット４５ａにおいて実行される。その後、ステップＳ２６０において、同タスクの処理単位の次の命令を参照した上で、ステップＳ１２０の処理に戻る。このような処理が、参照される命令が終了ポイント命令となるまで、すなわち、処理単位の終了ポイントに達するまで繰り返される。参照される命令が終了ポイント命令となると、ステップＳ２１０における判定結果が“Ｙｅｓ”となる。この場合、ステップＳ２２０の処理に進んで、演算部４５における演算出力データにより、メモリ内のデータを更新する。ただし、実行している処理単位がメモリ内のデータを書き換えるものではない場合には、この処理はスキップされる。 In step S250, the referred instruction is executed. That is, the corresponding instruction is transferred from the process switching unit 44 to the arithmetic unit 45, and the processing according to the transferred instruction is executed in the arithmetic unit 45a. After that, in step S260, the process returns to step S120 after referring to the next instruction in the processing unit of the task. Such processing is repeated until the referenced instruction becomes an end point instruction, that is, until the end point of the processing unit is reached. When the command to be referred to is the end point command, the determination result in step S210 becomes “Yes”. In this case, the process proceeds to step S220, and the data in the memory is updated by the calculation output data from the calculation unit 45. However, if the processing unit being executed is not for rewriting data in the memory, this processing is skipped.

続くステップＳ２３０では、アクセス対象となっていたデータに対応する禁止フラグの状態をセット状態からリセット状態に変化させる。これにより、データのロックが解除され、他コア３１ｄにおいて実行される処理単位によって、該当データへのアクセスが可能となる。そして、ステップＳ２４０にて、同タスクの次の処理単位の命令を参照した上で、ステップＳ１２０の処理に戻る。このように、データへのアクセスが可能である限り、最も高い処理優先度を持つタスクの処理単位が順番に実行されていく。 In subsequent step S230, the state of the prohibition flag corresponding to the data to be accessed is changed from the set state to the reset state. As a result, the lock of the data is released, and the data can be accessed by the processing unit executed in the other core 31d. Then, in step S240, the process returns to step S120 after referring to the instruction of the next processing unit of the task. In this way, as long as data can be accessed, the processing units of the task having the highest processing priority are sequentially executed.

処理単位の実行中に利用するデータへのアクセスが不可である場合、ステップＳ１４０における判定結果が“Ｎｏ”となる。この場合、ステップＳ１７０の処理に進み、実行対象となっているタスクは、実行タスク候補の中で最も低い処理優先度を持っているタスクであるか否かが判定される。このステップＳ１７０の判定結果が“Ｎｏ”である場合、実行タスク候補の中に、実行対象となっているタスクよりも低い処理優先度を有するタスクがあることになる。そのため、ステップＳ１８０の処理に進んで、次に処理優先度が高いタスクを実行対象とし、そのタスクに対応する命令キュー４２ｂに蓄積されている命令を参照した上で、ステップＳ１２０の処理に戻る。このようなタスクの切り替えが行われた場合、参照した命令は、次に処理優先度が高いタスクの処理単位の開始ポイント命令となる。従って、ステップＳ１２０における判定結果が“Ｙｅｓ”となり、上述したように、実行対象となった処理単位は、あらためて、アクセス予定のデータへのアクセスが可能であるかどうかにより、その実行可否が判定される。 If access to the data used during execution of the processing unit is not possible, the result of the determination in step S140 is “No”. In this case, the process proceeds to step S170, and it is determined whether the task to be executed is the task having the lowest processing priority among the execution task candidates. If the determination result in step S170 is “No”, it means that among the execution task candidates, there is a task having a lower processing priority than the task to be executed. Therefore, the process proceeds to step S180, where the task having the next highest processing priority is set as an execution target, the command stored in the command queue 42b corresponding to the task is referred to, and then the process returns to step S120. When such task switching is performed, the referenced instruction becomes the start point instruction of the processing unit of the task having the next highest processing priority. Therefore, the result of the determination in step S120 is "Yes", and as described above, whether or not the processing unit to be executed can be executed again is determined based on whether or not access to the data to be accessed is possible. You.

一方、ステップＳ１７０において、実行対象となっているタスクは、実行タスク候補の中で最も低い処理優先度を持っているタスクと判定された場合、実行タスク候補の中に、よい低い処理優先度を持っているタスクは存在しない。このため、ステップＳ１７０の判定結果が“Ｙｅｓ”となった場合には、ステップＳ１９０の処理に進んで、一定時間待機する。そして、ステップＳ２００の処理で、実行タスク候補の中で最も高い処理優先度を持つタスクの処理単位の命令を参照する。つまり、最も低い処理優先度を持つタスクが実行対象となっているときに、処理単位の切り替えを行う必要が生じた場合には、一定時間待機した後、最も高い処理優先度を持つタスクの処理単位に切り替えるのである。これにより、再び、最も高い処理優先度を持つタスクの処理単位から順に、アクセス予定のデータへのアクセスが可能であるかどうかにより、その実行可否を判定した上で、実行可能な処理単位の命令を実行させることが可能となる。 On the other hand, in step S170, if it is determined that the task to be executed has the lowest processing priority among the execution task candidates, a good low processing priority is assigned to the execution task candidates. There are no tasks to have. Therefore, when the result of the determination in step S170 is “Yes”, the process proceeds to step S190 and waits for a certain time. Then, in the processing of step S200, the instruction of the processing unit of the task having the highest processing priority among the execution task candidates is referred to. In other words, if it is necessary to switch the processing unit when the task with the lowest processing priority is the execution target, after waiting for a certain time, the processing of the task with the highest processing priority Switch to units. This makes it possible to determine again whether or not the data to be accessed can be accessed in order from the processing unit of the task having the highest processing priority, and then determine the instruction of the executable processing unit. Can be executed.

以上説明したように、本発明の第１実施形態によれば、並列プログラム３１ａ１’の各タスクのそれぞれの処理単位によりアクセスされるデータの各々に個別に対応するように設定される禁止フラグを用いることにより、各コア３１ｃ、３１ｄにて実行しようとするタスクの処理単位が、その実行時に利用するデータにアクセス可能であるか否かを判定することが可能となる。従って、データへのアクセス不可と判定された場合には、そのデータへのアクセスが可能となるまで待機するのではなく、実行可能な処理単位への切り替えを行うことができるようになる。このため、生成された並列プログラム３１ａ１‘の並列度が低くても、複数のコア３１ｃ、３１ｄでの処理単位の並列実行頻度を高めることが可能となる。 As described above, according to the first embodiment of the present invention, a prohibition flag set to individually correspond to data accessed by each processing unit of each task of the parallel program 31a1 'is used. This makes it possible to determine whether or not the processing unit of the task to be executed by each of the cores 31c and 31d can access data used at the time of execution. Therefore, when it is determined that the data cannot be accessed, it is possible to switch to an executable processing unit instead of waiting until the data can be accessed. Therefore, even if the degree of parallelism of the generated parallel program 31a1 # is low, it is possible to increase the frequency of parallel execution of the processing units in the plurality of cores 31c and 31d.

（第２実施形態）
次に、本発明の第２実施形態について説明する。上述した第１実施形態では、並列プログラム３１ａ１’の各タスクのそれぞれの処理単位によりアクセスされるデータの各々に個別に対応するように設定される禁止フラグを用いて、各コア３１ｃ，３１ｄにて実行される処理単位の相互干渉を防止するものであった、しかしながら、１つのコアで実行中の処理単位によってアクセスされるデータへの、他のコアで実行される処理単位によるアクセスを禁止する手法は、上述した第１実施形態の手法に限られない。本第２実施形態では、第１実施形態とは異なる手法の一例について説明する。 (Second Embodiment)
Next, a second embodiment of the present invention will be described. In the above-described first embodiment, the cores 31c and 31d use the prohibition flags set so as to individually correspond to the data accessed by the respective processing units of the respective tasks of the parallel program 31a1 ′. A method for preventing mutual interference of processing units executed, however, a method of prohibiting access by a processing unit executed by another core to data accessed by a processing unit executed by one core Is not limited to the method of the first embodiment described above. In the second embodiment, an example of a method different from the first embodiment will be described.

第２実施形態におけるマルチコアマイコン３１の構成を図７に示す。図７に示されるように、本実施形態のマルチコアマイコン３１は、メモリへのアクセスを管理するメモリプロテクションユニット（ＭＰＵ）４７を備える。ＭＰＵ４７は、同一データに対するアクセスを単一のコア３１ｃ、３１ｄのみに許可するように作動する。すなわち、ＭＰＵ４７は、いずれのコア３１ｃ、３１ｄからもアクセスされていないデータに対して、いずれかのコア３１ｃ、３１ｄからのアクセス要求があった場合には、そのアクセス要求を許可する。しかし、ＭＰＵ４７は、いずれかのコア３１ｃ、３１ｄによってアクセスされているデータに対して、別のコア３１ｃ、３１ｄからのアクセス要求があっても、そのアクセス要求を拒否する。 FIG. 7 shows the configuration of the multi-core microcomputer 31 according to the second embodiment. As shown in FIG. 7, the multi-core microcomputer 31 of the present embodiment includes a memory protection unit (MPU) 47 for managing access to a memory. The MPU 47 operates to permit access to the same data to only the single cores 31c and 31d. That is, when there is an access request from any of the cores 31c and 31d for data not accessed by any of the cores 31c and 31d, the MPU 47 permits the access request. However, even if there is an access request from another core 31c, 31d for data accessed by one of the cores 31c, 31d, the MPU 47 rejects the access request.

ＭＰＵ４７に対するデータのアクセス要求は、図７に示すように、各コア３１ｃ、３１ｄのアクセスデータ制御ユニット４３から出力される。このアクセス要求に対し、ＭＰＵ４７は、アクセスを許可する許可通知もしくはアクセスを拒否する拒否通知を、アクセスデータ制御ユニット４３に返送する。各コア３１ｃ、３１ｄのアクセスデータ制御ユニット４３は、図８のステップＳ３４０において、ＭＰＵ４７からの許可通知もしくは拒否通知に基づき、メモリに保存されたデータへのアクセスの可否を判定する。 A data access request to the MPU 47 is output from the access data control unit 43 of each of the cores 31c and 31d as shown in FIG. In response to this access request, the MPU 47 returns a permission notice for permitting the access or a rejection notice for denying the access to the access data control unit 43. The access data control unit 43 of each of the cores 31c and 31d determines whether or not to access the data stored in the memory based on the permission notification or the rejection notification from the MPU 47 in step S340 of FIG.

アクセスデータ制御ユニット４３がＭＰＵ４７から許可通知を受領すると、そのコアは、処理単位の実行のために該当データを読み出して、データキャッシュメモリ４６に保存する。さらに、図８のフローチャートのステップＳ３５０において、そのコアのアクセスデータ制御ユニット４３が、ＭＰＵ４７に対して、読み出したデータをロックするよう要求する。このロック要求に応じて、ＭＰＵ４７は該当データへの他コアからのアクセスを禁止するようになる。 When the access data control unit 43 receives the permission notification from the MPU 47, the core reads out the corresponding data for execution of the processing unit and stores the data in the data cache memory 46. Further, in step S350 of the flowchart in FIG. 8, the access data control unit 43 of the core requests the MPU 47 to lock the read data. In response to this lock request, the MPU 47 prohibits access to the data from other cores.

そして、処理単位の実体的処理に係るすべての命令が終了し、その処理単位の終了ポイントに達すると、図８のフローチャートのステップＳ４１０における判定結果が“Ｙｅｓ”となり、ステップＳ４２０〜Ｓ４４０の処理が実行される。この内、ステップＳ４３０の処理では、アクセスデータ制御ユニット４３が、ＭＰＵ４７に対して、ロックしているデータをアンロックするよう要求する。このアンロック要求に応じて、ＭＰＵ４７は該当データを他コアからのアクセスにも開放するようになる。 Then, when all the instructions related to the substantive processing of the processing unit are completed and the end point of the processing unit is reached, the determination result in step S410 of the flowchart of FIG. 8 becomes “Yes”, and the processing of steps S420 to S440 is performed. Be executed. In the process of step S430, the access data control unit 43 requests the MPU 47 to unlock the locked data. In response to the unlock request, the MPU 47 releases the corresponding data to access from another core.

その他のマルチコアマイコン３１の構成や、図８のフローチャートにおける処理は、上述した第１実施形態と同様であるため、説明を省略する。 The other configuration of the multi-core microcomputer 31 and the processing in the flowchart of FIG. 8 are the same as those of the first embodiment described above, and thus the description is omitted.

以上の第２実施形態によっても、第１実施形態と同様に、各コア３１ｃ、３１ｄにて実行しようとするタスクの処理単位が、その実行時に利用するデータにアクセス可能であるか否かを判定することができる。従って、データへのアクセス不可と判定された場合には、そのデータへのアクセスが可能となるまで待機するのではなく、実行可能な処理単位への切り替えを行うことができるようになる。このため、生成された並列プログラム３１ａ１‘の並列度が低くても、複数のコア３１ｃ、３１ｄでの処理単位の並列実行頻度を高めることが可能となる。 According to the above-described second embodiment, similarly to the first embodiment, it is determined whether or not the processing unit of the task to be executed by each of the cores 31c and 31d can access data used at the time of execution. can do. Therefore, when it is determined that the data cannot be accessed, it is possible to switch to an executable processing unit instead of waiting until the data can be accessed. Therefore, even if the degree of parallelism of the generated parallel program 31a1 # is low, it is possible to increase the frequency of parallel execution of the processing units in the plurality of cores 31c and 31d.

（第３実施形態）
次に、本発明の第３実施形態について説明する。上述した第１実施形態及び第２実施形態では、１つのコアがアクセスしているデータへの他コアによるアクセスを禁止することにより、複数のコアで実行される処理単位による相互干渉を防止するものであった。 (Third embodiment)
Next, a third embodiment of the present invention will be described. In the first and second embodiments described above, mutual interference by processing units executed by a plurality of cores is prevented by prohibiting another core from accessing data accessed by one core. Met.

本実施形態では、上述した第１実施形態又は第２実施形態の構成に加えて、例えば、１つのコアが、制御上、特に重要なデータへのアクセスを行って、そのデータの更新などを行う場合に、他コアにおいて別タスクの処理単位への切り替えを禁止するものである。これにより、他コアにおける切り替え後の別タスクの処理単位によって、誤って、その重要なデータが影響を受けることを確実に防止することができる。加えて、同じタスク内の処理単位の間では、従来の並列化コンパイル方法により、依存関係に従って、例えば、同期ポイント命令追加部によって待合せ処理を追加することが可能である。これにより同じタスク内の処理単位同士で、誤って、その重要なデータが影響を受けることを確実に防止することができる。詳細は、特開２０１５−１８０７号公報を参照されたい。 In the present embodiment, in addition to the configuration of the first embodiment or the second embodiment described above, for example, one core accesses data that is particularly important for control and updates the data. In this case, the switching of the other core to the processing unit of another task is prohibited. Thus, it is possible to reliably prevent the important data from being erroneously affected by the processing unit of another task after switching in another core. In addition, between processing units in the same task, it is possible to add a queuing process according to the dependency by, for example, a synchronization point instruction adding unit by the conventional parallelizing compilation method. As a result, it is possible to reliably prevent the important data from being erroneously affected by the processing units in the same task. For details, refer to JP-A-2015-1807.

本実施形態によるマルチコアマイコン３１は、図９に示すように、各コア３１ｃ、３１ｄのアクセスデータ制御ユニット４３同士が通信可能に構成されている。そして、各コア３１ｃ、３１ｄのアクセスデータ制御ユニット４３は、開始ポイント命令に続いて、他コアにおいて別タスクの処理単位への切り替えを禁止する排他区間の開始命令が設定されている場合（図１０のフローチャートのステップＳ６６０：Ｙｅｓ）、他コアのアクセスデータ制御ユニットへ、別タスクの制御単位への切り換えの禁止要求であるプリエンプション禁止要求を出力する（図１０のフローチャートのステップＳ６７０）。なお、図１０のフローチャートは、第１実施形態の構成による作動を前提としたものであるが、第２実施形態の構成に対して、本実施形態の構成を追加しても良い。 As shown in FIG. 9, the multi-core microcomputer 31 according to the present embodiment is configured so that the access data control units 43 of the cores 31c and 31d can communicate with each other. Then, the access data control unit 43 of each of the cores 31c and 31d sets a case in which, following the start point command, a start command of an exclusive section that prohibits another core from switching to a processing unit of another task is set (FIG. 10). In step S660 of the flowchart of step (Yes), a preemption prohibition request that is a request to prohibit switching to a control unit of another task is output to the access data control unit of another core (step S670 of the flowchart of FIG. 10). Although the flowchart of FIG. 10 is based on the premise that the operation according to the configuration of the first embodiment is performed, the configuration of the present embodiment may be added to the configuration of the second embodiment.

このプリエンプション禁止要求を受信した他コアのアクセスデータ制御ユニットは、実行対象となっているタスクの制御処理がアクセス予定のデータにアクセス不可であっても（図１０のフローチャートのステップＳ５４０：Ｎｏ）、図１０のフローチャートのステップＳ５７０に示すように、プリエンプション禁止要求に応じて、そのままステップＳ１２０の処理に戻ることにより、別タスクの処理単位への切り換えを禁止する。 The access data control unit of the other core that has received the preemption prohibition request, even if the control process of the task to be executed cannot access the data to be accessed (step S540 in the flowchart of FIG. 10: No). As shown in step S570 of the flowchart of FIG. 10, in response to the preemption prohibition request, the process returns to the process of step S120, thereby prohibiting the switching to the processing unit of another task.

排他区間開始命令と対をなすように、終了ポイント命令の直前に、排他区間終了命令が設定される。処理単位の実体的処理が終了して、排他区間終了命令が参照されると、図１０のフローチャートの判定結果が“Ｙｅｓ”となる。この場合、ステップＳ７２０の処理に進んで、他コアに対して出力しているプリエンプション禁止要求を停止する。 An exclusive section end command is set immediately before the end point command so as to form a pair with the exclusive section start command. When the substantive processing of the processing unit ends and the exclusive section end instruction is referred to, the determination result in the flowchart of FIG. 10 becomes “Yes”. In this case, the process proceeds to step S720 to stop the preemption prohibition request output to the other core.

以上、本発明の好ましい実施形態について説明した。しかしながら、本発明は、上記実施形態に何ら制限されることはなく、本発明の趣旨を逸脱しない範囲において、種々の変形が可能である。 The preferred embodiment of the present invention has been described above. However, the present invention is not limited to the above embodiment, and various modifications are possible without departing from the gist of the present invention.

例えば、上述した第１及び第２実施形態では、最初に最も高い処理優先度を持つタスクの処理単位に関して、その処理単位の命令によってアクセス予定のデータがアクセス可能であるかどうかを判定し、アクセス不可である場合に、実行対象となる処理単位を、次に高い処理優先度を持つタスクの処理単位に切り替えるものであった。しかしながら、例えば、最初に、実行タスク候補の各タスクの処理単位の命令によってアクセス予定のデータにアクセス可能であるかどうかを判定し、アクセス可能であるタスクの中で、最も処理優先度の高いタスクの処理単位を実行対象として決定しても良い。 For example, in the above-described first and second embodiments, regarding the processing unit of the task having the highest processing priority, it is determined whether or not the data to be accessed is accessible by the instruction of the processing unit. If it is impossible, the processing unit to be executed is switched to the processing unit of the task having the next highest processing priority. However, for example, first, it is determined whether or not the data to be accessed is accessible by the instruction of the processing unit of each task of the execution task candidate, and the task having the highest processing priority among the accessible tasks is determined. May be determined as an execution target.

また、上述した実施形態では、マルチコアマイコン３１を車載装置３０として適用する例について説明したが、マルチコアマイコン３１の適用対象はこれに限られない。 Further, in the above-described embodiment, an example in which the multi-core microcomputer 31 is applied as the in-vehicle device 30 has been described, but the application target of the multi-core microcomputer 31 is not limited thereto.

１：自動並列化コンパイラ、１０：コンピュータ、１０ａ：字句解析部、１０ｂ：意味解析部、１０ｃ：依存関係解析部、１０ｄ：コア割付及びスケジューリング部、１０ｅ：同期ポイント命令追加部、１０ｆ：コード生成部、１１：ディスプレイ、１２：ＨＤＤ、１３：ＣＰＵ、１４：ＲＯＭ、１５：ＲＡＭ、１６：入力装置、１７：読取部、１８：記憶媒体、２０：コンパイラ、３０：車載装置、３１：マルチコアマイコン、３１ａ：ＲＯＭ３１ａ１：並列プログラム、３１ｂ：ＲＡＭ、３１ｃ：第１コア、３１ｄ：第２コア、３２：通信部、３３：センサ部、３４：入出力ポート、４０ａ、４０ｂ：命令キャッシュメモリ、４１：命令フェッチ・デコード部、４２ａ、４２ｂ：命令キュー、４３：アクセスデータ制御ユニット、４３ａ、４３ｂ：アクセスデータテーブル、４４：プロセス切替ユニット、４４ａ、４４ｂ：レジスタ、４４ｃ：プライオリティテーブル、４５：演算部、４５ａ：演算ユニット、４５ｂ：レジスタ、４６：データキャッシュメモリ、４７：ＭＰＵ 1: Automatic parallelizing compiler, 10: Computer, 10a: Lexical analysis section, 10b: Semantic analysis section, 10c: Dependency analysis section, 10d: Core allocation and scheduling section, 10e: Synchronization point instruction addition section, 10f: Code generation , 11: display, 12: HDD, 13: CPU, 14: ROM, 15: RAM, 16: input device, 17: reading unit, 18: storage medium, 20: compiler, 30: in-vehicle device, 31: multi-core microcomputer 31a: ROM 31a1: parallel program, 31b: RAM, 31c: first core, 31d: second core, 32: communication unit, 33: sensor unit, 34: input / output port, 40a, 40b: instruction cache memory, 41: Instruction fetch / decode unit, 42a, 42b: instruction queue, 43: access data control unit, 43a 43 b: the access data table 44: process switching unit, 44a, 44b: register, 44c: priority table, 45: operation unit, 45a: arithmetic unit, 45b: register, 46: data cache memory, 47: MPU

Claims

A multi-core microcomputer (31) for executing a parallel program (31a1 ′) for a multi-core microcomputer having a plurality of cores (31c, 31d) generated from a single program for a single-core microcomputer having one core,
The parallel program is included in the single program, for each task consisting of a plurality of processing units, based on the dependency of the plurality of processing units, the allocation of the plurality of processing units to the plurality of cores and the execution order Has been determined,
Each of the plurality of cores of the multi-core microcomputer selects a processing unit to be executed from among processing units belonging to a plurality of tasks allocated to its own core (S110 to S200, S310 to S400, 510). ~ S610, S670),
The selection unit includes:
A determination unit (S140, S340, S540) that determines whether or not data to be used can be accessed at the time of execution of the processing unit, with respect to the processing unit of each task;
Prohibition processing for determining the processing unit determined to be accessible to the data to be used by the determination unit as the processing unit to be executed, and prohibiting access to the data by the processing unit executed in another core ( Determination processing units (S150 to S160, S350 to S360, S550 to S560, and S670) for executing S150, S350, S550, and S670);
A multi-core microcomputer, wherein each of the plurality of cores executes an instruction included in a processing unit determined by the determination-time processing unit.

The selecting unit includes a specifying unit (S110, S310, S510) for specifying a processing unit belonging to a task having a relatively highest processing priority based on the processing priority of each of the plurality of tasks.
The determining unit, for the processing unit specified by the specifying unit, determines whether or not the data to be used is accessible,
When the determining unit determines that the data to be used is inaccessible, the selecting unit switches the processing unit to be determined by the determining unit to a processing unit belonging to a task having the next highest processing priority. The multi-core microcomputer according to claim 1, further comprising a switching unit (S170 to S200, S370 to S400, S580 to S610).

The switching unit, when the processing unit determined to be inaccessible to the data by the determination unit belongs to the task of the lowest processing priority, the processing unit to be determined by the determination unit, the highest processing priority 3. The multi-core microcomputer according to claim 2, wherein the processing is switched to a processing unit belonging to the task.

The switching unit, when switching a processing unit to be determined by the determination unit from a processing unit belonging to a task having the lowest processing priority to a processing unit belonging to a task having the highest processing priority, a predetermined waiting time elapses. The multi-core microcomputer according to claim 3, wherein the switching is performed later.

A plurality of instruction queues (42a, 42b) for storing at least instructions included in each processing unit of the plurality of tasks allocated to the own core;
An arithmetic processing unit (45) for executing arithmetic processing according to the instruction given from the instruction queue;
Instructions included in each processing unit belonging to the plurality of tasks are stored in the plurality of instruction queues,
The multi-core microcomputer according to claim 2, wherein the switching unit switches an instruction queue serving as a source of instructions to the arithmetic processing unit.

Each processing unit of the parallel program includes a start point instruction indicating a start point thereof,
6. The multi-core microcomputer according to claim 1, wherein the start point instruction includes access data information indicating data to be used when the data is accessed by an instruction included in a corresponding processing unit.

7. The multi-core microcomputer according to claim 6, wherein the determination unit determines whether or not the data to be used can be accessed when the processing unit is executed, in accordance with the start point instruction.

Each processing unit of the parallel program includes an end point instruction indicating its end point,
8. The multi-core microcomputer according to claim 6, wherein the prohibition processing by the determination-time processing unit is released in response to the end point instruction. 9.

The determination-time processing unit performs a process of setting a flag indicating that the data to be used is in use as the prohibition process (S150, S550), locks the data,
9. The device according to claim 1, wherein the determination unit determines whether the data is accessible based on whether a flag corresponding to the data to be used is set and locked. 9. Multi-core microcomputer.

A memory accessible by the plurality of cores;
A memory management unit (47) for managing access to the memory;
The data used by the plurality of cores when executing the respective processing units of the assigned tasks is stored in the memory,
The determination-time processing unit performs, as the prohibition process (S350), a process of instructing the memory management unit to lock the data to be used,
9. The multi-core microcomputer according to claim 1, wherein the determination unit determines whether the data to be used is accessible by the memory management unit based on whether the data is locked. .

The multi-core microcomputer according to claim 9, wherein the determination-time processing unit further outputs, as the prohibition process (S 670), an instruction to prohibit the other core from switching to a processing unit of another task.

The multi-core microcomputer is applied to a vehicle-mounted device for controlling a vehicle-mounted device mounted on a vehicle, and controls the vehicle-mounted device by executing the parallel program. Multi-core microcomputer.

A parallelization method for generating a parallel program (31a1) for a multi-core microcomputer having a plurality of cores (31c, 31b) from a single program for a single-core microcomputer having one core,
Core assignment for determining, for each task including a plurality of processing units included in the single program, allocation of the plurality of processing units to the plurality of cores and execution order based on a dependency of the plurality of processing units. And an execution order determination procedure (10a to 10d);
Generating the parallel program so that the plurality of processing units are executed by the plurality of cores of the multi-core microcomputer according to the allocation and execution order to the plurality of cores determined in the execution order determination step; A parallel program generation procedure (10e, 10f) for adding a start point instruction to a start point of each processing unit of the parallel program and adding an end point instruction to an end point.
The start point instruction and the end point instruction cause the multi-core microcomputer to perform data access arbitration so as not to access the same data by an instruction included in a processing unit executed simultaneously by the plurality of cores. A parallelization method for executing a processing unit of another task other than a processing unit of a corresponding task in a core for which data access is prohibited.

14. The parallelization method according to claim 13, wherein when the data is accessed by an instruction included in a corresponding processing unit, the start point instruction includes access data information indicating data to be accessed.

15. The parallelization method according to claim 14, wherein the start point instruction includes an instruction for determining whether or not the data indicated by the access data information is accessible.

16. The parallelization method according to claim 15, wherein the start point instruction includes an instruction to switch to a processing unit of another task when it is determined that the data indicated by the access data information cannot be accessed.

17. The parallelization method according to claim 15, wherein the start point instruction includes an instruction to lock the data when it is determined that the data indicated by the access data information is accessible.

18. The parallelization method according to claim 17, wherein the end point instruction includes an instruction for unlocking the data.

18. The parallel program generation procedure according to claim 17, further comprising, after the start point instruction including the data lock instruction instruction, a prohibition instruction for prohibiting another core from switching to a processing unit of another task. 19. The parallelization method according to 18.

20. The parallelization method according to claim 19, wherein the parallel program generation procedure adds a release instruction for releasing the prohibited instruction immediately before the end point instruction.