JP5408330B2

JP5408330B2 - Multi-core processor system, thread control method, and thread control program

Info

Publication number: JP5408330B2
Application number: JP2012501562A
Authority: JP
Inventors: 浩一郎山下; 宏真山内; 清志宮▲崎▼
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2010-02-23
Filing date: 2010-02-23
Publication date: 2014-02-05
Anticipated expiration: 2030-02-23
Also published as: US20120304183A1; US9311142B2; WO2011104823A1; US20160179429A1; JPWO2011104823A1

Description

本発明は、スレッドを制御するマルチコアプロセッサシステム、スレッド制御方法、およびスレッド制御プログラムに関する。 The present invention relates to a multi-core processor system that controls threads, a thread control method, and a thread control program.

従来から、１つのＣＰＵに対して、複数のプログラムを動作させるマルチプログラミング技術が存在する。具体的には、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）の処理時間を分割する機能を有し、分割された時間にプロセスやスレッドを割り当てることにより、ＣＰＵが同時に複数のプロセスやスレッドを動作する。ここで、プロセスはプログラムの実行単位である。 Conventionally, there exists a multiprogramming technique for operating a plurality of programs for one CPU. Specifically, an OS (Operating System) has a function of dividing the processing time of a CPU (Central Processing Unit), and by assigning a process or thread to the divided time, the CPU can simultaneously process a plurality of processes or threads. Work. Here, a process is a program execution unit.

また、一定時間内に処理を終える必要があるプロセスに対して、ＯＳがＣＰＵに対するスレッドの割り当て時間を他のプロセスより多くすることで、優先的に処理を行い、一定時間内に処理を行う、という技術も開示されている。また、プロセスを効果的に切り替える技術として、ＯＳがプロセスごとに実行命令数を取得し、実行命令数の多いプロセスを先に実行する技術が開示されている（たとえば、下記特許文献１を参照。）。前述の技術では、キャッシュメモリに一番多く保持されているプロセスが実行されることにより、全体のスループットを向上させることができる。 In addition, for a process that needs to finish processing within a certain period of time, the OS performs processing preferentially by increasing the thread allocation time for the CPU over other processes, and performs processing within a certain period of time. This technique is also disclosed. Further, as a technique for effectively switching processes, a technique is disclosed in which the OS acquires the number of execution instructions for each process and executes a process having a large number of execution instructions first (for example, see Patent Document 1 below). ). In the technique described above, the overall throughput can be improved by executing the process held most in the cache memory.

また、コンピュータシステムに複数のＣＰＵを搭載するマルチコアプロセッサシステムの技術も開示されている。これにより、前述のマルチプログラミング技術において、ＯＳは複数のプログラムを複数のプロセスに対して割り当てることができる。また、マルチコアプロセッサシステムの構成として、各ＣＰＵが専用のメモリを保持し、それ以外のデータが必要の場合には、共有メモリにアクセスを行うことを特徴とする、分散システムの構造を持ったマルチコアプロセッサシステムが開示されている。また、各ＣＰＵがキャッシュメモリのみを保持し、必要なデータは共有メモリに格納することを特徴とする、集中共用システムの構造を持ったマルチコアプロセッサシステムも開示されている。 A technique of a multi-core processor system in which a plurality of CPUs are mounted on a computer system is also disclosed. Thereby, in the above-mentioned multiprogramming technique, the OS can allocate a plurality of programs to a plurality of processes. In addition, as a multi-core processor system configuration, a multi-core having a distributed system structure is characterized in that each CPU holds a dedicated memory, and when other data is required, the shared memory is accessed. A processor system is disclosed. Also disclosed is a multi-core processor system having a centralized shared system structure in which each CPU holds only a cache memory and necessary data is stored in a shared memory.

特開平９−３３０２３７号公報JP-A-9-330237

しかしながら、マルチコアプロセッサシステムでは、複数のＣＰＵが共有メモリに同時にアクセスすることによってコンテンションが発生する。コンテンションが発生すると、ＣＰＵは、通常の処理時間以内に処理を終了できなくなり、一定時間内に処理を終わらせる必要があるリアルタイム処理を行えないという問題があった。リアルタイム処理とは、設計上あらかじめ決められた時刻に処理を終了しなければならない処理、および割り込み動作における、割り込みイベント発生から割り込み処理本体の開始時間までの許容されるインターバル時間が定められた処理をさす。 However, in a multi-core processor system, contention occurs when a plurality of CPUs simultaneously access a shared memory. When contention occurs, there is a problem in that the CPU cannot finish the process within a normal processing time, and cannot perform a real-time process that needs to finish the process within a certain time. Real-time processing refers to processing that must be completed at a predetermined time in design, and processing that determines the allowable interval time from the occurrence of an interrupt event to the start time of the interrupt processing body in interrupt operation. Sure.

また、コンテンションはハードウェアを原因として引き起こされる。したがって、上述した従来技術において、マルチコアプロセッサシステムに特許文献１を適用しても、ＣＰＵはキャッシュメモリに一番多く保持されているプロセスによってコンテンションを起こす可能性もあり、コンテンションの解決に至らないという問題があった。 Also, contention is caused by hardware. Therefore, in the above-described conventional technology, even if Patent Document 1 is applied to a multi-core processor system, there is a possibility that the CPU may cause contention by the process held most in the cache memory, leading to the solution of contention. There was no problem.

また、前述した分散システムを適用した場合、コンテンションが発生する頻度は少ないが、ＣＰＵごとにメモリを配置する必要があるため、コストと消費電力が大きくなるという問題があった。したがって、コストと消費電力に制限のある組み込み環境では、集中共用システムを適用したマルチコアプロセッサシステムがよく適用される。しかし、集中共用システムを適用したマルチコアプロセッサシステムは、複数のＣＰＵが共有メモリに同時にアクセスする機会が多く、コンテンションが発生する頻度が多いという問題があった。 Further, when the above-described distributed system is applied, contention occurs less frequently, but there is a problem in that cost and power consumption increase because it is necessary to arrange a memory for each CPU. Therefore, in an embedded environment where cost and power consumption are limited, a multi-core processor system to which a centralized shared system is applied is often applied. However, the multi-core processor system to which the centralized shared system is applied has a problem that a plurality of CPUs frequently access the shared memory at the same time, and contention occurs frequently.

本発明は、上述した従来技術による問題点を解消するため、マルチコアプロセッサシステムにてリアルタイム処理を保証できるマルチコアプロセッサシステム、スレッド制御方法、およびスレッド制御プログラムを提供することを目的とする。 An object of the present invention is to provide a multi-core processor system, a thread control method, and a thread control program that can guarantee real-time processing in a multi-core processor system in order to solve the above-described problems caused by the prior art.

上述した課題を解決し、目的を達成するため、開示のマルチコアプロセッサシステムは複数のコアのうち、実行優先度が最も高い第１のコアを検出し、検出された第１のコアのうち、メモリに対してアクセス競合を発生させた第２のコアを特定し、複数のコアのうち、第１のコアと第２のコアとを除いたすべてのコアに対し、メモリにアクセスしないスレッドを、アクセス競合の期間のうち少なくとも一部の期間中に実行させるように制御することを備えることを要件とする。 In order to solve the above-described problems and achieve the object, the disclosed multi-core processor system detects a first core having the highest execution priority among a plurality of cores, and detects a memory among the detected first cores. The second core that caused the access contention is identified, and the threads that do not access the memory are accessed for all the cores except the first core and the second core among the plurality of cores. It is a requirement to provide control to be executed during at least some of the periods of contention.

本マルチコアプロセッサシステム、スレッド制御方法、およびスレッド制御プログラムによれば、リアルタイム処理を行っており、コンテンション発生中のコアのリアルタイム処理を保証することができるという効果を奏する。 According to the present multi-core processor system, thread control method, and thread control program, real-time processing is performed, and there is an effect that real-time processing of the core during contention generation can be guaranteed.

実施の形態にかかるマルチコアプロセッサシステムのハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the multi-core processor system concerning embodiment. マルチコアプロセッサシステム１００の各ＣＰＵのハードウェアの一部の構成とソフトウェア構成を示すブロック図である。2 is a block diagram showing a partial hardware configuration and software configuration of each CPU of the multi-core processor system 100. FIG. マルチコアプロセッサシステム１００の機能的構成を示すブロック図である。2 is a block diagram showing a functional configuration of a multi-core processor system 100. FIG. コンテンション状態を示す説明図である。It is explanatory drawing which shows a contention state. コンテンションが解消された状態を示す説明図である。It is explanatory drawing which shows the state from which contention was eliminated. 本実施の形態を適用したマルチコアプロセッサシステム１００の性能比を示す説明図である。It is explanatory drawing which shows the performance ratio of the multi-core processor system 100 to which this Embodiment is applied. 優先度テーブル３０３−１の記憶内容の一例を示す説明図である。It is explanatory drawing which shows an example of the memory content of the priority table 303-1. ハイパーバイザによるメッセージ送信処理を示すフローチャートである。It is a flowchart which shows the message transmission process by a hypervisor. ハイパーバイザによるメッセージ受信処理を示すフローチャートである。It is a flowchart which shows the message reception process by a hypervisor.

以下に添付図面を参照して、本発明にかかるマルチコアプロセッサシステム、スレッド制御方法、およびスレッド制御プログラムの好適な実施の形態を詳細に説明する。 Exemplary embodiments of a multi-core processor system, a thread control method, and a thread control program according to the present invention will be explained below in detail with reference to the accompanying drawings.

（マルチコアプロセッサシステムのハードウェア構成）
図１は、実施の形態にかかるマルチコアプロセッサシステムのハードウェア構成を示すブロック図である。マルチコアプロセッサシステムとは、コアが複数搭載されたプロセッサを含むコンピュータのシステムである。コアが複数搭載されていれば、複数のコアが搭載された単一のプロセッサでもよく、シングルコアのプロセッサが並列されているプロセッサ群でもよい。なお、本実施の形態では、説明を単純化するため、シングルコアのプロセッサであるＣＰＵが並列されているプロセッサ群を例にあげて説明する。(Hardware configuration of multi-core processor system)
FIG. 1 is a block diagram of a hardware configuration of the multi-core processor system according to the embodiment. A multi-core processor system is a computer system including a processor having a plurality of cores. If a plurality of cores are mounted, a single processor having a plurality of cores may be used, or a processor group in which single core processors are arranged in parallel may be used. In this embodiment, in order to simplify the description, a processor group in which CPUs that are single-core processors are arranged in parallel will be described as an example.

マルチコアプロセッサシステム１００は、ＣＰＵを複数搭載するＣＰＵｓ１０１と、ＲＯＭ（Ｒｅａｄ‐ＯｎｌｙＭｅｍｏｒｙ）１０２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１０３と、フラッシュＲＯＭ１０４と、を備えている。また、マルチコアプロセッサシステム１００は、ユーザやその他の機器との入出力装置として、ディスプレイ１０５と、Ｉ／Ｆ（Ｉｎｔｅｒｆａｃｅ）１０６と、を備えている。また、各構成部はバス１０８によってそれぞれ接続されている。本実施の形態にかかるハードウェアの構成は、集中共用システムを適用した構成となっている。 The multi-core processor system 100 includes CPUs 101 that include a plurality of CPUs, a ROM (Read-Only Memory) 102, a RAM (Random Access Memory) 103, and a flash ROM 104. The multi-core processor system 100 includes a display 105 and an I / F (Interface) 106 as input / output devices for a user and other devices. Each component is connected by a bus 108. The hardware configuration according to the present embodiment is a configuration to which a centralized shared system is applied.

ここで、ＣＰＵｓ１０１は、マルチコアプロセッサシステムの全体の制御を司る。また、ＣＰＵｓ１０１は、シングルコアのプロセッサを並列して接続したすべてのＣＰＵを指している。詳細は、図２にて後述する。ＲＯＭ１０２は、ブートプログラムなどのプログラムを記憶している。ＲＡＭ１０３は、ＣＰＵｓ１０１のワークエリアとして使用される。 Here, the CPUs 101 are responsible for overall control of the multi-core processor system. The CPUs 101 refers to all CPUs in which single-core processors are connected in parallel. Details will be described later with reference to FIG. The ROM 102 stores a program such as a boot program. The RAM 103 is used as a work area for the CPUs 101.

フラッシュＲＯＭ１０４は、書き換えが可能であり、電源を切ってもデータが消えないという特徴を持つ不揮発性の半導体メモリである。フラッシュＲＯＭ１０４は、ソフトウェアプログラムや、データを記憶している。フラッシュＲＯＭ１０４の代わりに、磁気ディスクであるＨＤＤ（ハードディスクドライブ）にて記憶してもよいが、フラッシュＲＯＭ１０４を使用することにより、機械的に動作するＨＤＤに比べて振動に強くすることができる。たとえば、マルチコアプロセッサシステム１００で構成する装置に対し、強い振動があった場合でも、フラッシュＲＯＭ１０４であればデータが消える可能性を低くすることができる。 The flash ROM 104 is a non-volatile semiconductor memory that can be rewritten and does not lose data even when the power is turned off. The flash ROM 104 stores software programs and data. Instead of the flash ROM 104, the data may be stored in an HDD (hard disk drive) that is a magnetic disk. However, the use of the flash ROM 104 makes it more resistant to vibration than a mechanically operated HDD. For example, the flash ROM 104 can reduce the possibility of erasing data even when there is strong vibration with respect to the device configured by the multi-core processor system 100.

ディスプレイ１０５は、カーソル、アイコンあるいはツールボックスをはじめ、文書、画像、機能情報などのデータを表示する。ディスプレイ１０５は、たとえば、ＴＦＴ液晶ディスプレイなどを採用することができる。また、ディスプレイ１０５は、タッチパネル式で入力する形態でもよい。 The display 105 displays data such as a document, an image, and function information as well as a cursor, an icon, or a tool box. For example, a TFT liquid crystal display can be adopted as the display 105. Further, the display 105 may be in a form of input using a touch panel.

Ｉ／Ｆ１０６は、通信回線を通じてＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、インターネットなどのネットワーク１０７に接続され、このネットワーク１０７を介して他の装置に接続される。そして、Ｉ／Ｆ１０６は、ネットワーク１０７と内部のインターフェースを司り、外部装置からのデータの入出力を制御する。Ｉ／Ｆ１０６には、たとえばモデムやＬＡＮアダプタなどを採用することができる。 The I / F 106 is connected to a network 107 such as a LAN (Local Area Network), a WAN (Wide Area Network), and the Internet through a communication line, and is connected to other devices via the network 107. The I / F 106 serves as an internal interface with the network 107 and controls input / output of data from an external device. For example, a modem or a LAN adapter can be employed as the I / F 106.

図２は、マルチコアプロセッサシステム１００の各ＣＰＵのハードウェアの一部の構成とソフトウェア構成を示すブロック図である。マルチコアプロセッサシステム１００のハードウェア構成は、ＣＰＵｓ１０１と、共有メモリ２０３で構成される。ＣＰＵｓ１０１は、複数のＣＰＵとしてＣＰＵ２０１−１、ＣＰＵ２０１−２、・・・、ＣＰＵ２０１−ｎで構成される。 FIG. 2 is a block diagram illustrating a partial hardware configuration and software configuration of each CPU of the multi-core processor system 100. The hardware configuration of the multi-core processor system 100 includes CPUs 101 and a shared memory 203. The CPUs 101 includes a CPU 201-1, a CPU 201-2,..., A CPU 201-n as a plurality of CPUs.

ＣＰＵ２０１−１、ＣＰＵ２０１−２、・・・、ＣＰＵ２０１−ｎは、それぞれ、キャッシュメモリ２０２−１、キャッシュメモリ２０２−２、・・・、キャッシュメモリ２０２−ｎを保持している。各ＣＰＵと共有メモリ２０３は、バス１０８によってそれぞれ接続されている。以下の記述では、ＣＰＵ２０１−１、ＣＰＵ２０１−２について記述する。 The CPU 201-1, CPU 201-2,..., And CPU 201-n respectively hold a cache memory 202-1, a cache memory 202-2,. Each CPU and the shared memory 203 are connected by a bus 108. In the following description, the CPU 201-1 and CPU 201-2 will be described.

また、マルチコアプロセッサシステム１００のソフトウェア構成として、ＣＰＵ２０１−１は、ハイパーバイザ２０４−１、ＯＳ２０５−１を実行する。ＣＰＵ２０１−１は、ハイパーバイザ２０４−１の制御により、監視ライブラリ２０６−１を実行する。同様に、ＣＰＵ２０１−１は、ＯＳ２０５−１の制御により、リアルタイムソフトウェア２０７を実行する。同様にＣＰＵ２０１−２も、ハイパーバイザ２０４−２、ＯＳ２０５−２を実行する。ＣＰＵ２０１−２は、ハイパーバイザ２０４−２の制御により、監視ライブラリ２０６−２を実行する。同様に、ＣＰＵ２０１−２は、ＯＳ２０５−２の制御により、ソフトウェア２０８を実行する。 Further, as a software configuration of the multi-core processor system 100, the CPU 201-1 executes a hypervisor 204-1 and an OS 205-1. The CPU 201-1 executes the monitoring library 206-1 under the control of the hypervisor 204-1. Similarly, the CPU 201-1 executes the real-time software 207 under the control of the OS 205-1. Similarly, the CPU 201-2 executes the hypervisor 204-2 and the OS 205-2. The CPU 201-2 executes the monitoring library 206-2 under the control of the hypervisor 204-2. Similarly, the CPU 201-2 executes the software 208 under the control of the OS 205-2.

ＣＰＵ２０１−１がリアルタイムソフトウェア２０７を実行する際には、データへのアクセス先が２通りあり、アクセス経路２０９とアクセス経路２１０である。同様に、ＣＰＵ２０１−２がソフトウェア２０８を実行する際にも、データへのアクセス先が２通りあり、アクセス経路２１１とアクセス経路２１２である。また、ハイパーバイザ２０４−１とハイパーバイザ２０４−２と、他のＣＰＵ上で動作しているハイパーバイザは、ハイパーバイザ間通信２１３を行う。 When the CPU 201-1 executes the real-time software 207, there are two access destinations for data, the access path 209 and the access path 210. Similarly, when the CPU 201-2 executes the software 208, there are two access destinations to the data, that is, the access path 211 and the access path 212. In addition, the hypervisor 204-1, the hypervisor 204-2, and the hypervisor operating on another CPU perform communication 213 between hypervisors.

ＣＰＵ２０１−１、ＣＰＵ２０１−２、・・・、ＣＰＵ２０１−ｎは、マルチコアプロセッサシステム１００の制御を司る。ＣＰＵ２０１−１、ＣＰＵ２０１−２、・・・、ＣＰＵ２０１−ｎは、対称的、均一的に処理が割り付けられたＳＭＰ（Ｓｙｍｍｅｔｒｉｃ
Ｍｕｌｔｉ−ｐｒｏｃｅｓｓｉｎｇ）でもよい。また、ＣＰＵ２０１−１、ＣＰＵ２０１−２、・・・、ＣＰＵ２０１−ｎは、処理内容によって分担するＣＰＵを決めておくＡＳＭＰ（ＡｓｙｍｍｅｔｒｉｃＭｕｌｔｉ−Ｐｒｏｃｅｓｓｉｎｇ）でもよい。ＡＳＭＰの例として、マルチコアプロセッサシステム１００は、ＣＰＵ２０１−１にて優先して処理すべきであるリアルタイム処理を割り当ててもよい。The CPU 201-1, the CPU 201-2,..., The CPU 201-n control the multi-core processor system 100. The CPU 201-1, the CPU 201-2,..., And the CPU 201-n are symmetrically and uniformly assigned SMP (Symmetric).
Multi-processing) may be used. Further, the CPU 201-1, CPU 201-2,..., CPU 201-n may be ASMP (Asymmetric Multi-Processing) that determines the CPU to be shared according to the processing content. As an example of ASMP, the multi-core processor system 100 may assign real-time processing that should be preferentially processed by the CPU 201-1.

共有メモリ２０３は、ＣＰＵ２０１−１、ＣＰＵ２０１−２、・・・、ＣＰＵ２０１−ｎからアクセス可能な記憶領域である。記憶領域とは、具体的には、たとえば、ＲＯＭ１０２、ＲＡＭ１０３、フラッシュＲＯＭ１０４である。また、たとえば、ＣＰＵ２０１−１が、ディスプレイ１０５に画像データの表示を要求する場合、ＲＡＭ１０３に含まれるＶＲＡＭ（ＶｉｄｅｏＲＡＭ）にアクセスし、画像データをＶＲＡＭに書き込むことになる。したがって、ＣＰＵ２０１−１が、ディスプレイ１０５へアクセスする場合も、共有メモリ２０３にアクセスすることに含める。 The shared memory 203 is a storage area accessible from the CPU 201-1, CPU 201-2,..., CPU 201-n. Specifically, the storage area is, for example, the ROM 102, the RAM 103, and the flash ROM 104. For example, when the CPU 201-1 requests the display 105 to display image data, the CPU 201-1 accesses a VRAM (Video RAM) included in the RAM 103 and writes the image data in the VRAM. Therefore, the case where the CPU 201-1 accesses the display 105 is also included in accessing the shared memory 203.

また、たとえば、ＣＰＵ２０１−１が、Ｉ／Ｆ１０６にアクセスする場合も同様である。たとえば、Ｉ／Ｆ１０６の具体例がＬＡＮアダプタとして、ＬＡＮアダプタにあるバッファにアクセスする形式か、ＲＡＭ１０３にアクセスし、その後ＬＡＮアダプタに転送する形式かのどちらかになる。どちらの場合であっても、ＣＰＵ２０１−１、ＣＰＵ２０１−２からみると共有メモリにアクセスしていることとなるため、ＣＰＵ２０１−１、ＣＰＵ２０１−２が、Ｉ／Ｆ１０６へアクセスする場合も、共有メモリ２０３にアクセスすることに含める。同様に、ＣＰＵ２０１−１が、Ｉ／Ｆ１０６にアクセスする場合も、Ｉ／Ｆ１０６を制御するデバイスドライバが用意する共有の記憶領域にアクセスすることになるため、結果、共有メモリ２０３にアクセスすることになる。 Further, for example, the same applies when the CPU 201-1 accesses the I / F 106. For example, a specific example of the I / F 106 is a LAN adapter, which has either a format for accessing a buffer in the LAN adapter or a format for accessing the RAM 103 and then transferring to the LAN adapter. In either case, the shared memory is accessed when viewed from the CPU 201-1 and CPU 201-2. Therefore, even when the CPU 201-1 and CPU 201-2 access the I / F 106, the shared memory is accessed. Include in accessing 203. Similarly, when the CPU 201-1 accesses the I / F 106, the shared memory area prepared by the device driver that controls the I / F 106 is accessed, and as a result, the shared memory 203 is accessed. Become.

ハイパーバイザ２０４−１、ハイパーバイザ２０４−２は、それぞれＣＰＵ２０１−１、ＣＰＵ２０１−２にて動作するプログラムである。ハイパーバイザの機能は、ＯＳとＣＰＵの間にて位置し、ＯＳの監視を行い、ＯＳがハングアップした際にリセットをする他、ＯＳが何もスレッドを実行していない場合に、省電力設定にする。また、ＣＰＵ２０１−１、ＣＰＵ２０１−２は、それぞれのハイパーバイザによって、本実施の形態での特徴である、コンテンションの監視をする監視ライブラリ２０６−１、監視ライブラリ２０６−２を実行する。 The hypervisor 204-1 and the hypervisor 204-2 are programs that operate on the CPU 201-1 and the CPU 201-2, respectively. The hypervisor function is located between the OS and CPU, monitors the OS, resets when the OS hangs, and sets the power saving when the OS is not executing any threads. To. In addition, the CPU 201-1 and the CPU 201-2 execute the monitoring library 206-1 and the monitoring library 206-2 that monitor contention, which is a feature of the present embodiment, by the respective hypervisors.

ＯＳ２０５−１、ＯＳ２０５−２は、それぞれ、ＣＰＵ２０１−１、ＣＰＵ２０１−２にて動作するプログラムであり、ハイパーバイザ２０４−１、ハイパーバイザ２０４−２上で動作する。たとえば、ＯＳ２０５−１は、リアルタイムソフトウェア２０７をＣＰＵ２０１−１に割り当てて実行させるスレッドのスケジューラを持つ。 The OS 205-1 and OS 205-2 are programs that operate on the CPU 201-1 and CPU 201-2, respectively, and operate on the hypervisor 204-1 and the hypervisor 204-2. For example, the OS 205-1 has a thread scheduler that allocates and executes the real-time software 207 to the CPU 201-1.

監視ライブラリ２０６−１、監視ライブラリ２０６−２は、それぞれ、ハイパーバイザ２０４−１、ハイパーバイザ２０４−２上で動作するプログラムであり、共有メモリ２０３に対してアクセス競合によるコンテンションが発生していないかを監視する。監視の結果、コンテンションが発生している場合、監視ライブラリは、ハイパーバイザ間で情報を送信し、コンテンションを起こしたことを、他のハイパーバイザに送信する。 The monitoring library 206-1 and the monitoring library 206-2 are programs that operate on the hypervisor 204-1 and the hypervisor 204-2, respectively, and no contention due to access contention occurs in the shared memory 203. To monitor. If contention occurs as a result of monitoring, the monitoring library transmits information between the hypervisors and transmits information indicating that contention has occurred to other hypervisors.

リアルタイムソフトウェア２０７は、ＯＳ２０５−１によってＣＰＵ２０１−１に割り当てられたプログラムである。リアルタイムソフトウェアの具体例としては、通信パケット処理がある。通信パケット処理は、プロトコルの決められた時間以内に処理を行う必要があり、リアルタイム処理を要求される。ソフトウェア２０８は、ＯＳ２０５−２によってＣＰＵ２０１−２に割り当てられたプログラムである。ソフトウェア２０８は、リアルタイム処理を要求されていない。前述のように、本実施の形態では、ＣＰＵ２０１−１にてリアルタイム処理の保証を必要とするソフトウェアを実行している状態を想定する。 The real-time software 207 is a program assigned to the CPU 201-1 by the OS 205-1. A specific example of real-time software is communication packet processing. The communication packet processing needs to be performed within the time determined by the protocol, and real-time processing is required. The software 208 is a program assigned to the CPU 201-2 by the OS 205-2. Software 208 is not required for real-time processing. As described above, in this embodiment, it is assumed that the CPU 201-1 is executing software that requires guarantee of real-time processing.

アクセス経路２０９は、ＣＰＵ２０１−１がキャッシュメモリ２０２−１にアクセスする経路である。アクセス経路２１０は、ＣＰＵ２０１−１が共有メモリ２０３にアクセスする経路である。アクセス経路２０９とアクセス経路２１０の違いとして、リアルタイムソフトウェア２０７がアクセスしたいデータがキャッシュメモリ２０２−１にあればアクセス経路２０９、なければアクセス経路２１０となる。アクセス経路２１１、アクセス経路２１２も同様で、アクセス経路２１１は、ＣＰＵ２０１−２がキャッシュメモリ２０２−２にアクセスする経路である。アクセス経路２１２は、ＣＰＵ２０１−２が共有メモリ２０３にアクセスする経路である。 The access path 209 is a path for the CPU 201-1 to access the cache memory 202-1. The access path 210 is a path for the CPU 201-1 to access the shared memory 203. As a difference between the access path 209 and the access path 210, if the data that the real-time software 207 desires to access is in the cache memory 202-1, the access path 209 is not present, and if the data is to be accessed, the access path 210. The same applies to the access path 211 and the access path 212. The access path 211 is a path through which the CPU 201-2 accesses the cache memory 202-2. The access path 212 is a path for the CPU 201-2 to access the shared memory 203.

ハイパーバイザ間通信２１３は、ハイパーバイザ間でメッセージの送受信をする通信である。具体的には、たとえば、ＣＰＵ２０１−１にて、リアルタイムソフトウェア２０７を実行中にコンテンション状態になった場合に、ハイパーバイザ２０４−１からハイパーバイザ２０４−２を含むすべてのハイパーバイザにメッセージをブロードキャスト送信する。 The hypervisor communication 213 is communication for transmitting and receiving messages between hypervisors. Specifically, for example, when the CPU 201-1 enters a contention state while executing the real-time software 207, a message is broadcast from the hypervisor 204-1 to all hypervisors including the hypervisor 204-2. Send.

（マルチコアプロセッサシステムの機能的構成）
次に、マルチコアプロセッサシステム１００の機能的構成について説明する。図３は、マルチコアプロセッサシステム１００の機能的構成を示すブロック図である。マルチコアプロセッサシステム１００は、優先度検出部３０５と、発行命令効率算出部３０６と、コンテンション検出部３０７と、特定部３０８と、制御部３１１と、を含む構成である。この制御部となる機能（優先度検出部３０５〜制御部３１１）は、具体的には、たとえば、図１に示したＲＯＭ１０２、ＲＡＭ１０３、フラッシュＲＯＭ１０４などの記憶装置に記憶されたプログラムをＣＰＵｓ１０１が実行することにより、その機能を実現する。(Functional configuration of multi-core processor system)
Next, a functional configuration of the multi-core processor system 100 will be described. FIG. 3 is a block diagram showing a functional configuration of the multi-core processor system 100. The multi-core processor system 100 is configured to include a priority detection unit 305, an issued instruction efficiency calculation unit 306, a contention detection unit 307, a specification unit 308, and a control unit 311. Specifically, the functions (priority detection unit 305 to control unit 311) serving as the control unit are executed by the CPUs 101 by executing a program stored in a storage device such as the ROM 102, the RAM 103, and the flash ROM 104 shown in FIG. By doing so, the function is realized.

また、ＣＰＵ２０１−１、ＣＰＵ２０１−２、・・・、ＣＰＵ２０１−ｎは、ハイパーバイザとＯＳ／ソフトウェアを実行する。さらに、一点破線で分割された領域のうち、領域３０１−１に図示している優先度検出部３０５〜ハイパーバイザ間メッセージ送信部３０９は、ＣＰＵ２０１−１がハイパーバイザ２０４−１の機能の一部として実行することにより実現する。同様に、領域３０１−２内に図示しているハイパーバイザ間メッセージ受信部３１０、制御部３１１は、ＣＰＵ２０１−２が、ハイパーバイザ２０４−２の機能の一部として実行することにより実現する。 Further, the CPU 201-1, CPU 201-2,..., CPU 201-n execute a hypervisor and OS / software. Furthermore, the priority detection unit 305 to the inter-hypervisor message transmission unit 309 illustrated in the region 301-1 among the regions divided by the one-dot broken line are a part of the function of the hypervisor 204-1 by the CPU 201-1. This is realized by executing as Similarly, the inter-hypervisor message receiving unit 310 and the control unit 311 illustrated in the area 301-2 are realized by the CPU 201-2 being executed as a part of the function of the hypervisor 204-2.

また、図示していないが、ＣＰＵ２０１−１以外のコアが実行しているハイパーバイザにも、優先度検出部３０５〜ハイパーバイザ間メッセージ送信部３０９の機能を有している。同様に、ＣＰＵ２０１−２以外のコアが実行しているハイパーバイザにも、ハイパーバイザ間メッセージ受信部３１０、制御部３１１の機能を有している。また、優先度検出部３０５〜ハイパーバイザ間メッセージ送信部３０９は、監視ライブラリ２０６−１に相当する。同様に、ハイパーバイザ間メッセージ受信部３１０、制御部３１１は、監視ライブラリ２０６−２に相当する。 Although not shown, the hypervisor executed by the core other than the CPU 201-1 also has the functions of the priority detection unit 305 to the inter-hypervisor message transmission unit 309. Similarly, the hypervisor executed by the core other than the CPU 201-2 has the functions of the inter-hypervisor message receiving unit 310 and the control unit 311. The priority detection unit 305 to the hypervisor message transmission unit 309 correspond to the monitoring library 206-1. Similarly, the hypervisor message receiving unit 310 and the control unit 311 correspond to the monitoring library 206-2.

また、領域３０２−１に図示しているＯＳスケジューラ監視部３０４−１、リアルタイムソフトウェア２０７、ソフトウェア３１２は、ＣＰＵ２０１−１がＯＳ２０５−１の機能の一部として実行することにより実現する。優先度テーブル３０３−１は、ハイパーバイザ２０４−１またはＯＳ２０５−１からアクセスできるテーブルである。 In addition, the OS scheduler monitoring unit 304-1, the real-time software 207, and the software 312 illustrated in the area 302-1 are realized by the CPU 201-1 executing as a part of the functions of the OS 205-1. The priority table 303-1 is a table that can be accessed from the hypervisor 204-1 or the OS 205-1.

また、領域３０２−２に図示しているＯＳスケジューラ監視部３０４−２、ソフトウェア２０８、ソフトウェア３１３、ｎｉｃｅ値設定部３１４、ダミースレッド起動部３１５は、ＣＰＵ２０１−２がＯＳ２０５−２の機能の一部として実行することにより実現する。優先度テーブル３０３−２は、ハイパーバイザ２０４−２またはＯＳ２０５−２からアクセスできるテーブルである。 The OS scheduler monitoring unit 304-2, software 208, software 313, nice value setting unit 314, and dummy thread activation unit 315 illustrated in the area 302-2 are part of the functions of the OS 205-2 by the CPU 201-2. This is realized by executing as The priority table 303-2 is a table that can be accessed from the hypervisor 204-2 or the OS 205-2.

優先度テーブル３０３−１、優先度テーブル３０３−２は、マルチコアプロセッサシステム１００で実行される処理と処理の優先度を関連付けて管理するテーブルである。優先度テーブル３０３−１の内容の詳細は、図７にて後述する。 The priority table 303-1 and the priority table 303-2 are tables for managing the processing executed in the multi-core processor system 100 and the processing priority in association with each other. Details of the contents of the priority table 303-1 will be described later with reference to FIG.

ＯＳスケジューラ監視部３０４−１、ＯＳスケジューラ監視部３０４−２は、ＣＰＵ２０１−１、ＣＰＵ２０１−２に割り当てるソフトウェアを監視する機能を有する。具体的には、たとえば、リアルタイムソフトウェア２０７がＣＰＵ２０１−１に割り当てられており、リアルタイムソフトウェア２０７がＲＡＭ１０３やフラッシュＲＯＭ１０４上にある共有資源にアクセス要求する場合を想定する。 The OS scheduler monitoring unit 304-1 and the OS scheduler monitoring unit 304-2 have a function of monitoring software allocated to the CPU 201-1 and the CPU 201-2. Specifically, for example, it is assumed that the real-time software 207 is assigned to the CPU 201-1 and the real-time software 207 requests access to a shared resource on the RAM 103 or the flash ROM 104.

この時、既に他のソフトウェアが共有資源を使用すると宣言中であった場合に、ＯＳスケジューラ監視部３０４−１は、リアルタイムソフトウェア２０７の実行状態を待ち状態にする。続けて、ＯＳスケジューラ監視部３０４−１は、実行可能状態となっていた別のソフトウェア、たとえばソフトウェア３１２を実行状態にして、ＣＰＵ２０１−１に割り当てる。 At this time, if another software has already been declared to use the shared resource, the OS scheduler monitoring unit 304-1 sets the execution state of the real-time software 207 in a waiting state. Subsequently, the OS scheduler monitoring unit 304-1 sets another software that has been in an executable state, for example, the software 312 to an execution state, and assigns it to the CPU 201-1.

また、別の具体例として、たとえば、リアルタイムソフトウェア２０７が所定期間以上ＣＰＵ２０１−１に割り当てられていた場合にも、ＯＳスケジューラ監視部３０４−１は、別のソフトウェアをＣＰＵ２０１−１に割り当てる。また、前述のように、ＣＰＵに割り当てられたソフトウェアを切り替えることを、ディスパッチという。 As another specific example, for example, even when the real-time software 207 has been assigned to the CPU 201-1 for a predetermined period or longer, the OS scheduler monitoring unit 304-1 assigns another software to the CPU 201-1. Further, as described above, switching the software assigned to the CPU is called dispatch.

また、ＯＳスケジューラ監視部３０４−１は、新しくソフトウェアを起動する際に、ソフトウェアの実行単位となるスレッドとして起動する。各スレッドは、スタック領域、プログラムカウンタを含むレジスタ情報等を持つ。ＯＳスケジューラ監視部３０４−１は、ディスパッチを行うたびに、現在実行中のレジスタ情報等を共有メモリ２０３に退避し、次のソフトウェアのレジスタ情報等を共有メモリ２０３から取得し、ＣＰＵのレジスタ情報に設定する。 Further, the OS scheduler monitoring unit 304-1 is activated as a thread that becomes a unit of software execution when the software is newly activated. Each thread has a stack area, register information including a program counter, and the like. Each time dispatch is performed, the OS scheduler monitoring unit 304-1 saves the currently executed register information and the like to the shared memory 203, acquires the next software register information and the like from the shared memory 203, and stores them in the CPU register information. Set.

また、マルチコアプロセッサシステム１００は、スレッドの集合から、１つのプロセスを構成してもよい。スレッド間ではメモリ空間が共通であるが、プロセス間ではメモリ空間が独立しており、お互いのメモリ空間に直接アクセスできない。本実施の形態では、スレッドを用いて説明しているが、プロセスに置き換えてもよい。 The multi-core processor system 100 may configure one process from a set of threads. The memory space is common among threads, but the memory space is independent between processes, and the memory space cannot be directly accessed. In this embodiment, the description is made using threads, but it may be replaced with a process.

別のソフトウェアをＣＰＵ２０１−１に割り当てる際に、複数の割り当て候補となるソフトウェアが存在する場合、ＯＳスケジューラ監視部３０４−１は、優先度テーブル３０３−１に基づいて割り当てを行ってもよい。また、ＯＳスケジューラ監視部３０４−１は、それぞれのソフトウェアの割り当て時刻に基づいて、割り当て時刻が一番古いソフトウェアを割り当ててもよい。 When assigning another software to the CPU 201-1, when there are a plurality of assignment candidate softwares, the OS scheduler monitoring unit 304-1 may perform assignment based on the priority table 303-1. Further, the OS scheduler monitoring unit 304-1 may allocate the software with the oldest allocation time based on the allocation time of each software.

優先度検出部３０５は、複数のコアのうち、実行優先度が最も高い第１のコアを検出する機能を有する。ここでのコアは、ＣＰＵｓ１０１を構成しているＣＰＵ２０１−１、ＣＰＵ２０１−２、・・・、ＣＰＵ２０１−ｎに相当する。また、マルチコアプロセッサシステム１００がＡＳＭＰであり、リアルタイム処理が割り当てられたＣＰＵが存在するならば、ＣＰＵによって第１のコアを検出してもよい。 The priority detection unit 305 has a function of detecting the first core having the highest execution priority among the plurality of cores. The core here corresponds to the CPU 201-1, CPU 201-2,..., CPU 201-n constituting the CPUs 101. If the multi-core processor system 100 is ASMP and there is a CPU to which real-time processing is assigned, the first core may be detected by the CPU.

具体的には、たとえば、ＣＰＵ２０１−１は、現在割り当てられているソフトウェアの優先度を、優先度テーブル３０３−１から取得し、優先度が“リアルタイム”の場合に、実行優先度が最も高い第１のコアとする。なお、検出した第１のコアの情報は、キャッシュメモリ２０２−１、ＣＰＵ２０１−１の汎用レジスタなどの記憶領域に記憶される。 Specifically, for example, the CPU 201-1 obtains the priority of the currently assigned software from the priority table 303-1, and when the priority is “real time”, the CPU 20-1 has the highest execution priority. 1 core. The detected first core information is stored in a storage area such as the cache memory 202-1, the general-purpose register of the CPU 201-1.

発行命令効率算出部３０６は、コアごとに、コアが命令を発行した発行命令数とコアのサイクル数に基づいて、発行命令効率を算出する機能を有する。発行命令数は、所定時間内にＣＰＵが命令を行った数である。発行命令数はＣＰＵの特殊なレジスタである発行命令カウンタＩに格納されており、ハイパーバイザは、スーパバイザモードに移行することにより、発行命令カウンタＩの値を取得する。サイクル数は、所定時間内にＣＰＵが入力されたクロックの数である。サイクル数はＣＰＵのレジスタであるクロックカウンタＣに格納されている。発行命令効率は、１命令にかかったクロック数であり、Ｃ／Ｉにて算出される。発行命令効率算出部３０６は、発行命令効率をＩ／Ｃとして算出し、後述する閾値τも逆数にして比較してもよい。 The issued instruction efficiency calculating unit 306 has a function of calculating the issued instruction efficiency for each core based on the number of issued instructions issued by the core and the number of cycles of the core. The number of issued commands is the number of commands issued by the CPU within a predetermined time. The number of issued instructions is stored in an issued instruction counter I which is a special register of the CPU, and the hypervisor acquires the value of the issued instruction counter I by shifting to the supervisor mode. The cycle number is the number of clocks input by the CPU within a predetermined time. The number of cycles is stored in a clock counter C that is a register of the CPU. The issued instruction efficiency is the number of clocks required for one instruction, and is calculated by C / I. The issued instruction efficiency calculation unit 306 may calculate the issued instruction efficiency as I / C, and may compare the threshold value τ described later with an inverse number.

具体的には、たとえば、ハイパーバイザ２０４−１が起動するたびに、ＣＰＵ２０１−１は、Ｃ／Ｉを算出する。ハイパーバイザは、数十マイクロ秒〜数ミリ秒に１回の間隔で実行されるため、その時間での発行命令カウンタＩとクロックカウンタＣを取得し、Ｃ／Ｉを算出する。なお、算出された発行命令効率は、キャッシュメモリ２０２−１、ＣＰＵ２０１−１の汎用レジスタなどの記憶領域に記憶される。 Specifically, for example, every time the hypervisor 204-1 is activated, the CPU 201-1 calculates C / I. Since the hypervisor is executed at intervals of once every several tens of microseconds to several milliseconds, the hypervisor acquires the issued instruction counter I and the clock counter C at that time, and calculates C / I. The calculated issued instruction efficiency is stored in a storage area such as the cache memory 202-1, the general-purpose register of the CPU 201-1.

コンテンション検出部３０７は、発行命令効率算出部３０６によって算出された発行命令効率と所定の閾値とに基づいて、アクセス競合を検出する機能を有する。所定の閾値とは、仕様から設定できる値であり、τで表す。閾値τの具体的な設定方法は、図４にて後述する。具体的には、たとえば、発行命令数が閾値τより大きい場合、ＣＰＵ２０１−１は、共有メモリ２０３に対するアクセス競合によるコンテンションが発生していることを検出する。なお、検出された結果は、キャッシュメモリ２０２−１、ＣＰＵ２０１−１の汎用レジスタなどの記憶領域に記憶される。 The contention detection unit 307 has a function of detecting an access conflict based on the issued instruction efficiency calculated by the issued instruction efficiency calculation unit 306 and a predetermined threshold. The predetermined threshold is a value that can be set from the specification and is represented by τ. A specific method for setting the threshold τ will be described later with reference to FIG. Specifically, for example, when the number of issued instructions is larger than the threshold τ, the CPU 201-1 detects that contention due to access contention with respect to the shared memory 203 has occurred. The detected result is stored in a storage area such as the cache memory 202-1, the general-purpose register of the CPU 201-1.

特定部３０８は、複数のコアのうち、優先度検出部３０５によって検出された第１のコアのうち、共有メモリ２０３に対してアクセス競合を発生させた第２のコアを特定する機能を有する。また、特定部３０８は、共有メモリ２０３に対してアクセス競合を発生させており第２のコアと競合する第３のコアを特定してもよい。アクセス競合によるコンテンション発生を検出する際にＣＰＵ２０１−１は、コンテンション検出部３０７によって検出してもよい。 The specifying unit 308 has a function of specifying a second core that has caused access contention for the shared memory 203 among the first cores detected by the priority detection unit 305 among the plurality of cores. Further, the specifying unit 308 may specify a third core that has generated access contention for the shared memory 203 and competes with the second core. When detecting the occurrence of contention due to access competition, the CPU 201-1 may detect the contention by the contention detection unit 307.

具体的には、たとえば、ＣＰＵ２０１−１は、優先度検出部３０５によって“リアルタイム”であるＣＰＵから、コンテンションを発生しているＣＰＵを検出して、第２のコアとして特定する。また、ＣＰＵ２０１−１は、複数のＣＰＵからコンテンションを発生しているＣＰＵを検出して、第３のコアとして特定してもよい。なお、特定された第２のコア、または第３のコアの情報は、キャッシュメモリ２０２−１、ＣＰＵ２０１−１の汎用レジスタなどの記憶領域に記憶される。 Specifically, for example, the CPU 201-1 detects the CPU generating contention from the CPUs that are “real time” by the priority detection unit 305 and identifies them as the second core. Further, the CPU 201-1 may detect a CPU generating contention from a plurality of CPUs and specify the CPU as a third core. In addition, the information of the specified 2nd core or 3rd core is memorize | stored in storage areas, such as the cache memory 202-1 and the general purpose register of CPU201-1.

ハイパーバイザ間メッセージ送信部３０９は、他のハイパーバイザにメッセージをブロードキャスト送信する機能を有する。具体的には、たとえば、リアルタイム処理を行っており、コンテンション発生を検出したハイパーバイザ２０４−１は、バス１０８を通じてハイパーバイザ２０４−２や、その他のハイパーバイザに対して、メッセージをブロードキャスト送信する。なお、送信したメッセージの内容は、キャッシュメモリ２０２−１、ＣＰＵ２０１−１の汎用レジスタなどの記憶領域に記憶されてもよい。 The hypervisor message transmission unit 309 has a function of broadcasting a message to other hypervisors. Specifically, for example, the hypervisor 204-1 that performs real-time processing and detects the occurrence of contention broadcasts a message to the hypervisor 204-2 and other hypervisors via the bus 108. . Note that the content of the transmitted message may be stored in a storage area such as the cache memory 202-1, the general-purpose register of the CPU 201-1.

ハイパーバイザ間メッセージ受信部３１０は、他のハイパーバイザが送信したメッセージを受信する機能を有する。具体的には、たとえば、ハイパーバイザ２０４−２は、リアルタイム処理を行っておりコンテンション発生を検出したハイパーバイザ２０４−１からメッセージを受信する。なお、受信したメッセージの内容は、キャッシュメモリ２０２−２、ＣＰＵ２０１−２の汎用レジスタなどの記憶領域に記憶される。 The hypervisor message receiver 310 has a function of receiving a message transmitted by another hypervisor. Specifically, for example, the hypervisor 204-2 performs a real-time process and receives a message from the hypervisor 204-1 that has detected the occurrence of contention. The contents of the received message are stored in a storage area such as the cache memory 202-2 and the general-purpose register of the CPU 201-2.

制御部３１１は、複数のコアのうち、第１のコアと特定部３０８によって特定された第２のコアを除いた第３のコアに対し、共有メモリ２０３にアクセスしないスレッドを実行するように制御する機能を有する。また、制御部３１１は、第２のコアを除いた第３のコアに対し、共有メモリ２０３にアクセスしないスレッドを実行するように制御してもよい。また、特定部３０８によって第３のコアが特定されている場合、制御部３１１は、特定された第３のコアに対し、共有メモリ２０３にアクセスしないスレッドを実行するように制御してもよい。 The control unit 311 controls the third core excluding the first core and the second core specified by the specifying unit 308 to execute a thread that does not access the shared memory 203 among the plurality of cores. It has the function to do. In addition, the control unit 311 may perform control so that a thread that does not access the shared memory 203 is executed on the third core excluding the second core. When the third core is specified by the specifying unit 308, the control unit 311 may control the specified third core to execute a thread that does not access the shared memory 203.

また、共有メモリ２０３にアクセスしないスレッドを実行する期間は、アクセス競合が発生した期間のうち、所定の期間である。所定の期間とは、ＯＳ２０５−２が保持しているタイムスライスの値となる。また、所定の期間は、アクセス競合が発生した期間のうち、第３のコアに割り当てられていたスレッドと、共有メモリ２０３にアクセスしないスレッドとで、時分割した期間としてもよい。 Further, the period during which a thread that does not access the shared memory 203 is executed is a predetermined period among the periods in which access contention occurs. The predetermined period is a time slice value held by the OS 205-2. In addition, the predetermined period may be a period in which time is divided between a thread allocated to the third core and a thread that does not access the shared memory 203 in a period in which access contention occurs.

具体的には、たとえば、特定された第２のコアは、ハイパーバイザ間通信を行うため、メッセージを受信したＣＰＵが第３のコアとなり、共有メモリ２０３にアクセスしないスレッドを実行するようにＯＳスケジューラ監視部３０４−２を制御する。制御内容としては、制御部３１１は、ｎｉｃｅ値設定部３１４か、または、ダミースレッド起動部３１５を実行するようにＯＳスケジューラ監視部３０４−２を制御する。 Specifically, for example, since the specified second core performs inter-hypervisor communication, the CPU that receives the message becomes the third core, and the OS scheduler executes a thread that does not access the shared memory 203. Controls the monitoring unit 304-2. As a control content, the control unit 311 controls the OS scheduler monitoring unit 304-2 to execute the nice value setting unit 314 or the dummy thread activation unit 315.

ｎｉｃｅ値設定部３１４は、現在実行されているソフトウェアのｎｉｃｅ値を設定する機能を有する。ｎｉｃｅ値とは、ＰＯＳＩＸ（ＰｏｒｔａｂｌｅＯｐｅｒａｔｉｎｇＳｙｓｔｅｍＩｎｔｅｒｆａｃｅｆｏｒＵＮＩＸ（登録商標））にて定義されているｎｉｃｅコマンドで設定する値である。ｎｉｃｅコマンドによって設定値を変更することにより、ＯＳ２０５−２はソフトウェアの実行優先度を制御する。 The nice value setting unit 314 has a function of setting a nice value of currently executed software. The nice value is a value set by a nice command defined in POSIX (Portable Operating System Interface for UNIX (registered trademark)). The OS 205-2 controls the execution priority of the software by changing the setting value with the nice command.

具体的には、たとえば、リアルタイム処理を要求されないソフトウェアに対して、ｎｉｃｅ値を上昇させると、優先度が低くなる。ｎｉｃｅコマンドの実装の一例としては、ＯＳ２０５−２は、ソフトウェアの割り当て終了時刻にｎｉｃｅ値を加算した値を算出する。次に、ＯＳスケジューラ監視部３０４−２は、加算した値に基づいて、一番小さい値を持つソフトウェアをディスパッチ対象に決める、という方法をとってもよい。 Specifically, for example, when the nice value is increased for software that does not require real-time processing, the priority is lowered. As an example of implementation of the nice command, the OS 205-2 calculates a value obtained by adding the nice value to the software assignment end time. Next, the OS scheduler monitoring unit 304-2 may take a method of determining the software having the smallest value as a dispatch target based on the added value.

結果、対象のソフトウェアのｎｉｃｅ値が大きいほど対象のソフトウェアの優先度が低くなる。したがって、もしＯＳ２０５−２がＰＯＳＩＸ仕様に準拠していなく、ｎｉｃｅコマンドが存在しない場合でも、前述の処理を加えることでｎｉｃｅ値設定部３１４を実現してもよい。なお、設定された値は、ＲＡＭ１０３、フラッシュＲＯＭ１０４などの記憶領域に記憶される。 As a result, the priority of the target software decreases as the nice value of the target software increases. Therefore, even if the OS 205-2 is not compliant with the POSIX specification and there is no nice command, the nice value setting unit 314 may be realized by adding the above-described processing. The set value is stored in a storage area such as the RAM 103 or the flash ROM 104.

ダミースレッド起動部３１５は、共有メモリ２０３にアクセスしないスレッドを生成する機能を有する。具体的には、たとえば、ＣＰＵ２０１−２は、ＣＰＵに対して何も動作しないコードであるｎｏｐを一定時間行うスレッドを起動する。また、ｎｉｃｅ値設定部３１４とダミースレッド起動部３１５は、ＯＳスケジューラ監視部３０４−２によってアクセス競合によるコンテンション発生の期間のうち少なくとも一部の期間中に実行する。 The dummy thread activation unit 315 has a function of generating a thread that does not access the shared memory 203. Specifically, for example, the CPU 201-2 activates a thread that performs a nop that is a code that does not operate on the CPU for a certain period of time. The nice value setting unit 314 and the dummy thread activation unit 315 are executed by the OS scheduler monitoring unit 304-2 during at least a part of the contention generation period due to access contention.

図４は、コンテンション状態を示す説明図である。初めに、ＣＰＵ２０１−１は、ハイパーバイザ２０４−１とリアルタイムソフトウェア２０７を実行しており、ＣＰＵ２０１−２は、ハイパーバイザ２０４−２とソフトウェア２０８を実行している。それぞれのＣＰＵは、実行中のソフトウェアによってキャッシュメモリにアクセスするか、または、共有メモリ２０３にアクセスする。 FIG. 4 is an explanatory diagram showing a contention state. First, the CPU 201-1 executes the hypervisor 204-1 and the real-time software 207, and the CPU 201-2 executes the hypervisor 204-2 and the software 208. Each CPU accesses the cache memory by executing software or accesses the shared memory 203.

ハイパーバイザは周期的に起動しており、起動間隔は、数十マイクロ秒から、数ミリ秒である。図４では、どちらのメモリにアクセスしたかという動作に従って、示した時間を、時間４０１、時間４０２、時間４０３に分割して示している。時間４０１と時間４０３にて、ＣＰＵ２０１−１とＣＰＵ２０１−２は、リアルタイムソフトウェア２０７と、ソフトウェア２０８によって同時に共有メモリ２０３にアクセスしていないため、コンテンション状態とはならない。 The hypervisor is periodically activated, and the activation interval is several tens of microseconds to several milliseconds. In FIG. 4, the time shown is divided into a time 401, a time 402, and a time 403 according to which memory is accessed. At time 401 and time 403, since the CPU 201-1 and CPU 201-2 are not accessing the shared memory 203 at the same time by the real-time software 207 and the software 208, the contention state does not occur.

しかし、時間４０２にて、ＣＰＵ２０１−１とＣＰＵ２０１−２は、同時に共有メモリ２０３にアクセスしているため、共有メモリ２０３に対してアクセス競合によるコンテンション状態となっている。コンテンション状態になると、ＣＰＵ２０１−１は、メモリアクセスにかかる時間が数百サイクルとなり、リアルタイムソフトウェア２０７の処理遅延を発生させてしまう。結果、ＣＰＵ２０１−１は、リアルタイムソフトウェア２０７に求められている時刻までに処理を終了することができない可能性があり、リアルタイム処理の保証を行えない状態となる。 However, since the CPU 201-1 and the CPU 201-2 are accessing the shared memory 203 at the same time 402, the contention state is caused by access contention with respect to the shared memory 203. In the contention state, the CPU 201-1 takes several hundred cycles for memory access, causing a processing delay of the real-time software 207. As a result, the CPU 201-1 may not be able to finish the processing by the time required for the real-time software 207, and the real-time processing cannot be guaranteed.

次に、リアルタイム処理の保証について説明する。リアルタイム処理は、決められた時間以内に応答を返す必要があり、その時間を、Δ［秒］とする。ここで、リアルタイム処理を行うＣＰＵ２０１−１のクロックサイクルをｃｌｋ［１／秒］とする。したがって、時間Δを消費する間に許されるＣＰＵ２０１−１のクロックのカウント数はΔ・ｃｌｋ［個］となる。もしコンテンション状態になっており、ＣＰＵ２０１−１がΔ・ｃｌｋ［個］のカウント数で１命令を実行できない場合に、リアルタイム処理の保証を行えないことになる。 Next, guarantee of real-time processing will be described. In real-time processing, it is necessary to return a response within a predetermined time, and this time is set to Δ [seconds]. Here, the clock cycle of the CPU 201-1 that performs real-time processing is defined as clk [1 / second]. Therefore, the clock count of the CPU 201-1 allowed while consuming the time Δ is Δ · clk [pieces]. If the CPU 201-1 is in a contention state and cannot execute one instruction with a count of Δ · clk [pieces], real-time processing cannot be guaranteed.

１命令あたりのクロック数は、一定時間内での発行命令カウンタＩと、クロックカウンタＣに基づいて、Ｃ／Ｉを算出することで求められる。ここで、閾値τを、τ＝Δ・ｃｌｋで示す。Ｃ／Ｉが閾値τ以下の場合、ＣＰＵ２０１−１は、リアルタイム処理の保証を行える状態であり、Ｃ／Ｉが閾値τより大きい場合、ＣＰＵ２０１−１は、リアルタイム処理の保証を行えない状態である。 The number of clocks per instruction can be obtained by calculating C / I based on the issued instruction counter I and the clock counter C within a certain time. Here, the threshold τ is represented by τ = Δ · clk. When C / I is less than or equal to the threshold τ, the CPU 201-1 is in a state where real-time processing can be guaranteed, and when C / I is greater than the threshold τ, the CPU 201-1 is in a state where real-time processing cannot be guaranteed. .

Δやｃｌｋは、仕様策定時に決定可能な値であるため、閾値τも仕様策定時に決定可能である。具体的には、たとえば、Δ＝２［マイクロ秒］、ｃｌｋ＝５００［ＭＨｚ］である場合、τ＝１０００となる。通常、ＣＰＵ２０１−１は、共有メモリ２０３へのアクセスは数十カウント消費する。しかし、共有メモリ２０３へのアクセス競合によるコンテンション発生時には、数十〜数百カウント消費し、ＣＰＵ２０１−１の動作効率は最大でピーク時の３０％にまで低減することがある。 Since Δ and clk are values that can be determined at the time of specification formulation, the threshold τ can also be determined at the time of specification formulation. Specifically, for example, when Δ = 2 [microseconds] and clk = 500 [MHz], τ = 1000. Normally, the CPU 201-1 consumes several tens of counts to access the shared memory 203. However, when contention occurs due to contention on access to the shared memory 203, tens to hundreds of counts are consumed, and the operating efficiency of the CPU 201-1 may be reduced to 30% at the maximum.

図５は、コンテンションが解消された状態を示す説明図である。図５も図４と同様に、ＣＰＵ２０１−１は、ハイパーバイザ２０４−１とリアルタイムソフトウェア２０７を実行しており、ＣＰＵ２０１−２は、ハイパーバイザ２０４−２とソフトウェア２０８を実行している。図４では、時間４０２にて、ＣＰＵ２０１−１とＣＰＵ２０１−２が同時に共有メモリ２０３にアクセスし、アクセス競合によるコンテンションが発生していた。 FIG. 5 is an explanatory diagram showing a state in which contention is eliminated. In FIG. 5, as in FIG. 4, the CPU 201-1 executes the hypervisor 204-1 and the real-time software 207, and the CPU 201-2 executes the hypervisor 204-2 and the software 208. In FIG. 4, at time 402, the CPU 201-1 and CPU 201-2 accessed the shared memory 203 at the same time, and contention due to access contention occurred.

しかし、図５での時間４０２でのＣＰＵ２０１−２は、ソフトウェア２０８とダミースレッドを交互に実行することにより、アクセス競合によるコンテンションを解消している。結果、ＣＰＵ２０１−１は、リアルタイムソフトウェア２０７に求められている時刻までに処理を終了することができ、リアルタイム処理を保証できる。 However, the CPU 201-2 at time 402 in FIG. 5 eliminates contention due to access contention by alternately executing software 208 and dummy threads. As a result, the CPU 201-1 can finish the processing by the time required by the real-time software 207, and can guarantee the real-time processing.

図６は、本実施の形態を適用したマルチコアプロセッサシステム１００の性能比を示す説明図である。図６の横軸は、バス１０８に設定したバッファ段数であり、縦軸は従来例のバッファ段数１を基準とした性能比である。従来例のバッファ段数１と等しい性能の場合、縦軸が１．００にプロットされることになる。本実施の形態にかかるマルチコアプロセッサシステム１００に関して、バッファ段数ごとに、従来例との性能比をプロットして曲線で結んだ結果が曲線６０１である。また、Δ＝１［ミリ秒］、ｃｌｋ＝６００［ＭＨｚ］である。同様に、従来例にかかるマルチコアプロセッサシステムに関して、バッファ段数ごとに、従来例との性能比をプロットして曲線で結んだ結果が曲線６０２である。 FIG. 6 is an explanatory diagram showing the performance ratio of the multi-core processor system 100 to which the present embodiment is applied. The horizontal axis in FIG. 6 is the number of buffer stages set in the bus 108, and the vertical axis is the performance ratio based on the number of buffer stages 1 in the conventional example. In the case of performance equal to the number of buffer stages of the conventional example, the vertical axis is plotted at 1.00. With respect to the multi-core processor system 100 according to the present embodiment, a curve 601 is a result of plotting a performance ratio with the conventional example and connecting with a curve for each number of buffer stages. Further, Δ = 1 [millisecond] and clk = 600 [MHz]. Similarly, with respect to the multi-core processor system according to the conventional example, a result obtained by plotting a performance ratio with the conventional example for each number of buffer stages and connecting with a curve is a curve 602.

また、曲線６０１、曲線６０２は、従来例のバッファ段数１を基準として、それより性能比がよい場合には領域６０３に位置し、悪い場合には領域６０４に位置する。領域６０３に位置する場合、マルチコアプロセッサシステム１００はリアルタイム処理を保証でき、領域６０４に位置する場合、リアルタイム処理を保証できないことになる。マルチコアプロセッサシステムは、バスのバッファ段数が増えるとバス利用の効率をあげることができるが、リアルタイム処理を保証することが難しくなる。 The curves 601 and 602 are located in the region 603 when the performance ratio is better than that of the conventional buffer stage number 1, and located in the region 604 when the performance ratio is worse. When located in the area 603, the multi-core processor system 100 can guarantee real-time processing, and when located in the area 604, real-time processing cannot be guaranteed. In the multi-core processor system, the bus utilization efficiency can be increased as the number of buffer stages of the bus increases, but it becomes difficult to guarantee real-time processing.

従来例にかかる曲線６０２は、バッファ段数が５段以上では領域６０４に位置している。したがって、従来例にかかるマルチコアプロセッサシステムは、バッファ段数が５段以上になると、リアルタイム処理保証を行えない。本実施の形態にかかる曲線６０１では、バッファ段数が１３段となるまで領域６０３に位置している。したがって、本実施の形態にかかるマルチコアプロセッサシステム１００は、バッファ段数が１３段まで、リアルタイム処理を保証することができる。 A curve 602 according to the conventional example is located in the region 604 when the number of buffer stages is five or more. Therefore, the multi-core processor system according to the conventional example cannot guarantee real-time processing when the number of buffer stages is five or more. The curve 601 according to the present embodiment is located in the region 603 until the number of buffer stages becomes 13. Therefore, the multi-core processor system 100 according to the present embodiment can guarantee real-time processing up to 13 buffer stages.

図７は、優先度テーブル３０３−１の記憶内容の一例を示す説明図である。優先度テーブル３０３−１は、処理名称フィールドと、実行優先度フィールドで構成する。なお、優先度テーブル３０３−２も同様のデータが設定されている。処理名称フィールドは、具体的な処理の内容を記述している。実際には、処理内容を記述したプログラムがＲＯＭ１０２、ＲＡＭ１０３、フラッシュＲＯＭ１０４のいずれかに存在し、ＣＰＵ２０１−１は、プログラムをロードし、スレッドとして実行する。実行優先度フィールドは、対応する処理名称の実行する際の優先度を設定している。 FIG. 7 is an explanatory diagram showing an example of the stored contents of the priority table 303-1. The priority table 303-1 includes a process name field and an execution priority field. Similar data is set in the priority table 303-2. The process name field describes specific processing contents. Actually, a program describing the processing contents exists in any of the ROM 102, RAM 103, and flash ROM 104, and the CPU 201-1 loads the program and executes it as a thread. The execution priority field sets a priority for executing the corresponding process name.

たとえば、“通信パケット受信”処理は、一定時間内にパケット処理を行わないと、タイムアウトになるため、リアルタイム処理を保証する必要がある。したがって、実行優先度フィールドは、“リアルタイム”となる。続けて、“描画レンダリング”処理は、通常の処理であって、リアルタイム処理を保証する必要がない。したがって、実行優先度フィールドは、“通常”となる。同様に、“ＵＩ入力”処理は、仕様によりユーザに対する応答時間を決められている場合、リアルタイム処理を保証する必要がある。“辞書の先読み検索”処理は、リアルタイム処理を保証する必要がない。 For example, the “communication packet reception” process times out if packet processing is not performed within a certain period of time, so real-time processing must be guaranteed. Therefore, the execution priority field is “real time”. Subsequently, the “drawing rendering” process is a normal process, and it is not necessary to guarantee real-time processing. Therefore, the execution priority field is “normal”. Similarly, the “UI input” process needs to guarantee real-time processing when the response time for the user is determined by the specification. “Dictionary look-ahead search” processing does not require real-time processing.

コンテンションが起こる状態として、たとえば、マルチコアプロセッサシステム１００が、Ｗｅｂブラウジング処理を行っている状態を想定する。前述の状態でＣＰＵ２０１−１は、通信パケット受信処理を実行しており、ＣＰＵ２０１−２は、描画レンダリング処理を実行している。描画レンダリング処理は、メモリアクセスが多く、通信パケット受信処理と共有メモリ２０３に対するアクセス競合を起こす可能性が高い。 As a state in which contention occurs, for example, a state in which the multi-core processor system 100 is performing web browsing processing is assumed. In the state described above, the CPU 201-1 is executing communication packet reception processing, and the CPU 201-2 is executing drawing rendering processing. The rendering / rendering process has many memory accesses, and there is a high possibility that a communication packet reception process and an access conflict with the shared memory 203 will occur.

本実施の形態を適用した状態では、前述の状態のようにアクセス競合によるコンテンションが発生した際に、ＣＰＵ２０１−２が、ＯＳ２０５−２によって描画レンダリング処理のｎｉｃｅ値を上昇させる。ＯＳ２０５−２は、ｎｉｃｅ値を上昇させた描画レンダリング処理をよりまばらになるようにＣＰＵ２０１−２に割り当てる。結果、マルチコアプロセッサシステム１００は、アクセス競合によるコンテンションを回避でき、描画レンダリング処理のリアルタイム処理を保証できる。 In the state where the present embodiment is applied, when contention due to access competition occurs as in the above-described state, the CPU 201-2 increases the nice value of the rendering rendering process by the OS 205-2. The OS 205-2 assigns the rendering rendering process in which the nice value is increased to the CPU 201-2 so as to be more sparse. As a result, the multi-core processor system 100 can avoid contention due to access contention, and can guarantee real-time processing of rendering rendering processing.

また別のコンテンション状態として、たとえば、マルチコアプロセッサシステム１００が、ユーザからの文字入力を受け付けている状態を想定する。前述の状態で、ＣＰＵ２０１−１は、ＵＩ入力処理を実行しており、ＣＰＵ２０１−２は、辞書の先読み検索処理を実行している。辞書の先読み検索処理は、Ｉ／Ｏアクセスが多く、通信パケット受信処理と共有メモリ２０３に対するアクセス競合を起こす可能性が高い。 As another contention state, for example, a state is assumed in which the multi-core processor system 100 accepts a character input from the user. In the state described above, the CPU 201-1 is executing UI input processing, and the CPU 201-2 is executing dictionary prefetch search processing. The dictionary look-ahead search process has many I / O accesses, and there is a high possibility that a communication packet reception process and an access conflict with the shared memory 203 will occur.

本実施の形態を適用した状態では、前述の状態のようにアクセス競合によるコンテンションが発生した際に、ＣＰＵ２０１−２が、ＯＳ２０５−２によって辞書の先読み検索処理のｎｉｃｅ値を上昇させる。ＯＳ２０５−２は、ｎｉｃｅ値を上昇させた辞書の先読み検索処理をよりまばらになるようにＣＰＵ２０１−２に割り当てる。結果、マルチコアプロセッサシステム１００は、アクセス競合によるコンテンションを回避でき、ＵＩ入力処理のリアルタイム処理を保証できる。 In a state where the present embodiment is applied, when contention due to access competition occurs as in the above-described state, the CPU 201-2 increases the nice value of the dictionary prefetch search processing by the OS 205-2. The OS 205-2 assigns the look-ahead search processing of the dictionary with the nice value increased to the CPU 201-2 so as to be more sparse. As a result, the multi-core processor system 100 can avoid contention due to access contention and can guarantee real-time processing of UI input processing.

図８は、ハイパーバイザによるメッセージ送信処理を示すフローチャートである。メッセージ送信処理は、ハイパーバイザが起動するたびに行われる。ＣＰＵ２０１−１は、リアルタイムソフトウェアが実行中かを確認する（ステップＳ８０１）。リアルタイムソフトウェアが実行中の場合（ステップＳ８０１：Ｙｅｓ）、ＣＰＵ２０１−１は、発行命令カウンタＩを取得する（ステップＳ８０２）。続けて、ＣＰＵ２０１−１は、クロックカウンタＣを取得する（ステップＳ８０３）。取得後、ＣＰＵ２０１−１は、コンテンション中かの判断値となるＣ／Ｉ値を算出する（ステップＳ８０４）。算出後、ＣＰＵ２０１−１は、Ｃ／Ｉ値と閾値τを比較する（ステップＳ８０５）。 FIG. 8 is a flowchart showing message transmission processing by the hypervisor. The message transmission process is performed every time the hypervisor is activated. The CPU 201-1 confirms whether the real-time software is being executed (step S801). When the real-time software is being executed (step S801: Yes), the CPU 201-1 acquires the issue instruction counter I (step S802). Subsequently, the CPU 201-1 acquires the clock counter C (step S803). After the acquisition, the CPU 201-1 calculates a C / I value that is a determination value indicating whether contention is in progress (step S804). After the calculation, the CPU 201-1 compares the C / I value with the threshold value τ (step S805).

Ｃ／Ｉ値が閾値τより大きい場合（ステップＳ８０５：Ｙｅｓ）、コンテンション中となり、ＣＰＵ２０１−１は、ｎｉｃｅ値上昇メッセージを生成する（ステップＳ８０６）。このメッセージを受信したＣＰＵは、現在動作中のソフトウェアのｎｉｃｅ値を上昇させ、ｎｉｃｅ値が上昇したソフトウェアは優先度が下がるため、現在動作中のソフトウェアの実行をまばらにすることになる。 When the C / I value is larger than the threshold τ (step S805: Yes), the contention is in progress, and the CPU 201-1 generates a nice value increase message (step S806). The CPU that has received this message increases the nice value of the currently operating software, and the software with the increased nice value has a lower priority, so the execution of the currently operating software is sparse.

生成後、ＣＰＵ２０１−１は、ハイパーバイザ間にメッセージをブロードキャスト送信する（ステップＳ８０７）。送信後、ＣＰＵ２０１−１は、通常のハイパーバイザ処理を実行し（ステップＳ８１０）、処理を終了する。Ｃ／Ｉ値が閾値τ以下の場合（ステップＳ８０５：Ｎｏ）、コンテンション中ではないことになり、ＣＰＵ２０１−１は、ステップＳ８１０の処理を行い、処理を終了する。 After the generation, the CPU 201-1 broadcasts a message between the hypervisors (step S807). After the transmission, the CPU 201-1 executes a normal hypervisor process (step S810) and ends the process. If the C / I value is less than or equal to the threshold value τ (step S805: No), the contention is not in progress, and the CPU 201-1 performs the process of step S810 and ends the process.

リアルタイムソフトウェアが実行中でない場合（ステップＳ８０１：Ｎｏ）、ＣＰＵ２０１−１は、続けて、実行中のソフトウェアのｎｉｃｅ値が初期値かを確認する（ステップＳ８０８）。初期値でない場合（ステップＳ８０８：Ｎｏ）、ＣＰＵ２０１−１は、実行中のソフトウェアのｎｉｃｅ値を初期値に設定し（ステップＳ８０９）、ステップＳ８１０の処理に移行する。ｎｉｃｅ値が初期値の場合（ステップＳ８０８：Ｙｅｓ）、ＣＰＵ２０１−１は、ステップＳ８１０の処理に移行する。 When the real-time software is not being executed (step S801: No), the CPU 201-1 subsequently checks whether the nice value of the software being executed is an initial value (step S808). If it is not the initial value (step S808: No), the CPU 201-1 sets the nice value of the software being executed to the initial value (step S809), and proceeds to the process of step S810. When the nice value is the initial value (step S808: Yes), the CPU 201-1 proceeds to the process of step S810.

ｎｉｃｅ値が初期値でない場合、ＣＰＵ２０１−１で実行していた処理がコンテンションの原因だったことを示しており、ＣＰＵ２０１−１は、ステップＳ８０９の処理にて、コンテンション回避のために低下していた処理を元に戻すことができる。コンテンションを解決する場合、コンテンションの原因となっている処理を、ＯＳのスケジューラが切り替え可能とする最小単位の時間で休止することで、コンテンションの解決を得られるケースが多い。もし、最小単位の時間で解決しないことが多い場合、ＣＰＵ２０１−１は、ステップＳ８０８：Ｎｏの後にＣ／Ｉ値を算出し、閾値τと比較してコンテンションが解決したことを確認した後にステップＳ８０９の処理を実行してもよい。 If the nice value is not the initial value, it indicates that the process executed by the CPU 201-1 was the cause of contention, and the CPU 201-1 decreased in order to avoid contention in the process of step S809. The processing that had been performed can be restored. When resolving contention, it is often possible to obtain contention resolution by suspending the process that causes contention for the minimum unit time that can be switched by the scheduler of the OS. If there is often no resolution in the minimum unit time, the CPU 201-1 calculates the C / I value after step S808: No, and compares it with the threshold τ to confirm that the contention has been resolved. The process of S809 may be executed.

図９は、ハイパーバイザによるメッセージ受信処理を示すフローチャートである。ＣＰＵ２０１−２は、ハイパーバイザ間のメッセージを受信する（ステップＳ９０１）。本実施の形態では、ＣＰＵ２０１−１が送信したメッセージを受信する。次に、ＣＰＵ２０１−２は、自身のＣＰＵがメッセージをブロードキャストしたかを確認する（ステップＳ９０２）。 FIG. 9 is a flowchart showing message reception processing by the hypervisor. The CPU 201-2 receives a message between hypervisors (step S901). In the present embodiment, the message transmitted by CPU 201-1 is received. Next, the CPU 201-2 confirms whether its own CPU broadcasts a message (step S902).

ブロードキャストしている場合（ステップＳ９０２：Ｙｅｓ）、リアルタイム処理中でコンテンション中であり、スレッドの制御を行わないため、ＣＰＵ２０１−２は、処理を終了する。ブロードキャストしていない場合（ステップＳ９０２：Ｎｏ）、コンテンションの原因となるので、ＣＰＵ２０１−２は、共有メモリ２０３にアクセスしない処理を実行する。 When broadcasting (step S902: Yes), the CPU 201-2 ends the processing because the real-time processing is in contention and the thread is not controlled. When not broadcasting (step S902: No), it causes contention, so the CPU 201-2 executes a process of not accessing the shared memory 203.

たとえば、ＣＰＵ２０１−２は、現在動作中のソフトウェアのｎｉｃｅ値を上昇させるようにＯＳ２０５−２に指示する（ステップＳ９０３）。もし、ｎｉｃｅ値の機能を持っていないＯＳの場合、ＣＰＵ２０１−２は、ＯＳ２０５−２に対してダミースレッドを起動させるよう指示してもよい。 For example, the CPU 201-2 instructs the OS 205-2 to increase the nice value of the currently operating software (step S903). If the OS does not have a nice value function, the CPU 201-2 may instruct the OS 205-2 to activate a dummy thread.

また、ＣＰＵ２０１−２は、ステップＳ９０２：Ｎｏにて、リアルタイムソフトが動作していない場合、Ｃ／Ｉ値を算出し、Ｃ／Ｉ値と閾値τを比較しコンテンション中であった場合に、スレッドの制御を行ってもよい。この場合、Ｃ／Ｉ値の比較の分だけ処理が増加するが、コンテンションが発生しているＣＰＵだけを対象にすることができる。 Further, when the real-time software is not operating in step S902: No, the CPU 201-2 calculates the C / I value, compares the C / I value with the threshold value τ, and is in contention. You may control the thread. In this case, the processing increases by the comparison of the C / I values, but only the CPU in which contention occurs can be targeted.

また、本実施の形態では、ＣＰＵ２０１−１は、メッセージ送信処理にて優先度をリアルタイム実行か否かという２段階で分けたうえでコンテンションのチェックをしたが、優先度を３段階以上に分けてコンテンションのチェックをしてもよい。 In the present embodiment, the CPU 201-1 checks the contention after dividing the priority in two stages of whether or not the real-time execution is performed in the message transmission process. However, the priority is divided into three or more stages. You may check contention.

その場合の処理を行う例として、ＣＰＵ２０１−１は、優先度テーブル３０３−１の実行優先度フィールドの取りうる値を３段階以上にする。たとえば、“ＵＩ入力”処理の実行優先度が“リアルタイム”と“通常”の間である“高優先”であり、“辞書の先読み検索”処理の実行優先度が“通常”の下である“低優先”とする。さらに、メッセージ送信処理では、ステップＳ８０１の処理にて、「リアルタイムソフトウェアが実行中か？」を「低優先以外の優先度を持つソフトウェアが実行中か？」に置き換える。さらに、ステップＳ８０６の処理にて、メッセージの内容に、現在動作中のソフトウェアの優先度を付与する。 As an example of performing the processing in that case, the CPU 201-1 sets the possible values of the execution priority field of the priority table 303-1 to three or more levels. For example, the execution priority of the “UI input” process is “high priority” between “real time” and “normal”, and the execution priority of the “dictionary look-ahead search” process is below “normal”. “Low priority”. Further, in the message transmission process, “Is real-time software being executed?” Is replaced with “Is software having a priority other than low priority being executed?” In step S801. Further, in the process of step S806, the priority of the currently operating software is given to the message content.

続けて、メッセージ受信処理では、ステップＳ９０２：Ｎｏの処理の後とステップＳ９０３の処理の間に新たな条件として、「受信したメッセージの優先度が現在動作中のソフトウェアの優先度より高いか？」を付け加える。条件がＹｅｓの場合、実行主体となるＣＰＵは、ステップＳ９０３の処理を行い、Ｎｏの場合には、ステップＳ９０３の処理を行わず処理を終了する。 Subsequently, in the message reception process, a new condition between the process after step S902: No and the process at step S903 is "Are the received message priority higher than the priority of the currently operating software?" Add. If the condition is Yes, the CPU that is the execution subject performs the process of Step S903, and if No, the process ends without performing the process of Step S903.

前述した処理の状態にて、たとえば、図７にて前述した高優先であるＵＩ入力処理を実行しているＣＰＵは、ステップＳ８０１にてＹｅｓとなり、ステップＳ８０７の処理にて他のコアにメッセージをブロードキャスト送信する。前述のメッセージを通常の優先度である描画レンダリングを実行しているＣＰＵが受信した場合、「受信したメッセージの優先度が現在動作中のソフトウェアの優先度より高いか？」が、Ｙｅｓとなり、ステップＳ９０３の処理を行い、ｎｉｃｅ値の制御を行う。 In the processing state described above, for example, the CPU executing the high-priority UI input processing described above with reference to FIG. 7 becomes Yes in step S801, and sends a message to other cores in the processing in step S807. Broadcast transmission. If the CPU executing the rendering rendering, which is the normal priority, receives the above message, “Yes, is the priority of the received message higher than the priority of the currently operating software?”, Step The process of S903 is performed to control the nice value.

もし、前述のメッセージをリアルタイム処理が要求される通信パケットを実行しているＣＰＵが受信した場合、「受信したメッセージの優先度が現在動作中のソフトウェアの優先度より高いか？」が、Ｎｏとなるため、ｎｉｃｅ値の制御を行わない。このように、メッセージ送信処理を行うＣＰＵは、優先度を３段階以上に分けてコンテンションのチェックを行い、メッセージ受信処理を行うＣＰＵは、優先度の判断を加えることで、優先度の低い処理を行っているＣＰＵの処理をまばらにする。これにより、マルチコアプロセッサシステム１００は、優先度の高い処理を先に処理することができる。 If the CPU executing the communication packet for which real-time processing is required is received for the above-described message, “No. Is the priority of the received message higher than the priority of the currently operating software?” Therefore, the nice value is not controlled. In this way, the CPU that performs message transmission processing divides the priority into three or more stages and checks the contention, and the CPU that performs message reception processing adds priority determination, thereby lowering the priority processing. Sparse CPU processing. Thereby, the multi-core processor system 100 can process a process with high priority first.

また、前述の実行優先度が３段階以上の処理の場合でステップＳ９０３を実行する際に、受信したメッセージの優先度と現在動作中のソフトウェアの優先度に基づいて、ｎｉｃｅ値の上昇させる値を設定してもよい。たとえば、受信したメッセージの優先度がリアルタイムであり、現在動作中のソフトウェアの優先度が通常であった場合、優先度が２段階離れているため、ｎｉｃｅ値を２上昇させる、という処理を行ってもよい。このように、ハイパーバイザ２０４−２によってｎｉｃｅ値を段階的に制御することで、ＯＳ２０５−２は、優先度の低い処理ほど、よりまばらに実行することになり、ＣＰＵ２０１−１に割り当てられているリアルタイム処理を先に処理することができる。 Further, when executing step S903 in the case of the above-described processing with three or more execution priorities, a value for increasing the nice value is set based on the priority of the received message and the priority of the currently operating software. It may be set. For example, if the priority of the received message is real-time and the priority of the currently operating software is normal, the priority is two steps away, so that the nice value is increased by two. Also good. In this way, by controlling the nice value in stages by the hypervisor 204-2, the OS 205-2 executes the processing with lower priority more sparsely and is assigned to the CPU 201-1. Real-time processing can be processed first.

また、実行優先度が２段階の状態においても、ｎｉｃｅ値の上昇させる値を２段階以上あげる処理を追加してもよい。具体的には、たとえば、ＣＰＵ２０１−２が、メッセージ受信処理を受けてｎｉｃｅ値を１上昇させた後に、ｎｉｃｅ値を初期値に戻す前に、メッセージ受信した場合である。この場合、ｎｉｃｅ値を上昇させたにもかかわらず、まだコンテンション状態であることを意味しているため、ＣＰＵ２０１−２はさらにｎｉｃｅ値を１上昇させるように設定することで、コンテンション状態がより解消しやすくなる。 Further, even when the execution priority is in a two-stage state, a process for increasing the value of the nice value by two or more stages may be added. Specifically, for example, the CPU 201-2 has received a message after receiving the message reception process and increasing the nice value by 1, but before returning the nice value to the initial value. In this case, since it means that the nice value is still increased even though the nice value is increased, the CPU 201-2 sets the nice value to be increased by 1 so that the contention state is increased. It becomes easier to eliminate.

以上説明したように、マルチコアプロセッサシステム、スレッド制御方法、およびスレッド制御プログラムによれば、リアルタイム処理でコンテンション中のＣＰＵを特定する。そして、リアルタイム処理中のＣＰＵと特定されたＣＰＵを除いたすべてのＣＰＵが、共有メモリにアクセスしないスレッドを実行するよう制御する。これにより、マルチコアプロセッサシステムは、リアルタイム処理を保証できる。 As described above, according to the multi-core processor system, the thread control method, and the thread control program, the CPU in contention is specified by real-time processing. Then, control is performed so that all CPUs except for the CPU identified as the CPU in real-time processing execute a thread that does not access the shared memory. Thereby, the multi-core processor system can guarantee real-time processing.

また、マルチコアプロセッサシステムは、特定されたＣＰＵを除いたすべてのＣＰＵに対し、共有メモリにアクセスしないスレッドを実行してもよい。これにより、特定されたＣＰＵから、特定されたＣＰＵを除いたすべてのＣＰＵに対して制御依頼をする際に、競合している相手を探さず、自身のＣＰＵ以外のすべてのＣＰＵに制御依頼を行うことで検索処理を行わないため、処理を簡略化できる。 Further, the multi-core processor system may execute a thread that does not access the shared memory for all the CPUs except the specified CPU. As a result, when a control request is made from the specified CPU to all CPUs except the specified CPU, the control request is sent to all CPUs other than the own CPU without searching for a competing partner. Since the search process is not performed, the process can be simplified.

また、マルチコアプロセッサシステムは、複数のＣＰＵのうち、コンテンション中のＣＰＵを特定し、特定されたＣＰＵが、共有メモリにアクセスしないスレッドを実行するよう制御してもよい。これにより、マルチコアプロセッサシステムは、コンテンションを起こしたＣＰＵのみスレッドを制御させ、コンテンションを起こしていないＣＰＵに対しては、通常処理を続けさせることができる。 In addition, the multi-core processor system may specify a CPU in contention among a plurality of CPUs, and control the specified CPU to execute a thread that does not access the shared memory. As a result, the multi-core processor system can control the thread only for the CPU that has caused contention, and can continue normal processing for the CPU that has not caused contention.

また、マルチコアプロセッサシステムは、コンテンションが発生した期間のうち、スレッドを制御するＣＰＵに対して、制御するＣＰＵに割り当てられていたスレッドの実行時間と、メモリにアクセスしないスレッドの時間を、時分割で分割して割り当ててもよい。これにより、マルチコアプロセッサシステムは、コンテンションを解消し、また、制御するＣＰＵに割り当てられていたスレッドの処理も行うことができる。 Also, the multi-core processor system time-divides the execution time of the thread assigned to the controlling CPU and the time of the thread not accessing the memory with respect to the CPU controlling the thread during the contention generation period. You may divide and allocate by. As a result, the multi-core processor system can eliminate contention and can also perform processing of threads assigned to the controlling CPU.

また、マルチコアプロセッサシステムは、ＣＰＵごとに、ＣＰＵの命令を発行した発行命令数とＣＰＵのサイクル数とに基づいて、発行命令効率を算出し、算出された発行命令効率と所定の閾値τに基づいて、コンテンションを検出してもよい。これにより、マルチコアプロセッサシステムは、アクセス競合によるコンテンションを検出することができ、リアルタイム処理を保証できる。 In addition, the multi-core processor system calculates, for each CPU, the issued instruction efficiency based on the number of issued instructions issued by the CPU and the number of cycles of the CPU, and based on the calculated issued instruction efficiency and a predetermined threshold τ. Contention may be detected. As a result, the multi-core processor system can detect contention due to access contention and guarantee real-time processing.

また、マルチコアプロセッサシステムは、ＣＰＵに割り当てられたスレッドの実行優先度が最も高いコアを検出してもよい。これにより、マルチコアプロセッサシステムは、リアルタイム処理の保証を必要とするスレッドを決めておくことで、そのスレッドがどのＣＰＵに割り当てられていても、コンテンションを解消しリアルタイム処理の保証を行うことができる。 Further, the multi-core processor system may detect the core having the highest execution priority of the thread assigned to the CPU. As a result, the multi-core processor system can determine the thread that needs to guarantee real-time processing, and can eliminate contention and guarantee real-time processing regardless of which CPU the thread is assigned to. .

なお、本実施の形態で説明したスレッド制御方法は、予め用意されたプログラムをコンピュータで実行することにより実現することができる。本スレッド制御プログラムは、ハードディスク、フレキシブルディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ等のコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行される。また本スレッド制御プログラムは、インターネット等のネットワークを介して配布してもよい。 The thread control method described in the present embodiment can be realized by executing a program prepared in advance on a computer. The thread control program is recorded on a computer-readable recording medium such as a hard disk, a flexible disk, a CD-ROM, an MO, and a DVD, and is executed by being read from the recording medium by the computer. The thread control program may be distributed via a network such as the Internet.

２０１−１ＣＰＵ
２０１−２ＣＰＵ
３０１−１領域
３０１−２領域
３０２−１領域
３０２−２領域
３０３−１優先度テーブル
３０３−２優先度テーブル
３０４−１ＯＳスケジューラ監視部
３０４−２ＯＳスケジューラ監視部
３０５優先度検出部
３０６発行命令効率算出部
３０７コンテンション検出部
３０８特定部
３０９ハイパーバイザ間メッセージ送信部
３１０ハイパーバイザ間メッセージ受信部
３１１制御部
３１２ソフトウェア
３１３ソフトウェア
３１４ｎｉｃｅ値設定部
３１５ダミースレッド起動部201-1 CPU
201-2 CPU
301-1 Area 301-2 Area 302-1 Area 302-2 Area 303-1 Priority Table 303-2 Priority Table 304-1 OS Scheduler Monitoring Unit 304-2 OS Scheduler Monitoring Unit 305 Priority Detection Unit 306 Issue Command Efficiency calculation unit 307 contention detection unit 308 identification unit 309 inter-hypervisor message transmission unit 310 inter-hypervisor message reception unit 311 control unit 312 software 313 software 314 nice value setting unit 315 dummy thread activation unit

Claims

A multi-core processor system comprising a plurality of cores and a memory accessible from the plurality of cores,
Detecting means for detecting a first core having the highest execution priority among the plurality of cores;
For each core, a calculation means for calculating the issue instruction efficiency representing the number of cycles taken for one instruction by dividing the number of cycles of the core by the number of issue instructions issued by the core;
For each of the cores, contention detection means for detecting access contention to the memory based on a comparison result between the issued instruction efficiency calculated by the calculation means and a predetermined threshold value;
Among the first cores detected by the detection means, a specifying means for specifying the second core in which the access conflict is detected by the conflict detection means ;
A period in which the access contention occurs for a thread that does not access the memory with respect to a third core of the plurality of cores excluding the first core and the second core specified by the specifying unit. Control means for performing control for a predetermined period of time,
A multi-core processor system comprising:

The control means includes
A thread that does not access the memory is executed for a predetermined period of a period in which the access contention occurs with respect to a third core of the plurality of cores excluding the second core specified by the specifying unit. The multi-core processor system according to claim 1, wherein control is performed to

The specifying means is:
A third core that generates an access conflict with the memory and competes with the second core among the plurality of cores;
The control means includes
The control unit according to claim 1, wherein a thread that does not access the memory is controlled to be executed for a predetermined period of a period in which the access contention occurs with respect to the third core specified by the specifying unit. The described multi-core processor system.

The predetermined period is a period of time divided by a thread allocated to the third core and a thread that does not access the memory in a period in which the access contention occurs. Item 4. The multi-core processor system according to any one of Items 1 to 3.

The detection means includes
The multi-core processor system according to any one of claims 1 to 4, wherein a first core having the highest execution priority of a thread assigned to the core is detected from the plurality of cores.

The conflict detection means includes
For each core, based on a comparison result between the issued instruction efficiency calculated by the calculating means and a predetermined threshold obtained by the product of the time interval at which real-time guarantee can be performed and the number of clock cycles per unit time of the core 6. The multi-core processor system according to claim 1, wherein an access contention with respect to the memory is detected.

The core of the multi-core processor system comprising a plurality of cores, a memory accessible from the plurality of cores , a detection unit, a calculation unit, a conflict detection unit, a specifying unit, and a control unit,
A detecting step of detecting a first core having the highest execution priority among the plurality of cores by the detecting means;
A calculating step of calculating, for each core, the issued instruction efficiency representing the number of cycles taken by one instruction by dividing the number of cycles of the core by the number of issued instructions issued by the core; ,
A contention detection step of detecting access contention to the memory based on a comparison result between the issued instruction efficiency calculated by the calculation step and a predetermined threshold for each core by the contention detection unit;
A specifying step of specifying the second core in which the access conflict is detected by the conflict detection step among the first cores detected by the detection step by the specifying unit ;
A thread that does not access the memory is accessed by the control means with respect to a third core of the plurality of cores excluding the first core and the second core specified by the specifying step. A control instruction step for instructing control to be executed for a predetermined period of time during which competition has occurred;
A thread control method characterized by executing

The core of a multi-core processor system comprising a plurality of cores and a memory accessible from the plurality of cores,
Detecting means for detecting a first core having the highest execution priority among the plurality of cores;
For each core, a calculating means for calculating the issue instruction efficiency representing the number of cycles taken for one instruction by dividing the number of cycles of the core by the number of issued instructions issued by the core.
Conflict detection means for detecting access contention to the memory based on a comparison result between the issued instruction efficiency calculated by the calculation means and a predetermined threshold for each core.
A specifying unit for specifying a second core in which the access conflict is detected by the conflict detection unit among the first cores detected by the detection unit;
A period in which the access contention occurs for a thread that does not access the memory with respect to a third core of the plurality of cores excluding the first core and the second core specified by the specifying unit. Control instruction means for instructing control to be executed for a predetermined period of time,
Thread control program characterized by functioning as