JP4523921B2

JP4523921B2 - Computer resource dynamic controller

Info

Publication number: JP4523921B2
Application number: JP2006047704A
Authority: JP
Inventors: 和宏村山; 治之大谷
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2006-02-24
Filing date: 2006-02-24
Publication date: 2010-08-11
Anticipated expiration: 2026-02-24
Also published as: JP2007226587A

Description

この発明は、複数の計算機がネットワークを形成して、連携して複数のプロセスを並行処理する分散システムにおける、処理が間に合わない事によるデッドラインミスの多発を防止する装置、システムに関するものである。 The present invention relates to an apparatus and a system for preventing frequent occurrence of deadline mistakes due to inadequate processing in a distributed system in which a plurality of computers form a network and cooperate to process a plurality of processes in parallel.

従来の計算機リソース動的制御方式は、従来例１として例えば特許文献１によれば、繰り返し周期ごとに処理時間と処理データのデータの内容、データの大きさと前回の処理時間をもとにしてプロセスの次の処理時間を予測する手段と、処理予測時間の大きさの範囲ごとに定められたタスクの実行順序案を保持し、予測によって求めた次の処理時間のデータに基づいて、あらかじめ定められた実行順序案に基づいて次の周期の処理順序を決定することにより、処理のデッドラインミス発生を防止することを特徴としている。 According to the conventional computer resource dynamic control method, for example, according to Patent Document 1 as Conventional Example 1, a process based on the processing time and the data content of the processing data, the data size, and the previous processing time for each repetition period. A means for predicting the next processing time and a task execution order plan determined for each range of the predicted processing time are stored, and are determined in advance based on the data of the next processing time obtained by the prediction. By determining the processing order of the next cycle based on the proposed execution order, it is possible to prevent the occurrence of a process deadline miss.

また従来例２として他の特許文献２によれば、処理すべきプロセス量を単位化して、例えばあるプロセスは８単位、他のプロセスは５単位等とし、この細分化されたプロセスの単位を基準にして、各プロセッサに処理する単位を割当てる方法が示されている。従ってプロセスが持つ単位量を管理するプロセス管理手段と、実行するプロセッサを管理するプロセッサ管理手段と、割当リソース量決定手段とを持っている。
特開平４−１７１５３８号公報特開平６−２８３２３号公報 In addition, according to other patent document 2 as Conventional Example 2, the process amount to be processed is unitized, for example, 8 units for a certain process, 5 units for another process, etc. Thus, a method of assigning a processing unit to each processor is shown. Therefore, it has a process management means for managing the unit amount of the process, a processor management means for managing the processor to be executed, and an allocated resource amount determination means.
JP-A-4-171538 JP-A-6-28323

上記の従来例１では、プロセスのマイグレーション（他の計算機への移行）を行うことができず、例え他の計算機に空きリソースがあったとしても処理内容の変更や縮退を行うしかなく、システム内の他の計算機リソースを効率よく利用できない、またはオーバーロードするプロセスの処理を任された計算機（プロセッサ）によるデッドラインミスが継続する、という課題がある。
また、上記の従来例１では、主としてデータの大きさ、値をもとにして次の処理時間を予測しているが、データの大きさによって処理時間が決定する訳ではない。更に各プロセス様々なデータを受信するが、データの内容はプロセスが受け取る直前まで不明である場合が多くて、事前の予測はできない。このような理由から、従来例１の手法で次の周期処理時間を予測することは不可能である。
また従来例２では、やはりプロセスの量を大きさで区切って単位化して管理するだけであり、プロセスの内容や質により計算機を割当てたり、計算機の環境により処理量を変えたりはしないので、やはりデッドラインミスが発生し続けることを防げない、という課題がある。 In the conventional example 1 described above, process migration (transfer to another computer) cannot be performed, and even if there is an empty resource in another computer, there is no choice but to change or degenerate the processing contents. There is a problem that other computer resources cannot be used efficiently, or that deadline mistakes caused by a computer (processor) entrusted with the process of overloading continue.
Further, in the above conventional example 1, the next processing time is predicted mainly based on the data size and value, but the processing time is not determined by the data size. Furthermore, each process receives various data, but the content of the data is often unknown until just before the process receives, and cannot be predicted in advance. For this reason, it is impossible to predict the next periodic processing time by the method of Conventional Example 1.
In the conventional example 2, the process amount is also divided and managed in units, and the computer is not allocated according to the contents or quality of the process or the processing amount is not changed depending on the computer environment. There is a problem that it is not possible to prevent the occurrence of deadline mistakes.

この発明は上記のような課題を解決するためになされたもので、システムを構成する各プロセスのデッドラインミスが発生した場合に、十分な空きメモリ量、ＣＰＵ時間を持つ計算機にプロセスマイグレーションを行うことによってプロセスのデッドラインミスの継続発生を防止することを目的とする。
また、過去のプロセスの処理時間からプロセスの将来の処理時間を予測し、将来デッドラインミスが発生すると予測された場合には、プロセスマイグレーションを行うことによってプロセスのデッドラインミスの発生を未然に防ぐことを目的とする。 The present invention has been made to solve the above-described problems. When a deadline miss of each process constituting the system occurs, process migration is performed to a computer having a sufficient free memory amount and CPU time. The purpose is to prevent the continuous occurrence of process deadline mistakes.
In addition, by predicting the future processing time of the process from the processing time of the past process, if it is predicted that a future deadline miss will occur, the process migration is performed to prevent the occurrence of a process deadline miss. For the purpose.

この発明に係る計算機リソース動的制御装置は、対象となるプロセスを処理する計算機を複数台接続して、複数の上記プロセスを連携処理するシステム、に接続して、
各上記計算機のリソースの使用状態を監視して計算機のこのリソースの使用状態情報を収集し、このリソースの不足を検出する計算機負荷量収集部と、
上記プロセスの処理時間を監視してプロセスの処理時間を収集し、プロセスの処理限界時間を規定したデッドライン時間を超す処理となるデッドラインミスの発生を検出する処理時間収集部と、
システムで処理される上記プロセスの処理時間とこのプロセスの処理で使用する計算機のリソースの使用量とデッドライン時間とをプロセスの識別子と共に記憶し、システムに接続される上記計算機の所有リソース量を計算機の識別子と共に記憶するシステム構成情報テーブル・メモリと、
上記デッドラインミスの発生を検出した通知により上記システム構成情報テーブル・メモリを参照して上記プロセスまたは関連するプロセスの処理を現行の計算機とは異なる計算機に再割当を勧告する移行プロセス決定部と、
上記再割当を勧告する通知を受けて上記システム構成情報テーブル・メモリの上記プロセスの処理時間と上記計算機の所有リソース量とを参照して上記プロセスの処理時間が上記デッドライン時間内となるよう上記再割当を勧告されたプロセスと候補計算機とを抽出し、この抽出したプロセスをこの抽出した候補計算機に割当て実行させると同時に上記システム構成情報テーブル・メモリを更新する資源割当部と、を備えた。 The computer resource dynamic control device according to the present invention connects a plurality of computers that process a target process, and connects to a system that processes the plurality of processes in a coordinated manner.
A computer load collection unit that monitors the resource usage status of each of the above computers, collects the usage status information of this resource of the computer, and detects a shortage of this resource;
A processing time collecting unit that monitors the processing time of the process and collects the processing time of the process, and detects the occurrence of a deadline miss that is a process that exceeds the deadline time that defines the processing limit time of the process;
The processing time of the above-mentioned process processed in the system, the use amount of the computer resource used in the processing of this process, and the deadline time are stored together with the process identifier, and the owned resource amount of the above-mentioned computer connected to the system is calculated. A system configuration information table memory for storing together with the identifier of
A transition process determining unit that recommends reassignment of processing of the process or related process to a computer different from the current computer by referring to the system configuration information table / memory according to the notification that the occurrence of the deadline miss is detected;
In response to the notification recommending the reallocation, the processing time of the process is within the deadline time by referring to the processing time of the process in the system configuration information table memory and the amount of resources owned by the computer. A resource allocation unit that extracts a process and a candidate computer for which reassignment has been recommended, causes the extracted process to be allocated to the extracted candidate computer, and simultaneously updates the system configuration information table and memory;

この発明によれば、プロセスのデッドラインミスを検出し、またプロセスが要求する計算機リソース量と各計算機が持つリソース量を管理して、規定回数以上デッドラインミスが発生したプロセスを、負荷の軽い他の計算機、または計算機リソースに余裕のある他の計算機に割り付けることで、デッドラインミスの継続を防ぐ効果がある。
また、デッドラインミスの原因が計算機故障の場合には、故障した計算機上で動作していた全てのプロセスを他の計算機に再割付して、処理の継続が可能となり、可用性の高いシステムが得られる効果がある。 According to the present invention, a deadline miss of a process is detected, and the amount of computer resources required by the process and the amount of resources possessed by each computer are managed. By allocating to other computers or other computers with sufficient computer resources, there is an effect of preventing the continuation of deadline mistakes.
In addition, if the cause of the deadline mistake is a computer failure, all processes that were running on the failed computer can be reassigned to other computers, and processing can be continued, resulting in a highly available system. There is an effect.

実施の形態１．
図１は、本実施の形態における計算機リソース動的制御システムの構成を示した図である。即ち、本実施の形態のリソース動的制御システムは、複数の計算機１００がネットワーク１０２によって接続された分散計算機システム環境において、各計算機がシステムに必要な構成要素を分担して持ち、全体として処理が間に合わない事によるデッドラインミスの多発を防止する。図において各計算機１００は、それぞれプロセス１０１の処理を分担して、または独立に行っており、資源割当に応じて新たなプロセスを処理する。
最初に計算機のハードウェア構成を説明するが、図２は、計算機１００が計算機リソース動的制御装置１００ａとなる場合の構成を示す図である。これらの図において、計算機１００または計算機リソース動的制御装置１００ａは、プロセッサ（ＣＰＵ）１１、プログラムやプロセス１０１を記憶するメモリ１２、処理時間監視部１、処理時間収集部２、移行プロセス決定部３、計算機負荷監視部４、計算機負荷量収集部５、資源割当部６、システム構成情報管理テーブル・メモリ７、処理時間履歴テーブル・メモリ８、計算機負荷履歴テーブル・メモリ９、計算機状態テーブル・メモリ１０、入力装置１３、出力装置１４、通信インタフェース１５及び各計算機上で動作するプロセス１０１で構成される。
図１の構成では全体の構成例を示すために、各計算機の詳細接続は示されていないが、各計算機１００は図２の詳細構成と同じ構成をしている。例えば右端の計算機は、処理時間監視部１と計算機負荷監視部４と資源割当部６と図示しないメモリ１２に書込まれたプロセス１０１とのみが枠で表示されているが、図２に示すプロセッサ１１やメモリ１２と内部バス１６で接続されている。また資源割当部６等の構成要素は、以下の機能説明で述べるような機能を持つプログラムをメモリに展開して、このプログラムをプロセッサ１１が読み出して実行することでその機能を得ている。
このプロセス１０１は、入力装置１３の一種であるＣＤリーダや他の記録媒体読取り装置で読取られて、または他の計算機から通信回線等を経由して通信インタフェース１５で取込まれて、図２に示されるように、実際にはメモリ１２に展開して記憶されている。本実施の形態において処理されるプロセス１０１は、例えばレーダのアプリケーションのように、データのサイズではなくデータの内容によって処理負荷が大きく変動するようなアプリケーションであってもよい。 Embodiment 1 FIG.
FIG. 1 is a diagram showing a configuration of a computer resource dynamic control system according to the present embodiment. That is, the resource dynamic control system according to the present embodiment shares the components necessary for the system in a distributed computer system environment in which a plurality of computers 100 are connected by the network 102, and the processing as a whole is performed. Prevent frequent deadline mistakes due to not being in time. In the figure, each computer 100 shares the processing of the process 101 or performs it independently, and processes a new process according to resource allocation.
First, the hardware configuration of a computer will be described. FIG. 2 is a diagram illustrating a configuration when the computer 100 is a computer resource dynamic control device 100a. In these drawings, a computer 100 or a computer resource dynamic control device 100a includes a processor (CPU) 11, a memory 12 for storing programs and processes 101, a processing time monitoring unit 1, a processing time collection unit 2, and a migration process determination unit 3. , Computer load monitoring unit 4, computer load amount collection unit 5, resource allocation unit 6, system configuration information management table / memory 7, processing time history table / memory 8, computer load history table / memory 9, computer state table / memory 10 , Input device 13, output device 14, communication interface 15, and process 101 operating on each computer.
The detailed connection of each computer is not shown in the configuration of FIG. 1 to show an overall configuration example, but each computer 100 has the same configuration as the detailed configuration of FIG. For example, in the rightmost computer, only the processing time monitoring unit 1, the computer load monitoring unit 4, the resource allocation unit 6, and the process 101 written in the memory 12 (not shown) are displayed in a frame, but the processor shown in FIG. 11 and the memory 12 via an internal bus 16. Further, the constituent elements such as the resource allocation unit 6 obtain the function by developing a program having functions as described in the following functional description in the memory, and reading and executing the program by the processor 11.
This process 101 is read by a CD reader which is a kind of the input device 13 or another recording medium reader, or is taken in by the communication interface 15 from another computer via a communication line or the like. As shown, it is actually expanded and stored in the memory 12. The process 101 processed in the present embodiment may be an application in which the processing load greatly varies depending on the content of data rather than the size of the data, such as a radar application.

以下、本実施の形態における計算機リソース動的制御装置１００ａまたは計算機リソース動的制御システムにおける各構成要素についてその機能を説明する。
図３は、本実施の形態におけるプロセスのデータ処理時間を監視する処理時間監視部１の機能を説明する図である。即ち、分散処理を行う各プロセッサ１１が割当てられたプロセス１０１の処理を実行するに当たり、データを受け取ってから処理を開始した時刻（以下、「処理開始時刻」とする）、処理を完了した時刻（以下「処理終了時刻」とする）を処理時間監視部１に報告するステップを設けると共に、処理開始時刻から処理終了時刻までの間で、実際に演算のためにＣＰＵ（プロセッサ）を使用した時間（以下「ＣＰＵ時間」とする）を報告するステップを設ける。なお動作に関係するステップは、後の動作記述で説明する。更に処理時間監視部１は、必要に応じて処理時間収集部２に上記の処理開始時刻、処理終了時刻、ＣＰＵ時間を通知する役割を持つ。
処理時間監視部１は、それぞれの計算機１００に各１つ設けてもよいし、プロセス１０１毎に各１つ設けてももよいし、システム全体で１つ設けてもよい。図１では、構成の一例として計算機１００に各１つ設けた場合の構成を示している。 Hereinafter, the function of each component in the computer resource dynamic control device 100a or the computer resource dynamic control system in the present embodiment will be described.
FIG. 3 is a diagram illustrating the function of the processing time monitoring unit 1 that monitors the data processing time of the process according to the present embodiment. That is, when each processor 11 that performs distributed processing executes the processing of the assigned process 101, the time at which the processing is started after receiving data (hereinafter referred to as “processing start time”), and the time at which the processing is completed ( (Hereinafter referred to as “processing end time”) is provided with a step of reporting to the processing time monitoring unit 1 and the time (in actual use) of the CPU (processor) for the calculation between the processing start time and the processing end time ( (Hereinafter referred to as “CPU time”). The steps related to the operation will be described later in the description of the operation. Furthermore, the processing time monitoring unit 1 has a role of notifying the processing time collecting unit 2 of the above processing start time, processing end time, and CPU time as necessary.
One processing time monitoring unit 1 may be provided for each computer 100, one for each process 101, or one for the entire system. FIG. 1 shows a configuration in which one computer 100 is provided as an example of the configuration.

図４、図５、図６は、本実施の形態における処理時間収集部２の主な機能を説明する図である。処理時間収集部２は図４（ａ）に示すように、処理時間監視部１から処理開始時刻、処理終了時刻、ＣＰＵ時間を受け取るとともに、これらのデータをプロセスごとに分けて、処理時間履歴テーブル・メモリ８に過去一定回数分だけ保持する。図４（ｂ）は、処理時間履歴テーブル・メモリ８に保持されたデータの例を示す図である。
また図５（ａ）に示すように、保持している履歴の中で何回デッドラインミスが起きているかを調査して、許容可能な回数以上デッドラインミスが発生していた場合には、移行プロセス決定部３に、デッドラインミスが規定回数発生している旨を通知する機能を持つ。
更に図６に示すように、移行プロセス決定部３から、あるプロセスの処理開始時刻、終了時刻、ＣＰＵ時間の履歴を送信するようリクエストがあった場合に、該当するプロセスのデータを持つ処理時間履歴テーブル・メモリ８に保持されたデータを送信する機能を持つ。なお、処理時間収集部２が機能を実行するために必要な、処理時間履歴テーブル・メモリ８に保持する過去のデータの個数、プロセスの処理限界時間を規定したデッドライン時間、このデッドライン時間を超えた処理となるデッドラインミスの許容回数はシステム構成情報管理テーブル・メモリ７に記載される。
処理時間収集部２は、プロセス毎に１つ設けてもよいし、計算機毎に１つ設けてもよいし、システム全体で１つ設けるようにしてもよい。 4, 5, and 6 are diagrams for explaining main functions of the processing time collection unit 2 in the present embodiment. As shown in FIG. 4A, the processing time collection unit 2 receives the processing start time, the processing end time, and the CPU time from the processing time monitoring unit 1, and divides these data for each process, thereby processing the processing time history table. Hold in memory 8 a certain number of times in the past. FIG. 4B is a diagram showing an example of data held in the processing time history table memory 8.
Further, as shown in FIG. 5A, the number of deadline misses that have occurred in the history that is held is investigated, and if a deadline miss has occurred more than an allowable number of times, It has a function of notifying the migration process decision unit 3 that a deadline miss has occurred a specified number of times.
Further, as shown in FIG. 6, when there is a request from the migration process determination unit 3 to transmit a history of processing start time, end time, and CPU time of a certain process, a processing time history having data of the corresponding process It has a function of transmitting data held in the table memory 8. The number of past data necessary for the processing time collecting unit 2 to execute the function, the number of past data held in the processing time history table memory 8, the deadline time defining the processing limit time of the process, and this deadline time The allowable number of deadline misses that exceed the processing is described in the system configuration information management table memory 7.
One processing time collection unit 2 may be provided for each process, one for each computer, or one for the entire system.

図７は、本実施の形態における移行プロセス決定部３の機能を説明する図である。移行プロセス決定部３は、処理時間収集部２よりデッドラインミスが規定回数以上発生した旨を告げる通知を受け取ると、処理時間収集部２、計算機負荷量収集部５などに問い合わせ、他プロセスの処理時間の変動や、計算機の状態（正常または異常）、ＣＰＵ、メモリ等のハードウェアの利用率などについて調査を行うことにより、デッドラインミスを解消するためにハードウェア・リソースの再割り当てを行うべきプロセスを検出する機能を持つ。
この移行プロセス決定部３は、システムに１つ在ればよい。 FIG. 7 is a diagram illustrating the function of the migration process determination unit 3 in the present embodiment. When the migration process determination unit 3 receives a notification from the processing time collection unit 2 that the deadline miss has occurred more than the specified number of times, it makes an inquiry to the processing time collection unit 2, the computer load collection unit 5 and the like to process other processes. Hardware resources should be reassigned to eliminate deadline errors by investigating time fluctuations, computer status (normal or abnormal), CPU and memory utilization, etc. Has the ability to detect processes.
One migration process determining unit 3 may be provided in the system.

図８は、本実施の形態における計算機負荷監視部４の機能を説明する図である。計算機負荷監視部４は、各計算機１００上に１つ設けて、各計算機１００のメモリ使用状況やＣＰＵ使用率を定期的に取得し、計算機負荷量収集部５に通知する機能を持つと共に、これらの情報を定期的に通知することにより計算機１００が正常に動作していることを計算機負荷量収集部５に伝える機能も持つ。 FIG. 8 is a diagram for explaining the function of the computer load monitoring unit 4 in the present embodiment. The computer load monitoring unit 4 is provided on each computer 100 and has a function of periodically acquiring the memory usage status and CPU usage rate of each computer 100 and notifying the computer load amount collecting unit 5 of these. This information is periodically notified to the computer load amount collecting unit 5 that the computer 100 is operating normally.

図９、図１０、図１１、図１２は、本実施の形態における計算機負荷量収集部５の主な機能を説明する図である。計算機負荷量収集部５は図９に示すように、計算機負荷監視部４からの通知を受け取り、メモリの空き状況とＣＰＵ１１の利用状況と上記情報の受信時刻を計算機毎に分類して計算機負荷履歴テーブル・メモリ９に保持する。
また、移行プロセス決定部３や資源割当部６から、ＣＰＵ１１の利用率、空きメモリ量、計算機の正常・異常に関する情報を送信するようリクエストがあった場合に、図１０に示すように、計算機負荷履歴テーブル・メモリ９に保持されたデータから対応するデータを取出して送信する機能も持つ。
更に、計算機負荷監視部４からデータが届かなくなった場合には、計算機状態テーブル・メモリ１０にある、対応するデータが届かなくなった計算機のエントリを図１１（ｂ）のように「異常」を示すように書き換え、同時に移行プロセス決定部３に故障が発生したことを通知する機能も持つ。
更に、計算機負荷履歴テーブル・メモリ９にあるデータを参照し、空きメモリ量や空きＣＰＵリソースが少なくなった計算機があった場合にも、移行プロセス決定部３に通知する機能も持つ。また図１２に示すように、移行プロセス決定部３や資源割当部６から正常な計算機１００の一覧リストの取得要求があった場合には、計算機状態テーブル・メモリ１０を参照して、正常動作する計算機１００の識別子の一覧を返す機能も持つ。 9, FIG. 10, FIG. 11, and FIG. 12 are diagrams for explaining the main functions of the computer load amount collection unit 5 in the present embodiment. As shown in FIG. 9, the computer load amount collection unit 5 receives a notification from the computer load monitoring unit 4, classifies the memory availability, the usage status of the CPU 11, and the reception time of the above information for each computer, and calculates the computer load history. Stored in the table memory 9.
Further, when there is a request from the migration process determination unit 3 or the resource allocation unit 6 to transmit information on the usage rate of the CPU 11, the amount of free memory, and normality / abnormality of the computer, as shown in FIG. It also has a function of extracting the corresponding data from the data held in the history table memory 9 and transmitting it.
Further, when the data is not received from the computer load monitoring unit 4, the entry of the computer in which the corresponding data is not received in the computer status table memory 10 indicates “abnormal” as shown in FIG. 11B. Thus, it has a function of notifying the migration process determining unit 3 that a failure has occurred at the same time.
Further, it has a function of referring to the data in the computer load history table memory 9 and notifying the migration process determining unit 3 when there is a computer whose free memory amount or free CPU resource is reduced. As shown in FIG. 12, when there is a request for obtaining a list of normal computers 100 from the migration process determination unit 3 or the resource allocation unit 6, the computer state table / memory 10 is referred to and the normal operation is performed. It also has a function of returning a list of identifiers of the computer 100.

図１３は、本実施の形態における資源割当部６の機能を説明する図である。資源割当部６は、移行プロセス決定部３が移行する必要があるとして選定したプロセスを、新しく他の計算機１００に割り当てる機能を持つ。資源割当部６がそのプロセスに対して異なる計算機を割り当てる際には、計算機負荷量収集部５に各計算機１００のＣＰＵ１１の負荷やメモリ１２の利用状況について問い合わせたり、システム構成情報管理テーブル・メモリ７からプロセス１０１が必要とするＣＰＵ時間やメモリ量に関する情報を取得したりすることにより、メモリ不足やデッドラインミスを発生させる可能性が少ない計算機１００上でそのプロセス１０１を起動する。更に、プロセス１０１の起動後は元の計算機で動作していたプロセスを停止させる。 FIG. 13 is a diagram illustrating the function of the resource allocation unit 6 in the present embodiment. The resource allocation unit 6 has a function of newly allocating a process selected by the migration process determination unit 3 to be migrated to another computer 100. When the resource allocation unit 6 allocates a different computer to the process, the computer load amount collection unit 5 is inquired about the load of the CPU 11 of each computer 100 and the usage status of the memory 12, or the system configuration information management table memory 7 The CPU 101 starts the process 101 on the computer 100 that is less likely to cause a memory shortage or a deadline miss by acquiring information on the CPU time and memory amount required by the process 101. Further, after the process 101 is started, the process operating on the original computer is stopped.

図１４は、本実施の形態におけるシステム構成情報管理テーブル・メモリ７のデータ構成の例を示す図である。システム構成情報管理テーブル・メモリ７は、ソフトウェアに関する情報として、プロセス１０１の名称である「プロセス名」、直近のデッドラインミスの発生回数をカウントするために保持する処理開始時刻・処理終了時刻・ＣＰＵ時間の履歴の個数を示す「処理時間履歴テーブルに格納する履歴数」、各プロセスが処理した際の「平均ＣＰＵ時間」、「デッドライン時間」、「履歴中のデッドラインミス許容回数」「動作中の計算機」「プロセス識別子」、「消費メモリ」等の計算機リソースの使用量をプロセス１０１毎に保持する。ハードウェア情報として、システムに存在する計算機１００を識別する「ホスト名」、その計算機の「搭載メモリ量」等の所有リソース量、「計算機負荷履歴テーブルに保持する履歴数」を計算機ごとに記載する。 FIG. 14 is a diagram showing an example of the data configuration of the system configuration information management table / memory 7 in the present embodiment. The system configuration information management table / memory 7 includes, as information related to software, a “process name” that is the name of the process 101, a process start time, a process end time, and a CPU that are stored to count the number of occurrences of the most recent deadline miss. "Number of histories stored in the processing time history table" indicating the number of time histories, "Average CPU time", "Deadline time", "Deadline miss allowable number in history", "Operation" The usage amount of computer resources such as “in-computer”, “process identifier”, “consumed memory”, and the like is held for each process 101. As hardware information, “host name” for identifying the computer 100 existing in the system, owned resource amount such as “installed memory amount” of the computer, and “number of histories held in the computer load history table” are described for each computer. .

図１５は、本実施の形態における処理時間履歴テーブル・メモリ８のデータ構成の例を示す図である。本テーブルはプロセス毎に存在し、テーブルには、１つのエントリごとに「処理開始時刻」「処理終了時刻」「ＣＰＵ時間」を格納する。本テーブルは、処理時間監視部１が処理時間収集部２に「処理開始時刻」「処理終了時刻」「ＣＰＵ時間」を送信すると、処理時間収集部２によってそのデータが書き込まれる。エントリの数は、システム構成情報管理テーブルによって決定されており、全エントリにデータが書き込まれると、古いデータから順に削除される。また、規定回数以上デッドラインミスが発生すると、リソース割当完了後、リソース割当を行ったプロセスのデータを管理するテーブル中に在る対応する全てのデータが削除される。 FIG. 15 is a diagram showing an example of the data configuration of the processing time history table memory 8 in the present embodiment. This table exists for each process, and “processing start time”, “processing end time”, and “CPU time” are stored for each entry in the table. In this table, when the processing time monitoring unit 1 transmits “processing start time”, “processing end time”, and “CPU time” to the processing time collecting unit 2, the data is written by the processing time collecting unit 2. The number of entries is determined by the system configuration information management table. When data is written in all entries, the oldest data is deleted in order. If a deadline miss occurs more than the specified number of times, all the corresponding data in the table for managing the data of the process that performed the resource allocation is deleted after the resource allocation is completed.

図１６は、本実施の形態における計算機負荷履歴テーブル・メモリ９のデータ構成の例を示す図である。本テーブルは計算機ごとに存在し、テーブルには、１つのエントリ毎に「データ取得時刻」「ＣＰＵ使用率」「空きメモリ量」のデータを格納する。本テーブルは、計算機負荷監視部４が計算機負荷量収集部５に「データ取得時刻」「ＣＰＵ使用率」「空きメモリ量」のデータを送ると、計算機負荷量収集部５によって書き込まれる。テーブルのエントリの数はシステム構成情報管理テーブルによって決定されており、全エントリにデータが書き込まれると、古いデータから順に削除される。
図１７は、本実施の形態における計算機状態テーブル・メモリ１０のデータ構成の例を示す図である。本テーブルには、計算機ごとに「正常／異常」を格納するエントリがあり、計算機負荷量収集部５が、ある一定時間内に計算機負荷監視部４から「データ取得時刻」「ＣＰＵ使用率」「空きメモリ量」のデータを受信しなければ、計算機に異常が発生したと判断し、該当する計算機のエントリが「異常」と示される。
図２で示したように、計算機リソース動的制御装置１００ａが必要な全ての構成要素を持つようにしてもよいが、図１に例を示すように、幾つかの計算機が必要な処理時間収集・予測部２等の構成要素を分散して持ち、それらの計算機群が全体として計算機リソース動的制御システムを構成してもよい。 FIG. 16 is a diagram showing an example of the data configuration of the computer load history table / memory 9 in the present embodiment. This table exists for each computer, and data of “data acquisition time”, “CPU usage rate”, and “free memory amount” is stored for each entry in the table. This table is written by the computer load amount collection unit 5 when the computer load monitoring unit 4 sends data of “data acquisition time”, “CPU usage rate”, and “free memory amount” to the computer load amount collection unit 5. The number of entries in the table is determined by the system configuration information management table. When data is written in all entries, old data is deleted in order.
FIG. 17 is a diagram showing an example of the data configuration of the computer state table / memory 10 in the present embodiment. In this table, there is an entry for storing “normal / abnormal” for each computer, and the computer load amount collection unit 5 receives “data acquisition time”, “CPU usage rate”, “ If the “Available Memory” data is not received, it is determined that an abnormality has occurred in the computer, and the entry of the corresponding computer is indicated as “abnormal”.
As shown in FIG. 2, the computer resource dynamic control apparatus 100a may have all the necessary components. However, as shown in FIG. 1, some computers need to collect processing time. The constituent elements such as the prediction unit 2 may be distributed and these computer groups may constitute a computer resource dynamic control system as a whole.

次に、本実施の形態における計算機リソース動的制御装置または計算機リソース動的制御システムの各構成要素の動作について説明する。
図１８は、処理時間監視部１の動作フローの一例を示す図である。図３の機能説明図も参照しながら動作を説明する。ステップＳ１（以下、ステップの記述を省略する）にて、処理時間監視部１は、プロセッサ１１が各プロセス１０１の処理を開始すると、プロセスの処理開始時刻を取得する。そしてＳ２において、プロセッサ１１がその処理を終了した時に、処理終了時刻とＳ１からＳ２までの処理時間中におけるＣＰＵ時間を取得する。Ｓ３において、処理開始時刻、処理終了時刻、ＣＰＵ時間を処理時間収集部２に通知する。これが処理時間監視部１の図３の機能による具体的な動作である。
処理時間監視部１はＳ１〜Ｓ３の処理を、プロセッサ１１がプロセス１０１の処理動作を実行する限り繰り返す。なお、各プロセスの処理におけるＣＰＵ時間の取得は、例えばオペレーティング・システムがＵＮＩＸ（登録商標）であれば、ｔｉｍｅｓ（）システムコールや、ｇｅｔｒｕｓａｇｅ（）システムコール、ｐｓコマンドなどにより得られる。 Next, the operation of each component of the computer resource dynamic control device or computer resource dynamic control system in the present embodiment will be described.
FIG. 18 is a diagram illustrating an example of an operation flow of the processing time monitoring unit 1. The operation will be described with reference to the function explanatory diagram of FIG. In step S1 (hereinafter, description of steps is omitted), when the processor 11 starts processing of each process 101, the processing time monitoring unit 1 acquires the processing start time of the process. In S2, when the processor 11 finishes the process, the process end time and the CPU time in the process time from S1 to S2 are acquired. In S3, the processing time collection unit 2 is notified of the processing start time, the processing end time, and the CPU time. This is a specific operation by the function of the processing time monitoring unit 1 shown in FIG.
The processing time monitoring unit 1 repeats the processing of S1 to S3 as long as the processor 11 executes the processing operation of the process 101. The acquisition of the CPU time in the process of each process is obtained by, for example, a times () system call, a get (age) system call, a ps command, etc. if the operating system is UNIX (registered trademark).

図１９は、処理時間収集部２の動作フローの一例を示す図である。図４ないし図６の機能説明図も参照しながら動作を説明する。Ｓ１１において、処理時間監視部１からあるプロセスを処理した処理開始時刻、終了時刻、ＣＰＵ時間のデータを受け取ると、Ｓ１２において、そのプロセス処理のデータを書き込む処理時間履歴テーブル・メモリ８のあるエントリにこれらのデータを書き込む。Ｓ１３において、全プロセスの処理について、Ｓ１２にて書き込んだ処理開始時刻、処理終了時刻より、デッドラインミスが発生しているかどうかを調べる。デッドラインミスが発生しているかどうかは、処理終了時刻から処理開始時刻を引くことにより処理に要した時間を求め、この時間とシステム構成情報管理テーブル・メモリ７に示されている、プロセスのデッドライン時間を比較することにより得ることができる。デッドラインミスが発生していたら、Ｓ１４において、処理時間履歴テーブル・メモリ８に保持されているデータを全て参照し、規定回数以上デッドラインミスが発生しているかどうかを調べる。許容可能なデッドラインミスの規定回数はシステム構成情報管理テーブル・メモリ７が保持している。規定回数以上のデッドラインミスが発生していた場合には、Ｓ１５において、移行プロセス決定部３に通知すると共に、処理時間履歴テーブル・メモリ８に書き込まれている全データを削除する。なおＳ１３にてデッドラインミスが発生していなければ、Ｓ１６に進み、処理時間履歴テーブル・メモリ８の、次にデータを書き込むエントリを１つ先に進める。処理時間履歴テーブル・メモリ８の空きエントリがなくなった場合には、それ以降、Ｓ１２では最も古いデータを順に上書きしていく。 FIG. 19 is a diagram illustrating an example of an operation flow of the processing time collection unit 2. The operation will be described with reference to the function explanatory diagrams of FIGS. In S11, when processing start time, end time, and CPU time data for processing a process is received from the processing time monitoring unit 1, in S12, the process processing data is written in a certain entry in the processing time history table memory 8. Write these data. In S13, it is checked whether or not a deadline miss has occurred in the processes of all processes from the process start time and process end time written in S12. Whether or not a deadline miss has occurred is obtained by subtracting the process start time from the process end time to obtain the time required for the process, and this time and the process deadline indicated in the system configuration information management table memory 7 It can be obtained by comparing the line times. If a deadline miss has occurred, in S14, all data held in the processing time history table memory 8 are referred to and it is checked whether or not a deadline miss has occurred more than the specified number of times. The specified number of allowable deadline misses is held in the system configuration information management table memory 7. If a deadline miss has occurred more than the specified number of times, in S15, the migration process determination unit 3 is notified and all data written in the processing time history table memory 8 is deleted. If no deadline miss has occurred in S13, the process proceeds to S16, and the entry for writing data next in the processing time history table memory 8 is advanced by one. When there are no more empty entries in the processing time history table memory 8, thereafter, the oldest data is overwritten in order in S12.

図２０は、本実施の形態における計算機負荷監視部４の動作フローの一例を示す図である。図８の機能説明図も参照しながら動作を説明する。Ｓ１００１において計算機負荷監視部４は、プロセッサ１１による処理開始時刻を取得する。そしてＳ１００２で、計算機負荷とメモリ使用量を取得する。そしてＳ１００３で、取得した現在時刻と、ＣＰＵ使用率、メモリ使用量を計算機負荷量収集部５に通知する。ＣＰＵ使用率の取得方法は、例えばオペレーティング・システムがＵＮＩＸ（登録商標）であれば、ｍｐｓｔａｔコマンドやｔｏｐコマンド、メモリ使用量の取得方法は、ｖｍｓｔａｔコマンドにより得られる。 FIG. 20 is a diagram showing an example of an operation flow of the computer load monitoring unit 4 in the present embodiment. The operation will be described with reference to the function explanatory diagram of FIG. In S <b> 1001, the computer load monitoring unit 4 acquires the processing start time by the processor 11. In step S1002, the computer load and the memory usage are acquired. In step S1003, the computer load amount collecting unit 5 is notified of the acquired current time, CPU usage rate, and memory usage amount. For example, if the operating system is UNIX (registered trademark), the CPU usage rate acquisition method can be obtained by the mpstat command, the top command, and the memory usage acquisition method by the vmstat command.

図２１は、本実施の形態における計算機負荷量収集部５の動作フローの一例を示す図である。図９ないし図１２の機能説明図も参照しながら動作を説明する。Ｓ１０１１で一定時間待ち、Ｓ１０１２で計算機負荷監視部４からデータが届いたらそれを受け取る。そしてＳ１０１３で、計算機状態テーブル・メモリ１０中の、データを送信した計算機の欄を「正常」にする。そしてＳ１０１４で、計算機負荷履歴テーブル・メモリ９の該当するエントリにデータを書き込む。計算機負荷履歴テーブルは、図９（ｂ）に示すように計算機毎にエントリが分かれているので、通知を送信した計算機のエントリに書き込むことになる。Ｓ１０１５で、書き込んだデータがリソース不足を示しているかどうかを調べる。リソース不足とは、例えばＣＰＵ使用率が１００％になったか、使用メモリ量が搭載メモリ量を超えた場合などが考えられる。また、ＣＰＵ使用率、使用メモリ量の上限をユーザによって設定しておき、それを超えたらリソース不足としてもよい。リソース不足が発生していたら、Ｓ１０１６にて、移行プロセス決定部３にリソース不足が発生した旨と、その計算機名を通知する。Ｓ１０１７にて、次にデータを書き込むために、計算機負荷履歴テーブル・メモリ９のエントリを１つ先に進める。計算機負荷履歴テーブル・メモリ９の空きエントリがなくなった場合には、それ以降、最も古いデータから順に上書きしていく。
なお、Ｓ１０１１にて一定時間待っても、計算機負荷監視部４からデータが届かなかった場合には、Ｓ１０１８にて計算機状態テーブル・メモリ１０の、データを送信してこなかった計算機負荷監視部４が監視する計算機の欄を「異常」に変更し、Ｓ１０１９にて移行プロセス決定部３に通知する。Ｓ１０１１〜Ｓ１０１９の手順は１つの計算機１００に対しての計算機負荷量収集の処理を示したものであり、これを全ての計算機１００に対して行う。 FIG. 21 is a diagram illustrating an example of an operation flow of the computer load amount collection unit 5 in the present embodiment. The operation will be described with reference to the function explanatory diagrams of FIGS. In step S1011, the process waits for a certain period of time. In S1013, the column of the computer that transmitted the data in the computer state table memory 10 is set to “normal”. In step S1014, data is written to the corresponding entry in the computer load history table memory 9. In the computer load history table, as shown in FIG. 9 (b), entries are divided for each computer, so the entry is written in the entry of the computer that sent the notification. In S1015, it is checked whether the written data indicates a resource shortage. The resource shortage may be, for example, a case where the CPU usage rate reaches 100% or the amount of used memory exceeds the amount of installed memory. Further, the upper limit of the CPU usage rate and the used memory amount may be set by the user, and if the upper limit is exceeded, the resource may be insufficient. If a resource shortage has occurred, in S1016, the migration process determination unit 3 is notified of the shortage of resources and the computer name. In S1017, in order to write data next, the entry in the computer load history table / memory 9 is advanced by one. When there are no more empty entries in the computer load history table memory 9, the oldest data is overwritten in order thereafter.
If no data arrives from the computer load monitoring unit 4 after waiting for a predetermined time in S1011, the computer load monitoring unit 4 that has not transmitted the data in the computer state table memory 10 in S1018. The column of the computer to be monitored is changed to “abnormal”, and the migration process determination unit 3 is notified in S1019. The procedure of S1011 to S1019 shows the processing of collecting the computer load for one computer 100, and this is performed for all the computers 100.

図２２は、デッドラインミスが発生したことが処理時間収集部２より通知された場合における移行プロセス決定部３の動作フローの一例を示した図である。図において、Ｓ１０１で図５に示すように、処理時間収集部２よりデッドラインミスが規定回数以上発生した旨を伝える通知が届く。通知とともに、例えばデッドラインミスが発生したプロセス名も届く。Ｓ１０１でデッドラインミスが発生したプロセスの名前が届くと、Ｓ１０２で、システム構成情報管理テーブル・メモリ７よりデッドラインミスが発生したプロセスを処理している計算機を識別する。そしてＳ１０３で、その計算機上で動作している他のプロセスを識別してその名を取得する。Ｓ１０４で、計算機負荷量収集部５に対し、その計算機が正常に動作しているかどうかを問い合わせる。そしてＳ１０５にて、Ｓ１０３で識別した計算機が正常動作していなければ、計算機が停止したことがデッドラインミス発生の原因であることがわかる。その場合は、Ｓ１０３で検出した全てのプロセス１０１を他の計算機に移行させるために、Ｓ１０６にてこれらのプロセス名をリソース再割付の対象として資源割当部６に通知する。 FIG. 22 is a diagram illustrating an example of an operation flow of the migration process determination unit 3 when the processing time collection unit 2 is notified that a deadline miss has occurred. In FIG. 5, as shown in FIG. 5, in S101, the processing time collection unit 2 receives a notification notifying that a deadline miss has occurred a predetermined number of times or more. Along with the notification, for example, the name of the process in which a deadline miss has occurred is also received. When the name of the process in which the deadline miss has occurred is received in S101, in S102, the computer that processes the process in which the deadline miss has occurred is identified from the system configuration information management table memory 7. In step S103, another process operating on the computer is identified and its name is acquired. In S104, the computer load amount collecting unit 5 is inquired as to whether or not the computer is operating normally. Then, in S105, if the computer identified in S103 is not operating normally, it can be seen that the cause of the deadline error is that the computer has stopped. In that case, in order to transfer all the processes 101 detected in S103 to another computer, in S106, these process names are notified to the resource allocation unit 6 as targets for resource reallocation.

なおＳ１０５で計算機が正常に動作していた場合には、Ｓ１０７において、Ｓ１０３で見つけたプロセスのうち、ＣＰＵ時間が増えているプロセスを調査する。ＣＰＵ時間が増えているかどうかは、システム構成情報管理テーブルが各プロセスの平均ＣＰＵ時間を保持しているので、システム構成情報管理テーブルを見て平均ＣＰＵ時間を取得し、現在のＣＰＵ時間と比較することで判る。そして、デッドラインミスが発生したプロセスと、Ｓ１０７で見つけ出したプロセスをリソース再割当の候補として、これらのうちのどれかに対してリソース割当を行うように資源割当部６に通知する。 If the computer is operating normally in S105, in S107, a process in which the CPU time is increased among the processes found in S103 is investigated. Whether the CPU time has increased or not is determined because the system configuration information management table holds the average CPU time of each process, so that the average CPU time is obtained by looking at the system configuration information management table and compared with the current CPU time. I understand. Then, the process in which the deadline miss has occurred and the process found in S107 are set as candidates for resource reallocation, and the resource allocation unit 6 is notified to perform resource allocation for any of them.

図２３は、計算機負荷収集手段より通知が届いた場合に移行プロセス決定部３が行う動作フローの一例を示す図である。図７の機能説明図も参照しながら動作を説明する。計算機負荷量収集部５より負荷量が多いと通知が届くのは、計算機故障かメモリ容量が不足の場合があると考えられる。Ｓ１１１にて計算機負荷量収集部５よりこうした通知が届くと、Ｓ１１２にて、システム構成情報管理テーブル・メモリ７を参照することにより、異常が発生した計算機１００上で処理されているプロセス１０１を識別する。そしてＳ１１３にて、計算機負荷量収集部５よりの通知の内容が計算機故障かメモリ不足かを調べる。通知の内容が計算機故障であった場合にはＳ１１４にて、Ｓ１１２で調べた全プロセスをリソース再割当ての対象として、資源割当部６に通知する。また、通知の内容がメモリ不足であった場合にはＳ１１２で、識別して判明した全プロセス１０１をリソース再割当ての候補として、これらのプロセスのいずれかを他の計算機１００に移行させるよう、資源割当部６に通知する。 FIG. 23 is a diagram illustrating an example of an operation flow performed by the migration process determination unit 3 when a notification is received from the computer load collection unit. The operation will be described with reference to the function explanatory diagram of FIG. If the load amount is larger than that of the computer load amount collection unit 5, it is considered that the notification may arrive if there is a computer failure or the memory capacity is insufficient. When such a notification is received from the computer load amount collecting unit 5 in S111, the process 101 being processed on the computer 100 in which an abnormality has occurred is identified by referring to the system configuration information management table memory 7 in S112. To do. In S113, it is checked whether the content of the notification from the computer load amount collecting unit 5 is a computer failure or a memory shortage. If the content of the notification is a computer failure, in S114, the resource allocation unit 6 is notified of all processes examined in S112 as resources to be reassigned. If the content of the notification is insufficient, in step S112, all the processes 101 identified and found are candidates for resource reallocation so that any one of these processes is transferred to another computer 100. Notify the allocation unit 6.

図２４は、資源割当部６の動作フローの一例を示した図である。図１３の機能説明図も参照しながら動作を説明する。Ｓ１２１で、移行プロセス決定部３から、リソース再割当の候補またはリソース再割当の対象となるプロセスのリストが届く。リソース再割当の候補のリストが届く場合は、デッドラインミスが発生してメモリ不足が原因である場合であり、この場合には届いたリストに示すプロセスのいずれか少なくとも１つについて対処を行えばよい。リソース再割当の対象リストが届く場合は、計算機故障が発生した場合であり、この場合には届いた全てのプロセスに対してリソース再割当てを行う必要がある。
Ｓ１２２で、Ｓ１２１で届いたリストに記載されているプロセス名を例えばＣＰＵ時間の長い順にテーブルに保持する。Ｓ１２４で、Ｓ１２２で作成したプロセスのテーブルから、プロセス名を１つ取り出す。Ｓ１２６で、正常に動作する計算機１００のリストを計算機負荷量収集部５より取得し、例えば負荷の大きい順にテーブルに保持する。そして、Ｓ１２７以下を、Ｓ１２６のテーブルに示されている全計算機について、または、割当先計算機が見つかるまで繰り返す。
Ｓ１２７で、計算機１００のリストが示されたテーブル中、最も負荷の大きい計算機を１つ取り出す。Ｓ１２８で、Ｓ１２４で取り出したプロセス１０１がＳ１２７で選択した計算機１００上で動作できるかどうかを確認する。動作できるかどうか確認する方法としては、例えば、単純に計算機のＣＰＵ使用率のみで判断する方法や、各プロセスの処理周期とＣＰＵ時間をもとに処理のタイミングを実際に求め、デッドラインミスが発生せずにスケジューリングできるかどうかを分析する方法が考えられる。 FIG. 24 is a diagram illustrating an example of an operation flow of the resource allocation unit 6. The operation will be described with reference to the function explanatory diagram of FIG. In S121, the migration process determination unit 3 receives a list of resource reassignment candidates or processes to be reassigned. When the list of resource reassignment candidates arrives, it is a case where a deadline miss has occurred and the memory is insufficient. In this case, if at least one of the processes shown in the received list is addressed Good. When the target list for resource reallocation arrives, it is a case where a computer failure has occurred. In this case, it is necessary to perform resource reallocation for all the received processes.
In S122, the process names described in the list received in S121 are stored in a table in the order of, for example, the CPU time. In S124, one process name is extracted from the process table created in S122. In S126, a list of normally operating computers 100 is acquired from the computer load amount collection unit 5, and is stored in a table in order of increasing load, for example. Then, S127 and the subsequent steps are repeated for all the computers shown in the table of S126 or until an assignment destination computer is found.
In S127, one computer with the highest load is taken out from the table showing the list of computers 100. In S128, it is confirmed whether or not the process 101 extracted in S124 can operate on the computer 100 selected in S127. As a method of confirming whether or not it can be operated, for example, a method of simply judging based on the CPU usage rate of a computer, or actually determining the processing timing based on the processing cycle and CPU time of each process, a deadline mistake is detected. A method of analyzing whether or not scheduling can be performed without occurring can be considered.

Ｓ１２９にて資源割当部６が、Ｓ１２４でテーブルより取り出したプロセス１０１がＳ１２７で選んだ計算機１００上で動作可能であると判断した場合には、次にＳ１３０で、Ｓ１２６で取り出した計算機の空きメモリ量を調査し、Ｓ１２４で取り出したプロセスが使用するメモリ量以上空いているかどうかを確認する。十分なメモリが空いていればＳ１３２にて、Ｓ１２７で選択した計算機をＳ１２４で取り出したプロセスの移行先として決定する。そしてＳ１３３にて、Ｓ１３２で決定した計算機によってＳ１２４で取り出したプロセスを起動して処理させ、これまで動作していた計算機によるプロセスの処理を停止させる。
Ｓ１３４にて、システム構成情報管理テーブル・メモリ７の、Ｓ１２４にてテーブルより取り出したプロセスと、プロセスの割当先となった計算機（つまりＳ１２７で取り出した計算機）とＳ１２４にてテーブルより取り出したプロセスの現在のＣＰＵ時間に関する情報を書き換えて、Ｓ１２３に戻り、Ｓ１２２にて作成したテーブルの次のエントリに示されるプロセスの処理を行う。
なお、移行プロセス決定部３から「リソース再割当の候補プロセス群」としてプロセスのリストが与えられた場合には、通知されたすべてのプロセス１０１に対してリソース再割当を行う必要はなく、１つ以上のプロセスの再割当が決定すれば、Ｓ１２２にて作成したテーブルに示される他のプロセスの割当処理を行わずに終了してもよい。 If the resource allocation unit 6 determines in S129 that the process 101 extracted from the table in S124 is operable on the computer 100 selected in S127, then in S130, the free memory of the computer extracted in S126. The amount is checked, and it is confirmed whether or not the process taken out in S124 is more than the amount of memory used. If sufficient memory is available, in S132, the computer selected in S127 is determined as the migration destination of the process extracted in S124. In S133, the process determined in S124 is activated and processed by the computer determined in S132, and the process of the process performed by the computer that has been operating is stopped.
In S134, the process extracted from the table in S124 of the system configuration information management table memory 7, the computer to which the process is assigned (that is, the computer extracted in S127) and the process extracted from the table in S124. The information related to the current CPU time is rewritten, and the process returns to S123 to perform the process indicated by the next entry in the table created in S122.
When a list of processes is given as a “resource reassignment candidate process group” from the migration process determination unit 3, it is not necessary to perform resource reassignment for all the notified processes 101. If the above process reassignment is determined, the process may be terminated without performing the process of assigning other processes shown in the table created in S122.

Ｓ１２９において、Ｓ１２７で選択した計算機１００がプロセス１０１を処理するとデッドラインミスを発生させると判断した場合には、Ｓ１２６に戻り、次に負荷の大きい計算機を選択してＳ１２７以降を行う。
またＳ１３１で、Ｓ１２７で選択した計算機がＳ１２４で選択したプロセスを処理するために必要な空きメモリ量を持たない場合には、Ｓ１２６に戻り、テーブルの次のエントリにある計算機について、Ｓ１２７以降を実行する。Ｓ１２６において、全ての計算機についてＳ１２７を実行し、かつ割当先が見つからなかった場合には、そのプロセスのリソース割当を断念してＳ１３５に進み、Ｓ１２４で選んだプロセスを停止し、Ｓ１３６で、Ｓ１２２にて作成したテーブルから停止させたプロセス名を削除する。
なお、ここで示した再割当の方法はあくまで一例であり、Ｓ１２４〜Ｓ１３６の動作を応用した例は多数考えられるが、その他の例はここでは割愛する。 If it is determined in S129 that the computer 100 selected in S127 causes a deadline miss when the process 101 is processed, the process returns to S126, and the computer having the next largest load is selected and the processes from S127 are performed.
In S131, if the computer selected in S127 does not have the amount of free memory necessary to process the process selected in S124, the process returns to S126, and S127 and subsequent steps are executed for the computer in the next entry in the table. To do. In S126, if S127 is executed for all the computers and the allocation destination is not found, the resource allocation of the process is abandoned and the process proceeds to S135, the process selected in S124 is stopped, and the process proceeds to S122 in S136. Delete the stopped process name from the created table.
Note that the reallocation method shown here is merely an example, and there are many examples in which the operations of S124 to S136 are applied, but other examples are omitted here.

図２５は、これまでに説明した各構成要素による図１８から図２４で示される動作をもとに、計算機リソース動的制御装置１００ａが、または構成要素を分散した計算機リソース動的制御システムが連携して、動作する、デッドラインミスの継続を防ぐ総合動作フローを示した図である。
処理時間収集部２または計算機負荷量収集部５がＳ１４１にて、デッドラインミスの発生、または計算機の異常、ＣＰＵ、メモリ・リソース不足を検出して、移行プロセス決定部３に通知する。この場合の処理時間収集部２の動作はＳ１１〜Ｓ１６、計算機負荷量収集部５の動作はＳ１０１１〜Ｓ１０１８に示すとおりである。その後に移行プロセス決定部３はＳ１４２で、上記の通知の内容を調査する。通知の内容が計算機の故障である場合は、Ｓ１４４にて、故障した計算機上で動作していた全てのプロセスを対象として、資源割当部６がＳ１２１〜Ｓ１３６の手順によりリソースの割当を行う。計算機の故障でない場合は、Ｓ１４５にて、移行プロセス決定部３が、デッドラインミス発生の原因となったプロセス、リソース不足発生の原因となったプロセス、リソース再割当を行うプロセスを見つけ出す。そしてそれらのプロセスに対してＳ１４６にて、資源割当部６がＳ１２１〜Ｓ１３６の手順でリソース割当を行う。リソース割当終了後、Ｓ１４７にて資源割当部６は、システム構成情報管理テーブル・メモリ７の、新規にリソース割当を行ったプロセスの識別子と、そのプロセスを処理する割当先計算機、処理に要するＣＰＵ時間などに関する情報を新しいものに書き換える。 In FIG. 25, the computer resource dynamic control device 100a or the computer resource dynamic control system in which the components are distributed is linked based on the operations shown in FIGS. FIG. 5 is a diagram showing an overall operation flow that operates and prevents the continuation of a deadline miss.
In S141, the processing time collection unit 2 or the computer load amount collection unit 5 detects the occurrence of a deadline miss or the abnormality of the computer, the CPU, and the memory / resource shortage, and notifies the migration process determination unit 3 of them. The operation of the processing time collection unit 2 in this case is as shown in S11 to S16, and the operation of the computer load amount collection unit 5 is as shown in S1011 to S1018. Thereafter, the migration process determination unit 3 investigates the contents of the notification in S142. If the content of the notification is a computer failure, the resource allocation unit 6 allocates resources according to the procedures of S121 to S136 for all processes operating on the failed computer in S144. If it is not a computer failure, in S145, the migration process determination unit 3 finds a process that causes a deadline error, a process that causes a resource shortage, and a process that performs resource reallocation. In S146, the resource allocation unit 6 performs resource allocation for these processes in the procedure of S121 to S136. After the resource allocation is completed, the resource allocation unit 6 in S147, the identifier of the process that newly allocated the resource in the system configuration information management table memory 7, the allocation destination computer that processes the process, and the CPU time required for the process Rewrite information about the new information.

即ち本実施の形態は、複数台の計算機がネットワークに接続され、各計算機上で複数のプロセスが連携または独立して動作し、各プロセスは一定の時間周期で処理を行い、制限時間内に処理を完了することが求められているような分散リアルタイムシステムにおいて、プロセスの処理時間を監視する処理時間監視手段と計算機のリソース使用状況を監視する計算機負荷監視手段とを持ち、各プロセスの処理時間に応じてプロセスに与えるリソースを変更することによってプロセスのデッドラインミスの発生継続を防止するシステムであって、以下の構成を付加することにより十分な空きリソースを持つ他計算機へのプロセスマイグレーションを可能とし、プロセスマイグレーション後における全プロセスのデッドラインミスの再発防止を可能とすることを特徴とする計算機リソース動的制御方式を示している。そして、
１１）システムを構成するプロセス一覧、各プロセスの平均処理時間、消費メモリ量、ＣＰＵ時間、各プロセスの１回あたりの処理における制限時間、システムを構成する計算機一覧、計算機が搭載するメモリ量を保持するシステム構成情報管理テーブル。
１２）システム上で動作する全プロセスの処理時間、ＣＰＵ時間の変動の履歴を保持するとともに、プロセスのデッドラインミスの発生を検出する処理時間収集手段。
１３）システム上で動作する全計算機のリソースの使用状況の履歴を保持し、過去の計算機リソースの使用状況を他の手段に通知するとともに、リソース不足の発生を検出する計算機負荷収集手段。
１４）デッドラインミス、ＣＰＵリソース不足、デッドラインミスなどの障害が発生した場合に、リアルタイム処理を継続するためにリソースの再割り当てを行うプロセスを決定する移行プロセス決定手段。
１５）移行プロセス決定手段が決定したプロセスに対して、システム構成情報管理テーブルより必要なリソース量を取得し、必要な空きリソースを持つ計算機上にプロセスマイグレーションを行う資源割付手段、を備えた。 In other words, in this embodiment, a plurality of computers are connected to a network, and a plurality of processes operate in cooperation or independently on each computer, and each process performs processing at a fixed time period and performs processing within a time limit. In a distributed real-time system that is required to complete the process, it has a processing time monitoring means for monitoring the processing time of the process and a computer load monitoring means for monitoring the resource usage status of the computer. In response to the change of the resources given to the process, the system prevents the continuation of process deadline mistakes. By adding the following configuration, process migration to other computers with sufficient free resources is possible. , Prevents the recurrence of deadline mistakes in all processes after process migration It shows a computer resource dynamic control method, characterized by. And
11) A list of processes constituting the system, an average processing time of each process, a consumed memory amount, a CPU time, a time limit for processing per process, a list of computers constituting the system, and a memory amount installed in the computer are retained. System configuration information management table.
12) Processing time collection means for holding the history of fluctuations in the processing time and CPU time of all processes operating on the system and detecting the occurrence of a process deadline miss.
13) Computer load collection means for holding a history of resource usage of all the computers operating on the system, notifying other means of usage of past computer resources, and detecting occurrence of resource shortage.
14) A migration process determination unit that determines a process for reallocating resources in order to continue real-time processing when a failure such as a deadline miss, CPU resource shortage, or deadline miss occurs.
15) Resource allocation means for acquiring a necessary resource amount from the system configuration information management table for the process determined by the migration process determining means and performing process migration on a computer having the necessary free resources is provided.

言い換えると、本実施の形態の計算機リソース動的制御装置は、対象となるプロセスを処理する計算機を複数台接続して、複数の上記プロセスを連携処理するシステム、に接続して、
各上記計算機のリソースの使用状態を監視して計算機のこのリソースの使用状態情報を収集し、このリソースの不足を検出する計算機負荷量収集部と、
上記プロセスの処理時間を監視してプロセスの処理時間を収集し、プロセスの処理限界時間を規定したデッドライン時間を超す処理となるデッドラインミスの発生を検出する処理時間収集部と、
システムで処理される上記プロセスの処理時間とこのプロセスの処理で使用する計算機のリソースの使用量とデッドライン時間とをプロセスの識別子と共に記憶し、システムに接続される上記計算機の所有リソース量を計算機の識別子と共に記憶するシステム構成情報テーブル・メモリと、
上記デッドラインミスの発生の通知により上記システム構成情報テーブル・メモリを参照して上記プロセスの処理を現行の計算機とは異なる計算機に再割当を勧告する移行プロセス決定部と、
上記再割当の勧告通知を受けて上記システム構成情報テーブル・メモリの上記プロセスの処理時間と上記計算機の所有リソース量とを参照して上記プロセスの処理時間が上記デッドライン時間内となるようプロセスと候補計算機とを抽出し、この抽出したプロセスをこの抽出した候補計算機に割当て実行させると同時に上記システム構成情報テーブル・メモリを更新する資源割当部と、を備えた。
このように、プロセスのデッドラインミスを検出し、規定回数以上デッドラインミスが発生したプロセス１０１の処理を他の負荷の軽い計算機１００に割り付けるか、デッドラインミスが発生した原因となるプロセスを計算機リソースに余裕のある他の計算機に割当ることにより、デッドラインミスの継続を防ぐことが可能となる。
また、原因が計算機故障の場合には、故障した計算機上で動作していた全てのプロセスを他の計算機に再割当することにより、処理の継続が可能となる。 In other words, the computer resource dynamic control device according to the present embodiment connects a plurality of computers that process the target process, and connects to a system that processes the plurality of processes in cooperation.
A computer load collection unit that monitors the resource usage status of each of the above computers, collects the usage status information of this resource of the computer, and detects a shortage of this resource;
A processing time collecting unit that monitors the processing time of the process and collects the processing time of the process, and detects the occurrence of a deadline miss that is a process that exceeds the deadline time that defines the processing limit time of the process;
The processing time of the above-mentioned process processed in the system, the use amount of the computer resource used in the processing of this process, and the deadline time are stored together with the process identifier, and the owned resource amount of the above-mentioned computer connected to the system is calculated. A system configuration information table memory for storing together with the identifier of
A transition process determination unit that recommends reassignment of processing of the process to a computer different from the current computer by referring to the system configuration information table / memory by notification of occurrence of the deadline miss;
In response to the reassignment recommendation notification, the processing time of the process in the system configuration information table / memory is referred to the amount of resources owned by the computer so that the processing time of the process is within the deadline time. A candidate computer, and a resource allocation unit that updates the system configuration information table and memory at the same time that the extracted process is allocated and executed by the extracted candidate computer.
In this way, a deadline miss of a process is detected, and the process of the process 101 in which the deadline miss has occurred more than the specified number of times is assigned to another lightly loaded computer 100, or the process causing the deadline miss is calculated by the computer. By allocating to other computers with sufficient resources, it is possible to prevent continuation of deadline mistakes.
In addition, when the cause is a computer failure, the process can be continued by reassigning all the processes operating on the failed computer to other computers.

実施の形態２．
実施の形態１では、計算機がプロセスを処理する実績に基づいてデッドラインミスが生じたプロセスを負荷の軽い計算機に移す場合を説明したが、本実施の形態では学習により将来の予測をしてデッドラインミスが発生しそうであると、プロセスを他の計算機に移して、更にデッドラインミスの発生確率を減らす装置、システムを説明する。
図２６は、本実施の形態における計算機リソース動的制御システムの構成の例を示す図である。本実施の形態における計算機リソース動的制御システムは、実施の形態１における構成のほか、デッドラインミス発生時刻予測部２１、計算機負荷予測部２２、ＣＰＵ時間予測部２３、メモリ使用量予測部２４を備えている。その他の構成要素の、処理時間監視部１、計算機負荷監視部４、システム構成情報管理テーブル・メモリ７ないし計算機状態テーブル・メモリ１０の機能と動作は、実施の形態１と同一であり、説明を省略する。
同様に図２７は、本実施の形態における計算機リソース動的制御装置１００ｂの構成を示す図である。即ち図２６に示す分散配置でシステムとして動的制御することに代えて、特定の計算機が全ての構成要素を備えて、制御装置となってもよい。
また、処理時間収集部がデッドラインミス発生時刻予測部２１を含んで、処理時間収集・予測部２ｂとなる構成とし、計算機負荷量収集部が計算機負荷予測部２２を含んで、計算機負荷量収集・予測部５ｂとなる構成としてもよい。 Embodiment 2. FIG.
In the first embodiment, a case has been described in which a process in which a deadline error has occurred is transferred to a light-loading computer based on the results of processing the process by the computer. An apparatus and system for reducing the probability of occurrence of a deadline miss by moving the process to another computer when a line miss is likely to occur will be described.
FIG. 26 is a diagram showing an example of the configuration of a computer resource dynamic control system in the present embodiment. The computer resource dynamic control system according to the present embodiment includes, in addition to the configuration according to the first embodiment, a deadline miss occurrence time prediction unit 21, a computer load prediction unit 22, a CPU time prediction unit 23, and a memory usage amount prediction unit 24. I have. The functions and operations of the processing time monitoring unit 1, the computer load monitoring unit 4, the system configuration information management table / memory 7 or the computer state table / memory 10 of the other components are the same as those in the first embodiment, and will be described. Omitted.
Similarly, FIG. 27 is a diagram showing a configuration of the computer resource dynamic control device 100b in the present embodiment. That is, instead of performing dynamic control as a system with the distributed arrangement shown in FIG. 26, a specific computer may be provided with all the components and become a control device.
Further, the processing time collection unit includes the deadline miss occurrence time prediction unit 21 and is configured as the processing time collection / prediction unit 2b, and the computer load collection unit includes the computer load prediction unit 22 to collect the computer load collection. -It is good also as a structure used as the estimation part 5b.

以下、本実施の形態における移行プロセス決定部３ｂ、資源割当部６ｂ、デッドラインミス発生時刻予測部２１または処理時間収集・予測部２ｂ、計算機負荷予測部２２または計算機負荷量収集・予測部５ｂ、ＣＰＵ時間予測部２３、メモリ使用量予測部２４の機能について説明する。
本実施の形態における移行プロセス決定部３ｂは、処理時間収集部２から、デッドラインミス発生の通知を受け取る機能に加え、デッドラインミス発生時刻予測部２１からも通知を受け取る機能を持つ点が実施の形態１と異なる。また、デッドラインミス発生時刻予測部２１からデッドラインミス発生の通知を受け取った場合には、計算機負荷予測部２２やメモリ使用量予測部２４やＣＰＵ時間予測部２３にも問い合わせることによってデッドラインミス発生時の計算機負荷、メモリ使用量、ＣＰＵ時間（計算機のプロセッサがそのプロセスを処理する時間）の予測値を取得し、これらの予測値に基づき、将来発生すると思われるデッドラインミス発生の原因を推測する点が実施の形態１と異なる。
本実施の形態における資源割当部６ｂは、移行プロセス決定部３ｂがデッドラインミス発生時刻予測部２１からデッドラインミス通知を受け取った場合には、計算機負荷予測部２２、ＣＰＵ時間予測部２３、メモリ使用量予測部２４に問い合わせることによって、プロセスの再割当先として適切な計算機１００を選択する点が実施の形態１と異なる。 Hereinafter, the migration process determination unit 3b, the resource allocation unit 6b, the deadline miss occurrence time prediction unit 21 or the processing time collection / prediction unit 2b, the computer load prediction unit 22 or the computer load amount collection / prediction unit 5b in the present embodiment, The functions of the CPU time prediction unit 23 and the memory usage amount prediction unit 24 will be described.
The migration process determination unit 3b in the present embodiment has a function of receiving a notification from the processing time collection unit 2 as well as receiving a notification of the occurrence of a deadline miss from the deadline miss occurrence time prediction unit 21. This is different from Form 1. In addition, when a notice of the occurrence of a deadline miss is received from the deadline miss occurrence time prediction unit 21, the deadline miss is also inquired by inquiring to the computer load prediction unit 22, the memory usage prediction unit 24, and the CPU time prediction unit 23. Obtain the predicted value of the computer load, memory usage, and CPU time (the time that the processor of the computer processes the process) at the time of occurrence, and based on these predicted values, determine the cause of the occurrence of a deadline miss that may occur in the future The point to be estimated is different from the first embodiment.
When the migration process determining unit 3b receives a deadline miss notification from the deadline miss occurrence time predicting unit 21, the resource allocating unit 6b in the present embodiment has a computer load predicting unit 22, a CPU time predicting unit 23, a memory The difference from the first embodiment is that an appropriate computer 100 is selected as a process reallocation destination by making an inquiry to the usage amount prediction unit 24.

図２８は、本実施の形態におけるデッドラインミス発生時刻予測部２１の機能を示した図である。デッドラインミス発生時刻予測部２１は、処理時間履歴テーブル・メモリ８に示される処理開始時刻、処理終了時刻から時刻と処理時間の変動の関係を統計学などにより求め、近似式で表現する機能を持つ。また、求めた近似式を利用してデッドラインミス発生時刻を予測し、処理時間収集・予測部２ｂか、または直接に移行プロセス決定部３ｂに、デッドライン発生予測時刻を通知する役割を持つ。例えば、図２８の例では、５０ミリ秒周期で処理を行っており、処理を行うたびに処理時間が１０ミリ秒ずつ長くなっているケースを示している。この場合の近似式はｙ＝０．２×（ｘ−最も古いデータ内の処理開始時刻）＋１０ミリ秒（ｙは処理時間、ｘは処理開始時刻）と求めることができ、例えば処理のデッドラインが１００ミリ秒であった場合には、１０時１０分１０秒４５０ミリ秒にデッドラインミスが発生することがわかる。この場合、デッドラインミス発生時刻予測部２１は、移行プロセス決定部３ｂに「１０時１０分１０秒４５０ミリ秒にデッドラインミスが発生する」との予告を通知することになる。 FIG. 28 is a diagram illustrating the function of the deadline miss occurrence time prediction unit 21 in the present embodiment. The deadline miss occurrence time prediction unit 21 has a function of calculating the relationship between the time and the processing time from the processing start time and the processing end time indicated in the processing time history table memory 8 by using statistics or the like, and expressing it by an approximate expression. Have. Further, it has a role of predicting a deadline miss occurrence time using the obtained approximate expression and notifying the processing time collection / prediction unit 2b or the transition process determination unit 3b directly to the predicted deadline occurrence time. For example, the example of FIG. 28 shows a case where processing is performed at a cycle of 50 milliseconds and the processing time is increased by 10 milliseconds each time processing is performed. The approximate expression in this case can be obtained as y = 0.2 × (x−processing start time in the oldest data) +10 milliseconds (y is the processing time, x is the processing start time). Is 100 milliseconds, it can be seen that a deadline miss occurs at 10:10:10 seconds 450 milliseconds. In this case, the deadline miss occurrence time prediction unit 21 notifies the transition process determination unit 3b of a notice that “a deadline miss will occur at 10: 10: 10: 450 ms”.

図２９は、本実施の形態における計算機負荷予測部２２の機能を示した図である。計算機負荷予測部２２は、計算機負荷履歴テーブル・メモリ９に示されるＣＰＵ使用率から、時刻とＣＰＵ負荷の変動の関係を統計学などの手法により求め、変動を近似式にて表現する機能を持つ。また、求めた近似式を利用して、デッドラインミス発生予測時刻における各計算機のＣＰＵ使用率を求め、移行プロセス決定部３ｂなどからのリクエストに応じてこれらの値を通知する機能を持つ。また、ＣＰＵ使用率がある上限値を超える時刻を求め、移行プロセス決定部３ｂに通知する機能を持つ。ある上限値とは、１００％でもよいし、ユーザによってあらかじめ決めておいてもよい。
例えば、図２９では、ある計算機の負荷が５０ミリ秒ごとに０．５％ずつ増加しているケースを示している。この場合のＣＰＵ使用率の近似式はｙ＝０．０１×（ｘ−最も古いデータの受信時刻）＋１０（ｙはＣＰＵ使用率、ｘは時刻）と求めることができ、例えば、移行プロセス決定部３ｂから、１０時１０分１０秒４５０ミリ秒のＣＰＵ負荷について問い合わせがあった場合には、１４．５％と回答することになる。もし、問い合わせと同じ時刻のデータが存在しなければ、テーブルに示される中で最も近い時刻データを取得し、回答することになる。また、ＣＰＵ使用率の許容可能な上限値を１００％とした場合、本式より、ＣＰＵ使用率が１００％になるのは１０時１０分２８秒になる。
計算機負荷予測部２２は、プロセスごとにあってもよいし、システムで１つでもよい。ただしシステムで１つの場合は、全てのプロセスに対して近似式を求めることになる。 FIG. 29 is a diagram showing the function of the computer load prediction unit 22 in the present embodiment. The computer load predicting unit 22 has a function of calculating the relationship between time and CPU load variation from a CPU usage rate shown in the computer load history table memory 9 by a technique such as statistics and expressing the variation with an approximate expression. . Further, it has a function of obtaining the CPU usage rate of each computer at the estimated deadline miss occurrence time by using the obtained approximate expression and notifying these values in response to a request from the migration process determining unit 3b or the like. Further, it has a function of obtaining a time when the CPU usage rate exceeds a certain upper limit value and notifying the migration process determination unit 3b. The certain upper limit value may be 100%, or may be determined in advance by the user.
For example, FIG. 29 shows a case where the load on a certain computer increases by 0.5% every 50 milliseconds. In this case, the approximate expression of the CPU usage rate can be obtained as y = 0.01 × (x−the reception time of the oldest data) +10 (y is the CPU usage rate and x is the time). From 3b, if there is an inquiry about the CPU load at 10:10:10 seconds 450 milliseconds, it will be answered 14.5%. If there is no data at the same time as the inquiry, the closest time data shown in the table is acquired and answered. Also, assuming that the allowable upper limit value of the CPU usage rate is 100%, from this equation, the CPU usage rate becomes 100% at 10:10:28.
The computer load prediction unit 22 may be provided for each process or may be one in the system. However, in the case of one system, approximate equations are obtained for all processes.

図３０は、本実施の形態におけるＣＰＵ時間予測部２３の機能を示した図である。ＣＰＵ時間予測部２３は、処理時間履歴テーブル・メモリ８に示される各プロセスのＣＰＵ時間から、時刻とＣＰＵ時間の変動の関係を統計学などにより求め、近似式で表現する機能を持つ。また、移行プロセス決定部３ｂなどからのリクエストに応じて、求めた近似式を利用して、デッドラインミス発生予測時における各プロセスのＣＰＵ時間を求め、通知する機能を持つ。
例えば、図３０では、ある計算機のＣＰＵ時間が５０ミリ秒ごとに１０ミリ秒ずつ増えているケースを示している。この場合のＣＰＵ時間の近似式はｙ＝０．２×（ｘ−最も古い受信データ内の処理開始時刻）＋９（ｙはＣＰＵ時間、ｘは処理開始時刻）と求めることができ、例えば、移行プロセス決定部３ｂから、１０時１０分１０秒４５０ミリ秒のＣＰＵ時間について問い合わせがあった場合には、ＣＰＵ時間予測部２３は、その時間は９９ミリ秒と回答することになる。 FIG. 30 is a diagram illustrating a function of the CPU time prediction unit 23 in the present embodiment. The CPU time prediction unit 23 has a function of obtaining the relationship between the time and the fluctuation of the CPU time from the CPU time of each process shown in the processing time history table memory 8 by using statistics or the like and expressing it by an approximate expression. Further, it has a function of obtaining and notifying the CPU time of each process at the time of predicting the occurrence of a deadline miss using the obtained approximate expression in response to a request from the migration process determining unit 3b and the like.
For example, FIG. 30 shows a case where the CPU time of a certain computer increases by 10 milliseconds every 50 milliseconds. In this case, an approximate expression of CPU time can be obtained as y = 0.2 × (x−processing start time in the oldest received data) +9 (y is CPU time, x is processing start time). If there is an inquiry about the CPU time of 10: 10: 10: 450 milliseconds from the process determination unit 3b, the CPU time prediction unit 23 will reply that the time is 99 milliseconds.

図３１は、本実施の形態におけるメモリ使用量予測部２４の機能を示した図である。メモリ使用量予測部２４は、計算機負荷履歴テーブル・メモリ９に示されるメモリ使用量から、時刻とメモリ使用量の変動を統計学などにより求め、変動を近似式にて表現する機能を持つ。また、メモリ使用量の近似式をもとにメモリ不足が発生する時刻を求め、移行プロセス決定部３ｂに通知する機能を持つ。また、求めた近似式を利用して、デッドラインミス発生予測時刻における各計算機のメモリ使用量を求め、移行プロセス決定部３ｂなどからのリクエストに応じて通知する機能を持つ。空きメモリ量がどの程度になったらメモリ不足とするかは、ユーザよりシステムに与えてもよいし、空きメモリ量が０になったらメモリ不足としてもよい。ここでは説明を単純にするため、空きメモリが０となったらメモリ不足とする。
例えば、図３１の例では、ある計算機のメモリ使用量が、５０ミリ秒ごとに２ＭＢずつ増加しているケースを示している。この場合のメモリ使用量の近似式は、ｙ＝０．０４×（ｘ−最も古いデータの受信時刻）＋１００（ｙはメモリ使用量、ｘは時刻）と求めることができる。そして、本計算機の搭載メモリ量が１８０ＭＢであった場合には、２秒後にメモリ不足が発生することになり、その旨を処理時間収集・予測部２ｂまたは直接に移行プロセス決定部３ｂに通知する。また、移行プロセス決定部３ｂなどから、１０時１０分１０秒４５０ミリ秒のメモリ使用量について問い合わせがあった場合には、１１８ＭＢと回答することになる。
メモリ使用量予測部２４は、計算機に１つあってもよいし、システムに１つでもよいが、全ての計算機について近似式を求めることになる。 FIG. 31 is a diagram showing the function of the memory usage amount prediction unit 24 in the present embodiment. The memory usage amount prediction unit 24 has a function of obtaining the time and memory usage variation from the memory usage shown in the computer load history table / memory 9 by using statistics or the like, and expressing the variation by an approximate expression. Further, it has a function of obtaining the time when the memory shortage occurs based on the approximate expression of the memory usage, and notifying the migration process determining unit 3b. Further, it has a function of obtaining the memory usage of each computer at the estimated deadline miss occurrence time using the obtained approximate expression and notifying it in response to a request from the migration process determining unit 3b or the like. The amount of free memory when the amount of free memory becomes insufficient may be given to the system by the user, or when the amount of free memory becomes 0, the memory may be insufficient. Here, to simplify the description, it is assumed that the memory is insufficient when the free memory becomes zero.
For example, the example of FIG. 31 shows a case where the memory usage of a certain computer increases by 2 MB every 50 milliseconds. In this case, the approximate expression of the memory usage can be obtained as y = 0.04 × (x−the reception time of the oldest data) +100 (y is the memory usage, x is the time). If the amount of installed memory in this computer is 180 MB, a memory shortage will occur after 2 seconds, and this is notified to the processing time collection / prediction unit 2b or directly to the migration process determination unit 3b. . In addition, when there is an inquiry about the memory usage of 10: 10: 10: 450 milliseconds from the migration process determination unit 3b or the like, 118 MB is replied.
Although there may be one memory usage prediction unit 24 in the computer or one in the system, approximate expressions are obtained for all the computers.

図３２は、本実施の形態におけるデッドラインミス発生時刻予測部２１の動作フローの一例を示す図である。図２８の機能説明図も参照しながら動作を説明する。デッドラインミス発生時刻予測部２１は、Ｓ２１にて、処理時間履歴テーブル・メモリ８の全エントリを参照し、各エントリにおける開始時刻と処理時間を求める。処理時間とは、処理終了時刻から処理開始時刻を引いたものである。そしてＳ２２にて、統計学の手法により処理開始時刻と処理時間の関係を近似式にて表す。近似式の求め方としては、例えば最小二乗法などが考えられる。また近似式の次数に関しては、あらかじめ決めておいてもよいし、次数の上限を決めた上で次数を１次から徐々に上げていき、正確に近似した式を求めるという方法も考えられる。そして求めた式を用いて、Ｓ２３にて、デッドラインミス発生時刻を求める。求め方は、例えば、Ｓ２２にて求めた式がｙ＝ａ_ｎｘ^ｎ＋ａ_ｎ−１ｘ^ｎ−１＋…＋ａｘ＋ｂ（ｙ：処理時間、ｘ：時刻）であった場合、ｙの値にデッドライン時間の値を代入したときのｘの値により求めることができる。
そしてＳ２４にて、デッドラインミスが将来発生するかどうかを求める。デッドラインミスが将来発生するかどうかは、例えばＳ２３にて求めた時刻が現在時刻より前か後かによって判断することができる。デッドラインミスが将来発生すると判断した場合には、Ｓ２５において、処理時間収集・予測部経由で、または直接で移行プロセス決定部３ｂに、デッドラインミス発生が予想されるプロセスの名前を通知する。この動作は、プロセスの処理とは同期せずに行ってもよいし、プロセスの処理が終わるたびに行ってもよい。 FIG. 32 is a diagram illustrating an example of an operation flow of the deadline miss occurrence time prediction unit 21 in the present embodiment. The operation will be described with reference to the function explanatory diagram of FIG. In S21, the deadline miss occurrence time prediction unit 21 refers to all entries in the processing time history table memory 8 and obtains the start time and processing time for each entry. The processing time is obtained by subtracting the processing start time from the processing end time. In S22, the relation between the processing start time and the processing time is expressed by an approximate expression using a statistical technique. As a method of obtaining the approximate expression, for example, a least square method or the like can be considered. Further, the order of the approximate expression may be determined in advance, or a method may be considered in which the order is gradually increased from the first order after determining the upper limit of the order and an accurate approximate expression is obtained. Then, using the obtained formula, the deadline miss occurrence time is obtained in S23. For example, when the formula obtained in S22 is y = a _n x ⁿ + a _n-1 x ^n-1 +... + Ax + b (y: processing time, x: time), the value of y is dead. It can be obtained from the value of x when the line time value is substituted.
In S24, it is determined whether a deadline miss will occur in the future. Whether a deadline miss will occur in the future can be determined, for example, based on whether the time obtained in S23 is before or after the current time. If it is determined that a deadline miss will occur in the future, in S25, the name of the process in which the occurrence of a deadline miss is expected is notified to the migration process determination unit 3b via the processing time collection / prediction unit or directly. This operation may be performed without being synchronized with the processing of the process, or may be performed every time the processing of the process is completed.

図３３は、本実施の形態における計算機負荷予測部２２の動作フローの一例を示した図である。図２９の機能説明図も参照しながら動作を説明する。Ｓ２０１にて、計算機負荷履歴テーブル・メモリ９の全エントリを参照し、各エントリの負荷取得時刻とそのときの計算機負荷を求める。そしてＳ２０２にて、統計学の手法により、計算機負荷と負荷取得時刻の関係を近似式にて表す。近似式の求め方としては、デッドラインミス発生時刻予測手段と同様、例えば最小二乗法などが考えられる。また、近似式の次数に関しては、デッドラインミス発生時刻予測手段と同様にあらかじめ決めておいてもよいし、次数の上限を決めた上で徐々に上げて、最もデータの変動に近い式を近似式とする、という方法でもよい。
Ｓ２０３にて、Ｓ２０２で求めた式からＣＰＵ使用率が１００％ないし決められた上限値になる時刻を求める。そしてＣＰＵ使用率が上限に達すると予測されると、Ｓ２０５にて、ＣＰＵ使用率が上限に達する時刻と、その計算機名を通知する。なお、Ｓ２０１〜Ｓ２０５の処理は、計算機負荷量収集部５からの計算機負荷情報の通知に同期して行ってもよいし、非同期に行ってもよい。 FIG. 33 is a diagram showing an example of an operation flow of the computer load prediction unit 22 in the present embodiment. The operation will be described with reference to the function explanatory diagram of FIG. In S201, all entries in the computer load history table memory 9 are referred to determine the load acquisition time of each entry and the computer load at that time. In S202, the relation between the computer load and the load acquisition time is expressed by an approximate expression using a statistical technique. As a method of obtaining the approximate expression, the least square method, for example, can be considered as in the deadline miss occurrence time predicting means. In addition, the order of the approximation formula may be determined in advance as in the deadline miss occurrence time prediction means, or after gradually determining the upper limit of the order and approximating the formula closest to the data fluctuation. A method of formulating may be used.
In S203, the time when the CPU usage rate becomes 100% or the determined upper limit value is obtained from the equation obtained in S202. If it is predicted that the CPU usage rate will reach the upper limit, in S205, the time when the CPU usage rate reaches the upper limit and the computer name are notified. Note that the processing of S201 to S205 may be performed in synchronization with the notification of the computer load information from the computer load amount collection unit 5, or may be performed asynchronously.

図３４は、本実施の形態におけるＣＰＵ時間予測部２３の動作フローの一例を示した図である。図３０の機能説明図も参照しながら動作を説明する。Ｓ２１１にて、処理時間履歴テーブル・メモリの全エントリを参照し、各エントリにおける処理開始時刻とＣＰＵ時間を取得する。そしてＳ２１２にて、統計学の手法により、ＣＰＵ時間取得時刻とＣＰＵ時間の関係を近似式で表す。近似式の求め方は、上記の最小二乗法などが考えられる。その近似式の次数に関しても上記と同様に処理する。この動作は、プロセスの処理と同期、非同期いずれで行ってもよい。
図３５は、本実施の形態におけるメモリ使用量予測部２４の動作フローの一例を示す図である。Ｓ２２１にて、計算機負荷履歴テーブル・メモリ９の全エントリを参照し、各エントリにおける負荷取得時刻とそのときのメモリ使用量を求める。そしてＳ２２２にて、負荷取得時刻とメモリ使用量の関係を、上記の近似式で表したと同様な方法で表す。この処理も、計算機負荷量収集部５への通知に同期して、または非同期で行ってよい。そしてＳ２２３にて、メモリ不足が発生する時刻を求める。例えば、Ｓ２２２にて求めた式がｙ＝ａ_ｎｘ^ｎ＋ａ_ｎ−１ｘ^ｎ−１＋・・・＋ａ_１ｘ＋ｂ（ｙ：メモリ使用量、ｘ：時刻）であった場合には、ｙに例えば計算機搭載メモリ量またはユーザが決定した上限メモリ量を代入した場合のｘの値を求めればよい。求めた時刻が現在時刻よりも後であった場合には、メモリ不足が発生させると予測され、Ｓ２２５において、メモリ不足が発生する時刻と、その計算機名を移行プロセス決定部３ｂに通知する。 FIG. 34 is a diagram showing an example of an operation flow of the CPU time prediction unit 23 in the present embodiment. The operation will be described with reference to the function explanatory diagram of FIG. In S211, the processing start time and CPU time in each entry are acquired by referring to all entries in the processing time history table / memory. In S212, the relation between the CPU time acquisition time and the CPU time is expressed by an approximate expression using a statistical technique. As a method of obtaining the approximate expression, the least square method described above can be considered. The order of the approximate expression is processed in the same manner as described above. This operation may be performed either synchronously or asynchronously with the process processing.
FIG. 35 is a diagram illustrating an example of an operation flow of the memory usage amount prediction unit 24 in the present embodiment. In S221, all entries in the computer load history table / memory 9 are referred to, and the load acquisition time in each entry and the memory usage at that time are obtained. In S222, the relationship between the load acquisition time and the memory usage is expressed by the same method as that expressed by the above approximate expression. This processing may also be performed in synchronization with the notification to the computer load amount collection unit 5 or asynchronously. In S223, the time when the memory shortage occurs is obtained. For example, if the formula obtained in S222 is y = a _n x ⁿ + a _n-1 x ^n-1 +... + A ₁ x + b (y: memory usage, x: time) For example, what is necessary is just to obtain | require the value of x at the time of substituting the memory amount mounted on a computer, or the upper limit memory amount determined by the user. If the calculated time is later than the current time, it is predicted that a memory shortage will occur, and in S225, the time when the memory shortage occurs and the computer name thereof are notified to the migration process determination unit 3b.

図３６は、本実施の形態における移行プロセス決定部３ｂの、デッドラインミス発生時刻予測部２１からデッドラインミス発生の予測通知が届いた場合の動作フローの一例順を示した図である。
Ｓ２３１で、デッドラインミス発生時刻予測部２１よりデッドラインミスがある時刻に発生する旨を伝える通知が届く。通知とともに、例えばデッドラインミスが発生したプロセス名が届くことが考えられる。Ｓ２３１でデッドラインミスが発生するプロセスの名前が届くと、Ｓ２３２で、システム構成情報管理テーブル・メモリ７を参照することにより、デッドラインミスが発生するプロセスを処理している計算機名を取得する。そしてＳ２３３で、その計算機１００上で動作している他のプロセス１０１の名を取得する。Ｓ２３４にて、Ｓ２３３で見つけたプロセスのうちでデッドラインミス発生の予測時刻においてＣＰＵの処理時間が増えているプロセスを調査する。ＣＰＵ時間が増えているプロセスは実際に処理負荷が高くなったプロセスを示す。ＣＰＵ時間が増えているかどうかの判断は、システム構成情報管理テーブルが、各プロセスの平均ＣＰＵ時間を保持しているので、システム構成情報管理テーブルより平均ＣＰＵ時間を取得し、更にＣＰＵ時間予測部２３に、デッドラインミス発生時刻におけるプロセスのＣＰＵ時間を問い合わせて取得し、この二つの値を比較することにより判断できる。そしてＳ２３５にて、デッドラインミスが発生すると予測されるプロセスと、Ｓ２３４で見つけ出したプロセスを、リソース再割当を行う候補プロセスとして、これらのうちの少なくともどちらか１つを移行させるように資源割当部６ｂに通知する。 FIG. 36 is a diagram illustrating an example order of an operation flow in the transition process determination unit 3b according to the present embodiment when a prediction notification of the occurrence of a deadline miss has arrived from the deadline miss occurrence time prediction unit 21.
In S231, the deadline miss occurrence time predicting unit 21 receives a notification notifying that a deadline miss occurs at a time. For example, it is conceivable that the name of the process in which a deadline miss occurs is received along with the notification. When the name of the process in which the deadline miss occurs in S231, the name of the computer that is processing the process in which the deadline miss occurs is acquired by referring to the system configuration information management table memory 7 in S232. In step S233, the name of another process 101 operating on the computer 100 is acquired. In S234, among the processes found in S233, a process in which the processing time of the CPU increases at the predicted time of occurrence of a deadline miss is investigated. A process whose CPU time has increased indicates a process whose processing load has actually increased. Whether or not the CPU time has increased is determined by acquiring the average CPU time from the system configuration information management table because the system configuration information management table holds the average CPU time of each process. Further, it can be determined by inquiring and obtaining the CPU time of the process at the deadline miss occurrence time and comparing these two values. In S235, the resource allocation unit is configured to shift the process predicted to cause a deadline miss and the process found in S234 as at least one of these as candidate processes for resource reassignment. 6b is notified.

図３７は、本実施の形態における移行プロセス決定部３ｂの他の動作フロー、即ちメモリ使用量予測部２４または計算機負荷予測部２２から、それぞれの予測通知が届いた場合の動作フローの一例を示した図である。
Ｓ２４１にて、メモリ使用量予測部２４または計算機負荷予測部２２よりデッドラインミス発生の予測通知が届くと、Ｓ２４２で、システム構成情報管理テーブル・メモリ７を参照して、メモリ不足が発生する計算機１００上で動作するプロセス１０１名を取得し、Ｓ２４２で見つけたプロセスをリソース再割当の候補とし、資源割付部６ｂにこれらのプロセスを通知する。
なお、処理時間収集・予測部２ｂ、計算機負荷量収集部５から通知を受け取った場合の移行プロセス決定部３ｂの動作は実施の形態１と同一である。 FIG. 37 shows an example of the operation flow when each prediction notification arrives from the other operation flow of the migration process determination unit 3b in this embodiment, that is, the memory usage amount prediction unit 24 or the computer load prediction unit 22. It is a figure.
When a notice of occurrence of a deadline miss is received from the memory usage prediction unit 24 or the computer load prediction unit 22 in S241, the computer in which a memory shortage occurs in S242 is referred to the system configuration information management table memory 7 The name of the process 101 operating on 100 is acquired, the process found in S242 is set as a resource reallocation candidate, and these processes are notified to the resource allocation unit 6b.
The operation of the migration process determination unit 3b when receiving a notification from the processing time collection / prediction unit 2b and the computer load amount collection unit 5 is the same as that of the first embodiment.

図３８は、本実施の形態における資源割付部６ｂの動作フローの一例を示す図である。
Ｓ２５１で移行プロセス決定部３ｂから、リソース再割当の候補、リソース再割当対象となるプロセスのリストが届くと、Ｓ２５２で、そのリストに記載されているプロセス名を例えばデッドラインミス発生時において使用するＣＰＵ時間の長い順にテーブルに保持する。デッドラインミス発生時のＣＰＵ時間は、ＣＰＵ時間予測部２３に問い合わせることにより求められる。
Ｓ２５４において、Ｓ２５２で作成したプロセス名のテーブルから、プロセス名を１つ取り出す。Ｓ２５５で、正常に動作する計算機１００の一覧を計算機負荷量収集部５より取得し、例えばデッドラインミス発生時において負荷の大きい順にテーブルに保持する。デッドラインミス発生時刻における計算機負荷は、計算機負荷予測部２２に問い合わせることにより求められる。
Ｓ２５７において、計算機のリストで示されたテーブル中、例えば最もＣＰＵ使用率の最も高い計算機を１つ取り出す。Ｓ２５８では、Ｓ２５４で取り出したプロセス１０１がＳ２５７で選択した計算機１００上で動作できるかどうかを確認する。動作できるかどうかを確認する方法としては、例えば、単純に計算機のＣＰＵ使用率のみで判断する方法や、各プロセスの周期をもとにして、どのプロセスがいつＣＰＵを占有するかを実際に調査することによってデッドラインミスが発生せずにスケジューリングできるかどうかを分析する方法が考えられる。 FIG. 38 is a diagram illustrating an example of an operation flow of the resource allocation unit 6b in the present embodiment.
When the migration process determination unit 3b receives a resource reassignment candidate and a list of resource reassignment process processes in S251, the process name described in the list is used in S252 when a deadline miss occurs, for example. The table is held in order of increasing CPU time. The CPU time when a deadline miss occurs is obtained by inquiring the CPU time prediction unit 23.
In S254, one process name is extracted from the process name table created in S252. In S255, a list of normally operating computers 100 is acquired from the computer load amount collection unit 5, and stored in the table in descending order of load, for example, when a deadline miss occurs. The computer load at the deadline miss occurrence time is obtained by inquiring the computer load prediction unit 22.
In S257, for example, one computer having the highest CPU usage rate is extracted from the table shown in the computer list. In S258, it is confirmed whether or not the process 101 extracted in S254 can operate on the computer 100 selected in S257. As a method of confirming whether or not it can be operated, for example, a method of simply judging based on the CPU usage rate of a computer or an actual investigation of which process occupies the CPU based on the cycle of each process. A method of analyzing whether or not scheduling can be performed without causing a deadline error is conceivable.

Ｓ２５９において、Ｓ２５４でテーブルより取り出したプロセスがＳ２５７で選んだ計算機１００上で動作可能であると判断した場合には、次にＳ２６０において、Ｓ２５７で取り出した計算機１００の、デッドラインミス発生時刻における空きメモリ量を調査し、Ｓ２５４で取り出したプロセス１０１が使用するメモリ量以上に空いているかどうかを確認する。デッドラインミス発生時刻における空きメモリ量は、メモリ使用量予測部２４にデッドラインミス発生時刻のメモリ使用量を問い合わせて取得し、更にシステム構成情報管理テーブル・メモリ７に対してＳ２５６で取り出した計算機の搭載メモリ量を参照して得て、後者の値から前者の値を引くことにより求めることができる。
そして十分な空きメモリ量があればＳ２６２において、Ｓ２５７で選択した計算機１００をＳ２５４で取り出したプロセス１０１の再割当先として決定する。Ｓ２６３において、Ｓ２６１で決定した計算機１００上でＳ２５４にて取り出したプロセス１０１を起動し、元の計算機で動作していたプロセスを停止させる。Ｓ２６４にて、システム構成情報管理テーブル・メモリ７の、Ｓ２５４にてテーブルより取り出したプロセスとプロセスの割付先となった計算機１００（Ｓ２５６で取り出した計算機）とＳ２５４にてテーブルより取り出したプロセス１０１のデッドラインミス発生時のＣＰＵ時間を書き換えてＳ２５３に戻り、Ｓ２５２にて作成したテーブルの次のエントリに示されるプロセスについて、Ｓ２５３〜Ｓ２６４の処理を行う。 If it is determined in S259 that the process extracted from the table in S254 is operable on the computer 100 selected in S257, then in S260, the computer 100 extracted in S257 is free at the deadline miss occurrence time. The amount of memory is checked, and it is confirmed whether or not the amount of memory used by the process 101 extracted in S254 is larger than the amount used. The free memory amount at the deadline miss occurrence time is obtained by inquiring the memory use amount at the deadline miss occurrence time to the memory use amount prediction unit 24, and is further extracted from the system configuration information management table memory 7 in S256. Can be obtained by referring to the amount of installed memory and subtracting the former value from the latter value.
If there is a sufficient amount of free memory, in S262, the computer 100 selected in S257 is determined as the reassignment destination of the process 101 extracted in S254. In step S263, the process 101 extracted in step S254 is started on the computer 100 determined in step S261, and the process operating in the original computer is stopped. In S264, the process extracted from the table in S254 of the system configuration information management table memory 7 and the computer 100 (the computer extracted in S256) that is the allocation destination of the process and the process 101 extracted from the table in S254. The CPU time at the time of occurrence of a deadline miss is rewritten and the process returns to S253, and the processes of S253 to S264 are performed for the process indicated by the next entry in the table created in S252.

なお、移行プロセス決定部３ｂからデッドラインミス発生時刻予測部２１、またはメモリ使用量予測部２４からの通知で動的リソース制御装置またはシステムが動作している場合、すなわちリソース再割当の候補プロセス群としてプロセス１０１のリストが与えられており、リストに示されるプロセスの少なくともどれか１つについてリソース再割当が行えればよいという場合には、通知されたすべてのプロセスに対してリソース再割当を行わずに終了してもよい。
Ｓ２５９において、Ｓ２５７で選択した計算機１００でプロセス１０１を処理するとデッドラインミスを発生させると判断した場合には、Ｓ２５６に戻り、Ｓ２５５で作成したテーブルに示される他の全計算機１００についてＳ２５７以下を実行する。
またＳ２６１において、Ｓ２５７で選択した計算機１００に十分な空きメモリがない場合には、Ｓ２５６に戻り、Ｓ２５５で作成したテーブルに示される全計算機についてＳ２５７以下を実行する。Ｓ２５６において、全ての計算機について動作可能であるかどうかチェックを行って未割当の場合には、割当可能な計算機が存在しなかったと判断してリソース割当を断念し、Ｓ２６６にて自プロセスを停止させ、Ｓ２６５にて、そのプロセスをテーブルから削除して、テーブルの次のプロセスについてＳ２５３以下の処理を行う。
なお、ここで示した再割当の方法はあくまで一例であり、Ｓ２５４〜Ｓ２６４の動作を応用した例は多数考えられるが、その他の例はここでは割愛する。 In addition, when the dynamic resource control device or system is operating in response to a notification from the migration process determination unit 3b from the deadline miss occurrence time prediction unit 21 or the memory usage amount prediction unit 24, that is, a candidate process group for resource reallocation If a list of processes 101 is given and it is sufficient that resource reallocation can be performed for at least one of the processes shown in the list, resource reallocation is performed for all the notified processes. You may end without.
In S259, when it is determined that the process 101 is processed by the computer 100 selected in S257, a deadline error occurs, the process returns to S256, and S257 and subsequent steps are executed for all the other computers 100 shown in the table created in S255. To do.
In S261, if there is not enough free memory in the computer 100 selected in S257, the process returns to S256, and S257 and subsequent steps are executed for all the computers shown in the table created in S255. In S256, it is checked whether or not all computers can be operated. If unassigned, it is determined that there is no assignable computer, and resource allocation is abandoned, and the own process is stopped in S266. In step S265, the process is deleted from the table, and the processes in and after step S253 are performed for the next process in the table.
Note that the reallocation method shown here is merely an example, and there are many examples in which the operations of S254 to S264 are applied, but other examples are omitted here.

図３９は、各構成要素の動作をもとにして連携動作させた場合の、本実施の形態における計算機リソース動的制御装置またはシステムの動作フローの概要を示した図である。
図において、Ｓ２７１にて、処理時間収集部２、計算機負荷量収集部５、デッドラインミス発生時刻予測部２１、メモリ使用量予測部２４が、デッドラインミスの発生、デッドラインミスの発生予測、計算機の異常、ＣＰＵ・メモリリソース不足、ＣＰＵ・メモリ・リソース不足の予測を移行プロセス決定部３ｂに通知する。デッドラインミス発生の検出は、Ｓ１１〜Ｓ１６の手順で行われる。デッドラインミスの発生予測の検出はＳ２１〜Ｓ２５の手順で行われる。計算機の異常検出、ＣＰＵ・メモリリソース不足の検出はステップＳ１０１１〜Ｓ１０１６の手順で行われる。デッドラインミスの発生予測は、Ｓ２１〜Ｓ２５の手順で行われる。ＣＰＵリソース不足の予測は、ステップＳ２０１〜Ｓ２０５の手順で行われる。メモリリソース不足の予測は、ステップＳ２２１〜Ｓ２２５の手順で行われる。
するとＳ２７２で、移行プロセス決定部３ｂは、通知の内容を調査する。通知の内容が計算機の故障であれば、Ｓ２７４にて、故障した計算機１００上で動作していた全てのプロセス１０１に対して、資源割当部６ｂはＳ２５１〜Ｓ２６５の手順でリソースの割当を行う。 FIG. 39 is a diagram showing an outline of the operation flow of the computer resource dynamic control device or system in the present embodiment when a cooperative operation is performed based on the operation of each component.
In the figure, in S271, the processing time collection unit 2, the computer load amount collection unit 5, the deadline miss occurrence time prediction unit 21, and the memory usage amount prediction unit 24 are configured to generate a deadline miss, a deadline miss occurrence prediction, The migration process determining unit 3b is notified of a computer abnormality, CPU / memory resource shortage, and CPU / memory / resource shortage prediction. Detection of the occurrence of a deadline miss is performed according to the procedures of S11 to S16. Detection of occurrence prediction of a deadline miss is performed in the steps S21 to S25. Computer abnormality detection and CPU / memory resource shortage detection are performed in steps S1011 to S1016. Prediction of the occurrence of a deadline miss is performed according to the procedures of S21 to S25. Prediction of CPU resource shortage is performed in the procedure of steps S201 to S205. Prediction of memory resource shortage is performed by the procedure of steps S221 to S225.
Then, in S272, the migration process determination unit 3b investigates the content of the notification. If the content of the notification is a computer failure, in S274, the resource allocation unit 6b allocates resources to all the processes 101 operating on the failed computer 100 in the procedure of S251 to S265.

計算機の故障でなければ、Ｓ２７５にて、移行プロセス決定部３ｂによりデッドラインミス発生の原因と考えられるプロセス、または、リソース不足発生の原因と考えられるプロセスを見つけ出し、Ｓ２７６にて資源割当部６ｂがＳ２５１〜Ｓ２６５の手順でリソース割当を行う。リソース割当終了後Ｓ２７７にて、資源割当部６ｂは、システム構成情報管理テーブル・メモリ７の、新規にリソース割当を行ったプロセスの識別子と、そのプロセスが動作する割当先計算機１００に関する情報などを書き換える。 If it is not a computer failure, in S275, the migration process determining unit 3b finds a process that is considered to be the cause of the occurrence of a deadline error or a process that is considered to be the cause of the resource shortage, and in S276, the resource allocation unit 6b Resource allocation is performed according to the procedures of S251 to S265. In S277 after the resource allocation is completed, the resource allocation unit 6b rewrites the identifier of the process that has newly allocated the resource, information on the allocation destination computer 100 on which the process operates, and the like in the system configuration information management table memory 7. .

即ち本実施の形態は、複数台の計算機がネットワークに接続され、各計算機上で複数のプロセスが連携または独立して動作し、各プロセスは一定の時間周期で処理を行い、制限時間内に処理を完了することが求められているような分散リアルタイムシステムにおいて、プロセスの処理時間を監視する処理時間監視手段と計算機のリソース使用状況を監視する計算機負荷監視手段とを持ち、各プロセスの処理時間に応じてプロセスに与えるリソースを変更することによってプロセスのデッドラインミスの発生を防止するシステムであって、以下の構成を付加することにより、デッドラインミスが発生する前に空きリソースを持つ他計算機へのプロセスマイグレーションを行い、デッドラインミス発生を予防することを特徴とする計算機リソース動的制御方式を示している。即ち、
２１）各プロセスの平均処理時間、消費メモリ量、ＣＰＵ時間、各プロセスの１回あたりの処理における制限時間、システムを構成する計算機一覧を保持するシステム構成情報管理テーブル。
２２）システム上で動作する各プロセスの処理時間、ＣＰＵ時間の変動の履歴を保持するとともに、プロセスのデッドラインミスの発生を検出する処理時間収集手段。
２３）各プロセスの処理時間の変動の履歴より、プロセスの処理時間の変動を学習し、将来のデッドラインミスの発生および発生時刻を予測する処理時間予測手段。
２４）システム上で動作する全計算機のリソースの使用状況の履歴を保持し、過去の計算機リソースの使用状況を他の手段に通知するとともに、リソース不足の発生を検出する計算機負荷収集手段。
２５）計算機のメモリ使用状況の変動をもとに、メモリの使用状況の将来の変動を学習し、将来のメモリ不足の発生を検出するとともに、プロセスのデッドラインミス発生予想時刻やＣＰＵリソース不足発生時刻におけるメモリ使用状況を通知するメモリ使用量予測手段。
２６）計算機のＣＰＵ使用状況の変動より、ＣＰＵ使用状況の将来の変動を学習し、将来のＣＰＵリソース不足の発生を検出するとともに、メモリ不足が発生する時刻、デッドラインミス発生予想時刻におけるＣＰＵ使用状況を通知するＣＰＵ使用率予測手段。
２７）デッドラインミス、ＣＰＵリソース不足、メモリリソース不足などの障害が発生した場合に、リソースの再割り当てを行うべきプロセスを決定する移行プロセス決定手段。
２８）システム構成情報管理テーブルより移行プロセス手段が決定したプロセスが必要なリソース量を取得し、必要な空きリソースを持つ計算機上にプロセスマイグレーションする資源割付手段、を備えた。 In other words, in this embodiment, a plurality of computers are connected to a network, and a plurality of processes operate in cooperation or independently on each computer, and each process performs processing at a fixed time period and performs processing within a time limit. In a distributed real-time system that is required to complete the process, it has a processing time monitoring means for monitoring the processing time of the process and a computer load monitoring means for monitoring the resource usage status of the computer. A system that prevents the occurrence of process deadline misses by changing the resources given to the process accordingly, and by adding the following configuration to other computers that have free resources before the deadline miss occurs A computer resource that prevents deadline errors from occurring It shows the control system. That is,
21) A system configuration information management table that holds an average processing time of each process, an amount of consumed memory, a CPU time, a time limit for processing per process, and a list of computers constituting the system.
22) Processing time collecting means for holding a history of fluctuations in processing time and CPU time of each process operating on the system and detecting occurrence of a process deadline miss.
23) A processing time predicting unit that learns the variation in the processing time of the process from the history of the variation in the processing time of each process, and predicts the occurrence and occurrence time of a future deadline miss.
24) Computer load collecting means for holding a history of resource usage of all the computers operating on the system, notifying other means of the usage status of past computer resources, and detecting occurrence of resource shortage.
25) Learning future fluctuations in memory usage based on fluctuations in the memory usage of computers, detecting future memory shortages, and predicting the occurrence of process deadline misses and CPU resource shortages Memory usage amount predicting means for notifying the memory usage status at time.
26) Learning future fluctuations in the CPU usage status from fluctuations in the CPU usage status of the computer, detecting the occurrence of a future CPU resource shortage, and the CPU usage at the time when the memory shortage occurs and the estimated deadline miss occurrence time CPU usage rate predicting means for notifying the situation.
27) A migration process determining unit that determines a process to which resources should be reassigned when a failure such as a deadline miss, CPU resource shortage, or memory resource shortage occurs.
28) Resource allocation means for acquiring a necessary resource amount for the process determined by the migration process means from the system configuration information management table and performing process migration on a computer having the necessary free resources is provided.

言い換えると、本実施の形態における計算機リソース動的制御装置は、計算機のプロセッサ使用状況の変動を基に、該プロセッサの使用時間を予測し、現プロセッサによるデッドラインミスの発生時刻を予測する計算機時間予測部と、
計算機のメモリ使用状況の変動を基に、該メモリの必要量を予測するメモリ使用量予測部と、を備え、
計算機負荷量収集部は、計算機リソースの使用状態情報の収集に加えて、リソースと計算機能力との必要量を予測する計算機負荷量収集・予測部を有し、
処理時間収集部は、プロセスの処理時間の収集に加えて、現計算機によるデッドラインミスの発生時刻を予測する処理時間収集・予測部を有し、
移行プロセス決定部は、上記リソースと計算機能力の必要量予測または上記デッドラインミスの発生予測の通知を受けると、該予測に基づいてプロセスの再割当を勧告し、
資源割当部は、上記デッドラインミスの発生時刻の予測と上記メモリの必要量の予測とに基づいて、上記勧告されて抽出したプロセスを抽出した候補計算機に割当て実行させることを特徴とする。
このように、実施の形態１に加え、デッドラインミスの発生をプロセスの処理時間の変動などから学習することによって予測し、デッドラインミス発生を予測してリソース再割当てを行えるようにすることにより、システムのソフトリアルタイム処理だけでなく、ハードリアルタイム処理も実現することができる。また、デッドラインミス発生時のＣＰＵ負荷、メモリ使用量も予測してリソース割当を行うことによって、よりデッドラインミスの発生しにくい計算機１００にリソース割当を行うことができ、リソース再割当完了後のデッドラインミス再発を高い確率で防止することができる。 In other words, the computer resource dynamic control apparatus according to the present embodiment predicts the usage time of the processor based on the change in the processor usage status of the computer, and predicts the occurrence time of the deadline miss by the current processor. A predictor;
A memory usage amount predicting unit that predicts a necessary amount of the memory based on a change in the memory usage status of the computer,
The computer load amount collection unit has a computer load amount collection / prediction unit that predicts the required amount of resources and calculation function in addition to the collection of computer resource usage information.
The processing time collection unit has a processing time collection / prediction unit that predicts the occurrence time of a deadline miss by the current computer in addition to collecting the processing time of the process,
When the migration process determination unit receives the notification of the necessary amount prediction of the resource and the calculation function or the prediction of the occurrence of the deadline miss, it recommends the reallocation of the process based on the prediction,
The resource allocation unit is configured to cause the candidate computer that has extracted the recommended extracted process to execute allocation based on the prediction of the occurrence time of the deadline miss and the prediction of the required amount of memory.
As described above, in addition to the first embodiment, the occurrence of a deadline miss is predicted by learning from fluctuations in the processing time of the process, and the deadline miss occurrence is predicted to enable resource reallocation. In addition to soft real-time processing of the system, hard real-time processing can also be realized. Also, by allocating resources by predicting the CPU load and memory usage when a deadline miss occurs, it is possible to allocate resources to the computer 100 that is less prone to deadline misses. Deadline miss recurrence can be prevented with high probability.

実施の形態３．
デッドラインミスが発生すると予測される場合に、より早くプロセスを他の計算機に移行させて、デッドラインミスの発生を確実に抑える構成と動作を説明する。
本実施の形態における計算機リソース動的制御システムまたは装置の構成は、実施の形態２と同一である。また実施の形態３における処理時間監視部１、処理時間収集・予測部２ｂ、計算機負荷監視部４、計算機負荷量収集部５、処理時間履歴テーブル・メモリ８、計算機負荷履歴テーブル・メモリ９、計算機状態テーブル・メモリ１０、ＣＰＵ時間予測部２３の機能と動作は、実施の形態２と同一である。
本実施の形態におけるシステム構成情報管理テーブル・メモリ７ｃのデータ構成の例を図４０に示す。本実施の形態のシステム構成情報管理テーブル・メモリ７ｃでは、実施の形態２におけるシステム構成情報管理テーブル・メモリ７に示す情報のほか、リソース動的制御システムに関する情報として、「再割付所要時間」、ソフトウェアに関する情報として、「プロセス起動所要時間」を保持する。再割付所要時間とは、デッドラインミス発生、デッドラインミス予測、メモリ不足予測、計算機故障などの通知が移行プロセス決定部３ｃに届いてから、資源割当部６ｃが資源の再割当を完了するまでに要する時間を保持するものである。この時間は過去の最悪値としてもよいし、割付完了時間に大きな変動がなければ過去の平均値などでもよい。プロセス起動所要時間とは、動的制御システムがプロセス１０１を起動してから、プロセスが処理を開始するまでの時間である。この時間は、例えばプロセスをメモリにロードするまでの時間や、プロセス固有の初期化処理が含まれる。 Embodiment 3 FIG.
A description will be given of a configuration and an operation in which, when a deadline miss is predicted to occur, the process is transferred to another computer earlier to reliably prevent the occurrence of a deadline miss.
The configuration of the computer resource dynamic control system or apparatus in the present embodiment is the same as that in the second embodiment. The processing time monitoring unit 1, processing time collection / prediction unit 2b, computer load monitoring unit 4, computer load amount collection unit 5, processing time history table / memory 8, computer load history table / memory 9, computer according to the third embodiment The functions and operations of the state table memory 10 and the CPU time prediction unit 23 are the same as those in the second embodiment.
An example of the data configuration of the system configuration information management table / memory 7c in the present embodiment is shown in FIG. In the system configuration information management table / memory 7c according to the present embodiment, in addition to the information shown in the system configuration information management table / memory 7 according to the second embodiment, information relating to the resource dynamic control system includes “reassignment time” “Process startup time” is stored as information about the software. The time required for reallocation refers to the notification of deadline miss occurrence, deadline miss prediction, memory shortage prediction, computer failure, etc., until the resource allocation unit 6c completes the resource reallocation after the notification to the migration process determination unit 3c. The time required for this is maintained. This time may be the worst value in the past, or may be an average value in the past if there is no significant change in the allocation completion time. The process start time is the time from when the dynamic control system starts the process 101 to when the process starts processing. This time includes, for example, a time until the process is loaded into the memory and an initialization process specific to the process.

本実施の形態におけるデッドラインミス発生時刻予測部２１ｃの動作フローを図４１に示す。Ｓ３１においてはＳ２１〜Ｓ２４を実行する。このときＳ２４にてデッドラインミスが将来発生すると判断した場合には、Ｓ３２にて現在時刻を取得し、デッドラインミス発生時刻までの残り時間を求める。そしてＳ３３にて、リソース再割当てに要する時間とプロセス起動に要する時間と、プロセス１回あたりの処理周期時間の和を求める。Ｓ３４にて、Ｓ３２で求めた時間とＳ３３で求めた時間を比較し、Ｓ３２の時間が短い場合、つまり、デッドラインミス発生までの残り時間が、対処に要する時間とプロセス起動所要時間と処理周期の時間の和よりも短い場合には、Ｓ３５にて移行プロセス決定部３ｃに、デッドラインミスの発生が予想されるプロセスの名前と現在時刻を通知する。Ｓ３４にて、Ｓ３２の時間のほうが短い、つまりＳ３４以降でただちにリソース再割当を開始した場合にデッドラインミスが発生する時刻の１周期よりも前の時刻にリソース再割当が完了すると推定できる場合には、リソース再割当は行わずにプロセスの処理時間の監視を続ける。
なおＳ３１〜Ｓ３５の処理は、プロセスの処理が終了に同期して行ってもよいし、プロセスの処理とは非同期に行ってもよい。また、本例ではデッドラインミスが発生する１周期前にリソース割付が完了するか否かでリソース割付を開始するかどうかを判断しているが、必ずしも１周期である必要はなく、ユーザによって任意に決定してよい。 FIG. 41 shows an operation flow of the deadline miss occurrence time prediction unit 21c in the present embodiment. In S31, S21 to S24 are executed. At this time, if it is determined in S24 that a deadline miss will occur in the future, the current time is acquired in S32, and the remaining time until the deadline miss occurrence time is obtained. In S33, the sum of the time required for resource reallocation, the time required for process activation, and the processing cycle time per process is obtained. In S34, the time obtained in S32 is compared with the time obtained in S33. If the time in S32 is short, that is, the remaining time until the occurrence of a deadline miss is the time required for handling, the time required for starting the process, and the processing cycle. If it is shorter than the sum of the times, the name and the current time of the process in which the occurrence of a deadline miss is expected are notified to the migration process determination unit 3c in S35. In S34, when the time of S32 is shorter, that is, when resource reassignment is started immediately after S34, it can be estimated that resource reassignment is completed at a time before one cycle of the time when a deadline miss occurs. Continues monitoring the process time without reallocating resources.
Note that the processing of S31 to S35 may be performed in synchronization with the end of the processing of the process, or may be performed asynchronously with the processing of the process. In this example, whether or not to start resource allocation is determined based on whether or not resource allocation is completed one cycle before the deadline miss occurs. You may decide.

本実施の形態におけるメモリ使用量予測部２４ｃの動作フローを図４２に示す。
Ｓ３０１では、Ｓ２２１〜Ｓ２２４を実行する。そしてＳ２２４にてメモリ不足が将来発生すると判断した場合には、Ｓ３０２にて現在時刻を取得し、メモリ不足が発生するまでの残り時間を求める。そしてＳ３０３にてリソース再割当てに要する時間とプロセス起動に要する時間とプロセスの処理周期時間の和を求める。Ｓ３０４にて、Ｓ３０２で求めた時間とＳ３０３で求めた時間を比較し、Ｓ３０２で求めた時間のほうが短い、つまりメモリ不足発生までの残り時間が、リソース再割り当てに要する時間とプロセス起動所要時間と処理周期時間の時間の和より短い場合には、Ｓ３０５にて、メモリ不足が発生する時刻とその計算機、現在時刻を移行プロセス決定部３ｃに通知する。なおＳ３０４にて、Ｓ３０２で求めた時間のほうが短い、つまり、ただちにリソース再割当を行った場合にメモリ不足が発生する時刻の１周期よりも前にリソース再割付が終了する場合、メモリ使用量の監視を継続する。なおデッドラインミスが発生する１周期前にリソース割付が完了するか否かでリソース割付を開始するかどうかを判断しているが、必ずしも１周期である必要はなく、ユーザによって任意に決定してよい。 FIG. 42 shows an operation flow of the memory usage prediction unit 24c in the present embodiment.
In S301, S221 to S224 are executed. If it is determined in S224 that a memory shortage will occur in the future, the current time is acquired in S302, and the remaining time until the memory shortage occurs is obtained. In S303, the sum of the time required for resource reallocation, the time required for process activation, and the processing cycle time of the process is obtained. In S304, the time obtained in S302 is compared with the time obtained in S303, and the time obtained in S302 is shorter, that is, the remaining time until the memory shortage occurs is the time required for resource reallocation and the time required for starting the process. If it is shorter than the sum of the processing cycle times, in S305, the time when the memory shortage occurs, its computer, and the current time are notified to the migration process determination unit 3c. In S304, if the time obtained in S302 is shorter, that is, if resource reallocation is completed before one cycle of the time when memory shortage occurs when resource reallocation is performed immediately, Continue monitoring. Note that it is determined whether or not resource allocation is started based on whether or not resource allocation is completed one cycle before the deadline miss occurs. However, it is not necessarily one cycle and can be arbitrarily determined by the user. Good.

本実施の形態における計算機負荷予測部２２ｃの動作フローを図４３に示す。
Ｓ３１１では、Ｓ２０１〜Ｓ２０４を実行する。そしてＳ２０４にて、ＣＰＵ使用率がある上限値以上になると判断した場合には、Ｓ３１２にて、現在時刻を取得し、ＣＰＵ使用率が上限値になるまでの残り時間を求める。そしてＳ３１３にて、リソース再割当てに要する時間とプロセス起動に要する時間とプロセスの処理周期時間の和を求める。Ｓ３１４にて、Ｓ３１２で求めた時間とＳ３１３で求めた時間を比較し、Ｓ３１２で求めた時間、つまりＣＰＵ使用率が上限値になるまでの残り時間が、リソース再割当てに要する時間とプロセス起動所要時間と周期処理のインターバル時間より短い場合には、Ｓ３１５にて、ＣＰＵが上限値になる時刻とその計算機、現在時刻を移行プロセス決定部３ｃに通知する。なお、Ｓ３１４にて、Ｓ３１２の時間のほうが短い、つまり現在時刻でリソース再割当を行っても、ＣＰＵリソース不足が発生する時刻の１周期以上前にリソース再割当が終了する場合には何もせず、ＣＰＵ使用率の監視を継続する。 FIG. 43 shows an operation flow of the computer load predicting unit 22c in the present embodiment.
In S311, S201 to S204 are executed. If it is determined in S204 that the CPU usage rate is greater than or equal to a certain upper limit value, the current time is acquired in S312 and the remaining time until the CPU usage rate reaches the upper limit value is obtained. In S313, the sum of the time required for resource reallocation, the time required for process activation, and the process cycle time of the process is obtained. In S314, the time obtained in S312 is compared with the time obtained in S313, and the time obtained in S312, that is, the remaining time until the CPU usage rate reaches the upper limit value, the time required for resource reallocation and the process activation required If it is shorter than the time and the interval time of the periodic processing, in S315, the time when the CPU reaches the upper limit value, its computer, and the current time are notified to the migration process determination unit 3c. In S314, the time of S312 is shorter, that is, even if the resource reallocation is performed at the current time, nothing is done if the resource reallocation ends one cycle or more before the time when the CPU resource shortage occurs. Continue monitoring the CPU usage rate.

本実施の形態における、デッドラインミス発生時刻予測部２１ｃからデッドラインミス発生の予測通知が届いた場合の移行プロセス決定部３ｃの動作フローを図４４に示す。
Ｓ３２１にて、デッドラインミス発生時刻予測部２１ｃより、デッドラインミス発生が予測されるプロセス１０１と、Ｓ３２にて取得した現在時刻が届くと、Ｓ３２２にて、実施の形態２におけるステップＳ２３２〜Ｓ２３４を実行する。そしてＳ３２３にて、Ｓ２３４で見つけたプロセスと、デッドラインミス発生が予測されるプロセスをリソース再割当の候補とし、この候補プロセス群とデッドラインミス発生時刻予測部２１ｃより送られてきた時刻を資源割付部６ｃに送信する。
本実施の形態における、メモリ使用量予測部２４ｃまたは計算機負荷予測部２２ｃから通知が届いた場合の移行プロセス決定手段３ｃの動作フローを図４５に示す。
Ｓ３３１で、メモリ使用量予測部２４ｃまたは計算機負荷予測部２２ｃから通知が届く。通知の内容は、異常が発生した計算機と、メモリ使用量予測部２４ｃがＳ３１２にて取得した現在時刻、計算機負荷予測部２２ｃがＳ３２２で取得した現在時刻である。Ｓ３３２において、システム構成情報管理テーブル・メモリ７ｃを参照して、Ｓ３３１にて通知された計算機で動作するプロセスを取得する。Ｓ３３３にて、Ｓ３２２で見つけたプロセスをリソース再割当の対象、またはリソース割当の候補のプロセスとし、資源割当部６ｃにこれらのプロセスと、Ｓ３２１にて受信した時刻を通知する。 FIG. 44 shows an operation flow of the migration process determination unit 3c when a prediction notification of deadline miss occurrence is received from the deadline miss occurrence time prediction unit 21c in the present embodiment.
In S321, when the deadline miss occurrence time predicting unit 21c receives the process 101 in which the occurrence of a deadline miss is predicted and the current time acquired in S32, in S322, steps S232 to S234 in the second embodiment are performed. Execute. In S323, the process found in S234 and the process in which the occurrence of a deadline miss is predicted are determined as resource reassignment candidates, and the time sent from the candidate process group and the deadline miss occurrence time prediction unit 21c is determined as a resource. It transmits to the allocation part 6c.
FIG. 45 shows an operation flow of the migration process determination unit 3c when a notification is received from the memory usage prediction unit 24c or the computer load prediction unit 22c in the present embodiment.
In S331, a notification is received from the memory usage prediction unit 24c or the computer load prediction unit 22c. The contents of the notification are the computer in which the abnormality has occurred, the current time acquired by the memory usage prediction unit 24c in S312 and the current time acquired by the computer load prediction unit 22c in S322. In S332, the system configuration information management table / memory 7c is referred to, and a process operating on the computer notified in S331 is acquired. In S333, the process found in S322 is set as a resource reassignment target or a resource allocation candidate process, and the resource allocation unit 6c is notified of these processes and the time received in S321.

本実施の形態における、資源割当部６ｃの動作フローを図４６に示す。
Ｓ３４１にて、移行プロセス決定部３ｃから、リソース割当対象、割当候補となるプロセス一覧と、デッドラインミス発生時刻予測部２１ｃ、計算機負荷予測部２２ｃ、メモリ使用量予測部２４ｃが取得した時刻を受信する。Ｓ３４２では、Ｓ２５２〜Ｓ２６３を実行する。そして２５３において、Ｓ２５２のテーブルが空か、もしくは対処する必要がないと判断した場合にＳ３４３に進み、現在時刻を取得する。そしてＳ３４４にて、Ｓ３４３にて得た時刻からＳ３４１で受信した時刻を引くことにより、リソース割当に要した時間を求める。そしてＳ３４５にて、システム構成情報管理テーブル・メモリ７ｃに登録されている、リソース割当所要時間を取得する。そしてＳ３４４で求めた、今回リソース割当に要した時間と、システム構成情報テーブル・メモリ７ｃに登録されていたリソース割当時間を取得し、今回リソース割当に要した時間のほうが長くかかっていればＳ３４７に進み、システム構成情報管理テーブル・メモリ７ｃの、リソース割付所要時間の値を、Ｓ３４４で求めた値に書き換える。 FIG. 46 shows an operation flow of the resource allocation unit 6c in the present embodiment.
In S341, from the migration process determination unit 3c, the resource allocation target, the process list as allocation candidates, and the times acquired by the deadline miss occurrence time prediction unit 21c, the computer load prediction unit 22c, and the memory usage prediction unit 24c are received. To do. In S342, S252 to S263 are executed. If it is determined in 253 that the table in S252 is empty or it is not necessary to deal with it, the process proceeds to S343 to acquire the current time. In S344, the time required for resource allocation is obtained by subtracting the time received in S341 from the time obtained in S343. In S345, the required resource allocation time registered in the system configuration information management table / memory 7c is acquired. Then, the time required for the current resource allocation obtained in S344 and the resource allocation time registered in the system configuration information table / memory 7c are obtained. Then, the value of the resource allocation required time in the system configuration information management table memory 7c is rewritten to the value obtained in S344.

図４７は、上記の各構成要素の動作をもとにして連携動作させた場合の、本実施の形態における計算機リソース動的制御装置またはシステムの動作フローの概要を示した図である。
Ｓ３５１にて、処理時間収集・予測部２ｂ、計算機負荷量収集部５、デッドラインミス発生時刻予測部２１ｃ、メモリ使用量予測部２４ｃが、デッドラインミスの発生、デッドラインミスの発生予測、計算機の異常、ＣＰＵ・メモリリソース不足、ＣＰＵ・メモリリソース不足の予測を移行プロセス決定部３ｃに通知する。デッドラインミス発生の検出はＳ１１〜Ｓ１６の手順で行われる。デッドラインミスの発生予測の検出はＳ３１〜Ｓ３５の手順で行われる。計算機の異常検出、ＣＰＵ・メモリリソース不足の検出はＳ１０１１〜Ｓ１０１６の手順で行われる。ＣＰＵリソース不足の予測は、Ｓ３１〜Ｓ３５の手順で行われる。メモリリソース不足の予測は、Ｓ３０１〜Ｓ３０５の手順で行われる。
するとＳ３５２で、移行プロセス決定部３ｃは、通知の内容を調査する。通知の内容が計算機の故障であればＳ３５５にて、故障した計算機上で動作していた全てのプロセスに対して、資源割当部６ｃはＳ３４１〜Ｓ３４７の手順でリソースの割当を行う。
計算機の故障でなければＳ３５５にて、デッドラインミス発生の原因となったプロセス、リソース不足を発生させたプロセスの候補を見つけ出し、Ｓ３５６にて、資源割当部６ｃがＳ３４１〜Ｓ３４７の手順でリソース割当を行う。リソース割当終了後、Ｓ３５７にて資源割当部６ｃは、システム構成情報管理テーブル・メモリ７ｃの、新規にリソース割当を行ったプロセスの識別子と、そのプロセスが動作する割当先計算機に関する情報などを書き換える。 FIG. 47 is a diagram showing an outline of the operation flow of the computer resource dynamic control device or system in the present embodiment when a cooperative operation is performed based on the operation of each component described above.
In S351, the processing time collection / prediction unit 2b, the computer load amount collection unit 5, the deadline miss occurrence time prediction unit 21c, and the memory usage amount prediction unit 24c The migration process determination unit 3c is notified of the abnormalities of CPU, memory resource shortage, and CPU / memory resource shortage. Detection of the occurrence of a deadline miss is performed according to the procedures of S11 to S16. Detection of occurrence prediction of a deadline miss is performed in the procedure of S31 to S35. Computer abnormality detection and CPU / memory resource shortage detection are performed in steps S1011 to S1016. Prediction of CPU resource shortage is performed according to the procedures of S31 to S35. Prediction of memory resource shortage is performed according to the procedure of S301 to S305.
Then, in S352, the migration process determination unit 3c investigates the content of the notification. If the content of the notification is a computer failure, in S355, the resource allocation unit 6c allocates resources to all processes operating on the failed computer in the sequence of S341 to S347.
If it is not a computer failure, in S355, a candidate for the process causing the deadline error or the process causing the resource shortage is found, and in S356, the resource allocator 6c allocates resources in the procedure of S341 to S347. I do. After completing the resource allocation, in S357, the resource allocation unit 6c rewrites the identifier of the process to which the resource allocation is newly performed, information on the allocation destination computer on which the process operates, and the like in the system configuration information management table memory 7c.

このように、システム構成情報管理テーブル・メモリ７ｃがリソース割付に要する時間の最悪値を保持し、デッドラインミス、メモリ不足、ＣＰＵリソース不足の発生予測時間よりリソース割り付けに要する時間とプロセス起動する時間と周期処理時間の和だけ前にリソース割付を開始することにより、少なくともデッドラインミス発生の１周期前までにリソース再割り当てが終了し、プロセスのハードリアルタイム処理を継続することができる。 As described above, the system configuration information management table / memory 7c holds the worst value of the time required for resource allocation, and the time required for resource allocation and the time for starting the process from the estimated occurrence time of deadline miss, memory shortage, and CPU resource shortage. By starting the resource allocation by the sum of the period processing time and the resource processing, the real-time processing of the process can be continued at least one period before the occurrence of the deadline miss.

即ち本実施の形態は、先の実施の形態における計算機リソース動的制御システムに加えて、以下の特徴がある要素を加えることにより、デッドラインミス発生直前までにプロセスマイグレーションを完了することを特徴とする計算機リソース動的制御方式を示している。即ち、
３１）先の実施の形態におけるシステム構成情報管理テーブルに加え、各プロセスにおける、デッドラインミスの発生、デッドラインミスの発生予測、メモリ不足、メモリ不足の予測、計算機故障を検出してからプロセスマイグレーションを完了するまでの時間と、プロセスマイグレーションを完了してからプロセスが処理を開始するまでの時間を保持するシステム構成情報管理テーブル。
３２）先の実施の形態における処理時間予測手段において、各プロセスの処理時間の変動の履歴より、プロセスの処理時間の変動を学習し、将来のデッドラインミスの発生および発生時刻を予測し、プロセスマイグレーションを完了してプロセスが処理を開始するまでに要する時間だけ前に移行プロセス決定手段にデッドラインミスの発生予測を通知する機能を持つ処理時間予測手段。
３３）先の実施の形態におけるメモリ使用量予測手段において、各計算機のメモリ使用量の変動をもとにメモリ使用状況の将来の変動を予測することによりデッドラインミス発生時刻やＣＰＵリソース不足が発生する時刻におけるメモリ使用量を通知するとともに、メモリ不足が発生する時刻を予測し、プロセスマイグレーションを完了してプロセスが処理を開始するまでに必要な時間だけ前に移行プロセス決定手段にメモリ不足の発生予測を通知する機能を持つメモリ使用量予測手段。
３４）先の実施の形態におけるＣＰＵ使用率予測手段において、各計算機のＣＰＵ使用量の変動をもとに、ＣＰＵ使用量の将来の変動を予測することによりデッドラインミス発生時刻やメモリ不足が発生する時刻におけるＣＰＵ使用量を通知するとともに、ＣＰＵリソース不足が発生する時刻を予測し、プロセスマイグレーションが完了してプロセスが処理を開始するまでに必要な時間だけ前に移行プロセス決定手段にＣＰＵリソース不足の発生予測を通知する機能を持つＣＰＵ使用率予測手段、を備える。 That is, this embodiment is characterized in that the process migration is completed immediately before the occurrence of a deadline miss by adding an element having the following characteristics in addition to the computer resource dynamic control system in the previous embodiment. This shows a computer resource dynamic control method. That is,
31) In addition to the system configuration information management table in the previous embodiment, process migration after detecting deadline miss occurrence, deadline miss occurrence prediction, memory shortage, memory shortage prediction, computer failure in each process System configuration information management table that holds the time to complete the process and the time from the completion of process migration to the start of the process.
32) The processing time predicting means in the previous embodiment learns the process time variation from the history of the process time variation of each process, predicts the occurrence and occurrence time of a future deadline miss, Processing time prediction means having a function of notifying the migration process determination means of the occurrence prediction of a deadline miss only before the time required for the process to start processing after completing the migration.
33) In the memory usage prediction means in the previous embodiment, a deadline miss occurrence time or CPU resource shortage occurs by predicting future fluctuations in the memory usage based on fluctuations in the memory usage of each computer. Notifying the memory usage at the time to run, predicting the time when memory shortage will occur, causing the migration process decision means to run out of memory only as long as necessary until the process starts after the process migration is completed Memory usage prediction means with a function to notify the prediction.
34) In the CPU usage rate prediction means in the previous embodiment, a deadline miss occurrence time or memory shortage occurs by predicting future fluctuations in CPU usage based on fluctuations in CPU usage of each computer. CPU usage at the time to be executed is predicted, the time when CPU resource shortage occurs is predicted, and the migration process decision means is short of CPU resources only before the process migration is completed and the process starts processing. CPU usage rate predicting means having a function of notifying the occurrence prediction.

言い換えると、本実施の形態における計算機リソース動的制御装置は、先の実施の形態の装置に加えて、システム構成情報テーブル・メモリは、プロセスを候補計算機に再割当する再割当所要時間と該候補計算機でプロセスを起動するプロセス起動所要時間とを記憶し、
資源割当部は、上記再割当所要時間とプロセス起動所要時間との和の時間より前に抽出したプロセスを抽出した候補計算機に割当て実行させることを特徴とする。
また、実施の形態２と比べ、デッドラインミス発生直前までアプリケーションの処理時間、メモリ使用量、ＣＰＵ使用率の変化の振る舞いを調べることができ、より正確なシステムの振る舞いをもとにリソース割付を行うことができる。その結果、直前に振る舞いが変わり、デッドラインミス、メモリ不足、ＣＰＵリソース不足が発生しないことがわかった場合には、実施の形態２と比べ、リソース割付に伴って加わる処理オーバーヘッドを削減することができる。また、システム構成情報テーブルに、リソース割付時間の最悪値を保持、絶えず更新し、その最悪値の時間だけ先立ってリソース割付を開始することにより、デッドラインミス発生前にリソース割付処理が完了する可能性が高くなる。 In other words, in the computer resource dynamic control device according to the present embodiment, in addition to the device according to the previous embodiment, the system configuration information table / memory includes a reassignment time required for reassigning a process to a candidate computer and the candidate. Memorize the process start time required to start the process on the computer,
The resource allocation unit is characterized in that the extracted candidate computer allocates and executes the process extracted before the sum of the reassignment required time and the process start required time.
Compared to the second embodiment, it is possible to examine the behavior of changes in application processing time, memory usage, and CPU usage until just before the occurrence of a deadline miss, and to allocate resources based on more accurate system behavior. It can be carried out. As a result, if the behavior changes immediately before and it is found that a deadline miss, memory shortage, and CPU resource shortage do not occur, the processing overhead added with resource allocation can be reduced compared to the second embodiment. it can. In addition, the worst value of resource allocation time is maintained in the system configuration information table, and continuously updated, and resource allocation processing can be completed before a deadline miss occurs by starting resource allocation in advance of the worst time. Increases nature.

実施の形態４．
本実施の形態においては、リソースが不足すると、重要度が高いプロセスを優先して処理する構成と動作を説明する。
本実施の形態における計算機リソース動的制御装置または計算機リソース動的制御システムの構成要素は、実施の形態１と同一である。即ち処理時間監視部１、処理時間収集・予測部２ｂ、移行プロセス決定部３、計算機負荷監視部４、計算機負荷量収集部５、処理時間履歴テーブル・メモリ８、計算機負荷履歴テーブル・メモリ９、計算機状態テーブル・メモリ１０の構成は、実施の形態１と同一である。
本実施の形態におけるシステム構成情報管理テーブル・メモリ７ｄの構成を図４８に示す。本実施の形態のシステム構成情報管理テーブル・メモリ７ｄは、実施の形態１におけるシステム構成情報管理テーブル・メモリ７に示す情報のほか、各プロセスの「重要度」を保持する。ここで重要度とは、システムを構成する全プロセスを動作させるだけの計算機リソースがなくなった場合に空きリソースを作り出すためにプロセスを停止させる際、プロセスを停止させる順序を示すものである。すなわち、重要度に示す順序に応じてプロセスを停止させることにより、重要度の高いプロセスほどリソースが足りなくなっても動作し、重要度の低いプロセスはシステム全体の計算機リソースが足りなくなった場合には停止する。 Embodiment 4 FIG.
In the present embodiment, a configuration and operation for preferentially processing a process having high importance when resources are insufficient will be described.
The components of the computer resource dynamic control device or the computer resource dynamic control system in the present embodiment are the same as those in the first embodiment. That is, the processing time monitoring unit 1, the processing time collection / prediction unit 2b, the migration process determination unit 3, the computer load monitoring unit 4, the computer load amount collection unit 5, the processing time history table / memory 8, the computer load history table / memory 9, The configuration of the computer state table memory 10 is the same as that of the first embodiment.
The configuration of the system configuration information management table / memory 7d in the present embodiment is shown in FIG. In addition to the information shown in the system configuration information management table / memory 7 of the first embodiment, the system configuration information management table / memory 7d of the present embodiment holds the “importance” of each process. Here, the importance indicates the order in which processes are stopped when the processes are stopped in order to create free resources when there are no more computer resources for operating all the processes constituting the system. In other words, by stopping the processes according to the order shown in the importance level, the more important processes will run even if there are not enough resources, and the less important processes will run when there is not enough computer resources in the entire system. Stop.

本実施の形態における資源割当部６ｄの動作フローの一例を図４９に示す。
Ｓ４０１にて、移行プロセス決定部３から、リソース割当の候補、リソース割当対象となるプロセス１０１のリストが届くと、Ｓ４０２にて、Ｓ４０１にて受け取ったリストを重要度の高い順に並べ替え、テーブルに保持する。そしてＳ４０４にて、Ｓ４０２のテーブルの先頭にある重要度の高いプロセス名を取り出す。そしてＳ４０５で、正常に動作する計算機１００の名前を負荷の大きい順にテーブルに持つ。そしてＳ４０６にて、Ｓ４０５にて作成したテーブルの先頭にある計算機名を取り出す。そしてＳ４０７にて、Ｓ４０６にて取り出した計算機１００のＣＰＵ使用状況などを調査し、Ｓ４０４で選択したプロセス１０１が、Ｓ４０６で選択した計算機１００上でデッドラインミス発生することなく動作できるかどうか調べる。デッドラインミスなく動作可能と判断した場合には、Ｓ４０９にて、Ｓ４０６にて取り出した計算機１００の空きメモリ量を調査し、Ｓ４０４で選択したプロセス１０１が動作可能であるかどうか調べる。動作可能であると判断した場合には、Ｓ４１１にて、Ｓ４０６で選択した計算機１００をＳ４０４で選択したプロセス１０１の移行先として決定し、Ｓ４１２で、Ｓ４１１で割当を決定した計算機１００のうえで、Ｓ４０４で選択したプロセス１０１を起動し、元の計算機上で動作していたＳ４０４で選択したプロセスを停止させる。Ｓ４１３で、システム構成情報管理テーブルの、割当た計算機とプロセス、現在のプロセスの処理時間を書き換える。 An example of the operation flow of the resource allocation unit 6d in the present embodiment is shown in FIG.
In S401, when the migration process determination unit 3 receives a resource allocation candidate and a list of resource allocation target processes 101, the list received in S401 is rearranged in descending order of importance in S402, and the table is displayed in the table. Hold. In S404, the process name having high importance at the top of the table in S402 is extracted. In step S405, the names of the computers 100 that operate normally are stored in the table in descending order of load. In S406, the computer name at the head of the table created in S405 is extracted. In step S407, the CPU usage status of the computer 100 taken out in step S406 is checked, and it is checked whether or not the process 101 selected in step S404 can operate on the computer 100 selected in step S406 without causing a deadline error. If it is determined that the operation can be performed without a deadline mistake, in S409, the free memory amount of the computer 100 extracted in S406 is checked to check whether the process 101 selected in S404 is operable. If it is determined that the operation is possible, in S411, the computer 100 selected in S406 is determined as the migration destination of the process 101 selected in S404, and in S412, the computer 100 whose allocation is determined in S411 is determined. The process 101 selected in S404 is started, and the process selected in S404 that was operating on the original computer is stopped. In S413, the assigned computer and process and the processing time of the current process in the system configuration information management table are rewritten.

Ｓ４０８にて、動作させるとデッドラインミスを発生すると判断した場合には、Ｓ４１５にて、Ｓ４０５で作成したテーブルの次のエントリに示される計算機１００を選ぶ。次のエントリに示される計算機があれば、まだ全ての計算機について調査を行っていないことになるので、Ｓ４０６に進み、Ｓ４１５で選択した計算機について、Ｓ４０６以降の処理を行う。
Ｓ４１５にて、次のエントリに示される計算機がなければ、すでに全ての計算機１００について動作可能かどうか調査し、どの計算機でも動作しないということがわかったことになる。これはすなわち、計算機リソースに空きがないことを示す。この場合はＳ４１７に進み、Ｓ４０４のプロセス１０１よりも重要度の低いプロセスがあるかどうか調べ、あればＳ４１９に進み、動作するプロセスの中で最も重要度の低いプロセスを停止させる。Ｓ４２０で、Ｓ４０２で作成したテーブルに、停止させたプロセスがあれば、そのエントリを削除する。重要度の低いプロセスが停止したことにより、計算機リソースに空きができたことになる。そして、Ｓ４０５に進み、再度計算機の負荷順に並べ替えて、プロセス１０１を割付可能なリソースを調査する。
Ｓ４１７にて、現在調査中のプロセスよりも重要度の低いプロセスがなくなった場合には、Ｓ４１８にて自プロセスは停止させることにより、一連の処理は終了となる。Ｓ４１３にてプロセスの再割当が完了したか、Ｓ４１８でプロセスが停止したら、Ｓ４１４にて、テーブルから先頭エントリ、すなわち、Ｓ４０４で選択したプロセス名を削除し、Ｓ４０２のテーブルにおいて、今回調査したエントリの次のエントリに示されるプロセスについて、プＳ４０４以降の処理を繰り返し行う。 If it is determined in S408 that a deadline miss will occur when operated, the computer 100 indicated in the next entry of the table created in S405 is selected in S415. If there is a computer indicated in the next entry, all the computers have not been investigated yet, so the process proceeds to S406, and the processes after S406 are performed for the computer selected in S415.
In S415, if there is no computer shown in the next entry, all the computers 100 are already checked to see if they can be operated, and it is found that none of the computers can be operated. This indicates that there is no free computer resource. In this case, the process proceeds to S417, and it is checked whether there is a process having a lower importance than the process 101 in S404. If there is a process, the process proceeds to S419, and the process having the lowest importance among the operating processes is stopped. In S420, if there is a stopped process in the table created in S402, the entry is deleted. A computer resource is vacant because a process of low importance stops. Then, the process proceeds to S405, and the resources that can be allocated to the process 101 are examined again by rearranging in the order of the computer load.
In S417, when there is no process having a lower importance than the process currently being investigated, the process is terminated by stopping the own process in S418. If the process reallocation is completed in S413 or the process is stopped in S418, the first entry from the table, that is, the process name selected in S404 is deleted in S414. For the process indicated in the next entry, the processing from step S404 is repeated.

図５０は本実施の形態における、全構成要素が連携動作した場合における計算機リソース動的制御装置またはシステムの動作フローの概要を示した図である。
図のＳ４２１にて、処理時間収集・予測部２ｂまたは、計算機負荷量収集部５から、デッドラインミスの発生、または計算機の異常が移行プロセス決定部３に通知される。この場合の処理時間収集・予測部２ｂの動作はＳ１１〜Ｓ１６、計算機負荷量収集部５の動作はＳ１０１１〜Ｓ１０１８に示すとおりである。するとＳ４２２で、移行プロセス決定部３は、デッドラインミスの原因を調査する。Ｓ４２３にて、原因が計算機の故障にあるかどうかを調査し、計算機の故障であればＳ４２４にて、故障した計算機１００上で動作していた全てのプロセス１０１に対して、資源割当部６はＳ４０１〜Ｓ４２０の手順によりリソースの割当を行う。リソース割当の際、どの計算機にも割当可能な空きリソースがなければ、重要度の低いプロセス１０１から順に停止させて空きリソースを作成し、再度リソースの割当を試行する。
計算機の故障でなければＳ４２５で、移行プロセス決定部３により、デッドラインミス発生の原因となったプロセス、リソース不足発生の原因となったプロセス、リソース再割付を行うプロセスを見つけ出す。そしてＳ４２６にて、デッドラインミス発生の原因となったプロセス、または、リソース不足を発生させた原因と考えられるプロセスに対して、資源割当部６がＳ４０１〜Ｓ４２０の手順でリソース割当を行う。リソース割当の際、どの計算機１００にも割当可能な空きリソースがなければ、重要度の低いプロセスから順に停止させて空きリソースを作成し、再度リソース割当を試行する。そしてＳ４２７にて、リソース割当終了後に資源割当部６は、システム構成情報管理テーブル・メモリ７ｄの、新規にリソース割当を行ったプロセス１０１の識別子と、そのプロセスが動作する割当先の計算機１００、処理に要するＣＰＵ時間などに関する情報を書き換える。 FIG. 50 is a diagram showing an outline of the operation flow of the computer resource dynamic control device or system when all the components operate in cooperation in this embodiment.
In S421 in the figure, the processing time collection / prediction unit 2b or the computer load amount collection unit 5 notifies the migration process determination unit 3 of the occurrence of a deadline miss or a computer abnormality. The operation of the processing time collection / prediction unit 2b in this case is as shown in S11 to S16, and the operation of the computer load amount collection unit 5 is as shown in S1011 to S1018. Then, in S422, the migration process determination unit 3 investigates the cause of the deadline miss. In S423, it is investigated whether or not the cause is a failure of the computer. If the failure is a computer, the resource allocation unit 6 applies to all the processes 101 operating on the failed computer 100 in S424. Resources are allocated according to the procedures of S401 to S420. At the time of resource allocation, if there is no available resource that can be assigned to any computer, the resources 101 are stopped in order from the least important process 101 to create an available resource, and resource allocation is attempted again.
If it is not a computer failure, in S425, the migration process determination unit 3 finds a process that causes a deadline miss, a process that causes a resource shortage, and a process that performs resource reallocation. In S426, the resource allocation unit 6 allocates resources to the process that causes the deadline miss or the process that is considered to be the cause of the resource shortage in the procedure of S401 to S420. At the time of resource allocation, if there is no available resource that can be assigned to any of the computers 100, the available resources are created by stopping in order from the process with the lower importance, and resource allocation is attempted again. In S427, after the resource allocation is completed, the resource allocation unit 6 identifies the identifier of the process 101 that has newly allocated the resource in the system configuration information management table memory 7d, the allocation destination computer 100 in which the process operates, Rewrite information related to CPU time required for.

即ち本実施の形態は、実施の形態１の計算機リソース動的制御システムに、以下の特徴を加えることにより、全てのプロセスが動作するだけのリソースがシステムに存在しない場合に、重要な処理を優先してリアルタイム処理を継続することを可能とする計算機リソース動的制御方式を示している。即ち、
４１）実施の形態１のシステム構成テーブルに加え、全てのプロセスを動作させるだけのシステムリソースが足りなくなった場合に優先動作させるプロセスの順位を保持するシステム構成情報管理テーブル。
４２）実施の形態１の資源割付手段において、移行プロセス決定手段が決定したプロセスが動作できるための空きリソースが無い場合に、システム構成情報管理テーブルが持つプロセスの動作順位をもとに、順位の低いプロセスを停止させることにより計算機に空きリソースを作成し、その計算機上にプロセスマイグレーションを行う資源割付手段、を備える。 That is, this embodiment gives priority to important processing when the system does not have enough resources to operate all processes by adding the following features to the computer resource dynamic control system of the first embodiment. This shows a computer resource dynamic control system that enables real-time processing to be continued. That is,
41) In addition to the system configuration table of the first embodiment, a system configuration information management table that holds the order of processes to be preferentially operated when there are not enough system resources to operate all processes.
42) In the resource allocation unit according to the first embodiment, when there is no free resource for the process determined by the migration process determination unit to operate, the rank of the rank is determined based on the process operation order of the system configuration information management table. Resource allocation means for creating a free resource in a computer by stopping a low process and performing process migration on the computer is provided.

言い換えると、本実施の形態における計算機リソース動的制御装置は実施の形態１の装置に加えて、システム構成情報テーブル・メモリは、プロセスの処理優先順位を記憶し、
資源割当部は、現在の計算機ではデッドラインミスの発生が予測されると、上記システム構成情報テーブル・メモリを参照して、プロセスの処理優先順位が高いプロセスを候補計算機に割当て実行させることを特徴とする。
このように、システム内で独立した複数の処理が動作している場合に、計算機故障などにより全てのプロセス１０１を動作させるだけのシステムリソースがなくなってしまった場合においても、プロセスに重要度を与え、重要性の低い処理は停止させることによってシステムリソースに空きを作り出し、重要度のより高い処理を優先させて動作継続させることが可能になる。 In other words, the computer resource dynamic control device in the present embodiment, in addition to the device in the first embodiment, the system configuration information table memory stores the processing priority of the process,
The resource allocating unit refers to the system configuration information table / memory when the occurrence of a deadline miss is predicted in the current computer, and assigns a process having a higher process priority to the candidate computer to execute the process. And
As described above, when a plurality of independent processes are operating in the system, even when the system resources for operating all the processes 101 are lost due to a computer failure or the like, the process is given importance. By stopping the less important processes, it is possible to create a free space in the system resources, and to continue the operation with priority given to the processes with higher importance.

実施の形態５．
本実施の形態においては、各計算機１００の性能にばらつきがある場合に、その性能差を考慮して割当を行える装置、システムを説明する。
本実施の形態における動的リソース制御装置またはシステムの構成要素は、実施の形態１と同一である。本実施の形態における、即ち処理時間監視部１、処理時間収集・予測部２ｂ、移行プロセス決定部３、計算機負荷監視部４、計算機負荷量収集部５、処理時間履歴テーブル・メモリ８ないし計算機状態テーブル・メモリ１０の構成は、実施の形態１と同一である。
本実施の形態におけるシステム構成情報管理テーブル・メモリ７ｅが持つ情報の例を図５１に示す。システム構成情報管理テーブル・メモリ７ｅには、実施の形態１におけるシステム構成情報管理テーブル・メモリ７に示す情報のほか、プロセス１０１としてのアプリケーションごとに「動作可能計算機名」、計算機１００ごとに「性能値」が示されている。性能値としては、例えばＳＰＥＣＩｎｔのような、計算機の性能値を示すベンチマークプログラムのデータが示されることが考えられる。 Embodiment 5 FIG.
In the present embodiment, a description will be given of an apparatus and a system that can perform allocation in consideration of the performance difference when the performance of each computer 100 varies.
The components of the dynamic resource control apparatus or system in the present embodiment are the same as those in the first embodiment. In this embodiment, that is, processing time monitoring unit 1, processing time collection / prediction unit 2b, migration process determination unit 3, computer load monitoring unit 4, computer load amount collection unit 5, processing time history table / memory 8 or computer state The configuration of the table memory 10 is the same as that of the first embodiment.
FIG. 51 shows an example of information held in the system configuration information management table / memory 7e in the present embodiment. In the system configuration information management table / memory 7e, in addition to the information shown in the system configuration information management table / memory 7 in the first embodiment, “operable computer name” for each application as the process 101 and “performance” for each computer 100 are displayed. Value "is shown. As the performance value, it is conceivable that benchmark program data indicating the performance value of the computer, such as SPECInt, is shown.

本実施の形態における資源割当部６ｅの動作フローの例を図５２に示す。
図において、Ｓ５０１で、移行プロセス決定部３から、リソース割当の候補、リソース割当対象となるプロセスのリストが届くと、Ｓ５０２にて、プロセス１０１のリストを例えばＣＰＵ時間の多い順に並べ、テーブルに保持する。Ｓ５０２にて作成したテーブルが空であれば、処理を終了する。
空でなければＳ５０４において、Ｓ５０２にて作成したテーブルからプロセス１０１の名を１つ取り出す。そして、Ｓ５０４のプロセス１０１が動作可能で、かつ正常に動作している計算機１００を、例えば計算機負荷の大きい順に並べ替え、テーブルに保持する。各プロセスが動作可能な計算機の一覧は、システム構成情報管理テーブル・メモリ７ｅに示されているので、それを参照して計算機の識別子を取得する。そしてＳ５０７以下を、Ｓ５０５にて作成したテーブルにある全計算機について、または、リソース割当先となる計算機１００が決定するまで行う。Ｓ５０７にて、Ｓ５０５にて作成したテーブルから計算機を１つ選ぶ。そしてＳ５０８にて、Ｓ５０７の計算機１００の性能値と、現在動作中の計算機の性能値を、システム構成情報管理テーブル・メモリ７ｅを参照して取得する。そしてＳ５０９にて、Ｓ５０８にて求めた値より、Ｓ５０４にて選択したプロセス１０１の現在のＣＰＵ時間を補正する。例えば、現在動作中の計算機の性能値が１００、Ｓ５０７で選択した計算機の性能値が８５であった場合には、補正後のＣＰＵ時間は１００／８５倍になる。Ｓ５１０にて、この補正後の値をもとに、Ｓ５０４で選択したプロセスが、Ｓ５０７で選択した計算機上でデッドラインミスを発生させることなく動作できるかどうかを確認する。 An example of the operation flow of the resource allocation unit 6e in the present embodiment is shown in FIG.
In FIG. 5, when a resource allocation candidate and a list of processes to be allocated are received from the migration process determination unit 3 in S501, the list of processes 101 is arranged in order of increasing CPU time, for example, and stored in a table in S502. To do. If the table created in S502 is empty, the process ends.
If it is not empty, one name of the process 101 is extracted from the table created in S502 in S504. Then, the computers 100 in which the process 101 of S504 is operable and operating normally are rearranged in the descending order of the computer load, for example, and held in a table. A list of computers that can operate each process is shown in the system configuration information management table memory 7e, and the identifier of the computer is obtained by referring to the list. Then, S507 and the subsequent steps are performed for all the computers in the table created in S505 or until the computer 100 as the resource allocation destination is determined. In S507, one computer is selected from the table created in S505. In S508, the performance value of the computer 100 in S507 and the performance value of the currently operating computer are acquired with reference to the system configuration information management table memory 7e. In step S509, the current CPU time of the process 101 selected in step S504 is corrected based on the value obtained in step S508. For example, when the performance value of the currently operating computer is 100 and the performance value of the computer selected in S507 is 85, the corrected CPU time is 100/85 times. In S510, based on the corrected value, it is confirmed whether or not the process selected in S504 can operate without causing a deadline miss on the computer selected in S507.

動作可能であった場合にはＳ５１２に進み、Ｓ５０７で選択した計算機が、Ｓ５０４で選択した計算機が動作するだけの空きメモリを持っているかどうか確認する。十分な空きメモリ量があった場合にはＳ５１４にて、Ｓ５０７で選択した計算機をリソース割付先計算機として決定する。そしてＳ５１５にて、Ｓ５１４で決定した計算機１００上で、Ｓ５０４で選択したプロセス１０１を起動し、これまで動作していたプロセスを停止させる。そしてＳ５１６にて、システム構成管理テーブル・メモリ７ｅの、「動作中の計算機」の欄と、「平均ＣＰＵ時間」の欄を現在の値に書き換える。そしてＳ５１７にて、Ｓ５０４にて作成したテーブル中、リソース再割当を行ったプロセス名を削除し、Ｓ５０３に戻り、Ｓ５０２で作成したテーブルにある残りのプロセスについて、Ｓ５０３以降の処理を行う。
Ｓ５０６において、Ｓ５０５のテーブルに示される全計算機についてＳ５０７以下を実施した場合には、割付可能な計算機が見つからなかったということであるため、Ｓ５０４で選択したプロセスのリソース再割付を断念し、Ｓ５１８にてプロセス１０１を停止させ、Ｓ５０２で作成したテーブルから停止させたプロセス名を削除する。
Ｓ５１１、Ｓ５１３にて、Ｓ５０４で選択したプロセスが動作不可能であった場合には、Ｓ５０６に戻り、次のエントリの計算機についてＳ５０７以降の処理を行う。 If it is operable, the process proceeds to S512, and it is confirmed whether or not the computer selected in S507 has enough free memory to operate the computer selected in S504. If there is a sufficient amount of free memory, in S514, the computer selected in S507 is determined as the resource allocation destination computer. In step S515, the process 101 selected in step S504 is started on the computer 100 determined in step S514, and the processes that have been operating so far are stopped. In S516, the “computer in operation” column and the “average CPU time” column in the system configuration management table memory 7e are rewritten to the current values. In S517, the process name to which the resource reallocation has been performed is deleted from the table created in S504, and the process returns to S503, and the processes after S503 are performed on the remaining processes in the table created in S502.
In S506, if S507 and subsequent steps are performed for all the computers shown in the table of S505, it means that no assignable computer was found. Therefore, the resource reallocation of the process selected in S504 is abandoned, and the process goes to S518. The process 101 is stopped and the stopped process name is deleted from the table created in S502.
In S511 and S513, if the process selected in S504 is not operable, the process returns to S506, and the processing of S507 and subsequent steps is performed on the computer of the next entry.

本実施の形態における、各構成要素が連携して行う動的リソース制御装置、またはシステム全体の動作フローの例を図５３に示す。
Ｓ５２１にて、処理時間収集・予測部２ｂまたは、計算機負荷量収集部５は、デッドラインミスの発生、または計算機の異常、ＣＰＵ、メモリリソース不足を検出して、移行プロセス決定部３に通知する。この場合の処理時間収集・予測部２ｂの動作はＳ１１〜Ｓ１６、計算機負荷量収集部５の動作はＳ１０１１〜Ｓ１０１８に示すとおりである。するとＳ５２２で、移行プロセス決定部３は、通知の内容を調査する。通知の内容が計算機の故障であればＳ５２４にて、故障した計算機上で動作していた全てのプロセス１０１に対して、資源割当部６ｅはＳ５０１〜Ｓ５１８の手順によりリソースの割当を行う。計算機の故障でなければＳ５２５で、移行プロセス決定部３により、デッドラインミス発生の原因と考えられるプロセス、リソース不足発生の原因と考えられるプロセスを見つけ出す。そしてＳ１４６にて、Ｓ５２５で見つけたプロセスに対して、資源割当部６ｅがＳ５０１〜Ｓ５１８の手順でリソース割当を行う。リソース割当終了後Ｓ５２７にて、資源割当部６ｅは、システム構成情報管理テーブル・メモリ７ｅの、新規にリソース割当を行ったプロセス１０１の識別子と、そのプロセスが動作する割当先の計算機１００、処理に要するＣＰＵ時間などに関する情報を書き換える。 FIG. 53 shows an example of the operation flow of the dynamic resource control apparatus or the entire system performed in cooperation with each component in this embodiment.
In S521, the processing time collection / prediction unit 2b or the computer load amount collection unit 5 detects the occurrence of a deadline miss or the abnormality of the computer, the CPU, or the memory resource shortage, and notifies the migration process determination unit 3 of the detection. . The operation of the processing time collection / prediction unit 2b in this case is as shown in S11 to S16, and the operation of the computer load amount collection unit 5 is as shown in S1011 to S1018. Then, in S522, the migration process determination unit 3 investigates the content of the notification. If the content of the notification is a computer failure, in S524, the resource allocation unit 6e allocates resources to all the processes 101 operating on the failed computer by the procedures of S501 to S518. If it is not a computer failure, in S525, the migration process determination unit 3 finds a process that is considered to be a cause of occurrence of a deadline miss and a process that is considered to be a cause of occurrence of resource shortage. In S146, the resource allocation unit 6e allocates resources to the process found in S525 according to the procedures in S501 to S518. After the resource allocation is completed, in S527, the resource allocation unit 6e performs processing in the system configuration information management table / memory 7e for the identifier of the process 101 that has newly allocated the resource, the allocation-destination computer 100 in which the process operates, and processing. Rewrite information about CPU time required.

即ち本実施の形態は、実施の形態１の計算機リソース動的制御システムに、以下の特徴を加えることにより、システムを構成する計算機の性能が不均一な場合においてもデッドラインミスを発生させない計算機へのプロセスマイグレーションを可能とすることを特徴とする計算機リソース動的制御方式を示している。即ち、
５１）実施の形態１のシステム構成情報管理テーブルに加え、各プロセスが動作可能な計算機の一覧と、各計算機の性能を示す値を保持するシステム構成情報管理テーブル。
５２）実施の形態１の資源割付手段において、移行プロセス決定手段が決定したプロセスが動作可能な計算機をプロセスマイグレーション先となる計算機候補とし、システム構成情報管理テーブルが持つ計算機の性能値を用いて、この計算機候補にプロセスを移行した場合にプロセスが必要とする計算機リソースを算出することにより、十分な空きリソースを持つ計算機にプロセスマイグレーションを行う資源割付手段、を備える。 In other words, the present embodiment adds the following features to the computer resource dynamic control system of the first embodiment, so that a computer that does not cause a deadline error even when the performance of the computers constituting the system is uneven. The computer resource dynamic control system is characterized by enabling the process migration. That is,
51) A system configuration information management table that holds a list of computers that can operate each process and a value indicating the performance of each computer in addition to the system configuration information management table of the first embodiment.
52) In the resource allocating unit of the first embodiment, a computer capable of operating the process determined by the migration process determining unit is set as a computer candidate as a process migration destination, and the performance value of the computer included in the system configuration information management table is used. Resource allocation means is provided for performing process migration on a computer having sufficient free resources by calculating computer resources required by the process when the process is transferred to the computer candidate.

言い換えれば、本実施の形態における計算機リソース動的制御装置は、システム構成情報テーブル・メモリは、対象プロセスを処理可能な動作可能計算機の識別子と計算機の性能値とを記憶し、
資源割当部は、再割当の勧告通知を受けると、上記システム構成情報テーブル・メモリを参照して、上記動作可能計算機の識別子と計算機の性能値とを調べてプロセスを候補計算機に割当て実行させることを特徴とする。
このように、システム構成情報管理テーブル・メモリ７ｅが、各プロセス１０１が動作可能な計算機名、および、計算機の性能値を示すことにより、システム内の計算機のＯＳやＣＰＵアーキテクチャ、性能値がそれぞれ異なる場合においても、動作可能な計算機を正しく判別し、ＣＰＵ時間などを正しく補正し、デッドラインミスが発生しない計算機への割付が可能となる。 In other words, in the computer resource dynamic control device according to the present embodiment, the system configuration information table memory stores an identifier of an operable computer capable of processing the target process and a performance value of the computer,
When the resource allocation unit receives the reassignment recommendation notification, the resource allocation unit refers to the system configuration information table / memory, checks the identifier of the operable computer and the performance value of the computer, and allocates the process to the candidate computer for execution. It is characterized by.
As described above, the system configuration information management table / memory 7e indicates the computer name in which each process 101 can operate and the performance value of the computer, so that the OS, CPU architecture, and performance value of the computers in the system are different. Even in such a case, it is possible to correctly determine an operable computer, correct CPU time, etc., and assign the computer to a computer that does not cause a deadline error.

実施の形態６．
デッドラインミス発生予測時間が迫っていても、短時間の内に該当するプロセスを処理する計算機の移行を行って、デッドラインミスの発生を防止する構成と動作を説明する。具体的には、そうしたプロセスを移行すべき計算機を想定して起動をかけておき、待機させる。
本実施の形態におけるリソース動的制御システムの構成を図５４に示す。また一つの装置としてまとめた場合の計算機リソース動的制御装置１００ｃの構成を図５５に示す。本実施の形態における動的リソース制御システムまたは装置における構成要素は、実施の形態３の構成に加え、レプリカ管理部２５をもつ。
その他の本実施の形態における処理時間監視部１、処理時間収集部２（または処理時間収集・予測部２ｂ）、移行プロセス決定部３、計算機負荷監視部４、計算機負荷量収集部５（または計算機負荷量収集・予測部５ｂ）、資源割当部６、システム構成情報管理テーブル・メモリ７、処理時間履歴テーブル・メモリ８ないし計算機状態テーブル・メモリ１０、デッドラインミス発生時刻予測部２１、計算機負荷予測部２２、ＣＰＵ時間予測部２３、メモリ使用量予測部２４の構成は実施の形態３と同一である。 Embodiment 6 FIG.
A configuration and operation for preventing the occurrence of a deadline miss by migrating a computer that processes a corresponding process within a short time even when the deadline miss occurrence prediction time is approaching will be described. Specifically, assuming that the computer to which such a process should be transferred is activated, the process is put on standby.
The configuration of the resource dynamic control system in this embodiment is shown in FIG. Further, FIG. 55 shows the configuration of the computer resource dynamic control device 100c in the case of being combined as one device. The components in the dynamic resource control system or apparatus in the present embodiment have a replica management unit 25 in addition to the configuration in the third embodiment.
Other processing time monitoring unit 1, processing time collection unit 2 (or processing time collection / prediction unit 2b), transition process determination unit 3, computer load monitoring unit 4, computer load amount collection unit 5 (or computer) in this embodiment Load collection / prediction unit 5b), resource allocation unit 6, system configuration information management table / memory 7, processing time history table / memory 8 or computer state table / memory 10, deadline miss occurrence time prediction unit 21, computer load prediction The configurations of the unit 22, the CPU time prediction unit 23, and the memory usage amount prediction unit 24 are the same as those in the third embodiment.

ここで、プロセスのレプリカとは、移行した方が良いと予想されるプロセスを新たに処理する計算機上で起動をかけて待機している、移行を前提としてコピーしたプロセスのことである。従って既に起動はしているが、移行が決まって処理すべきデータは未だ届かないなどの理由により処理待ちになっているプロセスである。また、レプリカは、メモリ資源は使用するものの、ＣＰＵリソースは使用しない、つまりレプリカのＣＰＵ時間は０であるとする。
本実施の形態におけるレプリカ管理部２５は、計算機負荷量収集部５が持つデータを参照し、十分に空きメモリがある計算機１００上でレプリカを起動する。そしてデッドラインミスなどが発生すると予測される場合において、資源割当部６がプロセスに資源再割当を行う際に、どの計算機１００でプロセスのレプリカが動作しているのかを資源割当部６に通知する役割を持つ。また、レプリカが動作する計算機上でメモリ不足発生が予測される場合には、プロセスのレプリカを停止させる役割を持つ。 Here, the process replica is a process copied on the premise of migration that is activated and waiting on a computer that newly processes a process that is expected to be migrated. Therefore, it is a process that has already started, but is waiting for processing because, for example, the data to be processed has not yet arrived due to the transition. Further, it is assumed that the replica uses memory resources but does not use CPU resources, that is, the replica CPU time is zero.
The replica management unit 25 in the present embodiment refers to the data held by the computer load amount collection unit 5 and activates the replica on the computer 100 having sufficient free memory. When it is predicted that a deadline miss or the like will occur, when the resource allocation unit 6 reallocates resources to the process, the resource allocation unit 6 is notified of which computer 100 the process replica is operating on. Have a role. Also, when a memory shortage is predicted on the computer on which the replica operates, the replica of the process is stopped.

本実施の形態におけるレプリカ管理部２５の動作フローの例を図５６、図５７に示す。図５６は、システム起動時などの際にレプリカを起動する場合の手順である。Ｓ６１にて、レプリカを動作させるプロセスを１つ選択する。プロセスを選択する基準としては、例えば、メモリ使用量の多い順などが考えられる。一般に、メモリ使用量の多いほうが、プロセスの起動に時間を要するため、プロセスの再割当が完了するまでに多くの時間がかかると考えられる。その他の方法としては、処理周期が短く、リソース割当に要する時間を短縮する必要があるものを選ぶ、という方法が考えられる。Ｓ６２にて、計算機負荷量収集部５より、全ての計算機１００のメモリ使用量を取得する。そしてＳ６３にて、プロセスが動作可能なだけの空きメモリ量を持つ計算機上でプロセスのレプリカを起動する。Ｓ６３にて、起動するレプリカの個数に関しては、全ての計算機上でレプリカを起動してもよいし、例えば空きメモリ量の大きい順に幾つか選択してレプリカを起動してもよいし、１つの計算機上だけで起動してもよい。
図５７は、システム運用中に、ある計算機１００のメモリが不足した場合のレプリカ管理部２５による動作を示した図である。メモリ不足発生時のメモリ使用量予測部２４の動作は別途説明する。Ｓ６０１にて、メモリ使用量予測部２４より、ある計算機のレプリカ停止要求が届くと、Ｓ６０２にて、指定された計算機１００上で動作するレプリカを停止させる。ここで、停止させるレプリカは、全て停止させてもよいし、メモリ使用量の多い順に停止させてもよいし、メモリ使用量の少ない順に停止させてもよい。そしてＳ６０３で、レプリカを停止させた計算機以外の計算機のメモリ使用量を計算機負荷量収集部５より取得し、Ｓ６０４で、停止させたレプリカが動作可能だけのメモリ空き容量を持つ計算機１００で、レプリカを再起動する。なおＳ６０４に関しては、Ｓ６０２で停止させた全レプリカについて実行する。 Examples of the operation flow of the replica management unit 25 in this embodiment are shown in FIGS. FIG. 56 shows a procedure for starting a replica at the time of starting the system. In S61, one process for operating the replica is selected. As a criterion for selecting a process, for example, the order in which the memory usage is large may be considered. In general, the larger the memory usage, the longer it takes for the process to start up, so it can be considered that it takes more time to complete the process reallocation. As another method, a method in which a processing cycle is short and a time required for resource allocation needs to be shortened is selected. In S 62, the memory usage of all the computers 100 is acquired from the computer load amount collecting unit 5. In step S63, a process replica is activated on a computer having a free memory capacity sufficient for the process to operate. In S63, with regard to the number of replicas to be activated, the replicas may be activated on all computers, for example, several replicas may be activated in the descending order of the amount of free memory, or one computer may be activated. It may be activated only on the top.
FIG. 57 is a diagram showing an operation by the replica management unit 25 when a memory of a certain computer 100 is insufficient during system operation. The operation of the memory usage prediction unit 24 when a memory shortage occurs will be described separately. When a replica stop request for a certain computer is received from the memory usage prediction unit 24 in S601, the replica operating on the specified computer 100 is stopped in S602. Here, all the replicas to be stopped may be stopped, may be stopped in descending order of memory usage, or may be stopped in order of increasing memory usage. In S603, the amount of memory used by a computer other than the computer whose replica has been stopped is acquired from the computer load amount collection unit 5, and in S604, the replica 100 is stored in the computer 100 having a memory free capacity sufficient for operation. Restart. S604 is executed for all replicas stopped in S602.

本実施の形態におけるメモリ使用量予測部２４の動作フローを図５８に示す。Ｓ６１１にて、Ｓ２２１からＳ２２４までを実行する。そして、メモリ不足が予測された場合、メモリ不足が発生しそうな計算機１００上でプロセスのレプリカが動作するかどうかを調べ、動作しているのであればＳ６１３に進み、レプリカ管理部２５にレプリカ停止を依頼し、レプリカを停止させる。そしてＳ６１４にて、メモリ不足が発生する時刻を再度求める。この場合のメモリ不足が発生する時刻を求める式は、Ｓ２２２で求めた式から停止させたレプリカが使用しているメモリ使用量だけ引くことによって求められる。例えば、Ｓ２２２で求めた式がｙ＝ａｘ＋ｂ（ｙ：処理時間、ｘ：時刻）であり、停止させたレプリカが使用していたメモリ使用量の総和がｃであった場合には、式はｙ＝ａｘ＋ｂ−ｃとなる。この式のｙに計算機の搭載メモリ量を与えることにより、メモリ不足が発生する時刻が求められる。Ｓ６１５にて、Ｓ２２３またはＳ６１４で求めたメモリ不足が発生する時刻と、現在時刻から、メモリ不足が発生するまでの残り時間を求める。そして、Ｓ６１６で、その計算機で動作するプロセスのレプリカが他の計算機で動作しているかどうかを調べる。レプリカが動作しているかどうかは、レプリカ管理部２５に問い合わせることで判る。レプリカが他の計算機上で動作していれば、Ｓ６１７にて、リソース再割付所要時間と１回の周期処理時間の和を求める。レプリカが動作していなければ、レプリカ再割付所要時間とプロセス起動所要時間と１回の周期処理時間の和を求める。そして、Ｓ６１７またはＳ６１８で求めた時間と、Ｓ６１５で求めた時間を比較し、Ｓ６１５で求めた時間のほうが短ければＳ６２０にて、デッドラインミス発生時刻予測部２１に、デッドラインミスが発生するプロセスと現在時刻を通知する。 FIG. 58 shows an operation flow of the memory usage prediction unit 24 in the present embodiment. In S611, S221 to S224 are executed. If a memory shortage is predicted, a check is made to see if a process replica operates on the computer 100 where memory shortage is likely to occur. If it is operating, the process proceeds to S613, and the replica management unit 25 is stopped. Request and stop replica. In S614, the time when the memory shortage occurs is obtained again. In this case, the formula for obtaining the time when the memory shortage occurs is obtained by subtracting only the memory usage amount used by the stopped replica from the formula obtained in S222. For example, if the equation obtained in S222 is y = ax + b (y: processing time, x: time) and the sum of the memory usage used by the stopped replica is c, the equation is y = Ax + b-c. By giving the amount of memory mounted on the computer to y in this equation, the time when memory shortage occurs is obtained. In S615, the remaining time until the memory shortage occurs is determined from the time when the memory shortage occurs in S223 or S614 and the current time. In step S616, it is checked whether a replica of a process operating on the computer is operating on another computer. Whether or not the replica is operating can be determined by inquiring the replica management unit 25. If the replica is operating on another computer, in S617, the sum of the resource reallocation time and one cycle processing time is obtained. If the replica is not operating, the sum of the time required for replica reassignment, the time required for process activation, and the time for one periodic process is obtained. Then, the time obtained in S617 or S618 is compared with the time obtained in S615, and if the time obtained in S615 is shorter, the process in which a deadline miss occurs in the deadline miss occurrence time prediction unit 21 in S620. And notify the current time.

本実施の形態における資源割当部６ｆの動作フローの例を図５９に示す。Ｓ６２１にて、移行プロセス決定部３から、リソース再割当の候補、リソース再割当対象となるプロセス１０１のリストが届くと、ＣＰＵ時間の長い順に並べ替え、テーブルに保持する。そして、Ｓ６２４以降の操作をテーブルが空になるまで実行する。
Ｓ６２４にて、Ｓ６２２のテーブルからプロセス名を１つ取り出し、Ｓ６２５にて、Ｓ６２４のプロセスのレプリカが動作する計算機１００の識別子をレプリカ管理部２５より取得し、ＣＰＵ使用率の高い順に並べ、テーブルに保持する。そして、Ｓ６２５で作成したテーブルに示される全計算機についてＳ６２７以降を実行する。まずＳ６２７にて、Ｓ６２５にて作成したテーブルから計算機名を取り出す。そして、Ｓ６２４で取り出したプロセスがＳ６２７で取り出した計算機上で動作できるかどうかを確認する。動作可能であればＳ６３１にて、ステップＳ６２７で選んだ計算機を割付先として決定し、Ｓ６３２にて、Ｓ６３１で決定した計算機上でプロセスを起動し、元の計算機で動作していたプロセスを停止させる。そしてＳ６３３にて、システム構成情報管理テーブル・メモリ７の、プロセスと計算機、現在の処理時間の情報を書き換える。そしてＳ６３４にて、Ｓ６２４で取り出したプロセスのレプリカを起動するよう要求する。そしてＳ６３６にて、Ｓ６２４で取り出したプロセスの名前をテーブルから削除し、テーブルから他のプロセスを選択して、Ｓ６２３以降を行う。またＳ６２９にて、Ｓ６２７で選んだ計算機に十分な空きＣＰＵがなかった場合には、Ｓ６２７に戻り、他の計算機についてＳ６２７〜Ｓ６２９を実行する。Ｓ６２５で作成したテーブル内の全ての計算機についてＳ６２７〜Ｓ６２９を実行し、かつ割当先が決定しなかった場合には、リソース再割当を断念し、Ｓ６３にて自プロセスを停止させ、Ｓ６３６にてＳ６２４で選択したプロセスをテーブルより削除し、他のプロセスについてＳ６２３以降を実施する。なお、レプリカが動作していない場合には、実施の形態３と同一の手順で動作を行うことになる。 FIG. 59 shows an example of the operation flow of the resource allocation unit 6f in the present embodiment. In S621, when the migration process determination unit 3 receives a resource reassignment candidate and a list of processes 101 to be reassigned, the process is rearranged in the descending order of the CPU time and stored in the table. Then, the operations after S624 are executed until the table becomes empty.
In S624, one process name is extracted from the table in S622. In S625, the identifiers of the computers 100 on which the replicas of the process in S624 operate are obtained from the replica management unit 25, arranged in the descending order of CPU usage, and stored in the table. Hold. Then, S627 and subsequent steps are executed for all the computers shown in the table created in S625. First, in S627, the computer name is extracted from the table created in S625. Then, it is confirmed whether or not the process extracted in S624 can operate on the computer extracted in S627. If it is operable, in S631, the computer selected in step S627 is determined as the assignment destination, and in S632, the process is started on the computer determined in S631, and the process that was operating on the original computer is stopped. . In step S633, information on the process, computer, and current processing time in the system configuration information management table / memory 7 is rewritten. In step S634, a request is made to activate a replica of the process extracted in step S624. In S636, the name of the process extracted in S624 is deleted from the table, another process is selected from the table, and S623 and subsequent steps are performed. In S629, if the computer selected in S627 does not have enough free CPUs, the process returns to S627, and S627 to S629 are executed for the other computers. If S627 to S629 are executed for all the computers in the table created in S625 and the allocation destination is not determined, the resource reallocation is abandoned, the own process is stopped in S63, and S624 is performed in S636. The process selected in step S3 is deleted from the table, and S623 and subsequent steps are performed for other processes. When the replica is not operating, the operation is performed in the same procedure as in the third embodiment.

本実施の形態における動的リソース制御システムまたは装置の全体の動作フローを図６０に示す。
Ｓ６５１にて、処理時間収集部２、計算機負荷量収集部５、デッドラインミス発生時刻予測部２１、メモリ使用量予測部２４が、デッドラインミスの発生、デッドラインミスの発生予測、計算機の異常、ＣＰＵ・メモリリソース不足、ＣＰＵ・メモリリソース不足の予測を移行プロセス決定部３に通知する。デッドラインミス発生の検出はＳ１１〜Ｓ１６の手順で行われ、デッドラインミスの発生予測の検出はＳ３１〜Ｓ３５の手順で行われる。計算機の異常検出、ＣＰＵ・メモリリソース不足の検出はＳ１０１１〜Ｓ１０１６の手順で行われ、ＣＰＵリソース不足の予測はＳ３１〜Ｓ３５の手順で行われる。メモリリソース不足の予測は、Ｓ３０１〜Ｓ３０５の手順で行われる。するとＳ６５２で、移行プロセス決定部３は、通知の内容を調査する。通知の内容が計算機の故障であればＳ６５５にて、故障した計算機上で動作していた全てのプロセスに対して、資源割当部６ｆはＳ３４１〜Ｓ３４７の手順でリソースの割当を行う。
計算機の故障でなければＳ６５５にて、デッドラインミス発生の原因と考えられるプロセス、リソース不足を発生させた原因と考えられるプロセスに対して、資源割当部６ｆがＳ６２１〜Ｓ６３４の手順でリソース割当を行う。Ｓ６５６にて、リソース割当終了後、資源割当部６ｆは、システム構成情報管理テーブル・メモリ７の、新規にリソース割当を行ったプロセスの識別子と、そのプロセスが動作する割当先計算機に関する情報などを書き換える。 FIG. 60 shows an overall operation flow of the dynamic resource control system or apparatus in the present embodiment.
In S651, the processing time collection unit 2, the computer load amount collection unit 5, the deadline miss occurrence time prediction unit 21, and the memory usage amount prediction unit 24 determine that a deadline miss has occurred, a deadline miss has occurred, and the computer has an abnormality. CPU / memory resource shortage and CPU / memory resource shortage prediction are notified to the migration process determining unit 3. Detection of the occurrence of a deadline miss is performed in steps S11 to S16, and detection of the occurrence prediction of a deadline miss is performed in steps S31 to S35. Computer abnormality detection and CPU / memory resource shortage detection are performed in steps S1011 to S1016, and CPU resource shortage prediction is performed in steps S31 to S35. Prediction of memory resource shortage is performed according to the procedure of S301 to S305. Then, in S652, the migration process determination unit 3 investigates the content of the notification. If the content of the notification is a computer failure, in S655, the resource allocation unit 6f allocates resources to all processes operating on the failed computer in the procedure of S341 to S347.
If it is not a computer failure, in S655, the resource allocation unit 6f allocates resources in the steps S621 to S634 with respect to the process that is considered to be the cause of the deadline error occurrence and the process that is considered to be the cause of the resource shortage. Do. In S656, after the resource allocation is completed, the resource allocation unit 6f rewrites the identifier of the process to which the resource has been newly allocated and information on the allocation destination computer in which the process is operating, in the system configuration information management table memory 7. .

即ち本実施の形態の計算機リソース動的制御システムは、実施の形態３の計算機リソース動的制御システム式に、以下の特徴を加えることにより、デッドラインミス発生までの残り時間が短い場合においてもデッドラインミスの発生を防止させることを特徴とする計算機リソース動的制御方式を示す。即ち、
６１）計算機負荷収集手段が持つデータを参照し、十分に空きリソース量がある計算機にプロセスのコピーをあらかじめ起動しておき、プロセスマイグレーションを行うプロセスのコピーが動作する計算機を資源割付手段に通知するレプリカ管理手段。
６２）実施の形態３の資源割付手段において、移行プロセス決定手段が決定したプロセスのコピーが動作している計算機をプロセスマイグレーション先となる計算機候補とし、計算機候補の中で十分な空きＣＰＵリソースを持つ計算機にプロセスマイグレーションを行う資源割付手段、を備える。 In other words, the computer resource dynamic control system according to the present embodiment adds the following features to the computer resource dynamic control system expression according to the third embodiment, so that even when the remaining time until the deadline miss occurs is short, A computer resource dynamic control system characterized by preventing occurrence of a line miss will be described. That is,
61) Referring to data stored in the computer load collection means, a process copy is started in advance on a computer having a sufficient amount of free resources, and the computer on which the process copy performing the process migration operates is notified to the resource allocation means. Replica management means.
62) In the resource allocation unit of the third embodiment, a computer on which a copy of the process determined by the migration process determination unit is operating is set as a computer candidate to be a process migration destination, and has sufficient free CPU resources among the computer candidates. Resource allocation means for performing process migration on the computer is provided.

言い換えれば、本実施の形態における計算機リソース動的制御装置は実施の形態３の装置に加えて、選択した計算機に所定のプロセスの処理起動をかけて、及びプロセスの処理終了をさせて、これらの状態を管理するレプリカ管理部を備えて、
資源割当部は、再割当の勧告通知を受けると、上記選択した計算機に上記レプリカ管理部で起動をかけていて動作可能であれば、該動作可能な計算機に上記プロセスの割当を実行させることを特徴とする。
このように、レプリカ管理部２５によって各プロセスに対して予めそのプロセスのレプリカを起動し、待機させておくことにより、リソース再割当を開始してから実際にプロセスの処理を開始するまでの所要時間を短縮することができ、デッドラインミス発生やメモリ不足発生までの残り時間が短い場合においても短時間で移行処理を完了し、システムのハードリアルタイム処理を継続することが可能になる。 In other words, in addition to the apparatus of the third embodiment, the computer resource dynamic control apparatus according to the present embodiment starts the process of a predetermined process for the selected computer and terminates the process of the process. With a replica management unit that manages the state,
When the resource allocation unit receives the reassignment advisory notice, the resource allocation unit causes the operable computer to execute the process allocation if the selected computer is activated by the replica management unit and is operable. Features.
In this way, the replica management unit 25 activates a replica of each process in advance and makes it stand by, so that the time required from the start of resource reallocation to the actual processing of the process. Even when the remaining time until the occurrence of a deadline miss or memory shortage is short, the migration process can be completed in a short time, and the hard real-time process of the system can be continued.

実施の形態７．
プロセスがパイプライン処理の一部を構成している場合は、パイプライン処理のステージの前後における処理に影響があるので、制約が加わる。こうした場合におけるデッドラインミスの発生は影響が広範囲になるので、適切な原因除去が必要である。本実施の形態においては、こうしたパイプライン処理における移行を説明する。
本実施の形態において、計算機リソース動的制御システムが適用対象とするプロセスの処理形態の一例を図６１に示す。図に示すように、１つの入力に対して複数のプロセスが連なって（以下パイプライン処理と呼ぶ）、各プロセスが順々に処理を行うことにより１回の周期処理が完了するものとする。また、この独立したパイプライン処理が複数個存在することにより、システムが構成されるものとする。パイプライン処理を構成する各プロセス１０１は、同一計算機１００上で動作する場合もあれば、それぞれ異なる計算機で動作する場合もある。なお、実施の形態１から実施の形態６で適用対象としているプロセスのように、単一のプロセスで処理が完結する場合は、本実施の形態ではパイプラインの長さ１のパイプライン処理を行っている。
本実施の形態におけるリソース動的制御システムの構成は実施の形態５と同じである。
本実施の形態における処理時間監視部１、計算機負荷監視部４、計算機負荷量収集部５、資源割当部６、処理時間履歴テーブル・メモリ８、計算機負荷履歴テーブル・メモリ９、計算機状態テーブル・メモリ１０の役割、構成は実施の形態１と同一であり、詳細説明を省略する。 Embodiment 7 FIG.
When a process constitutes part of pipeline processing, processing is affected before and after the stage of pipeline processing, and thus a restriction is added. The occurrence of a deadline miss in such a case has a wide range of effects, and appropriate cause elimination is necessary. In this embodiment, the transition in such pipeline processing will be described.
FIG. 61 shows an example of a processing form of a process to be applied by the computer resource dynamic control system in the present embodiment. As shown in the figure, it is assumed that a plurality of processes are connected to one input (hereinafter referred to as pipeline processing), and each process performs processing one after another to complete one cycle processing. In addition, the system is configured by the existence of a plurality of independent pipeline processes. Each process 101 constituting the pipeline processing may operate on the same computer 100 or may operate on a different computer. When processing is completed in a single process, such as the process to be applied in Embodiments 1 to 6, pipeline processing with a pipeline length of 1 is performed in this embodiment. ing.
The configuration of the resource dynamic control system in the present embodiment is the same as that in the fifth embodiment.
Processing time monitoring unit 1, computer load monitoring unit 4, computer load amount collection unit 5, resource allocation unit 6, processing time history table / memory 8, computer load history table / memory 9, computer state table / memory in the present embodiment The role and configuration of 10 are the same as those of the first embodiment, and detailed description thereof is omitted.

本実施の形態におけるシステム構成情報管理テーブル・メモリ７ｇのデータ構成の例を図６２に示す。本実施の形態におけるシステム構成情報管理テーブル・メモリ７ｇは、実施の形態１に示す情報のほか、「パイプライン処理に関する情報」として、「パイプライン処理名」、「パイプライン処理を構成するプロセス」、「パイプライン処理のデッドライン」、「パイプライン処理の周期」が示される。例えば図６２では、パイプライン１というパイプライン処理を構成するプロセス群があり、パイプライン処理は、プロセスＡ、プロセスＢ、プロセスＣが順にデータを送信して処理を行うことによって実現し、プロセスＡからプロセスＣまでの処理を１秒以内に完了し、プロセスＡが処理を開始してから、次に処理を開始するまでの時間が２００ミリ秒であることを示している。
本実施の形態における処理時間収集部２は、実施の形態１に示す役割のほか、パイプライン処理の先頭の処理を行うプロセスの処理開始時刻、およびパイプライン処理の最後の処理を行うプロセスの処理終了時刻から１回のパイプライン処理に要する時間を求める役割を持つ。また、処理時間履歴テーブル・メモリ８に保有している全プロセスの処理時間の履歴を参照し、システムを構成するパイプライン処理が何回デッドラインミスを発生させているかを調査する。そして、許容可能な回数以上デッドラインミスが発生していた場合には、移行プロセス決定部３に、パイプライン処理において、デッドラインミスが規定回数以上発生している旨を通知する役割を持つ。パイプライン処理のデッドラインミスの回数は、パイプライン処理の先頭の処理を行うプロセスの処理開始時刻とパイプライン処理の最後の処理を行うプロセスの処理終了時刻を取得し、両方の値を引き、システム構成情報管理テーブルに示される「パイプライン処理のデッドライン」の値と比較することにより求められる。これを、処理時間履歴テーブルに示される全てのデータについて行うことになる。 An example of the data configuration of the system configuration information management table / memory 7g in the present embodiment is shown in FIG. In addition to the information shown in the first embodiment, the system configuration information management table memory 7g according to the present embodiment includes “pipeline process name” and “process constituting the pipeline process” as “information related to pipeline process”. , “Pipeline processing deadline” and “Pipeline processing cycle” are shown. For example, in FIG. 62, there is a process group constituting pipeline processing called pipeline 1, and the pipeline processing is realized by the process A, process B, and process C transmitting data in order and performing processing. It is shown that the time from the start of the process A to the next process after the process from the process A to the process C is completed within one second is 200 milliseconds.
In addition to the role shown in the first embodiment, the processing time collection unit 2 in the present embodiment processes the process start time of the process that performs the first process of the pipeline process and the process of the process that performs the last process of the pipeline process. It plays a role of obtaining the time required for one pipeline process from the end time. Further, the processing time history of all processes held in the processing time history table memory 8 is referred to and the number of times that the pipeline processing that constitutes the system causes a deadline miss is investigated. When the deadline miss has occurred more than the allowable number of times, it has a role of notifying the migration process determining unit 3 that the deadline miss has occurred more than the prescribed number in the pipeline processing. The number of deadline misses in pipeline processing is obtained by obtaining the processing start time of the process that performs the first processing of the pipeline processing and the processing end time of the process that performs the last processing of the pipeline processing, subtracting both values, It is obtained by comparing with the value of “pipeline processing deadline” shown in the system configuration information management table. This is performed for all data shown in the processing time history table.

本実施の形態における処理時間監視部１、計算機負荷監視部４、計算機負荷量収集部５の動作フローは実施の形態１と同一である。
本実施の形態における処理時間収集部２ｇの動作を図６３に示す。Ｓ７１で、処理時間監視部１からデータを受け取る。そしてＳ７２にて、処理時間履歴テーブル・メモリ８にデータを書き込む。Ｓ７２で書き込んだデータがパイプライン処理の最後に処理を行うプロセスの処理完了時刻であればＳ７４に進み、パイプライン処理の先頭の処理を行ったプロセスの、同一処理周期における処理開始時刻を求める。Ｓ７５にて、同様に、パイプライン処理の最後の処理を行ったプロセスの、同一処理周期における処理完了時刻を求める。そしてＳ７６にて、Ｓ７４とＳ７５から、パイプライン処理の１回の処理時間を求め、デッドラインミスが発生しているかどうかを調べる。デッドラインミスが発生していれば、処理時間履歴テーブル・メモリ８に残されたデータ中に何回デッドラインミスが発生しているのかを調べ、規定回数以上デッドラインミスが発生していれば、移行プロセス決定部３に通知する。そしてＳ７９にて、各プロセスにおける次の周期の処理データを格納するために、処理時間履歴テーブルのエントリを１つ先に勧める。Ｓ７３にて、書き込んだデータがパイプライン処理の最後に処理を行うプロセスのデータでない場合、またはＳ７６にて、デッドラインミスが発生していなければ、Ｓ７９に直接進む。 The operation flow of the processing time monitoring unit 1, the computer load monitoring unit 4, and the computer load amount collecting unit 5 in the present embodiment is the same as that of the first embodiment.
The operation of the processing time collection unit 2g in the present embodiment is shown in FIG. In S71, data is received from the processing time monitoring unit 1. In S72, data is written in the processing time history table memory 8. If the data written in S72 is the process completion time of the process that performs the process at the end of the pipeline process, the process proceeds to S74, and the process start time in the same process cycle of the process that performed the first process of the pipeline process is obtained. Similarly, in S75, the processing completion time in the same processing cycle of the process that performed the last processing of the pipeline processing is obtained. In S76, the processing time for one pipeline process is obtained from S74 and S75, and it is checked whether or not a deadline miss has occurred. If a deadline miss has occurred, check how many times the deadline miss has occurred in the data remaining in the processing time history table memory 8, and if a deadline miss has occurred more than the specified number of times , Notify the migration process determination unit 3. In step S79, one entry in the processing time history table is recommended first in order to store the processing data of the next cycle in each process. In S73, if the written data is not the data of the process that performs the process at the end of the pipeline process, or if no deadline miss has occurred in S76, the process proceeds directly to S79.

本実施の形態における移行プロセス決定部３ｇの動作を図６４に示す。
Ｓ７０１にて、処理時間収集部２からパイプライン処理のデッドラインミス通知が届くと、Ｓ７０２にて、デッドラインミスが発生したパイプライン処理を構成するプロセス名の一覧を取得する。そしてＳ７０３にて、Ｓ７０２で取得した全プロセスについて処理時間が大きく増加しているプロセスを見つけ出す。この見つけ出したプロセスをデッドラインミス発生の原因となったプロセスとみなす。デッドラインミス発生の原因となったプロセスは複数ある場合も考えられる。Ｓ７０３で見つけた全プロセスについて、Ｓ７０４以下を実行する。Ｓ７０５にて、Ｓ７０３で見つけ出した各プロセスから１つ選択し、Ｓ７０６にて、そのプロセスが動作する計算機を見つけ出し、Ｓ７０７にて、Ｓ７０４で見つけ出した計算機上で動作する他のプロセスを見つけ出す。そしてＳ７０８にて、Ｓ７０６にて見つけ出した計算機が正常に動作するかどうかを調べ、正常に動作するのであれば、Ｓ７０７で見つけたプロセスのうち、ＣＰＵ時間が最も増加しているプロセス、または、Ｓ７０５のプロセスをリソース再割当の候補とする。この場合、Ｓ７０７において最もＣＰＵ時間が増加しているプロセスか、Ｓ７０５のプロセスのどちらかについてリソース再割当ができればよい。そしてＳ７１０にて、資源割当部６に通知する。Ｓ７０８にて計算機が正常に動作していなければ、Ｓ７１１に進み、Ｓ７０７で見つけた全てのプロセスをリソース再割当の対象とし、資源割当部６に通知する。 FIG. 64 shows the operation of the migration process determination unit 3g in the present embodiment.
When a deadline miss notification for pipeline processing is received from the processing time collection unit 2 in S701, a list of process names constituting the pipeline processing in which the deadline miss has occurred is acquired in S702. In step S703, a process having a significantly increased processing time is found for all the processes acquired in step S702. The found process is regarded as a process that causes a deadline error. There may be multiple processes that cause a deadline error. For all processes found in S703, S704 and subsequent steps are executed. In step S705, one of the processes found in step S703 is selected. In step S706, a computer on which the process operates is found. In step S707, another process operating on the computer found in step S704 is found. In step S708, it is checked whether the computer found in step S706 operates normally. Are candidates for resource reallocation. In this case, it is only necessary that resource reallocation can be performed for either the process having the largest increase in CPU time in S707 or the process in S705. In step S710, the resource allocation unit 6 is notified. If the computer is not operating normally in S708, the process proceeds to S711, and all processes found in S707 are subject to resource reallocation and are notified to the resource allocation unit 6.

本実施の形態における、全ての構成要素の連携によって動作する動的リソース管理システムまたは装置の動作フローの例を図６５に示す。
Ｓ７１１で、処理時間収集部２ｇ、または、計算機負荷量収集部５が、パイプライン処理のデッドラインミスの発生、計算機故障、メモリ不足の発生、などを移行プロセス決定部３に通知する。処理時間収集部２ｇはＳ７１〜Ｓ７９の手順でデッドラインミスの発生を検出し、計算機負荷量収集部５は、Ｓ１０１１〜Ｓ１０１９の手順で計算機故障、メモリ・ＣＰＵリソース不足を検出する。
するとＳ７１２で、移行プロセス決定部３ｇは、通知の内容を解析し、原因が計算機故障であればＳ７１６に進み、資源割当部６がＳ１２１〜Ｓ１３６の手順で、故障していた計算機上で動作していたプロセスを他の計算機上で再起動させる。原因が故障でなければ、移行プロセス決定部３ｇがＳ７１４にてＳ７０１〜Ｓ７１０の手順でデッドラインミス発生、リソース不足の発生の原因と考えられるプロセスを見つけ出し、Ｓ７１５で、そのプロセスに対して、資源割当部６が、Ｓ１２１〜Ｓ１３６の手順でリソースの再割り当てを行う。そしてＳ７１７にて、システム構成テーブル・メモリ７ｇの、リソース再割り当てが行われた計算機などに関する情報を書き換える。 FIG. 65 shows an example of the operation flow of the dynamic resource management system or apparatus that operates in cooperation with all the components in the present embodiment.
In S711, the processing time collection unit 2g or the computer load amount collection unit 5 notifies the migration process determination unit 3 of the occurrence of a deadline miss in pipeline processing, a computer failure, a memory shortage, and the like. The processing time collection unit 2g detects the occurrence of a deadline miss in the steps S71 to S79, and the computer load amount collection unit 5 detects a computer failure and a memory / CPU resource shortage in the steps S1011 to S1019.
Then, in S712, the migration process determining unit 3g analyzes the content of the notification, and if the cause is a computer failure, the process proceeds to S716, and the resource allocation unit 6 operates on the failed computer in the procedure of S121 to S136. Restart the process that was running on another computer. If the cause is not a failure, the migration process determination unit 3g finds a process that is considered to be the cause of the occurrence of a deadline miss or a resource shortage in steps S701 to S710 in S714. The allocator 6 reallocates resources according to the procedure from S121 to S136. In step S717, the information on the computer to which the resource reallocation has been performed is rewritten in the system configuration table memory 7g.

即ち本実施の形態の計算機リソース動的制御方式は、実施の形態１の計算機リソース動的制御方式に以下の特徴を加えることにより、パイプライン状に複数のプロセスが連なって、各プロセスが順々に処理を行うことにより１回の周期処理が完了するような処理において、処理にデッドライン時間が与えられている場合に、デッドラインミスが発生した場合にデッドライン発生の原因となるプロセスのマイグレーションを行うことにより、本形状の処理のデッドラインミス継続を防止することを特徴とする計算機リソース動的制御方式である。即ち、
７１）実施の形態１のシステム構成情報管理テーブルに加え、１つの周期処理を行うプロセス群、周期処理時間、処理の制限時間、処理名を保持するシステム構成情報管理テーブル。
７２）実施の形態１の処理時間収集手段において、複数のプロセスによって構成された一連の処理全体の所要時間を求め、デッドラインミスが発生した場合に移行プロセス決定手段に通知する処理時間収集手段。
７３）実施の形態１の移行プロセス決定手段において、処理時間収集手段から通知が届いた場合に、一連の処理を構成する全プロセスの処理時間の変動を調査することによって、デッドラインミス発生の原因となったプロセスを求め、資源割付手段にそのプロセスを通知する移行プロセス決定手段、を備える。 In other words, the computer resource dynamic control method of the present embodiment adds the following features to the computer resource dynamic control method of the first embodiment, so that a plurality of processes are connected in a pipeline, and each process is sequentially performed. When a deadline time is given to a process in which one cycle process is completed by performing the process at a time, a process migration that causes a deadline occurs when a deadline miss occurs This is a computer resource dynamic control system characterized by preventing the continuation of a deadline miss in the processing of this shape by performing the above. That is,
71) In addition to the system configuration information management table of the first embodiment, a system configuration information management table that holds a process group that performs one periodic process, a periodic processing time, a processing time limit, and a processing name.
72) A processing time collecting means in the processing time collecting means of the first embodiment, which obtains the time required for the entire series of processes constituted by a plurality of processes and notifies the transition process determining means when a deadline miss occurs.
73) When the notification is received from the processing time collecting unit in the migration process determining unit of the first embodiment, the cause of the occurrence of the deadline miss is investigated by examining the variation in the processing time of all the processes constituting the series of processes. A transition process determining means for obtaining the process and reporting the process to the resource allocating means.

言い換えると、本実施の形態における計算機リソース動的制御装置は実施の形態１の装置に加えて、システム構成情報テーブル・メモリは、パイプライン処理を構成するプロセス名とデッドライン時間を記憶し、
移行プロセス決定部は、デッドラインミスの発生、または発生予測を受けると、上記システム構成情報テーブル・メモリを参照して上記発生の原因となるパイプライン処理のプロセスを特定し、勧告を行うことを特徴とする。
このように、本実施の形態により、複数のプロセスで構成されたパイプライン処理にデッドラインが与えられた場合において、パイプライン処理にデッドラインミスが発生した場合においてもその原因を取り除き、デッドラインミス解消が可能となる。
なお、上記の各実施の形態を組合わせた実施の形態であってもよく、その場合は各実施の形態で述べられた特徴を組合わせた効果が得られる。 In other words, the computer resource dynamic control device in the present embodiment, in addition to the device in the first embodiment, the system configuration information table memory stores the process name and deadline time constituting the pipeline processing,
When the migration process decision unit receives the occurrence or prediction of a deadline miss, it refers to the system configuration information table / memory to identify the pipeline process that causes the occurrence and makes a recommendation. Features.
As described above, according to the present embodiment, when a deadline is given to a pipeline process constituted by a plurality of processes, even when a deadline miss occurs in the pipeline process, the cause is removed and the deadline is removed. Mistakes can be resolved.
It should be noted that the above embodiments may be combined, and in that case, an effect obtained by combining the features described in each embodiment can be obtained.

この発明の実施の形態１における計算機リソース動的制御システムの構成を示す図である。It is a figure which shows the structure of the computer resource dynamic control system in Embodiment 1 of this invention. 実施の形態１における計算機リソース動的制御装置の構成を示す図である。2 is a diagram illustrating a configuration of a computer resource dynamic control device according to Embodiment 1. FIG. 実施の形態１における処理時間監視部の機能を説明する図である。6 is a diagram for explaining a function of a processing time monitoring unit according to Embodiment 1. FIG. 実施の形態１における処理時間収集部の機能を説明する図である。6 is a diagram for explaining a function of a processing time collection unit according to Embodiment 1. FIG. 実施の形態１における処理時間収集部の機能を説明する図である。6 is a diagram for explaining a function of a processing time collection unit according to Embodiment 1. FIG. 実施の形態１における処理時間収集部の機能を説明する図である。6 is a diagram for explaining a function of a processing time collection unit according to Embodiment 1. FIG. 実施の形態１における移行プロセス決定部の機能を説明する図である。6 is a diagram for explaining a function of a migration process determination unit in the first embodiment. FIG. 実施の形態１における計算機負荷監視部の機能を説明する図である。3 is a diagram for explaining functions of a computer load monitoring unit in Embodiment 1. FIG. 実施の形態１における計算機負荷量収集部の機能を説明する図である。FIG. 3 is a diagram for explaining a function of a computer load amount collection unit in the first embodiment. 実施の形態１における計算機負荷量収集部の機能を説明する図である。FIG. 3 is a diagram for explaining a function of a computer load amount collection unit in the first embodiment. 実施の形態１における計算機負荷量収集部の機能を説明する図である。FIG. 3 is a diagram for explaining a function of a computer load amount collection unit in the first embodiment. 実施の形態１における計算機負荷量収集部の機能を説明する図である。FIG. 3 is a diagram for explaining a function of a computer load amount collection unit in the first embodiment. 実施の形態１における資源割当部の機能を説明する図である。FIG. 3 is a diagram for explaining a function of a resource allocation unit in the first embodiment. 実施の形態１におけるシステム構成情報管理テーブルのデータ構成例を示す図である。6 is a diagram illustrating a data configuration example of a system configuration information management table in the first embodiment. FIG. 実施の形態１における処理時間履歴テーブルのデータ構成例を示す図である。6 is a diagram illustrating a data configuration example of a processing time history table in the first embodiment. FIG. 実施の形態１における計算機負荷履歴テーブルのデータ構成例を示す図である。6 is a diagram illustrating a data configuration example of a computer load history table according to the first embodiment. FIG. 実施の形態１における計算機状態テーブルのデータ構成例を示す図である。3 is a diagram illustrating a data configuration example of a computer state table according to Embodiment 1. FIG. 実施の形態１における処理時間監視部の動作フローを示す図である。6 is a diagram showing an operation flow of a processing time monitoring unit in the first embodiment. FIG. 実施の形態１における処理時間収集部の動作フローを示す図である。6 is a diagram illustrating an operation flow of a processing time collection unit according to Embodiment 1. FIG. 実施の形態１における計算機負荷監視部の動作フローを示す図である。6 is a diagram illustrating an operation flow of a computer load monitoring unit according to the first embodiment. FIG. 実施の形態１における計算機負荷量収集部の動作フローを示す図である。FIG. 6 is a diagram illustrating an operation flow of a computer load amount collection unit according to the first embodiment. 実施の形態１における移行プロセス決定部の動作フローを示す図である。6 is a diagram illustrating an operation flow of a migration process determination unit according to Embodiment 1. FIG. 実施の形態１における移行プロセス決定部の動作フローを示す図である。6 is a diagram illustrating an operation flow of a migration process determination unit according to Embodiment 1. FIG. 実施の形態１における資源割当部の動作フローを示す図である。6 is a diagram illustrating an operation flow of a resource allocation unit in Embodiment 1. FIG. 実施の形態１における計算機リソース動的制御装置、システムが行う総合動作フローを示す図である。FIG. 3 is a diagram showing an overall operation flow performed by the computer resource dynamic control device and system in the first embodiment. この発明の実施の形態２における計算機リソース動的制御システムの構成を示す図である。It is a figure which shows the structure of the computer resource dynamic control system in Embodiment 2 of this invention. 実施の形態２における計算機リソース動的制御装置の構成を示す図である。It is a figure which shows the structure of the computer resource dynamic control apparatus in Embodiment 2. FIG. 実施の形態２におけるデッドラインミス発生時刻予測部の機能を説明する図である。It is a figure explaining the function of the deadline miss occurrence time prediction part in Embodiment 2. FIG. 実施の形態２における計算機負荷予測部の機能を説明する図である。It is a figure explaining the function of the computer load estimation part in Embodiment 2. FIG. 実施の形態２におけるＣＰＵ時間予測部の機能を説明する図である。FIG. 10 is a diagram for explaining a function of a CPU time prediction unit in the second embodiment. 実施の形態２におけるメモリ使用量予測部の機能を説明する図である。FIG. 10 is a diagram for explaining a function of a memory usage amount prediction unit in the second embodiment. 実施の形態２におけるデッドラインミス発生時刻予測部の動作フローを示す図である。FIG. 10 is a diagram showing an operation flow of a deadline miss occurrence time prediction unit in the second embodiment. 実施の形態２における計算機負荷予測部の動作フローを示す図である。It is a figure which shows the operation | movement flow of the computer load estimation part in Embodiment 2. FIG. 実施の形態２におけるＣＰＵ時間予測部の動作フローを示す図である。FIG. 10 is a diagram showing an operation flow of a CPU time prediction unit in the second embodiment. 実施の形態２におけるメモリ使用量予測部の動作フローを示す図である。FIG. 10 is a diagram illustrating an operation flow of a memory usage amount prediction unit in the second embodiment. 実施の形態２における移行プロセス決定部の動作フローを示す図である。FIG. 10 is a diagram illustrating an operation flow of a migration process determination unit in the second embodiment. 実施の形態２における移行プロセス決定部の動作フローを示す図である。FIG. 10 is a diagram illustrating an operation flow of a migration process determination unit in the second embodiment. 実施の形態２における資源割当部の動作フローを示す図である。FIG. 10 is a diagram showing an operation flow of a resource allocation unit in the second embodiment. 実施の形態２における計算機リソース動的制御装置、システムが行う総合動作フローを示す図である。It is a figure which shows the comprehensive operation | movement flow which the computer resource dynamic control apparatus and system in Embodiment 2 perform. この発明の実施の形態３におけるシステム構成情報管理テーブルのデータ構成例を示す図である。It is a figure which shows the example of a data structure of the system configuration information management table in Embodiment 3 of this invention. 実施の形態３におけるデッドラインミス発生時刻予測部の動作フローを示す図である。FIG. 10 is a diagram showing an operation flow of a deadline miss occurrence time prediction unit in the third embodiment. 実施の形態３におけるメモリ使用量予測部の動作フローを示す図である。FIG. 10 is a diagram illustrating an operation flow of a memory usage amount prediction unit in the third embodiment. 実施の形態３における計算機負荷予測部の動作フローを示す図である。FIG. 10 is a diagram showing an operation flow of a computer load prediction unit in the third embodiment. 実施の形態３における移行プロセス決定部の動作フローを示す図である。FIG. 10 is a diagram showing an operation flow of a migration process determination unit in the third embodiment. 実施の形態３における移行プロセス決定部の動作フローを示す図である。FIG. 10 is a diagram showing an operation flow of a migration process determination unit in the third embodiment. 実施の形態３における資源割当部の動作フローを示す図である。FIG. 10 is a diagram showing an operation flow of a resource allocation unit in the third embodiment. 実施の形態３における計算機リソース動的制御装置、システムが行う総合動作フローを示す図である。FIG. 10 is a diagram showing an overall operation flow performed by the computer resource dynamic control device and system in the third embodiment. この発明の実施の形態４におけるシステム構成情報管理テーブルのデータ構成例を示す図である。It is a figure which shows the example of a data structure of the system configuration information management table in Embodiment 4 of this invention. 実施の形態４における資源割当部の動作フローを示す図である。FIG. 20 is a diagram illustrating an operation flow of a resource allocation unit in the fourth embodiment. 実施の形態４における計算機リソース動的制御装置、システムが行う総合動作フローを示す図である。FIG. 20 is a diagram showing an overall operation flow performed by the computer resource dynamic control device and system in the fourth embodiment. この発明の実施の形態５におけるシステム構成情報管理テーブルのデータ構成例を示す図である。It is a figure which shows the example of a data structure of the system configuration information management table in Embodiment 5 of this invention. 実施の形態５における資源割当部の動作フローを示す図である。FIG. 20 is a diagram showing an operation flow of a resource allocation unit in the fifth embodiment. 実施の形態５における計算機リソース動的制御装置、システムが行う総合動作フローを示す図である。FIG. 20 is a diagram showing an overall operation flow performed by the computer resource dynamic control device and system in the fifth embodiment. この発明の実施の形態６における計算機リソース動的制御システムの構成を示す図である。It is a figure which shows the structure of the computer resource dynamic control system in Embodiment 6 of this invention. 実施の形態６における計算機リソース動的制御装置の構成を示す図である。FIG. 20 is a diagram illustrating a configuration of a computer resource dynamic control device according to a sixth embodiment. 実施の形態６におけるレプリカ管理部の動作フローを示す図である。FIG. 20 is a diagram showing an operation flow of a replica management unit in the sixth embodiment. 実施の形態６におけるレプリカ管理部の動作フローを示す図である。FIG. 20 is a diagram showing an operation flow of a replica management unit in the sixth embodiment. 実施の形態６におけるメモリ使用量予測部の動作フローを示す図である。FIG. 20 is a diagram illustrating an operation flow of a memory usage amount prediction unit in the sixth embodiment. 実施の形態６における資源割当部の動作フローを示す図である。FIG. 20 is a diagram showing an operation flow of a resource allocation unit in the sixth embodiment. 実施の形態６における計算機リソース動的制御装置、システムが行う総合動作フローを示す図である。FIG. 20 is a diagram showing an overall operation flow performed by the computer resource dynamic control device and system in the sixth embodiment. この発明の実施の形態７における計算機リソース動的制御システムが対象とするプロセスの例を示す図である。It is a figure which shows the example of the process which the computer resource dynamic control system in Embodiment 7 of this invention makes object. 実施の形態７におけるシステム構成情報管理テーブルのデータ構成例を示す図である。FIG. 20 is a diagram showing a data configuration example of a system configuration information management table in the seventh embodiment. 実施の形態７における処理時間収集部の動作フローを示す図である。FIG. 23 is a diagram showing an operation flow of a processing time collection unit in the seventh embodiment. 実施の形態７における移行プロセス決定部の動作フローを示す図である。FIG. 23 is a diagram showing an operation flow of a migration process determination unit in the seventh embodiment. 実施の形態７における計算機リソース動的制御装置、システムが行う総合動作フローを示す図である。FIG. 20 is a diagram showing an overall operation flow performed by the computer resource dynamic control device and system in the seventh embodiment.

Explanation of symbols

１処理時間監視部、２処理時間収集部、２ｂ処理時間収集・予測部、３移行プロセス決定部、４計算機負荷監視部、５計算機負荷量収集部、５ｂ計算機負荷量収集・予測部、６資源割当部、７システム構成情報管理テーブル・メモリ、８処理時間履歴テーブル・メモリ、９計算機負荷履歴テーブル・メモリ、１０計算機状態テーブル・メモリ、１１プロセッサ（ＣＰＵ）、１２メモリ、１３入力装置、１４出力装置、１５通信インタフェース、１６内部バス、２１デッドラインミス発生時刻予測部、２２計算機負荷予測部、２３ＣＰＵ時間予測部、２４メモリ使用量予測部、２５レプリカ管理部、１００計算機、１００ａ，１００ｂ，１００ｃ計算機リソース動的制御装置、１０１プロセス。 1 processing time monitoring unit, 2 processing time collection unit, 2b processing time collection / prediction unit, 3 migration process decision unit, 4 computer load monitoring unit, 5 computer load collection unit, 5b computer load collection / prediction unit, 6 resources Allocation unit, 7 system configuration information management table / memory, 8 processing time history table / memory, 9 computer load history table / memory, 10 computer state table / memory, 11 processor (CPU), 12 memory, 13 input device, 14 output Apparatus, 15 communication interface, 16 internal bus, 21 deadline miss occurrence time prediction unit, 22 computer load prediction unit, 23 CPU time prediction unit, 24 memory usage prediction unit, 25 replica management unit, 100 computer, 100a, 100b, 100c Computer resource dynamic controller, 101 process.

Claims

The computer has a processor and a memory connected multiple, a process performed repeatedly at a predetermined cycle by a processor in which a predetermined computer has among the plurality of computers, other than the predetermined computer of the plurality of computers In a computer resource dynamic control device that controls execution of other computers,
The process is executed with a deadline time that represents the limit of the processing time from the start to the end of execution of the process, and a processing time that exceeds the deadline time when the process is repeatedly executed at a fixed period. A system configuration information table memory that pre- stores an allowable number of times allowed to be performed and a consumed memory amount representing an amount of memory consumed when the process is executed ;
A processing time history table memory that stores at least one processing start time at which execution of the process is started and processing end time at which execution of the process is ended as processing time history information;
The processing start time and the processing end time output when the predetermined computer finishes executing the process are input from the predetermined computer, and the input processing start time and processing end time are input to the processing time history. Store as processing time history information in table memory,
In response to each processing time history information of one or more processing time history information stored in the processing time history table memory, a processing time obtained by subtracting the processing start time from the processing end time of the processing time history information is obtained.
By comparing the processing time corresponding to each processing time history information obtained and the deadline time stored in the system configuration information table memory, the number of processing times exceeding the deadline time is obtained,
The obtained number exceeds the allowable number of times stored in the system configuration information table memory, and the processing time corresponding to the latest processing time history information among the one or more processing time history information is set to the deadline time. If exceeded, a processing time collection unit that notifies that the number of processing times exceeding the deadline time exceeds the allowable number of times, and
When the processing time collection unit is notified that the number of the processing times exceeding the deadline time exceeds the allowable number of times, a transition process determination unit that determines to cause the other computer to execute the process;
The processor usage rate indicating the usage rate of the processor of the computer and the free memory amount indicating the free memory of the computer are input from each computer of the plurality of computers, and the processor usage rate and free space of each input computer are input. A computer load collection unit for outputting a memory amount;
When it is determined by the migration process determination unit that the other process is to be executed by another computer, the processor usage rate and the free memory amount of each computer output by the computer load amount collection unit Enter processor usage and free memory,
Compare the amount of free memory of the computer with the amount of consumed memory stored in the system configuration information table / memory in order of the processor usage rate of the other computer that has been input. computer determines, computer resource dynamic control device according to claim the resource allocation unit for controlling to cause to execute the process on the determined other computer further comprising a <br/>.

The system configuration information table / memory stores in advance a memory amount representing the capacity of the memory of the computer corresponding to each of the computers,
The computer load amount collection unit inputs a memory usage amount representing a memory usage amount of the computer together with the processor usage rate from each computer of the plurality of computers, and the processor usage rate and the memory usage amount of each inputted computer And
The computer resource dynamic control device is
The computer load history table memory for storing one or more computer load history information corresponding to each computer, using the processor usage rate and memory usage output by the computer load amount collection unit as computer load history information,
The processing time is obtained corresponding to each processing time history information of the one or more processing time history information stored in the processing time history table memory, and the dead time is determined based on one or more fluctuations of the obtained processing time. It is determined whether or not the processing time exceeding the line time will occur in the future, and when it is determined that the processing time exceeding the deadline time will occur in the future, the processing time exceeding the deadline time occurs A deadline miss occurrence time predicting unit that predicts and outputs a time based on the variation of the one or more processing times;
Corresponding to the occurrence time predicted by the deadline miss occurrence time predicting unit based on the processor usage rate fluctuation of one or more computer load history information corresponding to each computer stored in the computer load history table memory A computer load prediction unit that calculates and outputs the processor usage rate of each computer
Corresponding to the occurrence time predicted by the deadline miss occurrence time prediction unit based on the variation in memory usage of one or more computer load history information corresponding to each computer stored in the computer load history table memory A memory usage prediction unit that calculates and outputs the memory usage of each computer ;
With
When the occurrence time is output by the deadline miss occurrence time prediction unit , the migration process determination unit determines to cause the other process to execute the process,
Said resource allocation unit,
When the occurrence time is output by the deadline miss occurrence time prediction unit, when the migration process determination unit determines to cause the other process to be executed by the migration process determination unit, the output from the computer load prediction unit Enter the processor usage rate of other computers among the processor usage rates of multiple computers corresponding to the time of occurrence.
In order of the processor usage rate of the other computers you entered,
(1) Obtain the memory amount of the other computer from the memory amount corresponding to each computer stored in the system configuration information table memory.
(2) Input the memory usage of the other computer among the memory usages of the plurality of computers corresponding to the occurrence time output by the memory usage prediction unit,
(3) The free memory amount obtained by subtracting the input memory usage from the acquired memory amount is obtained, and the obtained free memory amount is compared with the consumed memory amount stored in the system configuration information table memory. Determine whether the amount is greater than or equal to the amount of memory consumed,
When it is determined that the amount of free memory is equal to or greater than the amount of consumed memory, control is performed to cause another computer having the amount of free memory equal to or greater than the amount of consumed memory to execute the process. The computer resource dynamic control apparatus according to claim 1.