JP2013210683A

JP2013210683A - Autoscaling method, autoscaling program and computer node

Info

Publication number: JP2013210683A
Application number: JP2012078501A
Authority: JP
Inventors: Kiyonobu Sato; 清伸佐藤; Kazumi Shinozaki; 和美篠崎; Hiroaki Kubota; 博昭久保田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2012-03-30
Filing date: 2012-03-30
Publication date: 2013-10-10
Anticipated expiration: 2032-03-30
Also published as: JP5867238B2

Abstract

PROBLEM TO BE SOLVED: To provide a technology for allowing application of a system with excellent efficiency of charging.SOLUTION: A master node 110 manages tasks to be executed by the respective slave nodes 120. A schedule information storage part 113 stores management information for managing the tasks to be executed by the respective slave nodes 120. A schedule processing part 112 verifies whether or not there is a slave node 120 capable of terminating processing of a task of a processing request in a delivery. In addition, the schedule processing part 112 verifies whether or not a computer node capable of terminating processing of the task of the processing request can be prepared in the delivery when a task whose processing can be terminated even by other computer nodes is moved. As results of verification, when both are negative, a scale out processing part 114 increases the slave nodes 120 by scale out.

Description

本発明は，システムが備えるコンピュータノード数を制御するオートスケーリング方法，オートスケーリングプログラムおよびコンピュータノードに関するものである。 The present invention relates to an autoscaling method, an autoscaling program, and a computer node for controlling the number of computer nodes provided in a system.

従来のオンプレミスのシステムでは，システムが持つリソース量が固定であったので，そのシステムが持つリソース量を超えて，クライアントからの処理要求を受け付けることができなかった。 In a conventional on-premises system, the amount of resources that the system has is fixed, so it is impossible to accept processing requests from clients that exceed the amount of resources that the system has.

これに対して，近年発展しているクラウドコンピューティングを利用したシステムでは，システムが持つリソース量を動的に変更可能である。そのため，クライアントから要求される処理を実行することによってシステムのリソース消費量がシステムが持つリソース量を超えてしまうような場合でも，システムのリソースを動的に増やすことで，クライアントからの要求を受け付け可能となる。 On the other hand, in a system using cloud computing, which has been developing in recent years, the amount of resources that the system has can be changed dynamically. Therefore, even if the system resource consumption exceeds the amount of resources that the system has by executing the processing requested by the client, requests from the client can be accepted by dynamically increasing the system resources. It becomes possible.

このようなシステムのリソースを増減する技術として，システムにおける仮想化されたコンピュータノードのリソース量を増減するスケールアップ・スケールダウンと呼ばれる技術がある。しかし，仮想化されたコンピュータノードのリソースの増加には，物理マシンのリソース量の限界がある。 As a technique for increasing / decreasing the resources of such a system, there is a technique called scale up / scale down for increasing / decreasing the amount of resources of a virtual computer node in the system. However, the increase in resources of virtualized computer nodes has a limit on the amount of physical machine resources.

システムのリソースを増減する別の技術として，システムにおけるコンピュータノードの数を増減するスケールアウト・スケールシュリンクと呼ばれる技術がある。この技術では，システムのリソース量の増加に，物理マシンのリソース量の限界がない。システムのリソース消費率に応じて自動でスケールアウト・スケールシュリンクを行う，オートスケーリングと呼ばれる技術もある。 As another technique for increasing / decreasing system resources, there is a technique called scale-out / scale shrink that increases / decreases the number of computer nodes in the system. With this technology, there is no limit to the amount of physical machine resources in increasing the amount of system resources. There is also a technology called autoscaling that automatically scales out and shrinks according to the resource consumption rate of the system.

なお，分散バッチシステムにおいて，新たなバッチ処理が要求された際に，ある処理サーバで実行中のバッチ処理を他のサーバに移行し，処理可能となったサーバで新しいバッチ処理を実行することで，バッチ処理のデッドラインを管理する技術が知られている。 In a distributed batch system, when a new batch process is requested, the batch process being executed on one processing server is transferred to another server, and the new batch process is executed on the server that can be processed. Technology for managing deadlines in batch processing is known.

特開２００２−３４２０９８号公報JP 2002-342098 A

しかし，上述のように，システムのリソース消費率に応じてオートスケーリングを行うと，スケールアウトによって無駄なコンピュータノードが増設されたり，誤ったスケールシュリンクによってシステムが実行中の処理が強制的に中断される場合がある。クラウドコンピューティングにおいて，コンピュータノードの数に応じた従量課金が行われる場合には，無駄なスケールアウトによって無駄な課金が発生してしまう場合がある。 However, as described above, if autoscaling is performed according to the resource consumption rate of the system, unnecessary computer nodes are added due to scale-out, or processing that is being executed by the system is forcibly interrupted due to erroneous scale shrinkage. There is a case. In cloud computing, when pay-per-use according to the number of computer nodes is performed, useless charge may occur due to useless scale-out.

一側面では，本発明は，課金の効率がよいシステムの運用が可能となる技術を提供することを目的とする。 In one aspect, an object of the present invention is to provide a technique that enables operation of a system with high charging efficiency.

１態様では，オートスケーリング方法では，要求された処理を複数のコンピュータノードに分散して実行するシステムにおいて，該システムで実行するタスクの管理を行うコンピュータノードが，納期が指定された処理要求を受け付けた際に，記憶部に記憶されたシステムが備えるコンピュータノードが実行するタスクを管理する管理情報を参照して，該処理要求のタスクをコンピュータノードに割り当てるシミュレーションを実行し，割り当てのシミュレーションにより，処理要求のタスクを納期内に処理終了可能なコンピュータノードがなく，さらに他のコンピュータノードでも納期内に処理終了可能なタスクを移動しても処理要求のタスクを納期内に処理終了可能なコンピュータノードを用意できないと判定された場合に，システムにコンピュータノードを増設する処理を実行する。 In one aspect, in the autoscaling method, in a system that executes the requested processing distributed to a plurality of computer nodes, the computer node that manages the tasks executed in the system accepts the processing request for which the delivery date is specified. The management information stored in the storage unit that manages the tasks executed by the computer nodes included in the system is referred to, a simulation for allocating the task of the processing request to the computer nodes is executed, and the simulation is performed by the allocation simulation. There is no computer node that can finish processing the requested task within the due date, and there is a computer node that can finish processing the requested task within the due date even if another computer node moves the task that can finish within the due date. If it is determined that the system cannot be prepared, It executes a process of adding a computer node.

１態様では，課金の効率がよいシステムの運用が可能となる。 In one aspect, it is possible to operate a system with good charging efficiency.

オンプレミスで実現されるシステムの課題を説明する図である。It is a figure explaining the subject of the system implement | achieved by on-premises. クラウドコンピューティングで実現されるシステムの課題を説明する図である。It is a figure explaining the subject of the system implement | achieved by cloud computing. 本実施の形態によるシステムの構成例を示す図である。It is a figure which shows the structural example of the system by this Embodiment. 本実施の形態による各ノードの機能構成例を示す図である。It is a figure which shows the function structural example of each node by this Embodiment. 本実施の形態によるノードを実現する仮想マシンシステムの例を示す図である。It is a figure which shows the example of the virtual machine system which implement | achieves the node by this Embodiment. 本実施の形態のマスタノードにおけるスケジュール処理部による処理要求のタスクの割り当てシミュレーションの例を説明する図である。It is a figure explaining the example of the task allocation simulation of the process request by the schedule process part in the master node of this Embodiment. 本実施の形態のマスタノードにおけるスケジュール処理部による処理要求のタスクの割り当てシミュレーションの例を説明する図である。It is a figure explaining the example of the task allocation simulation of the process request by the schedule process part in the master node of this Embodiment. 本実施の形態のマスタノードにおけるスケジュール処理部による処理要求のタスクの割り当てシミュレーションの例を説明する図である。It is a figure explaining the example of the task allocation simulation of the process request by the schedule process part in the master node of this Embodiment. 本実施の形態のマスタノードにおけるスケジュール処理部による処理要求のタスクの割り当てシミュレーションの例を説明する図である。It is a figure explaining the example of the task allocation simulation of the process request by the schedule process part in the master node of this Embodiment. 本実施の形態のマスタノードにおけるリスケジュール処理部によるタスクの移動シミュレーションの例を説明する図である。It is a figure explaining the example of the movement simulation of the task by the reschedule process part in the master node of this Embodiment. 本実施の形態のマスタノードにおけるリスケジュール処理部によるタスクの移動シミュレーションの例を説明する図である。It is a figure explaining the example of the movement simulation of the task by the reschedule process part in the master node of this Embodiment. 本実施の形態のマスタノードにおけるリスケジュール処理部によるスレーブノードにロック設定を行う例を説明する図である。It is a figure explaining the example which performs lock setting to the slave node by the reschedule process part in the master node of this Embodiment. 本実施の形態のマスタノードにおけるリスケジュール処理部によるスレーブノードにロック設定を行う例を説明する図である。It is a figure explaining the example which performs lock setting to the slave node by the reschedule process part in the master node of this Embodiment. 本実施の形態のマスタノードにおけるスケジュール処理部によるスレーブノードのロック解除を行う例を説明する図である。It is a figure explaining the example which performs lock release of the slave node by the schedule process part in the master node of this Embodiment. 本実施の形態のマスタノードによる処理要求受け付け時の処理フローチャートである。It is a process flowchart at the time of the processing request reception by the master node of this Embodiment. 本実施の形態のマスタノードによる処理要求受け付け時の処理フローチャートである。It is a process flowchart at the time of the processing request reception by the master node of this Embodiment. 本実施の形態のマスタノードによるリスケジュールの処理フローチャートである。It is a process flowchart of the reschedule by the master node of this Embodiment.

以下，本実施の形態について，図を用いて説明する。 Hereinafter, the present embodiment will be described with reference to the drawings.

図１は，オンプレミスで実現されるシステムの課題を説明する図である。 FIG. 1 is a diagram illustrating a problem of a system realized on-premises.

図１に示す例において，タスクを示す図形の縦の長さは該タスクのリソース消費量を示し，横の長さは該タスクの処理時間を示す。また，図１に示す例において，グラフの縦軸はシステムのリソース消費量を示し，横軸は時間の経過を示す。ｔ₀〜ｔ₃は，等間隔の時刻を示す。 In the example shown in FIG. 1, the vertical length of the figure indicating the task indicates the resource consumption amount of the task, and the horizontal length indicates the processing time of the task. In the example shown in FIG. 1, the vertical axis of the graph indicates the system resource consumption, and the horizontal axis indicates the passage of time. t _{0 to} t ₃ indicate equally spaced times.

オンプレミスのシステムでは，システムが持つリソースの量は固定となる。すなわち，あらかじめシステムが持つリソース量を超えたリソース消費量の処理を行うことはできない。図１に示す例において，システムリソースのラインは，システムが持つリソースの量を示す。 In an on-premises system, the amount of resources that the system has is fixed. In other words, it is not possible to perform processing of resource consumption exceeding the resource amount of the system in advance. In the example shown in FIG. 1, the system resource line indicates the amount of resources the system has.

例えば，図１（Ａ）において，システムに対して，タスク＃１〜タスク＃４の処理要求が，ほぼ同時刻ｔ₀にあったものとする。このとき，システムは，処理要求を受け付けた順，ここではタスク＃１，タスク＃２，タスク＃３，タスク＃４の順に処理を開始する。しかし，図１（Ａ）に示すように，タスク＃３の処理を開始したところで，システムのリソース消費量がシステムが持つリソース量に達してタスク＃４の処理を開始できない状態となり，タスク＃４の処理要求を受け付けられなくなる。 For example, in FIG. 1 (A), the relative system, the processing requirements of the task # 1 Task # 4, and what was almost at the same time t _0. At this time, the system starts processing in the order in which processing requests are received, in this case, task # 1, task # 2, task # 3, and task # 4. However, as shown in FIG. 1 (A), when the processing of task # 3 is started, the resource consumption of the system reaches the resource amount of the system and the processing of task # 4 cannot be started. The processing request cannot be accepted.

図１（Ｂ）に示す例は，スケジューリングを行うシステムの例である。スケジューリングを行うシステムでは，処理要求を受け付けた時点でタスクの処理時間を見積もり，処理要求が発生した時点でリソースに空きがなくても，実行中の他のタスクの終了後に受け付けた処理要求のタスクを実行するように，タスクのスケジューリングを行う。 The example shown in FIG. 1B is an example of a system that performs scheduling. In a scheduling system, the processing time of a task is estimated when a processing request is received, and even if there is no available resource when the processing request occurs, the task of the processing request that is received after the termination of other tasks that are running Schedule the task to execute

例えば，最後に受け付けるタスク＃４の処理の納期が，時刻ｔ₃であるものとする。図１（Ｂ）に示すように，最も処理終了時刻が早いタスク＃３の処理終了時刻がｔ₁であるので，タスク＃４をタスク＃３の後にスケジュールすれば，タスク＃４の処理終了時刻はｔ₃となり，納期に間に合う。この場合，タスク＃４の処理要求は，システムで受け付け可能となる。 For example, it is assumed that the delivery date of the process of task # 4 received last is time t ₃ . As shown in FIG. 1 (B), since the processing end time of the most processing end time is earlier task # 3 is a t _1, when scheduling tasks # 4 Following task # 3, the processing end time of the task # 4 Becomes t _{3 and} is in time for delivery. In this case, the processing request for task # 4 can be accepted by the system.

ところが，図１（Ｃ）に示す例では，最も処理終了時刻が早いタスク＃２の後にタスク＃４をスケジュールすると，タスク＃４の処理終了時刻はｔ₃を超えてしまい，納期に間に合わない。この場合，タスク＃４の処理要求は，システムで受け付けられない。オンプレミスのシステムでは，スケジューリングを行っても，処理要求を受け付けできない場合がある。 However, in the example shown in FIG. 1 (C), the most processing end time to schedule tasks # 4 after an early task # 2, the processing end time of the task # 4 exceeds the t _3, not in time for delivery. In this case, the processing request for task # 4 is not accepted by the system. In an on-premises system, processing requests may not be accepted even if scheduling is performed.

このようなオンプレミスの技術の対極に存在するのが，クラウドコンピューティングの技術である。クラウドコンピューティングによって実現されるシステムのリソース量は，可変となる。 The opposite of such on-premises technology is cloud computing technology. The amount of system resources realized by cloud computing is variable.

クラウドコンピューティングでは，システムが備える各コンピュータノードが，例えば仮想マシン（ＶＭ：Virtual Machine ）で実現される。なお，以下では，システムが備えるコンピュータノードを，単にノードとも呼ぶ。 In cloud computing, each computer node included in the system is realized by, for example, a virtual machine (VM). In the following, a computer node provided in the system is also simply referred to as a node.

クラウドコンピューティングにおいて，システムのリソース量を変更する技術として，例えば，スケールアップ・スケールダウンの技術がある。スケールアップは，ノードのリソース量を増加させる技術である。また，スケールダウンは，ノードのリソース量を減少させる技術である。スケールアップの技術を利用すれば，例えば，現状のリソースでは新たな処理要求のタスクを実行しても納期に間に合わないような場合でも，ノードのリソース量を増加させることで，新たな処理要求のタスクを納期に間に合うように実行できるようになる。 In cloud computing, as a technology for changing the amount of system resources, for example, there is a scale-up / scale-down technology. Scale-up is a technology that increases the resource amount of a node. Scale-down is a technology that reduces the amount of resources in a node. If scale-up technology is used, for example, even if a task with a new processing request is not executed in time for the current resource, increasing the resource amount of the node will increase the amount of new processing request. Tasks can be executed in time for delivery.

しかし，スケールアップ可能なリソースは，仮想マシンであるノードを実現する物理マシンのリソースが上限となるため，ノードのスケールアップには限界がある。また，通常は１つの物理マシン上に複数の仮想マシンのノードが配置されており，物理マシンのリソースを各ノードに配分しきって余剰がない場合には，他のノードの性能を下げることなく特定のノードの性能を上げることは不可能となる。例えば，マルチテナントの環境では，１つの物理マシン上で複数のテナントの仮想マシンを動作することになり，あるテナントの仮想マシンのリソース量を増やすために，他のテナントのリソース量を減らすといったことは，簡単にはできない。よって，スケールアップの技術を用いても，要求された処理のタスクをすべて納期を保証して実行できるわけではない。 However, the resources that can be scaled up are limited by the resources of the physical machine that implements the node that is the virtual machine, and there is a limit to scaling up the node. Also, usually when multiple virtual machine nodes are placed on a single physical machine and physical machine resources are allocated to each node and there is no surplus, it can be specified without reducing the performance of other nodes. It is impossible to improve the performance of the node. For example, in a multi-tenant environment, multiple tenant virtual machines operate on one physical machine, and the resource amount of other tenants is reduced in order to increase the resource amount of a certain tenant's virtual machine. Is not easy. Therefore, even if scale-up technology is used, not all requested processing tasks can be executed with guaranteed delivery times.

クラウドコンピューティングにおいて，システムのリソース量を変更する技術として，スケールアップ・スケールダウンの技術以外に，スケールアウト・スケールシュリンクの技術がある。スケールアウトは，システムにノードを増設する技術である。スケールシュリンクは，システムからノードを撤去する技術である。スケールアウトの技術を用いれば，ほぼ確実に要求された処理のタスクをすべて納期を保証して実行できるようになる。 In cloud computing, there are scale-out and scale-shrink technologies as well as scale-up and scale-down technologies. Scale-out is a technology for adding nodes to the system. Scale shrink is a technology that removes nodes from the system. By using scale-out technology, it is possible to execute almost all required processing tasks with guaranteed delivery times.

ただし，クラウドコンピューティングの世界では，通常はサービス使用料金が従量制となっており，スケールアウトによってシステムが備えるノードの数が増加するにつれて，サービスの使用料金が増えてしまう。 However, in the cloud computing world, service usage fees are usually pay-as-you-go, and service usage fees increase as the number of nodes in a system increases due to scale-out.

図２は，クラウドコンピューティングで実現されるシステムの課題を説明する図である。 FIG. 2 is a diagram for explaining a problem of a system realized by cloud computing.

図２に示す例において，タスクを示す図形の縦の長さは該タスクのリソース消費量を示し，横の長さは該タスクの処理時間を示す。また，図２に示す例において，グラフの縦軸はシステムを構成するノードのリソース消費量の合計を示し，横軸は時間の経過を示す。ｔ₀〜ｔ₃は，等間隔の時刻を示す。図２に示す例において，システムリソースのラインは，システムを構成するノードが持つリソースの量の合計を示す。 In the example shown in FIG. 2, the vertical length of the graphic indicating the task indicates the resource consumption of the task, and the horizontal length indicates the processing time of the task. In the example shown in FIG. 2, the vertical axis of the graph represents the total resource consumption of the nodes constituting the system, and the horizontal axis represents the passage of time. t _{0 to} t ₃ indicate equally spaced times. In the example shown in FIG. 2, the system resource line indicates the total amount of resources possessed by the nodes constituting the system.

図２（Ａ）において，システムは，当初，ＶＭ＃１〜ＶＭ＃３の３つのノードを備えているものとする。ここで，システムに対して，タスク＃１〜タスク＃４の処理要求が，ほぼ同時刻ｔ₀にあったものとする。このとき，システムでは，処理要求を受け付けた順，ここではタスク＃１，タスク＃２，タスク＃３，タスク＃４の順に，ノードにタスクが割り当てられ，処理が開始される。しかし，図２（Ａ）に示すように，ＶＭ＃３のノードにタスク＃３の処理を割り当てたところで，システムのすべてのノードのリソース消費量が上限となり，タスク＃４の処理を割り当て可能なノードがなくなる。 In FIG. 2A, it is assumed that the system initially includes three nodes VM # 1 to VM # 3. Here, the system, the processing requirements of the task # 1 Task # 4, and what was almost at the same time t _0. At this time, in the system, tasks are assigned to the nodes in the order in which processing requests are received, in this case, task # 1, task # 2, task # 3, and task # 4, and processing is started. However, as shown in FIG. 2A, when task # 3 processing is assigned to the node of VM # 3, the resource consumption of all nodes in the system becomes the upper limit, and task # 4 processing can be assigned. There are no nodes.

このとき，例えば図２（Ａ）に示すように，システムではスケールアウトが行われ，ＶＭ＃４のノードがシステムに増設される。図２（Ａ）に示すように，スケールアウトによってＶＭ＃４のノードがシステムに増設されることで，要求されたタスク＃４の処理が実行可能となる。例えば，クラウドコンピューティングにおいて，仮想マシンやネットワークなどのインフラをサービスとして提供するＩａａＳ（Infrastructure as a Service ）ベンダのサービスとして，システムのリソース消費率が所定以上となったときに自動でスケールアウトを行うサービスがある。クラウドコンピューティングにおいて，自動でシステムの拡張／縮小を行う技術は，オートスケーリングとも呼ばれる。 At this time, for example, as shown in FIG. 2A, scale-out is performed in the system, and a node of VM # 4 is added to the system. As shown in FIG. 2A, when a VM # 4 node is added to the system by scale-out, the requested task # 4 can be executed. For example, in cloud computing, as a service of an IaaS (Infrastructure as a Service) vendor that provides infrastructure such as virtual machines and networks as a service, it automatically scales out when the system resource consumption rate exceeds a predetermined level There is service. In cloud computing, the technology for automatically expanding / reducing a system is also called autoscaling.

ここで，タスク＃４の処理の納期が時刻ｔ₃であるものとする。図２（Ａ）に示すように，最も処理終了時刻が早いタスク＃３の処理終了時刻がｔ₁であるので，タスク＃４をタスク＃３の後にスケジュールすれば，タスク＃４の処理終了時刻はｔ₃となり，納期に間に合う。 Here, it is assumed delivery of processing of the task # 4 is time t _3. As shown in FIG. 2 (A), since the processing end time of the most processing end time is earlier task # 3 is a t _1, when scheduling tasks # 4 Following task # 3, the processing end time of the task # 4 Becomes t _{3 and} is in time for delivery.

このように，タスクのスケジューリングを行うことができれば，スケールアウトによりＶＭ＃４のノードを増設しなくても，納期に間に合うようにタスク＃４の処理を実行することが可能であった。しかし，このような判断は，上述のＩａａＳのサービスのようにリソース消費を見ただけではできない。上述したように，クラウドコンピューティングでは，ノードが増えるごとに使用料金が増えるため，ノードを増設しなくても納期に間に合うようにタスクの処理を実行できるのであれば，ノードを増設したくないという要望がある。 As described above, if task scheduling can be performed, it is possible to execute the processing of task # 4 in time for the delivery date without adding a VM # 4 node due to scale-out. However, such a determination cannot be made only by looking at resource consumption like the above-described IaaS service. As described above, in cloud computing, the usage fee increases as the number of nodes increases, so if you can execute task processing in time for delivery without adding nodes, you do not want to add nodes. There is a request.

また，無駄なノードの使用料金を抑えるために，システムで運用されていないと判断されるノードに対するスケールシュリンクが行われ，該ノードがシステムから撤去される。例えば，クラウドコンピューティングにおいて，ＩａａＳベンダのサービスとして，ノードのＣＰＵ（Central Processing Unit ）使用率が所定以下となったときに，該ノードではタスクの処理が行われておらずＯＳなどのミドルウェアしか動作していないと判断し，自動でスケールシュリンクを行うサービスがある。 Further, in order to reduce the use fee of the useless node, the scale shrink is performed on the node which is determined not to be operated in the system, and the node is removed from the system. For example, in cloud computing, as a service of an IaaS vendor, when a node's CPU (Central Processing Unit) usage rate falls below a predetermined level, task processing is not performed on the node and only middleware such as an OS operates. There is a service that automatically performs scale shrinkage.

だだし，ノードのリソース消費量が少ないからといって，必ずしも要求されたタスクの処理が行われていないとは限らない。たとえば，図２（Ｂ）に示す例において，ＶＭ＃３のノードでは，タスク＃３の処理が終了すると，リソース消費量が極端に減ってしまう。しかし，図２（Ｂ）に示す例において，ＶＭ＃３のノードでは，タスク＃３と並列にリソース消費量が少ないタスク＃４が実行されている。タスク＃３の処理の終了と同時にＶＭ＃３のノードに対するスケールシュリンクが行われると，タスク＃４の処理が強制終了されてしまう。このように，上述のＩａａＳベンダのサービスのようにリソース消費を見ただけでスケールシュリンクを判断すると，例えば，図２（Ｂ）に示すタスク＃４のようにリソース消費量が少ないタスクの処理を実行中であるノードを，誤って撤去してしまう可能性がある。 However, just because the resource consumption of the node is small, the requested task is not necessarily processed. For example, in the example shown in FIG. 2B, in the node of VM # 3, when the processing of task # 3 is completed, the resource consumption is extremely reduced. However, in the example shown in FIG. 2B, task # 4 with a small amount of resource consumption is executed in parallel with task # 3 in the node of VM # 3. If scale shrink is performed on the node of VM # 3 simultaneously with the end of the process of task # 3, the process of task # 4 is forcibly terminated. In this way, when scale shrinkage is determined just by looking at resource consumption as in the above-described IaaS vendor service, for example, processing of a task with low resource consumption, such as task # 4 shown in FIG. There is a possibility of accidentally removing a node that is being executed.

以下では，このような問題の解決を図った本実施の形態によるオートスケーリングの技術を説明する。 In the following, an autoscaling technique according to this embodiment for solving such a problem will be described.

図３は，本実施の形態によるシステムの構成例を示す図である。 FIG. 3 is a diagram illustrating a configuration example of a system according to the present embodiment.

本実施の形態において，図３に示すシステム１００は，クラウドコンピューティングにおいて，ＩａａＳ側から提供される仮想マシンのノードから構成されるコンピュータシステムであるものとする。システム１００は，クライアント２００から要求された処理を，複数のノードで分散して実行する。システム１００とクライアント２００とは，ＬＡＮ（Local Area Network）やインターネットなどのネットワーク３００で接続されている。システム１００は，マスタノード１１０，スレーブノード１２０を備える。 In this embodiment, the system 100 shown in FIG. 3 is assumed to be a computer system composed of nodes of virtual machines provided from the IaaS side in cloud computing. The system 100 executes processing requested by the client 200 in a distributed manner at a plurality of nodes. The system 100 and the client 200 are connected by a network 300 such as a LAN (Local Area Network) or the Internet. The system 100 includes a master node 110 and a slave node 120.

マスタノード１１０は，クライアント２００からの処理要求を受け付け，各スレーブノード１２０に対するタスクの割り当てや，システム１００で実行するタスクの管理を行う。なお，本実施の形態によるマスタノード１１０は，スレーブノード１２０のスケールアウトやスケールシュリンクなどのオートスケーリングの処理を行う。なお，マスタノード１１０が，同時にスレーブノード１２０の１つであってもよい。 The master node 110 receives a processing request from the client 200 and assigns a task to each slave node 120 and manages a task executed by the system 100. Note that the master node 110 according to the present embodiment performs auto-scaling processing such as scale-out and scale shrink of the slave node 120. Note that the master node 110 may be one of the slave nodes 120 at the same time.

スレーブノード１２０は，マスタノード１１０から割り当てられた処理要求のタスクを実行するコンピュータノードである。スレーブノード１２０は，複数のタスクの並列処理が可能である。スレーブノード１２０は，オートスケーリングの処理によって，システム１００に増設，またはシステム１００から削除される。例えば，図３において，＃ａのスレーブノード１２０ａと＃ｂのスレーブノード１２０ｂがタスクの処理を実行している場合に，それらのスレーブノード１２０で実行しきれない処理要求があれば，＃ｃのスレーブノード１２０ｃがシステム１００に増設される。また，たとえば，＃ｃのスレーブノード１２０ｃが実行するタスクがなくなれば，＃ｃのスレーブノード１２０ｃがシステム１００から削除される。 The slave node 120 is a computer node that executes a processing request task assigned by the master node 110. The slave node 120 can process a plurality of tasks in parallel. The slave node 120 is added to the system 100 or deleted from the system 100 by the auto scaling process. For example, in FIG. 3, when the slave node 120a of #a and the slave node 120b of #b are executing task processing, if there are processing requests that cannot be executed by those slave nodes 120, #c A slave node 120 c is added to the system 100. For example, if there is no task to be executed by the slave node 120c of #c, the slave node 120c of #c is deleted from the system 100.

クライアント２００は，システム１００に対して，システム１００が提供する処理の実行要求を行う上位アプリケーションを有するコンピュータである。なお，図１に示す例では，クライアント２００がシステム１００外部の装置となっているが，例えばクライアント２００がシステム１００内部のノードであってもよい。 The client 200 is a computer having a host application that makes a request to the system 100 to execute a process provided by the system 100. In the example illustrated in FIG. 1, the client 200 is a device outside the system 100, but the client 200 may be a node inside the system 100, for example.

図４は，本実施の形態による各ノードの機能構成例を示す図である。 FIG. 4 is a diagram illustrating a functional configuration example of each node according to the present embodiment.

マスタノード１１０は，クライアント通信部１１１，スケジュール処理部１１２，スケジュール情報記憶部１１３，スケールアウト処理部１１４，スレーブ通信部１１５，リスケジュール処理部１１６，スケールシュリンク処理部１１７を備える。 The master node 110 includes a client communication unit 111, a schedule processing unit 112, a schedule information storage unit 113, a scale-out processing unit 114, a slave communication unit 115, a reschedule processing unit 116, and a scale shrink processing unit 117.

クライアント通信部１１１は，クライアント２００からの処理要求を受け付ける。クライアント通信部１１１は，受け付けた要求の処理がシステム１００内で実行可能なものであれば，スケジュール処理部１１２に渡す。クライアント通信部１１１は，受け付けた要求の処理がシステム１００内で処理実行不可能なものであれば，その旨をクライアント２００に応答する。システム１００内で処理実行不可能な処理は，例えばスレーブノード１２０にないアプリケーションの処理や，１仮想マシンのスレーブノード１２０では納期までに終了できない大規模な処理などである。大規模な処理を，複数のノードで分散して実行する技術については，本実施の形態による技術の範囲外であるので，ここでは説明を行わない。クライアント通信部１１１は，スレーブノード１２０から受け取った処理結果をクライアント２００に通知する。 The client communication unit 111 receives a processing request from the client 200. The client communication unit 111 passes the received request processing to the schedule processing unit 112 if it can be executed in the system 100. If the received request process cannot be executed in the system 100, the client communication unit 111 responds to the client 200 to that effect. The processing that cannot be executed in the system 100 is, for example, processing of an application that does not exist in the slave node 120 or large-scale processing that cannot be completed by the delivery date in the slave node 120 of one virtual machine. A technique for executing a large-scale process in a distributed manner at a plurality of nodes is out of the scope of the technique according to the present embodiment, and will not be described here. The client communication unit 111 notifies the client 200 of the processing result received from the slave node 120.

スケジュール処理部１１２は，クライアント通信部１１１が納期が指定された処理要求を受け付けた際に，スケジュール情報記憶部１１３に記憶された管理情報を参照して，該処理要求のタスクをスレーブノード１２０に割り当てるシミュレーションを実行する。 When the client communication unit 111 receives a processing request for which a delivery date is specified, the schedule processing unit 112 refers to the management information stored in the schedule information storage unit 113, and sends the task of the processing request to the slave node 120. Run the assigned simulation.

スケジュール情報記憶部１１３は，システム１００が備えるスレーブノード１２０が実行するタスクを管理する管理情報の記憶部である。スケジュール情報記憶部１１３に記憶された管理情報には，すべてのスレーブノード１２０で実行するタスクのスケジュールが記録されている。 The schedule information storage unit 113 is a management information storage unit that manages tasks executed by the slave nodes 120 included in the system 100. In the management information stored in the schedule information storage unit 113, schedules of tasks to be executed by all the slave nodes 120 are recorded.

より具体的には，スケジュール処理部１１２は，まず，現行の状態で，受け付けた処理要求のタスクを，指定された納期内に処理終了可能なスレーブノード１２０の検出を行う。ここでは，スケジュール処理部１１２は，即時に処理実行可能なスレーブノード１２０だけではなく，他のタスクの処理終了後に処理要求のタスクを実行しても，納期内に処理終了可能なスレーブノード１２０も検出する。スケジュール処理部１１２は，ここで検出されたスレーブノード１２０から，処理要求のタスクを割り当てるスレーブノード１２０を選択する。なお，ここでは，ロック中のスレーブノード１２０については，検出の対象外とする。本実施の形態において，ロックは，スレーブノード１２０を，処理要求のタスクの割り当て対象外に設定することをいう。 More specifically, the schedule processing unit 112 first detects the slave node 120 that can finish processing the task of the received processing request within the specified delivery date in the current state. Here, the schedule processing unit 112 includes not only the slave node 120 that can execute processing immediately, but also the slave node 120 that can end the processing within the delivery date even if the processing request task is executed after the processing of other tasks ends. To detect. The schedule processing unit 112 selects the slave node 120 to which the task of the processing request is assigned from the slave nodes 120 detected here. Here, the locked slave node 120 is not subject to detection. In this embodiment, the lock means that the slave node 120 is set out of the task allocation target of the processing request.

スケジュール処理部１１２は，現行の状態で処理要求のタスクを納期内に処理終了可能なスレーブノード１２０がなければ，他のスレーブノード１２０に移動しても納期内に処理終了可能なタスクを検出する。スケジュール処理部１１２は，検出されたタスクを移動することで，処理要求のタスクを納期内に処理終了可能なスレーブノード１２０を用意できるかを判定する。タスクを移動することで処理要求のタスクを納期内に処理終了可能なスレーブノード１２０を用意できると判定される場合，スケジュール処理部１１２は，該当タスクを他のスレーブノード１２０に移動する処理を，スレーブ通信部１１５を介して実行する。タスクの移動は，例えば該タスクの処理を移動元のスレーブノード１２０で停止し，移動先のスレーブノード１２０で再起動させる方法でもよいし，移動元のスレーブノード１２０における該タスクの実行中のイメージデータを移動先のスレーブノード１２０に転送する方法でもよい。本実施の形態では，該タスクの処理を移動先のスレーブノード１２０で再起動させる方法でタスクの移動を行うものとする。スケジュール処理部１１２は，タスクの移動により空きができたスレーブノード１２０を，処理要求のタスクを割り当てる対象のスレーブノード１２０に決定する。 If there is no slave node 120 that can finish the processing request task within the due date in the current state, the schedule processing unit 112 detects the task that can finish the processing within the due date even if the slave node 120 moves to another slave node 120. . The schedule processing unit 112 determines whether it is possible to prepare a slave node 120 that can complete the processing of the requested task within the due date by moving the detected task. If it is determined that the slave node 120 capable of completing the processing request task within the due date can be prepared by moving the task, the schedule processing unit 112 performs the process of moving the task to another slave node 120. This is executed via the slave communication unit 115. The task may be moved by, for example, a method in which processing of the task is stopped at the source slave node 120 and restarted at the destination slave node 120, or an image of the task being executed at the source slave node 120. A method of transferring data to the destination slave node 120 may also be used. In the present embodiment, it is assumed that the task is moved by a method in which the task processing is restarted by the slave node 120 of the movement destination. The schedule processing unit 112 determines the slave node 120 that has become free due to the movement of the task as the slave node 120 to which the task requested to be processed is assigned.

スケジュール処理部１１２は，タスクを移動しても処理要求のタスクを納期内に処理終了可能なスレーブノード１２０を用意できない場合，ロック中のスレーブノード１２０があれば，そのロックを解除すれば，処理要求のタスクを納期内に処理終了可能であるかを判定する。ロック解除により，処理要求のタスクを納期内に処理終了可能となると判定される場合には，スケジュール処理部１１２は，ロック中のスレーブノード１２０のロックを解除する。スケジュール処理部１１２は，ロックを解除したスレーブノード１２０を，処理要求のタスクを割り当てる対象のスレーブノード１２０に決定する。 If the schedule processing unit 112 cannot prepare a slave node 120 that can complete the processing of the requested task within the delivery date even if the task is moved, if there is a slave node 120 that is locked, if the lock is released, the schedule processing unit 112 Determine whether the requested task can be completed within the due date. When it is determined that the processing request task can be completed within the due date due to the unlocking, the schedule processing unit 112 unlocks the locked slave node 120. The schedule processing unit 112 determines the slave node 120 whose lock has been released as the slave node 120 to which the task requested to be processed is assigned.

スケジュール処理部１１２は，ここまでの処理で，処理要求のタスクを割り当てる対象のスレーブノード１２０が見つけられなかった場合，スケールアウトにより，システム１００にスレーブノード１２０を増設すると判断する。 The schedule processing unit 112 determines that the slave node 120 is to be added to the system 100 by scale-out when the slave node 120 to which the processing request task is assigned cannot be found in the processing so far.

スケールアウト処理部１１４は，スケールアウトによって，システム１００にコンピュータノードを増設する処理を行う。より具体的には，スケールアウト処理部１１４は，ＩａａＳ側で用意しているインタフェースを用いて，ＩａａＳ側の管理コンピュータにアクセスし，自システム１００へのスケールアウトを要求する。 The scale-out processing unit 114 performs processing for adding computer nodes to the system 100 by scale-out. More specifically, the scale-out processing unit 114 accesses the management computer on the IaaS side using an interface prepared on the IaaS side, and requests scale-out to the own system 100.

スレーブ通信部１１５は，スレーブノード１２０との通信を行う。例えば，スレーブ通信部１１５は，受け付けた処理要求のタスクを割り当てる対象のスレーブノード１２０に対して，該タスクの処理を依頼する。また，スレーブ通信部１１５は，スレーブノード１２０から，依頼したタスクの処理結果を応答として受け付ける。また，スレーブ通信部１１５は，タスクの移動が行われる際に，移動元のスレーブノード１２０に対して該当タスクの停止を通知し，移動先のスレーブノード１２０に対して該当タスクの実行を通知する。 The slave communication unit 115 performs communication with the slave node 120. For example, the slave communication unit 115 requests processing of the task to the slave node 120 to which the task of the received processing request is assigned. Further, the slave communication unit 115 receives the processing result of the requested task from the slave node 120 as a response. Further, when the task is moved, the slave communication unit 115 notifies the movement-source slave node 120 of the stop of the corresponding task and notifies the movement-destination slave node 120 of the execution of the corresponding task. .

リスケジュール処理部１１６は，例えば定期的に，スケジュール情報記憶部１１３に記憶された管理情報を参照し，実行するタスクがないスレーブノード１２０を，スケールシュリンクの対象とする。 For example, the reschedule processing unit 116 periodically refers to the management information stored in the schedule information storage unit 113 and sets the slave node 120 having no task to be executed as a target of scale shrink.

また，リスケジュール処理部１１６は，スケジュール情報記憶部１１３に記憶された管理情報を参照して，あるスレーブノード１２０で動作するタスクを，他のスレーブノード１２０に移動するシミュレーションを実行する。リスケジュール処理部１１６は，他のスレーブノード１２０に移動しても納期内に処理終了可能なタスクを検出する。 In addition, the reschedule processing unit 116 refers to the management information stored in the schedule information storage unit 113 and executes a simulation of moving a task operating on a certain slave node 120 to another slave node 120. The reschedule processing unit 116 detects a task that can be processed within the delivery date even if it moves to another slave node 120.

リスケジュール処理部１１６は，検出されたタスクを移動することで，実行するタスクがないスレーブノード１２０を用意できるかを判定する。タスクを移動することで実行するタスクがないスレーブノード１２０を用意できると判定される場合，リスケジュール処理部１１６は，該当タスクを他のスレーブノード１２０に移動する処理を，スレーブ通信部１１５を介して実行する。タスクの移動については，上述のスケジュール処理部１１２の場合と同様である。リスケジュール処理部１１６は，タスクの移動により実行するタスクがなくなったスレーブノード１２０を，スケールシュリンクの対象とする。 The reschedule processing unit 116 determines whether the slave node 120 having no task to be executed can be prepared by moving the detected task. If it is determined that the slave node 120 having no task to be executed can be prepared by moving the task, the reschedule processing unit 116 performs the process of moving the corresponding task to another slave node 120 via the slave communication unit 115. And execute. The movement of the task is the same as in the case of the schedule processing unit 112 described above. The reschedule processing unit 116 sets the slave node 120 in which there is no task to be executed due to the movement of the task as a target of scale shrink.

また，リスケジュール処理部１１６は，検出されたタスクを移動することで，実行するタスクの数が所定数以下となるスレーブノード１２０を用意できるかを判定する。タスクの移動により実行するタスクの数が所定数以下となるスレーブノード１２０を用意できると判定される場合，リスケジュール処理部１１６は，該当タスクを他のスレーブノード１２０に移動する処理を，スレーブ通信部１１５を介して実行する。リスケジュール処理部１１６は，タスクの移動により実行するタスクの数が所定数以下となるスレーブノード１２０にロックを設定する。ロック中のスレーブノード１２０には，原則として新たな処理要求のタスクが割り当てられないので，残ったタスクの処理が終了した後で，スケールシュリンクの対象となる。 In addition, the reschedule processing unit 116 determines whether or not the slave node 120 in which the number of tasks to be executed is a predetermined number or less can be prepared by moving the detected task. When it is determined that the slave node 120 whose number of tasks to be executed is less than or equal to the predetermined number can be prepared by moving the task, the reschedule processing unit 116 performs the process of moving the corresponding task to another slave node 120 by slave communication This is executed via the unit 115. The reschedule processing unit 116 sets a lock on the slave node 120 in which the number of tasks to be executed is less than or equal to a predetermined number due to task movement. As a rule, a task with a new processing request is not assigned to the slave node 120 that is locked, so that after the processing of the remaining task is completed, the slave node 120 becomes a target of scale shrink.

スケールシュリンク処理部１１７は，スケールシュリンクによって，対象のスレーブノード１２０をシステム１００から削除する処理を行う。より具体的には，スケールシュリンク処理部１１７は，ＩａａＳ側で用意しているインタフェースを用いて，ＩａａＳ側の管理コンピュータにアクセスし，対象スレーブノード１２０のスケールシュリンクを要求する。 The scale shrink processing unit 117 performs processing for deleting the target slave node 120 from the system 100 by scale shrink. More specifically, the scale shrink processing unit 117 accesses the management computer on the IaaS side using an interface prepared on the IaaS side, and requests scale shrink of the target slave node 120.

スレーブノード１２０は，マスタ通信部１２１，アプリケーション１２２を備える。図４に示すように，スレーブノード１２０では，複数種類のアプリケーション１２２ａ，ｂ，... を稼働することが可能である。処理要求のタスクが割り当てられた際には，該処理要求に対応するアプリケーション１２２によって，タスクの処理が実行される。 The slave node 120 includes a master communication unit 121 and an application 122. As shown in FIG. 4, the slave node 120 can operate a plurality of types of applications 122a, b,. When a processing request task is assigned, the task processing is executed by the application 122 corresponding to the processing request.

マスタ通信部１２１は，マスタノード１１０との通信を行う。例えば，マスタ通信部１２１は，マスタノード１１０からタスクの実行要求や停止要求を受け付け，タスクの実行開始または実行停止の処理を行う。また，タスクの処理が終了した場合には，マスタ通信部１２１は，マスタノード１１０に処理結果の応答を返す。 The master communication unit 121 communicates with the master node 110. For example, the master communication unit 121 receives a task execution request or stop request from the master node 110, and performs task start or stop processing. When the task processing is completed, the master communication unit 121 returns a processing result response to the master node 110.

図５は，本実施の形態によるノードを実現する仮想マシンシステムの例を示す図である。 FIG. 5 is a diagram illustrating an example of a virtual machine system that implements a node according to the present embodiment.

図５に示す仮想マシンシステムは，例えば，１台の物理マシン上で１または複数台の仮想マシンが動作するシステムである。図５に示す仮想マシンシステムの例では，ＣＰＵ１１やメモリ１２などの，物理マシンであるコンピュータ１０の資源を利用して，マスタノード１１０やスレーブノード１２０を含む複数の仮想マシン３０が動作している。なお，図５に示す仮想マシンシステムでは，マスタノード１１０とスレーブノード１２０とが同じ物理マシンで実現されているが，クラウドコンピューティングの環境では，マスタノード１１０や，各スレーブノード１２０が，それぞれ異なる物理マシンで実現されることも多い。 The virtual machine system shown in FIG. 5 is a system in which one or a plurality of virtual machines operate on one physical machine, for example. In the example of the virtual machine system illustrated in FIG. 5, a plurality of virtual machines 30 including a master node 110 and a slave node 120 are operating using resources of the computer 10 that is a physical machine such as the CPU 11 and the memory 12. . In the virtual machine system shown in FIG. 5, the master node 110 and the slave node 120 are realized by the same physical machine. However, in the cloud computing environment, the master node 110 and each slave node 120 are different from each other. Often implemented with physical machines.

図５に示す仮想マシンシステムにおいて，ハイパーバイザ２０は，コンピュータ１０を利用した仮想化環境を実現するソフトウェアである。マスタノード１１０やスレーブノード１２０を含む仮想マシン３０は，ハイパーバイザ２０上に構築可能である。 In the virtual machine system shown in FIG. 5, the hypervisor 20 is software that realizes a virtual environment using the computer 10. The virtual machine 30 including the master node 110 and the slave node 120 can be constructed on the hypervisor 20.

図４に示すマスタノード１１０，スレーブノード１２０を仮想マシンで実現するコンピュータ１０は，例えば，ＣＰＵ１１，主記憶となるメモリ１２，記憶装置１３，通信装置１４，媒体読取・書込装置１５，入力装置１６，出力装置１７等のハードウェアを備える。記憶装置１３は，例えばＨＤＤ（Hard Disk Drive ）等の外部記憶装置や，補助記憶装置などである。媒体読取・書込装置１５は，例えばＣＤ−Ｒ（Compact Disc Recordable ）ドライブやＤＶＤ−Ｒ（Digital Versatile Disc Recordable ）ドライブなどである。入力装置１６は，例えばキーボード・マウスなどである。出力装置１７は，例えばディスプレイ等の表示装置などである。なお，コンピュータ１０のハードウェア構成は，必要に応じて任意の設計が可能である。 The computer 10 that realizes the master node 110 and the slave node 120 shown in FIG. 4 as virtual machines includes, for example, a CPU 11, a memory 12 serving as a main memory, a storage device 13, a communication device 14, a medium reading / writing device 15, and an input device. 16 and hardware such as an output device 17 are provided. The storage device 13 is an external storage device such as an HDD (Hard Disk Drive), an auxiliary storage device, or the like. The medium reading / writing device 15 is, for example, a CD-R (Compact Disc Recordable) drive or a DVD-R (Digital Versatile Disc Recordable) drive. The input device 16 is, for example, a keyboard / mouse. The output device 17 is a display device such as a display. Note that the hardware configuration of the computer 10 can be arbitrarily designed as necessary.

図４に示すマスタノード１１０，スレーブノード１２０および各ノードが備える機能部は，コンピュータ１０が備えるＣＰＵ１１，メモリ１２等のハードウェアと，ソフトウェアプログラムとによって実現することが可能である。コンピュータ１０が実行可能なプログラムは，例えば，記憶装置１３に記憶され，その実行時にメモリ１２に読み出され，ＣＰＵ１１により実行される。 The master node 110, the slave node 120, and the functional units included in each node illustrated in FIG. 4 can be realized by hardware such as the CPU 11 and the memory 12 included in the computer 10 and a software program. A program executable by the computer 10 is stored in, for example, the storage device 13, read into the memory 12 at the time of execution, and executed by the CPU 11.

コンピュータ１０は，可搬型記録媒体から直接プログラムを読み取り，そのプログラムに従った処理を実行することもできる。また，コンピュータ１０は，サーバコンピュータからプログラムが転送されるごとに，逐次，受け取ったプログラムに従った処理を実行することもできる。さらに，このプログラムは，コンピュータで読み取り可能な記録媒体に記録しておくことができる。 The computer 10 can also read a program directly from a portable recording medium and execute processing according to the program. In addition, each time the program is transferred from the server computer, the computer 10 can sequentially execute processing according to the received program. Furthermore, this program can be recorded on a computer-readable recording medium.

以下，本実施の形態によるマスタノード１１０による処理について，より具体的な例を用いて説明する。 Hereinafter, processing by the master node 110 according to the present embodiment will be described using a more specific example.

図６〜図９は，本実施の形態のマスタノードにおけるスケジュール処理部による処理要求のタスクの割り当てシミュレーションの例を説明する図である。 FIG. 6 to FIG. 9 are diagrams for explaining an example of processing request task allocation simulation by the schedule processing unit in the master node of this embodiment.

図６（Ａ）は，タスクの表記の例を示す。図６（Ａ）において，塗り潰しされた図形が，タスクの処理を表す。タスクの処理を示す図形の縦の長さは該タスクのリソース消費量を示し，横の長さは該タスクの処理時間を示す。また，図６（Ａ）において，白抜き図形の右端は，該タスクの納期を示す。納期は，クライアント２００に指定されたタスクの処理を終了する期限である。以下の説明において，タスクについてはすべて図６（Ａ）に示す通りの表記が行われる。 FIG. 6A shows an example of task notation. In FIG. 6A, a filled graphic represents task processing. The vertical length of the graphic indicating the task processing indicates the resource consumption of the task, and the horizontal length indicates the processing time of the task. In FIG. 6A, the right end of the white figure indicates the delivery date of the task. The delivery date is a time limit for ending the processing of the task specified in the client 200. In the following description, all tasks are represented as shown in FIG.

図６（Ｂ）は，クライアント２００から時刻ｔ₀に処理要求を受け付けた時点における，システム１００が備える各スレーブノード１２０ａ，ｂが実行するタスクのスケジュールを示す。図６（Ｂ）に示すように，この時点でシステム１００が備えるスレーブノード１２０は，＃ａのスレーブノード１２０ａと＃ｂのスレーブノード１２０ｂの２つである。なお，スケジュール情報記憶部１１３には，例えば図６（Ｂ）のグラフに示すようなシステム１００が備えるスレーブノード１２０が実行するタスクのスケジュールをデータ化した管理情報が記憶されている。図６（Ｂ）において，グラフの縦軸はスレーブノードのリソース消費量を示し，横軸は時間の経過を示す。ｔ_-1〜ｔ₃は，等間隔の時刻を示す。 FIG. 6 (B) at the time of accepting the processing request from the client 200 at time t _0, indicating the schedule of tasks each slave node 120a system 100 is provided, b is executed. As shown in FIG. 6B, the slave nodes 120 included in the system 100 at this time are two slave nodes: the slave node 120a of #a and the slave node 120b of #b. Note that the schedule information storage unit 113 stores management information obtained by converting the schedule of tasks executed by the slave node 120 included in the system 100 as shown in the graph of FIG. 6B, for example. In FIG. 6B, the vertical axis of the graph indicates the resource consumption of the slave node, and the horizontal axis indicates the passage of time. t _{−1 to} t ₃ indicate equally spaced times.

図６（Ｃ）は，時刻ｔ₀の時点でクライアント２００から受け付けた処理要求のタスクＸを示す図形である。図６（Ｃ）に示すように，処理要求のタスクＸの処理については，他のタスクと区別がつくように，ハッチングで表されている。図６（Ｃ）に示すように，処理要求のタスクＸについては，時刻ｔ₃の納期が指定されている。 FIG. 6C is a diagram showing the task X of the processing request received from the client 200 at time t ₀ . As shown in FIG. 6C, the processing of the processing request task X is indicated by hatching so that it can be distinguished from other tasks. As shown in FIG. 6C, for the task X of the processing request, the delivery date at time t ₃ is designated.

なお，タスクの処理時間やリソース消費量の推定については，過去の実績から求める方法や理論的に求める方法などで多数の周知技術が存在するので，ここでは詳細な説明を省略する。タスクの処理時間の推定は，タスクが対象とする処理によって異なる。例えば画像の処理の処理時間については，画像ファイルサイズの影響を受ける。データベースを用いた処理では，データベースへのアクセス回数の影響を受ける。また，タスクのリソース消費量については，例えば，タスクの処理を行うアプリケーション１２２ごとに値が決まっている場合にはその値を用いることができる。スレーブノード１２０にリソース消費量を問い合わせるようにしてもよい。その他，様々な周知技術の利用が可能である。 Note that there are many known techniques for estimating task processing time and resource consumption, such as a method obtained from past results or a method obtained theoretically, and a detailed description thereof will be omitted here. The estimation of task processing time differs depending on the processing targeted by the task. For example, the processing time of image processing is affected by the image file size. Processing using a database is affected by the number of accesses to the database. For example, when the resource consumption amount of a task is determined for each application 122 that performs task processing, the value can be used. The slave node 120 may be inquired about the resource consumption. In addition, various known techniques can be used.

マスタノード１１０において，スケジュール処理部１１２は，処理要求を受け付けると，まず，現状で，処理要求のタスクをスレーブノード１２０に割り当て可能であるかを検証する。 In the master node 110, when receiving the processing request, the schedule processing unit 112 first verifies whether the task of the processing request can be assigned to the slave node 120 at present.

図７（Ａ）は，処理要求のタスクＸを，時刻ｔ₀に処理実行を開始するように，＃ａのスレーブノード１２０ａに対して割り当てした検証の例を示す。また，図７（Ｂ）は，処理要求のタスクＸを，時刻ｔ₀に処理実行を開始するように，＃ｂのスレーブノード１２０ｂに対して割り当てした検証の例を示す。いずれも，ノードのリソース消費量が，ノードのリソース限界量を超えてしまうため，検証結果はＮＧとなる。 FIG. 7 (A) is a task X of processing requests, to initiate the process performed at the time t _0, an example of a verification that assigned to the slave node 120a of # a. Further, FIG. 7 (B), the task X of processing requests, to initiate the process performed at the time t _0, an example of a verification that assigned to the slave node 120b of # b. In both cases, the resource consumption amount of the node exceeds the resource limit amount of the node, so the verification result is NG.

図７（Ｃ）は，最も早く処理が終了する，＃ａのスレーブノード１２０ａが実行するタスクＡ＃１の処理終了後に，処理要求のタスクＸをスケジュールした検証の例を示す。この場合，処理要求のタスクＸの処理終了時刻が納期ｔ₃を超えてしまうため，検証結果はＮＧとなる。 FIG. 7C shows an example of verification in which the task X of the processing request is scheduled after the processing of the task A # 1 executed by the slave node 120a of #a, which ends processing earliest. In this case, since the processing end time of the task X of the processing request exceeds the delivery date t ₃ , the verification result is NG.

次に，スケジュール処理部１１２は，他のタスクで再起動しても納期までに処理終了可能なタスクを移動する検証を行う。図８（Ａ）に示すように，＃ｂのスレーブノード１２０ｂが実行するタスクＢ＃３は，リソース消費量の面でも＃ａのスレーブノード１２０ａに移動可能であり，＃ａのスレーブノード１２０ａで再起動しても納期までに処理終了可能である。図８（Ｂ）は，図８（Ａ）に示すタスクの移動によってリソース消費量が減った＃ｂのスレーブノード１２０ｂに対して，処理要求のタスクＸを割り当てした検証の例を示す。タスクの移動を行っても，ノードのリソース消費量が，ノードのリソース限界量を超えてしまうため，検証結果はＮＧとなる。 Next, the schedule processing unit 112 performs verification to move a task that can be completed by the delivery date even if restarted by another task. As shown in FIG. 8A, the task B # 3 executed by the slave node 120b of #b can be moved to the slave node 120a of #a in terms of resource consumption, and the slave node 120a of #a Even after restarting, the process can be completed by the delivery date. FIG. 8B shows an example of verification in which the task X of the processing request is assigned to the slave node 120b of #b whose resource consumption has decreased due to the movement of the task shown in FIG. 8A. Even if the task is moved, the resource consumption of the node exceeds the resource limit of the node, so the verification result is NG.

他にロック中のスレーブノード１２０が存在しないので，スケジュール処理部１１２は，スケールアウト処理部１１４によるスケールアウトの処理でスレーブノード１２０を増設すると判断する。図９は，スケールアウト処理部１１４によって増設された＃ｃのスレーブノード１２０ｃに，処理要求のタスクＸが割り当てられた状態を示す。 Since there is no other slave node 120 that is locked, the schedule processing unit 112 determines to add the slave nodes 120 by the scale-out processing by the scale-out processing unit 114. FIG. 9 shows a state in which the task X of the processing request is assigned to the #c slave node 120c added by the scale-out processing unit 114.

このように，本実施の形態によるマスタノード１１０は，現状のスレーブノード１２０の状態でスケジューリングを含めた検証を行っても，さらにタスクの移動を行う検証を行っても，納期内に処理が終了するように処理要求のタスクを割り当てできない場合にのみ，スケールアウトを実行する。これにより，無駄なスケールアウトを抑制できるので，無駄な課金が発生しない，効率がよいシステム１００の運用が可能となる。 As described above, the master node 110 according to the present embodiment completes the processing within the delivery date regardless of whether the verification including scheduling is performed in the current state of the slave node 120 or further verification of moving the task is performed. Execute scale-out only when a task with a processing request cannot be assigned. As a result, useless scale-out can be suppressed, so that it is possible to operate the system 100 efficiently without useless charging.

図１０，図１１は，本実施の形態のマスタノードにおけるリスケジュール処理部によるタスクの移動シミュレーションの例を説明する図である。 10 and 11 are diagrams for explaining an example of task movement simulation by the reschedule processing unit in the master node of the present embodiment.

図１０は，時刻ｔ₀の時点での，システム１００が備える各スレーブノード１２０ａ，ｂ，ｃが実行するタスクのスケジュールを示す。図１０に示すように，この時点でシステム１００が備えるスレーブノード１２０は，＃ａのスレーブノード１２０ａと，＃ｂのスレーブノード１２０ｂと，＃ｃのスレーブノード１２０ｃの３つである。 Figure 10 shows the at time t _0, each slave node 120a of system 100 is provided, b, a schedule for the tasks that c runs. As shown in FIG. 10, there are three slave nodes 120 included in the system 100 at this time point: a slave node 120a of #a, a slave node 120b of #b, and a slave node 120c of #c.

まず，マスタノード１１０において，リスケジュール処理部１１６は，定期的にスケジュール情報記憶部１１３を参照して，実行するタスクがないスレーブノード１２０があるかをチェックする。図１０に示すように，この時点では，すべてのスレーブノード１２０が処理を実行しているので，実行するタスクがないスレーブノード１２０はない。このとき，リスケジュール処理部１１６は，現状のままでスケールシュリンクできるスレーブノード１２０はないと判断する。 First, in the master node 110, the reschedule processing unit 116 periodically checks the schedule information storage unit 113 to check whether there is a slave node 120 that has no task to be executed. As shown in FIG. 10, since all the slave nodes 120 are executing the process at this time, there is no slave node 120 having no task to be executed. At this time, the reschedule processing unit 116 determines that there is no slave node 120 that can be scale-shrinked as it is.

次に，リスケジュール処理部１１６は，他のタスクで再起動しても納期までに処理終了可能なタスクを移動する検証を行う。 Next, the reschedule processing unit 116 performs verification to move a task that can be completed by the delivery date even if restarted by another task.

図１１（Ａ）は，＃ｂのスレーブノード１２０ｂが実行するタスクＢ＃１を，＃ａのスレーブノード１２０ａで再起動する検証の例を示す。図１１（Ａ）に示すように，タスクＢ＃１を他のスレーブノード１２０で再起動すると，処理の終了時刻が納期ｔ₂を超えてしまうため，検証結果はＮＧとなる。リスケジュール処理部１１６は，タスクを移動して＃ｂのスレーブノード１２０ｂに対してスケールシュリンクを行うことはできないと判断する。 FIG. 11A shows an example of verification in which the task B # 1 executed by the slave node 120b of #b is restarted by the slave node 120a of #a. As shown in FIG. 11 (A), and restarting the task B # 1 in other slave nodes 120, end time of the processing for exceeds the delivery time t _2, the verification result is NG. The reschedule processing unit 116 determines that the scale shrink cannot be performed on the slave node 120b of #b by moving the task.

図１１（Ｂ）は，＃ｃのスレーブノード１２０ｃが実行するタスクＣ＃１を，＃ａのスレーブノード１２０ａで再起動する検証の例を示す。図１１（Ｂ）に示すように，タスクＣ＃１は，＃ａのスレーブノード１２０ａで再起動しても納期ｔ₃までに処理終了可能である。さらに，タスクＣ＃１を＃ａのスレーブノード１２０ａに移動すると，＃Ｃのスレーブノード１２０ｃが実行するタスクがなくなるため，＃Ｃのスレーブノード１２０ｃに対してスケールシュリンクを行うことが可能である。 FIG. 11B shows an example of verification in which the task C # 1 executed by the slave node 120c of #c is restarted by the slave node 120a of #a. As shown in FIG. 11 (B), the task C # 1 is capable of handling ends up delivery t ₃ Restarting the slave node 120a of # a. Further, when the task C # 1 is moved to the slave node 120a of #a, there is no task executed by the slave node 120c of #C, so that scale shrink can be performed on the slave node 120c of #C.

リスケジュール処理部１１６は，スレーブ通信部１１５を介して，＃Ｃのスレーブノード１２０ｃにタスクＣ＃１の停止を指示し，＃Ａのスレーブノード１２０ａにタスクＣ＃１の実行を指示する。＃Ｃのスレーブノード１２０ｃが実行するタスクがなくなるので，リスケジュール処理部１１６は，＃Ｃのスレーブノード１２０ｃをスケールシュリンクの対象とする。スケールシュリンク処理部１１７は，＃Ｃのスレーブノード１２０ｃに対するスケールシュリンクの処理を実行する。 The reschedule processing unit 116 instructs the #C slave node 120c to stop the task C # 1 and instructs the #A slave node 120a to execute the task C # 1 via the slave communication unit 115. Since there is no task to be executed by the #C slave node 120c, the reschedule processing unit 116 sets the #C slave node 120c as the target of scale shrink. The scale shrink processing unit 117 executes scale shrink processing for the slave node 120c of #C.

このように，本実施の形態によるマスタノード１１０は，他のスレーブノード１２０にタスクを移動することで，実行するタスクがないスレーブノード１２０を積極的に作り，スケールシュリンクを実行する。これにより，無駄なスレーブノード１２０を積極的に削除することが可能となるので，無駄な課金が発生しない効率がよいシステム１００の運用が可能となる。 As described above, the master node 110 according to the present embodiment actively creates the slave node 120 having no task to be executed and moves the scale shrink by moving the task to another slave node 120. As a result, the useless slave node 120 can be positively deleted, so that the system 100 can be efficiently operated without generating useless charging.

図１２，図１３は，本実施の形態のマスタノードにおけるリスケジュール処理部によるスレーブノードにロック設定を行う例を説明する図である。 12 and 13 are diagrams illustrating an example in which lock setting is performed on a slave node by a reschedule processing unit in the master node of the present embodiment.

図１２は，時刻ｔ₀の時点での，システム１００が備える各スレーブノード１２０ａ，ｂが実行するタスクのスケジュールを示す。図１２に示すように，この時点でシステム１００が備えるスレーブノード１２０は，＃ａのスレーブノード１２０ａと＃ｂのスレーブノード１２０ｂの２つである。 12, at the time of time t _0, indicating the schedule of tasks each slave node 120a system 100 is provided, b is executed. As shown in FIG. 12, the slave nodes 120 included in the system 100 at this time are the slave node 120a of #a and the slave node 120b of #b.

マスタノード１１０において，リスケジュール処理部１１６は，タスクを移動しても実行するタスクがないスレーブノード１２０が用意できない場合に，タスクを移動することで実行するタスクの数が所定数以下となるスレーブノード１２０を用意できるかを検証する。例えば，ここでは，所定数を１とする。 In the master node 110, the reschedule processing unit 116, when a slave node 120 that does not have a task to be executed even if a task is moved cannot be prepared, a slave whose number of tasks to be executed by moving the task becomes a predetermined number or less. It is verified whether the node 120 can be prepared. For example, the predetermined number is 1 here.

図１３（Ａ）は，＃ｂのスレーブノード１２０ｂが実行するタスクＢ＃２を，＃ａのスレーブノード１２０ａで再起動する検証の例を示す。図１３（Ａ）に示すように，タスクＢ＃２は，＃ａのスレーブノード１２０ａで再起動しても納期ｔ₃までに処理終了可能である。なお，＃ｂのスレーブノード１２０ｂが実行するタスクＢ＃１を，＃ａのスレーブノード１２０ａで再起動しようとしても，処理の終了時刻が納期ｔ₂を超えてしまうため，タスクＢ＃１は移動できない。 FIG. 13A shows an example of verification in which the task B # 2 executed by the slave node 120b of #b is restarted by the slave node 120a of #a. Figure 13 (A), the task B # 2 is capable of handling ends up delivery t ₃ Restarting the slave node 120a of # a. Incidentally, the task B # 1 to the slave node 120b of # b is executed, If you try to restart the slave node 120a of # a, since the end time of processing exceeds the delivery time t _2, the task B # 1 is moved Can not.

タスクＢ＃２を＃ａのスレーブノード１２０ａに移動したとすると，図１３（Ｂ）に示すように，＃ｂのスレーブノード１２０ｂは，実行するタスクの数が所定数１以下となる。このとき，リスケジュール処理部１１６は，スレーブ通信部１１５を介して，＃Ｂのスレーブノード１２０ｂにタスクＢ＃２の停止を指示し，＃Ａのスレーブノード１２０ａにタスクＢ＃２の実行を指示する。 If the task B # 2 is moved to the slave node 120a of #a, as shown in FIG. 13B, the number of tasks executed by the slave node 120b of #b is 1 or less. At this time, the reschedule processing unit 116 instructs the slave node 120b of #B to stop task B # 2 and instructs the slave node 120a of #A to execute task B # 2 via the slave communication unit 115. To do.

また，図１３（Ｂ）に示すように，実行するタスクがＢ＃１のみとなった＃ｂのスレーブノード１２０ｂに，ロックを設定する。なお，ロックを設定したスレーブノード１２０の情報は，スケジュール情報記憶部１１３など，スケジュール処理部１１２がアクセス可能な記憶部に記録しておく。これで，＃ｂのスレーブノード１２０ｂには，原則として，新規の処理要求のタスクが割り当てされないので，残っている＃Ｂ１のタスクの終了後に，スケールシュリンク処理部１１７によって，＃ｂのスレーブノード１２０ｂを削除することができる。 In addition, as shown in FIG. 13B, a lock is set on the slave node 120b of #b in which the task to be executed is only B # 1. The information of the slave node 120 to which the lock is set is recorded in a storage unit accessible by the schedule processing unit 112, such as the schedule information storage unit 113. Thus, in principle, a new processing request task is not assigned to the slave node 120b of #b. Therefore, after the remaining # B1 task is finished, the scale shrink processing unit 117 performs the slave node 120b of #b. Can be deleted.

このように，本実施の形態によるマスタノード１１０は，他のスレーブノード１２０にタスクを移動して，実行するタスクが少ないスレーブノード１２０を積極的に作り，そのスレーブノード１２０に対する新規の処理要求のタスクの割り当てを抑制する。これにより，システム１００の環境が，実行するタスクがないスレーブノード１２０ができやすい環境となり，無駄なスレーブノード１２０を積極的に削除することが可能となるので，無駄な課金が発生しない効率がよいシステム１００の運用が可能となる。 As described above, the master node 110 according to the present embodiment moves a task to another slave node 120, actively creates a slave node 120 with few tasks to be executed, and issues a new processing request to the slave node 120. Suppress task assignments. As a result, the environment of the system 100 becomes an environment in which the slave node 120 having no task to be executed can be easily created, and the unnecessary slave node 120 can be positively deleted. The system 100 can be operated.

図１４は，本実施の形態のマスタノードにおけるスケジュール処理部によるスレーブノードのロック解除を行う例を説明する図である。 FIG. 14 is a diagram illustrating an example in which the slave node is unlocked by the schedule processing unit in the master node according to the present embodiment.

ここで，システム１００が，図１３（Ｂ）に示す状態で，クライアント２００から，図６（Ｃ）に示すタスクＸの処理要求を受けたものとする。このとき，スケジュール処理部１１２は，処理要求のタスクＸを，図１４（Ａ）に示すようにロックされていない＃ａのスレーブノード１２０ａに割り当てしようとするが，ノードのリソース消費量がノードのリソース限界量を超えてしまうため，割り当てできない。さらに，他にタスクを移動できる相手先のスレーブノード１２０もない。 Here, it is assumed that the system 100 receives a task X processing request shown in FIG. 6C from the client 200 in the state shown in FIG. At this time, the schedule processing unit 112 tries to assign the task X of the processing request to the slave node 120a of #a which is not locked as shown in FIG. 14A, but the resource consumption of the node is Cannot allocate because the resource limit is exceeded. Further, there is no other slave node 120 that can move the task.

このとき，スケジュール処理部１１２は，ロック中の＃ｂのスレーブノード１２０ｂのロックを解除すれば，＃ｂのスレーブノード１２０ｂで処理要求のタスクＸが納期までに処理終了可能かを検証する。図１４（Ｂ）に示すように，＃ｂのスレーブノード１２０ｂで処理要求のタスクＸが納期までに処理終了可能であるので，スケジュール処理部１１２は，ロック中の＃ｂのスレーブノード１２０ｂのロックを解除し，＃ｂのスレーブノード１２０ｂに処理要求のタスクＸを割り当てる。 At this time, the schedule processing unit 112 verifies whether the task X of the processing request can be completed by the delivery date at the slave node 120b of #b if the lock of the locked slave node 120b of #b is released. As shown in FIG. 14B, since the task X of the processing request can be completed by the due date on the slave node 120b of #b, the schedule processing unit 112 locks the slave node 120b of #b being locked. And the task X of the processing request is assigned to the slave node 120b of #b.

このように，ロック中のスレーブノード１２０には，他のスレーブノード１２０に処理要求のタスクを割り当てできない場合にのみ，処理要求のタスクの割り当てが行われる。これにより，スケールシュリンクを実行しやすい環境をギリギリまで維持しながら，スケールアウトの発生を抑制できるので，無駄な課金が発生しない効率がよいシステム１００の運用が可能となる。 As described above, the processing request task is assigned to the locked slave node 120 only when the processing request task cannot be assigned to another slave node 120. As a result, the occurrence of scale-out can be suppressed while maintaining an environment in which scale shrinking is easily performed to the limit, so that it is possible to operate the system 100 with high efficiency without causing unnecessary charging.

なお，本実施の形態によるマスタノード１１０は，納期内にタスクの処理が終了することを最優先としているので，システム１００では常に最低限の処理性能が維持されている。 The master node 110 according to the present embodiment places the highest priority on the completion of task processing within the delivery date, and therefore the system 100 always maintains the minimum processing performance.

図１５，図１６は，本実施の形態のマスタノードによる処理要求受け付け時の処理フローチャートである。 FIG. 15 and FIG. 16 are processing flowcharts when a processing request is accepted by the master node of this embodiment.

マスタノード１１０において，クライアント通信部１１１は，クライアントから処理要求を受け付ける（ステップＳ１０）。ここで受け付ける処理要求には，納期が指定されている。クライアント通信部１１１は，処理要求が受け入れ可能かを判定する（ステップＳ１１）。処理要求が受け入れ可能でなければ（ステップＳ１１のＮＯ），クライアント通信部１１１は，クライアントに処理要求の受け入れが不可能である旨を応答し（ステップＳ１２），処理を終了する。 In the master node 110, the client communication unit 111 receives a processing request from the client (step S10). A delivery date is specified in the processing request accepted here. The client communication unit 111 determines whether the processing request can be accepted (step S11). If the processing request is not acceptable (NO in step S11), the client communication unit 111 responds to the client that the processing request cannot be accepted (step S12) and ends the processing.

処理要求が受け入れ可能であれば（ステップＳ１１のＹＥＳ），スケジュール処理部１１２は，スケジュール情報記憶部１１３の管理情報を参照し，現状のスレーブノード１２０に処理要求のタスクを割り当てるシミュレーションを行う（ステップＳ１３）。ここでは，例えば，図７に示すようにスケジューリングを含めた検証が行われる。 If the processing request is acceptable (YES in step S11), the schedule processing unit 112 refers to the management information in the schedule information storage unit 113, and performs a simulation of assigning the processing request task to the current slave node 120 (step). S13). Here, for example, verification including scheduling is performed as shown in FIG.

スケジュール処理部１１２は，現状のスレーブノード１２０に対する処理要求のタスクの割り当てが可能かを判定する（ステップＳ１４）。処理要求のタスクの割り当てが可能であれば（ステップＳ１４のＹＥＳ），スケジュール処理部１１２は，処理要求のタスクを割り当てる先のスレーブノード１２０を決定する（ステップＳ１５）。スレーブ通信部１１５は，決定されたスレーブノード１２０に処理要求のタスクの処理を依頼する（ステップＳ２２）。 The schedule processing unit 112 determines whether it is possible to assign a processing request task to the current slave node 120 (step S14). If the processing request task can be allocated (YES in step S14), the schedule processing unit 112 determines the slave node 120 to which the processing request task is allocated (step S15). The slave communication unit 115 requests the determined slave node 120 to process the processing request task (step S22).

処理要求のタスクの割り当てが可能でなければ（ステップＳ１４のＮＯ），スケジュール処理部１１２は，スケジュール情報記憶部１１３の管理情報を参照し，既存のタスクを移動して，処理要求のタスクを割り当てるシミュレーションを行う（ステップＳ１６）。ここでは，例えば，図８に示すように，他のスレーブノード１２０に移動して再実行しても納期内に処理を終了可能な既存のタスクがある場合に，その既存タスクを移動してリソースに空きができたスレーブノード１２０に処理要求のタスクを割り当てることができるかが検証される。 If the processing request task cannot be assigned (NO in step S14), the schedule processing unit 112 refers to the management information in the schedule information storage unit 113, moves the existing task, and assigns the processing request task. A simulation is performed (step S16). Here, for example, as shown in FIG. 8, if there is an existing task that can be completed within the delivery date even if it is moved to another slave node 120 and re-executed, the existing task is moved to the resource It is verified whether it is possible to assign a processing request task to the slave node 120 that has become free.

スケジュール処理部１１２は，既存タスクを移動して処理要求のタスクの割り当てが可能となるかを判定する（ステップＳ１７）。処理要求のタスクの割り当てが可能となれば（ステップＳ１７のＹＥＳ），スケジュール処理部１１２は，該当既存タスクの移動を実行する（ステップＳ１８）。より具体的には，スケジュール処理部１１２は，スレーブ通信部１１５を介して，既存タスクの移動元のスレーブノード１２０に該タスクの停止を依頼する。また，スケジュール処理部１１２は，スレーブ通信部１１５を介して，既存タスクの移動先のスレーブノード１２０に該タスクの実行を依頼する。スレーブ通信部１１５は，既存タスクの移動によりリソースに空きができたスレーブノード１２０に処理要求のタスクの処理を依頼する（ステップＳ２２）。 The schedule processing unit 112 determines whether or not it is possible to assign a processing request task by moving an existing task (step S17). If the processing request task can be assigned (YES in step S17), the schedule processing unit 112 moves the existing task (step S18). More specifically, the schedule processing unit 112 requests the slave node 120 that is the source of the existing task to stop the task via the slave communication unit 115. In addition, the schedule processing unit 112 requests the execution of the task to the slave node 120 to which the existing task is moved via the slave communication unit 115. The slave communication unit 115 requests the processing of the requested task to the slave node 120 whose resources have become free due to the movement of the existing task (step S22).

処理要求のタスクの割り当てが可能とならなければ（ステップＳ１７のＮＯ），スケジュール処理部１１２は，ロック中のスレーブノード１２０が存在して，該スレーブノード１２０のロックを解除すれば，処理要求のタスクの割り当てが可能となるかを判定する（ステップＳ１９）。ロックを解除して処理要求のタスクの割り当てが可能となれば（ステップＳ１９のＹＥＳ），スケジュール処理部１１２は，ロック中のスレーブノード１２０のロックを解除する（ステップＳ２０）。スレーブ通信部１１５は，ロックが解除されたスレーブノード１２０に処理要求のタスクの処理を依頼する（ステップＳ２２）。このケースは，図１４の例に示されるケースである。 If it is not possible to assign the processing request task (NO in step S17), the schedule processing unit 112 determines that the processing request is not issued if the slave node 120 being locked exists and the slave node 120 is unlocked. It is determined whether task assignment is possible (step S19). If it is possible to release the lock and assign the processing request task (YES in step S19), the schedule processing unit 112 unlocks the locked slave node 120 (step S20). The slave communication unit 115 requests the processing of the processing request task to the slave node 120 whose lock has been released (step S22). This case is the case shown in the example of FIG.

ロック中のスレーブノード１２０がないか，ロックを解除しても処理要求のタスクの割り当てが可能とならなければ（ステップＳ１９のＮＯ），スケールアウト処理部１１４は，スケールアウトの処理を実行する（ステップＳ２１）。スケールアウトにより，システム１００にスレーブノード１２０が増設される。スレーブ通信部１１５は，増設されたスレーブノード１２０に処理要求のタスクの処理を依頼する（ステップＳ２２）。このケースは，図９の例に示されるケースである。 If there is no slave node 120 that is locked, or if it is not possible to assign a processing-requested task even if the lock is released (NO in step S19), the scale-out processing unit 114 executes a scale-out process ( Step S21). Due to the scale-out, slave nodes 120 are added to the system 100. The slave communication unit 115 requests the added slave node 120 to process the processing request task (step S22). This case is the case shown in the example of FIG.

スレーブ通信部１１５が，スレーブノード１２０から処理要求に対する処理結果の応答を受け付けると（ステップＳ２３），クライアント通信部１１１は，その応答をクライアントに送信する（ステップＳ２４）。 When the slave communication unit 115 receives a response to the processing request from the slave node 120 (Step S23), the client communication unit 111 transmits the response to the client (Step S24).

図１７は，本実施の形態のマスタノードによるリスケジュールの処理フローチャートである。 FIG. 17 is a flowchart of rescheduling processing by the master node according to this embodiment.

マスタノード１１０において，リスケジュール処理部１１６は，所定のタイミングで本処理を開始する。 In the master node 110, the reschedule processing unit 116 starts this processing at a predetermined timing.

リスケジュール処理部１１６は，スケジュール情報記憶部１１３の管理情報を参照し，現状で実行するタスクがないスレーブノード１２０があるかを判定する（ステップＳ３０）。実行するタスクがないスレーブノード１２０があれば（ステップＳ３０のＹＥＳ），スケールシュリンク処理部１１７は，該当スレーブノード１２０に対するスケールシュリンクの処理を実行する（ステップＳ３１）。スケールシュリンクにより，システム１００からスレーブノード１２０が削除される。 The reschedule processing unit 116 refers to the management information in the schedule information storage unit 113 and determines whether there is a slave node 120 that does not have a task to be executed at present (step S30). If there is a slave node 120 that has no task to execute (YES in step S30), the scale shrink processing unit 117 executes a scale shrink process for the slave node 120 (step S31). The slave node 120 is deleted from the system 100 by the scale shrink.

リスケジュール処理部１１６は，スケジュール情報記憶部１１３の管理情報を参照し，タスクを移動するシミュレーションを行う（ステップＳ３２）。ここでは，例えば図１１に示すように，他のスレーブノード１２０に移動して再実行しても納期内に処理を終了可能なタスクがある場合に，そのタスクを移動して実行するタスクがないスレーブノード１２０を生成できるかが検証される。また，例えば図１３に示すように，他のスレーブノード１２０に移動して再実行しても納期内に処理を終了可能なタスクがある場合に，そのタスクを移動して実行するタスクが所定数以下となるスレーブノード１２０を生成できるかが検証される。 The reschedule processing unit 116 refers to the management information in the schedule information storage unit 113 and performs a simulation of moving the task (step S32). Here, for example, as shown in FIG. 11, when there is a task that can finish processing within the delivery date even if it is moved to another slave node 120 and re-executed, there is no task to move and execute that task. It is verified whether the slave node 120 can be generated. For example, as shown in FIG. 13, when there is a task that can be completed within the delivery date even if it is moved to another slave node 120 and re-executed, a predetermined number of tasks are moved and executed. It is verified whether the following slave node 120 can be generated.

リスケジュール処理部１１６は，タスクを移動して実行するタスクがないスレーブノード１２０を生成できるか，すなわちタスクを移動してスケールシュリンクが可能となるかを判定する（ステップＳ３３）。タスクを移動してスケールシュリンクが可能となれば（ステップＳ３３のＹＥＳ），リスケジュール処理部１１６は，該当タスクの移動を実行する（ステップＳ３４）。より具体的には，リスケジュール処理部１１６は，スレーブ通信部１１５を介して，タスクの移動元のスレーブノード１２０に該タスクの停止を依頼する。また，リスケジュール処理部１１６は，スレーブ通信部１１５を介して，タスクの移動先のスレーブノード１２０に該タスクの実行を依頼する。スケールシュリンク処理部１１７は，タスクの移動により実行するタスクがなくなったスレーブノード１２０に対するスケールシュリンクの処理を実行する（ステップＳ３５）。スケールシュリンクにより，システム１００からスレーブノード１２０が削除される。 The reschedule processing unit 116 determines whether it is possible to generate a slave node 120 that does not have a task to be executed by moving a task, that is, whether the task can be moved and scale shrink can be performed (step S33). If the scale shrink can be performed by moving the task (YES in step S33), the reschedule processing unit 116 executes the movement of the corresponding task (step S34). More specifically, the reschedule processing unit 116 requests the task's source slave node 120 to stop the task via the slave communication unit 115. Further, the reschedule processing unit 116 requests the execution of the task from the slave node 120 to which the task is moved via the slave communication unit 115. The scale shrink processing unit 117 executes scale shrink processing for the slave node 120 that has no task to be executed due to task movement (step S35). The slave node 120 is deleted from the system 100 by the scale shrink.

タスクを移動してもスケールシュリンクが可能とならなければ（ステップＳ３３のＮＯ），リスケジュール処理部１１６は，タスクを移動して実行するタスクが所定数以下となるスレーブノード１２０があるかを判定する（ステップＳ３６）。タスクを移動しても実行するタスクが所定数以下となるスレーブノード１２０がなければ（ステップＳ３６のＮＯ），リスケジュール処理部１１６は，処理を終了する。 If the scale shrink is not possible even if the task is moved (NO in step S33), the reschedule processing unit 116 determines whether there is a slave node 120 that moves the task and executes a predetermined number of tasks or less. (Step S36). If there is no slave node 120 in which the number of tasks to be executed is less than or equal to the predetermined number even if the task is moved (NO in step S36), the reschedule processing unit 116 ends the process.

タスクを移動して実行するタスクが所定数以下となるスレーブノード１２０があれば（ステップＳ３６のＹＥＳ），リスケジュール処理部１１６は，該当タスクの移動を実行する（ステップＳ３７）。より具体的には，リスケジュール処理部１１６は，スレーブ通信部１１５を介して，タスクの移動元のスレーブノード１２０に該タスクの停止を依頼する。また，リスケジュール処理部１１６は，スレーブ通信部１１５を介して，タスクの移動先のスレーブノード１２０に該タスクの実行を依頼する。リスケジュール処理部１１６は，タスクを移動して実行するタスクが所定数以下となったスレーブノード１２０にロックを設定する（ステップＳ３８）。このまま，ロックが解除されずに残ったタスクの処理がすべて終了すれば，ロックが設定されたスレーブノード１２０は，その後のステップＳ３０，Ｓ３１の処理でスケールシュリンクの対象となる。 If there is a slave node 120 in which the number of tasks to be moved and executed is less than or equal to the predetermined number (YES in step S36), the reschedule processing unit 116 moves the corresponding task (step S37). More specifically, the reschedule processing unit 116 requests the task's source slave node 120 to stop the task via the slave communication unit 115. Further, the reschedule processing unit 116 requests the execution of the task from the slave node 120 to which the task is moved via the slave communication unit 115. The reschedule processing unit 116 sets a lock on the slave node 120 whose task to be executed by moving the task is equal to or less than a predetermined number (step S38). If all of the tasks remaining without being unlocked are completed, the slave node 120 to which the lock is set becomes the target of scale shrink in the subsequent steps S30 and S31.

以上，本実施の形態について説明したが，本発明はその主旨の範囲において種々の変形が可能であることは当然である。 Although the present embodiment has been described above, the present invention can naturally be modified in various ways within the scope of the gist thereof.

１００システム
１１０マスタノード
１１１クライアント通信部
１１２スケジュール処理部
１１３スケジュール情報記憶部
１１４スケールアウト処理部
１１５スレーブ通信部
１１６リスケジュール処理部
１１７スケールシュリンク処理部
１２０スレーブノード
１２１マスタ通信部
１２２アプリケーション
２００クライアント
３００ネットワーク DESCRIPTION OF SYMBOLS 100 System 110 Master node 111 Client communication part 112 Schedule processing part 113 Schedule information storage part 114 Scale out processing part 115 Slave communication part 116 Reschedule processing part 117 Scale shrink processing part 120 Slave node 121 Master communication part 122 Application 200 Client 300 Network

Claims

In a system in which requested processing is distributed to and executed by a plurality of computer nodes, a computer node that manages tasks executed in the system includes:
When a processing request with a specified delivery date is received, the management information stored in the storage unit for managing the task executed by the computer node of the system is referred to, and the task of the processing request is assigned to the computer node. Run the simulation,
According to the simulation of the assignment, there is no computer node that can finish the processing request task within the due date, and even if another computer node moves the task that can finish the processing within the due date, the processing request task is delivered. An auto-scaling method comprising: executing a process of adding a computer node to the system when it is determined that a computer node capable of terminating the process cannot be prepared in the system.

A computer node that manages the task further includes:
Referring to the management information, a simulation for moving a task operating on a certain computer node to another computer node is executed,
When it is determined by the simulation of the movement that a computer node having no task to be executed can be prepared by moving a task that can be completed within the delivery date even at another computer node, the other computer node can also be within the delivery date. Move the task that can finish processing,
The autoscaling method according to claim 1, further comprising: deleting a computer node having no task to be executed from the system.

A computer node that manages the task further includes:
When it is determined by the simulation of the movement that a computer node can be prepared in which the number of tasks to be executed is less than or equal to a predetermined number by moving a task that can be completed within the delivery date even by another computer node. Move the task that can be completed within the delivery date even at the computer node,
The autoscaling method according to claim 2, wherein a process of setting a computer node having a number of tasks to be executed to a predetermined number or less as a task request assignment target is executed.

In the process of adding a computer node to the system, a computer capable of completing the processing request task within the delivery date even if the setting of the computer node set to be excluded from the processing request task allocation target is canceled. 4. The autoscaling method according to claim 3, further comprising adding a computer node to the system when it is determined that a node cannot be prepared.

In a system in which requested processing is distributed to a plurality of computer nodes, a computer node that manages tasks executed in the system is provided.
When a processing request with a specified delivery date is received, the management information stored in the storage unit for managing the task executed by the computer node of the system is referred to, and the task of the processing request is assigned to the computer node. Run the simulation,
According to the simulation of the assignment, there is no computer node that can finish the processing request task within the due date, and even if another computer node moves the task that can finish the processing within the due date, the processing request task is delivered. An auto-scaling program that causes the system to execute a process to add a computer node when it is determined that a computer node that can complete the process cannot be prepared.

In a system having a plurality of computer nodes, a computer node for managing tasks executed in the system,
A management information storage unit for managing tasks executed by computer nodes included in the system;
A schedule processing unit that executes a simulation of referring to the management information and allocating a task of the processing request to a computer node when a processing request with a specified delivery date is received;
There is no computer node capable of completing the processing request task within the due date by the schedule processing unit, and even if another computer node moves a task capable of completing the processing within the due date, the processing request task is delivered. A computer node comprising: a scale-out processing unit that executes processing for adding a computer node to the system when it is determined that a computer node capable of terminating processing cannot be prepared in the system.