JP5551967B2

JP5551967B2 - Cluster system, cluster system scale-out method, resource manager device, server device

Info

Publication number: JP5551967B2
Application number: JP2010119499A
Authority: JP
Inventors: 豪生西村; 道生入江; 雅志金子
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2010-05-25
Filing date: 2010-05-25
Publication date: 2014-07-16
Anticipated expiration: 2030-05-25
Also published as: JP2011248521A

Description

本発明は、通信サービスなどを提供する高可用クラスタシステム、クラスタシステムのスケールアウト方法、リソースマネージャ装置、サーバ装置に関する。 The present invention relates to a highly available cluster system that provides communication services, a cluster system scale-out method, a resource manager device, and a server device.

近年、インターネットにおいてはサービスの多様化が顕著である。また、新サービスは、利用者数を事前に明確に把握することが困難であるため、最初に小規模のサーバを用いてサービス提供を開始（スモールスタート）し、利用者の増加に応じてサーバを増強していくことがコスト面で有効である。 In recent years, service diversification has been remarkable on the Internet. In addition, since it is difficult to clearly grasp the number of users in advance, the new service starts service provision (small start) using a small server first, and the server increases as the number of users increases. It is effective in terms of cost to increase the power.

また、近年ではサーバ増強の手段として、ハードウェアをより強力なものに交換するのではなく、同じ処理を行うサーバを追加（スケールアウト）し、複数のサーバが並列動作するクラスタによって処理を行うことが一般的となりつつある。 Also, in recent years, as a means of augmenting servers, instead of replacing hardware with more powerful ones, adding servers that perform the same processing (scaling out) and performing processing in clusters in which multiple servers operate in parallel Is becoming common.

ただし、利用者数の増加時に速やかにスケールアウトを行うためには、稼働していない予備のサーバを常に待機させておく必要がある。そのような常に使用していない待機サーバを個々のサービス提供事業者が用意するのはコストが高い。 However, in order to quickly scale out when the number of users increases, it is necessary to always keep a standby server that is not operating in standby. It is costly for each service provider to prepare such a standby server that is not always used.

そこで、近年ではデータセンターのようなサーバ管理事業者が一括してサーバを管理し、サービス提供事業者をホスティングするＩａａＳ（ＩｎｆｒａｓｔｒｕｃｔｕｒｅａｓａＳｅｒｖｉｃｅ）などと呼ばれる形態が一般化している（非特許文献１）。ＩａａＳでは、個々のサービス提供事業者にとって、サーバのハードウェアメンテナンスが不要なだけでなく、複数のサービス提供事業者が待機サーバを共有することにより、サーバ設備維持費の低減が期待できる。 Thus, in recent years, a form called IaaS (Infrastructure as a Service) or the like in which a server management company such as a data center collectively manages servers and hosts service providers has become common (Non-patent Document 1). ). In IaaS, not only hardware maintenance of a server is unnecessary for each service provider, but also a reduction in server facility maintenance costs can be expected by sharing a standby server among a plurality of service providers.

一方、近年では設備維持コスト低減を目的として通信網のＩＰ化が進んでおり、従来の交換機のような専用ハードウェアではなく、汎用サーバを用いて通信ネットワーク・通信サービスを実現することが一般化している。また、通信サービスにおいてもサービスは多様化しているため、新サービスをスモールスタートし、利用者の拡大に応じてサーバ数をスケールアウトすることにより、設備コストを低減したいという要求がある。そのような要求に対して、各サービスをスモールスタートし、あるサービスの利用者拡大時に、余剰待機資源が集約されたリソースプールから必要なサーバを獲得し、スケールアウトを行う構成の適用が考えられる。 On the other hand, in recent years, IP of communication networks has been promoted for the purpose of reducing equipment maintenance costs, and it has become common to implement communication networks and communication services using general-purpose servers instead of dedicated hardware like conventional exchanges. ing. In addition, as communication services are diversified, there is a demand to reduce equipment costs by starting small new services and scaling out the number of servers according to the expansion of users. In response to such a request, it is possible to apply a configuration in which each service is started in a small manner, and when a user of a certain service is expanded, a necessary server is acquired from a resource pool in which surplus standby resources are aggregated and scaled out. .

Amazon Elastic Compute Cloud (Amazon EC2), http://aws.amazon.com/ec2/.Amazon Elastic Compute Cloud (Amazon EC2), http://aws.amazon.com/ec2/.

上述のような、個々のサービスにおけるスケールアウト時に、余剰資源が集約された共用のリソースプールからサーバを取得する構成においては、リソースプール内に保持されているサーバは共用であり、どのようなソフトウェア構成を取るかが事前にわからないため、ソフトウェアを何も有していない状態であることが一般的である。よって、スケールアウト時にサーバを取得した後、個々のサービスに対応したソフトウェア構成が含まれたソフトウェアイメージを、イメージを管理しているリポジトリからそのサーバに転送し、必要な設定を行ってクラスタに組み込む必要がある。 In a configuration in which a server is acquired from a shared resource pool in which surplus resources are aggregated at the time of scale-out of individual services as described above, the server held in the resource pool is shared, and what software Since it is not known in advance whether to take the configuration, it is common to have no software. Therefore, after acquiring the server at scale-out, transfer the software image containing the software configuration corresponding to each service from the repository that manages the image to the server, make the necessary settings, and incorporate it into the cluster There is a need.

ソフトウェアイメージはオペレーティングシステムや各種ミドルウェア、データ等を全て含んでいるため数〜数十ＧＢに及ぶ大容量であり、転送にはある程度の時間を要する。複数のサービスを同時にスケールアウトするような場合、イメージを管理しているリポジトリがボトルネックとなり、さらにスケールアウト時間が増大するため、インフラとして公共性の高い通信サービスに求められる無停止性・即応性を満たすことができない。 Since the software image includes all of the operating system, various middleware, data, and the like, the software image has a large capacity ranging from several to several tens of GB, and transfer requires a certain amount of time. When multiple services are scaled out at the same time, the repository that manages the image becomes a bottleneck, and the scale-out time increases. Therefore, non-stop and responsiveness required for highly public communication services as infrastructure Can't meet.

上述の課題を鑑み、本発明の目的は、スケールアウト時のソフトウェアイメージ転送の負荷を軽減できるようにしたクラスタシステム、クラスタシステムのスケールアウト方法、リソースマネージャ装置、サーバ装置を提供することを目的とする。 In view of the above-described problems, an object of the present invention is to provide a cluster system, a cluster system scale-out method, a resource manager device, and a server device that can reduce the load of software image transfer during scale-out. To do.

上述の課題を解決するために、本発明に係るクラスタシステムは、サービスを提供する複数のサーバと、複数のサーバを管理するリソースマネージャとからなるクラスタシステムにおいて、リソースマネージャは、所定のサービスを行うためのサーバを追加するスケールアウトの要求があると、複数のサーバの中から待機状態にあるサーバをスケールアウトを行うサーバとして選出すると共に、複数のサーバの中で所定のサービスを既に作動している複数のサーバに対して所定のサービスのソフトウェアイメージの断片の転送を指示する手段を有し、ソフトウェアの転送を指示された複数のサーバは、所定のサービスのソフトウェアイメージの断片をスケールアウトを行うサーバとして選出されたサーバに転送する手段を有し、スケールアウトを行うサーバとして選出されたサーバは、ソフトウェアの転送を指示された複数のサーバから転送されてきた所定のサービスのソフトウェアイメージを受信して保存する手段を有することを特徴とする。 In order to solve the above-described problems, a cluster system according to the present invention is a cluster system including a plurality of servers that provide services and a resource manager that manages the plurality of servers. The resource manager performs a predetermined service. If there is a request for scale-out to add a server, a server in a standby state is selected as a server to be scaled out from among a plurality of servers, and a predetermined service is already activated in the plurality of servers. Means for instructing a plurality of servers to transfer software image fragments of a predetermined service, and the plurality of servers instructed to transfer software scale out software image fragments of a predetermined service A means for transferring to a server elected as the server, Server elected as a server for performing is characterized in that it comprises means for receiving and storing software image for a given service that has been transferred from a plurality of servers are instructed to transfer the software.

本発明に係るクラスタシステムのスケールアウト方法は、サービスを提供する複数のサーバと、複数のサーバを管理するリソースマネージャとからなるクラスタシステムのスケールアウト方法において、所定のサービスを行うためのサーバを追加するスケールアウトの要求があると、リソースマネージャが、複数のサーバの中から待機状態にあるサーバをスケールアウトを行うサーバとして選出すると共に、複数のサーバの中で所定のサービスを既に作動している複数のサーバに対して所定のサービスのソフトウェアイメージの断片の転送を指示する工程と、ソフトウェアの転送を指示された複数のサーバが、所定のサービスのソフトウェアイメージの断片をスケールアウトを行うサーバに転送する工程と、スケールアウトを行うサーバが、ソフトウェアの転送を指示された複数のサーバから転送されてきた所定のサービスのソフトウェアイメージを受信して保存する工程とを含むことを特徴とする。 The cluster system scale-out method according to the present invention adds a server for performing a predetermined service in the cluster system scale-out method including a plurality of servers providing services and a resource manager managing the plurality of servers. When a scale-out request is made, the resource manager selects a server in a standby state from a plurality of servers as a server to be scaled out, and a predetermined service is already operating in the plurality of servers. Instructing a plurality of servers to transfer software image fragments of a predetermined service, and a plurality of servers instructed to transfer software transfer software image fragments of a predetermined service to a server that performs scale-out And the server that performs the scale-out, Characterized in that it comprises a step of receiving and storing software image for a given service transferred from software plurality of servers that are instructed to transfer.

本発明に係るリソースマネージャ装置は、サービスを提供する複数のサーバと、複数のサーバを管理するリソースマネージャとからクラスタシステムを構成するためのリソースマネージャ装置において、サービス実行に必要なソフトウェアイメージを管理する手段と、複数のサーバの稼働状態を監視する手段と、所定のサービスを行うためのサーバを追加するスケールアウトの要求があると、複数のサーバの中から待機状態にあるサーバをスケールアウトを行うサーバとして選出すると共に、複数のサーバの中で所定のサービスを既に作動している複数のサーバに対して、スケールアウトを行うサーバに所定のサービスのソフトウェアイメージの断片を転送することを要求する手段とを備えることを特徴とする。 A resource manager device according to the present invention manages a software image necessary for service execution in a resource manager device for configuring a cluster system from a plurality of servers that provide services and a resource manager that manages the plurality of servers. If there is a scale-out request for adding a means, a means for monitoring the operating status of a plurality of servers, and a server for performing a predetermined service, the server in a standby state is scaled out from the plurality of servers. Means for selecting a server and requesting a plurality of servers already operating a predetermined service among the plurality of servers to transfer a piece of software image of the predetermined service to a server performing scale-out It is characterized by providing.

本発明に係るサーバ装置は、サービスを提供する複数のサーバと、複数のサーバを管理するリソースマネージャとからクラスタシステムを構成するためのサーバ装置において、ソフトウェアイメージの断片の転送及び取得を行う手段と、ソフトウェアイメージの格納手段と、個別設定情報の格納手段とを備え、他のサーバがスケールアウトを行うサーバとして設定されると、リソースマネージャの指示により、ソフトウェアイメージの格納手段に格納されているソフトウェアイメージの断片をスケールアウトを行うサーバに転送し、当該サーバがスケールアウトを行うサーバとして選定されると、既に所定のサービスを作動している他の複数のサーバから転送されてくるソフトウェアイメージを取得し、ソフトウェアイメージの格納手段に格納することを特徴とする。 A server device according to the present invention includes a means for transferring and acquiring software image fragments in a server device for configuring a cluster system from a plurality of servers that provide services and a resource manager that manages the plurality of servers. The software image storage means and the individual setting information storage means, and when the other server is set as a scale-out server, the software stored in the software image storage means according to the instruction of the resource manager transfer the pieces of image server for scale-out, when the server is selected as the server for scale-out, already acquired the software images transferred from a plurality of other servers are operating a predetermined service As a software image storage means. Characterized in that it.

本発明によれば、スケールアウト時に、既に稼働中の現用中のサーバが使用しているソフトウェアイメージが新規にスケールアウトを行うサーバに転送され、新規にスケールアウトを行うサーバの固有の設定が行われる。また、新規にスケールアウトを行うサーバには、クラスタを構成する複数のサーバからソフトウェアイメージが一部ずつ転送される。このため、特定の現用中サーバに転送負荷が集中することがなくなり、高速にスケールアウトすることが可能となる。 According to the present invention, at the time of scale-out, a software image used by an active server that is already in operation is transferred to a new scale-out server, and a unique setting of the new scale-out server is performed. Is called. In addition, a software image is partially transferred from a plurality of servers constituting the cluster to a new scale-out server. For this reason, the transfer load is not concentrated on a specific active server, and it is possible to scale out at high speed.

本発明の第１の実施形態に係るクラスタシステムの全体構成を示すブロック図である。1 is a block diagram illustrating an overall configuration of a cluster system according to a first embodiment of the present invention. ソフトウェアイメージの管理情報テーブルの構成の説明図である。It is explanatory drawing of a structure of the management information table of a software image. 本発明の第１の実施形態に係るクラスタシステムにおいてサービスが起動していないときのシステムの説明図である。It is explanatory drawing of a system when the service has not started in the cluster system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るクラスタシステムにおいてサービスが起動していないときのサーバ状態監視機能部のサーバ状態の説明図である。It is explanatory drawing of the server state of the server state monitoring function part when the service has not started in the cluster system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るクラスタシステムにおいてサービス「Ａ」を起動するときの説明に用いるシーケンス図である。It is a sequence diagram used for description when starting service "A" in the cluster system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るクラスタシステムにおいてサービス「Ａ」を起動したときのサーバ状態監視機能部のサーバ状態の説明図である。It is explanatory drawing of the server state of the server state monitoring function part when service "A" is started in the cluster system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るクラスタシステムにおいてサービス「Ｂ」を起動するときの説明に用いるシーケンス図である。It is a sequence diagram used for description when starting service "B" in the cluster system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るクラスタシステムにおいてサービス「Ｂ」を起動したときのシステムの説明図である。It is explanatory drawing of a system when service "B" is started in the cluster system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るクラスタシステムにおいてサービス「Ｂ」を起動したときのサーバ状態監視機能部のサーバ状態の説明図である。It is explanatory drawing of the server state of a server state monitoring function part when service "B" is started in the cluster system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るクラスタシステムにおいてサービス「Ｂ」をスケールアウトするときの説明に用いるシーケンス図である。FIG. 7 is a sequence diagram used for explanation when a service “B” is scaled out in the cluster system according to the first embodiment of the present invention. 本発明の第１の実施形態に係るクラスタシステムにおいてサービス「Ｂ」をスケールアウトしたときのシステムの説明図である。It is explanatory drawing of a system when the service "B" is scaled out in the cluster system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係るクラスタシステムにおいてサービス「Ｂ」をスケールアウトしたときのサーバ状態監視機能部が持つサーバ状態の説明図である。It is explanatory drawing of the server state which the server state monitoring function part has when the service "B" is scaled out in the cluster system according to the first embodiment of the present invention.

以下、本発明の実施の形態について図面を参照しながら説明する。図１は、本発明の第１の実施形態に係るクラスタシステムの全体構成を示すブロック図である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing the overall configuration of the cluster system according to the first embodiment of the present invention.

図１において、本発明の第１の実施形態に係るクラスタシステム１０２は、保守ネットワーク１５０によって保守者端末１０１と接続し、サービスネットワーク１６０によって利用者端末１４０と接続する。 In FIG. 1, the cluster system 102 according to the first embodiment of the present invention is connected to the maintenance person terminal 101 via the maintenance network 150 and is connected to the user terminal 140 via the service network 160.

サービス提供者は、クラスタシステム１０２に対する保守コマンドを、保守者端末１０１によって実施する。保守コマンドとしては、サービスを提供するためのサーバクラスタの構築や、スケールアウトによるクラスタの増強などが挙げられる。 The service provider executes a maintenance command for the cluster system 102 using the maintenance person terminal 101. Maintenance commands include building a server cluster to provide services and augmenting the cluster by scaling out.

ユーザは、利用者端末１４０を用い、クラスタシステム１０２上で提供されるサービスをサービスネットワーク１６０を介して利用する。クラスタシステム１０２は、計算資源が集約されているリソースプール１２０と、資源を管理する機能部であるリソースマネージャ１０３で構成され、両者は管理ネットワーク１１０によって接続される。リソースプール１２０は計算資源である複数のサーバ１３０によって構成される。 A user uses a service provided on the cluster system 102 via the service network 160 using the user terminal 140. The cluster system 102 includes a resource pool 120 in which calculation resources are aggregated, and a resource manager 103 that is a function unit for managing resources, and both are connected by a management network 110. The resource pool 120 is configured by a plurality of servers 130 that are computational resources.

次に、リソースマネージャ１０３の構成機能部について説明する。リソースマネージャ１０３は、管理インタフェース２０１と、ソフトウェアイメージ管理機能部２０２と、サーバ状態管理機能部２０３と、サーバ起動処理機能部２０４と、サーバ状態監視機能部２０５とから構成される。 Next, the configuration function unit of the resource manager 103 will be described. The resource manager 103 includes a management interface 201, a software image management function unit 202, a server state management function unit 203, a server activation processing function unit 204, and a server state monitoring function unit 205.

管理インタフェース２０１は、保守者端末１０１とのインタフェースを行い、保守者端末１０１からの保守コマンドを受け付けて、適切な処理を他の機能部と連携して実行する。ソフトウェアイメージは、サービスの提供に必要なソフトウェア構成を含んだファイルであり、サービス提供時にサーバ１３０に転送される。 The management interface 201 performs an interface with the maintenance person terminal 101, receives a maintenance command from the maintenance person terminal 101, and executes appropriate processing in cooperation with other function units. The software image is a file including a software configuration necessary for providing a service, and is transferred to the server 130 when the service is provided.

ソフトウェアイメージ管理機能部２０２は、所定のサービス実行に必要なソフトウェアイメージの種別、格納場所を管理する。図２にソフトウェアイメージ管理機能部２０２が持つソフトウェアイメージの管理情報テーブルの例を示す。ソフトウェアイメージは、サービス毎に一つ存在し、図２の例では、サービス「Ａ」がイメージ「Ａ」、サービス「Ｂ」がイメージ「Ｂ」を用いてサービスを開始する。また、イメージ「Ａ」、イメージ「Ｂ」の格納場所は、リソースマネージャ１０３のファイルシステム上の”／ｐａｔｈ／ｔｏ／ｉｍａｇｅＤｉｒ”であることを示している。本実施形態では、格納場所はリソースマネージャ１０３上のディスク等を想定しているが、リソースマネージャ１０３からアクセス可能であれば、実際の格納場所は外部ストレージなどでもよい。 The software image management function unit 202 manages the type and storage location of a software image necessary for executing a predetermined service. FIG. 2 shows an example of a software image management information table held by the software image management function unit 202. One software image exists for each service. In the example of FIG. 2, the service “A” starts the service using the image “A”, and the service “B” uses the image “B”. Further, the storage location of the images “A” and “B” is “/ path / to / imageDir” on the file system of the resource manager 103. In this embodiment, the storage location is assumed to be a disk or the like on the resource manager 103. However, the actual storage location may be an external storage or the like as long as it can be accessed from the resource manager 103.

サーバ状態管理機能部２０３は、リソースプール１２０内の資源であるサーバ１３０が現在どのサービスの提供に用いられているか、などのサーバ１３０の状態情報を管理する。 The server state management function unit 203 manages state information of the server 130 such as which service the server 130 that is a resource in the resource pool 120 is currently used for.

サーバ起動処理機能部２０４は、サーバの起動要求があった場合、サーバ状態監視機能部２０５の情報を参照し、リソースプール１２０内のどのサーバ１３０を用いるかを決定すると共に、該当サーバ１３０に対してソフトウェアイメージ管理機能部２０２が管理するソフトウェアイメージを転送し、起動する機能を持つ。また、サーバ起動処理機能部２０４は、スケールアウトの要求があった場合、サーバ状態監視機能部２０５の情報を参照し、リソースプール１２０内のどのサーバ１３０をスケールアウトに用いるかを決定すると共に、同一のサービスを既に行っているサーバ１３０に、スケールアウトに用いるサーバ１３０に対してソフトウェアイメージの断片の転送を要求し、起動する機能を持つ。 When there is a server activation request, the server activation processing function unit 204 refers to the information of the server status monitoring function unit 205 to determine which server 130 in the resource pool 120 is to be used and The software image management function unit 202 has a function of transferring and starting a software image managed by the software image management function unit 202. In addition, when there is a scale-out request, the server activation processing function unit 204 refers to the information of the server state monitoring function unit 205 and determines which server 130 in the resource pool 120 is used for the scale-out. The server 130 having the same service has a function of requesting the server 130 used for scale-out to transfer a piece of software image and starting it.

サーバ状態監視機能部２０５は、常にリソースプール１２０内のサーバ１３０の状態を監視し、サーバ状態管理機能部２０３を更新する機能を持つ。図４、図６、図９、及び図１２に、サーバ状態監視機能部２０５が持つサーバ状態の例を示す。図４では、サーバＩＤ「ＳＶ１」〜「ＳＶ６」の６つのサーバ１３０の状態は、全て待機中である。図６では、サーバＩＤが「ＳＶ１」、「ＳＶ２」のサーバ１３０がサービス「Ａ」の実行中であり、他のサーバは、待機中である。図９では、サーバＩＤが「ＳＶ１」，「ＳＶ２」のサーバ１３０が「サービス「Ａ」の実行中であり、サーバＩＤが「ＳＶ４」，「ＳＶ６」のサーバ１３０がサービス「Ｂ」の実行中であり、他のサーバは待機中である。図１２では、サーバＩＤが「ＳＶ１」，「ＳＶ２」のサーバ１３０が「サービス「Ａ」の実行中であり、サーバＩＤが「ＳＶ４」，「ＳＶ５」，「ＳＶ６」のサーバ１３０がサービス「Ｂ」の実行中であり、他のサーバは待機中である。 The server state monitoring function unit 205 has a function of constantly monitoring the state of the server 130 in the resource pool 120 and updating the server state management function unit 203. 4, 6, 9, and 12 show examples of server states that the server state monitoring function unit 205 has. In FIG. 4, the states of the six servers 130 with the server IDs “SV1” to “SV6” are all on standby. In FIG. 6, the server 130 with the server IDs “SV1” and “SV2” is executing the service “A”, and the other servers are waiting. In FIG. 9, the server 130 with server IDs “SV1” and “SV2” is executing “service“ A ”, and the server 130 with server IDs“ SV4 ”and“ SV6 ”is executing service“ B ”. And other servers are waiting. In FIG. 12, a server 130 with server IDs “SV1” and “SV2” is executing “service“ A ”, and a server 130 with server IDs“ SV4 ”,“ SV5 ”, and“ SV6 ”is service“ B ”. ”And other servers are waiting.

次に、リソースプール１２０内のサーバ１３０の構成機能部について述べる。サーバ１３０は、起動エージェント３０１と、ソフトウェアイメージ格納部３０２と、個別設定・データ格納部３０３と、計算処理機能部３０４とから構成される。 Next, the configuration function unit of the server 130 in the resource pool 120 will be described. The server 130 includes an activation agent 301, a software image storage unit 302, an individual setting / data storage unit 303, and a calculation processing function unit 304.

起動エージェント３０１は、リソースマネージャ１０３とサーバ１３０間の通信を行う。具体的には、サーバ起動処理機能部２０４からの起動指示を受け付けてソフトウェアイメージを取得する。また、サーバ状態監視機能部２０５からの状態問い合わせに対し、サーバ１３０上で実行中のサービス種別や待機中であることを伝える。また、サーバ起動処理機能部２０４からソフトウェアイメージの転送指示があると、他のサーバ１３０にソフトウェアイメージの断片を転送する。 The activation agent 301 performs communication between the resource manager 103 and the server 130. Specifically, it receives a start instruction from the server start processing function unit 204 and acquires a software image. Further, in response to the status inquiry from the server status monitoring function unit 205, the service type being executed on the server 130 or being in standby is notified. When a software image transfer instruction is issued from the server activation processing function unit 204, the software image fragment is transferred to another server 130.

ソフトウェアイメージ格納部３０２は、起動エージェント３０１が取得してきたソフトウェアイメージを格納する機能部である。ソフトウェアイメージは、サービス毎に存在するものであるが、サーバ１３０を起動してサービスを実行するために必要な設定情報は、個々のサーバ１３０毎に異なる場合がある。本システムでは、サーバ１３０個別の設定情報は取得したソフトウェアイメージを書き換えて保存するのではなく、個別設定・データ格納部３０３に格納する。また、サーバ１３０毎に違う出力を行うログ情報などのような個別データも、個別設定・データ格納部３０３に配置する。 The software image storage unit 302 is a functional unit that stores the software image acquired by the activation agent 301. The software image exists for each service, but the setting information necessary for starting the server 130 and executing the service may differ for each individual server 130. In this system, the setting information for each server 130 is stored in the individual setting / data storage unit 303 instead of rewriting and storing the acquired software image. Individual data such as log information that is output differently for each server 130 is also arranged in the individual setting / data storage unit 303.

計算処理機能部３０４は、通常の汎用サーバ１３０が持つＣＰＵ、主記憶などの演算装置であり、ソフトウェアイメージ格納部３０２と個別設定／データ格納部をディスクとしてシステムをブートし、サービス提供処理を行う。 The calculation processing function unit 304 is a calculation device such as a CPU or main memory of the normal general-purpose server 130, and boots the system using the software image storage unit 302 and the individual setting / data storage unit as disks to perform service providing processing. .

次に、本発明の第１の実施形態のクラスタシステム１０２の動作について説明する。図３は、６つのサーバ１３０（サーバＩＤ「ＳＶ１」〜「ＳＶ６」）から構成されるリソースプール１２０を持つクラスタシステム１０２において、サービスが何も起動していない状態を示す。クラスタシステム１０２において、サービスが何も起動していない状態では、サーバ状態監視機能部２０５が持つサーバ状態情報は、図４に示すように、サーバＩＤ「ＳＶ１」〜「ＳＶ６」の６つのサーバ１３０の状態は、全て待機中になっている。以下、この状態から、サービス「Ａ」，サービス「Ｂ」が実行されている状態に遷移する際の処理について説明する。 Next, the operation of the cluster system 102 according to the first embodiment of this invention will be described. FIG. 3 shows a state in which no service is activated in the cluster system 102 having the resource pool 120 composed of six servers 130 (server IDs “SV1” to “SV6”). In the cluster system 102, when no service is activated, the server status information held by the server status monitoring function unit 205 includes six servers 130 having server IDs “SV1” to “SV6” as shown in FIG. All of the states are waiting. In the following, processing when transitioning from this state to a state in which service “A” and service “B” are being executed will be described.

図５は、６つのサーバ１３０（サーバＩＤ「ＳＶ１」〜「ＳＶ６」）が待機状態から、サービス「Ａ」が実行されている状態に遷移するときの処理を示すシーケンス図である。 FIG. 5 is a sequence diagram showing processing when the six servers 130 (server IDs “SV1” to “SV6”) transition from the standby state to the state where the service “A” is being executed.

リソースマネージャ１０３は、保守者端末１０１から、管理インタフェース経由でサービス「Ａ」のサーバ二台による起動要求を保守コマンドとして受け取ると（ステップＳ１０１）、サーバ起動処理機能部２０４がサーバ状態監視機能部２０５の情報を参照し、待機状態であるサーバ１３０を二つを選出する（ステップＳ１０２）。ここでは、サーバＩＤが「ＳＶ１」と「ＳＶ２」のサーバ１３０が選出されるとする。 When the resource manager 103 receives, from the maintenance person terminal 101, an activation request by the two servers of the service “A” via the management interface as a maintenance command (step S101), the server activation processing function unit 204 causes the server state monitoring function unit 205 to The two servers 130 that are in a standby state are selected (step S102). Here, it is assumed that servers 130 having server IDs “SV1” and “SV2” are selected.

サーバＩＤが「ＳＶ１」と「ＳＶ２」のサーバ１３０が選出されると、リソースマネージャ１０３のサーバ起動処理機能部２０４は、ソフトウェアイメージ管理機能部２０２からサービス「Ａ」の実行に必要なソフトウェアイメージであるイメージ「Ａ」を特定し、このイメージ「Ａ」を、選出されたサーバ１３０（サーバＩＤ＝ＳＶ１，ＳＶ２）の起動エージェント３０１に対して転送する（ステップＳ１０３ａ、Ｓ１０３ｂ）。選出されたサーバ１３０（サーバＩＤ＝ＳＶ１，ＳＶ２）上の起動エージェント３０１は、受信したイメージ「Ａ」をソフトウェアイメージ格納部３０２に保存する（ステップＳ１０４ａ、Ｓ１０４ｂ）。 When the servers 130 with the server IDs “SV1” and “SV2” are selected, the server activation processing function unit 204 of the resource manager 103 uses the software image necessary for executing the service “A” from the software image management function unit 202. An image “A” is specified, and this image “A” is transferred to the activation agent 301 of the selected server 130 (server ID = SV1, SV2) (steps S103a and S103b). The activation agent 301 on the selected server 130 (server ID = SV1, SV2) stores the received image “A” in the software image storage unit 302 (steps S104a and S104b).

リソースマネージャ１０３のサーバ起動処理機能部２０４は、ＩＰアドレスやネットマスク、ＤＮＳサーバなどのＩＰアドレスなどのネットワーク設定、バッファサイズや同時接続数などミドルウェアやＯＳのパラメータなど、サーバ１３０の起動前に変更する必要がある個別設定情報を、サーバＩＤが「ＳＶ１」と「ＳＶ２」のサーバ１３０上の起動エージェント３０１に渡すと（ステップＳ１０５ａ、１０５ｂ）、サーバＩＤが「ＳＶ１」と「ＳＶ２」のサーバ１３０上の起動エージェント３０１は、個別設定・データ格納部３０３にその個別設定情報を格納する（ステップＳ１０６ａ、１０６ｂ）。 The server startup processing function unit 204 of the resource manager 103 changes the network settings such as the IP address, netmask, and DNS server IP address, and the middleware and OS parameters such as the buffer size and the number of simultaneous connections before starting the server 130. When the individual setting information that needs to be passed is passed to the activation agent 301 on the server 130 with the server IDs “SV1” and “SV2” (steps S105a and 105b), the server 130 with the server IDs “SV1” and “SV2”. The activation agent 301 above stores the individual setting information in the individual setting / data storage unit 303 (steps S106a and 106b).

その後、サーバＩＤが「ＳＶ１」と「ＳＶ２」のサーバ１３０の起動エージェント３０１は、計算処理機能部３０４に対してソフトウェアイメージと個別設定情報を用いてブートするように指示する。計算処理機能部３０４はブート後サービス「Ａ」を提供するサーバ１３０として動作する。 Thereafter, the activation agent 301 of the server 130 with the server IDs “SV1” and “SV2” instructs the calculation processing function unit 304 to boot using the software image and the individual setting information. The calculation processing function unit 304 operates as the server 130 that provides the post-boot service “A”.

このように、サービス「Ａ」を提供するサーバ１３０が設定されると、サーバ状態監視機能部２０５のサーバ状態情報は、図６に示すように、リソースプール１２０内の６個のサーバ１３０（サーバＩＤが「ＳＶ１」〜「ＳＶ６」）のうち、サーバＩＤが「ＳＶ１」、「ＳＶ２」のサーバ１３０がサービス「Ａ」の実行中となり、他のサーバは、待機中となる。 As described above, when the server 130 that provides the service “A” is set, the server status information of the server status monitoring function unit 205 includes six servers 130 (servers) in the resource pool 120 as illustrated in FIG. Among the servers having IDs “SV1” to “SV6”), the servers 130 with the server IDs “SV1” and “SV2” are executing the service “A”, and the other servers are on standby.

次に、同様の手順で、サービス「Ｂ」の起動要求保守コマンドに対して、それぞれ、サーバＩＤが「ＳＶ４」，「ＳＶ６」上でサービス「Ｂ」を提供するサーバ１３０を起動すると、図７において、ステップＳ２０１〜ステップＳ２０６ａ、Ｓ２０６ｂに示すような処理が行われ、図８で示すような状態となる。このときのサーバ状態監視機能部２０５のサーバ状態情報は、図９に示すように、リソースプール１２０内の６個のサーバ１３０（サーバＩＤが「ＳＶ１」〜「ＳＶ６」）のうち、サーバＩＤが「ＳＶ１」，「ＳＶ２」のサーバ１３０がサービス「Ａ」の実行中となり、サーバＩＤが「ＳＶ４」，「ＳＶ６」のサーバ１３０がサービス「Ｂ」の実行中となり、他のサーバは、待機中となる。 Next, when the server 130 that provides the service “B” on the server IDs “SV4” and “SV6” is activated in response to the activation request maintenance command for the service “B” in the same procedure, FIG. In step S201, steps S206a and S206b are performed, and the state shown in FIG. 8 is obtained. As shown in FIG. 9, the server status information of the server status monitoring function unit 205 at this time is the server ID among the six servers 130 (server IDs “SV1” to “SV6”) in the resource pool 120. Servers “SV1” and “SV2” are currently executing service “A”, servers 130 whose server IDs are “SV4” and “SV6” are currently executing service “B”, and other servers are waiting. It becomes.

続いて、この状態からサービス「Ｂ」をスケールアウトする手順について述べる。図１０は、サービス「Ｂ」をスケールアウトするときの処理を示すシーケンス図である。 Next, the procedure for scaling out service “B” from this state will be described. FIG. 10 is a sequence diagram illustrating processing when the service “B” is scaled out.

リソースマネージャ１０３は、管理インタフェース経由でサービス「Ｂ」のスケールアウト要求の保守コマンドを受け取ると（ステップＳ３０１）、リソースマネージャ１０３のサーバ起動処理機能部２０４は、サーバ状態管理機能部２０３の情報を参照し、待機中のサーバ１３０を一つ選出する（ステップＳ３０２）。その選出方法としては、待機中状態のサーバ１３０を無作為に一つ選出する、テーブルのエントリーを順に確認していく中で最初に発見されたものを選出する等の単純な方法の他、サーバ起動処理機能部２０４が実ネットワークトポロジやリソースプール１２０内のサーバ１３０の物理的位置を把握しており、その情報を基に、現用中のサーバ１３０に対して最も通信遅延が少ないと判断できる（例えば、同一のハブに接続されている、または直接ネットワークケーブルで接続されている、など）サーバ１３０を選択するような方法も考えられる。 When the resource manager 103 receives the maintenance command for the scale-out request for the service “B” via the management interface (step S301), the server activation processing function unit 204 of the resource manager 103 refers to the information of the server state management function unit 203. Then, one waiting server 130 is selected (step S302). As a selection method, a simple method such as selecting one server 130 in a waiting state at random, selecting the first one found while checking the table entries in order, The activation processing function unit 204 knows the actual network topology and the physical location of the server 130 in the resource pool 120, and can determine that the communication delay is the smallest for the active server 130 based on the information ( For example, a method of selecting the server 130 (connected to the same hub or directly connected by a network cable) is also conceivable.

ここでは、待機中のサーバ１３０の中から、サーバＩＤが「ＳＶ５」のサーバ１３０がスケールアウトを行うサーバとして選ばれたとする。この場合、サーバＩＤが「ＳＶ５」のサーバ１３０は、イメージ「Ｂ」のソフトウェアイメージの転送が必要となる。通常のサーバ起動要求では、リソースマネージャ１０３のサーバ起動処理機能部２０４は、ソフトウェアイメージ管理機能部２０２からサービス「Ｂ」の実行に必要なソフトウェアイメージをサーバＩＤが「ＳＶ５」のサーバ１３０に直接転送しているが、スケールアウト時には、サーバ起動処理機能部２０４は、直接ソフトウェアイメージの転送を行わず、サービス「Ｂ」を提供しているサーバがイメージ「Ｂ」の断片を転送を要求する。 Here, it is assumed that the server 130 with the server ID “SV5” is selected as the server to be scaled out from the waiting servers 130. In this case, the server 130 with the server ID “SV5” needs to transfer the software image of the image “B”. In a normal server activation request, the server activation processing function unit 204 of the resource manager 103 directly transfers the software image necessary for executing the service “B” from the software image management function unit 202 to the server 130 with the server ID “SV5”. However, at the time of scale-out, the server activation processing function unit 204 does not directly transfer the software image, and the server providing the service “B” requests transfer of a fragment of the image “B”.

すなわち、この場合には、リソースマネージャ１０３のサーバ起動処理機能部２０４は、既にイメージ「Ｂ」を保有しているサーバ１３０（サーバＩＤ＝ＳＶ４，ＳＶ６）に対して、起動エージェント３０１にイメージ「Ｂ」の断片をそれぞれ転送するように指示する（ステップＳ３０３ａ、３０３ｂ）。 That is, in this case, the server activation processing function unit 204 of the resource manager 103 sends the image “B” to the activation agent 301 for the server 130 (server ID = SV4, SV6) that already has the image “B”. ") Is transferred (steps S303a and 303b).

イメージ「Ｂ」を保有しているサーバ１３０（サーバＩＤ＝ＳＶ４，ＳＶ６）は、指示されたイメージ「Ｂ」の断片をサーバＩＤが「ＳＶ５」のサーバ１３０に転送する（ステップＳ３０４ａ、３０４ｂ）。その際に、各サーバ１３０が転送する断片の量は、等分である方法と重み付けを行うような方法が考えられる。等分に転送する場合、イメージ「Ｂ」の容量が１ＧＢであったとすると、サーバＩＤが「ＳＶ４」のサーバ１３０がイメージ「Ｂ」の先頭から半分である５００ＭＢまでのデータを、サーバＩＤが「ＳＶ６」のサーバ１３０がイメージ「Ｂ」の半分から末尾までの５００ＭＢをそれぞれサーバＩＤが「ＳＶ５」のサーバ１３０に対して転送する。 The server 130 (server ID = SV4, SV6) holding the image “B” transfers the instructed fragment of the image “B” to the server 130 whose server ID is “SV5” (steps S304a and 304b). At this time, the amount of fragments transferred by each server 130 can be divided equally and a method of weighting can be considered. In the case of equally transferring, assuming that the capacity of the image “B” is 1 GB, the server 130 with the server ID “SV4” stores data up to 500 MB, which is half the head of the image “B”, and the server ID is “ The server 130 with “SV6” transfers 500 MB from the half to the end of the image “B” to the server 130 with the server ID “SV5”.

重み付けを行うような場合は、静的な各サーバの情報を用いる方法と動的に各サーバの処理負荷を考慮する方法が考えられる。前者は、各サーバが平均的に行っている処理負荷および有休資源（ＣＰＵ時間、ネットワーク帯域など）を、事前にサーバ起動処理機能部２０４が把握していることを前提とし、後者は起動エージェント３０１の申告によってその時のサーバの処理負荷及び有休資源をサーバ起動処理機能部２０４が把握する。そして、サーバＩＤが「ＳＶ４」のサーバ１３０の有休資源がサーバＩＤが「ＳＶ６」のサーバ１３０に比べて例えば３倍多かった場合、イメージ「Ｂ」の容量を１ＧＢとすると、サーバＩＤが「ＳＶ４」のサーバ１３０は先頭から７５０ＭＢを、サーバＩＤが「ＳＶ６」のサーバ１３０は残りの２５０ＭＢを断片としてサーバＩＤが「ＳＶ５」のサーバ１３０に転送する。 When weighting is performed, a method of using static information of each server and a method of dynamically considering the processing load of each server can be considered. The former is based on the premise that the server activation processing function unit 204 knows in advance the processing load and idle resources (CPU time, network bandwidth, etc.) that each server is performing on average, and the latter is based on the activation agent 301. The server activation processing function unit 204 grasps the server processing load and availability resources at that time. Then, when the number of idle resources of the server 130 with the server ID “SV4” is, for example, three times that of the server 130 with the server ID “SV6”, and the capacity of the image “B” is 1 GB, the server ID is “SV4”. The server 130 with the server ID “SV6” transfers the remaining 250 MB to the server 130 with the server ID “SV5”.

サーバＩＤが「ＳＶ５」のサーバ１３０の起動エージェント３０１は、受信したイメージ「Ｂ」をソフトウェアイメージ格納部３０２に保存する（ステップＳ３０５）。そして、リソースマネージャ１０３のサーバ起動処理機能部２０４は、ＩＰアドレスやネットマスク、ＤＮＳサーバなどのＩＰアドレスなどのネットワーク設定、バッファサイズや同時接続数などミドルウェアやＯＳのパラメータなど、サーバ１３０の起動前に変更する必要がある個別設定情報を、サーバＩＤが「ＳＶ５」のサーバ１３０上の起動エージェント３０１に渡すと（ステップＳ３０６）、サーバＩＤが「ＳＶ５」のサーバ１３０上の起動エージェント３０１は、個別設定・データ格納部３０３にその個別設定情報を格納する（ステップＳ３０７）。 The activation agent 301 of the server 130 with the server ID “SV5” stores the received image “B” in the software image storage unit 302 (step S305). Then, the server activation processing function unit 204 of the resource manager 103 performs network settings such as an IP address, a netmask, an IP address such as a DNS server, middleware and OS parameters such as a buffer size and the number of simultaneous connections, and the like before the server 130 is activated. Is passed to the activation agent 301 on the server 130 whose server ID is “SV5” (step S306), the activation agent 301 on the server 130 whose server ID is “SV5” The individual setting information is stored in the setting / data storage unit 303 (step S307).

その後、サーバＩＤが「ＳＶ５」のサーバ１３０の起動エージェント３０１は、計算処理機能部３０４に対してソフトウェアイメージと個別設定情報を用いてブートするように指示する。計算処理機能部３０４はブート後サービス「Ｂ」を提供するサーバ１３０として動作する。 Thereafter, the activation agent 301 of the server 130 with the server ID “SV5” instructs the calculation processing function unit 304 to boot using the software image and the individual setting information. The calculation processing function unit 304 operates as the server 130 that provides the post-boot service “B”.

スケールアウト時には、ステップＳ３０１〜ステップＳ３０７に示すような処理が行われ、図１１で示すような状態となる。このときのサーバ状態監視機能部２０５のサーバ状態情報は、図１２に示すように、リソースプール１２０内の６個のサーバ１３０（サーバＩＤが「ＳＶ１」〜「ＳＶ６」）のうち、サーバＩＤが「ＳＶ１」，「ＳＶ２」のサーバ１３０がサービス「Ａ」の実行中となり、サーバＩＤが「ＳＶ４」，「ＳＶ５」，「ＳＶ６」のサーバ１３０がサービス「Ｂ」の実行中となり、他のサーバは、待機中となる。 At the time of scale-out, processing as shown in steps S301 to S307 is performed, and a state as shown in FIG. 11 is obtained. The server status information of the server status monitoring function unit 205 at this time is, as shown in FIG. 12, the server ID of the six servers 130 (server IDs “SV1” to “SV6”) in the resource pool 120. Servers “SV1” and “SV2” are executing service “A”, and servers 130 whose server IDs are “SV4”, “SV5”, and “SV6” are executing service “B”, and other servers Will be waiting.

以上のように、スケールアウト時にリソースマネージャ１０３にソフトウェアイメージの転送負荷をかけないことにより、複数のサービスのスケールアウトを同時に高速に行うことが可能となる。 As described above, it is possible to simultaneously scale out a plurality of services at high speed by not applying a software image transfer load to the resource manager 103 during scale-out.

本発明は、上述した実施形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 The present invention is not limited to the above-described embodiments, and various modifications and applications can be made without departing from the gist of the present invention.

１０１：保守者端末
１０２：クラスタシステム
１０３：リソースマネージャ
１１０：管理ネットワーク
１２０：リソースプール
１３０：サーバ
１４０：利用者端末
１５０：保守ネットワーク
１６０：サービスネットワーク
２０１：管理インタフェース
２０２：ソフトウェアイメージ管理機能部
２０３：サーバ状態管理機能部
２０４：サーバ起動処理機能部
２０５：サーバ状態監視機能部
３０１：起動エージェント
３０２：ソフトウェアイメージ格納部
３０３：データ格納部
３０４：計算処理機能部 101: maintenance person terminal 102: cluster system 103: resource manager 110: management network 120: resource pool 130: server 140: user terminal 150: maintenance network 160: service network 201: management interface 202: software image management function unit 203: Server state management function unit 204: Server activation processing function unit 205: Server state monitoring function unit 301: Activation agent 302: Software image storage unit 303: Data storage unit 304: Calculation processing function unit

Claims

In a cluster system comprising a plurality of servers that provide services and a resource manager that manages the plurality of servers,
When there is a request for scale-out to add a server for performing a predetermined service, the resource manager selects a server in a standby state from among the plurality of servers as a server to perform scale-out, and Means for instructing a plurality of servers already operating the predetermined service in the server to transfer a piece of the software image of the predetermined service;
The plurality of servers instructed to transfer the software includes means for transferring a piece of software image of the predetermined service to a server selected as the server that performs the scale-out,
The server selected as the server to perform the scale-out has means for receiving and storing a software image of a predetermined service transferred from a plurality of servers instructed to transfer the software. system.

There are a plurality of servers instructed to transfer the software, and each of the plurality of servers instructed to transfer the software scales out a piece of the software image of the predetermined service with an equal amount of transfer. The cluster system according to claim 1, wherein the cluster system is transferred to a server.

There are a plurality of servers instructed to transfer the software, and each of the plurality of servers instructed to transfer the software performs a scale-out of a piece of the software image of the predetermined service with a weighted transfer amount. The cluster system according to claim 1, wherein the data is transferred to the cluster system.

The cluster system according to claim 3, wherein the weighting is performed according to a processing load of the server and a free resource.

5. The cluster system according to claim 4, wherein the processing load of the server and information on available resources are statically managed by the resource manager based on an average processing load and available resources of each server.

The cluster system according to claim 4, wherein the processing load of the server and information on holidays are managed dynamically by the resource manager by acquiring information from each server.

In a cluster system scale-out method comprising a plurality of servers for providing services and a resource manager for managing the plurality of servers,
When there is a scale-out request to add a server for performing a predetermined service, the resource manager selects a server in a standby state from the plurality of servers as a server to perform scale-out, and the plurality of the plurality of servers. Instructing a plurality of servers already operating the predetermined service in the server to transfer a piece of the software image of the predetermined service;
A plurality of servers instructed to transfer the software, transferring a piece of software image of the predetermined service to the server performing the scale-out;
A scale-out method comprising: a server performing the scale-out receiving and storing a software image of a predetermined service transferred from a plurality of servers instructed to transfer the software.

In a resource manager device for configuring a cluster system from a plurality of servers that provide services and a resource manager that manages the plurality of servers,
A means of managing software images required for service execution;
Means for monitoring operating states of the plurality of servers;
When there is a request for scale-out to add a server for performing a predetermined service, a server in a standby state is selected as a server to perform scale-out from among the plurality of servers, and the server among the plurality of servers is selected. Means for requesting a plurality of servers already operating a predetermined service to transfer a piece of a software image of the predetermined service to a server that performs the scale-out. apparatus.

In a server device for configuring a cluster system from a plurality of servers that provide services and a resource manager that manages the plurality of servers,
Means for transferring and obtaining pieces of software images;
Software image storage means;
Storage means for individual setting information,
When another server is set as a server to be scaled out, according to an instruction from the resource manager, a piece of software image stored in the software image storage means is transferred to the server to be scaled out,
When the server is selected as a server to be scaled out, a software image transferred from a plurality of other servers already operating a predetermined service is acquired and stored in the software image storage means. A server device as a feature.