JPWO2004084085A1

JPWO2004084085A1 - Load balancing system by inter-site cooperation

Info

Publication number: JPWO2004084085A1
Application number: JP2004569568A
Authority: JP
Inventors: 励河合; 哲土屋; 泰廣國生
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2003-03-18
Filing date: 2003-03-18
Publication date: 2006-06-22
Also published as: WO2004084085A1

Abstract

ネットワーク１１を介してクライアント１０から直接リクエストを受信する前段センタ１２−１と、前段センタ１２−１を介してクライアント１０からのリクエストを受信する後段センタ１２−２からなるシステムにおいて、各センタが予備サーバ１７−１、１７−２を有する。まず、前段センタ１２−１が通常のサーバを使ってサービスを提供する。サーバの負荷が大きくなったことを検出したシステム制御装置１６−１は、サービス１とサービス２に共通に設けられた予備サーバ１７−１から負荷が大きくなったサービスの提供用にサーバを提供する。それでも負荷が支えきれないときは、システム制御装置１６−１が後段センタ１２−２のシステム制御装置１６−２にサービスの提供をになうよう指示を出す。後段センタ１２−２も、負荷が通常サーバで支えきれなくなったら、予備サーバ１７−２を使って、負荷を支える。In a system comprising a front-stage center 12-1 that directly receives a request from the client 10 via the network 11 and a back-stage center 12-2 that receives a request from the client 10 via the front-stage center 12-1, Servers 17-1 and 17-2 are included. First, the pre-stage center 12-1 provides a service using a normal server. The system control device 16-1, which has detected that the load on the server has increased, provides a server for providing a service with an increased load from the spare server 17-1 provided in common for the service 1 and the service 2. . If the load still cannot be supported, the system control device 16-1 instructs the system control device 16-2 of the post-stage center 12-2 to provide the service. The rear stage center 12-2 also supports the load using the spare server 17-2 when the load cannot be supported by the normal server.

Description

本発明は、サイト間連携による負荷分散システムに関する。 The present invention relates to a load distribution system by inter-site cooperation.

インターネットの爆発的な普及によってサービス提供側で必要となるサーバ、ネットワーク等のリソースは膨大なものとなってきている。しかし、ユーザからの要求の量は時間や条件によって大きく変動することが分かっており、集中時にあわせてリソースを確保しておけば通常時には無駄なリソースの維持が必要となり、かといって集中時には対応できないリソースでは、サービス品質の低下を招きユーザに不快感を与えることとなる。更にユーザ数の増加に伴い必要なリソースの上限を見積もることは困難になっており、リソースを必要に応じて割り当てるシステムが必要となってくる。同時に過剰なリソースは管理コストの増大を招くため、必要でないリソースを有効に利用するための仕組みも必要とされている。
図１は、従来の負荷分散システムの一例である。
図１の構成では、クライアント１０がネットワーク１１を介してデータセンタ１２にアクセスし、サービスを受ける。負荷分散装置１３には、複数のサーバ１４が接続されている。
１台のサーバでは処理しきれない場合、図１のようにサーバを複数台設置し、その前段に負荷分散装置１３を配置することで負荷を複数のサーバに分散し、サービス品質を向上させるが、サーバ１４の追加判定やサーバ１４・負荷分散装置１３の追加、設定変更の作業は人手で行われることが多く、また最大負荷に合わせたサーバを常時確保する必要があるため大きなコストがかかる。
特許文献１では、サーバの追加及びユーザからのリクエストの振り分け方法を定義しているが、ユーザ側にサーバ選択のための機構を組み込む必要があり、不特定多数向けのサービスへの適用には適していない。また、リクエスト以外の管理情報のやりとりが必要になるという問題がある。
また、特許文献２の方式は、静的な情報を配信する場合にしか適用できず、サービス提供等ユーザからの要求によって毎回異なる情報を返す場合には適用できない。
更に、特許文献３についても静的情報の場合を想定しており、ファイルサーバ等への負荷が過剰になった場合については考慮されていない。
特開平９−１０６３８１号公報特開平９−１７９８２０号公報特開２００２−２５９３５４号公報 With the explosive spread of the Internet, resources such as servers and networks required on the service provider side have become enormous. However, it is known that the amount of requests from users varies greatly depending on time and conditions, and if resources are secured according to the time of concentration, it is necessary to maintain wasteful resources during normal times. If the resource cannot be used, the service quality is degraded and the user is uncomfortable. Furthermore, it is difficult to estimate the upper limit of necessary resources as the number of users increases, and a system for allocating resources as necessary becomes necessary. At the same time, excessive resources lead to an increase in management costs, so a mechanism for effectively using unnecessary resources is also required.
FIG. 1 is an example of a conventional load distribution system.
In the configuration of FIG. 1, the client 10 accesses the data center 12 via the network 11 and receives a service. A plurality of servers 14 are connected to the load balancer 13.
If a single server cannot handle the processing, a plurality of servers are installed as shown in FIG. 1, and the load distribution device 13 is arranged at the preceding stage to distribute the load to the plurality of servers and improve the service quality. The addition of the server 14, the addition of the server 14 and the load balancer 13, and the operation of changing the settings are often performed manually, and a large cost is required because it is necessary to always secure a server that matches the maximum load.
Patent document 1 defines a method for adding servers and distributing requests from users, but it is necessary to incorporate a mechanism for server selection on the user side, which is suitable for application to services for unspecified majority. Not. In addition, there is a problem that management information other than the request needs to be exchanged.
Further, the method of Patent Document 2 can be applied only when static information is distributed, and cannot be applied when different information is returned each time according to a request from a user such as service provision.
Further, Patent Document 3 also assumes the case of static information, and does not consider the case where the load on the file server or the like becomes excessive.
Japanese Patent Laid-Open No. 9-106381 Japanese Patent Laid-Open No. 9-179820 JP 2002-259354 A

本発明の目的は、サービス提供のための負荷を分散し、ユーザからの要求の変化に柔軟に対応できる負荷分散システムを提供することである。
本発明の方法は、クライアントにネットワーク経由でサービスを提供する複数のサーバを備えた装置の負荷分散の方法であって、通常サービスを提供するサーバの負荷を分担するための、初期状態ではいずれのサービスの設定もされていない複数の予備サーバを設けるステップと、通常サービスを提供するサーバの負荷の増大を見込んで、該予備サーバに提供すべきサービスのためのアプリケーションを設定して、該サービスの提供用サーバとし、通常サービスを提供するサーバと負荷を分担させる制御ステップとを備えることを特徴とする。
本発明によれば、データセンタなどの装置内に、通常サービスを提供するサーバの他に、予備サーバを複数設け、通常サービスを提供するサーバの負荷が大きくなった場合には、予備サーバに、そのサービスを提供可能なようにアプリケーションをインストールして、当該サーバの当該サービスを提供するための負荷を分担させる。
また、別の形態では、本発明に従って、予備サーバを備えた装置をネットワークで接続し、互いに、予備サーバを提供し合うように制御することにより、１つのデータセンタでは、一時的な負荷を支える程の処理能力を得られなくても、複数の装置がネットワークを介して協同して負荷に対処することにより、大きな負荷によってサービス提供の中断を避けることが出来る。また、これにより、１つの装置に備える予備サーバの数を減らすことが出来、ハードウェアを冗長に、各装置に持たせる必要が無くなる。An object of the present invention is to provide a load distribution system that distributes a load for providing a service and can flexibly cope with a change in a request from a user.
The method of the present invention is a load balancing method for a device having a plurality of servers that provide services to clients via a network. In the initial state for sharing the load of servers that provide normal services, A step of providing a plurality of spare servers for which no service is set, and an increase in the load on the server that provides the normal service, setting an application for the service to be provided to the spare server, A providing server, comprising a server that provides a normal service and a control step that shares a load.
According to the present invention, in addition to a server that provides a normal service, a plurality of spare servers are provided in a device such as a data center, and when the load on a server that provides a normal service increases, An application is installed so that the service can be provided, and the load of the server for providing the service is shared.
In another form, according to the present invention, a temporary load is supported in one data center by connecting devices provided with spare servers via a network and controlling each other so as to provide spare servers. Even if the processing capability cannot be obtained, it is possible to avoid interruption of service provision due to a large load by cooperating with the load by a plurality of devices via the network. In addition, this makes it possible to reduce the number of spare servers provided in one device, and eliminates the need to provide each device with redundant hardware.

図１は、従来の負荷分散システムの一例である。
図２は、本発明の実施形態の基本構成を示す図である。
図３は、図２の基本構成におけるセンタ内のネットワーク配置構成を示す図である。
図４は、本発明の第１の実施形態を示す図である。
図５は、本発明の第１の実施形態の動作を示す図である。
図６は、サーバの負荷と容量を算出するためのデータを示す図である。
図７は、負荷の大きさに応じてサーバの選択をするためのデータを示す図である。
図８は、追加するサーバの能力と負荷の予測値との関係を示した図である。
図９は、複数のサービスで予備サーバを共用する構成を示した図である。
図１０は、異なるセンタ間で予備サーバの提供を行う場合の構成を示す図である。
図１１は、本発明の実施形態の動作を説明する図である。
図１２は、他センタとの連携を図る場合におけるネットワーク帯域の確保に関する説明をする図である。
図１３は、ウェブサーバにおける本発明の実施形態の適用例を示す図である。
図１４は、ウェブサービスにおける本発明の実施形態の適用例を示す図である。
図１５は、対等なセンタ間が相互にリソースを融通し合う場合の本発明の実施形態の適用例である。
図１６は、予備サーバを持たない前段センタの場合に本発明の実施形態を適用した例を示す図である。
図１７〜図２４は、センタに設けられるデータベース間の連携のない場合の本発明の実施形態の動作を説明するフローチャートである。
図２５〜図３０は、データベースの連携がある場合の本発明の実施形態の処理の流れを示すフローチャートである。FIG. 1 is an example of a conventional load distribution system.
FIG. 2 is a diagram showing a basic configuration of the embodiment of the present invention.
FIG. 3 is a diagram showing a network arrangement configuration in the center in the basic configuration of FIG.
FIG. 4 is a diagram showing a first embodiment of the present invention.
FIG. 5 is a diagram showing the operation of the first exemplary embodiment of the present invention.
FIG. 6 is a diagram illustrating data for calculating the load and capacity of the server.
FIG. 7 is a diagram illustrating data for selecting a server according to the magnitude of the load.
FIG. 8 is a diagram showing the relationship between the capacity of the server to be added and the predicted load value.
FIG. 9 is a diagram showing a configuration in which a spare server is shared by a plurality of services.
FIG. 10 is a diagram showing a configuration in the case of providing a spare server between different centers.
FIG. 11 is a diagram for explaining the operation of the embodiment of the present invention.
FIG. 12 is a diagram for explaining the securing of the network bandwidth in the case of cooperating with other centers.
FIG. 13 is a diagram illustrating an application example of the embodiment of the present invention in a web server.
FIG. 14 is a diagram illustrating an application example of the embodiment of the present invention in a web service.
FIG. 15 shows an application example of the embodiment of the present invention when resources are exchanged between equal centers.
FIG. 16 is a diagram illustrating an example in which the embodiment of the present invention is applied to a front-stage center that does not have a spare server.
17 to 24 are flowcharts for explaining the operation of the embodiment of the present invention when there is no cooperation between databases provided in the center.
FIG. 25 to FIG. 30 are flowcharts showing the flow of processing according to the embodiment of the present invention when there is database cooperation.

本発明では、ユーザからの要求量の変化を予測し、それに合わせてデータセンタ内、もしくは、連携する別データセンタ内のサーバを動的に追加、削除することでサービス品質を保証し、同時に余剰サーバを複数のサービスで共用することでコスト削減を目指すものである。
図２は、本発明の実施形態の基本構成を示す図である。
クライアント１０は、ネットワーク１１を介して、前段センタ１２−１の負荷分散装置１３−１経由で、Ｗｅｂサーバ１５−１にアクセスする。Ｗｅｂサーバ１５−１でのデータ処理の結果、クライアント１０は、データベースサーバ１４−１あるいは、ファイルサーバ１４−２にアクセスして、サービスを受ける。後段サーバ１２−２は、前段センタ１２−１とほぼ同様の構成をしており、負荷分散装置１３−１経由でクライアント１０からの要求を受け付け、負荷分散装置１３−２で負荷分散を行いながら、Ｗｅｂサーバ１５−２にクライアント１０を導く。そして、クライアント１０は、Ｗｅｂサーバ１５−２を介して、データベースサーバ１４−３あるいは１４−４にアクセスし、サービスを受ける。
ここで前段センタ１２−１とは、ユーザの要求を直接受け取るセンタを指し、後段センタ１２−２とは前段センタ１２−１を通してユーザ要求を処理するセンタを示している。データセンタ間でのサーバの割当ては多対多の関係であり、あるデータセンタが複数のデータセンタのサーバを利用する場合や複数のデータセンタからのサーバ要求に同時に、あるデータセンタが応じる場合もある。サーバの負荷状態やクライアントからの負荷状態はシステム制御装置１６−１、１６−２が集計・判定・適用を行い、結果をサーバ１４−１〜１４−４や負荷分散装置１３−１、１３−２に設定する。サーバリソースが不足した場合には予備サーバ内のサーバ１７−１、１７−２を必要な機能のサーバとして設定した上で、サービスに追加し、能力を向上させる。
図３は、図２の基本構成におけるセンタ内のネットワーク配置構成を示す図である。
物理的なネットワーク構成は単一のスイッチ群２０の直下に全てのサーバを接続し、その上に論理的に独立したネットワーク（ＶＬＡＮ０、ＶＬＡＮ１１、ＶＬＡＮ１２、ＶＬＡＮ２１）を複数構成する。このような配置にすることで必要な位置へのサーバ追加処理を自動化することが可能となる。
サーバの追加、削除を行う場合には、ＣＰＵ性能やネットワーク構成等のサーバ仕様からサーバ能力を導きだし、様々な種類のハードが混在する環境であっても必要なサーバの算出を行い、適切にサーバの割当てを行う。同時にそのサーバに対してかかるトラフィックを計算し、ネットワーク帯域の確保もしくは調停を行う。
また、負荷計測及び負荷変動予測から将来の負荷を予想することで過剰負荷となる前にサーバの追加を行い、サービス品質の保証を実現する。
図４は、本発明の第１の実施形態を示す図である。
同図において、図２と同様な構成要素には同様の参照符号を付して詳細な説明を省略する。
ユーザからのリクエストが割当てたサーバの能力を超えた場合、応答時間の増大や無応答が発生し、ユーザに対し不快感を与えることになる。その状態で更に負荷が増大した場合には、サーバ障害が引き起こされる場合もある。これを防ぐために、サーバの負荷状態の計測をシステム制御装置１６が行い、現在のサーバ数では問題を起こすと判断した場合には、予備サーバ１７からサーバの追加を行い、アプリケーションやサービス、利用するデータ等の設定及び導入を行う。そして、依存関係にある機器及びサーバ等の設定を更新することでサービスに組み込む。
図５は、本発明の第１の実施形態の動作を示す図である。
同図において、図４と同じ構成要素には同じ参照符号を付して説明を省略する。
ユーザからのリクエスト量が減少した場合には、余剰サーバが発生する。この余剰サーバ分については削除したとしてもサービス品質は低下せず、むしろ運用コストや利用率の向上という面からは予備サーバとして開放し、他のサービスで利用することが望ましい。このため依存関係にある機器から関係する設定を削除することでサービスとの連携を解消し、その後設定の解除等の処理を行い、予備サーバ１７に戻す。
図６は、サーバの負荷と容量を算出するためのデータを示す図である。
サービス能力を必要に応じて追加・削除するためには、あるサーバがどれだけのサービス能力を提供するかといった情報が必要となる。データセンタ等においては、利用するサーバや機器、そしてアプリケーションやサービスの組み合わせによって１ユニットあたりのサービス能力は変化する。利用するサーバを均一なものでそろえることは、複数のデータセンタが連携する場合などにおいては現実的に不可能であり、そのためＣＰＵやメモリ等の機器仕様からサービス能力を算出する必要がある。このため、典型的な構成における性能値からＣＰＵ能力等の違いを考慮して性能値を推定する方法を利用する。
図７は、負荷の大きさに応じてサーバの選択をするためのデータを示す図である。
ここでは、サービス能力だけでなく、サーバユニットの特性としてどのような用途に利用するかが好ましいかという情報を保持する。上記のように利用できるサーバごとの性能値は均一でないため、これらを組み合わせて必要な能力を提供できる構成を作成する必要がある。このため図６で求めた性能値と特性及び要求される性能値から、推奨度の高いサーバを優先して要求量を満たすまでサーバを選択し、利用する。
図８は、追加するサーバの能力と負荷の予測値との関係を示した図である。
計測されたリクエスト量がサービス能力を超える場合にリソースを追加するだけでは、急激に負荷が上昇しつつある場合等にサービス品質が保証出来なくなる。そのため負荷の傾向を把握し、リクエスト量の増加が見込まれる場合には予測されるリクエスト量に見合ったサービス能力を予め追加しておくことでサービス品質の低下を防ぐ。予測の仕方としては、線形外挿などを行うなどが考えられる。
図９は、複数のサービスで予備サーバを共用する構成を示した図である。
あるデータセンタ内の複数のサービスの負荷状態を見た場合、全てのサービスが同時に高負荷となることは極めて希であり、サービス毎に予備リソースを確保したのでは利用されないリソースが常時存在すると考えられる。予備リソースを複数のサービスで共用することで、全体としてより少ない予備リソースで必要なサービス能力の追加が可能となる。また、共用することで維持コストを分散させることが出来る。センタ１２には、サービス１とサービス２が搭載されており、それぞれに負荷分散装置１３−１、１３−２が設けられる。サービス１には、Ｗｅｂサーバ１５−１、データベースサーバ１４−１、ファイルサーバ１４−２が設けられる。サービス２には、サーバ２５が設けられる。予備サーバ１７は、サービス１とサービス２に共通に設けられており、システム制御装置１６が負荷の状況を見て、予備サーバ１７から、サービス１あるいはサービス２に、追加サーバを導入する。
図１０は、異なるセンタ間で予備サーバの提供を行う場合の構成を示す図である。
同図において、図２と同様な構成要素には同じ参照符号を付し、説明を省略する。
データセンタ１２−１の規模によっては、たとえ予備サーバを異なるサービス間で共用したとしても物理的もしくはコスト的に充分な予備サーバ１７−１を確保できないケースがある。また十分に確保しておいたつもりであっても突発的な負荷によってデータセンタ内の予備サーバではまかないきれない場合が起こり得る。このような場合に、ネットワークで接続されている別のデータセンタ１２−２を後段センタとし、その予備サーバ１７−２をネットワーク経由で利用する。
図１１は、本発明の実施形態の動作を説明する図である。
同図においては、図９と同じ構成要素には同じ参照符号を付して、説明を省略する。
サービスによっては直接ユーザと情報を交換するサーバ以外にもデータベース等連携して動作するサーバが必要なものがある。このようなサービスの場合、それぞれの機能毎に処理能力や負荷状態の確認を行い適切な機能にサーバを追加しなければ性能の向上が望めない。そのためシステム制御装置１６は、各階層毎に負荷の確認を行い、追加・削除時には連携しているサーバの設定を変更することで能力の増減を図る。
図１２は、他センタとの連携を図る場合におけるネットワーク帯域の確保に関する説明をする図である。
なお、同図において、図１０と同じ構成要素には同じ参照符号を付している。
複数のサービスが同時に動作する場合や連携処理が必要な場合には、サーバを追加するだけでなく各サービスや機能間のトラフィックを調停しなければ充分な処理能力が得られない。それぞれの部分で必要となる帯域を計算し、それらの割合を考慮した上でネットワークに対して各帯域の確保を行うことで全体として充分な性能を出せる様にする。
上記の構成により、ユーザからの負荷とサーバ能力の状態を監視し、負荷がサーバ能力を超える前に必要充分なリソースを、データセンタ内、もしくは連携するデータセンタから割り当てることが出来るようになり、ユーザからのリクエストに対するサービス品質の保証が可能となる。同時に必要となる予備サーバを広範囲で共用することが可能になるため、全体として必要なサーバの総量を減らすことができる。また、複数の機能を持つサーバが連携し合うサービスであっても、ボトルネックとなっている機能に対してサーバの追加を行うことができるため、充分な大規模化が可能になる。更に全体の処理を自動化可能なため、ユーザからの要求量の変化に素早く追従可能である。
図１３は、ウェブサーバにおける本発明の実施形態の適用例を示す図である。
同図において、図１２と同じ構成要素には同じ参照符号を付して説明を省略する。
負荷が軽い状態では、前段センタ１２−１のみで運用される。負荷が増加した場合は、前段センタ１２−１内の予備サーバ１７−１をウェブサーバ１５−１として追加する。更に負荷が増大してくると後段センタ１２−２にウェブサーバ群１５−２を作成し、後段センタ１２−２でも負荷を受け持つようにする。
図１４は、ウェブサービスにおける本発明の実施形態の適用例を示す図である。
同図において、図１２と同じ構成要素には、同じ参照符号を付して説明を省略する。
この例では、ウェブサービスがウェブサーバ１５−１とデータベースサーバ１４−１、ファイルサーバ１４−２の組み合わせで構成されている。負荷が軽い状態では、前段センタ１２−１のみで運用される。負荷の増大に伴い、ボトルネックとなった部分に順次予備サーバ１７−１の追加を行っていき、前段センタ１２−１でまかない切れなくなった場合には後段センタ１２−２との連携を行う。この例のデータベースサーバ１４−１は、前段センタ１２−１と後段センタ１２−２の間で連携中もデータの同期を行う。これはセンタ間をまたぐＶＬＡＮの作成及び帯域確保を行うことで実現する。
図１５は、対等なセンタ間が相互にリソースを融通し合う場合の本発明の実施形態の適用例である。
センタ１内のサービス１の処理能力がセンタ１内の予備サーバ３０−１でまかなえなくなった場合、センタ２に対して連携を依頼しセンタ２内のサーバ（網掛け部分及び予備サーバ３０−３）を利用する。更にセンタ２内のサーバ容量も枯渇した場合（予備サーバ３０−２を含めた容量が枯渇した場合）は別のセンタ３に対して連携を依頼しセンタ３内のサーバ（網掛け部分及び予備サーバ３０−３）を利用する。
図１６は、予備サーバを持たない前段センタの場合に本発明の実施形態を適用した例を示す図である。
前段センタ１２−１において、サービス提供に対してサーバが不足したとシステム制御部１６−１が判断した場合には、後段センタ１２−２に連携を依頼し、後段センタ１２−２内のサーバを利用する。ここでは、負荷分散装置、及びＷｅｂサーバをサービス１とサービス２に対して提供する。サービス１’とサービス２’のサーバは、それぞれサービス１及びサービス２のサービスを行う。更に、後段センタ１２−２では、サーバの容量が足りなくなった場合には、予備サーバ１７をそれぞれのサービスに必要なだけ追加する。追加の判断や、前段センタ１２−１との連携は、システム制御部１６−２が行う。
図１７〜図２４は、センタに設けられるデータベース間の連携のない場合の本発明の実施形態の動作を説明するフローチャートである。
図１７は、システム制御装置の全体の流れを示すフローチャートである。
まず、ステップＳ１０において、負荷計測を行う。ステップＳ１１において、予測処理能力が割当て済み処理能力を超えているか否かを判断する。ステップＳ１１の判断がＹＥＳの場合には、ステップＳ１２において、処理能力の追加を行いステップＳ１５に進む。ステップＳ１５では、１０秒待つとなっているが、この数値は設計者が適宜設定すべきものである。
ステップＳ１１における判断がＮＯの場合には、ステップＳ１３において、現在処理能力が割当て済み処理能力の２分の１以下であるか否かを判断する。ステップＳ１３の判断がＹＥＳの場合には、ステップＳ１４において、処理能力の削減を行い、ステップＳ１５に進む。ステップＳ１３の判断がＮＯの場合には、ステップＳ１５に進む。
ステップＳ１５の後は、再びステップＳ１０に戻る。
図１８は、図１７のステップＳ１０の負荷計測の詳細を示す図である。
ステップＳ２０において、使用サーバから１０秒間の平均の処理数を収集する。この１０秒というのは、図１７のステップＳ１５の値と一致するべきものである。ステップＳ２１において、総平均処理数を計算し、計測履歴に追加する。ステップＳ２２において、計測履歴が４項以上あるか否かを判断する。ステップＳ２２の判断がＮＯの場合には、ステップＳ２３において、最新履歴を３０秒後の予測値として、ステップＳ２５に進む。ステップＳ２２における判断がＹＥＳの場合には、ステップＳ２４において、最新４履歴から最小２乗近似で３０秒後の予測値を算出し、ステップＳ２５に進む。これは、最新の４履歴から、回帰曲線を求め、回帰曲線を使って、３０秒後の予測値を得ることである。ステップＳ２５においては、３０秒後の予測値を設定し、ステップＳ２６において、最新履歴を現在値に設定して図１７のフローに戻る。
図１９は、図１７のステップＳ１２の処理能力追加処理の詳細を示す図である。
ステップＳ３０において、予測値から現在割当て値を引いて追加処理能力量を決定する。ステップＳ３１において、センタ内に予備サーバがあるか否かを判断する。ステップＳ３１の判断がＹＥＳの場合には、ステップＳ３２において、センタ内の追加サーバを選択する。ステップＳ３３では、追加処理能力量が充足されたか否かを判断する。ステップＳ３３の判断がＮＯの場合には、ステップＳ３４へ、判断がＹＥＳの場合には、ステップＳ３８に進む。ステップＳ３１の判断がＮＯの場合には、ステップＳ３４に進む。
ステップＳ３４においては、予備処理能力を持つ連携先センタがあるか否かを判断する。ステップＳ３４の判断がＹＥＳの場合には、ステップＳ３６において、連携センタで処理能力を割当てる。ステップＳ３７においては、追加処理能力量が充足されたか否かを判断する。ステップＳ３７の判断がＮＯの場合には、ステップＳ３４に進む。ステップＳ３７の判断がＹＥＳの場合には、ステップＳ３８に進む。ステップＳ３４の判断がＮＯの場合には、ステップＳ３５において、追加処理能力量の充足不能を管理者に警告して、ステップＳ３８に進む。ステップＳ３８では、選択されたサーバを含むようＶＬＡＮを設定し、ステップＳ３９において、選択されたサーバにアプリケーションを設定し、ステップＳ４０に進む。
ステップＳ４０においては、センタ間の連携があるか否かを判断し、判断がＮＯの場合には、ステップＳ４３に進む。ステップＳ４０の判断がＹＥＳの場合には、ステップＳ４１において、連携センタ負荷分散比率の決定、及び割当て装置の設定を行い、ステップＳ４２において、自センタと連携センタとの間の通信帯域を設定し、ステップＳ４３に進む。ステップＳ４３においては、自センタの負荷分散比率を決定し、割当て装置を決定して、図１７のフローに戻る。
図２０は、図１９のステップＳ３２の追加サーバの選択処理を詳細に示すフローである。
ステップＳ５０において、必要な用途向けのサーバがあるか否かが判断される。ステップＳ５０の判断がＮＯの場合には、ステップＳ５４に進む。ステップＳ５０の判断がＹＥＳの場合には、ステップＳ５１において、必要な用途向けサーバ内で、１台で追加処理能力量を充足可能なサーバがあるか否かを判断する。ステップＳ５１の判断がＮＯの場合には、ステップＳ５２において、必要な用途向けで最大性能のサーバを選択し、ステップＳ５０に戻る。ステップＳ５１の判断がＹＥＳの場合には、必要な用途向けサーバの内、１台で追加処理能力量をまかなえるサーバの内、最低性能のサーバを選択し、ステップＳ５８に進む。
ステップＳ５４においては、利用可能なサーバがあるか否かを判断する。ステップＳ５４の判断がＮＯの場合には、ステップＳ５８に進む。ステップＳ５４の判断がＹＥＳの場合には、ステップＳ５５において、１台で追加処理能力量を充足可能なサーバあるか否かを判断する。ステップＳ５５の判断がＮＯの場合には、ステップＳ５６において、最大性能のサーバを選択し、ステップＳ５４に戻る。ステップＳ５５の判断がＹＥＳの場合には、ステップＳ５７において、１台で追加処理能力量を充足可能なサーバの内、最低の性能のサーバを選択し、ステップＳ５８に進む。ステップＳ５８においては、割り当てられたサーバ一覧を構成して、図１９の処理に戻る。
図２１は、図１９のステップＳ３６の連携センタ処理能力割当て処理の流れを示すフローである。
ステップＳ６０において、帯域による処理能力上限が割当て希望値より小さいか否かを判断する。ステップＳ６０の判断がＮＯの場合には、ステップＳ６２に進む。ステップＳ６０の判断がＹＥＳの場合には、ステップＳ６１において、割当量上限を帯域上限とし、ステップＳ６２に進む。
ステップＳ６２においては、連携先センタにサーバ選択を依頼し、ステップＳ６３において、連携先センタ内で追加サーバを選択し、ステップＳ６４において、割り当てられたサーバ一覧を構成して、図１９の処理に戻る。
図２２は、図１９のステップＳ３９のアプリケーション設定の詳細フローである。
ステップＳ７０において、センタ間の連携があるか否かを判断する。ステップＳ７０の判断がＮＯの場合には、ステップＳ７４に進む。ステップＳ７０の判断がＹＥＳの場合には、ステップＳ７１において、アプリケーションのアーカイブを転送済みか否かを判断する。ステップＳ７１の判断がＹＥＳの場合には、ステップＳ７３に進む。ステップＳ７１の判断がＮＯの場合には、ステップＳ７２において、連携先センタにアプリケーションのアーカイブを転送し、ステップＳ７３に進む。ステップＳ７３においては、追加サーバにアプリケーションをインストールし、ステップＳ７４に進む。ステップＳ７４では、自センタ内追加サーバにアプリケーションをインストールし、図１９の処理に戻る。
図２３は、図１７のステップＳ１４の処理能力の削減処理を示すフローである。
ステップＳ８０において、割当て値から現在の計測値を引き算し、削減処理能力量を決定する。ステップＳ８１において、連携センタがあるか否かを判断する。ステップＳ８１の判断がＹＥＳの場合には、ステップＳ８２において、連携センタで削減サーバを決定し、ステップＳ８３において、連携センタの全サーバが削減されたか否かを判断する。ステップＳ８３の判断がＹＥＳの場合には、ステップＳ８１に戻る。ステップＳ８３の判断がＮＯの場合には、ステップＳ８５に進む。ステップＳ８１の判断がＮＯの場合には、ステップＳ８４において、自センタで削減サーバを決定し、ステップＳ８５に進む。
ステップＳ８５においては、自センタの負荷分散比率の決定、および割当て装置を設定する。ステップＳ８６においては、連携センタの負荷分散比率を決定し、割当て装置を設定する。そして、ステップＳ８７において、ユーザリクエスト処理の完了を待つ。ステップＳ８８において、削減サーバからアプリケーションを削除し、ステップＳ８９において、残ったサーバのみを含むようＶＬＡＮを設定し（連携用ネットワーク通信路を設定し）、ステップＳ９０において、連携の解除があるか否かを判断する。ステップＳ９０の判断がＹＥＳの場合には、ステップＳ９１において、自センタと連携センタの帯域を解除して図１７の処理に戻る。ステップＳ９０の判断がＮＯの時も、図１７の処理に戻る。
図２４は、図２３のステップＳ８２あるいは、ステップＳ８４の削減サーバの選択処理を示すフローである。
ステップＳ１００において、他用途に利用可能なサーバがあるか否かを判断する。ステップＳ１００の判断がＮＯの場合、ステップＳ１０３に進む。ステップＳ１００の判断がＹＥＳの場合、ステップＳ１０１において、残り削減性能よりも性能が低いサーバがあるか否かを判断する。ステップＳ１０１の判断がＮＯの場合には、ステップＳ１０３に進む。ステップＳ１０１の判断がＹＥＳの場合には、ステップＳ１０２において、残り削減性能よりも性能が低いサーバの内、最大性能のサーバを削減してステップＳ１００に進む。
ステップＳ１０３では、現在利用中のサーバがあるか否かを判断する。ステップＳ１０３の判断がＮＯの場合には、ステップＳ１０６に進む。ステップＳ１０３の判断がＹＥＳの場合には、ステップＳ１０４において、残り削減性能よりも性能が低いサーバがあるか否かを判断する。ステップＳ１０４の判断がＮＯの場合には、ステップＳ１０６に進む。ステップＳ１０４の判断がＹＥＳの場合には、ステップＳ１０５において、残り削減性能よりも性能が低いサーバの内、最大性能のサーバを削減して、ステップＳ１０３に戻る。
ステップＳ１０６では、削除されたサーバ一覧を生成し、図２３の処理に戻る。
図２５〜図３０は、データベースの連携がある場合の本発明の実施形態の処理の流れを示すフローチャートである。
図２５は、連携依頼を行う自センタの全体の処理の流れを示すフローである。
ステップＳ１１０において、Ｗｅｂサーバの負荷計測を行う。ステップＳ１１１において、予測処理能力が割当て済み処理能力より大きいか否かを判断する。ステップＳ１１１の判断がＹＥＳの場合には、ステップＳ１１２において、Ｗｅｂ処理能力の追加を行い、ステップＳ１１５に進む。ステップＳ１１１の判断がＮＯの場合には、ステップＳ１１３において、現在処理能力が割当て済み処理能力の２分の１より小さいか否かを判断する。ステップＳ１１３の判断がＮＯの場合には、ステップＳ１１５に進む。ステップＳ１１３の判断がＹＥＳの場合には、ステップＳ１１４において、Ｗｅｂ処理の能力を削減して、ステップＳ１１５に進む。ステップＳ１１５においては、センタ内データベースの負荷を計測する。ステップＳ１１６においては、予測処理能力が割当て済み処理能力より大きいか否かを判断する。ステップＳ１１６の判断がＹＥＳの場合には、ステップＳ１１７において、データベース処理能力の追加を行い、ステップＳ１２０に進む。ステップＳ１１６の判断がＮＯの場合には、ステップＳ１１８において、現在処理能力が割当て済み処理能力の２分の１より小さいか否かを判断する。ステップＳ１１８の判断がＮＯの場合には、ステップＳ１２０に進む。ステップＳ１１８の判断がＹＥＳの場合には、ステップＳ１１９において、データベースの処理能力の削減を行い、ステップＳ１２０に進む。ステップＳ１２０では、１０秒待つ。この待ち時間は、設計者により適宜設定されるべきものである。ステップＳ１２０の後は、再び、ステップＳ１１０に戻る。
図２６は、連携先センタの全体処理の流れを示すフローである。
ステップＳ１３０において、センタ内のデータベース負荷を計測する。ステップＳ１３１において、予測処理能力が割当て済み処理能力より大きいか否かを判断する。ステップＳ１３１の判断がＹＥＳの場合には、ステップＳ１３２において、データベース処理能力を追加し、ステップＳ１３５に進む。ステップＳ１３１の判断がＮＯの場合には、ステップＳ１３３において、現在処理能力が割当て済み処理能力の２分の１より小さいか否かを判断する。ステップＳ１３３の判断がＮＯの場合には、ステップＳ１３５に進む。ステップＳ１３３の判断がＹＥＳの場合には、ステップＳ１３４において、データベース処理能力削減を行い、ステップＳ１３５に進む。ステップＳ１３５においては、１０秒待ち、ステップＳ１３０に戻る。この１０秒間はこれに限定されるべきものではなく、設計者によって適宜設定されるべきものである。
図２７は、各センタで行われるウェブ負荷計測あるいはデータベース負荷計測の詳細処理を示すフローである。
ステップＳ１４０においては、使用サーバから１０秒間の平均の処理数を収集する。この１０秒は、図２５のステップＳ１２０、図２６のステップＳ１３５の待ち時間と同じ値であるべきである。ステップＳ１４１において、総平均処理数を計算し、計測履歴に追加する。ステップＳ１４２において、計測履歴が４項以上あるか否かを判断する。ステップＳ１４２の判断がＮＯの時には、ステップＳ１４３において、最新履歴を３０秒後の予測値として、ステップＳ１４５に進む。ステップＳ１４２の判断がＹＥＳの場合には、ステップＳ１４４において、最新４履歴から最小２乗近似で３０秒後の予測値を導出し、ステップＳ１４５に進む。この導出の仕方は、図１８で述べたとおりである。
ステップＳ１４５においては、３０秒後の予測値を設定する。ステップＳ１４６においては、最新履歴を現在値に設定して図２５、２６の処理に戻る。
図２８は、図２５のステップＳ１１２のＷｅｂ処理能力追加処理の詳細フローである。
図２８のフローは、連携センタを追加したときは、ステップＳ１５４からの処理を行う。
まず、ステップＳ１５０において、予測値から現在割り当て値を引き算し、追加処理能力量を決定する。ステップＳ１５１においては、センタ内に予備サーバがあるか否かを判断する。ステップＳ１５１の判断がＮＯの時は、ステップＳ１５４に進む。ステップＳ１５１の判断がＹＥＳの時は、ステップＳ１５２において、センタ内の追加サーバの選択を行う。この処理の詳細は、図２０に示すとおりである。そして、ステップＳ１５３において、追加処理能力量が充足されたか否かを判断する。ステップＳ１５３の判断がＮＯの時は、ステップＳ１５４に進む。ステップＳ１５３の判断がＹＥＳの時は、ステップＳ１５８に進む。
ステップＳ１５４においては、予備処理能力を持つ連携先センタがあるか否かを判断する。ステップＳ１５４の判断がＹＥＳの場合には、ステップＳ１５６において、連携センタで処理能力を割り当てを行う。この処理の詳細は、図２１に示すとおりである。ステップＳ１５７では、追加処理能力量が充足されたか否かを判断する。ステップＳ１５７の判断がＮＯの場合には、ステップＳ１５４に戻る。ステップＳ１５７の判断がＹＥＳの場合には、ステップＳ１５８に進む。ステップＳ１５４の判断がＮＯの場合には、ステップＳ１５５において、追加処理能力量の充足が不能であることを管理者に警告して、ステップＳ１５８に進む。
ステップＳ１５８においては、選択されたサーバを含むようＶＬＡＮを設定し、ステップＳ１５９において、選択されたサーバにアプリケーションを設定する。アプリケーションの設定は、図２２に示したとおりである。ステップＳ１６０では、センタ間の連携があるか否かを判断する。ステップＳ１６０の判断の結果、ＹＥＳであれば、ステップＳ１６１において、連携センタ負荷分散比率の決定及び装置設定を行い、ステップＳ１６２において、自センタと連携センタ間の通信帯域を設定し、ステップＳ１６３に進む。
ステップＳ１６０の判断がＮＯの場合には、ステップＳ１６３にそのまま進む。ステップＳ１６３では、自センタの負荷分散比率を決定し、装置設定して、図２５の処理に戻る。
図２９は、図２５のステップＳ１１７及び図２６のステップＳ１３２のデータベース処理能力追加処理の詳細なフローである。
ステップＳ１７０において、予測値から現在の割り当て値を引き算し、追加処理能力量を決定する。ステップＳ１７１において、センタ内に予備サーバがあるか否かを判断する。ステップＳ１７１の判断がＮＯの場合には、ステップＳ１７７において、現在のデータベースから可能なＷｅｂ能力を計算し、ステップＳ１７８において、連携センタで不足分のＷｅｂ能力を追加する。ステップＳ１７８の処理は、図２８の通りである。そして、図２５あるいは図２６の処理に戻る。
ステップＳ１７１の判断がＹＥＳの場合には、ステップＳ１７２において、センタ内の追加サーバを選択する。そして、ステップＳ１７３において、追加処理能力量が充足されたか否かを判断する。ステップＳ１７３の判断がＮＯの場合には、ステップＳ１７７に進む。ステップＳ１７３の判断がＹＥＳの場合には、ステップＳ１７４において、選択されたサーバを含むようＶＬＡＮを設定し、ステップＳ１７５において、選択されたサーバにデータベースを設定し、ステップＳ１７６において、センタ内のＷｅｂサーバのデータベースリストを更新し、図２５あるいは図２６の処理に戻る。
図３０は、Ｗｅｂサーバ及びデータベースに共通の追加サーバの選択処理の詳細を示すフローである。
ステップＳ１８０において、必要な用途向けサーバがあるか否かを判断する。ステップＳ１８０の判断がＹＥＳの場合、ステップＳ１８１において、必要な用途向けサーバ内に１台で追加処理能力量を充足可能なサーバがあるか否かを判断する。ステップＳ１８１の判断がＮＯの場合には、ステップＳ１８２において、必要な用途向けであって、最大性能のサーバを選択し、ステップＳ１８０に戻る。ステップＳ１８１の判断がＹＥＳの場合には、ステップＳ１８３において、１台で追加処理能力量を充足可能なサーバの中で最低性能のサーバを選択し、ステップＳ１８８に進む。
ステップＳ１８０の判断がＮＯの場合には、ステップＳ１８４において、利用可能なサーバがあるか否かを判断する。ステップＳ１８４の判断がＹＥＳの場合には、ステップＳ１８５において、１台で追加処理能力量を充足可能なサーバがあるか否かを判断する。ステップＳ１８５の判断がＮＯの場合には、ステップＳ１８６において、使用できる最大性能のサーバを選択してステップＳ１８４に進む。ステップＳ１８５の判断がＹＥＳの場合には、ステップＳ１８７において、１台で追加処理能力量を充足可能なサーバの内、最低性能のサーバを選択してステップＳ１８８に進む。ステップＳ１８４の判断がＮＯの場合には、そのままステップＳ１８８に進む。
ステップＳ１８８では、割り当てられたサーバ一覧を構成し、図２８あるいは図２９の処理に戻る。In the present invention, a change in a request amount from a user is predicted, and a server in a data center or another linked data center is dynamically added or deleted in accordance with the change, so that service quality is guaranteed and surplus at the same time. It aims to reduce costs by sharing the server with multiple services.
FIG. 2 is a diagram showing a basic configuration of the embodiment of the present invention.
The client 10 accesses the Web server 15-1 via the network 11 via the load balancer 13-1 of the upstream center 12-1. As a result of data processing in the Web server 15-1, the client 10 accesses the database server 14-1 or the file server 14-2 and receives a service. The post-stage server 12-2 has substantially the same configuration as the pre-stage center 12-1, receives a request from the client 10 via the load distribution apparatus 13-1, and performs load distribution by the load distribution apparatus 13-2. Then, the client 10 is guided to the Web server 15-2. Then, the client 10 accesses the database server 14-3 or 14-4 via the Web server 15-2 and receives a service.
Here, the upstream center 12-1 indicates a center that directly receives a user request, and the downstream center 12-2 indicates a center that processes the user request through the upstream center 12-1. The allocation of servers between data centers is a many-to-many relationship, and when a data center uses servers from multiple data centers or when a data center responds to server requests from multiple data centers at the same time. is there. The server control state and the load state from the client are aggregated, determined, and applied by the system control devices 16-1 and 16-2, and the results are sent to the servers 14-1 to 14-4 and the load distribution devices 13-1 and 13-. Set to 2. When the server resources are insufficient, the servers 17-1 and 17-2 in the spare server are set as servers having necessary functions, added to the service, and the capability is improved.
FIG. 3 is a diagram showing a network arrangement configuration in the center in the basic configuration of FIG.
In the physical network configuration, all servers are connected directly under a single switch group 20, and a plurality of logically independent networks (VLAN0, VLAN11, VLAN12, VLAN21) are formed thereon. With this arrangement, it is possible to automate server addition processing at a required position.
When adding or deleting servers, the server performance is derived from the server specifications such as CPU performance and network configuration, and necessary servers are calculated even in an environment where various types of hardware are mixed. Assign a server. At the same time, the traffic for the server is calculated and the network bandwidth is secured or arbitrated.
In addition, by predicting the future load from load measurement and load fluctuation prediction, a server is added before the load becomes excessive, thereby guaranteeing service quality.
FIG. 4 is a diagram showing a first embodiment of the present invention.
In the figure, the same components as those in FIG. 2 are denoted by the same reference numerals and detailed description thereof is omitted.
When the request from the user exceeds the allocated server capacity, an increase in response time or no response occurs, which causes discomfort to the user. If the load further increases in this state, a server failure may be caused. In order to prevent this, the system control device 16 measures the load state of the server, and if it is determined that there is a problem with the current number of servers, a server is added from the spare server 17 to use applications, services, and the like. Set and introduce data. Then, the settings of the devices and servers in the dependency relationship are updated and incorporated into the service.
FIG. 5 is a diagram showing the operation of the first exemplary embodiment of the present invention.
In the figure, the same components as those in FIG.
When the request amount from the user decreases, a surplus server is generated. Even if the surplus server is deleted, the service quality does not deteriorate. Rather, it is desirable to open it as a spare server and use it for other services from the viewpoint of improving the operation cost and the utilization rate. For this reason, the linkage with the service is canceled by deleting the related setting from the devices having the dependency relationship, and then the processing such as the cancellation of the setting is performed, and the processing is returned to the spare server 17.
FIG. 6 is a diagram illustrating data for calculating the load and capacity of the server.
In order to add and delete service capabilities as needed, information on how much service capability a server provides is necessary. In a data center or the like, service capacity per unit varies depending on a combination of servers and devices to be used, applications, and services. It is practically impossible to arrange the servers to be used in a uniform manner, for example, when a plurality of data centers are linked, and therefore it is necessary to calculate the service capability from the specifications of the devices such as CPU and memory. For this reason, a method of estimating the performance value from the performance value in a typical configuration in consideration of a difference in CPU capability or the like is used.
FIG. 7 is a diagram illustrating data for selecting a server according to the magnitude of the load.
Here, not only the service capability but also information indicating what kind of use is preferable as a characteristic of the server unit is held. Since the performance values for each server that can be used as described above are not uniform, it is necessary to create a configuration that can provide necessary capabilities by combining them. For this reason, servers are selected and used from the performance values and characteristics obtained in FIG. 6 and the required performance values until a server with a high recommendation level is preferentially satisfied.
FIG. 8 is a diagram showing the relationship between the capacity of the server to be added and the predicted load value.
If the measured request amount exceeds the service capability, the service quality cannot be guaranteed only by adding a resource when the load is rapidly increasing. For this reason, the load tendency is grasped, and when an increase in the request amount is expected, a service capability corresponding to the predicted request amount is added in advance to prevent deterioration in service quality. As a prediction method, linear extrapolation or the like can be considered.
FIG. 9 is a diagram showing a configuration in which a spare server is shared by a plurality of services.
When looking at the load status of multiple services in a data center, it is extremely rare for all services to be loaded at the same time, and it is considered that there are always resources that are not used if spare resources are secured for each service. It is done. By sharing spare resources among a plurality of services, it is possible to add necessary service capability with fewer spare resources as a whole. Moreover, the maintenance cost can be dispersed by sharing. Service 1 and service 2 are installed in the center 12, and load distribution devices 13-1 and 13-2 are provided respectively. The service 1 includes a Web server 15-1, a database server 14-1, and a file server 14-2. The service 2 is provided with a server 25. The spare server 17 is provided in common for the service 1 and the service 2, and the system controller 16 introduces an additional server from the spare server 17 to the service 1 or the service 2 by checking the load state.
FIG. 10 is a diagram showing a configuration in the case of providing a spare server between different centers.
In the figure, the same components as those in FIG.
Depending on the scale of the data center 12-1, there may be a case where a spare server 17-1 that is physically or costly sufficient cannot be secured even if the spare server is shared between different services. In addition, even if it is intended to ensure enough, there may be a case where a spare server in the data center cannot be covered by a sudden load. In such a case, another data center 12-2 connected by a network is used as a subsequent center, and the spare server 17-2 is used via the network.
FIG. 11 is a diagram for explaining the operation of the embodiment of the present invention.
In the figure, the same components as those in FIG.
Some services require a server that operates in cooperation with a database in addition to a server that directly exchanges information with a user. In the case of such a service, improvement in performance cannot be expected unless the processing capability and load state are confirmed for each function and a server is added to an appropriate function. Therefore, the system control device 16 checks the load for each layer, and increases or decreases the capacity by changing the setting of the linked server when adding or deleting.
FIG. 12 is a diagram for explaining the securing of the network bandwidth in the case of cooperating with other centers.
In the figure, the same components as those in FIG. 10 are denoted by the same reference numerals.
When a plurality of services operate simultaneously or when linkage processing is necessary, sufficient processing capability cannot be obtained unless a server is added and traffic between the services and functions is not arbitrated. The bandwidth required for each part is calculated, and after considering the ratio, each band is secured to the network so that sufficient performance can be obtained as a whole.
With the above configuration, it is possible to monitor the load from the user and the status of the server capacity, and allocate necessary and sufficient resources from within the data center or from the linked data center before the load exceeds the server capacity. It is possible to guarantee the quality of service for a request from a user. Since necessary spare servers can be shared in a wide range at the same time, the total amount of necessary servers can be reduced as a whole. In addition, even for a service in which servers having a plurality of functions cooperate with each other, a server can be added to a function that is a bottleneck, so that the scale can be sufficiently increased. Furthermore, since the entire process can be automated, it is possible to quickly follow changes in the amount requested by the user.
FIG. 13 is a diagram illustrating an application example of the embodiment of the present invention in a web server.
In the figure, the same components as those in FIG.
When the load is light, only the front center 12-1 is operated. When the load increases, the spare server 17-1 in the upstream center 12-1 is added as the web server 15-1. When the load further increases, a web server group 15-2 is created in the latter stage center 12-2, and the latter center 12-2 is also responsible for the load.
FIG. 14 is a diagram illustrating an application example of the embodiment of the present invention in a web service.
In the figure, the same components as those in FIG.
In this example, the web service is composed of a combination of a web server 15-1, a database server 14-1, and a file server 14-2. When the load is light, only the front center 12-1 is operated. As the load increases, the spare servers 17-1 are sequentially added to the bottleneck portion, and when the front center 12-1 can not be cut, the cooperation with the rear center 12-2 is performed. The database server 14-1 in this example synchronizes data between the front center 12-1 and the rear center 12-2 even during cooperation. This is realized by creating a VLAN across the centers and securing the bandwidth.
FIG. 15 shows an application example of the embodiment of the present invention when resources are exchanged between equal centers.
When the processing capability of the service 1 in the center 1 cannot be covered by the spare server 30-1 in the center 1, the center 2 is requested to cooperate and the server in the center 2 (shaded portion and the spare server 30-3) Is used. Further, when the server capacity in the center 2 is also depleted (when the capacity including the spare server 30-2 is depleted), the other center 3 is requested to cooperate and the servers in the center 3 (shaded part and spare server). 30-3) is used.
FIG. 16 is a diagram illustrating an example in which the embodiment of the present invention is applied to a front-stage center that does not have a spare server.
If the system control unit 16-1 determines that there is a shortage of servers for service provision in the pre-stage center 12-1, the post-center 12-2 is requested to cooperate with the servers in the post-center 12-2. Use. Here, the load distribution device and the Web server are provided to the service 1 and the service 2. The service 1 ′ and service 2 ′ servers perform service 1 and service 2, respectively. Further, when the server capacity becomes insufficient in the rear stage center 12-2, the spare server 17 is added as necessary for each service. The system control unit 16-2 performs additional determination and cooperation with the upstream center 12-1.
17 to 24 are flowcharts for explaining the operation of the embodiment of the present invention when there is no cooperation between databases provided in the center.
FIG. 17 is a flowchart showing the overall flow of the system control apparatus.
First, in step S10, load measurement is performed. In step S11, it is determined whether the predicted processing capacity exceeds the allocated processing capacity. If the determination in step S11 is yes, processing capacity is added in step S12, and the process proceeds to step S15. In step S15, the process waits for 10 seconds. This numerical value should be set appropriately by the designer.
If the determination in step S11 is NO, in step S13, it is determined whether or not the current processing capacity is equal to or less than half of the allocated processing capacity. If the determination in step S13 is yes, processing capacity is reduced in step S14, and the process proceeds to step S15. If the determination in step S13 is no, the process proceeds to step S15.
After step S15, the process returns to step S10 again.
FIG. 18 is a diagram showing details of load measurement in step S10 of FIG.
In step S20, the average number of processes for 10 seconds is collected from the use server. This 10 seconds should coincide with the value in step S15 in FIG. In step S21, the total average number of processes is calculated and added to the measurement history. In step S22, it is determined whether there are four or more measurement histories. If the determination in step S22 is no, in step S23, the latest history is set as the predicted value after 30 seconds, and the process proceeds to step S25. If the determination in step S22 is yes, in step S24, a predicted value after 30 seconds is calculated from the latest 4 histories by least square approximation, and the process proceeds to step S25. This is to obtain a regression curve from the latest four histories and obtain a predicted value after 30 seconds using the regression curve. In step S25, a predicted value after 30 seconds is set. In step S26, the latest history is set to the current value, and the flow returns to the flow of FIG.
FIG. 19 is a diagram showing details of the processing capability addition processing in step S12 of FIG.
In step S30, an additional processing capacity amount is determined by subtracting the current assigned value from the predicted value. In step S31, it is determined whether there is a spare server in the center. If the determination in step S31 is YES, an additional server in the center is selected in step S32. In step S33, it is determined whether or not the additional processing capacity is satisfied. If the determination in step S33 is NO, the process proceeds to step S34. If the determination is YES, the process proceeds to step S38. If the determination in step S31 is no, the process proceeds to step S34.
In step S34, it is determined whether there is a cooperation destination center having a preliminary processing capability. If the determination in step S34 is yes, in step S36, the processing capability is assigned at the cooperation center. In step S37, it is determined whether or not the additional processing capacity is satisfied. If the determination in step S37 is no, the process proceeds to step S34. If the determination in step S37 is yes, the process proceeds to step S38. If the determination in step S34 is no, in step S35, the administrator is warned that the additional processing capacity cannot be satisfied, and the process proceeds to step S38. In step S38, the VLAN is set so as to include the selected server. In step S39, an application is set in the selected server, and the process proceeds to step S40.
In step S40, it is determined whether there is cooperation between the centers. If the determination is NO, the process proceeds to step S43. If the determination in step S40 is YES, in step S41, the cooperation center load distribution ratio is determined and the allocation device is set. In step S42, the communication band between the own center and the cooperation center is set, Proceed to step S43. In step S43, the load distribution ratio of the own center is determined, the allocation device is determined, and the flow returns to the flow of FIG.
FIG. 20 is a flowchart showing in detail the additional server selection process in step S32 of FIG.
In step S50, it is determined whether there is a server for a required application. If the determination in step S50 is no, the process proceeds to step S54. If the determination in step S50 is YES, in step S51, it is determined whether there is a server that can satisfy the additional processing capacity amount in one server for the required application. If the determination in step S51 is no, in step S52, the server with the maximum performance for the required application is selected, and the process returns to step S50. If the determination in step S51 is yes, the server with the lowest performance is selected from the servers for the necessary use among the servers that can provide the additional processing capacity, and the process proceeds to step S58.
In step S54, it is determined whether there is an available server. If the determination in step S54 is no, the process proceeds to step S58. If the determination in step S54 is yes, it is determined in step S55 whether or not there is a server that can satisfy the additional processing capacity amount by one unit. If the determination in step S55 is no, the server with the maximum performance is selected in step S56, and the process returns to step S54. If the determination in step S55 is YES, in step S57, the server with the lowest performance is selected from the servers that can satisfy the additional processing capacity amount by one unit, and the process proceeds to step S58. In step S58, the allocated server list is constructed, and the process returns to the process of FIG.
FIG. 21 is a flowchart showing the flow of the cooperation center processing capability assignment process in step S36 of FIG.
In step S60, it is determined whether or not the upper limit of processing capacity due to the bandwidth is smaller than the desired allocation value. If the determination in step S60 is no, the process proceeds to step S62. If the determination in step S60 is yes, in step S61, the allocated amount upper limit is set as the bandwidth upper limit, and the process proceeds to step S62.
In step S62, the cooperation destination center is requested to select a server. In step S63, an additional server is selected in the cooperation destination center. In step S64, the assigned server list is constructed, and the process returns to the process of FIG. .
FIG. 22 is a detailed flow of the application setting in step S39 of FIG.
In step S70, it is determined whether there is cooperation between the centers. If the determination in step S70 is no, the process proceeds to step S74. If the determination in step S70 is yes, it is determined in step S71 whether the application archive has been transferred. If the determination in step S71 is yes, the process proceeds to step S73. If the determination in step S71 is no, in step S72, the application archive is transferred to the cooperation destination center, and the process proceeds to step S73. In step S73, the application is installed on the additional server, and the process proceeds to step S74. In step S74, the application is installed in the own center additional server, and the process returns to the process of FIG.
FIG. 23 is a flowchart showing processing capacity reduction processing in step S14 of FIG.
In step S80, the current measured value is subtracted from the assigned value to determine the reduction processing capacity amount. In step S81, it is determined whether there is a cooperation center. If the determination in step S81 is yes, a reduction server is determined in the cooperation center in step S82, and it is determined in step S83 whether all the servers in the cooperation center have been reduced. If the determination in step S83 is yes, the process returns to step S81. If the determination in step S83 is no, the process proceeds to step S85. If the determination in step S81 is no, in step S84, a reduction server is determined in the own center, and the process proceeds to step S85.
In step S85, the determination of the load distribution ratio of the own center and the allocation device are set. In step S86, the load distribution ratio of the cooperation center is determined and the allocation device is set. In step S87, the process waits for completion of the user request process. In step S88, the application is deleted from the reduction server. In step S89, a VLAN is set so as to include only the remaining server (a network communication path for cooperation is set). In step S90, whether or not cooperation is released. Judging. If the determination in step S90 is yes, in step S91, the bandwidths of the local center and the cooperation center are released, and the process returns to the process of FIG. When the determination in step S90 is NO, the process returns to the process of FIG.
FIG. 24 is a flowchart showing the selection processing of the reduction server in step S82 or step S84 in FIG.
In step S100, it is determined whether there is a server that can be used for other purposes. If the determination in step S100 is no, the process proceeds to step S103. If the determination in step S100 is YES, in step S101, it is determined whether there is a server with performance lower than the remaining reduction performance. If the determination in step S101 is no, the process proceeds to step S103. If the determination in step S101 is yes, in step S102, the server having the highest performance is reduced among the servers whose performance is lower than the remaining reduction performance, and the process proceeds to step S100.
In step S103, it is determined whether there is a server currently in use. If the determination in step S103 is no, the process proceeds to step S106. If the determination in step S103 is YES, in step S104, it is determined whether there is a server whose performance is lower than the remaining reduction performance. If the determination in step S104 is no, the process proceeds to step S106. If the determination in step S104 is YES, in step S105, the servers with the highest performance among the servers with lower performance than the remaining reduction performance are reduced, and the process returns to step S103.
In step S106, the deleted server list is generated, and the process returns to the process of FIG.
FIG. 25 to FIG. 30 are flowcharts showing the flow of processing according to the embodiment of the present invention when there is database cooperation.
FIG. 25 is a flowchart showing the overall processing flow of the own center which makes a cooperation request.
In step S110, the load of the Web server is measured. In step S111, it is determined whether the predicted processing capacity is larger than the allocated processing capacity. If the determination in step S111 is yes, in step S112, web processing capability is added, and the process proceeds to step S115. If the determination in step S111 is no, it is determined in step S113 whether or not the current processing capacity is smaller than half of the allocated processing capacity. If the determination in step S113 is no, the process proceeds to step S115. If the determination in step S113 is yes, in step S114, the web processing capability is reduced, and the process proceeds to step S115. In step S115, the load on the in-center database is measured. In step S116, it is determined whether the predicted processing capacity is greater than the allocated processing capacity. If the determination in step S116 is yes, in step S117, database processing capacity is added, and the process proceeds to step S120. If the determination in step S116 is NO, it is determined in step S118 whether the current processing capacity is smaller than half of the allocated processing capacity. If the determination in step S118 is no, the process proceeds to step S120. If the determination in step S118 is yes, in step S119, the database processing capacity is reduced, and the process proceeds to step S120. In step S120, 10 seconds are waited. This waiting time should be appropriately set by the designer. After step S120, the process returns to step S110 again.
FIG. 26 is a flowchart showing the overall processing flow of the cooperation destination center.
In step S130, the database load in the center is measured. In step S131, it is determined whether the predicted processing capacity is greater than the allocated processing capacity. If the determination in step S131 is yes, in step S132, database processing capability is added, and the process proceeds to step S135. If the determination in step S131 is no, in step S133, it is determined whether or not the current processing capacity is smaller than half of the allocated processing capacity. If the determination in step S133 is no, the process proceeds to step S135. If the determination in step S133 is yes, in step S134, the database processing capacity is reduced, and the process proceeds to step S135. In step S135, the process waits for 10 seconds and returns to step S130. The 10 seconds should not be limited to this, but should be set appropriately by the designer.
FIG. 27 is a flowchart showing detailed processing of web load measurement or database load measurement performed at each center.
In step S140, the average number of processes for 10 seconds is collected from the use server. This 10 seconds should be the same value as the waiting time in step S120 in FIG. 25 and step S135 in FIG. In step S141, the total average number of processes is calculated and added to the measurement history. In step S142, it is determined whether there are four or more measurement histories. When the determination in step S142 is NO, in step S143, the latest history is set as a predicted value after 30 seconds, and the process proceeds to step S145. If the determination in step S142 is yes, in step S144, a predicted value after 30 seconds is derived from the latest four histories by least square approximation, and the process proceeds to step S145. This derivation method is as described in FIG.
In step S145, a predicted value after 30 seconds is set. In step S146, the latest history is set to the current value, and the processing returns to the processing of FIGS.
FIG. 28 is a detailed flow of the web processing capability addition process in step S112 of FIG.
The flow of FIG. 28 performs the processing from step S154 when a cooperation center is added.
First, in step S150, the current assigned value is subtracted from the predicted value to determine an additional processing capacity amount. In step S151, it is determined whether there is a spare server in the center. If the determination in step S151 is no, the process proceeds to step S154. When the determination in step S151 is YES, an additional server in the center is selected in step S152. Details of this processing are as shown in FIG. In step S153, it is determined whether or not the additional processing capacity has been satisfied. If the determination in step S153 is no, the process proceeds to step S154. If the determination in step S153 is yes, the process proceeds to step S158.
In step S154, it is determined whether there is a cooperation destination center having a preliminary processing capability. If the determination in step S154 is yes, in step S156, processing capabilities are assigned at the cooperation center. Details of this processing are as shown in FIG. In step S157, it is determined whether or not the additional processing capacity is satisfied. If the determination in step S157 is no, the process returns to step S154. If the determination in step S157 is yes, the process proceeds to step S158. If the determination in step S154 is no, in step S155, the administrator is warned that the additional processing capacity cannot be satisfied, and the process proceeds to step S158.
In step S158, the VLAN is set to include the selected server, and in step S159, an application is set to the selected server. The application settings are as shown in FIG. In step S160, it is determined whether there is cooperation between the centers. If the result of determination in step S160 is YES, in step S161, the cooperation center load distribution ratio is determined and the apparatus is set. In step S162, the communication band between the local center and the cooperation center is set, and the process proceeds to step S163. .
If the determination in step S160 is no, the process proceeds directly to step S163. In step S163, the load distribution ratio of the own center is determined, the apparatus is set, and the process returns to the process of FIG.
FIG. 29 is a detailed flow of the database processing capability addition process in step S117 in FIG. 25 and step S132 in FIG.
In step S170, the current assigned value is subtracted from the predicted value to determine the additional processing capacity amount. In step S171, it is determined whether there is a spare server in the center. If the determination in step S171 is no, in step S177, the possible web capabilities are calculated from the current database, and in step S178, the shortage web capabilities are added in the cooperation center. The processing in step S178 is as shown in FIG. Then, the processing returns to the processing of FIG. 25 or FIG.
If the determination in step S171 is YES, an additional server in the center is selected in step S172. In step S173, it is determined whether or not the additional processing capacity has been satisfied. If the determination in step S173 is no, the process proceeds to step S177. If the determination in step S173 is YES, in step S174, the VLAN is set to include the selected server, in step S175, the database is set in the selected server, and in step S176, the Web server in the center is set. The database list is updated, and the processing returns to the processing of FIG. 25 or FIG.
FIG. 30 is a flowchart showing details of an additional server selection process common to the Web server and the database.
In step S180, it is determined whether there is a necessary application server. If the determination in step S180 is yes, in step S181, it is determined whether there is a server that can satisfy the additional processing capacity amount in the server for the required application. If the determination in step S181 is no, in step S182, a server with the maximum performance for the required application is selected, and the process returns to step S180. If the determination in step S181 is yes, in step S183, the server with the lowest performance is selected from the servers that can satisfy the additional processing capacity amount by one unit, and the process proceeds to step S188.
If the determination in step S180 is no, it is determined in step S184 whether there is an available server. If the determination in step S184 is yes, in step S185, it is determined whether or not there is a server that can satisfy the additional processing capacity amount. If the determination in step S185 is no, in step S186, a server with the maximum performance that can be used is selected, and the flow advances to step S184. If the determination in step S185 is YES, in step S187, the server having the lowest performance is selected from the servers that can satisfy the additional processing capacity amount by one unit, and the process proceeds to step S188. If the determination in step S184 is no, the process proceeds directly to step S188.
In step S188, the allocated server list is constructed, and the process returns to the process of FIG.

本発明により、サービス毎、データセンタ毎に充分な予備サーバを確保して置かなくても必要となった時に動的に割り当てることでサービス品質が達成できるようになる。また、小規模なデータセンタであっても、他のデータセンタと連携することで急激な負荷集中時にもサービス品質を保証することが可能になる。更に予備サーバの共用により設備投資を軽減でき、同時に設備の有効利用が可能となる。 According to the present invention, it is possible to achieve service quality by dynamically allocating a spare server for each service and each data center when it is necessary without securing sufficient spare servers. Even in a small data center, it is possible to guarantee service quality even in the event of sudden load concentration by cooperating with other data centers. Furthermore, the equipment investment can be reduced by sharing the spare server, and at the same time, the equipment can be used effectively.

Claims

A load balancing method for a device including a plurality of servers that provide services to clients via a network,
Providing a plurality of spare servers for sharing the load of the server providing the normal service, in which none of the services are set in the initial state;
In anticipation of an increase in the load of the server that provides the normal service, an application for the service to be provided to the spare server is set to be a server for providing the service, and the load is shared with the server that provides the normal service Control steps;
A method comprising the steps of:

When a plurality of the devices are connected via a network and one device cannot support the load, a server that is normally used to provide a required service of another device is provided. The method of claim 1, wherein the method is provided for one device.

3. The method according to claim 2, wherein the other apparatus has a spare server, and provides the spare server when the server provided for the one apparatus cannot support the load. .

The method according to claim 2, wherein when a load is shared among the plurality of devices, a communication band is secured between the plurality of devices.

The control step determines whether or not to use a spare server for providing a service by predicting the magnitude of a load after a predetermined time from the number of requests processed in the past server. The method according to 1.

The method according to claim 1, wherein when a spare server is used for a specific service, the spare server is used from a spare server suitable for providing the specific service based on hardware characteristics of the spare server.

2. The method according to claim 1, wherein when a spare server is used for a specific service, the processing capacity to be replenished is used in preference to a spare server that can be replenished by a single machine.

8. The method according to claim 7, wherein a spare server that can replenish a processing capacity to be replenished by one unit is used in preference to a spare server having the lowest performance.

2. When a call server is used for a specific service, if there is no spare server capable of replenishing the processing capacity to be replenished by one unit, a spare server having the maximum performance is used. The method described.

When the load becomes low enough to be supported without a spare server, the control step deletes the application for providing the service from the spare server used to provide the service with a reduced load. The method according to claim 1, wherein the use of the spare server is stopped.

The method according to claim 10, wherein when the use of the spare server is stopped, the use is stopped in consideration of the hardware characteristics of the spare server.

11. The use of the spare server having the maximum performance is stopped when the use of the spare server is stopped within a range in which the load of the specific service can be continuously supported by the remaining servers and the spare server. the method of.

A device having a plurality of servers for providing services to clients via a network,
A plurality of spare servers that do not have any services set up in the initial state to share the load of servers that provide normal services;
In anticipation of an increase in the load of the server that provides the normal service, an application for the service to be provided to the spare server is set to be a server for providing the service, and the load is shared with the server that provides the normal service And a control means.

A load balancing method for a device including a plurality of servers that provide services to clients via a network,
Providing a plurality of spare servers for sharing the load of the server providing the normal service, in which none of the services are set in the initial state;
In anticipation of an increase in the load of the server that provides the normal service, an application for the service to be provided to the spare server is set to be a server for providing the service, and the load is shared with the server that provides the normal service Control steps;
A program for causing a computer to implement a method characterized by comprising: