JP2010072733A

JP2010072733A - Server management device, server management method and program

Info

Publication number: JP2010072733A
Application number: JP2008236845A
Authority: JP
Inventors: Airi Saito; 愛理齋藤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2008-09-16
Filing date: 2008-09-16
Publication date: 2010-04-02

Abstract

PROBLEM TO BE SOLVED: To provide a server management device suppressing power consumption in a data center by efficiently reducing temperature rise of a server within a machine in the data center to reduce the operation quantity of an air conditioning facility, a server management method and a program. SOLUTION: A failure detection means 23 determines whether a physical server which has reached an abnormal temperature is present, a processing execution means 24 moves a virtual machine operated on the physical server with the abnormal temperature to the other physical server. At that time, the execution means 24 selects the physical server as the moving destination based on heating characteristics of each physical server. COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、複数の物理サーバ上で運用される複数の仮想マシンを管理するサーバ管理装置，サーバ管理方法及びサーバ管理用プログラムに関する。 The present invention relates to a server management apparatus, a server management method, and a server management program for managing a plurality of virtual machines operated on a plurality of physical servers.

複数の物理サーバを管理するデータセンタでは、複数の物理サーバを仮想化によって１つのサーバリソースとして管理し、このサーバリソースを動的に分割して複数の仮想マシンとして稼動させるプロビジョニング技術が活用されている。 Data centers that manage multiple physical servers utilize provisioning technology that manages multiple physical servers as one server resource through virtualization, and dynamically divides this server resource to operate as multiple virtual machines. Yes.

このプロビジョニング技術は、複数の物理サーバそれぞれの負荷状況を監視し、負荷のばらつきをできるだけ少数の物理サーバに集約するように、仮想マシンを配置して余剰サーバを増やしたり、負荷異常の物理サーバを検出した場合に余剰サーバを起動して仮想マシンを再配置し負荷を平準化したりすることで、クライアント側からの要求に迅速且つ柔軟に対応できるシステム環境を実現している。 This provisioning technology monitors the load status of each of the multiple physical servers and places virtual machines to increase the number of surplus servers or to increase the number of physical servers with load abnormalities so that load variations are aggregated to as few physical servers as possible. When detected, a surplus server is started, the virtual machine is rearranged, and the load is leveled, thereby realizing a system environment that can respond quickly and flexibly to requests from the client side.

一方で、環境問題への取り組みに対する関心が世界的に高まるなか、複数の物理サーバを管理するデータセンタにも省電力化が求められている。 On the other hand, as the interest in addressing environmental issues is increasing worldwide, data centers that manage multiple physical servers are also required to save power.

通常のデータセンタでは、物理サーバのオーバーヒートによるシステムダウンを防止するために、マシン室内の各ラックに温度センサを取り付けて温度を監視し空調設備と連動して各物理サーバの温度をリアルタイムに制御する技術が採用され、マシン室内に設置された機器の冷却を効率的に行っている。 In a normal data center, in order to prevent system down due to overheating of physical servers, temperature sensors are attached to each rack in the machine room to monitor the temperature, and the temperature of each physical server is controlled in real time in conjunction with the air conditioning equipment. Technology is adopted to efficiently cool the equipment installed in the machine room.

このような空調制御技術を採用することによって、データセンタで消費される電力は、物理サーバなどの機器による消費電力よりも、機器を冷却するための空調設備による消費電力のほうが多くを占めることになっており、データセンタの省電力化を実現するためには、物理サーバ等の機器の発熱を抑制して空調設備の稼動量を軽減させる技術が求められていた。 By adopting such air-conditioning control technology, the power consumed by the data center is more consumed by the air-conditioning equipment for cooling the equipment than the power consumed by the equipment such as physical servers. Therefore, in order to realize power saving of the data center, a technology for reducing the operation amount of the air conditioning equipment by suppressing the heat generation of the equipment such as the physical server has been demanded.

これに関連する技術として、個々の物理サーバに対する熱異常発生への運用処置を行う形態のプロビジョニング技術が知られている。この技術は、各物理サーバに対して温度の閾値を設定して、温度が閾値を超えた物理サーバを検出すると、仮想マシンの再配置を行って異常温度に達した物理サーバを停止し、システムの縮退運転を自動実行して、物理サーバの発熱を抑える運用処置を行うような仕組みであった。 As a technology related to this, a provisioning technology in a form of performing an operation procedure for occurrence of a thermal abnormality for each physical server is known. This technology sets a temperature threshold for each physical server, and when a physical server whose temperature exceeds the threshold is detected, the virtual server is relocated to stop the physical server that has reached the abnormal temperature, This is a mechanism that automatically executes the degenerate operation of the system and performs operation measures to suppress the heat generation of the physical server.

また、その他の関連技術が、特許文献１乃至３に開示されている。特許文献１には、工業炉の設計を支援するために、工業炉内の流れや温度分布をシュミレーションするシステムが開示されている。特許文献２には、温度測定器と、温度を測定した場所の天井を撮影する手段と、天井の画像と温度測定値とを基に室内の温度分布図を作成するデータ処理手段とを有する装置が開示されている。特許文献３には、発電プラントなどの診断を行うのに用いる知識ベースを作成し、プラントの出口温度が異常になった場合、知識ベース（温度制御ルール，作業指示ルール）に従って対処法を推測し提示するシステムが開示されている。 Other related techniques are disclosed in Patent Documents 1 to 3. Patent Document 1 discloses a system for simulating the flow and temperature distribution in an industrial furnace in order to support the design of the industrial furnace. Patent Document 2 discloses an apparatus having a temperature measuring device, a means for photographing a ceiling at a place where the temperature is measured, and a data processing means for creating an indoor temperature distribution map based on an image of the ceiling and a temperature measurement value. Is disclosed. In Patent Document 3, a knowledge base used for diagnosis of a power plant or the like is created. When the outlet temperature of the plant becomes abnormal, a countermeasure is estimated according to the knowledge base (temperature control rule, work instruction rule). A system for presenting is disclosed.

これら特許文献１乃至３に開示された技術をデータセンタの温度管理技術に適用すると、マシン室内の温度分布図を作成し、異常温度を検出するとその機器への対処法をルールに従って決定することができる。これは、温度異常を検出した物理サーバの運用処置を決定する点で、上述したプロビジョニング技術と同様である。 When the technologies disclosed in Patent Documents 1 to 3 are applied to the temperature management technology of the data center, a temperature distribution map in the machine room is created, and when an abnormal temperature is detected, a countermeasure for the device can be determined according to a rule. it can. This is the same as the provisioning technique described above in that the operation procedure of the physical server that detects the temperature abnormality is determined.

特開平１１−２５０１１７号公報JP-A-11-250117 特開２００１−１４２９２０号公報JP 2001-142920 A 特公平７−３８２３９号公報Japanese Patent Publication No. 7-38239

しかしながら、上述した関連技術は、物理サーバの障害予兆監視に用いる技術であるため、物理サーバの温度異常による障害回避、および、熱溜まりの一時的な解消は可能であるが、マシン室における各物理サーバの温度、すなわち、マシン室内の温度分布を考慮して運用処置を行っていないので、マシン室規模の温度上昇防止を実現する適切な運用処置は保障されず、縮退運転により別の物理サーバが負荷を受けて即座に異常温度まで発熱してしまうなど、常にいずれかの物理サーバが異常温度になってしまい、空調設備の稼動量を抑えることはできなかったという問題があった。 However, since the related technology described above is a technology used for predictive failure monitoring of physical servers, it is possible to avoid failures due to temperature abnormalities of physical servers and to temporarily eliminate heat accumulation. Since the operation is not performed in consideration of the temperature of the server, that is, the temperature distribution in the machine room, an appropriate operation for preventing temperature rise on the machine room scale is not guaranteed, and another physical server is not allowed to operate due to the degenerate operation. There was a problem that one of the physical servers was constantly at an abnormal temperature, such as immediately generating heat to an abnormal temperature upon receiving a load, and the amount of air conditioning equipment could not be reduced.

そこで、本発明は、上記従来技術の問題を改善し、マシン室内にあるサーバの温度上昇を有効に抑えて、空調設備の稼動量を軽減し、マシン室全体の消費電力を抑えることが可能なサーバ管理装置，サーバ管理方法及びサーバ管理用プログラムを提供することを、その目的とする。 Therefore, the present invention can improve the above-mentioned problems of the prior art, effectively suppress the temperature rise of the server in the machine room, reduce the operation amount of the air conditioning equipment, and can reduce the power consumption of the entire machine room. It is an object of the present invention to provide a server management apparatus, a server management method, and a server management program.

上記目的を達成するため、本発明のサーバ管理装置は、複数の仮想マシンが実装された複数の物理サーバと回線接続する通信部と、各物理サーバから各仮想マシンの稼動状況を予め設定された時間間隔で収集する実環境データ収集手段と、複数の物理サーバに対応して設置された温度センサから各物理サーバの温度を予め設定された時間間隔で収集するサーバ温度収集手段と、この収集された各物理サーバの温度を予め物理サーバ毎に設定された閾値と個別に比較し閾値以上の温度を有する物理サーバである温度異常サーバがあるか否かを判定する障害検出手段と、温度異常サーバがあると判定された場合にその温度異常サーバ上で稼動している仮想マシンを他の物理サーバへ移動させる処置実行手段とを備え、この処置実行手段は、温度異常サーバ上で稼動している仮想マシンのうち少なくとも１つを移動対象マシンとして選出し、この移動対象マシンを受け入れても温度が閾値を超えないと推定される物理サーバを予め設定された各物理サーバ固有の発熱特性に基づいて特定しこの物理サーバを当該移動対象マシンの移動先に決定する機能を備えたことを特徴とする。 In order to achieve the above object, the server management apparatus according to the present invention has a communication unit that is connected to a plurality of physical servers on which a plurality of virtual machines are mounted, and the operation status of each virtual machine is preset from each physical server. Real environment data collecting means for collecting at time intervals, server temperature collecting means for collecting the temperature of each physical server at a preset time interval from temperature sensors installed corresponding to a plurality of physical servers, A failure detection means for individually comparing the temperature of each physical server with a threshold set in advance for each physical server and determining whether there is a temperature abnormal server that is a physical server having a temperature equal to or higher than the threshold; and a temperature abnormal server And a treatment execution means for moving a virtual machine running on the abnormal temperature server to another physical server when it is determined that there is a temperature abnormality. Each physical server in which at least one virtual machine running on the server is selected as a movement target machine, and a physical server whose temperature is estimated not to exceed a threshold even if this movement target machine is accepted is set in advance It is characterized in that it has a function of specifying the physical server as a movement destination of the movement target machine specified based on a unique heat generation characteristic.

また、本発明のサーバ管理方法は、複数の仮想マシンを実装した複数の物理サーバから各仮想マシンの稼働状況を予め設定された時間間隔で収集すると共に、複数の物理サーバに対応して予め設置された温度センサから各物理サーバの温度を予め設定された時間間隔で収集し、この収集された各物理サーバの温度を予め各物理サーバ毎に設定された閾値と個別に比較して、閾値以上の温度を有する物理サーバである温度異常サーバがあるか否か判定し、温度異常サーバがあった場合に、当該温度異常サーバ上で稼動している仮想マシンのうち少なくとも１つを、他の物理サーバに移動させる移動対象マシンとして選出し、予め定められた各物理サーバ固有の発熱特性に基づいて、移動対象マシンを受け入れても温度が閾値を超えないと推定される物理サーバを特定し、この特定された物理サーバへ移動対象マシンを移すことを特徴とする。 The server management method of the present invention collects the operating status of each virtual machine from a plurality of physical servers mounted with a plurality of virtual machines at a preset time interval, and is installed in advance corresponding to the plurality of physical servers. The temperature of each physical server is collected at a preset time interval from the measured temperature sensor, and the temperature of each collected physical server is individually compared with the threshold value set for each physical server in advance, and is equal to or greater than the threshold value. It is determined whether there is a temperature abnormal server that is a physical server having a temperature of, and if there is a temperature abnormal server, at least one of the virtual machines running on the temperature abnormal server is assigned to another physical server. Based on the heat generation characteristics specific to each physical server that is selected in advance as the target machine to be moved to the server, it is estimated that the temperature will not exceed the threshold even if the target machine is accepted. Identify the physical server that is characterized by transferring the movement target machine to the specified physical server.

更に、本発明のサーバ管理用プログラムは、複数の仮想マシンを実装した複数の物理サーバから各仮想マシンの稼動状況を予め設定された時間間隔で収集する実環境データ収集機能と、複数の物理サーバに対応して設置された温度センサから各物理サーバの温度を予め設定された時間間隔で収集するサーバ温度収集機能と、この収集された各物理サーバの温度を予め物理サーバ毎に設定された閾値と個別に比較し閾値以上の温度を有する物理サーバである温度異常サーバがあるか否かを判定する障害検出機能と、温度異常サーバがあった場合に、当該温度異常サーバ上で稼動している仮想マシンの少なくとも１つを他の物理サーバへ移動させる移動対象マシンとして選出し、予め定められた各物理サーバ固有の発熱特性に基づいて、移動対象マシンを受け入れても温度が閾値を超えないと推定される物理サーバを特定し、この特定された物理サーバへ移動対象マシンを移動させる処置実行機能とをコンピュータに実行させることを特徴とする。 Furthermore, the server management program according to the present invention includes a real environment data collection function for collecting the operation status of each virtual machine at a preset time interval from a plurality of physical servers mounted with a plurality of virtual machines, and a plurality of physical servers. Server temperature collection function that collects the temperature of each physical server at a preset time interval from temperature sensors installed corresponding to the threshold, and a threshold value that is set for each physical server in advance. And a failure detection function that determines whether there is a temperature abnormal server that is a physical server that has a temperature equal to or higher than a threshold, and if there is a temperature abnormal server, it is running on that temperature abnormal server Select at least one virtual machine as a target machine to be moved to another physical server, and move it based on the heat generation characteristics specific to each physical server. To accept thin identify physical server which is estimated to temperature does not exceed the threshold value, characterized in that to execute the action execution function on a computer to move the moving target machine to the specified physical server.

本発明は以上のように構成されるため、これにより、異常温度に達した物理サーバを検出した場合に、この異常温度サーバ上で稼動している仮想マシンを他の物理サーバに移し、さらに、この仮想マシンの移動先を各物理サーバの発熱特性に基づいて選択して、温度の上限値までに余裕のある物理サーバへ仮想マシンを移すので、マシン室内に設置された複数の物理サーバの温度上昇を効率的に抑えることが可能となり、結果として、マシン室における空調設備の消費電力が軽減し、データセンタの省電力化を実現できる。 Since the present invention is configured as described above, when a physical server that has reached an abnormal temperature is detected, a virtual machine running on the abnormal temperature server is moved to another physical server, This virtual machine destination is selected based on the heat generation characteristics of each physical server, and the virtual machine is moved to a physical server that has room to the upper limit of the temperature, so the temperature of multiple physical servers installed in the machine room As a result, the power consumption of the air conditioning equipment in the machine room can be reduced and the power saving of the data center can be realized.

以下、本発明に係る一実施形態を、図面に基づいて説明する。 Hereinafter, an embodiment according to the present invention will be described with reference to the drawings.

図１は、本実施形態のデータセンタにおけるネットワーク構成を示す図である。図１に示すように、本実施形態のデータセンタにおけるネットワーク構成は、複数台の物理サーバＡ１〜Ａｎとサーバ管理装置Ｃとが接続し、複数台の物理サーバＡ１〜Ａｎ上に複数の仮想マシンＢ１〜Ｂｍが構築されて運用されている。 FIG. 1 is a diagram showing a network configuration in the data center of this embodiment. As shown in FIG. 1, the network configuration in the data center of the present embodiment is such that a plurality of physical servers A1 to An and a server management apparatus C are connected, and a plurality of virtual machines are installed on the plurality of physical servers A1 to An. B1 to Bm are constructed and operated.

図２は、本実施形態のサーバ管理装置Ｃの構成を示す機能ブロック図である。図に示すように、本実施形態のサーバ管理装置Ｃは、複数の仮想マシンＢ１〜Ｂｍを実装した複数の物理サーバＡ１〜Ａｎと回線接続する通信部１と、物理サーバＡ１〜Ａｎからの情報に基づいて仮想マシンの配置を制御するデータ処理装置２と、各種情報を記憶する記憶装置３と、利用者の操作に従って情報を入力する入力部４と、ディスプレイを含む表示部５とを備えて構成されている。 FIG. 2 is a functional block diagram showing the configuration of the server management apparatus C of this embodiment. As shown in the figure, the server management apparatus C of the present embodiment includes a communication unit 1 that is connected to a plurality of physical servers A1 to An on which a plurality of virtual machines B1 to Bm are mounted, and information from the physical servers A1 to An. A data processing device 2 that controls the arrangement of virtual machines based on the above, a storage device 3 that stores various types of information, an input unit 4 that inputs information according to user operations, and a display unit 5 that includes a display. It is configured.

通信部１は、マシン室内に設置されている複数の物理サーバＡ１〜Ａｎと接続してデータの送受信を行うインタフェースである。 The communication unit 1 is an interface that transmits and receives data by connecting to a plurality of physical servers A1 to An installed in the machine room.

データ処理装置２は、マシン室内の複数の物理サーバＡ１〜Ａｎ上に構築された複数の仮想マシンＢ１〜Ｂｍの稼動状況など実環境データを各物理サーバから収集する実環境データ収集手段２１と、複数の物理サーバＡ１〜Ａｎに対応して予め設置された温度センサから物理サーバＡ１〜Ａｎそれぞれの温度を予め設定された時間間隔で収集するサーバ温度監視手段２２と、この収集された物理サーバＡ１〜Ａｎの温度を予め物理サーバＡ１〜Ａｎ毎に設定された閾値と個別に比較し閾値以上の温度を有する物理サーバがあるか否かを判定する障害検出手段２３と、閾値以上の温度を有する物理サーバを温度異常サーバとして検出し実環境データに基づいて温度異常サーバ上で稼動している仮想マシンを特定しこの仮想マシンを予め設定されたポリシーに従って他の物理サーバへ再配置させる処置実行手段２４とを備えている。 The data processing device 2 includes real environment data collection means 21 that collects real environment data such as operation statuses of the plurality of virtual machines B1 to Bm constructed on the plurality of physical servers A1 to An in the machine room, Server temperature monitoring means 22 that collects the temperature of each of the physical servers A1 to An at a preset time interval from temperature sensors installed in advance corresponding to the plurality of physical servers A1 to An, and the collected physical server A1 The failure detection means 23 which compares the temperature of ~ An individually with the threshold value previously set for each physical server A1 to An and determines whether there is a physical server having a temperature equal to or higher than the threshold, and has a temperature equal to or higher than the threshold A physical server is detected as an abnormal temperature server, a virtual machine running on the abnormal temperature server is identified based on actual environment data, and this virtual machine is set in advance. And a treatment execution unit 24 to relocate to another physical server according to the policy.

記憶装置３は、実環境データなどのシステム環境に関する情報を記憶するシステム構成情報記憶部３１と、管理対象である物理サーバに障害が発生した際の処置内容を予めポリシーとして記憶したポリシー記憶部３２とを備えている。 The storage device 3 includes a system configuration information storage unit 31 that stores information related to the system environment such as actual environment data, and a policy storage unit 32 that stores in advance a policy content when a failure occurs in a physical server to be managed. And.

また更に、データ処理装置２は、ポリシー記憶部３２に保存されているポリシーを入力部４からの指示に従って編集したり、新規に作成して保存したりするポリシー編集手段２４と、物理サーバＡ１〜Ａｎそれぞれのマシン室内における位置情報や、物理サーバＡ１〜Ａｎそれぞれの発熱特性などのサーバ固有情報を入力部４から受けてシステム構成情報記憶部３１に記憶させるサーバ固有情報受付手段２６と、サーバ温度監視手段２２に収集された物理サーバＡ１〜Ａｎの温度と物理サーバＡ１〜Ａｎのマシン室内における位置情報とに基づいてマシン室内の温度分布図を作成し表示部５に表示させる温度分布図作成手段２７とを備えている。 Furthermore, the data processing apparatus 2 includes a policy editing unit 24 that edits a policy stored in the policy storage unit 32 in accordance with an instruction from the input unit 4, or newly creates and stores the policy, and physical servers A1 to A1. Server specific information receiving means 26 for receiving server specific information such as position information in each An machine room and heat generation characteristics of each of the physical servers A1 to An from the input unit 4 and storing it in the system configuration information storage unit 31, and server temperature Temperature distribution diagram creating means for creating a temperature distribution diagram in the machine room based on the temperatures of the physical servers A1 to An collected by the monitoring unit 22 and the positional information of the physical servers A1 to An in the machine room and displaying them on the display unit 5 27.

実環境データ収集手段２１は、マシン室内に設置された複数の物理サーバＡ１〜Ａｎから、仮想マシンＢ１〜ＢｍのＣＰＵ使用率などの仮想マシンの稼動状況を含む実環境データを定期的またはランダムな時間間隔で収集する機能を備えている。情報を収集するための通信方式としては、先行技術として知られている方式を用いればよい。 The real environment data collection means 21 periodically or randomly stores real environment data including the operation status of virtual machines such as CPU usage rates of the virtual machines B1 to Bm from a plurality of physical servers A1 to An installed in the machine room. It has a function to collect at time intervals. As a communication method for collecting information, a method known as a prior art may be used.

サーバ温度監視手段２２は、マシン室内の複数の物理サーバＡ１〜Ａｎに対応して設置された温度センサから、物理サーバＡ１〜Ａｎの温度を定期的またはランダムな時間間隔で収集しシステム構成情報記憶部３１に記憶させる機能を備えている。サーバ温度監視手段２２は、例えば、物理サーバＡ１〜Ａｎそれぞれに搭載されているＣＰＵ内に予め装備された温度センサで測定された温度を取得するように構成してもよいし、マシン室内のラックに設けられた温度センサで測定された温度を取得するように構成してもよい。 The server temperature monitoring means 22 collects the temperature of the physical servers A1 to An at regular or random time intervals from the temperature sensors installed corresponding to the plurality of physical servers A1 to An in the machine room, and stores system configuration information. The function to be stored in the unit 31 is provided. For example, the server temperature monitoring unit 22 may be configured to acquire a temperature measured by a temperature sensor provided in advance in a CPU mounted in each of the physical servers A1 to An, or a rack in a machine room. You may comprise so that the temperature measured with the temperature sensor provided in may be acquired.

障害検出手段２３は、サーバ温度監視手段２２によって収集された物理サーバＡ１〜Ａｎの温度を予め物理サーバＡ１〜Ａｎ毎に設定された閾値と比較して、温度が閾値を超えた物理サーバを検出する機能を備えている。障害検出手段２３は、温度が閾値を超えた物理サーバを温度異常サーバとしてその識別情報を処置実行手段２４に通知する。ここで、物理サーバＡ１〜Ａｎ毎に設定された閾値は、ポリシーの一部としてポリシー記憶部３２に記憶されている。 The failure detection unit 23 compares the temperature of the physical servers A1 to An collected by the server temperature monitoring unit 22 with a threshold set in advance for each physical server A1 to An, and detects a physical server whose temperature exceeds the threshold It has a function to do. The failure detection unit 23 notifies the treatment execution unit 24 of the identification information of the physical server whose temperature exceeds the threshold value as a temperature abnormal server. Here, the threshold value set for each of the physical servers A1 to An is stored in the policy storage unit 32 as a part of the policy.

処置実行手段２４は、障害検出手段２３から温度異常サーバの識別情報を受けると、ポリシー記憶部３２に記憶されたポリシーに従って動作し、仮想マシンの配置変更を実行する。仮想マシンの配置変更方法としては、移動させる仮想マシンを一旦シャットダウンして、この仮想マシンの構成ファイルを物理サーバのローカルディスク間で移動させて、移動先で仮想マシンを起動させる方法と、共有ストレージ上に仮想マシンの構成ファイルを配置することで仮想マシンを停止させずに別の物理サーバへ移動させる方法などがある。 When the treatment execution unit 24 receives the identification information of the abnormal temperature server from the failure detection unit 23, the treatment execution unit 24 operates according to the policy stored in the policy storage unit 32, and executes the virtual machine layout change. The virtual machine layout can be changed by shutting down the virtual machine to be moved, moving the configuration file of this virtual machine between the local disks of the physical server, and starting the virtual machine at the destination. There is a method of moving a virtual machine to another physical server without stopping it by placing a virtual machine configuration file on the top.

本実施形態におけるポリシー記憶部３２に記憶されたポリシーは、温度異常サーバ上で稼動している仮想マシンを他のサーバへ移動させるための処置内容が設定されている。このポリシーの一例を以下に説明する。 In the policy stored in the policy storage unit 32 in the present embodiment, the treatment content for moving the virtual machine running on the abnormal temperature server to another server is set. An example of this policy will be described below.

本実施形態の処置実行手段２４は、ポリシーに従って動作することで、例えば、温度異常サーバ上で稼動している複数の仮想マシンのうちＣＰＵ使用率が最も大きい仮想マシンを移動対象マシンとし、他の各物理サーバの発熱特性を判断材料に入れて、移動対象マシンを受け入れても温度が閾値まで達しないと推測される物理サーバを選出し、この選出された物理サーバへ移動対象マシンを移動させるといった処置を実行する。 The action execution unit 24 according to the present embodiment operates according to a policy. For example, a virtual machine having the highest CPU usage rate among a plurality of virtual machines operating on a temperature abnormal server is set as a movement target machine. Put the heat generation characteristics of each physical server into the judgment material, select the physical server that is estimated that the temperature does not reach the threshold even if the moving target machine is accepted, and move the moving target machine to this selected physical server Take action.

本実施形態では、物理サーバＡ１〜Ａｎの発熱特性として、各物理サーバのＣＰＵ使用率に対する温度上昇率，各物理サーバの設置位置に起因する温度の上がり易さの度合いをレベルで表した温度上昇度がシステム構成情報記憶部３１に記憶されているので、ポリシーの例としては、物理サーバＡ１〜ＡｎのＣＰＵ使用率に対する温度上昇率と移動対象マシンのＣＰＵ使用率と物理サーバＡ１〜Ａｎの最新温度とに基づいて、物理サーバＡ１〜Ａｎそれぞれの移動対象マシンを受け入れた場合の推定温度を算出し、この算出した推定温度が閾値未満である物理サーバを当該移動対象マシンの移動先に決定する。 In the present embodiment, as the heat generation characteristics of the physical servers A1 to An, the temperature rise is expressed in terms of the temperature rise rate with respect to the CPU usage rate of each physical server and the degree of ease of temperature rise due to the installation position of each physical server. Is stored in the system configuration information storage unit 31. As an example of the policy, the temperature increase rate relative to the CPU usage rate of the physical servers A1 to An, the CPU usage rate of the migration target machine, and the latest of the physical servers A1 to An Based on the temperature, an estimated temperature when each of the physical target machines A1 to An is received is calculated, and a physical server whose calculated estimated temperature is less than the threshold is determined as a destination of the target target machine. .

ここで、各物理サーバの設置位置に起因する温度の上がり易さの度合いである温度上昇度とは、設置位置によって左右される物理サーバの温度の上がり易さの度合いであり、例えば、冷風の吹き出し口付近に設置されている物理サーバは、温度が上がり難いので、温度上昇度のレベルが低く設定される。 Here, the temperature rise degree, which is the degree of ease of temperature rise caused by the installation position of each physical server, is the degree of ease of temperature rise of the physical server that depends on the installation position. Since the physical server installed in the vicinity of the outlet does not easily rise in temperature, the level of temperature rise is set low.

また、移動対象マシンを受け入れた場合の推定温度が閾値未満となる物理サーバが複数あった場合は、推定温度と閾値との差が最も大きい物理サーバを移動対象マシンの移動先に決定するようにポリシーを設定してもよい。さらに、推定温度と閾値との差が最も大きい物理サーバが複数あった場合は、温度上昇度のレベルが最も低い物理サーバを移動対象マシンの移動先に決定するようにポリシーを設定してもよい。 Also, if there are multiple physical servers whose estimated temperature is less than the threshold when accepting the migration target machine, the physical server with the largest difference between the estimated temperature and the threshold is determined as the destination of the migration target machine. Policies may be set. Further, when there are a plurality of physical servers having the largest difference between the estimated temperature and the threshold, a policy may be set so that the physical server having the lowest temperature rise level is determined as the movement destination of the movement target machine. .

また、物理サーバＡ１〜Ａｎぞれぞれの移動対象マシンを受け入れた場合の推定温度を算出する際に、最新温度が取得できていなかった物理サーバについては、前回取得時の温度と、前回取得時からの経過時間と、当該物理サーバの熱伝導率と、隣接する物理サーバの最新温度と、この隣接する物理サーバとの接触面積とに基づいて最新温度を推定するようにポリシーを設定してもよい。物理サーバＡ１〜Ａｎの熱伝導率と物理サーバＡ１〜Ａｎ毎の隣接する物理サーバとの接触面積は、予めシステム構成情報記憶部３１に記憶しておく。 In addition, when calculating the estimated temperature when each of the physical servers A1 to An is moved, the physical server for which the latest temperature could not be acquired, the temperature at the previous acquisition and the previous acquisition Set a policy to estimate the latest temperature based on the elapsed time from the time, the thermal conductivity of the physical server, the latest temperature of the adjacent physical server, and the contact area with this adjacent physical server. Also good. The thermal conductivity of the physical servers A1 to An and the contact area between adjacent physical servers for each physical server A1 to An are stored in the system configuration information storage unit 31 in advance.

ポリシー情報編集手段２５は、ポリシー記憶部３２に予め保存されているポリシーを編集したり、新規に作成して保存したりする機能を備えている。このポリシーは、システム構成情報記憶部３１に記憶されている物理サーバ固有の発熱特性を重み付けとして使用することによって、マシン室内全体の温度上昇を効率的に防ぐことを目的としたサーバ再配置を実行するためのものである。ポリシーを編集することで、温度閾値の変更などが可能となる。 The policy information editing unit 25 has a function of editing a policy stored in advance in the policy storage unit 32 or newly creating and storing the policy. This policy uses server-specific heat generation characteristics stored in the system configuration information storage unit 31 as weights to perform server relocation for the purpose of efficiently preventing temperature rise in the entire machine room. Is to do. By editing the policy, the temperature threshold can be changed.

サーバ固有情報受付手段２６は、物理サーバＡ１〜Ａｎのマシン室内における位置情報や、物理サーバＡ１〜Ａｎの発熱特性などのサーバ固有情報を入力部４から受けてシステム構成情報記憶部３１に記憶させる。ここで、物理サーバＡ１〜Ａｎの位置情報は、物理サーバＡ１〜Ａｎが物理的に配置されている場所を表す情報であって、物理サーバＡ１〜Ａｎが設置されているマシン室の識別情報と、マシン室内の物理サーバＡ１〜Ａｎが設置されている位置を特定する値を含む情報である。本実施形態では、この位置情報は、マシン室の一角を原点としたＸ軸，Ｙ軸，Ｚ軸上の座標で表わされている。 The server specific information receiving means 26 receives server specific information such as the position information of the physical servers A1 to An in the machine room and the heat generation characteristics of the physical servers A1 to An from the input unit 4 and stores them in the system configuration information storage unit 31. . Here, the location information of the physical servers A1 to An is information indicating the locations where the physical servers A1 to An are physically arranged, and the identification information of the machine room in which the physical servers A1 to An are installed. This is information including a value for specifying a position where the physical servers A1 to An in the machine room are installed. In the present embodiment, this position information is represented by coordinates on the X, Y, and Z axes with one corner of the machine room as the origin.

また、物理サーバＡ１〜Ａｎの発熱特性としては、物理サーバＡ１〜ＡｎのＣＰＵ使用率に対する温度上昇率や、物理サーバＡ１〜Ａｎの設置位置に起因する温度の上がり易さの度合いをレベルで表した温度上昇度がある。 Further, as the heat generation characteristics of the physical servers A1 to An, the temperature increase rate with respect to the CPU usage rate of the physical servers A1 to An and the degree of temperature increase due to the installation positions of the physical servers A1 to An are expressed as levels. There is a temperature rise.

温度分布図作成手段２７は、物理サーバＡ１〜Ａｎの温度と物理サーバＡ１〜Ａｎのマシン室内における位置情報とに基づいてマシン室内の温度分布図を作成し、３次元のグラフィカルユーザインタフェースとして表示部５へ出力する。このグラフィカルユーザインタフェースは、図３に示すような、マシン室内におけるサーバＡ１〜Ａｎの温度分布図を画面に表示して表し、物理サーバＡ１〜Ａｎの温度をその値によって色分けして物理サーバＡ１〜Ａｎの位置情報から計算した画面位置に描画する。このように、マシン室内における物理サーバＡ１〜Ａｎの温度分布図を画面に表示することによって、管理者は、マシン室内の温度分布を容易に把握することができる。また、システム管理者は、画面に表示された温度分布図により、マシン室内の温度分布が目視できるので、手動でサーバ管理装置Ｃを操作して、高い温度が密集している位置の物理サーバを選択しこの物理サーバから低い温度の物理サーバへ業務を再配布できるようにしてもよい。 The temperature distribution diagram creating means 27 creates a temperature distribution diagram in the machine room based on the temperatures of the physical servers A1 to An and the positional information of the physical servers A1 to An in the machine room, and displays the temperature distribution diagram as a three-dimensional graphical user interface. Output to 5. This graphical user interface displays a temperature distribution diagram of the servers A1 to An in the machine room as shown in FIG. 3 and displays the temperature distribution of the physical servers A1 to An according to their values. Drawing is performed at the screen position calculated from the position information of An. Thus, by displaying the temperature distribution diagrams of the physical servers A1 to An in the machine room on the screen, the administrator can easily grasp the temperature distribution in the machine room. In addition, the system administrator can visually observe the temperature distribution in the machine room from the temperature distribution diagram displayed on the screen. Therefore, the server administrator manually operates the server management apparatus C to select a physical server at a position where high temperatures are concentrated. It may be possible to select and redistribute the business from this physical server to a low temperature physical server.

システム構成情報記憶部３１は、実環境データとして、仮想マシンＢ１〜ＢｎのＣＰＵ使用率のほかに、例えば、物理サーバＡ１〜Ａｎそれぞれの識別情報，機種情報，設定内容，機器仕様，運用状態，オペレーティングシステムの種別や、運用時に配布するソフトウェア，グループ化して管理する際の論理的な定義情報などを記憶している。また、システム構成情報記憶部３１は、サーバ固有情報として、物理サーバＡ１〜ＡｎのＣＰＵ使用率に対する温度上昇率や、物理サーバＡ１〜Ａｎの設置位置に起因する温度の上がり易さの度合いをレベルで表した温度上昇度などの物理サーバＡ１〜Ａｎそれぞれの発熱特性に関する情報や、物理サーバＡ１〜Ａｎの位置情報を記憶している。 The system configuration information storage unit 31 includes, as actual environment data, in addition to the CPU usage rates of the virtual machines B1 to Bn, for example, identification information, model information, setting contents, device specifications, operation statuses of the physical servers A1 to An, Stores the type of operating system, software distributed during operation, logical definition information when grouped and managed. In addition, the system configuration information storage unit 31 indicates, as server specific information, the temperature increase rate relative to the CPU usage rate of the physical servers A1 to An and the degree of ease of temperature increase caused by the installation positions of the physical servers A1 to An. The information regarding the heat generation characteristics of each of the physical servers A1 to An such as the temperature rise degree and the position information of the physical servers A1 to An are stored.

このように、本実施形態のサーバ管理装置Ｃは、物理サーバＡ１〜Ａｎの温度の他に物理サーバＡ１〜Ａｎの発熱特性を内部情報として保持し、これらの情報を用いて設定されたポリシーを格納したポリシー記憶部３２と、このポリシーに従って温度異常の物理サーバ上で稼動している仮想マシンを他の物理サーバに移動させる処置実行手段２４を備えて構成しているので、マシン室の温度を効率的に制御することができ、結果的に空調設備の稼動量を軽減し消費電力を抑えることができる。 As described above, the server management apparatus C of the present embodiment holds the heat generation characteristics of the physical servers A1 to An as internal information in addition to the temperatures of the physical servers A1 to An, and sets the policy set using these information. Since the stored policy storage unit 32 and the treatment execution means 24 for moving a virtual machine operating on a physical server having an abnormal temperature according to this policy to another physical server are provided, the temperature of the machine room is set. As a result, the amount of operation of the air conditioning equipment can be reduced and the power consumption can be suppressed.

次に、本実施形態のサーバ管理装置の動作について説明する。ここで、以下の動作説明は、本発明のサーバ管理方法の実施形態となる。 Next, the operation of the server management apparatus of this embodiment will be described. Here, the following description of the operation is an embodiment of the server management method of the present invention.

図４は、本実施形態におけるデータ処理装置２の動作の一例について示すフローチャートである。図４に示すように、データ処理装置２においては、実環境データ収集手段２１が、通信部１を介して、複数の物理サーバＡ１〜Ａｎから仮想マシンＢ１〜Ｂｍの稼働状況を含む実環境データを一定の時間間隔又はランダムな時間間隔で収集してシステム構成情報記憶部３１へ出力すると共に、サーバ温度監視手段２２が、複数の物理サーバＡ１〜Ａｎに対応して予め設置された温度センサから物理サーバＡ１〜Ａｎの温度を同様の時間間隔で収集してシステム構成情報記憶部３１へ出力する（図４のステップｓ１０１）。 FIG. 4 is a flowchart showing an example of the operation of the data processing apparatus 2 in the present embodiment. As shown in FIG. 4, in the data processing device 2, the real environment data collecting unit 21 includes real environment data including the operating statuses of the virtual machines B 1 to Bm from the plurality of physical servers A 1 to An via the communication unit 1. Are collected at regular time intervals or at random time intervals and output to the system configuration information storage unit 31, and the server temperature monitoring means 22 is provided from temperature sensors previously installed corresponding to the plurality of physical servers A1 to An. The temperatures of the physical servers A1 to An are collected at similar time intervals and output to the system configuration information storage unit 31 (step s101 in FIG. 4).

温度分布図作成手段２７が、予めサーバ固有情報受付手段２６からシステム構成情報記憶部３１へ送られて記憶された、物理サーバＡ１〜Ａｎそれぞれのマシン室内における位置情報と、サーバ温度監視手段２２に収集された物理サーバＡ１〜Ａｎの最新温度とに基づいて、マシン室内の温度分布図を作成し、表示部５へ出力する（図４のステップｓ１０２）。 The temperature distribution diagram creating means 27 stores the positional information in the machine room of each of the physical servers A1 to An stored in the server configuration information storage unit 31 from the server specific information accepting means 26 and the server temperature monitoring means 22 in advance. Based on the collected latest temperatures of the physical servers A1 to An, a temperature distribution map in the machine room is created and output to the display unit 5 (step s102 in FIG. 4).

続いて、障害検出手段２３が、サーバ温度監視手段２２に収集された物理サーバＡ１〜Ａｎの温度を予め物理サーバＡ１〜Ａｎ毎に設定された閾値と個別に比較して（図４のステップｓ１０３）、閾値以上の温度を有する物理サーバがあるか否かを判定する（図４のステップｓ１０４）。 Subsequently, the failure detection unit 23 individually compares the temperatures of the physical servers A1 to An collected by the server temperature monitoring unit 22 with a threshold value set in advance for each physical server A1 to An (step s103 in FIG. 4). ), It is determined whether there is a physical server having a temperature equal to or higher than the threshold (step s104 in FIG. 4).

閾値以上の温度を有する物理サーバがあった場合、処置実行手段２４が、閾値以上の温度を有する物理サーバを温度異常サーバとして検出し、ポリシー記憶部３２に記憶されたポリシーに従って、温度異常サーバ上で稼動している仮想マシンを他の物理サーバへ再配置させる処置を実行する。 When there is a physical server having a temperature equal to or higher than the threshold value, the action execution unit 24 detects the physical server having a temperature equal to or higher than the threshold value as a temperature abnormal server, and on the temperature abnormal server according to the policy stored in the policy storage unit 32. Execute the action to relocate the virtual machine running in to another physical server.

処置実行手段２４がポリシーに従って動作する内容の一例を説明すると、まず、システム構成情報記憶部３１内の情報を検索して温度異常サーバ上で稼動している仮想マシンを特定し、そのうちＣＰＵ使用率が最も大きい仮想マシンを移動対象マシンに選出する（図４のステップｓ１０５）。 An example of the content of the action execution means 24 that operates according to the policy will be described. First, information in the system configuration information storage unit 31 is searched to identify a virtual machine operating on the abnormal temperature server, and the CPU usage rate Is selected as the movement target machine (step s105 in FIG. 4).

続いて、物理サーバＡ１〜ＡｎそれぞれのＣＰＵ使用率に対する温度上昇率と移動対象マシンのＣＰＵ使用率とから、物理サーバＡ１〜Ａｎそれぞれの移動対象マシンを受け入れた場合の上昇温度を算出し、この各上昇温度を物理サーバＡ１〜Ａｎぞれぞれの最新温度に加算して、物理サーバＡ１〜Ａｎの移動対象マシンを受け入れた場合の推定温度を算出する（図４のステップｓ１０６）。このとき、電源が入っていないなどの理由で最新温度が取得できていなかった物理サーバについては、前回取得時の温度と、前回温度取得時からの経過時間と、当該物理サーバの熱伝導率と、隣接する物理サーバの最新温度と、この隣接する物理サーバとの接触面積とに基づいて最新温度を推定するようにしてもよい。 Subsequently, from the temperature increase rate with respect to the CPU usage rate of each of the physical servers A1 to An and the CPU usage rate of the migration target machine, an increase temperature when the migration target machine of each of the physical servers A1 to An is received is calculated. Each rising temperature is added to the latest temperature of each of the physical servers A1 to An to calculate an estimated temperature when the machines to be moved of the physical servers A1 to An are received (step s106 in FIG. 4). At this time, for physical servers that have not been able to acquire the latest temperature due to reasons such as the power being off, the temperature at the previous acquisition, the elapsed time since the previous temperature acquisition, and the thermal conductivity of the physical server The latest temperature may be estimated based on the latest temperature of the adjacent physical server and the contact area with the adjacent physical server.

そして、各推定温度が閾値未満であるか否か判定し（図４のステップｓ１０７）、推定温度が閾値未満の物理サーバが見つからなかった場合には、温度異常サーバの発生を示す通知を表示部５に出力する（図４のステップｓ１０８）。推定温度が閾値未満の物理サーバが見つかった場合には、その物理サーバを移動先として移動対象マシンを再配置するように物理サーバ側へ要求する（図４のステップｓ１０９）。このとき、推定温度が閾値未満の物理サーバが複数見つかった場合は、各物理サーバの設置位置に起因する温度の上がり易さの度合いをレベルで表した温度上昇度の最も小さい物理サーバを移動対象マシンの移動先に決定するようにする。 And it is determined whether each estimated temperature is less than a threshold value (step s107 of FIG. 4), and when a physical server whose estimated temperature is less than the threshold value is not found, a notification indicating the occurrence of an abnormal temperature server is displayed on the display unit. 5 (step s108 in FIG. 4). If a physical server having an estimated temperature lower than the threshold is found, the physical server side is requested to relocate the migration target machine with the physical server as the migration destination (step s109 in FIG. 4). At this time, if multiple physical servers with estimated temperatures below the threshold are found, the physical server with the smallest temperature rise that indicates the degree of ease of temperature rise due to the installation position of each physical server as a level to be moved Decide where to move the machine.

本実施形態のサーバ管理装置Ｃは、このように動作するので、異常温度に達した物理サーバ上で稼動している仮想マシンを、温度に余裕のある他の物理サーバに移動させることが可能となり、物理サーバＡ１〜Ａｎの温度上昇を効率的に抑えることができる。 Since the server management apparatus C of the present embodiment operates in this way, it becomes possible to move a virtual machine running on a physical server that has reached an abnormal temperature to another physical server that has sufficient temperature. The temperature increase of the physical servers A1 to An can be efficiently suppressed.

ここで、本実施形態におけるデータ処理装置２、すなわち、実環境データ収集手段２１，サーバ温度監視手段２２，障害検出手段２３，処置実行手段２４，ポリシー編集手段２５，サーバ固有情報受付手段２６，温度分布図作成手段２７とについては、その機能内容をプログラム化してコンピュータに実行させるように構成してもよい。 Here, the data processing apparatus 2 in the present embodiment, that is, the real environment data collecting means 21, the server temperature monitoring means 22, the failure detecting means 23, the action executing means 24, the policy editing means 25, the server specific information receiving means 26, the temperature The distribution map creating means 27 may be configured such that the function content is programmed and executed by a computer.

以上のように、本実施形態のサーバ管理装置Ｃは、マシン室における物理サーバＡ１〜Ａｎの位置を内部情報として持ち、収集した物理サーバＡ１〜Ａｎの温度をその位置情報を用いて３次元の分布図としてグラフィカルユーザインタフェースに出力するため、マシン室の温度分布図を画面に表示でき、この結果、システム管理者は、マシン室内の温度分布を容易に把握することができる。 As described above, the server management apparatus C according to the present embodiment has the positions of the physical servers A1 to An in the machine room as internal information, and the collected temperatures of the physical servers A1 to An are three-dimensional using the position information. Since the distribution diagram is output to the graphical user interface, the temperature distribution diagram of the machine room can be displayed on the screen. As a result, the system administrator can easily grasp the temperature distribution in the machine room.

また、本実施形態のサーバ管理装置Ｃは、物理サーバＡ１〜Ａｎの温度の他に、物理サーバＡ１〜Ａｎの冷却能力（発熱特性）を内部情報として保持して、これらの情報を用いたポリシーに従って仮想マシンの配置を行うことで、マシン室全体での温度上昇を防ぐのに最適な配置をすることができ、結果的に、空調設備の稼動量が軽減され、データセンタの省電力化を実現することができる。 In addition to the temperatures of the physical servers A1 to An, the server management apparatus C of the present embodiment holds the cooling capacity (heat generation characteristics) of the physical servers A1 to An as internal information, and a policy using these information By arranging virtual machines according to the above, it is possible to make an optimal arrangement to prevent the temperature rise in the entire machine room, and as a result, the amount of operation of the air conditioning equipment is reduced and the power consumption of the data center is reduced. Can be realized.

本発明に係る一実施形態のネットワーク構成を示すブロック図である。It is a block diagram which shows the network structure of one Embodiment which concerns on this invention. 図１に開示した実施形態のサーバ管理装置の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the server management apparatus of embodiment disclosed in FIG. 図１に開示した実施形態におけるサーバ管理装置の表示部に表示される温度分布図の一例を示す図である。It is a figure which shows an example of the temperature distribution figure displayed on the display part of the server management apparatus in embodiment disclosed in FIG. 図１に開示した実施形態のサーバ管理装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the server management apparatus of embodiment disclosed in FIG.

Explanation of symbols

Ａ１〜Ａｎ物理サーバ
Ｂ１〜Ｂｍ仮想マシン
Ｃサーバ管理装置
１通信部
２データ処理装置
３記憶装置
４入力部
５表示部
２１実環境データ収集手段
２２サーバ温度監視手段
２３障害検出手段
２４処置実行手段
２５ポリシー編集手段
２６サーバ固有情報受付手段
２７温度分布図作成手段
３１システム構成情報記憶部
３２ポリシー記憶部 A1 to An Physical server B1 to Bm Virtual machine C Server management device 1 Communication unit 2 Data processing device 3 Storage device 4 Input unit 5 Display unit 21 Real environment data collection unit 22 Server temperature monitoring unit 23 Fault detection unit 24 Treatment execution unit 25 Policy editing means 26 Server specific information receiving means 27 Temperature distribution diagram creating means 31 System configuration information storage section 32 Policy storage section

Claims

A communication unit for connecting to a plurality of physical servers on which a plurality of virtual machines are mounted;
Real environment data collection means for collecting the operation status of each virtual machine from each physical server at a preset time interval;
Server temperature collection means for collecting the temperature of each physical server at a preset time interval from temperature sensors installed corresponding to the plurality of physical servers;
Failure detection means for individually comparing the temperature of each collected physical server with a threshold set in advance for each physical server and determining whether there is a temperature abnormal server that is a physical server having a temperature equal to or higher than the threshold ,
A treatment execution means for moving a virtual machine running on the temperature abnormal server to another physical server when it is determined that there is the temperature abnormal server;
Based on the heat generation characteristics specific to each physical server set in advance, the treatment execution means is estimated that the temperature does not exceed the threshold even if a movement target machine that is a virtual machine moved from the temperature abnormal server is accepted. A server management apparatus comprising a function of specifying a physical server and determining the physical server as a movement destination of the movement target machine.

In the server management apparatus according to claim 1,
The action execution means includes a temperature increase rate with respect to a CPU usage rate of each physical server predetermined as a heat generation characteristic of each physical server, and a CPU usage rate of the migration target machine acquired by the real environment data collection means. On the basis of the latest temperature of each physical server acquired by the server temperature collecting means, the estimated temperature when it is assumed that the machine to be moved of each physical server has been received is calculated, and the calculated estimated temperature is A server management apparatus comprising a function of determining a physical server that is less than the threshold as a movement destination of the movement target machine.

In the server management device according to claim 2,
The action execution means selects a physical server having a largest difference between the estimated temperature and the threshold among physical servers having an estimated temperature lower than the threshold when assuming that the movement target machine has been accepted. A server management apparatus having the function described above.

In the server management apparatus according to claim 3,
The treatment execution unit is configured to determine the heat generation characteristic of each physical server among the physical servers having the largest difference between the estimated temperature and the threshold when it is assumed that the machine to be moved is received. A server management apparatus comprising a function of setting a physical server having a lowest temperature rise degree, which is a degree of ease of temperature rise caused by an installation position, as a movement destination of the movement target machine.

In the server management apparatus according to claim 4,
The treatment execution means, when there are a plurality of virtual machines operating on the temperature abnormal server, selects a virtual machine having the highest CPU utilization rate from the plurality of virtual machines as a movement target machine. Server management device.

In the server management apparatus according to claim 5,
When calculating the estimated temperature when assuming that the movement target machine of each of the physical servers has been accepted, the treatment execution means, for the physical server that has not been able to acquire the latest temperature, the temperature at the previous acquisition, The latest temperature is estimated based on the elapsed time from the previous acquisition, the thermal conductivity of the physical server, the latest temperature of the adjacent physical server, and the contact area with the adjacent physical server, Server management device to do.

The operation status of each virtual machine is collected at a preset time interval from a plurality of physical servers on which a plurality of virtual machines are mounted, and each physical server is detected from temperature sensors installed in advance corresponding to the plurality of physical servers. Temperature at a preset time interval,
The temperature of each collected physical server is individually compared with a threshold set in advance for each physical server to determine whether there is a temperature abnormal server that is a physical server having a temperature equal to or higher than the threshold,
When there is the temperature abnormal server, at least one of the virtual machines operating on the temperature abnormal server is selected as a movement target machine to be moved to another physical server,
Based on the heat generation characteristics specific to each physical server determined in advance, even if the movement target machine is accepted, a physical server whose temperature is estimated not to exceed the threshold is specified,
A server management method, wherein the migration target machine is moved to the identified physical server.

In the server management method according to claim 7,
In selecting a physical server that is estimated that the temperature does not exceed the threshold even if the moving target machine is accepted,
The temperature increase rate with respect to the CPU usage rate of each physical server, which is predetermined as the heat generation characteristics of each physical server, the CPU usage rate of the migration target machine included in the operating status of the virtual machine, and the A server that calculates an estimated temperature when it is assumed that the machine to be moved of each physical server has been received based on the latest temperature, and selects a physical server having the calculated estimated temperature less than the threshold value Management method.

In the server management method according to claim 8,
A server that selects a physical server having the largest difference between the estimated temperature and the threshold when a plurality of physical servers having an estimated temperature lower than the threshold when it is assumed that the moving target machine has been received are detected. Management method.

In the server management method according to claim 9,
When a plurality of physical servers having the largest difference between the estimated temperature and the threshold when it is assumed that the machine to be moved has been received are detected, the installation positions of the physical servers that are predetermined as the heat generation characteristics of the physical servers A server management method comprising: selecting a physical server having the lowest temperature rise degree, which is the degree of temperature rise due to the server.

In the server management method according to claim 10,
A server management method, wherein when there are a plurality of virtual machines operating on the temperature abnormal server, a virtual machine having the highest CPU usage rate among the plurality of virtual machines is specified as a movement target machine.

The server management method according to claim 11,
When calculating the estimated temperature when assuming that the machine to be moved of each physical server has been accepted,
For physical servers for which the latest temperature has not been acquired, the temperature at the previous acquisition, the elapsed time since the previous acquisition, the thermal conductivity of the physical server, the latest temperature of the adjacent physical server, and this adjacent physical server A server management method, wherein the latest temperature is estimated based on a contact area with the server.

A real environment data collection function for collecting the operation status of each virtual machine from a plurality of physical servers mounted with a plurality of virtual machines at a preset time interval;
A server temperature collection function for collecting the temperature of each physical server at a preset time interval from temperature sensors installed corresponding to the plurality of physical servers;
A failure detection function for individually comparing the temperature of each collected physical server with a threshold set in advance for each physical server and determining whether there is a temperature abnormal server that is a physical server having a temperature equal to or higher than the threshold; ,
When there is the abnormal temperature server, at least one of the virtual machines operating on the abnormal temperature server is selected as a movement target machine to be moved to another physical server, and is assigned to each predetermined physical server. Based on the heat generation characteristics, even if the movement target machine is accepted, a physical server whose temperature is estimated not to exceed the threshold value is identified, and the movement execution function for moving the movement target machine to the identified physical server; Management program for causing a computer to execute

In the server management program according to claim 13,
The action execution function is collected by a temperature increase rate with respect to a CPU usage rate of each physical server, which is predetermined as a heat generation characteristic of each physical server, a CPU usage rate by the migration target machine, and the server temperature collection function. Based on the latest temperature of each physical server, an estimated temperature when it is assumed that the machine to be moved of each physical server has been received is calculated, and a physical server having the calculated estimated temperature less than the threshold is selected as the movement target A server management program, which is a function for determining a destination of a machine.

In the server management program according to claim 14,
In the case where a plurality of physical servers having an estimated temperature that is less than a threshold when it is assumed that the movement target machine has been accepted are detected, the action execution function selects a physical server having the largest difference between the estimated temperature and the threshold. A server management program, which is a function for determining a destination of a machine.

In the server management program according to claim 15,
When there are a plurality of physical servers having the largest difference between the estimated temperature and the threshold when it is assumed that the movement target machine has been accepted, each of the treatment execution functions is determined in advance as each heat generation characteristic of each physical server. A server management program, which has a function of determining a physical server having a lowest temperature rise degree, which is a degree of ease of temperature rise caused by an installation position of a physical server, as a movement destination of the movement target machine.

In the server management program according to claim 16,
The action execution function is a function for selecting a virtual machine having the highest CPU utilization rate from the plurality of virtual machines as the movement target machine when there are a plurality of virtual machines operating on the temperature abnormal server. A server management program characterized by

In the server management program according to claim 17,
When calculating the estimated temperature when assuming that the movement target machine of each of the physical servers has been accepted, the action execution function, for the physical server that has not been able to acquire the latest temperature, the temperature at the previous acquisition, It is a function that estimates the latest temperature based on the elapsed time from the previous acquisition, the thermal conductivity of the physical server, the latest temperature of the adjacent physical server, and the contact area with the adjacent physical server. A server management program.