WO2011074284A1 - Migration method for virtual machine, virtual machine system, and storage medium containing program - Google Patents

Migration method for virtual machine, virtual machine system, and storage medium containing program Download PDF

Info

Publication number
WO2011074284A1
WO2011074284A1 PCT/JP2010/063273 JP2010063273W WO2011074284A1 WO 2011074284 A1 WO2011074284 A1 WO 2011074284A1 JP 2010063273 W JP2010063273 W JP 2010063273W WO 2011074284 A1 WO2011074284 A1 WO 2011074284A1
Authority
WO
WIPO (PCT)
Prior art keywords
computer
virtual
physical
management
virtual machine
Prior art date
Application number
PCT/JP2010/063273
Other languages
French (fr)
Japanese (ja)
Inventor
齋藤 浩
高本 良史
正芳 北村
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Publication of WO2011074284A1 publication Critical patent/WO2011074284A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2035Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant without idle spare hardware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2043Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant where the redundant components share a common memory address space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/485Task life-cycle, e.g. stopping, restarting, resuming execution
    • G06F9/4856Task life-cycle, e.g. stopping, restarting, resuming execution resumption being on a different machine, e.g. task migration, virtual machine migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/815Virtual

Definitions

  • the present invention relates to a method and system for improving the reliability of a virtual server.
  • the server virtualization technology is a technology capable of operating a plurality of virtual servers on a single physical server.
  • Physical servers have resources such as processors and memory.
  • Server virtualization technology divides physical resources and assigns each of the divided resources to different virtual servers on a single physical server. Run multiple virtual servers at the same time.
  • the need for server virtualization technology has increased due to improved processor performance and lower costs for resources such as memory.
  • the problem to be solved by the present invention is to achieve high reliability at low cost even in a virtualized server environment. In particular, it is to reduce the number of alternate (standby) servers for high reliability.
  • a virtual server In order for a server to fail and to switch to a replacement server, it is necessary to know exactly which virtual server was running. Unlike a physical server, a virtual server can be increased or decreased relatively easily if there are sufficient resources such as a processor and memory of the physical server.
  • the above-described conventional example has a problem that a failure cannot be recovered when resources are not sufficient. For example, when a serious failure occurs in a virtual server to which a resource with a CPU of 3 GHz and a memory of 4 GB is allocated, there must be a physical server having an equivalent resource.
  • the present invention has been made in view of the above problems, and enables a virtual server to take over even if the takeover destination physical server has fewer resources than the takeover source physical server.
  • the purpose is to do.
  • a typical example of the invention disclosed in this specification is as follows. That is, a virtual computer system including a plurality of physical computers having a virtualization unit that includes a processor and a memory to construct a plurality of virtual computers, and a management computer that includes the processor and the memory and is connected to the physical computer via a network.
  • FIG. 1 is a block diagram of a virtual machine system to which the present invention is applied.
  • the center of control in this embodiment is the management server 101.
  • the management server 101 includes a failure recovery unit 102, a virtual server definition information management unit 103, a failure sign management unit 104, a virtual server recovery method selection unit 105, a virtual server recovery unit 106, a virtual server management table 107, a server management table 108, a virtual server A server definition information table 118 and a virtual server return unit 119 are included.
  • the management server 101 manages the network switch 120, the physical servers 111-1 to 111-n, the server virtualization mechanism 110, the storage switch 112, the virtual server 109, and the disk array device 116.
  • the server virtualization mechanism 110 has a function of making the physical server 111 appear to the plurality of virtual servers 109, and can integrate the plurality of servers into the single physical server 111.
  • the server virtualization mechanism 110 can be configured by a hypervisor, a VMM (Virtual Machine Manager), or the like.
  • the physical server 111-1 to 111-n is collectively referred to as the physical server 111.
  • the disk array device 116 is connected to the physical server 111 via the storage switch 112.
  • the storage switch 112 that connects the disk array device 116 and the physical server 111 constitutes a SAN (Storage Area Network).
  • the network switch 120 that connects the management server 101 and the physical server 111 constitutes the network 207 shown in FIG.
  • the disk array device 116 has a virtual server image storage disk 114 storing a program executed by the virtual server 109 and a definition information storage disk 115 storing computer resource allocation information of the virtual server 109.
  • a highly reliable system is configured by moving a virtual server affected by the failure or the failure sign to another physical server.
  • the failure recovery unit 102 controls recovery of the physical server 111 and recovery of the virtual server 109 when the occurrence of a failure of the physical server 111 or a failure sign is detected.
  • the virtual server definition information management unit 103 performs processing for improving reliability by moving the virtual server 109 even when there is not enough free resources in the physical server 111 that is the takeover destination of the virtual server 109.
  • the failure sign management unit 104 checks the occurrence of a failure or a sign of failure for each physical server 111 managed by the management server 101.
  • the virtual server recovery method selection unit 105 selects the virtual server 109 based on the detection result of the failure or predictive failure of the physical server 111, the priority of the virtual server 109 affected by the failure, and the availability of computer resources of the physical server 111. Process to select the recovery method.
  • the virtual server recovery unit 106 executes recovery of the virtual server 109 based on the execution result of the virtual server recovery method selection unit 105.
  • the virtual server management table 107 stores detailed information regarding the server virtualization mechanism 110.
  • the server management table 108 manages resources of the physical server 111.
  • the virtual server definition information table 118 stores the priority for each virtual server 109, the position of the original (original) allocation information to which the resources of the physical server 111 are allocated, and information regarding the takeover of the virtual server.
  • the virtual server return unit 119 executes a process for returning the virtual server 109 moved to the standby physical server 111 to the original active physical server 111.
  • the physical servers 111-1 and 111-2 can configure the active system
  • the physical server 111-n can configure the standby system.
  • FIG. 2 is a block diagram showing the configuration of the management server 101 in the present invention.
  • the management server 101 includes a memory 202, a processor 203, a FCA (Fibre Channel Adapter) 204, a NIC (Network Interface Card) 205, and a BMC (Baseboard Management Controller) 206.
  • FCA Fibre Channel Adapter
  • NIC Network Interface Card
  • BMC Base Management Controller
  • the processor 203 executes various programs stored in the memory 202.
  • the FCA 204 is connected to an external storage (for example, the disk array device 116).
  • the NIC 205 and the BMC 206 are connected to the network 207.
  • the NIC 205 communicates with other servers via the network 207 mainly in response to requests from various programs on the memory 202.
  • the BMC 206 is used to detect a failure of the management server 101 and communicate with other servers via the network 207.
  • the NIC 205 and the BMC 206 are connected to the same network 207, but may be connected to different networks.
  • one FCA 204 and one NIC 205 are provided, but a plurality of FCAs 204 and NICs 205 may exist.
  • a failure recovery unit 102 On the memory 202, a failure recovery unit 102, a virtual server definition information management unit 103, a failure sign management unit 104, a virtual server recovery method selection unit 105, a virtual server recovery unit 106, a virtual server management table 107, a server management table 108,
  • the virtual server definition information table 118 is stored as a program.
  • Each program stored in the memory 202 is executed by the processor 203.
  • Each of the above programs is stored in a storage (for example, the disk array device 116) as a storage medium, and loaded into the memory 202 as necessary.
  • FIG. 3 is a block diagram showing a detailed configuration of the physical server 111 on which the server virtualization mechanism 110 to be managed by the management server 101 is operating.
  • the physical server 111 includes a memory 301, a processor 303, an FCA (Fibre Channel Adapter) 304, an NIC (Network Interface Card) 305, and a BMC (Baseboard Management Controller) 306.
  • FCA Fibre Channel Adapter
  • NIC Network Interface Card
  • BMC Base Management Controller
  • the processor 303 executes various programs stored in the memory 301.
  • the FCA 304 is connected to the disk array device 116 via the storage switch 112 shown in FIG.
  • the NIC 305 and the BMC 306 are connected to the network 207.
  • the NIC 305 mainly communicates with other servers in response to requests from various programs on the memory 301.
  • the BMC 306 is used to detect a failure of the physical server 111 and communicate with the management server 101 and other servers via the network 207.
  • the NIC 305 and the BMC 306 are connected to the same network, but may be connected to different networks.
  • one FCA 304 and one NIC 305 are provided, but a plurality of FCAs 304 and NICs 305 may exist.
  • a plurality of virtual servers 109-1 to 109-m can be constructed by operating the server virtualization mechanism 110 on the memory 301.
  • the virtual servers 109-1 to 109 -m are collectively referred to as the virtual server 109.
  • Each virtual server 109 can operate an OS (Operating System) 302 independently.
  • OS Operating System
  • a plurality of virtual servers 109 can be constructed on the server virtualization mechanism 110.
  • the virtual server 109 reads and executes a predetermined virtual server OS image 308 in the virtual server image storage disk 114 set in advance in the disk array device 116, and thereby the independent virtual server 109 is constructed. Is done. Also, the definition information storage disk 115 of the disk array device 116 stores virtual server definition information 309 in which definition information of each virtual server 109 is stored.
  • the virtual server definition information 309 of the disk array device 116 is shared from a plurality of physical servers 111 via the storage switch 112, and can be set so that it can be referenced from any physical server 111.
  • the virtual server OS image 308 and the virtual server definition information 309 for each of the plurality of virtual servers 109 completely different OSs and applications can be operated on the single physical server 301 for each virtual server 109.
  • a control I / F (Interface) 311 is an interface for controlling the server virtualization mechanism 110 from the external management server 101 or the like via the network 207.
  • the management server 101 can create or delete the virtual server 109 on the physical server 111 via the control I / F 311. For this reason, a network address is set in the control I / F 311 of the virtualization mechanism 110 of each physical server 111.
  • FIG. 4 is a block diagram showing an outline of the operation of the virtual computer system of the present invention.
  • the management server 101 is connected to a plurality of physical servers 111 to be managed via a network 207, and can transfer failure information, failure sign information, control information, and the like of each physical server 111.
  • the management server 101 detects (1) the occurrence of a failure or a failure sign of the physical server 111
  • control is performed to move the virtual server 109 affected by the detection of the failure or the failure sign to another physical server 111.
  • the definition information 309 of the virtual server is updated according to the actual operating status and the free computer resource status that are the actual operating status.
  • the updated virtual server definition information 309 makes it possible to recover the virtual server 109 from a failure or a failure sign by using the physical server 111-n having fewer computer resources than the physical server 111-1 before the migration (3).
  • FIG. 5 is an explanatory diagram showing details of the server management table 108.
  • the server management table 108 stores detailed information regarding the physical server 111.
  • the physical server identifier 501 stores an identifier for specifying the physical server 111.
  • the startup disk 502 indicates the location (for example, path) of the startup disk of the physical server 111.
  • the server identifier 503 indicates a unique identifier (for example, World Wide Name: WWN) that the FCA 304 connected to the disk array device 116 has.
  • the server mode 504 indicates the operating state of the physical server 111 and stores information for determining whether or not the server virtualization mechanism 110 is operating.
  • the processor / memory 505 stores processor information and memory capacity of the physical server 111.
  • the processor information includes the processor clock speed and the number of cores.
  • the network identifier 506 stores information for identifying the NIC 205 that the physical server 111 has. When there are a plurality of NICs 205 in one physical server 111, a plurality of identifiers are stored.
  • the network port 507 stores the port number of the network switch 120 to which the NIC 205 is connected. This is stored for setting the VLAN of the network switch when maintaining the network security of the physical server 111.
  • the disk 508 stores the identifier of the disk in the disk array device 116 of the physical server 111.
  • the disk identifier is, for example, LUN (Logical Unit Number), and LUN 10 is described in a plurality of physical servers 111 (servers 1 to 4 in the figure) in FIG. It can be shared from.
  • the virtualization mechanism identifier 509 stores an identifier that identifies the server virtualization mechanism 110 when the server virtualization mechanism 110 is operating on the physical server 111. This virtualization mechanism identifier 509 is associated with a server virtualization mechanism management table (virtual server management table 107) described later.
  • the server status 510 indicates the status and role of the physical server 111, and stores, for example, information indicating whether it is an active system or a standby system. In the embodiment of the present invention, it is used when performing a process of switching to a standby system when a failure occurs in any of the active systems.
  • FIG. 6 is an explanatory diagram showing details of the virtual server management table 107.
  • the virtual server management table 107 stores detailed information regarding the server virtualization mechanism 110 of each physical server 111.
  • the virtualization mechanism identifier 601 stores information for identifying the server virtualization mechanism 110 for each physical server 111 managed by the management server 101.
  • the control I / F address 602 stores a network address serving as access information to the control I / F 311 that controls the server virtualization mechanism 110 from the outside.
  • the virtual server identifier 603 stores a unique identifier for each virtual server 109.
  • the virtual server OS image 604 stores the OS image used by the virtual server 109 and the location (path) of the virtual server OS image 308.
  • the processor / memory allocation amount 605 indicates a computer resource amount allocated to the virtual server.
  • the computer resource amount includes, for example, the processor clock speed and the memory capacity allocated to the virtual server 109.
  • the state 606 stores information indicating whether or not the virtual server 109 is currently operating.
  • the processor / memory actual usage 607 stores the usage rate and memory capacity of the processor actually used by the virtual server.
  • the processor / memory actual usage 607 includes means for collecting actual usage (performance information) for resources periodically allocated from, for example, the OS running on the server virtualization mechanism 110 or the virtual server 109. Can be acquired. As the processor / memory actual usage 607, a method of storing an average usage (or usage rate) per unit time can be considered.
  • the average usage rate of the processor 303 is represented by “GHz” in terms of the clock speed, and the average usage amount of the memory 301 is represented by “GB”.
  • the average usage rate of the processor 303 is an actual usage rate for the resource (virtual processor) of the processor 303 allocated to the virtual server 109.
  • the network assignment 608 stores a NIC identifier of the virtual server and assignment information between the NIC 305 of the physical server 111 corresponding to the NIC identifier.
  • the disk 609 stores the location of the virtual server OS image 308 assigned to the virtual server 109 and the image file for data storage.
  • FIG. 7 shows the configuration of the virtual server definition information table 118.
  • the virtual server identifier 701 stores a unique identifier for each virtual server 109.
  • the virtual server priority 702 stores the importance of the virtual server 109 as a numerical value. The priority is “1” having the highest importance, and the importance decreases as the numerical value increases.
  • the numerical value of the virtual server priority 702 is input according to the priority of a task executed on the virtual server when the administrator generates a virtual server from the management server 101 or the like.
  • the original definition information 703 stores the location where the definition information of the original of the virtual server 109 (a state initially assigned to the physical server 111) is stored.
  • the present invention is characterized in that the reliability is improved by moving a virtual server to another physical server 111 by detecting the occurrence of a failure or a failure sign, but the definition information depends on the situation of the destination computer resource at that time. There is a case to change.
  • the definition information of the original virtual server 109 is required when returning the virtual server 109 to the original physical server 111 after removing the cause of the occurrence of the failure or the failure sign.
  • the original definition information 703 can be referred to and the configuration of the original virtual server 109 can be always restored.
  • the moving definition information 704 stores definition information of the virtual server 109 that is updated when the virtual server 109 is moved.
  • the movement date and time 705 stores the date and time when the virtual server 109 was moved.
  • the movement definition information 704 and the movement date / time 705 may store a history. By leaving a history of changes in allocated resources that occur during migration, the location of the physical server 111 where the failure has occurred can be easily identified. This facilitates failure analysis. Further, the movement definition information 704 and the movement date and time 705 may be added each time the virtual server 109 moves when the virtual server 109 has moved a plurality of times. For example, after detecting a failure sign and moving the virtual server 109, it is possible to detect the failure sign again at the destination and move the virtual server 109 further.
  • the virtual server 109 moves a plurality of times, but the movement definition information 704 and the movement date / time 705 are added each time the virtual server 109 moves.
  • the migration order may be restored in an order that goes back to the original order.
  • the virtual server 109 may be moved based on the original definition information 703. Where to return can be flexibly dealt with in accordance with the purpose of moving the virtual server 109, such as repairing the physical server 111 or moving to the physical server 111 with higher performance.
  • having the movement history of the virtual server 109 a plurality of times can increase the options for returning to the physical server 111 and moving, so that the virtual server 109 can be operated more flexibly. Become.
  • FIG. 8 is an explanatory diagram showing the contents of the virtual server definition information 309.
  • the virtual server name 801 stores the name of the virtual server 109.
  • the allocation resource 802 suggests the processor allocation amount, the memory allocation amount, the network allocation information, the location where the virtual server OS image 308 is stored, and generates the virtual server 109 such as the location where the data disk image is stored. Information for storing is stored.
  • the priority 803 stores the same content as the virtual server priority 702 stored in the virtual server definition information table 118.
  • movement history 804 movement history information of the virtual server 109 is stored.
  • the movement history 804 can be used as information for determining a movement destination when the virtual server 109 moves. For example, when the movement histories 804 of a plurality of definition information are aggregated and the movement frequency is high due to a failure sign for a specific virtual server 109, it means that there is a high risk of lowering the service level of the virtual server 109. Therefore, the movement history 804 can also be used as analysis information when the physical server 111 with higher reliability is selected and the virtual server 109 is moved.
  • Definition information 805 indicates the storage location of the original definition information of the virtual server 109.
  • the definition information used at the destination may be stored as a history each time the virtual server 109 moves.
  • FIG. 9 is a flowchart showing an outline of processing performed in the failure recovery unit 102. This flowchart is executed by the management server 101 at a predetermined cycle.
  • the failure recovery unit 102 calls the failure sign management unit 104.
  • the failure sign management unit 104 checks for occurrence of a failure or a failure sign for each physical server 111 that is the management target of the management server 101, as will be described later. That is, the failure sign management unit 104 inquires the BMC 203 and the failure sign detection unit 310 of each physical server 111 about the operation information, and acquires the operation information for each server 111.
  • step 902 the failure sign management unit 104 determines the occurrence of a failure or the presence or absence of a failure sign for each operation information of each server 111.
  • the failure sign management unit 104 detects the occurrence of a failure or a failure sign
  • the failure sign management unit 104 identifies the physical server 111 and notifies the failure recovery unit 102, and the process proceeds to step 903.
  • the process is terminated.
  • the failure recovery unit 102 determines whether or not the virtualization mechanism is being executed on the physical server 111 in which a failure or a failure sign is detected. If the server virtualization mechanism 110 is being executed on the target physical server 111, the failure recovery unit 102 calls the virtual server 109 recovery method selection unit 105 in step 904.
  • step 904 the virtual server recovery method selection unit 105 recovers the virtual server 109 based on the detection result of the failure or the failure sign, the priority of the affected virtual server 109, and the availability of the computer resources of the physical server 111. Process to select the method. The processing of this recovery method will be described later.
  • the failure recovery unit 102 calls the virtual server recovery unit 106.
  • the virtual server recovery unit 106 executes recovery of the virtual server 109 based on the execution result of the virtual server recovery method selection unit 105 as described later.
  • step 907 the result of the recovery process is notified to the administrator. This notification is performed by displaying the result on a display device (not shown) of the management server 101.
  • step 903 if there is a failure or sign of failure from the physical server 111 in which the server virtualization mechanism 110 is not executed, the physical server 111 is recovered in step 905.
  • the recovery of the physical server 111 that is not executing the server virtualization mechanism 110 is the same as that in the conventional example, and thus will not be described in detail in this embodiment.
  • the failure recovery unit 102 detects the occurrence of a failure or a failure sign in the physical server 111 executing the server virtualization mechanism 110, the amount of computer resources actually used by the virtual server 109 Alternatively, the allocation is performed again in accordance with the ratio, the computer resources less than those when the computer resources are initially allocated are reset, and the virtual server 109 is taken over by the standby physical server 111.
  • FIG. 10 is a flowchart illustrating an example of processing performed by the failure sign management unit 104. This processing corresponds to the processing in step 901 in FIG.
  • the failure sign management unit 104 performs processing to check whether a failure or a sign of failure has occurred in the physical server 111 to be managed.
  • the failure sign management unit 104 selects a target physical server 111.
  • the failure sign management unit 104 accesses the BMC 206 of the target physical server 111, and checks whether a hardware failure or a failure sign has occurred.
  • the BMC 206 of each physical server 111 can monitor the hardware state.
  • processor 303 For example, processor 303, memory 301, temperature, fan, power supply status monitoring, and the like.
  • a failure or a failure sign is, for example, detection of a failure sign when the BMC 206 detects some failure from the processor 303 and the failure is resolved by a number of retries, and is detected as a failure when the retry does not recover. To do. The same applies to other parts.
  • a predetermined threshold value is exceeded. If the temperature remains above the threshold, it may be detected as a failure.
  • the importance level can be set in the failure predictor according to the influence level when the failure is reached.
  • the failure sign of the processor 303 is highly important because it is highly likely that the failure of the processor 303 will substantially stop the entire system when the processor 303 is stopped, but the failure sign of the memory 301 is within a range in which a bit error can be corrected.
  • the importance is low.
  • the importance may be lowered when a plurality of processors 303 are installed. In this way, it is possible to control to increase the options of the recovery means by providing the importance to the failure sign.
  • Step 1003 it is determined whether or not a hardware failure or failure sign is detected. If a failure or failure sign is detected, the process proceeds to Step 1006 and the detection result is reported to the failure recovery unit 102. To do.
  • the failure sign detection unit 310 is called in step 1004 to detect a software level failure or failure sign that cannot be detected at the hardware level. The detection result by the failure sign detection unit 310 is reported to the failure sign management unit 104 in step 1006.
  • step 1007 the failure sign management unit 104 determines whether or not all the management target physical servers 111 have been inspected. If there is an uninspected physical server 111, the processing is repeated from step 1001.
  • FIG. 11 is a flowchart illustrating an example of processing performed by the virtual server recovery method selection unit 105. This process corresponds to step 904 in FIG.
  • the virtual server recovery method selection unit 105 determines how to recover the virtual server 109 when a failure or failure sign is detected in the physical server 111.
  • a failure or a failure sign is analyzed. For example, if the failure predictor of the processor 303 is analyzed, the core number of the processor 303 that detected the failure predictor and the importance of the predictor are analyzed.
  • the virtual server recovery method selection unit 105 investigates the influence range of the failure and specifies the affected virtual server 109.
  • the virtual server recovery method selection unit 105 determines whether or not there is a high possibility that the virtual server 109 is stopped due to the detected failure or failure sign. Note that there is a high possibility that the virtual server 109 will stop, for example, when the temperature of the processor 303 of the physical server 111 exceeds a predetermined value, or when the cooling fan of the processor 303 stops, In other words, the physical server 111 is currently operating but is expected to stop in the future.
  • step 1105 the virtual server recovery method selection unit 105 checks the priority of the virtual server 109 affected by the failure. This priority can be acquired by searching the virtual server priority 702 in the virtual server definition information table 118.
  • the virtual server recovery method selection unit 105 in step 1105 causes the other physical server 111 to be free.
  • the resource is searched, and the migration destination candidate of the virtual server 109 that is likely to be stopped is determined. If there are sufficient free resources in the migration destination physical server 111, the virtual server recovery method selection unit 105 determines the recovery method of the virtual server 109 as the virtual server migration method in step 1112.
  • the virtual server recovery method selection unit 105 determines the definition change method in step 1113. If there is a low possibility that the virtual server 109 will be stopped, or if the virtual server 109 with a low priority is affected, the status is reported to the administrator at step 1110 or step 1111. These reports notify the virtual server 109 that receives a failure occurrence or a failure sign to a display device (not shown) of the management server 101. By these processes, it is possible to widen the recovery range of the failure of the virtual server 109, and it is possible to recover the failure of the virtual server 109 with higher reliability than before.
  • Step 1112 or Step 1113 This selected process will be described in detail in the process of the virtual server recovery unit 106 described later.
  • FIG. 12 is a flowchart showing an outline of processing performed by the virtual server recovery unit 106. This process corresponds to the process in step 906 of FIG.
  • the virtual server recovery unit 106 executes the recovery process of the virtual server 109 according to the recovery method (virtual server migration or definition change) determined by the virtual server recovery method selection unit 105.
  • the virtual server recovery unit 106 determines whether or not the recovery method determined by the virtual server recovery method selection unit 105 is a definition change method. If the determined recovery method is not the definition change method, that is, if the recovery method has sufficient free resources to move, the virtual server recovery unit 106 extracts necessary resources from the definition information of the virtual server 109 in step 1202. This can be realized by referring to the original definition information 703 in the virtual server definition information table 118 and referring to the contents of the definition file. In other words, it indicates that the computer resource is allocated to the physical server 111 with the occurrence of a failure or a sign of failure, and is moved to a new physical server 111.
  • step 1203 the virtual server recovery unit 106 searches the server management table 108 for a physical server 111 having a free resource equivalent to or higher than that of the physical server 111 where a failure has occurred or is predictive of failure.
  • step 1204 the virtual server recovery unit 106 allocates the computer resource of the original definition information 703 to the searched physical server 111 and moves the virtual server 109.
  • the virtual server recovery unit 106 calls the virtual server definition information management unit 103 in step 1207.
  • the called virtual server definition information management unit 103 updates the virtual server definition information 309 by reducing the allocated amount of computer resources of the migration target virtual server 109, as will be described later.
  • Step 1208 moves the virtual server 109 using the changed virtual server definition information 309 used by the virtual server definition information management unit 103 at the destination.
  • the movement of the virtual server 109 is to move the virtual server 109 onto the destination physical server 111.
  • a resource for executing the virtual server 109 is temporarily secured at both the movement source and the movement destination. Processing to secure memory 301, CPU 303, and I / O (FCA 304, NIC 305) as the migration destination, and then copy the memory information and I / O status of the migration source virtual server 109 to the migration destination virtual server 109 I do. There may be fewer computer resources to move to.
  • the processing performance of the virtual server 109 decreases, but no particular processing is required for migration.
  • the migration destination physical server 111 has a small memory capacity
  • several methods are conceivable. One is a method of making the OS 302 appear as if it has the same capacity as the migration source.
  • copying the memory 301 the unused area that does not affect the operation of the OS 302 and the cache information of the OS 302 and the application are not copied, and a program such as a driver running on the OS 302 appears to have secured the insufficient memory area. Can be realized. Note that the above processing is not necessary in the case of moving while operating and the case where the virtual server 109 is temporarily shut down and started at the destination.
  • the virtual server definition information management unit 103 performs a process for improving reliability by moving the virtual server 109 even when there are not enough free resources in the physical server 111. Specifically, in order to move the virtual server 109 as much as possible, that is, to enable protection of the virtual server 109, the definition information is changed based on free resources, priorities, and actual usage of computer resources to enable the movement. Process.
  • FIG. 13 is a flowchart showing an outline of processing performed by the virtual server definition information management unit 103. This process corresponds to the process of step 1207 in FIG.
  • the virtual server definition information management unit 103 searches the virtual server 109 for the migration target virtual server 109 from the virtual server management table 107 for the actual actual usage of computer resources.
  • the virtual server definition information management unit 103 refers to the processor / memory actual usage 607 stored in the virtual server management table 107 to obtain the actual usage of computer resources.
  • Step 1302 the virtual server definition information management unit 103 searches the virtual server management table 107, selects, for example, the physical server 111 having the largest unused resource, and acquires unused resource information.
  • step 1303 the virtual server definition information management unit 103 compares the unused resource information acquired in step 1302 with the actual usage of the computer resource acquired in step 1301. It is judged whether it is larger than. This is because when only a part of the computer resources allocated to the virtual server 109 is operating, the physical server 111 having the minimum resources necessary to maintain the service level such as the current performance is selected. Is meant to do. When the unused resource information is larger than the actual usage amount and the physical server 111 holding the minimum computer resource capable of executing the migration target virtual server 109 is found, the process proceeds to step 1307 and the virtual server definition is performed. The information management unit 103 copies the virtual server definition information 309.
  • step 1308 the virtual server definition information management unit 103 applies the virtual server definition information 309 copied in step 1307 to the virtual server definition information 309 and information on unused resources of the physical server 111 searched in step 1302. To change.
  • the virtual server definition information 309 to be moved can execute the virtual server 109 with fewer computer resources than the original virtual server definition information 309.
  • the virtual server definition information management unit 103 reflects the virtual server definition information 309 changed in step 1308 in the virtual server definition information table 118. That is, the virtual server definition information management unit 103 stores the virtual server definition information 309 that has been copied and changed to unused resource information in the definition information storage disk 115 of the disk array device 116. Then, the virtual server definition information management unit 103 adds the storage location (path) of the virtual server definition information 309 that has been copied and changed to the movement definition information 704 in the virtual server definition information table 108, and also displays the current date and time. Stored in the movement date 705. Thereby, a movement history of the virtual server 109 to be moved is generated.
  • step 1303 determines whether unused resource larger than the actual usage amount is found. If it is determined in step 1303 that no unused resource larger than the actual usage amount is found, the process proceeds to step 1304.
  • step 1304 when the virtual server 109 having the minimum free resource does not exist, the virtual server definition information management unit 103 selects, for example, the physical server 111 having the most unused resources.
  • step 1306 the allocated resources of the virtual server 109 selected in step 1305 are stripped and accommodated to the virtual server 109 having a high priority. That is, the virtual server definition information management unit 103 reduces the allocated resources of the virtual server 109 having a lower priority than the virtual server 109 to be moved, and adds the reduced allocated resources to the unused resources of the physical server 111. Then, the virtual server definition information management unit 103 allocates unused resources, which have been increased by the amount of computer resources deprived from the virtual server 109 with low priority, to the virtual server 109 to be moved. Then, the process proceeds to step 1307 described above.
  • the amount of computer resources to be deprived from the virtual server 109 with low priority refers to the virtual server management table 107 and the processor / memory allocation 605 to the processor / memory actual use amount 607 of the virtual server 109 to be deprived.
  • the virtual server definition information management unit 103 updates the virtual server definition information 309 of the virtual server 109 to be stripped with the reduced computer resources.
  • the virtual server definition information management unit 103 when the computer resource stripped from one virtual server 109 cannot secure a computer resource that can execute the migration target virtual server 109, is more than the migration target virtual server 109. Computer resources are stripped from a plurality of low priority virtual servers 109.
  • the above processing makes it possible to recover the virtual server 109 affected by the detection of the state of a free resource, a failure, or a failure sign in a stepwise manner.
  • the computer resources allocated to the migration target virtual server 109 are reduced to the computer resources actually used, and the unused resources of the migration destination physical server 111 are allocated to the migration target virtual server 109, thereby minimizing the necessary amount.
  • Computer resources can be secured in the migration destination physical server 111 to guarantee the operation of the virtual server 109.
  • step 1303 it is determined whether the unused resource information is larger than the actual usage amount. However, it may be determined whether the unused resource information is equal to or greater than the actual usage amount.
  • FIG. 14 is a flowchart showing an outline of processing performed by the virtual server return unit 119. This process is executed when an administrator or the like instructs activation from a console (not shown) of the management server 101.
  • Step 1401 selects the virtual server 109 to which the virtual server return unit 119 returns.
  • the administrator may explicitly select the virtual server 109 that the administrator wants to return to, or the standby system is triggered when an event that the physical server 111 is restored to a normal state by replacement of the physical server 111 or the like is received.
  • the virtual server 109 moved to the physical server 111 may be automatically selected.
  • step 1402 the virtual server return unit 119 determines whether or not the virtual server 109 selected in step 1401 is moving. Determination of whether or not the migration is in progress can be made by determining whether or not the definition information is recorded in the migration definition information 704 of the virtual server definition information table 118. If the virtual server 109 is moving, the process proceeds to step 1403. If the virtual server 109 is in the original state, the process proceeds to step 1403.
  • step 1403 the virtual server return unit 119 extracts the movement definition information 704 of the virtual server 109 selected in step 1401 from the virtual server definition information table 118.
  • step 1404 when the extracted movement definition information 704 has been moved a plurality of times, the virtual server return unit 119 selects the physical server 111 that is the movement destination. Whether or not the movement has been performed a plurality of times can be determined by the virtual server return unit 119 based on whether or not a plurality of movement histories are described in the movement definition information 704.
  • Several methods are conceivable for selecting the physical server 111 to be moved to. In the case where the physical server 111 that has detected the failure sign is replaced with a new physical server 111 and repaired, the replaced physical server 111 may be selected.
  • a new physical server 111 that is different from the moved physical server 111 may be selected. At this time, it is possible to maintain high reliability by selecting the physical server 111 in which a failure or a failure sign has not occurred as much as possible.
  • the virtual server return unit 119 acquires the definition information of the virtual server 109 used at the movement destination from the movement definition information 704 in the virtual server definition information table 118. This acquires the movement definition information 704 corresponding to the physical server 111 selected as the movement destination from the movement definition information 704 extracted in step 1403. If moving to a physical server 111 different from the physical server 111 that has moved multiple times, the original virtual server definition information may be used.
  • the original (initially assigned) virtual server definition information 309 can be acquired by referring to the virtual server definition information 309 described in the original definition information 703 of the virtual server 109 in the virtual server definition information table 118.
  • step 1406 the virtual server return unit 119 searches for a free resource of the physical server 111 that is the movement destination.
  • the same physical properties as before the movement of the virtual server 109 are necessarily used.
  • computer resources of the server 111 may not be secured, and this is a necessary step for confirmation.
  • step 1407 it is determined whether or not the free resource acquired by the virtual server return unit 119 in step 1406 satisfies the contents of the virtual server definition information 309 of the virtual server 109 acquired in step 1405. If the free resource cannot satisfy the contents of the virtual server definition information 309 of the virtual server 109 acquired in step 1405, the administrator is notified that the resource is insufficient in step 1409. As another means, the processing of the virtual server definition information management unit 103 may be executed, the virtual server definition information 309 of the virtual server 109 may be changed within a range that satisfies the performance, and the movement may be continued.
  • Step 1408 moves the virtual server 109 using the virtual server definition information 309 of the virtual server 109 used at the destination.
  • the virtual server 109 can be restored to the computer resource before the movement without degrading the service level such as performance.
  • the original definition information 703 is referred to.
  • the virtual server 109 can be returned to the active physical server 111 very easily.
  • the present invention can also be applied to a trigger other than a failure or a failure sign.
  • the virtual server 109 is used to distribute the load by using the present invention. May be. That is, when the load (for example, the processor usage rate) of the physical server 111 exceeds a predetermined threshold, the virtual server 109 can be moved to the physical server 111 whose load is less than the predetermined threshold among other physical servers 111. .
  • the virtual server 109 having a low load is moved and aggregated on the specific physical server 111 using the present invention.
  • the power consumption of the entire system can be reduced.
  • the physical server 111 can be used to move the virtual server 109 for the purpose of load distribution and power consumption optimization.
  • the present invention can be applied to a virtual computer system that operates a plurality of virtual servers on a plurality of physical servers.

Abstract

Disclosed is a virtual machine system which has a plurality of physical computers having virtualization units which configure a plurality of virtual machines, and a management computer which is connected via a network. The management computer allocates the computer resources of a first physical computer from among the plurality of physical computers and operates the virtual machines. Information about the computer resources allocated to the virtual machines is retained as virtual machine definition information, and the computer resources of the first physical computer actually used by the virtual machines are acquired as an actual usage amount. When predetermined conditions arise, a physical computer which can maintain at least the same amount of computer resources as the actual usage amount is selected from among the plurality of physical computers as a second physical computer to be used as a migration destination for the virtual machines. Then the definition information is updated to the actual usage amount and the virtual machines are migrated to the selected second physical computer.

Description

仮想計算機の移動方法、仮想計算機システム及びプログラムを格納した記憶媒体Virtual computer migration method, virtual computer system, and storage medium storing program
 本発明は、仮想サーバの信頼性を向上するための方法およびシステムに関する。 The present invention relates to a method and system for improving the reliability of a virtual server.
 企業の計算機システムやデータセンタにおいて、サーバの保有台数が増大した結果、運用管理コストも増大している。この問題を解決する一つの方法として、サーバの仮想化技術がある。サーバの仮想化技術は、単一の物理サーバ上で複数の仮想サーバを稼働することができる技術である。物理サーバには、プロセッサやメモリといったリソースがあり、サーバの仮想化技術は物理的なリソースを分割し、分割されたリソースのそれぞれを異なる仮想的なサーバに割り当てることによって、単一の物理サーバ上で複数の仮想サーバを同時に実行する。プロセッサの性能向上とメモリなどのリソースが低コスト化したことによって、サーバの仮想化技術に関するニーズが増大している。 As a result of the increase in the number of servers in corporate computer systems and data centers, operation management costs are also increasing. One method for solving this problem is server virtualization technology. The server virtualization technology is a technology capable of operating a plurality of virtual servers on a single physical server. Physical servers have resources such as processors and memory. Server virtualization technology divides physical resources and assigns each of the divided resources to different virtual servers on a single physical server. Run multiple virtual servers at the same time. The need for server virtualization technology has increased due to improved processor performance and lower costs for resources such as memory.
 一方で、システムの高信頼化に関するニーズもますます高くなっている。企業システムの計算機の依存度が大きくなったことで、システムの停止による損害も大きくなったためである。システムを高信頼化する技術は、一般的には、現用系サーバとは別に待機系サーバを用意しておき、現用系サーバに障害が発生した場合には、待機系サーバと交代する技術がある。 On the other hand, the need for high system reliability is also increasing. This is because the damage caused by the suspension of the system has increased due to the increased dependency of computers in enterprise systems. As a technology for improving the reliability of the system, there is generally a technology in which a standby server is prepared separately from the active server, and when a failure occurs in the active server, the standby server is replaced. .
 サーバの仮想化と高信頼化の2つのニーズの流れから、仮想化されたサーバ環境でありながら高信頼を維持するという技術に対するニーズが生まれるのは自然であると考えられる(例えば、特許文献1)。しかし、2つの技術はお互いに相反する特性を持っている。例えば、物理サーバ上に複数の仮想サーバを構築した場合、物理サーバに障害が発生するとその上で稼働している全ての仮想サーバが一度に停止してしまう。独立した複数のサーバでシステムが構築されていれば、単一の物理サーバの障害の影響範囲は小さいが、単一の物理サーバに複数のサーバを集約することができる仮想化技術は障害の影響範囲が大きくなってしまう。そのため、仮想化環境では信頼性が低下する傾向がある。 From the flow of two needs of server virtualization and high reliability, it is natural that a need for a technology for maintaining high reliability in a virtualized server environment is generated (for example, Patent Document 1). ). However, the two technologies have opposite characteristics. For example, when a plurality of virtual servers are constructed on a physical server, when a failure occurs in the physical server, all virtual servers operating on the physical server are stopped at once. If the system is built with multiple independent servers, the failure scope of a single physical server is small, but the virtualization technology that can consolidate multiple servers on a single physical server is affected by the failure. The range becomes large. Therefore, the reliability tends to decrease in a virtual environment.
特開2001-216171号公報JP 2001-216171 A
 本発明が解決しようとする課題は、仮想化されたサーバ環境であっても、低コストに高信頼化することである。特に、高信頼化のための交代用(待機系)のサーバの台数を少なくすることである。 The problem to be solved by the present invention is to achieve high reliability at low cost even in a virtualized server environment. In particular, it is to reduce the number of alternate (standby) servers for high reliability.
 サーバに障害が発生し、交代サーバに切り替えるためには、稼働していた仮想サーバを正確に把握しておく必要がある。仮想サーバは、物理サーバと違い、物理サーバのプロセッサやメモリなどのリソースに余裕があれば比較的容易に増減することができる。しかし、前述した従来例では、リソースに余裕がない場合は、障害を回復することができないという課題がある。例えば、CPUが3GHz、メモリが4GBのリソースが割当てられた仮想サーバに深刻な障害が発生した場合には、同等のリソースを有する物理サーバが存在しなければならない。 In order for a server to fail and to switch to a replacement server, it is necessary to know exactly which virtual server was running. Unlike a physical server, a virtual server can be increased or decreased relatively easily if there are sufficient resources such as a processor and memory of the physical server. However, the above-described conventional example has a problem that a failure cannot be recovered when resources are not sufficient. For example, when a serious failure occurs in a virtual server to which a resource with a CPU of 3 GHz and a memory of 4 GB is allocated, there must be a physical server having an equivalent resource.
 そこで本発明は、上記問題点に鑑みてなされたもので、仮想サーバの引き継ぎ先の物理サーバのリソースが、引き継ぎ元の物理サーバよりも少ない場合であっても、仮想サーバを引き継ぐことを可能にすることを目的とする。 Therefore, the present invention has been made in view of the above problems, and enables a virtual server to take over even if the takeover destination physical server has fewer resources than the takeover source physical server. The purpose is to do.
 本明細書において開示される発明の代表的な一例を示せば以下の通りである。すなわち、プロセッサとメモリを備えて複数の仮想計算機を構築する仮想化部を有する複数の物理計算機と、プロセッサとメモリを備えて前記物理計算機とネットワークで接続された管理計算機と、を有する仮想計算機システムにおいて仮想計算機を移動させる方法であって、前記管理計算機が、前記複数の物理計算機のうち第1の物理計算機の計算機リソースを前記仮想計算機に割り当てて当該仮想計算機を稼働させる仮想計算機稼働ステップと、前記管理計算機が、前記仮想計算機に割り当てられた計算機リソースの情報を仮想計算機の定義情報として保持する保持ステップと、前記管理計算機は、前記仮想計算機が前記第1の物理計算機で実際に使用された計算機リソースを実使用量として取得するリソース使用量取得ステップと、前記管理計算機が、所定条件が成立したか否かを判定する判定ステップと、前記管理計算機は、前記所定条件が成立した場合に、前記仮想計算機の移動先として前記複数の物理計算機のうち、前記実使用量以上の計算機リソースを確保可能な物理計算機を前記第2の物理計算機として選択する選択ステップと、前記管理計算機が、前記定義情報を前記実使用量に更新して、前記仮想計算機を前記選択された第2の物理計算機に移動させる移動ステップと、を含む。 A typical example of the invention disclosed in this specification is as follows. That is, a virtual computer system including a plurality of physical computers having a virtualization unit that includes a processor and a memory to construct a plurality of virtual computers, and a management computer that includes the processor and the memory and is connected to the physical computer via a network. A virtual computer operating step in which the management computer allocates a computer resource of a first physical computer among the plurality of physical computers to the virtual computer and operates the virtual computer; A holding step in which the management computer holds information of a computer resource allocated to the virtual computer as definition information of the virtual computer; and the management computer is actually used by the first physical computer Resource usage acquisition step to acquire computer resources as actual usage, and previous A determination step of determining whether or not a predetermined condition is satisfied; and the management computer, when the predetermined condition is satisfied, of the plurality of physical computers as the migration destination of the virtual computer. A selection step of selecting, as the second physical computer, a physical computer capable of securing a computer resource equal to or greater than the usage amount; and the management computer updates the definition information to the actual usage amount and selects the virtual computer Moving to the second physical computer.
 したがって、本発明の代表的な実施の形態によれば、移動前の物理計算機よりも少ない計算機リソースで仮想計算機の移動または障害からの回復を行うことが可能となる。 Therefore, according to the representative embodiment of the present invention, it is possible to move a virtual machine or recover from a failure with fewer computer resources than the physical computer before the movement.
本発明の実施形態の仮想計算機システムのブロック図である。It is a block diagram of the virtual machine system of the embodiment of this invention. 本発明の実施形態の管理サーバの構成を示すブロック図である。It is a block diagram which shows the structure of the management server of embodiment of this invention. 本発明の実施形態の管理サーバの管理対象となるサーバ仮想化機構が稼働している物理サーバの詳細な構成を示すブロック図である。It is a block diagram which shows the detailed structure of the physical server in which the server virtualization mechanism used as the management object of the management server of embodiment of this invention is operating. 本発明の実施形態の仮想計算機システムの動作概要を示すブロック図である。It is a block diagram which shows the operation | movement outline | summary of the virtual computer system of embodiment of this invention. 本発明の実施形態のサーバ管理テーブルの詳細を示す説明図である。It is explanatory drawing which shows the detail of the server management table of embodiment of this invention. 本発明の実施形態の仮想サーバ管理テーブル107の詳細を示す説明図である。It is explanatory drawing which shows the detail of the virtual server management table 107 of embodiment of this invention. 本発明の実施形態の仮想サーバ定義情報テーブル118の構成を示す説明図である。It is explanatory drawing which shows the structure of the virtual server definition information table 118 of embodiment of this invention. 本発明の実施形態の仮想サーバの定義情報の内容を示す説明図である。It is explanatory drawing which shows the content of the definition information of the virtual server of embodiment of this invention. 本発明の実施形態の障害回復部で行われる処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process performed in the failure recovery part of embodiment of this invention. 本発明の実施形態の障害予兆管理部で行われる処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process performed in the failure sign management part of embodiment of this invention. 本発明の実施形態の仮想サーバ回復方法選択部で行われる処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process performed in the virtual server recovery method selection part of embodiment of this invention. 本発明の実施形態の仮想サーバ回復部で行われる処理の概要を示すフローチャートである。It is a flowchart which shows the outline | summary of the process performed in the virtual server recovery part of embodiment of this invention. 本発明の実施形態の仮想サーバ定義情報管理部で行われる処理の概要を示すフローチャートである。It is a flowchart which shows the outline | summary of the process performed in the virtual server definition information management part of embodiment of this invention. 本発明の実施形態の仮想サーバ復帰部で行われる処理の概要を示すフローチャートである。It is a flowchart which shows the outline | summary of the process performed in the virtual server return part of embodiment of this invention.
 以下、本発明の一実施形態を添付図面に基づいて説明する。 Hereinafter, an embodiment of the present invention will be described with reference to the accompanying drawings.
 図1は、本発明を適用する仮想計算機システムのブロック図である。本実施形態における制御の中心は、管理サーバ101である。管理サーバ101は、障害回復部102、仮想サーバ定義情報管理部103、障害予兆管理部104、仮想サーバ回復方法選択部105、仮想サーバ回復部106、仮想サーバ管理テーブル107、サーバ管理テーブル108、仮想サーバ定義情報テーブル118、及び仮想サーバ復帰部119を有する。 FIG. 1 is a block diagram of a virtual machine system to which the present invention is applied. The center of control in this embodiment is the management server 101. The management server 101 includes a failure recovery unit 102, a virtual server definition information management unit 103, a failure sign management unit 104, a virtual server recovery method selection unit 105, a virtual server recovery unit 106, a virtual server management table 107, a server management table 108, a virtual server A server definition information table 118 and a virtual server return unit 119 are included.
 管理サーバ101は、ネットワークスイッチ120、物理サーバ111-1~111-n,サーバ仮想化機構110,ストレージスイッチ112、仮想サーバ109、ディスクアレイ装置116を管理する。ここで、サーバ仮想化機構110は、物理サーバ111を複数の仮想サーバ109に見せる機能を有しており、単一の物理サーバ111に複数のサーバを統合することができる。サーバ仮想化機構110は、ハイパーバイザーやVMM(Virtual Machine Manager)等で構成することができる。なお、以下では、物理サーバ111-1~111-nの総称を物理サーバ111とする。 The management server 101 manages the network switch 120, the physical servers 111-1 to 111-n, the server virtualization mechanism 110, the storage switch 112, the virtual server 109, and the disk array device 116. Here, the server virtualization mechanism 110 has a function of making the physical server 111 appear to the plurality of virtual servers 109, and can integrate the plurality of servers into the single physical server 111. The server virtualization mechanism 110 can be configured by a hypervisor, a VMM (Virtual Machine Manager), or the like. Hereinafter, the physical server 111-1 to 111-n is collectively referred to as the physical server 111.
 ディスクアレイ装置116は、ストレージスイッチ112を介して物理サーバ111に接続される。なお、ディスクアレイ装置116と物理サーバ111を接続するストレージスイッチは112、SAN(Storage Area Network)を構成する。また、管理サーバ101と物理サーバ111を接続するネットワークスイッチ120は、図2に示すネットワーク207を構成する。 The disk array device 116 is connected to the physical server 111 via the storage switch 112. Note that the storage switch 112 that connects the disk array device 116 and the physical server 111 constitutes a SAN (Storage Area Network). The network switch 120 that connects the management server 101 and the physical server 111 constitutes the network 207 shown in FIG.
 ディスクアレイ装置116は、仮想サーバ109で実行されるプログラムが格納された仮想サーバイメージ格納ディスク114と仮想サーバ109の計算機リソースの割当て情報が格納された定義情報格納ディスク115を有する。本実施形態では、物理サーバ111のいずれかに障害あるいは障害予兆を検出すると、他の物理サーバ上に障害や障害予兆の影響を受ける仮想サーバを移動することによって、高信頼なシステムを構成する。 The disk array device 116 has a virtual server image storage disk 114 storing a program executed by the virtual server 109 and a definition information storage disk 115 storing computer resource allocation information of the virtual server 109. In this embodiment, when a failure or a failure sign is detected in any of the physical servers 111, a highly reliable system is configured by moving a virtual server affected by the failure or the failure sign to another physical server.
 ここで、管理サーバ101の機能要素の概要は、以下の通りである。 Here, an outline of functional elements of the management server 101 is as follows.
 障害回復部102は、後述するように、物理サーバ111の障害の発生または障害の予兆を検出したときに、物理サーバ111の回復と仮想サーバ109の回復を制御する。 As described later, the failure recovery unit 102 controls recovery of the physical server 111 and recovery of the virtual server 109 when the occurrence of a failure of the physical server 111 or a failure sign is detected.
 仮想サーバ定義情報管理部103は、仮想サーバ109の引き継ぎ先の物理サーバ111に十分な空きリソースが存在しない場合でも、仮想サーバ109を移動することで信頼性を高めるための処理を行う。 The virtual server definition information management unit 103 performs processing for improving reliability by moving the virtual server 109 even when there is not enough free resources in the physical server 111 that is the takeover destination of the virtual server 109.
 障害予兆管理部104は、管理サーバ101に管理される物理サーバ111毎に、障害や障害の予兆の発生を検査する。 The failure sign management unit 104 checks the occurrence of a failure or a sign of failure for each physical server 111 managed by the management server 101.
 仮想サーバ回復方法選択部105は、物理サーバ111の障害あるいは障害予兆の検出結果、障害の影響を受ける仮想サーバ109の優先度、および物理サーバ111の計算機リソースの空き状況に基づいて、仮想サーバ109の回復方法を選択する処理を行う。 The virtual server recovery method selection unit 105 selects the virtual server 109 based on the detection result of the failure or predictive failure of the physical server 111, the priority of the virtual server 109 affected by the failure, and the availability of computer resources of the physical server 111. Process to select the recovery method.
 仮想サーバ回復部106は、仮想サーバ回復方法選択部105の実行結果に基づき、仮想サーバ109の回復を実行する。 The virtual server recovery unit 106 executes recovery of the virtual server 109 based on the execution result of the virtual server recovery method selection unit 105.
 仮想サーバ管理テーブル107は、サーバ仮想化機構110に関する詳細な情報を格納する。サーバ管理テーブル108は、物理サーバ111のリソースを管理する。仮想サーバ定義情報テーブル118は、仮想サーバ109毎の優先度と、物理サーバ111のリソースを割り当てた当初(オリジナル)の割り当て情報の位置と、仮想サーバの引き継ぎに関する情報を格納する。 The virtual server management table 107 stores detailed information regarding the server virtualization mechanism 110. The server management table 108 manages resources of the physical server 111. The virtual server definition information table 118 stores the priority for each virtual server 109, the position of the original (original) allocation information to which the resources of the physical server 111 are allocated, and information regarding the takeover of the virtual server.
 仮想サーバ復帰部119は、待機系の物理サーバ111に移動した仮想サーバ109を、元の現用系の物理サーバ111に復帰させる処理を実行する。なお、図1において、例えば、物理サーバ111-1、111-2が現用系を構成し、物理サーバ111-nが待機系を構成することができる。 The virtual server return unit 119 executes a process for returning the virtual server 109 moved to the standby physical server 111 to the original active physical server 111. In FIG. 1, for example, the physical servers 111-1 and 111-2 can configure the active system, and the physical server 111-n can configure the standby system.
 図2は、本発明における管理サーバ101の構成を示すブロック図である。管理サーバ101は、メモリ202、プロセッサ203、FCA(Fibre Channel Adapter)204、NIC(Network Interface Card)205、及びBMC(Baseboard Management Controller)206を有する。 FIG. 2 is a block diagram showing the configuration of the management server 101 in the present invention. The management server 101 includes a memory 202, a processor 203, a FCA (Fibre Channel Adapter) 204, a NIC (Network Interface Card) 205, and a BMC (Baseboard Management Controller) 206.
 プロセッサ203は、メモリ202内に格納された各種プログラムを実行する。FCA204は外部のストレージ(例えば、ディスクアレイ装置116)と接続される。NIC205およびBMC206はネットワーク207に接続される。NIC205は、主にメモリ202上の各種プログラムからの要求に応じてネットワーク207を介して他のサーバと通信する。BMC206は管理サーバ101の障害などを検知し、ネットワーク207を介して他のサーバと通信するために使用される。本実施形態では、NIC205とBMC206は同一のネットワーク207に接続されているが、異なるネットワークに接続してもよい。また、FCA204、NIC205はそれぞれ一つずつであるが、複数存在してもよい。 The processor 203 executes various programs stored in the memory 202. The FCA 204 is connected to an external storage (for example, the disk array device 116). The NIC 205 and the BMC 206 are connected to the network 207. The NIC 205 communicates with other servers via the network 207 mainly in response to requests from various programs on the memory 202. The BMC 206 is used to detect a failure of the management server 101 and communicate with other servers via the network 207. In this embodiment, the NIC 205 and the BMC 206 are connected to the same network 207, but may be connected to different networks. Further, one FCA 204 and one NIC 205 are provided, but a plurality of FCAs 204 and NICs 205 may exist.
 メモリ202上には、障害回復部102、仮想サーバ定義情報管理部103、障害予兆管理部104,仮想サーバ回復方法選択部105、仮想サーバ回復部106、仮想サーバ管理テーブル107、サーバ管理テーブル108、および仮想サーバ定義情報テーブル118がプログラムとして格納される。プロセッサ203によってメモリ202に格納された各プログラムが実行される。上記各プログラムは、記憶媒体としてのストレージ(例えば、ディスクアレイ装置116)に格納され、必要に応じてメモリ202にロードされる。 On the memory 202, a failure recovery unit 102, a virtual server definition information management unit 103, a failure sign management unit 104, a virtual server recovery method selection unit 105, a virtual server recovery unit 106, a virtual server management table 107, a server management table 108, The virtual server definition information table 118 is stored as a program. Each program stored in the memory 202 is executed by the processor 203. Each of the above programs is stored in a storage (for example, the disk array device 116) as a storage medium, and loaded into the memory 202 as necessary.
 図3は、管理サーバ101の管理対象となるサーバ仮想化機構110が稼働している物理サーバ111の詳細な構成を示すブロック図である。物理サーバ111は、メモリ301、プロセッサ303、FCA(Fibre Channel Adapter)304、NIC(Network Interface Card)305、BMC(Baseboard Management Controller)306から構成される。 FIG. 3 is a block diagram showing a detailed configuration of the physical server 111 on which the server virtualization mechanism 110 to be managed by the management server 101 is operating. The physical server 111 includes a memory 301, a processor 303, an FCA (Fibre Channel Adapter) 304, an NIC (Network Interface Card) 305, and a BMC (Baseboard Management Controller) 306.
 プロセッサ303は、メモリ301内に格納された各種プログラムを実行する。FCA304は図1に示したストレージスイッチ112を解してディスクアレイ装置116と接続される。NIC305およびBMC306はネットワーク207に接続される。NIC305は、主にメモリ301上の各種プログラムからの要求に応じて他のサーバと通信する。BMC306は物理サーバ111の障害などを検知し、ネットワーク207を介して管理サーバ101や他のサーバと通信するために使用する。本実施形態では、NIC305とBMC306は同一のネットワークに接続されているが、異なるネットワークに接続してもよい。また、FCA304、NIC305はそれぞれ一つずつであるが、複数存在してもよい。 The processor 303 executes various programs stored in the memory 301. The FCA 304 is connected to the disk array device 116 via the storage switch 112 shown in FIG. The NIC 305 and the BMC 306 are connected to the network 207. The NIC 305 mainly communicates with other servers in response to requests from various programs on the memory 301. The BMC 306 is used to detect a failure of the physical server 111 and communicate with the management server 101 and other servers via the network 207. In this embodiment, the NIC 305 and the BMC 306 are connected to the same network, but may be connected to different networks. Further, one FCA 304 and one NIC 305 are provided, but a plurality of FCAs 304 and NICs 305 may exist.
 メモリ301上には、サーバ仮想化機構110が稼働することで、複数の仮想サーバ109-1~109-mを構築することができる。なお、仮想サーバ109-1~109-mの総称を仮想サーバ109とする。仮想サーバ109は、それぞれ独立にOS(Operating System)302を稼働させることができる。プロセッサ303によりサーバ仮想化機構110が実行されると、サーバ仮想化機構110上で複数の仮想サーバ109を構築することができる。 A plurality of virtual servers 109-1 to 109-m can be constructed by operating the server virtualization mechanism 110 on the memory 301. The virtual servers 109-1 to 109 -m are collectively referred to as the virtual server 109. Each virtual server 109 can operate an OS (Operating System) 302 independently. When the server virtualization mechanism 110 is executed by the processor 303, a plurality of virtual servers 109 can be constructed on the server virtualization mechanism 110.
 仮想サーバ109はディスクアレイ装置116にあらかじめ設定された仮想サーバイメージ格納ディスク114内の所定の仮想サーバOSイメージ308をサーバ仮想化機構110が読み込んで実行することによって、それぞれ独立した仮想サーバ109が構築される。また、ディスクアレイ装置116の定義情報格納ディスク115には、各仮想サーバ109の定義情報が格納された仮想サーバ定義情報309が格納される。 The virtual server 109 reads and executes a predetermined virtual server OS image 308 in the virtual server image storage disk 114 set in advance in the disk array device 116, and thereby the independent virtual server 109 is constructed. Is done. Also, the definition information storage disk 115 of the disk array device 116 stores virtual server definition information 309 in which definition information of each virtual server 109 is stored.
 ディスクアレイ装置116の仮想サーバ定義情報309は、ストレージスイッチ112を介して複数の物理サーバ111から共有され、いずれの物理サーバ111からも参照することができるように設定することができる。複数の仮想サーバ109毎に仮想サーバOSイメージ308および仮想サーバ定義情報309を設けておくことで、仮想サーバ109毎にまったく異なるOSやアプリケーションを単一の物理サーバ301上で稼働させることができる。 The virtual server definition information 309 of the disk array device 116 is shared from a plurality of physical servers 111 via the storage switch 112, and can be set so that it can be referenced from any physical server 111. By providing the virtual server OS image 308 and the virtual server definition information 309 for each of the plurality of virtual servers 109, completely different OSs and applications can be operated on the single physical server 301 for each virtual server 109.
 制御I/F(Interface)311は、ネットワーク207を介して外部の管理サーバ101等からサーバ仮想化機構110を制御するためのインターフェースである。この制御I/F311を介して管理サーバ101は物理サーバ111上の仮想サーバ109の作成や削除などを行うことができる。このため、各物理サーバ111の仮想化機構110の制御I/F311には、ネットワークアドレスが設定される。 A control I / F (Interface) 311 is an interface for controlling the server virtualization mechanism 110 from the external management server 101 or the like via the network 207. The management server 101 can create or delete the virtual server 109 on the physical server 111 via the control I / F 311. For this reason, a network address is set in the control I / F 311 of the virtualization mechanism 110 of each physical server 111.
 図4は、本発明の仮想計算機システムの動作概要を示すブロック図である。管理サーバ101は、管理対象となる複数の物理サーバ111とネットワーク207を介して接続され、各物理サーバ111の障害情報、障害予兆情報、制御情報などを転送することができる。 FIG. 4 is a block diagram showing an outline of the operation of the virtual computer system of the present invention. The management server 101 is connected to a plurality of physical servers 111 to be managed via a network 207, and can transfer failure information, failure sign information, control information, and the like of each physical server 111.
 本発明では、管理サーバ101が(1)物理サーバ111の障害発生または障害予兆を検出すると、障害や障害予兆の検出によって影響を受ける仮想サーバ109を他の物理サーバ111へ移動する制御をする。この時に、各物理サーバ111の計算機リソースの空き状況によっては移動先の物理サーバ111の計算機リソースが不足しており移動できないケースがある。こういった場合に、本発明では(2)仮想サーバの定義情報309を実際の稼働状況である実稼動状況および空き計算機リソース状況に応じて更新する。更新された仮想サーバ定義情報309により、移動前の物理サーバ111-1より少ない計算機リソースの物理サーバ111-nで障害や障害予兆から仮想サーバ109を回復することができるようになる(3)。 In the present invention, when the management server 101 detects (1) the occurrence of a failure or a failure sign of the physical server 111, control is performed to move the virtual server 109 affected by the detection of the failure or the failure sign to another physical server 111. At this time, depending on the availability of the computer resources of each physical server 111, there may be a case where the computer resources of the destination physical server 111 are insufficient and cannot be moved. In such a case, in the present invention, (2) the definition information 309 of the virtual server is updated according to the actual operating status and the free computer resource status that are the actual operating status. The updated virtual server definition information 309 makes it possible to recover the virtual server 109 from a failure or a failure sign by using the physical server 111-n having fewer computer resources than the physical server 111-1 before the migration (3).
 図5は、サーバ管理テーブル108の詳細を示す説明図である。サーバ管理テーブル108は、物理サーバ111に関する詳細な情報が格納される。物理サーバ識別子501は、物理サーバ111を特定するための識別子を格納する。起動ディスク502は、物理サーバ111の起動ディスクの場所(例えば、パス)を示す。サーバ識別子503は、ディスクアレイ装置116と接続されるFCA304が有する固有の識別子(例えば、World Wide Name:WWN)を示す。サーバモード504は、物理サーバ111の稼働状態を示しており、サーバ仮想化機構110が稼働しているか否かを判別するための情報が格納されている。プロセッサ/メモリ505は、物理サーバ111のプロセッサ情報やメモリ容量が格納される。なお、プロセッサ情報は、プロセッサのクロック速度とコア数などで構成される。ネットワーク識別子506は、物理サーバ111が有するNIC205を識別するための情報が格納される。ひとつの物理サーバ111に複数のNIC205が存在する場合は、複数の識別子が格納される。ネットワークポート507は、NIC205が接続されているネットワークスイッチ120のポート番号が格納される。これは、物理サーバ111のネットワークセキュリティを保つ際にネットワークスイッチのVLANを設定するために格納される。ディスク508は、物理サーバ111が有するディスクアレイ装置116内のディスクの識別子が格納される。ディスクの識別子としては、例えば、LUN(Logical Unit Number)であり、図5においてはLUN10が複数の物理サーバ111(図中サーバ1~4)に記載されているが、これは複数の物理サーバ111から共有することができることを示している。仮想化機構識別子509は、物理サーバ111上でサーバ仮想化機構110が稼働している場合に、サーバ仮想化機構110を特定する識別子が格納される。この仮想化機構識別子509は、後で述べるサーバ仮想化機構の管理テーブル(仮想サーバ管理テーブル107)と関連づけられている。サーバ状態510は、物理サーバ111の状態や役割を示しており、例えば、現用系か待機系かを示す情報が格納されている。本発明における実施形態では、現用系のいずれかに障害が発生した場合に待機系に交代する処理を行う際に使用される。 FIG. 5 is an explanatory diagram showing details of the server management table 108. The server management table 108 stores detailed information regarding the physical server 111. The physical server identifier 501 stores an identifier for specifying the physical server 111. The startup disk 502 indicates the location (for example, path) of the startup disk of the physical server 111. The server identifier 503 indicates a unique identifier (for example, World Wide Name: WWN) that the FCA 304 connected to the disk array device 116 has. The server mode 504 indicates the operating state of the physical server 111 and stores information for determining whether or not the server virtualization mechanism 110 is operating. The processor / memory 505 stores processor information and memory capacity of the physical server 111. The processor information includes the processor clock speed and the number of cores. The network identifier 506 stores information for identifying the NIC 205 that the physical server 111 has. When there are a plurality of NICs 205 in one physical server 111, a plurality of identifiers are stored. The network port 507 stores the port number of the network switch 120 to which the NIC 205 is connected. This is stored for setting the VLAN of the network switch when maintaining the network security of the physical server 111. The disk 508 stores the identifier of the disk in the disk array device 116 of the physical server 111. The disk identifier is, for example, LUN (Logical Unit Number), and LUN 10 is described in a plurality of physical servers 111 (servers 1 to 4 in the figure) in FIG. It can be shared from. The virtualization mechanism identifier 509 stores an identifier that identifies the server virtualization mechanism 110 when the server virtualization mechanism 110 is operating on the physical server 111. This virtualization mechanism identifier 509 is associated with a server virtualization mechanism management table (virtual server management table 107) described later. The server status 510 indicates the status and role of the physical server 111, and stores, for example, information indicating whether it is an active system or a standby system. In the embodiment of the present invention, it is used when performing a process of switching to a standby system when a failure occurs in any of the active systems.
 図6は、仮想サーバ管理テーブル107の詳細を示す説明図である。 FIG. 6 is an explanatory diagram showing details of the virtual server management table 107.
 仮想サーバ管理テーブル107は、各物理サーバ111のサーバ仮想化機構110に関する詳細な情報が格納される。仮想化機構識別子601は、管理サーバ101が管理している物理サーバ111毎のサーバ仮想化機構110を識別するための情報が格納される。制御I/Fアドレス602は、サーバ仮想化機構110を外部から制御する制御I/F311へのアクセス情報となるネットワークアドレスが格納される。仮想サーバ識別子603は、仮想サーバ109毎にユニークな識別子が格納される。 The virtual server management table 107 stores detailed information regarding the server virtualization mechanism 110 of each physical server 111. The virtualization mechanism identifier 601 stores information for identifying the server virtualization mechanism 110 for each physical server 111 managed by the management server 101. The control I / F address 602 stores a network address serving as access information to the control I / F 311 that controls the server virtualization mechanism 110 from the outside. The virtual server identifier 603 stores a unique identifier for each virtual server 109.
 仮想サーバOSイメージ604は、仮想サーバ109がどのOSイメージを使用して起動したか、仮想サーバOSイメージ308の場所(パス)が格納される。プロセッサ/メモリ割当量605は、当該仮想サーバに割当てられる計算機リソース量を示す。 The virtual server OS image 604 stores the OS image used by the virtual server 109 and the location (path) of the virtual server OS image 308. The processor / memory allocation amount 605 indicates a computer resource amount allocated to the virtual server.
 計算機リソース量は、例えば、仮想サーバ109に割り当てたプロセッサのクロック速度とメモリの容量を含む。状態606は、仮想サーバ109が現在稼働中か否かを示す情報が格納されている。プロセッサ/メモリ実使用量607は、当該仮想サーバが実際に使用しているプロセッサの使用率やメモリの容量が格納される。プロセッサ/メモリ実使用量607は、例えばサーバ仮想化機構110や仮想サーバ109上で稼動するOSなどから定期的に割り当てられたリソースに対する実際の使用量(性能情報)を収集する手段を有することによって取得することができる。プロセッサ/メモリ実使用量607としては、単位時間当たりの平均使用量(または使用率)を格納するなどの方法が考えられる。図示の例では、プロセッサ303の平均使用率をクロック速度で「GHz」で表し、メモリ301の平均使用量を「GB」で表した。なお、プロセッサ303の平均使用率は、仮想サーバ109に割り当てたプロセッサ303のリソース(仮想プロセッサ)に対する実際の使用率である。 The computer resource amount includes, for example, the processor clock speed and the memory capacity allocated to the virtual server 109. The state 606 stores information indicating whether or not the virtual server 109 is currently operating. The processor / memory actual usage 607 stores the usage rate and memory capacity of the processor actually used by the virtual server. The processor / memory actual usage 607 includes means for collecting actual usage (performance information) for resources periodically allocated from, for example, the OS running on the server virtualization mechanism 110 or the virtual server 109. Can be acquired. As the processor / memory actual usage 607, a method of storing an average usage (or usage rate) per unit time can be considered. In the illustrated example, the average usage rate of the processor 303 is represented by “GHz” in terms of the clock speed, and the average usage amount of the memory 301 is represented by “GB”. The average usage rate of the processor 303 is an actual usage rate for the resource (virtual processor) of the processor 303 allocated to the virtual server 109.
 ネットワーク割当608は、仮想サーバのNIC識別子と、このNIC識別子に対応する物理サーバ111が有するNIC305との割当情報が格納される。ディスク609は、仮想サーバ109に割り当てられた仮想サーバOSイメージ308やデータ格納用のイメージファイルの場所が格納される。 The network assignment 608 stores a NIC identifier of the virtual server and assignment information between the NIC 305 of the physical server 111 corresponding to the NIC identifier. The disk 609 stores the location of the virtual server OS image 308 assigned to the virtual server 109 and the image file for data storage.
 図7は、仮想サーバ定義情報テーブル118の構成を示す。仮想サーバ識別子701は、仮想サーバ109毎にユニークな識別子が格納される。仮想サーバ優先度702は、当該仮想サーバ109の重要度が数値として格納される。優先度は「1」が最も重要度が高く、数値が大きくなるに従って重要度が低くなる。仮想サーバ優先度702の数値は、管理者が管理サーバ101等から仮想サーバを生成する際に、当該仮想サーバで実行される業務の優先度などに応じて入力する。 FIG. 7 shows the configuration of the virtual server definition information table 118. The virtual server identifier 701 stores a unique identifier for each virtual server 109. The virtual server priority 702 stores the importance of the virtual server 109 as a numerical value. The priority is “1” having the highest importance, and the importance decreases as the numerical value increases. The numerical value of the virtual server priority 702 is input according to the priority of a task executed on the virtual server when the administrator generates a virtual server from the management server 101 or the like.
 オリジナル定義情報703は、当該仮想サーバ109のオリジナル(物理サーバ111に最初に割り当てた状態)の定義情報が格納された場所が格納される。本発明では、障害発生や障害予兆の検出によって、仮想サーバを他の物理サーバ111に移動して信頼性を高めることが特徴であるが、その際に移動先の計算機リソースの状況によっては定義情報を変更するケースがある。しかし、障害発生や障害予兆の原因を取り除いた後で、仮想サーバ109を元の物理サーバ111に戻す際に、オリジナルの仮想サーバ109の定義情報が必要になるためである。これにより、移動の際に仮想サーバ109の定義情報を更新しても、オリジナル定義情報703を参照することにより、必ず元の仮想サーバ109の構成に戻すことができるようになる。 The original definition information 703 stores the location where the definition information of the original of the virtual server 109 (a state initially assigned to the physical server 111) is stored. The present invention is characterized in that the reliability is improved by moving a virtual server to another physical server 111 by detecting the occurrence of a failure or a failure sign, but the definition information depends on the situation of the destination computer resource at that time. There is a case to change. However, the definition information of the original virtual server 109 is required when returning the virtual server 109 to the original physical server 111 after removing the cause of the occurrence of the failure or the failure sign. As a result, even if the definition information of the virtual server 109 is updated at the time of migration, the original definition information 703 can be referred to and the configuration of the original virtual server 109 can be always restored.
 移動用定義情報704は、仮想サーバ109を移動する際に、移動の際に更新された仮想サーバ109の定義情報が格納される。移動日時705は、当該仮想サーバ109を移動した日時が格納される。移動用定義情報704および移動日時705は、履歴を格納するようにしてもよい。移動の際に生じた割り当てリソースの変更の履歴を残すことによって、障害が発生した物理サーバ111の位置などを特定しやすくなる。このため、障害の解析が容易になる。また、移動用定義情報704と移動日時705は、複数回に渡って仮想サーバ109が移動した場合、仮想サーバ109が移動するたびに情報を追加してもよい。例えば、障害予兆を検出し、仮想サーバ109を移動した後で再び移動先で障害予兆を検出し仮想サーバ109をさらに移動するケースが考えられる。こういった場合、複数回に渡って仮想サーバ109が移動するが、移動する毎に移動用定義情報704と移動日時705を追加する。その後、障害予兆を検出した物理サーバ111を修復した場合など、仮想サーバ109を元の物理サーバ111へ復帰する場合は、移動の順番を遡る順番で復帰してもよいし、移動の順番に関係なくオリジナル定義情報703に基づいて仮想サーバ109を移動してもよい。どこに復帰するかは、物理サーバ111の修復やより性能の高い物理サーバ111へ移動したいなど、仮想サーバ109の移動の目的に応じて柔軟に対応することができる。 The moving definition information 704 stores definition information of the virtual server 109 that is updated when the virtual server 109 is moved. The movement date and time 705 stores the date and time when the virtual server 109 was moved. The movement definition information 704 and the movement date / time 705 may store a history. By leaving a history of changes in allocated resources that occur during migration, the location of the physical server 111 where the failure has occurred can be easily identified. This facilitates failure analysis. Further, the movement definition information 704 and the movement date and time 705 may be added each time the virtual server 109 moves when the virtual server 109 has moved a plurality of times. For example, after detecting a failure sign and moving the virtual server 109, it is possible to detect the failure sign again at the destination and move the virtual server 109 further. In such a case, the virtual server 109 moves a plurality of times, but the movement definition information 704 and the movement date / time 705 are added each time the virtual server 109 moves. After that, when the virtual server 109 is restored to the original physical server 111, such as when the physical server 111 that has detected the failure sign is repaired, the migration order may be restored in an order that goes back to the original order. Instead, the virtual server 109 may be moved based on the original definition information 703. Where to return can be flexibly dealt with in accordance with the purpose of moving the virtual server 109, such as repairing the physical server 111 or moving to the physical server 111 with higher performance.
 このように、複数回の仮想サーバ109の移動の履歴を持つことで、物理サーバ111への復帰や移動の選択肢を増すことができるため、仮想サーバ109をより柔軟に運用することができるようになる。 As described above, having the movement history of the virtual server 109 a plurality of times can increase the options for returning to the physical server 111 and moving, so that the virtual server 109 can be operated more flexibly. Become.
 図8は、仮想サーバ定義情報309の内容を示す説明図である。仮想サーバ名801は、当該仮想サーバ109の名称が格納される。割当てリソース802は、プロセッサの割当て量、メモリの割当量、ネットワークの割当て情報、仮想サーバOSイメージ308が格納された位置を示唆し、データディスクイメージが格納された位置など、仮想サーバ109を生成するための情報が格納される。 FIG. 8 is an explanatory diagram showing the contents of the virtual server definition information 309. The virtual server name 801 stores the name of the virtual server 109. The allocation resource 802 suggests the processor allocation amount, the memory allocation amount, the network allocation information, the location where the virtual server OS image 308 is stored, and generates the virtual server 109 such as the location where the data disk image is stored. Information for storing is stored.
 優先度803は、仮想サーバ定義情報テーブル118に格納された、仮想サーバ優先度702と同じ内容が格納される。移動履歴804は、当該仮想サーバ109の移動履歴情報が格納される。移動履歴804は、仮想サーバ109の移動の際に、移動先を決定するための情報として用いることができる。例えば、複数の定義情報の移動履歴804を集計し、特定の仮想サーバ109について障害予兆が原因で移動頻度が高い場合、仮想サーバ109のサービスレベルを低下させるリスクが高いことを意味する。従って移動履歴804は、より信頼性が高い物理サーバ111を選択して、仮想サーバ109を移動させる際に解析情報として用いることもできる。 The priority 803 stores the same content as the virtual server priority 702 stored in the virtual server definition information table 118. In the movement history 804, movement history information of the virtual server 109 is stored. The movement history 804 can be used as information for determining a movement destination when the virtual server 109 moves. For example, when the movement histories 804 of a plurality of definition information are aggregated and the movement frequency is high due to a failure sign for a specific virtual server 109, it means that there is a high risk of lowering the service level of the virtual server 109. Therefore, the movement history 804 can also be used as analysis information when the physical server 111 with higher reliability is selected and the virtual server 109 is moved.
 定義情報805は、当該仮想サーバ109のオリジナルの定義情報の格納場所を示す。また、オリジナル定義情報の記載だけでなく、複数回に渡って仮想サーバ109が移動した場合は、仮想サーバ109が移動するたびに移動先で用いた定義情報を履歴として格納してもよい。 Definition information 805 indicates the storage location of the original definition information of the virtual server 109. In addition to the description of the original definition information, when the virtual server 109 has moved a plurality of times, the definition information used at the destination may be stored as a history each time the virtual server 109 moves.
 図9は、障害回復部102で行われる処理の概要を示すフローチャートを示す。このフローチャートは所定の周期で管理サーバ101が実行する。 FIG. 9 is a flowchart showing an outline of processing performed in the failure recovery unit 102. This flowchart is executed by the management server 101 at a predetermined cycle.
 ステップ901は、障害回復部102が障害予兆管理部104を呼び出す。障害予兆管理部104は、管理サーバ101の管理対象である物理サーバ111毎に、障害発生や障害予兆が発生していないかを後述するように検査する。すなわち、障害予兆管理部104は、各物理サーバ111のBMC203及び障害予兆検出部310に稼動情報を問い合わせ、各サーバ111毎に稼動情報を取得する。 In step 901, the failure recovery unit 102 calls the failure sign management unit 104. The failure sign management unit 104 checks for occurrence of a failure or a failure sign for each physical server 111 that is the management target of the management server 101, as will be described later. That is, the failure sign management unit 104 inquires the BMC 203 and the failure sign detection unit 310 of each physical server 111 about the operation information, and acquires the operation information for each server 111.
 ステップ902は、ステップ901の結果、障害予兆管理部104は各サーバ111の稼動情報毎に障害の発生または障害予兆の有無を判定する。そして、障害予兆管理部104は、障害の発生または障害予兆を検出した場合には、物理サーバ111を特定して障害回復部102へ通知し、ステップ903に移行する。一方、障害の発生または障害予兆が無い場合には処理を終了する。 In step 902, as a result of step 901, the failure sign management unit 104 determines the occurrence of a failure or the presence or absence of a failure sign for each operation information of each server 111. When the failure sign management unit 104 detects the occurrence of a failure or a failure sign, the failure sign management unit 104 identifies the physical server 111 and notifies the failure recovery unit 102, and the process proceeds to step 903. On the other hand, if there is no failure or no failure sign, the process is terminated.
 ステップ903は、障害あるいは障害予兆が検出された物理サーバ111で、仮想化機構が実行されているか否かを障害回復部102が判定する。障害回復部102は、対象となる物理サーバ111でサーバ仮想化機構110が実行されていればステップ904にて、仮想サーバ109回復方法選択部105を呼び出す。 In step 903, the failure recovery unit 102 determines whether or not the virtualization mechanism is being executed on the physical server 111 in which a failure or a failure sign is detected. If the server virtualization mechanism 110 is being executed on the target physical server 111, the failure recovery unit 102 calls the virtual server 109 recovery method selection unit 105 in step 904.
 ステップ904では、仮想サーバ回復方法選択部105は、障害あるいは障害予兆の検出結果、影響を受ける仮想サーバ109の優先度、および物理サーバ111の計算機リソースの空き状況に基づいて、仮想サーバ109の回復方法を選択する処理を行う。この回復方法の処理については後述する。 In step 904, the virtual server recovery method selection unit 105 recovers the virtual server 109 based on the detection result of the failure or the failure sign, the priority of the affected virtual server 109, and the availability of the computer resources of the physical server 111. Process to select the method. The processing of this recovery method will be described later.
 ステップ906では、障害回復部102が仮想サーバ回復部106を呼び出す。仮想サーバ回復部106は、仮想サーバ回復方法選択部105の実行結果に基づき、仮想サーバ109の回復を後述するように実行する。 In step 906, the failure recovery unit 102 calls the virtual server recovery unit 106. The virtual server recovery unit 106 executes recovery of the virtual server 109 based on the execution result of the virtual server recovery method selection unit 105 as described later.
 ステップ907では、回復処理の結果を管理者に通知する。この通知は、管理サーバ101の図示しない表示装置などに対して結果を表示することで行われる。 In step 907, the result of the recovery process is notified to the administrator. This notification is performed by displaying the result on a display device (not shown) of the management server 101.
 上記ステップ903で、サーバ仮想化機構110が実行されていない物理サーバ111からの障害や障害予兆の場合は、ステップ905にて物理サーバ111の回復を行う。サーバ仮想化機構110を実行していない物理サーバ111の回復は、従来例と同様であるので、本実施形態では詳述しない。 In step 903, if there is a failure or sign of failure from the physical server 111 in which the server virtualization mechanism 110 is not executed, the physical server 111 is recovered in step 905. The recovery of the physical server 111 that is not executing the server virtualization mechanism 110 is the same as that in the conventional example, and thus will not be described in detail in this embodiment.
 以上の処理により、障害回復部102は、サーバ仮想化機構110を実行している物理サーバ111に障害の発生または障害予兆を検出したときには、仮想サーバ109が実際に使用している計算機リソースの量または比率に応じて再度割り当てを行い、最初に計算機リソースを割り当てたときよりも少ない計算機リソースを再設定して、待機系の物理サーバ111に仮想サーバ109を引き継がせる。 Through the above processing, when the failure recovery unit 102 detects the occurrence of a failure or a failure sign in the physical server 111 executing the server virtualization mechanism 110, the amount of computer resources actually used by the virtual server 109 Alternatively, the allocation is performed again in accordance with the ratio, the computer resources less than those when the computer resources are initially allocated are reset, and the virtual server 109 is taken over by the standby physical server 111.
 これにより、待機系の物理サーバ111の計算機リソースが、現用系の物理サーバ111よりも少ない場合であっても、複数の仮想サーバ109を待機系の物理サーバ111へ確実に引き継がせることが可能となる。したがって、現用系よりも計算機リソースの少ない物理サーバ111で待機系を構成することが可能となって、複数の物理サーバ111を備える仮想計算機システムの導入及び運用コストを削減することが可能となる。 As a result, even when the computer resources of the standby physical server 111 are less than those of the active physical server 111, a plurality of virtual servers 109 can be reliably transferred to the standby physical server 111. Become. Therefore, it is possible to configure a standby system with physical servers 111 that have fewer computer resources than the active system, and it is possible to reduce the introduction and operation costs of a virtual computer system including a plurality of physical servers 111.
 図10は、障害予兆管理部104で行われる処理の一例を示すフローチャートである。この処理は、上記図9のステップ901の処理に対応する。 FIG. 10 is a flowchart illustrating an example of processing performed by the failure sign management unit 104. This processing corresponds to the processing in step 901 in FIG.
 障害予兆管理部104は、管理対象となる物理サーバ111に、障害や障害予兆が発生していないかを検査する処理を行う。ステップ1001では、障害予兆管理部104が対象となる物理サーバ111を選択する。ステップ1002では、障害予兆管理部104が対象となる物理サーバ111が有するBMC206にアクセスし、ハードウェアの障害あるいは障害予兆が発生していないかを検査する。 The failure sign management unit 104 performs processing to check whether a failure or a sign of failure has occurred in the physical server 111 to be managed. In step 1001, the failure sign management unit 104 selects a target physical server 111. In step 1002, the failure sign management unit 104 accesses the BMC 206 of the target physical server 111, and checks whether a hardware failure or a failure sign has occurred.
 ここで各物理サーバ111のBMC206は、ハードウェアの状態を監視することができる。例えば、プロセッサ303、メモリ301、温度、ファン、電源の状態監視などである。障害や障害予兆とは、例えばBMC206がプロセッサ303から何らかの不具合を検出し、何回かのリトライにより不具合が解消した場合は障害予兆の検出であり、リトライでは回復しなかった場合には障害と検知する。他の部品についても同様である。他にも、プロセッサ303やメモリ301あるいはチップセット(図示省略)の温度監視によって、所定の閾値を超えたが、一定期間後に温度が下がった場合は障害予兆と検出し、一定時間を経過しても温度が閾値を超えたままの状態であれば障害と検知するなどが考えられる。さらに、障害予兆には障害に至った場合の影響度に応じて重要度を設けることもできる。例えば、プロセッサ303の障害予兆は、プロセッサ303が停止した時には実質的にシステム全体の停止につながる可能性が高いため重要度は高いが、メモリ301の障害予兆はビットエラーを訂正できる範囲であれば重要度は低いなどである。また、同じプロセッサ303の障害予兆でも、複数のプロセッサ303が搭載されている場合には重要度を下げてもよい。このように、障害予兆に重要度を設けることで回復手段の選択肢を増やす制御をすることができるようになる。 Here, the BMC 206 of each physical server 111 can monitor the hardware state. For example, processor 303, memory 301, temperature, fan, power supply status monitoring, and the like. A failure or a failure sign is, for example, detection of a failure sign when the BMC 206 detects some failure from the processor 303 and the failure is resolved by a number of retries, and is detected as a failure when the retry does not recover. To do. The same applies to other parts. In addition, when the temperature of the processor 303, the memory 301, or the chip set (not shown) is exceeded, a predetermined threshold value is exceeded. If the temperature remains above the threshold, it may be detected as a failure. Further, the importance level can be set in the failure predictor according to the influence level when the failure is reached. For example, the failure sign of the processor 303 is highly important because it is highly likely that the failure of the processor 303 will substantially stop the entire system when the processor 303 is stopped, but the failure sign of the memory 301 is within a range in which a bit error can be corrected. The importance is low. Further, even when a failure sign of the same processor 303 is installed, the importance may be lowered when a plurality of processors 303 are installed. In this way, it is possible to control to increase the options of the recovery means by providing the importance to the failure sign.
 次に、ステップ1003は、ハードウェアの障害や障害予兆が検出されたか否かを判定し、障害または障害予兆が検出されていた場合はステップ1006に移行し、検出結果を障害回復部102に報告する。 Next, in Step 1003, it is determined whether or not a hardware failure or failure sign is detected. If a failure or failure sign is detected, the process proceeds to Step 1006 and the detection result is reported to the failure recovery unit 102. To do.
 一方、ハードウェアの障害や障害予兆が検出されなかった場合はステップ1004で障害予兆検出部310を呼び出し、ハードウェアレベルでは検知できないソフトウェアレベルの障害や障害予兆を検出する。障害予兆検出部310による検出結果はステップ1006にて障害予兆管理部104に報告される。 On the other hand, if a hardware failure or failure sign is not detected, the failure sign detection unit 310 is called in step 1004 to detect a software level failure or failure sign that cannot be detected at the hardware level. The detection result by the failure sign detection unit 310 is reported to the failure sign management unit 104 in step 1006.
 ステップ1007では、全ての管理対象の物理サーバ111を検査したか否かを障害予兆管理部104が判定し、未検査の物理サーバ111があればステップ1001から繰り返す。 In step 1007, the failure sign management unit 104 determines whether or not all the management target physical servers 111 have been inspected. If there is an uninspected physical server 111, the processing is repeated from step 1001.
 以上の処理により、ステップ1002やステップ1004によって、ハードウェアからソフトウェアまで物理サーバ111について幅広いレベルの障害や障害予兆を検出することができるようになる。 Through the above processing, it is possible to detect a wide range of failures and signs of failure in the physical server 111 from hardware to software in steps 1002 and 1004.
 図11は、仮想サーバ回復方法選択部105で行われる処理の一例を示すフローチャートである。この処理は上記図9のステップ904に対応する。 FIG. 11 is a flowchart illustrating an example of processing performed by the virtual server recovery method selection unit 105. This process corresponds to step 904 in FIG.
 仮想サーバ回復方法選択部105では、物理サーバ111で障害や障害予兆が検出された場合に、どのように仮想サーバ109を回復するかを決定する。まず、ステップ1101では、障害や障害予兆を解析する。この解析は、例えば、プロセッサ303の障害予兆であれば、障害予兆を検出したプロセッサ303のコア番号や予兆の重要度などを解析する。 The virtual server recovery method selection unit 105 determines how to recover the virtual server 109 when a failure or failure sign is detected in the physical server 111. First, in step 1101, a failure or a failure sign is analyzed. For example, if the failure predictor of the processor 303 is analyzed, the core number of the processor 303 that detected the failure predictor and the importance of the predictor are analyzed.
 ステップ1102では、仮想サーバ回復方法選択部105が障害の影響範囲を調査し、影響を受ける仮想サーバ109を特定する。障害の影響を受ける仮想サーバ109の特定には、サーバ管理テーブル108、仮想サーバ管理テーブル107を用いることで特定することができる。例えば、物理サーバ111-1でプロセッサ303の障害予兆が検出された場合、図6の仮想サーバ管理テーブル107から仮想サーバ識別子603=「VM1」と「VM3」が障害予兆の影響を受けることが判定できる。 In step 1102, the virtual server recovery method selection unit 105 investigates the influence range of the failure and specifies the affected virtual server 109. The virtual server 109 affected by the failure can be specified by using the server management table 108 and the virtual server management table 107. For example, if a failure sign of the processor 303 is detected in the physical server 111-1, it is determined from the virtual server management table 107 in FIG. 6 that the virtual server identifiers 603 = “VM1” and “VM3” are affected by the sign of failure. it can.
 ステップ1103では、仮想サーバ回復方法選択部105は検出された障害や障害予兆によって仮想サーバ109が停止する可能性が高いか否かを判定する。なお、仮想サーバ109が停止する可能性が高い場合とは、例えば、物理サーバ111のプロセッサ303の温度が所定値を超えたり、プロセッサ303の冷却ファンが停止ししたときなど、物理サーバ111が所定の稼動状態、すなわち、物理サーバ111が現在は稼動しているが、今後停止することが予想される状態となった場合である。 In step 1103, the virtual server recovery method selection unit 105 determines whether or not there is a high possibility that the virtual server 109 is stopped due to the detected failure or failure sign. Note that there is a high possibility that the virtual server 109 will stop, for example, when the temperature of the processor 303 of the physical server 111 exceeds a predetermined value, or when the cooling fan of the processor 303 stops, In other words, the physical server 111 is currently operating but is expected to stop in the future.
 停止する可能性が高い場合はステップ1105にて、仮想サーバ回復方法選択部105が障害の影響を受ける仮想サーバ109の優先度を調査する。この優先度は、仮想サーバ定義情報テーブル118の仮想サーバ優先度702を検索することで取得することができる。 If the possibility of stopping is high, in step 1105, the virtual server recovery method selection unit 105 checks the priority of the virtual server 109 affected by the failure. This priority can be acquired by searching the virtual server priority 702 in the virtual server definition information table 118.
 仮想サーバ109の優先度が高い場合、すなわち仮想サーバ109が停止することでシステムや業務に与える影響が大きい場合は、ステップ1105にて、仮想サーバ回復方法選択部105が他の物理サーバ111の空きリソースを検索し、停止する可能性が高い仮想サーバ109の移動先の候補を決定する。仮想サーバ回復方法選択部105は、移動先の物理サーバ111に十分な空きリソースが存在する場合は、ステップ1112にて、仮想サーバ109の回復方法を仮想サーバ移動方法に決定する。 If the priority of the virtual server 109 is high, that is, if the virtual server 109 has a large impact on the system or business, the virtual server recovery method selection unit 105 in step 1105 causes the other physical server 111 to be free. The resource is searched, and the migration destination candidate of the virtual server 109 that is likely to be stopped is determined. If there are sufficient free resources in the migration destination physical server 111, the virtual server recovery method selection unit 105 determines the recovery method of the virtual server 109 as the virtual server migration method in step 1112.
 一方、仮想サーバ回復方法選択部105は、移動先に十分な空きリソースを有する物理サーバ111が見つからなかった場合は、ステップ1113にて定義変更方法に決定する。仮想サーバ109が停止する可能性が低い場合や、優先度が低い仮想サーバ109が影響を受ける場合は、ステップ1110やステップ1111にて、状況を管理者に報告する。これらの報告は、管理サーバ101の図示しない表示装置などに障害の発生または障害予兆を受ける仮想サーバ109を通知する。これらの処理によって、仮想サーバ109の障害の回復する幅を広げることができ、従来よりも信頼性の高い仮想サーバ109の障害回復することができる。 On the other hand, if the physical server 111 having sufficient free resources at the destination is not found, the virtual server recovery method selection unit 105 determines the definition change method in step 1113. If there is a low possibility that the virtual server 109 will be stopped, or if the virtual server 109 with a low priority is affected, the status is reported to the administrator at step 1110 or step 1111. These reports notify the virtual server 109 that receives a failure occurrence or a failure sign to a display device (not shown) of the management server 101. By these processes, it is possible to widen the recovery range of the failure of the virtual server 109, and it is possible to recover the failure of the virtual server 109 with higher reliability than before.
 以上の処理により、物理サーバ111の障害の発生または障害予兆を受ける仮想サーバ109について、ステップ1112またはステップ1113で、仮想サーバ移動または定義変更の何れかの処理が選択される。この選択された処理は後述の仮想サーバ回復部106の処理で詳述する。 Through the above processing, for the virtual server 109 that receives a failure occurrence or a failure sign of the physical server 111, either the virtual server migration or the definition change processing is selected in Step 1112 or Step 1113. This selected process will be described in detail in the process of the virtual server recovery unit 106 described later.
 図12は、仮想サーバ回復部106で行われる処理の概要を示すフローチャートを示す。この処理は、上記図9のステップ906の処理に対応する。 FIG. 12 is a flowchart showing an outline of processing performed by the virtual server recovery unit 106. This process corresponds to the process in step 906 of FIG.
 仮想サーバ回復部106は、仮想サーバ回復方法選択部105によって決定された回復方法(仮想サーバ移動または定義変更)に従って、仮想サーバ109の回復処理を実行する。 The virtual server recovery unit 106 executes the recovery process of the virtual server 109 according to the recovery method (virtual server migration or definition change) determined by the virtual server recovery method selection unit 105.
 ステップ1201は、仮想サーバ回復部106が仮想サーバ回復方法選択部105で決定された回復方法が定義変更方法か否かを判定する。決定された回復方法が定義変更方法ではない場合、すなわち移動するための十分な空きリソースを有する場合、仮想サーバ回復部106はステップ1202にて当該仮想サーバ109の定義情報から必要リソースを抽出する。これは、仮想サーバ定義情報テーブル118のオリジナル定義情報703を参照し、定義ファイルの内容を参照することで実現できる。すなわち、障害の発生または障害予兆のある物理サーバ111に割り当てられている計算機リソースの割り当てで、新たな物理サーバ111へ移動することを示す。 In step 1201, the virtual server recovery unit 106 determines whether or not the recovery method determined by the virtual server recovery method selection unit 105 is a definition change method. If the determined recovery method is not the definition change method, that is, if the recovery method has sufficient free resources to move, the virtual server recovery unit 106 extracts necessary resources from the definition information of the virtual server 109 in step 1202. This can be realized by referring to the original definition information 703 in the virtual server definition information table 118 and referring to the contents of the definition file. In other words, it indicates that the computer resource is allocated to the physical server 111 with the occurrence of a failure or a sign of failure, and is moved to a new physical server 111.
 ステップ1203では、仮想サーバ回復部106が現在障害の発生または障害予兆のある物理サーバ111と同等以上の空きリソースを有する物理サーバ111をサーバ管理テーブル108から検索する。 In step 1203, the virtual server recovery unit 106 searches the server management table 108 for a physical server 111 having a free resource equivalent to or higher than that of the physical server 111 where a failure has occurred or is predictive of failure.
 ステップ1204では、仮想サーバ回復部106が上記検索した物理サーバ111にオリジナル定義情報703の計算機リソースを割り当てて仮想サーバ109を移動する。 In step 1204, the virtual server recovery unit 106 allocates the computer resource of the original definition information 703 to the searched physical server 111 and moves the virtual server 109.
 一方、定義変更方法が仮想サーバ109の回復方法として選択された場合は、仮想サーバ回復部106がステップ1207にて仮想サーバ定義情報管理部103を呼び出す。 On the other hand, when the definition change method is selected as the recovery method of the virtual server 109, the virtual server recovery unit 106 calls the virtual server definition information management unit 103 in step 1207.
 呼び出された仮想サーバ定義情報管理部103は、後述するように、移動対象の仮想サーバ109の計算機リソースの割り当て量を減少させて仮想サーバ定義情報309を更新する。 The called virtual server definition information management unit 103 updates the virtual server definition information 309 by reducing the allocated amount of computer resources of the migration target virtual server 109, as will be described later.
 ステップ1208は、仮想サーバ定義情報管理部103が移動先で使用する変更された仮想サーバ定義情報309を用いて、仮想サーバ109を移動する。仮想サーバ109の移動とは、仮想サーバ109を移動先の物理サーバ111上に移動することである。稼動状態を保ったまま仮想サーバ109を移動する場合は、一時的に移動元と移動先の両方で仮想サーバ109を実行するためのリソースを確保する。移動先はメモリ301、CPU303、I/O(FCA304,NIC305)を確保し、その上で、移動元の仮想サーバ109のメモリ情報やI/Oの状態を移動先の仮想サーバ109にコピーする処理を行う。移動先の計算機リソースの方が少ない場合もある。例えばプロセッサ303の割り当て量が少ない場合は、仮想サーバ109の処理性能が低下するが、移動に関しては特に処理を行う必要はない。一方、移動先の物理サーバ111のメモリ容量が少ない場合は、いくつかの方法が考えられる。一つは、OS302からは移動元と同じ容量を持っているかのように見せかける方法である。メモリ301のコピー時には、OS302の動作に影響を与えない未使用領域やOS302やアプリケーションのキャッシュ情報をコピーせず、OS302上で稼動するドライバなどのプログラムが、不足したメモリ領域を確保したように見せかけることで実現することができる。なお、稼動中のまま移動するケースと仮想サーバ109を一旦シャットダウンし移動先で起動する移動の場合には、上記の処理は必要ない。 Step 1208 moves the virtual server 109 using the changed virtual server definition information 309 used by the virtual server definition information management unit 103 at the destination. The movement of the virtual server 109 is to move the virtual server 109 onto the destination physical server 111. When moving the virtual server 109 while maintaining the operation state, a resource for executing the virtual server 109 is temporarily secured at both the movement source and the movement destination. Processing to secure memory 301, CPU 303, and I / O (FCA 304, NIC 305) as the migration destination, and then copy the memory information and I / O status of the migration source virtual server 109 to the migration destination virtual server 109 I do. There may be fewer computer resources to move to. For example, when the allocated amount of the processor 303 is small, the processing performance of the virtual server 109 decreases, but no particular processing is required for migration. On the other hand, when the migration destination physical server 111 has a small memory capacity, several methods are conceivable. One is a method of making the OS 302 appear as if it has the same capacity as the migration source. When copying the memory 301, the unused area that does not affect the operation of the OS 302 and the cache information of the OS 302 and the application are not copied, and a program such as a driver running on the OS 302 appears to have secured the insufficient memory area. Can be realized. Note that the above processing is not necessary in the case of moving while operating and the case where the virtual server 109 is temporarily shut down and started at the destination.
 仮想サーバ定義情報管理部103は、物理サーバ111に十分な空きリソースが存在しない場合でも、仮想サーバ109を移動することで信頼性を高めるための処理を行う。具体的には、できるだけ仮想サーバ109の移動、すなわち仮想サーバ109の保護を可能にするために空きリソース、優先度、計算機リソースの実使用量に基づいて、定義情報を変更し移動を可能にする処理を行う。 The virtual server definition information management unit 103 performs a process for improving reliability by moving the virtual server 109 even when there are not enough free resources in the physical server 111. Specifically, in order to move the virtual server 109 as much as possible, that is, to enable protection of the virtual server 109, the definition information is changed based on free resources, priorities, and actual usage of computer resources to enable the movement. Process.
 図13は、仮想サーバ定義情報管理部103で行われる処理の概要を示すフローチャートを示す。この処理は、上記図12のステップ1207の処理に対応する。 FIG. 13 is a flowchart showing an outline of processing performed by the virtual server definition information management unit 103. This process corresponds to the process of step 1207 in FIG.
 ステップ1301では、仮想サーバ定義情報管理部103が移動対象の仮想サーバ109について仮想サーバ管理テーブル107から計算機リソースの実際の実使用量を検索する。仮想サーバ定義情報管理部103は、仮想サーバ管理テーブル107に格納されたプロセッサ/メモリ実使用量607を参照することで計算機リソースの実使用量を取得する。 In Step 1301, the virtual server definition information management unit 103 searches the virtual server 109 for the migration target virtual server 109 from the virtual server management table 107 for the actual actual usage of computer resources. The virtual server definition information management unit 103 refers to the processor / memory actual usage 607 stored in the virtual server management table 107 to obtain the actual usage of computer resources.
 ステップ1302では、仮想サーバ定義情報管理部103が仮想サーバ管理テーブル107を検索し、例えば最も未使用のリソースが大きな物理サーバ111を選択し未使用リソース情報を取得する。 In Step 1302, the virtual server definition information management unit 103 searches the virtual server management table 107, selects, for example, the physical server 111 having the largest unused resource, and acquires unused resource information.
 ステップ1303では、仮想サーバ定義情報管理部103が上記ステップ1302で取得した未使用リソース情報と、ステップ1301で取得した計算機リソースの実使用量とを比較して、未使用リソース情報が、実使用量よりも大きいか否かを判定する。これは、仮想サーバ109に割当てられた計算機リソースの一部分だけを使用して稼動している場合に、現状の性能などのサービスレベルを保つために必要な最小限のリソースを有する物理サーバ111を選択することを意味している。未使用リソース情報が、実使用量よりも大きく、移動対象の仮想サーバ109を実行可能な最小限の計算機リソースを保持する物理サーバ111が見つかった場合は、ステップ1307の処理へ進んで仮想サーバ定義情報管理部103が仮想サーバ定義情報309をコピーする。 In step 1303, the virtual server definition information management unit 103 compares the unused resource information acquired in step 1302 with the actual usage of the computer resource acquired in step 1301. It is judged whether it is larger than. This is because when only a part of the computer resources allocated to the virtual server 109 is operating, the physical server 111 having the minimum resources necessary to maintain the service level such as the current performance is selected. Is meant to do. When the unused resource information is larger than the actual usage amount and the physical server 111 holding the minimum computer resource capable of executing the migration target virtual server 109 is found, the process proceeds to step 1307 and the virtual server definition is performed. The information management unit 103 copies the virtual server definition information 309.
 ステップ1308では、仮想サーバ定義情報管理部103が上記ステップ1307でコピーした仮想サーバ定義情報309を、ステップ1302にて検索した物理サーバ111が有する未使用リソースの情報を仮想サーバ定義情報309に適用して変更する。これにより、移動対象の仮想サーバ定義情報309は、オリジナルの仮想サーバ定義情報309よりも少ない計算機リソースで仮想サーバ109を実行することができるようになる。 In step 1308, the virtual server definition information management unit 103 applies the virtual server definition information 309 copied in step 1307 to the virtual server definition information 309 and information on unused resources of the physical server 111 searched in step 1302. To change. As a result, the virtual server definition information 309 to be moved can execute the virtual server 109 with fewer computer resources than the original virtual server definition information 309.
 ステップ1309では、仮想サーバ定義情報管理部103がステップ1308で変更した仮想サーバ定義情報309を仮想サーバ定義情報テーブル118に反映する。すなわち、仮想サーバ定義情報管理部103は、上記コピーして未使用リソース情報に変更した仮想サーバ定義情報309をディスクアレイ装置116の定義情報格納ディスク115に格納する。そして、仮想サーバ定義情報管理部103は、コピーして変更した仮想サーバ定義情報309の格納位置(パス)を仮想サーバ定義情報テーブル108の移動用定義情報704に追加し、また、現在の日時を移動日時705に格納する。これにより、移動対象の仮想サーバ109の移動の履歴が生成される。 In step 1309, the virtual server definition information management unit 103 reflects the virtual server definition information 309 changed in step 1308 in the virtual server definition information table 118. That is, the virtual server definition information management unit 103 stores the virtual server definition information 309 that has been copied and changed to unused resource information in the definition information storage disk 115 of the disk array device 116. Then, the virtual server definition information management unit 103 adds the storage location (path) of the virtual server definition information 309 that has been copied and changed to the movement definition information 704 in the virtual server definition information table 108, and also displays the current date and time. Stored in the movement date 705. Thereby, a movement history of the virtual server 109 to be moved is generated.
 一方、ステップ1303の判定で、実使用量よりも大きい未使用リソースが見つからなかった場合には、ステップ1304に進む。 On the other hand, if it is determined in step 1303 that no unused resource larger than the actual usage amount is found, the process proceeds to step 1304.
 ステップ1304では、最小限の空きリソースを有する仮想サーバ109が存在しなかった場合に、仮想サーバ定義情報管理部103は、例えば、最も未使用リソースが多い物理サーバ111を選択する。 In step 1304, when the virtual server 109 having the minimum free resource does not exist, the virtual server definition information management unit 103 selects, for example, the physical server 111 having the most unused resources.
 ステップ1305では、仮想サーバ定義情報管理部103は、仮想サーバ定義情報テーブル118を検索して、優先度(=仮想サーバ優先度702)が当該移動対象の仮想サーバ109よりも低い仮想サーバ109を選択する。仮想サーバ定義情報管理部103は、このとき、移動対象の仮想サーバ109よりも優先度が低い仮想サーバ109を移動先の物理サーバ111として選択する。 In step 1305, the virtual server definition information management unit 103 searches the virtual server definition information table 118 and selects a virtual server 109 whose priority (= virtual server priority 702) is lower than the virtual server 109 to be moved. To do. At this time, the virtual server definition information management unit 103 selects a virtual server 109 having a lower priority than the migration target virtual server 109 as the migration destination physical server 111.
 ステップ1306では、ステップ1305で選択された仮想サーバ109が有する割当てリソースを剥奪し、優先度の高い仮想サーバ109へ融通する。つまり、仮想サーバ定義情報管理部103は、移動対象の仮想サーバ109よりも優先度の低い仮想サーバ109の割り当てリソースを削減し、削減した割り当てリソースを当該物理サーバ111の未使用リソースに加える。そして、仮想サーバ定義情報管理部103は、優先度の低い仮想サーバ109から剥奪した計算機リソースの分だけ増加した未使用リソースを、移動対象の仮想サーバ109に割り当てる。そして、上述したステップ1307の処理に進む。なお、優先度の低い仮想サーバ109から剥奪される計算機リソースの量は、仮想サーバ管理テーブル107を参照して、剥奪対象の仮想サーバ109のプロセッサ/メモリ割当量605からプロセッサ/メモリ実使用量607を差し引いた値とする。仮想サーバ定義情報管理部103は剥奪対象の仮想サーバ109の仮想サーバ定義情報309を、削減された計算機リソースで更新する。 In step 1306, the allocated resources of the virtual server 109 selected in step 1305 are stripped and accommodated to the virtual server 109 having a high priority. That is, the virtual server definition information management unit 103 reduces the allocated resources of the virtual server 109 having a lower priority than the virtual server 109 to be moved, and adds the reduced allocated resources to the unused resources of the physical server 111. Then, the virtual server definition information management unit 103 allocates unused resources, which have been increased by the amount of computer resources deprived from the virtual server 109 with low priority, to the virtual server 109 to be moved. Then, the process proceeds to step 1307 described above. Note that the amount of computer resources to be deprived from the virtual server 109 with low priority refers to the virtual server management table 107 and the processor / memory allocation 605 to the processor / memory actual use amount 607 of the virtual server 109 to be deprived. The value obtained by subtracting. The virtual server definition information management unit 103 updates the virtual server definition information 309 of the virtual server 109 to be stripped with the reduced computer resources.
 また、仮想サーバ定義情報管理部103は、ひとつの仮想サーバ109から剥奪した計算機リソースでは、移動対象の仮想サーバ109を実行可能な計算機リソースを確保できない場合は、当該移動対象の仮想サーバ109よりも優先度の低い複数の仮想サーバ109から計算機リソースを剥奪する。 In addition, the virtual server definition information management unit 103, when the computer resource stripped from one virtual server 109 cannot secure a computer resource that can execute the migration target virtual server 109, is more than the migration target virtual server 109. Computer resources are stripped from a plurality of low priority virtual servers 109.
 上記処理により、空きリソースの状態や障害あるいは障害予兆の検出によって影響を受ける仮想サーバ109を段階的に回復することができるようになる。すなわち、移動対象の仮想サーバ109に割り当てる計算機リソースを、実際に使用した計算機リソースまで減少して移動先の物理サーバ111の未使用リソースを移動対象の仮想サーバ109に割り当てることで、必要最小限の計算機リソースを移動先の物理サーバ111に確保して、仮想サーバ109の稼動を保証することができる。 The above processing makes it possible to recover the virtual server 109 affected by the detection of the state of a free resource, a failure, or a failure sign in a stepwise manner. In other words, the computer resources allocated to the migration target virtual server 109 are reduced to the computer resources actually used, and the unused resources of the migration destination physical server 111 are allocated to the migration target virtual server 109, thereby minimizing the necessary amount. Computer resources can be secured in the migration destination physical server 111 to guarantee the operation of the virtual server 109.
 なお、上記ステップ1303では、未使用リソース情報が実使用量よりも大きいか否かを判定したが、未使用リソース情報が実使用量と同等以上か否かを判定してもよい。 In step 1303, it is determined whether the unused resource information is larger than the actual usage amount. However, it may be determined whether the unused resource information is equal to or greater than the actual usage amount.
 図14は、仮想サーバ復帰部119で行われる処理の概要を示すフローチャートを示す。この処理は、管理者などが管理サーバ101の図示しないコンソールから起動を指令したときなどに実行される。 FIG. 14 is a flowchart showing an outline of processing performed by the virtual server return unit 119. This process is executed when an administrator or the like instructs activation from a console (not shown) of the management server 101.
 ステップ1401は、仮想サーバ復帰部119が復帰する仮想サーバ109を選択する。選択方法は、管理者が復帰したい仮想サーバ109を明示的に選択してもよいし、物理サーバ111の交換などによって物理サーバ111が正常な状態に復旧したイベントなどを受信した契機に、待機系の物理サーバ111に移動した仮想サーバ109を自動的に選択してもよい。 Step 1401 selects the virtual server 109 to which the virtual server return unit 119 returns. For the selection method, the administrator may explicitly select the virtual server 109 that the administrator wants to return to, or the standby system is triggered when an event that the physical server 111 is restored to a normal state by replacement of the physical server 111 or the like is received. The virtual server 109 moved to the physical server 111 may be automatically selected.
 ステップ1402は、仮想サーバ復帰部119が上記ステップ1401で選択された仮想サーバ109が移動中か否かを判定する。移動中か否かの判定は、仮想サーバ定義情報テーブル118の移動用定義情報704に定義情報が記録されているか否かを判別することで判定することができる。仮想サーバ109が移動中であればステップ1403に進み、オリジナルの状態であればステップ1403に進む。 In step 1402, the virtual server return unit 119 determines whether or not the virtual server 109 selected in step 1401 is moving. Determination of whether or not the migration is in progress can be made by determining whether or not the definition information is recorded in the migration definition information 704 of the virtual server definition information table 118. If the virtual server 109 is moving, the process proceeds to step 1403. If the virtual server 109 is in the original state, the process proceeds to step 1403.
 ステップ1403では、仮想サーバ復帰部119が上記ステップ1401で選択された仮想サーバ109の移動用定義情報704を、仮想サーバ定義情報テーブル118から抽出する。 In step 1403, the virtual server return unit 119 extracts the movement definition information 704 of the virtual server 109 selected in step 1401 from the virtual server definition information table 118.
 ステップ1404では、抽出された移動用定義情報704が、複数回に渡って移動が行われている場合に、仮想サーバ復帰部119が移動先の物理サーバ111を選択する。複数回に渡って移動が行われているか否かは、移動用定義情報704に複数の移動履歴が記載されているか否かで仮想サーバ復帰部119が判定することができる。移動先の物理サーバ111の選択にはいくつかの方法が考えられる。障害予兆を検出した物理サーバ111が新しい物理サーバ111と交換されて修復したケースでは、交換された物理サーバ111を選択してもよい。また、複数回に渡って移動した物理サーバ111のいずれも修復されていないケースでは、移動した物理サーバ111とは異なる新しい物理サーバ111を選択してもよい。この時は、できる限り障害や障害予兆が発生していない物理サーバ111を選択する方が高い信頼性を保つことができる。 In step 1404, when the extracted movement definition information 704 has been moved a plurality of times, the virtual server return unit 119 selects the physical server 111 that is the movement destination. Whether or not the movement has been performed a plurality of times can be determined by the virtual server return unit 119 based on whether or not a plurality of movement histories are described in the movement definition information 704. Several methods are conceivable for selecting the physical server 111 to be moved to. In the case where the physical server 111 that has detected the failure sign is replaced with a new physical server 111 and repaired, the replaced physical server 111 may be selected. In addition, in a case where none of the physical servers 111 that have been moved a plurality of times have been repaired, a new physical server 111 that is different from the moved physical server 111 may be selected. At this time, it is possible to maintain high reliability by selecting the physical server 111 in which a failure or a failure sign has not occurred as much as possible.
 ステップ1405では、仮想サーバ復帰部119が移動先で使用する仮想サーバ109の定義情報を仮想サーバ定義情報テーブル118の移動用定義情報704から取得する。これは、ステップ1403で抽出した移動用定義情報704から移動先として選択した物理サーバ111に対応する移動用定義情報704を取得する。もし、複数回に渡って移動した物理サーバ111とは異なる物理サーバ111に移動する場合は、オリジナルの仮想サーバ定義情報を用いてもよい。オリジナル(割り当てた当初)の仮想サーバ定義情報309は、仮想サーバ定義情報テーブル118において当該仮想サーバ109のオリジナル定義情報703に記載された仮想サーバ定義情報309を参照することで取得することができる。 In step 1405, the virtual server return unit 119 acquires the definition information of the virtual server 109 used at the movement destination from the movement definition information 704 in the virtual server definition information table 118. This acquires the movement definition information 704 corresponding to the physical server 111 selected as the movement destination from the movement definition information 704 extracted in step 1403. If moving to a physical server 111 different from the physical server 111 that has moved multiple times, the original virtual server definition information may be used. The original (initially assigned) virtual server definition information 309 can be acquired by referring to the virtual server definition information 309 described in the original definition information 703 of the virtual server 109 in the virtual server definition information table 118.
 ステップ1406は、仮想サーバ復帰部119が移動先の物理サーバ111の空きリソースを検索する。一般には、本来稼動していたのと同じ仮想サーバ109の定義情報309に従って計算機リソースを確保することが望ましいが、仮想サーバ109の移動を繰り返す運用では、必ずしも仮想サーバ109を移動する前と同じ物理サーバ111の計算機リソースが確保できない可能性があり、確認のために必要なステップである。 In step 1406, the virtual server return unit 119 searches for a free resource of the physical server 111 that is the movement destination. In general, it is desirable to secure computer resources according to the definition information 309 of the same virtual server 109 that was originally operating. However, in an operation in which the movement of the virtual server 109 is repeated, the same physical properties as before the movement of the virtual server 109 are necessarily used. There is a possibility that computer resources of the server 111 may not be secured, and this is a necessary step for confirmation.
 ステップ1407では、仮想サーバ復帰部119が上記ステップ1406で取得した空きリソースが、ステップ1405で取得した仮想サーバ109の仮想サーバ定義情報309の内容を満たすか否かを判定する。もし、空きリソースが、ステップ1405で取得した仮想サーバ109の仮想サーバ定義情報309の内容を満たすことができない場合にはステップ1409で管理者にリソースが不足していることを通知する。別の手段としては、仮想サーバ定義情報管理部103の処理を実行し、性能を満たす範囲で仮想サーバ109の仮想サーバ定義情報309を変更して移動を継続してもい。 In step 1407, it is determined whether or not the free resource acquired by the virtual server return unit 119 in step 1406 satisfies the contents of the virtual server definition information 309 of the virtual server 109 acquired in step 1405. If the free resource cannot satisfy the contents of the virtual server definition information 309 of the virtual server 109 acquired in step 1405, the administrator is notified that the resource is insufficient in step 1409. As another means, the processing of the virtual server definition information management unit 103 may be executed, the virtual server definition information 309 of the virtual server 109 may be changed within a range that satisfies the performance, and the movement may be continued.
 ステップ1408は、移動先で使用する仮想サーバ109の仮想サーバ定義情報309を用いて、仮想サーバ109を移動する。 Step 1408 moves the virtual server 109 using the virtual server definition information 309 of the virtual server 109 used at the destination.
 上記処理によって、仮想サーバ109を性能などのサービスレベルを低下することなく、移動前の計算機リソースに復帰させることができるようになる。特に、現用系の物理サーバ111に障害または障害予兆が発生して待機系の物理サーバ111に仮想サーバ109を移動した後、現用系の物理サーバ111が復旧した場合、オリジナル定義情報703を参照することで、極めて容易に仮想サーバ109を現用系の物理サーバ111に復帰させることができる。 By the above processing, the virtual server 109 can be restored to the computer resource before the movement without degrading the service level such as performance. In particular, when a failure or a sign of failure occurs in the active physical server 111 and the virtual server 109 is moved to the standby physical server 111 and then the active physical server 111 is restored, the original definition information 703 is referred to. Thus, the virtual server 109 can be returned to the active physical server 111 very easily.
 上記実施形態では、障害や障害予兆の検出を契機に仮想サーバ109を移動する方法について述べたが、障害や障害予兆以外の契機でも本発明を適用することができる。例えば、複数の物理サーバ111が稼動する環境において、プロセッサ303の負荷が特定の物理サーバ111に片寄ってしまった場合に、本発明を用いて仮想サーバ109を移動して負荷を分散する目的で用いてもよい。つまり、物理サーバ111の負荷(例えばプロセッサ使用率)が所定の閾値を超えたときに、他の物理サーバ111のうち負荷が所定の閾値未満の物理サーバ111に仮想サーバ109を移動させることができる。 In the above-described embodiment, the method of moving the virtual server 109 upon detection of a failure or a failure sign has been described, but the present invention can also be applied to a trigger other than a failure or a failure sign. For example, in an environment where a plurality of physical servers 111 are operating, when the load on the processor 303 is shifted to a specific physical server 111, the virtual server 109 is used to distribute the load by using the present invention. May be. That is, when the load (for example, the processor usage rate) of the physical server 111 exceeds a predetermined threshold, the virtual server 109 can be moved to the physical server 111 whose load is less than the predetermined threshold among other physical servers 111. .
 また、物理サーバ111の電力消費を低減する目的で、負荷が低い(所定の閾値未満)仮想サーバ109を、本発明を用いて特定の物理サーバ111上に移動して集約し、仮想サーバ109が稼動していない物理サーバ111の電源を遮断することでシステム全体の消費電力量を削減することもできる。 Further, for the purpose of reducing the power consumption of the physical server 111, the virtual server 109 having a low load (less than a predetermined threshold) is moved and aggregated on the specific physical server 111 using the present invention. By cutting off the power supply of the physical server 111 that is not operating, the power consumption of the entire system can be reduced.
 このように、本発明は障害や障害予兆を契機とした仮想サーバ109の移動だけでなく、複数の物理サーバ111が稼動する環境において、負荷や電力、その他の指標等、所定の条件が成立したときに物理サーバ111で仮想サーバ109移動させて、負荷の分散化や電力消費の最適化を目的として利用することができる。 As described above, according to the present invention, not only the movement of the virtual server 109 triggered by a failure or a failure sign but also a predetermined condition such as a load, power, and other indexes is established in an environment where a plurality of physical servers 111 operate. Sometimes, the physical server 111 can be used to move the virtual server 109 for the purpose of load distribution and power consumption optimization.
 以上、本発明を添付の図面を参照して詳細に説明したが、本発明はこのような具体的構成に限定されるものではなく、添付した請求の範囲の趣旨内における様々な変更及び同等の構成を含むものである。 Although the present invention has been described in detail with reference to the accompanying drawings, the present invention is not limited to such specific configurations, and various modifications and equivalents within the spirit of the appended claims Includes configuration.
 以上のように、本発明は複数の物理サーバで複数の仮想サーバを稼働させる仮想計算機システムに適用することができる。 As described above, the present invention can be applied to a virtual computer system that operates a plurality of virtual servers on a plurality of physical servers.

Claims (15)

  1.  プロセッサとメモリを備えて複数の仮想計算機を構築する仮想化部を有する複数の物理計算機と、プロセッサとメモリを備えて前記物理計算機とネットワークで接続された管理計算機と、を有する仮想計算機システムにおいて仮想計算機を移動させる方法であって、
     前記管理計算機が、前記複数の物理計算機のうち第1の物理計算機の計算機リソースを前記仮想計算機に割り当てて当該仮想計算機を稼働させる仮想計算機稼働ステップと、
     前記管理計算機が、前記仮想計算機に割り当てられた計算機リソースの情報を仮想計算機の定義情報として保持する保持ステップと、
     前記管理計算機は、前記仮想計算機が前記第1の物理計算機で実際に使用された計算機リソースを実使用量として取得するリソース使用量取得ステップと、
     前記管理計算機が、所定条件が成立したか否かを判定する判定ステップと、
     前記管理計算機は、前記所定条件が成立した場合に、前記仮想計算機の移動先として前記複数の物理計算機のうち、前記実使用量以上の計算機リソースを確保可能な物理計算機を前記第2の物理計算機として選択する選択ステップと、
     前記管理計算機が、前記定義情報を前記実使用量に更新して、前記仮想計算機を前記選択された第2の物理計算機に移動させる移動ステップと、を含むことを特徴とする仮想計算機の移動方法。
    A virtual computer system having a plurality of physical computers having a virtualization unit for constructing a plurality of virtual computers with a processor and a memory, and a management computer having a processor and a memory and connected to the physical computer by a network. A method of moving a computer,
    A virtual computer operating step in which the management computer allocates a computer resource of a first physical computer among the plurality of physical computers to the virtual computer and operates the virtual computer;
    A holding step in which the management computer holds computer resource information assigned to the virtual computer as definition information of the virtual computer;
    The management computer includes a resource usage acquisition step in which the virtual computer acquires a computer resource actually used by the first physical computer as an actual usage;
    A determination step for determining whether or not the predetermined condition is satisfied by the management computer;
    The management computer, when the predetermined condition is satisfied, out of the plurality of physical computers as a migration destination of the virtual computer, a physical computer capable of securing a computer resource equal to or greater than the actual usage amount is the second physical computer. A selection step to select as,
    The management computer includes a moving step of updating the definition information to the actual usage amount and moving the virtual computer to the selected second physical computer. .
  2.  請求項1に記載の仮想計算機の移動方法であって、
     前記管理計算機は、前記物理計算機に障害または障害予兆があることを検出するステップをさらに含み、
     前記判定ステップでは、前記管理計算機は、前記障害または障害予兆が検出された場合に前記所定条件の成立と判定することを特徴とする仮想計算機の移動方法。
    A method of moving a virtual machine according to claim 1,
    The management computer further includes detecting that the physical computer has a failure or a failure sign,
    In the determination step, the management computer determines that the predetermined condition is satisfied when the failure or the failure sign is detected.
  3.  請求項2に記載の仮想計算機の移動方法であって、
     前記仮想計算機稼働ステップでは、前記管理計算機は、複数の仮想計算機を複数の物理計算機にそれぞれ割り当てて、仮想計算機毎に割り当てられた計算機リソースと物理計算機との関係を仮想計算機管理情報に保持し、
     前記選択ステップは、
     前記管理計算機が、前記仮想計算機管理情報から前記障害または障害予兆の影響を受ける仮想計算機を前記仮想計算機管理情報を検索して特定するステップと、
     前記管理計算機が、前記特定された仮想計算機の前記実使用量以上の計算機リソースを確保可能な物理計算機を当該特定された仮想計算機の移動先の前記第2の物理計算機として選択するステップと、を含むことを特徴とする仮想計算機の移動方法。
    A method of moving a virtual machine according to claim 2,
    In the virtual machine operation step, the management computer assigns a plurality of virtual machines to a plurality of physical computers, and holds the relationship between the computer resources assigned to each virtual machine and the physical computer in the virtual machine management information,
    The selection step includes
    The management computer searches the virtual computer management information to identify a virtual computer affected by the failure or a failure sign from the virtual computer management information; and
    The management computer selecting a physical computer capable of securing a computer resource equal to or greater than the actual usage of the identified virtual computer as the second physical computer to which the identified virtual computer is moved; and A method for moving a virtual machine, comprising:
  4.  請求項1に記載の仮想計算機の移動方法であって、
     前記移動ステップでは、前記管理計算機は、前記定義情報を複製し、当該複製された定義情報を前記実使用量に対応するように変更し、当該変更された定義情報に基づいて前記仮想計算機を前記第2の物理計算機に移動させることを特徴とする仮想計算機の移動方法。
    A method of moving a virtual machine according to claim 1,
    In the migration step, the management computer duplicates the definition information, changes the duplicated definition information so as to correspond to the actual usage, and changes the virtual computer based on the changed definition information. A method of moving a virtual machine, wherein the virtual machine is moved to a second physical computer.
  5.  請求項1に記載の仮想計算機の移動方法であって、
     前記仮想計算機稼働ステップでは、前記管理計算機は、複数の仮想計算機を複数の物理計算機にそれぞれ割り当てて、仮想計算機毎に割り当てた計算機リソースと物理計算機との関係を、前記仮想計算機毎に設定した優先度を仮想計算機管理情報に保持し、
     前記選択ステップは、
     前記実使用量以上の計算機リソースを確保可能な物理計算機が存在しなかった場合には、前記管理計算機が、前記管理計算機が前記仮想計算機管理情報を検索することによって、当該仮想計算機よりも優先度の低い仮想計算機を稼動させる物理計算機を移動先の第2の物理計算機として選択するステップと、
     前記管理計算機が、当該仮想計算機よりも優先度の低い仮想計算機から前記実使用量に対応する計算機リソースを剥奪するステップと、を含むことを特徴とする仮想計算機の移動方法。
    A method of moving a virtual machine according to claim 1,
    In the virtual machine operation step, the management computer assigns a plurality of virtual machines to a plurality of physical computers, and the priority between the computer resources assigned to each virtual machine and the physical computer is set for each virtual machine. Degree is stored in the virtual machine management information,
    The selection step includes
    When there is no physical computer that can secure a computer resource equal to or greater than the actual usage, the management computer searches the virtual computer management information by the management computer, thereby giving priority to the virtual computer. Selecting a physical computer that operates a low-virtual virtual machine as a second physical computer to be moved;
    The management computer includes a step of depriving a computer resource corresponding to the actual usage amount from a virtual computer having a lower priority than the virtual computer.
  6.  請求項5に記載の仮想計算機の移動方法であって、
     前記管理計算機が、前記物理計算機に障害または障害予兆があることを検出するステップをさらに含み、
     前記判定ステップは、前記障害または障害予兆が検出された場合に前記所定条件の成立と判定し、
     前記所定条件が成立したと判定された場合に、前記選択ステップは、
     前記仮想計算機を稼働させる物理計算機の稼働状態が所定の状態となり、かつ、前記実使用量以上の計算機リソースを確保可能な物理計算機が存在しなかった場合には、前記管理計算機が、前記仮想計算機管理情報を検索し、当該仮想計算機よりも優先度の低い仮想計算機を稼動させる物理計算機を移動先の第2の物理計算機として選択するステップと、
     当該仮想計算機よりも優先度の低い仮想計算機から前記実使用量に対応する計算機リソースを剥奪するステップと、を含むことを特徴とする仮想計算機の移動方法。
    The virtual computer migration method according to claim 5,
    The management computer further comprising detecting that the physical computer has a failure or a sign of failure;
    The determination step determines that the predetermined condition is satisfied when the failure or a failure sign is detected,
    When it is determined that the predetermined condition is satisfied, the selection step includes:
    When the operating state of the physical computer that operates the virtual computer is in a predetermined state and there is no physical computer that can secure a computer resource equal to or greater than the actual usage, the management computer Searching for management information and selecting a physical computer that operates a virtual machine having a lower priority than the virtual machine as a second physical computer to be moved;
    Removing a computer resource corresponding to the actual usage amount from a virtual computer having a lower priority than the virtual computer.
  7.  請求項1に記載の仮想計算機の移動方法であって、
     前記管理計算機は、前記第2の物理計算機へ移動した仮想計算機毎を前記第1の物理計算機に復帰させる復帰ステップをさらに含み、
     前記仮想計算機稼働ステップでは、前記管理計算機が、前記仮想計算機に最初に割り当てた計算機リソースの情報を前記仮想計算機のオリジナルの定義情報として保持し、
     前記移動ステップでは、前記管理計算機が、前記定義情報を複製し、当該複製された定義情報を前記実使用量に対応するように変更し、当該変更された定義情報に基づいて前記仮想計算機を前記第2の物理計算機に移動させ、前記変更した定義情報を移動の履歴として保持し、
     前記復帰ステップでは、前記管理計算機は、前記仮想計算機のオリジナルの定義情報で前記仮想計算機を前記第1の物理計算機に移動させることを特徴とする仮想計算機の移動方法。
    A method of moving a virtual machine according to claim 1,
    The management computer further includes a return step of returning to the first physical computer each virtual computer that has moved to the second physical computer,
    In the virtual machine operation step, the management computer holds the information of the computer resource initially allocated to the virtual machine as original definition information of the virtual machine,
    In the migration step, the management computer duplicates the definition information, changes the duplicated definition information so as to correspond to the actual usage amount, and changes the virtual computer based on the changed definition information. Move to the second physical computer, hold the changed definition information as a movement history,
    In the returning step, the management computer moves the virtual computer to the first physical computer with the original definition information of the virtual computer.
  8.  プロセッサとメモリとを備えて複数の仮想計算機を構築する仮想化部を有する複数の物理計算機と、
     プロセッサとメモリを備えて前記物理計算機とネットワークで接続された管理計算機と、を有する仮想計算機システムであって、
     前記管理計算機は、
     前記複数の物理計算機のうち第1の物理計算機の計算機リソースを前記仮想計算機に割り当てて当該仮想計算機を稼働させるために、前記仮想計算機に割り当てられた計算機リソースの情報を仮想計算機の定義情報として保持する仮想計算機定義情報と、
     前記仮想計算機が前記第1の物理計算機で実際に使用した計算機リソースを取得した実使用量と、前記仮想計算機に割り当てられた計算機リソースとの関係を保持する仮想計算機管理情報と、
     所定条件が成立した場合に、前記仮想計算機の移動先として前記複数の物理計算機のうち、前記実使用量以上の計算機リソースを確保可能な物理計算機を前記第2の物理計算機として選択し、前記定義情報を前記実使用量に更新して、前記仮想計算機を前記選択された第2の物理計算機に移動させる回復部と、を備えることを特徴とする仮想計算機システム。
    A plurality of physical computers having a virtualization unit that includes a processor and a memory to construct a plurality of virtual computers;
    A virtual machine system comprising a management computer comprising a processor and a memory and connected to the physical computer via a network,
    The management computer is
    In order to allocate the computer resource of the first physical computer among the plurality of physical computers to the virtual computer and operate the virtual computer, information on the computer resource allocated to the virtual computer is held as definition information of the virtual computer Virtual machine definition information to be
    Virtual computer management information that holds a relationship between the actual usage of the computer resource that the virtual computer actually used in the first physical computer and the computer resource allocated to the virtual computer;
    When a predetermined condition is satisfied, a physical computer capable of securing a computer resource equal to or greater than the actual usage amount is selected as the second physical computer among the plurality of physical computers as the migration destination of the virtual computer, and the definition A virtual machine system comprising: a recovery unit that updates information to the actual usage amount and moves the virtual machine to the selected second physical computer.
  9.  請求項8に記載の仮想計算機システムであって、
     前記管理計算機は、
     前記物理計算機に障害または障害予兆があることを検出する障害予兆管理部をさらに備え、
     前記回復部は、前記障害または障害予兆が検出された場合に前記所定条件の成立と判定することを特徴とする仮想計算機システム。
    The virtual machine system according to claim 8,
    The management computer is
    A failure sign management unit for detecting that the physical computer has a failure or a failure sign,
    The virtual machine system, wherein the recovery unit determines that the predetermined condition is satisfied when the failure or a failure sign is detected.
  10.  請求項9に記載の仮想計算機システムであって、
     前記仮想計算機管理情報は、
     複数の仮想計算機を複数の物理計算機にそれぞれ割り当てるために、割り当てられた計算機リソースと物理計算機との関係を保持し、
     前記回復部は、
     前記仮想計算機管理情報を検索することによって、前記仮想計算機管理情報から前記障害または障害予兆の影響を受ける仮想計算機を特定し、
     前記特定された仮想計算機の前記実使用量以上の計算機リソースを確保可能な物理計算機を、当該特定された仮想計算機の移動先の前記第2の物理計算機として選択することを特徴とする仮想計算機システム。
    The virtual machine system according to claim 9, wherein
    The virtual machine management information is
    In order to assign multiple virtual machines to multiple physical computers, respectively, maintain the relationship between the assigned computer resources and physical computers,
    The recovery unit is
    By searching for the virtual machine management information, the virtual machine affected by the failure or the failure predictor is identified from the virtual machine management information,
    A virtual computer system, wherein a physical computer capable of securing a computer resource equal to or greater than the actual usage amount of the specified virtual computer is selected as the second physical computer to which the specified virtual computer is moved. .
  11.  請求項8に記載の仮想計算機システムであって、
     前記回復部は、
     前記仮想計算機に対応する定義情報を複製し、
     当該複製された定義情報を前記実使用量に対応するように変更し、
     当該変更された定義情報に基づいて前記仮想計算機を前記第2の物理計算機に移動させることを特徴とする仮想計算機システム。
    The virtual machine system according to claim 8,
    The recovery unit is
    Duplicate definition information corresponding to the virtual machine,
    Change the duplicated definition information to correspond to the actual usage,
    A virtual computer system, wherein the virtual computer is moved to the second physical computer based on the changed definition information.
  12.  請求項8に記載の仮想計算機システムであって、
     前記仮想計算機管理情報は、複数の仮想計算機を複数の物理計算機にそれぞれ割り当てるために、割り当てられた計算機リソースと物理計算機との関係を、前記仮想計算機毎に設定した優先度を保持し、
     前記回復部は、
     前記実使用量以上の計算機リソースを確保可能な物理計算機が存在しなかった場合には、前記管理計算機が前記仮想計算機管理情報を検索することによって、当該仮想計算機よりも優先度の低い仮想計算機を稼動させる物理計算機を移動先の第2の物理計算機として選択し、
     当該仮想計算機よりも優先度の低い仮想計算機から前記実使用量に対応する計算機リソースを剥奪することを特徴とする仮想計算機システム。
    The virtual machine system according to claim 8,
    The virtual machine management information retains the priority set for each virtual machine with respect to the relationship between the assigned computer resource and the physical machine in order to assign a plurality of virtual machines to a plurality of physical machines, respectively.
    The recovery unit is
    If there is no physical computer that can secure a computer resource equal to or greater than the actual usage, the management computer searches for the virtual computer management information, and a virtual computer having a lower priority than the virtual computer is selected. Select the physical computer to run as the second physical computer to move to,
    A virtual computer system, wherein a computer resource corresponding to the actual usage is deprived from a virtual computer having a lower priority than the virtual computer.
  13.  請求項12に記載の仮想計算機システムであって、
     前記管理計算機は、
     前記物理計算機に障害または障害予兆があることを検出する障害予兆管理部をさらに備え、
     前記回復部は、
     前記障害または障害予兆が検出された場合に前記所定条件の成立と判定し、
     当該所定条件が成立し、前記仮想計算機を稼働させる物理計算機の稼働状態が所定の稼動状態となり、かつ、前記実使用量以上の計算機リソースを確保可能な物理計算機が存在しなかった場合には、前記管理計算機が前記仮想計算機管理情報を検索し、当該仮想計算機よりも優先度の低い仮想計算機を稼動させる物理計算機を移動先の第2の物理計算機として選択し、
     当該仮想計算機よりも優先度の低い仮想計算機から前記実使用量に対応する計算機リソースを剥奪することを特徴とする仮想計算機システム。
    The virtual computer system according to claim 12,
    The management computer is
    A failure sign management unit for detecting that the physical computer has a failure or a failure sign,
    The recovery unit is
    When the failure or a failure sign is detected, it is determined that the predetermined condition is satisfied,
    When the predetermined condition is satisfied, the operating state of the physical computer that operates the virtual computer becomes a predetermined operating state, and there is no physical computer that can secure a computer resource equal to or greater than the actual usage amount, The management computer searches the virtual computer management information, selects a physical computer that operates a virtual computer having a lower priority than the virtual computer as a second physical computer to be moved,
    A virtual computer system, wherein a computer resource corresponding to the actual usage is deprived from a virtual computer having a lower priority than the virtual computer.
  14.  請求項8に記載の仮想計算機システムであって、
     前記管理計算機は、前記第2の物理計算機へ移動した仮想計算機毎を前記第1の物理計算機に復帰させる仮想計算機復帰部をさらに備え、
     前記仮想計算機定義情報は、前記仮想計算機に最初に割り当てた計算機リソースの情報を前記仮想計算機のオリジナルの定義情報として保持し、
     前記回復部は、
     前記定義情報を複製し、
     当該複製された定義情報を前記実使用量に対応するように変更し、
     当該変更された定義情報に基づいて前記仮想計算機を前記第2の物理計算機に移動させ、
     前記変更された定義情報を移動の履歴として保持し、
     前記仮想計算機復帰部は、前記仮想計算機のオリジナルの定義情報で前記仮想計算機を前記第1の物理計算機に移動させることを特徴とする仮想計算機システム。
    The virtual machine system according to claim 8,
    The management computer further includes a virtual computer restoration unit that causes the first physical computer to restore each virtual computer that has moved to the second physical computer,
    The virtual machine definition information holds information of a computer resource initially assigned to the virtual machine as original definition information of the virtual machine,
    The recovery unit is
    Duplicate the definition information,
    Change the duplicated definition information to correspond to the actual usage,
    Moving the virtual machine to the second physical machine based on the changed definition information;
    Holding the changed definition information as a movement history;
    The virtual computer restoration unit moves the virtual computer to the first physical computer with original definition information of the virtual computer.
  15.  プロセッサとメモリを備えて複数の仮想計算機を構築する仮想化部を有する複数の物理計算機と、プロセッサとメモリを備えて前記物理計算機とネットワークで接続された管理計算機と、を有する仮想計算機システムにおいて、前記管理計算機を制御するプログラムが格納された記憶媒体であって、
     前記複数の物理計算機のうち第1の物理計算機の計算機リソースを前記仮想計算機に割り当てて当該仮想計算機を稼働させる手順と、
     前記仮想計算機に割り当てられた計算機リソースの情報を仮想計算機の定義情報として保持する手順と、
     前記仮想計算機が前記第1の物理計算機で実際に使用された計算機リソースを実使用量として取得する手順と、
     所定条件が成立したか否かを判定する手順と、
     前記所定条件が成立した場合に、前記仮想計算機の移動先として前記複数の物理計算機のうち、前記実使用量以上の計算機リソースを確保可能な物理計算機を前記第2の物理計算機として選択する手順と、
     前記定義情報を前記実使用量に更新して、前記仮想計算機を前記選択された第2の物理計算機に移動させる手順と、を前記管理計算機に実行させることを特徴とするプログラムを格納した記憶媒体。
    In a virtual computer system having a plurality of physical computers having a virtualization unit for constructing a plurality of virtual computers with a processor and a memory, and a management computer having a processor and a memory and connected to the physical computer by a network, A storage medium storing a program for controlling the management computer,
    A procedure for allocating a computer resource of a first physical computer among the plurality of physical computers to the virtual computer and operating the virtual computer;
    A procedure for retaining information of computer resources allocated to the virtual machine as definition information of the virtual machine;
    A procedure in which the virtual computer acquires a computer resource actually used by the first physical computer as an actual usage amount;
    A procedure for determining whether or not a predetermined condition is satisfied;
    A procedure for selecting, as the second physical computer, a physical computer capable of securing a computer resource equal to or greater than the actual usage amount among the plurality of physical computers as the migration destination of the virtual computer when the predetermined condition is satisfied; ,
    A storage medium storing a program for updating the definition information to the actual usage amount and causing the management computer to execute a procedure for moving the virtual computer to the selected second physical computer .
PCT/JP2010/063273 2009-12-18 2010-08-05 Migration method for virtual machine, virtual machine system, and storage medium containing program WO2011074284A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2009-288000 2009-12-18
JP2009288000A JP2011128967A (en) 2009-12-18 2009-12-18 Method for moving virtual machine, virtual machine system and program

Publications (1)

Publication Number Publication Date
WO2011074284A1 true WO2011074284A1 (en) 2011-06-23

Family

ID=44167050

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2010/063273 WO2011074284A1 (en) 2009-12-18 2010-08-05 Migration method for virtual machine, virtual machine system, and storage medium containing program

Country Status (2)

Country Link
JP (1) JP2011128967A (en)
WO (1) WO2011074284A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102946413A (en) * 2012-10-17 2013-02-27 北京搜狐新媒体信息技术有限公司 Method and system for resource preprocessing in dispatching and deployment performing process of virtual machine
JPWO2013094006A1 (en) * 2011-12-19 2015-04-27 富士通株式会社 Program, information processing apparatus and method
JP2015108899A (en) * 2013-12-03 2015-06-11 富士通株式会社 Control program, control device, and control method
WO2015114816A1 (en) * 2014-01-31 2015-08-06 株式会社日立製作所 Management computer, and management program
US9699509B2 (en) 2014-04-22 2017-07-04 Olympus Corporation Alternate video processing on backup virtual machine due to detected abnormalities on primary virtual machine

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5725191B2 (en) 2011-09-22 2015-05-27 富士通株式会社 Power management device, power management method, and power management program
US9183102B2 (en) * 2011-09-30 2015-11-10 Alcatel Lucent Hardware consumption architecture
WO2013099023A1 (en) * 2011-12-28 2013-07-04 富士通株式会社 Monitoring program, monitoring method and monitoring device
WO2013114829A1 (en) * 2012-02-01 2013-08-08 日本電気株式会社 Information processing system, data center, system migration method, and program
WO2013121531A1 (en) * 2012-02-15 2013-08-22 株式会社日立製作所 Virtual computer system and virtual computer fault symptom recovery method
JP5948933B2 (en) * 2012-02-17 2016-07-06 日本電気株式会社 Job continuation management apparatus, job continuation management method, and job continuation management program
JP5983746B2 (en) 2012-07-05 2016-09-06 富士通株式会社 Processing apparatus, processing system, and program
WO2014073046A1 (en) 2012-11-07 2014-05-15 富士通株式会社 Information processing device, program and virtual machine migration method
US9588798B2 (en) 2013-02-28 2017-03-07 Nec Corporation Software safe shutdown system, software safe shutdown method, and program to prevent a problem caused by a system failure
JP6191686B2 (en) * 2013-03-21 2017-09-06 富士通株式会社 Information processing apparatus, resource allocation method, and program
JP2015022385A (en) * 2013-07-17 2015-02-02 日本電気株式会社 Virtual system and method for controlling virtual system
JP6194761B2 (en) * 2013-11-07 2017-09-13 富士通株式会社 Information processing method, apparatus, and program
US10021093B2 (en) * 2014-01-10 2018-07-10 Priviti Pte Ltd System and method for communicating credentials
WO2015114791A1 (en) * 2014-01-31 2015-08-06 株式会社日立製作所 Security management device
JP6269199B2 (en) * 2014-03-13 2018-01-31 日本電気株式会社 Management server, failure recovery method, and computer program
JP6260375B2 (en) * 2014-03-17 2018-01-17 富士通株式会社 Management device, management program, and information processing system
JP6398641B2 (en) * 2014-11-19 2018-10-03 日本電気株式会社 Management device, service provision management method, and service provision management program
JP6540072B2 (en) * 2015-02-16 2019-07-10 富士通株式会社 Management device, information processing system and management program
KR101900727B1 (en) 2018-06-14 2018-09-20 김상순 Virtual server managing apparatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007148839A (en) * 2005-11-29 2007-06-14 Hitachi Ltd Failure recovery method
JP2008293117A (en) * 2007-05-22 2008-12-04 Hitachi Ltd Method for monitoring performance of virtual computer, and device using the method
JP2009252204A (en) * 2008-04-11 2009-10-29 Hitachi Ltd Operation management system and operation management method of computer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007148839A (en) * 2005-11-29 2007-06-14 Hitachi Ltd Failure recovery method
JP2008293117A (en) * 2007-05-22 2008-12-04 Hitachi Ltd Method for monitoring performance of virtual computer, and device using the method
JP2009252204A (en) * 2008-04-11 2009-10-29 Hitachi Ltd Operation management system and operation management method of computer

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPWO2013094006A1 (en) * 2011-12-19 2015-04-27 富士通株式会社 Program, information processing apparatus and method
CN102946413A (en) * 2012-10-17 2013-02-27 北京搜狐新媒体信息技术有限公司 Method and system for resource preprocessing in dispatching and deployment performing process of virtual machine
CN102946413B (en) * 2012-10-17 2015-07-08 北京搜狐新媒体信息技术有限公司 Method and system for resource preprocessing in dispatching and deployment performing process of virtual machine
JP2015108899A (en) * 2013-12-03 2015-06-11 富士通株式会社 Control program, control device, and control method
WO2015114816A1 (en) * 2014-01-31 2015-08-06 株式会社日立製作所 Management computer, and management program
US9990258B2 (en) 2014-01-31 2018-06-05 Hitachi, Ltd. Management computer and management program
US9699509B2 (en) 2014-04-22 2017-07-04 Olympus Corporation Alternate video processing on backup virtual machine due to detected abnormalities on primary virtual machine

Also Published As

Publication number Publication date
JP2011128967A (en) 2011-06-30

Similar Documents

Publication Publication Date Title
WO2011074284A1 (en) Migration method for virtual machine, virtual machine system, and storage medium containing program
US11182220B2 (en) Proactive high availability in a virtualized computer system
EP2800303B1 (en) Switch method, device and system for virtual application dual machine in cloud environment
US7992032B2 (en) Cluster system and failover method for cluster system
US20220129299A1 (en) System and Method for Managing Size of Clusters in a Computing Environment
US8713362B2 (en) Obviation of recovery of data store consistency for application I/O errors
US11321197B2 (en) File service auto-remediation in storage systems
JP6054522B2 (en) Integrated storage / VDI provisioning method
US8423816B2 (en) Method and computer system for failover
US9485160B1 (en) System for optimization of input/output from a storage array
US8122212B2 (en) Method and apparatus for logical volume management for virtual machine environment
US9652326B1 (en) Instance migration for rapid recovery from correlated failures
WO2011074152A1 (en) Management server, management method, and management program for virtual hard disk
US20110119670A1 (en) Method for dynamic load balancing on partitioned systems
US20100186010A1 (en) Dynamic Checking of Hardware Resources for Virtual Environments
US10108517B1 (en) Techniques for data storage systems using virtualized environments
JP5305040B2 (en) Server computer switching method, management computer and program
JP2007207219A (en) Computer system management method, management server, computer system, and program
US20150074251A1 (en) Computer system, resource management method, and management computer
WO2012004902A1 (en) Computer system and system switch control method for computer system
JP2007156679A (en) Failure recovery method for server, and database system
JP2010257274A (en) Storage management system and storage management method in virtualization environment
JP5998577B2 (en) Cluster monitoring apparatus, cluster monitoring method, and program
JP2015075898A (en) Processing restart method, processing restart program and information processing system
US11900159B2 (en) Method for repointing resources between hosts

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10837315

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10837315

Country of ref document: EP

Kind code of ref document: A1