JP2008107896A - Physical resource control management system, physical resource control management method and physical resource control management program - Google Patents

Physical resource control management system, physical resource control management method and physical resource control management program Download PDF

Info

Publication number
JP2008107896A
JP2008107896A JP2006287536A JP2006287536A JP2008107896A JP 2008107896 A JP2008107896 A JP 2008107896A JP 2006287536 A JP2006287536 A JP 2006287536A JP 2006287536 A JP2006287536 A JP 2006287536A JP 2008107896 A JP2008107896 A JP 2008107896A
Authority
JP
Japan
Prior art keywords
physical resource
physical
failure
unit
resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
JP2006287536A
Other languages
Japanese (ja)
Inventor
Shinji Kami
伸治 加美
Original Assignee
Nec Corp
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nec Corp, 日本電気株式会社 filed Critical Nec Corp
Priority to JP2006287536A priority Critical patent/JP2008107896A/en
Publication of JP2008107896A publication Critical patent/JP2008107896A/en
Application status is Withdrawn legal-status Critical

Links

Abstract


In a virtual environment, it is difficult to perform fast failover while concealing a physical resource failure in an application (process).
A hardware space 6100 that is a set of physical resources in a system and a software space 6500 that is a set of software programs, in which one or more central processing units 6121 and other physical resources contain data. It is connected by the transfer path 6131, and part or all of the physical resources have a life / death confirmation unit, and the software space is a virtualization means 6550, at least one virtual resource space 6520 and a virtual operating on the virtual resource space The virtualization unit includes a resource allocation unit 6551, a failure management unit 6552, and a resource access unit 6554. Employing such a configuration, the failure management means coordinates between hardware failure management and software failure management to control the resource allocation means.
[Selection] Figure 1

Description

  The present invention relates to a control management method in an IT system that uses a CPU and an I / O device as physical resources, and a network (NW) system that uses a CPU and a line card as physical resources, and particularly relates to a control management method for fault management.

  As a failure management system in IT and network systems, a duplex structure is often adopted to enhance fault tolerance. The duplex structure is a method of setting a standby resource in addition to the active resource and increasing redundancy so that even if a failure occurs in the active system, the standby resource is switched to prevent the service from being stopped. In general, an N + M configuration (N: number of active systems, M: number of standby systems) can be used. In general, recovery from a failure is performed between the service and the physical resource, and the service is configured to conceal the failure so that a special mechanism is not included in the service.

  There are two types of failure management methods, software and hardware.

  As an example of software management: VMM, 2. SW RAID etc. are mentioned, and H / W RAID etc. are mentioned as an example of hardware management (nonpatent literature 2).

  In recent years, virtual device configurations based on resource virtualization have become mainstream in computer environments. For example, as shown in FIG. 8, the virtualization includes a virtualization layer sandwiched between a physical resource 1001 and conventional device access means 1006. The virtualization means 1003 includes a virtual resource 1002 and a plurality of device access means 1006. By providing to 1004, a plurality of different virtual devices 1004 can be driven on a set of resources. The virtualization unit 1003 schedules access to the physical resource 1001 from a plurality of virtual devices 1004, so that each process 1005 can virtually access the device by using the device access unit 1006 without any change. Become. The details of the physical resource 1001 are hidden by the virtual device 1004 (or the guest OS installed in the virtual device 1004) by virtualization.

  Although there are various virtualization architectures, the XEN architecture described in Non-Patent Document 1 is shown in FIG. 9 as an example. XEN has a hypervisor 2002 as a virtualization layer for the physical resource 2001, and also has a device driver for device access and mediates access to the physical resource 2001 of the virtual device 2004. A dedicated privileged domain 2003.

  Using this concealment structure, even if a physical resource failure occurs, it is possible in principle to conceal the failure by switching the corresponding virtual resource while concealing the guest OS. This method is described in Patent Document 1 and the like.

  As shown in FIG. 10, the basic configuration of the software redundancy management system includes a physical resource 3001 belonging to the hardware space 3002, device access means 3004 belonging to the software space 3003, redundancy management means 3005, and a process 3006. The redundancy management unit 3005 can access the plurality of physical resources 3001 and the corresponding device access unit 3004. For example, a set of physical resources having a duplex configuration is shown to the process 3006 as one abstract device. Since the process 3006 performs device access according to the interface defined with the redundancy management unit 3005 regardless of the actual state of the physical resource 3001, the process is designed to access one device even if it is actually duplicated. Is possible. If one of the duplicated physical resources fails, the redundancy unit 3005 can conceal the failure in the process 3006 by switching the setting to the other device access unit and physical resource.

  The above software processing is excellent in flexibility because it does not require dedicated hardware. However, since the switching processing is performed in the software process from the failure detection, there is a disadvantage that the processing time is long.

  Another example is a hardware RAID system.

  FIG. 11 shows a schematic diagram of a RAID system as an example of a hardware redundancy management method. This system includes a disk device 4005 and a software program 4004. The disk device 4005 has a physical disk 4001 such as a hard disk and a memory 4007. The software program 4004 has a process 4006 and device access means 4008. Data 4011 stored on a physical disk is always mirrored (copied) as data 4012 on another physical disk, and the details are hidden from the user (software operation means 4004 such as process 4006 and device access means 4008). In this method, one data 4013 in the memory 4007 is accessed. As a result, even if one physical disk in which the data 4011 exists fails, the data 4012 of the other physical disk is supplied to the process, so that the process is not affected by the failure. The controller 4002 which is the redundant management and dedicated HW is performed, and the software operation unit 4004 such as the device access unit 4008 and the process 4006 need not be aware of it. This is a concealment of redundancy management at the hardware level using dedicated hardware.

  Further, as a method for speeding up failure recovery, there is a method of independently monitoring the life and death between hardware having a redundant configuration. By determining the master (active) and slave (standby) and monitoring each other's life and death, for example, when a failure occurs in the master and the slave detects it, the slave is set to operate as a master. is there. Since faults are detected and switched by hardware, fault recovery can be performed at high speed. For example, as shown in FIG. 12, in the working path 5003 and the standby path 5004 in the network, data is copied at the branch point 5001 in the normal state, the data is always transferred through both paths, and at the branch point 5002 Receive and transfer data from the working path. The working path and standby path are always confirmed to be alive and dead by hardware, for example, by periodically sending test signals. If a failure is observed in the working path, data from the standby path is hardware-based. Switching is performed at the branch point 5002 so as to transfer. This method enables fast failure recovery.

Although the above hardware processing is quick to recover, there are drawbacks such as the need for dedicated hardware and management flexibility such as redundant configuration settings.
US Patent Application Publication No. 2005/0246718 Paul Barham et. Al. "Xen and the Art of Virtualization" Proceedings of the nineteenth ACM symposium on Operating systems principles, pp 164-177, Bolton Landing, NY, USA, 2003 David A. Patterson, et. Al. "A case for redundant arrays of inexpensive disks (RAID)", Proceedings of the 1988 ACM SIGMOD international conference on Management of data, Pages: 109-116

  However, in the above configuration, it is difficult to perform high-speed failover while concealing a physical resource failure in an application (process) without using a dedicated hardware controller in a virtual environment. .

  The reason is that failover in the current virtual environment is software failure management, and thus recovery takes time. In addition, since hardware fast recovery method and software recovery method cannot be linked, hardware fast independent recovery method such as keep alive method is used as it is unless it is concealed from software using a dedicated hardware controller. This is because it was difficult to apply to a virtual environment while maintaining high speed.

(Object of invention)
An object of the present invention is to provide a hardware method and software while concealing a physical failure in a virtual device, a guest OS installed in the virtual device, or a process in a virtual environment without using an expensive and dedicated hardware controller. The object is to provide a system that performs flexible high-speed failover that can be coordinated.

  Another object of the present invention is to manage the priority of each virtual device even in a complicated management environment in which hardware with a high-speed failure recovery function such as a keep alive method and other normal hardware are mixed. An object of the present invention is to provide a system in which optimal redundancy can be set independently to satisfy the policy.

The physical resource control management system of the present invention includes a plurality of physical resources,
A virtual machine in which at least one software program operates, and a computer component that includes software functioning as a virtualization unit that enables the virtual apparatus to share the plurality of physical resources, and
The virtualization means includes a resource allocation means for allocating the plurality of physical resources to the virtual device, and one or more of the plurality of physical resources have failed, and the hardware is allocated to another physical resource. It comprises fault management means for controlling switching to the other physical resource by software in the resource allocation means in cooperation with the switching control when switching control is executed.

The physical resource control management method of the present invention includes a plurality of physical resources, a virtual device in which at least one software program operates, and a virtualization unit that enables the virtual device to share the plurality of physical resources. And a computer configuration unit having functioning software,
In the physical resource control management method of the physical resource control management system, the virtualization means includes resource allocation means for allocating the plurality of physical resources to the virtual device.
A step of switching to another physical resource by hardware when a failure occurs in one or more of the plurality of physical resources;
And switching to the other physical resource by software in the resource allocating means in cooperation with the switching control.

The physical resource control management program according to the present invention causes a computer to function as a virtual device in which at least one software program operates, and a virtualization unit that enables the virtual device to share a plurality of physical resources. The physical resource control management program of
The virtualization means includes a resource allocation means for allocating the plurality of physical resources to the virtual device, and one or more of the plurality of physical resources have failed, and the hardware is allocated to another physical resource. When switching control is executed, it functions as a failure management unit that controls switching to the other physical resource by software in the resource allocation unit in cooperation with the switching control.

  According to the present invention, high-speed failover can be performed in a virtual environment while concealing physical resource failures in applications and processes without using a dedicated hardware controller. The reason is that, in the concealment structure in the virtual environment, the hardware failure recovery method and the software failure recovery method cooperate without impairing the recovery speed.

  A typical embodiment of the present invention includes a hardware space that is a set of physical resources in a system system and a software space that is a set of software programs that operate on the hardware space, and the hardware space includes at least one or more. The central processing unit and other physical resources are connected by a data transfer path, and part or all of the physical resources have a life / death confirmation unit, and the software space is a virtualization means and at least one virtual resource space. And a virtual device that operates on the virtual device, and the virtualization unit includes a resource allocation unit, a failure management unit, and a resource access unit. By adopting such a configuration, the failure management means can coordinate the failure management by hardware and the failure management by software to control the resource allocation means.

  Embodiments of the present invention will be described below with reference to the drawings.

[First Embodiment]
First, a first embodiment of the present invention will be described in detail with reference to the drawings.

  Referring to FIG. 1, the first embodiment of the present invention includes a hardware space 6100 and a software space 6500. The hardware space 6100 includes at least a physical resource 6101 typified by an I / O device, a physical resource 6102, and a central processing unit (central processing means) 6121 typified by a CPU. (Transfer means) 6131. The data transfer path is a system bus represented by, for example, a PCI bus, but is not limited to this. The physical resource 6101 is provided with a life / death confirmation unit (life / life confirmation means) 6111, and the physical resource 6102 is provided with a life / death confirmation unit 6112.

  The software space 6500 includes a virtualization unit 6550, at least one virtual resource space 6520, and a virtual device 6510 operating thereon. The virtualization unit 6550 includes a resource allocation unit 6551, a failure management unit 6552, a resource access unit 6553, and a resource access unit 6554. The virtual device 6510 has resource access means 6511. The virtual means 6550, the virtual resource space 6520, and the virtual device 6510 are programs and data stored in a semiconductor memory such as a DRAM or a hard disk device, and are processed by a central processing unit represented by a CPU in the hardware space 6100. Is done.

  Next, an outline of the operation of these means will be described.

  In the hardware space 6100, a physical resource 6101 typified by an I / O device or the like, a physical resource 6102, and a central processing unit 6121 typified by a CPU or the like are connected by a data transfer path 6131 typified by a PCI bus or the like. It is connected. The physical resource 6101 set in the active system and the physical resource 6102 set in the standby system are exchanged by the life / death confirmation unit 6111 and the life / death confirmation unit 6112 by the life / life confirmation signal 6113 typified by the keep alive signal. Confirm each other's life and death. The physical resource 6101 and the physical resource 6102 constitute a redundant pair. When the life / death confirmation signal 6113 is interrupted and a failure of the physical resource of the other party is detected, a signal for notifying it is transmitted to the central processing unit 6121 through the data transfer path 6131. When the central processing unit 6121 receives the signal, the central processing unit 6121 interrupts the current processing and transmits a failure occurrence signal to the failure management means 6552 in the software space 6500.

  The software space 6500 is a software program that operates on a hardware space (physical resource space) 6100. The virtualization unit 6550 belonging to the software space 6500 provides the virtual device 6510 with a virtual resource space 6520 having a virtual resource 6521 obtained by virtualizing the hardware space 6100. The virtual device 6510 operates on the virtual resource space 6520 as if it operates on the same hardware space. The virtual device 6510 has resource access means 6511 typified by a device driver, and provides resource access means to various software processes operating in the virtual device 6510. The resource access unit 6511 presents the existence of the virtual resource 6510 space to various software processes, and actually transfers the processing to the resource allocation unit 6551 of the virtualization unit 6550 in response to an access request for the resource in the virtual resource space 6510. .

  The virtualization means 6550 can generally have a plurality of virtual devices 6510 in the same system, each providing a virtual resource space 6520 and mediating physical resource access from them, thereby the plurality of virtual devices 6510. Allows the hardware space 6100 to be shared.

  For this purpose, the virtualization unit 6550 controls the resource access unit 6553 and the resource access unit 6554 that directly access the physical resource 6101 and the physical resource 6102, and the resource access unit and the resource access unit in the virtual device 6510. The hardware allocation (physical resource space) 6100 is shared by controlling the connection with the 6511 by the resource allocation unit 6551.

  The resource allocation unit 6551 accepts the connection with the resource access unit to the physical resource accessible to the virtual device 6510 according to the setting designated in advance, so that the resource access unit 6511 of the plurality of virtual devices 6510 Control physical resource access. The resource allocation unit 6551 also sets a redundant configuration. For example, in order to increase the availability level of the virtual resource 6521, it is assumed that the physical resource 6101 and the physical resource 6102 are actually used in duplicate. Assuming that the active system is the physical resource 6101 and the standby system is the physical resource 6102, the resource allocation unit 6551 normally connects the resource access unit 6511 and the resource access unit 6553 of the virtual device 6510. The failure can be recovered by switching the connection from the resource access means 6553 to the resource access means 6554 when a failure occurs in the physical resource 6101. This is the above-described software failure recovery method, in which a failure is concealed from the virtual device 6510, and the virtual device 6510 does not need to make any settings for taking a redundant configuration.

  Further, the virtualization unit 6550 has a failure management unit 6552. The fault management unit 6552 has a function of searching and executing from the table registered in advance which processing is performed by a signal represented by an interrupt from the central processing unit 6121. For example, an operation such as issuing a command to change the above connection setting upon interruption in the event of a failure is performed.

  Next, the operation from failure occurrence to failure recovery in the embodiment of the present invention will be described with reference to the flowchart shown in FIG.

(Step S101)
Assume that the physical resource 6101 is set as an active resource and the physical resource 6102 is set as a standby resource. Both physical resources regularly confirm each other's life / death state through the communication of the life / death confirmation signal 6113 of the life / death confirmation units 6111 and 6112. Further, the virtual resource 6521 of the virtual device 6510 is set to 1 + 1 redundancy, and the resource allocation unit 6551 accesses the active physical resource 6101 during normal operation, and the resource access unit of the virtual device 6510. 6511 is connected.

(Step S102)
A failure represented by a power failure or the like occurs in the physical resource 6101 set in the active system, and the service cannot be continued.

(Step S103)
The life and death confirmation unit 6111 cannot communicate between the life and death confirmation unit 6112 of the physical resource 6102 and the life and death confirmation signal 6113. Therefore, the life and death confirmation unit 6112 detects the failure of the physical resource 6101 by detecting no response at the shortest signal transmission interval time. Confirm and change the status setting of the physical resource 6102 to the active system.

(Step S104)
The physical resource 6102 transmits a failure detection notification signal represented by an interrupt signal and a state change notification signal from the standby system to the active system to the central processing unit 6121.

(Step S105)
When the central processing unit 6121 receives a signal from the physical resource 6112, the central processing unit 6121 stops the current processing and notifies the failure management unit 6552 of an interrupt signal.

(Step S106)
The fault management unit 6552 searches for a corresponding process from a previously registered table for the interrupt signal.

(Step S107)
The failure management unit 6552 transmits a connection change command to the resource allocation unit 6551 according to the searched processing, and the resource allocation unit 6551 performs switching control of the connection destination from the resource access unit 6553 to the resource access unit 6554 according to the command. The active system is synchronized with the operation state in the hardware space in which the state setting is changed to the physical resource 6102.

(Step S108)
The resource access unit 6511 can access the physical resource 6102 and can continue the service.

  Next, the effect of this embodiment will be described.

  In the conventional software failure recovery method, when a failure occurs in a physical resource, there is no response from the physical resource 6101 to the resource access means of the virtual device, the failure is detected by processing such as timeout, and the active system is connected to the standby system Is changed, and the operation status of other software and hardware is changed as necessary, and the failure recovery processing is completed. For this reason, the processing time from failure occurrence to recovery is generally long. In order to shorten the processing time, alternative means such as always sending a life / death confirmation signal from the resource access means of the virtualization means can be considered, but in order to shorten the fault detection time, the CPU load increases accordingly. On the other hand, according to the present embodiment, the life and death confirmation is performed by a hardware method to maximize the high speed, and by interrupting from the hardware, the resource allocation unit 6551 synchronizes and synchronizes with software without significant delay. Therefore, it is possible to quickly recover from a failure without using a special hardware failure concealment structure while concealing the virtual device.

[Second Embodiment]
Next, a second embodiment of the present invention will be described in detail with reference to FIG.

  The second embodiment of the present invention includes a hardware space 7100 and a software space 7500.

  The hardware space 7100 includes at least one I / O device 7101, I / O device 7102, and CPU 7103 connected to each other via a data transfer path 7150. The I / O devices 7101 and 7102 confirm each other's life and death through the keep alive signal 7201 by the life and death confirmation units 7111 and 7112. Here, when the I / O devices 7101 and 7102 are network cards, in addition to the alive confirmation between the network end points by the alive confirmation units 7111 and 7112 between the I / O devices 7101 and 7102, the network The life and death confirmation part of each path by the life and death confirmation signal with the upper node may be mounted separately.

  The software space 7500 includes virtualization means 7580 typified by an operating system and one or more virtual devices 7530. The virtualization unit 7580 includes a device driver 7501 that is an access unit to the I / O device 7101, a device driver 7502 that is an access unit to the I / O device 7102, a failure management unit 7550, a resource allocation unit 7503, and a back-end device driver. 7504. Further, the failure management unit 7550 includes an interface unit 7551 that provides an access unit to the administrator 7900, a process management unit 7554, and an information storage unit 7553. The virtual device 7530 includes a process 7532 and a front-end device driver 7531.

  Next, the operation of these means will be described. The I / O devices 7101 and 7102 are input / output devices represented by NIC (network interface card) and disk, but are not necessarily limited to these. In general, a user process uses these I / O devices for purposes such as communication with an external system and data access to a disk. The CPU 7103 is a central processing unit typified by Pentium (registered trademark) and Xeon, and the data transfer path 7150 is a system bus typified by a PCI bus, but is not necessarily limited thereto. The I / O devices 7101 and 7102 form a redundant pair, and one is set as the active system and the other is set as the standby system (here, the I / O device 7101 is set as the active system). The life and death confirmation units 7111 and 7112 mutually confirm the life and death by communicating keep-alive signals 7201 at regular intervals.

  If a failure occurs in the I / O device 7101 and the life and death confirmation unit 7112 detects an abnormality in the communication of the keep-alive signal 7201, the I / O device 7101 is determined as a failure, and its own operating state is changed from the standby system to the active system. The failure notification interrupt signal 7202 is transmitted to the CPU 7103. When the CPU 7103 detects the interrupt signal 7202, the CPU 7103 temporarily stops the current processing and transmits the interrupt signal 7203 to the failure management unit 7550.

  Device drivers 7501 and 7502 are software programs that provide access means to the I / O devices 7101 and 7102, respectively, and are created unique to the device. The front-end device driver 7531 in the virtual device 7530 indicates the existence of the virtual resource 7301 to the process 7532 and provides an interface similar to that used when the process 7532 accesses a normal physical resource. Note that the process 7532 may directly access the front-end device driver 7531, and another process such as a kernel process may intervene. The front-end device driver 7531 receives an access request to the virtual resource 7301 and transfers the access request to the back-end device driver 7504 associated in advance. This association is realized by a method of transferring data in the form of access to a shared memory, such as XEN described in Non-Patent Document 1, but is not necessarily limited to this.

  The back-end device driver 7504 is associated with the front-end device driver 7531 and the device drivers 7501 and 7502, and mediates between the device drivers 7501 and 7502. A software-based fault concealment structure is formed by interposing resource allocation means 7503 between the back-end device driver 7504 and the device drivers 7501 and 7502. During normal operation, the back-end device driver 7504 is connected to a device driver 7501 that is an access means to the active I / O device.

  The failure management means 7550 operates the hardware failure recovery method and the software failure recovery method in synchronization when a failure occurs in the physical resource. For this purpose, an interface unit 7551 with an administrator 7900, an information storage unit 7553, and a process management unit 7554 are provided. The administrator 7900 makes settings in advance through the interface means 7551 for the redundant configuration of the virtual resources 7301 and the processing method when a failure occurs. For example, specifically, as a redundant configuration, the I / O device 7101 is the active system, the I / O device 7102 is the standby system, and 1 + 1 redundancy is established. If a fault notification interrupt signal from the I / O device 7102 is detected, the resource A process of switching the connection between the back-end device driver 7504 and the device driver 7501 in the assignment unit 7503 to the connection between the back-end device driver 7504 and the device driver 7502 is registered. The interrupt signal is managed as an ID which is a unique value for each fault or interrupt signal transmission source device, for example. This processing information is managed as a table for each ID of each interrupt signal, for example, and stored in information storage means 7553 represented by a memory or a disk. In response to an interrupt signal from the CPU 7103, the process management unit 7554 searches the table in the information storage unit 7553 for the corresponding process from the interrupt signal ID, and executes the process (in this case, the connection switch to the resource control unit 7503) Issue an order). Here, it is also possible to register a process such as notifying the administrator 7900 of an error message through the interface means 7551 when the corresponding process is not found.

  When the I / O device 7101 on which the life / death confirmation unit 7111 is mounted and the 7102 on which the life / death confirmation unit 7111 is mounted are, for example, a network card, a service interruption due to a failure of a node on the network may be caused by the I / O device 7101 and A value such as a hardware counter 7102 is monitored, and a failure can be determined by detecting an abnormal change in the counter value. Then, similar failure switching is possible by notifying the slave from the master through the keep-alive signal.

  In addition, an interruption signal 7203 is issued in the same way in the case of a failure on the network by a life / death confirmation signal between the network card and the node on the network, and high-speed switching is possible.

  Hereinafter, the operation from the failure occurrence to the failure recovery in the I / O device 7101 will be described in detail with reference to the flowchart shown in FIG. Here, as described above, it is assumed that necessary settings and process registration have already been performed by the administrator 7900.

(Step S201)
When a failure occurs in the I / O device 7101, the device driver 7501, the back-end device driver 7504, the front-end device driver 7531, and the process 7532 cannot access the I / O device 7101, and the service stops. Further, due to the failure, a communication abnormality of the keep alive signal 7201 occurs.

(Step S202)
The I / O device 7102 detects a failure of the I / O device 7101 from the keepalive signal communication abnormality detection of the life and death confirmation unit 7112, and changes its operation state from standby to operation.

(Step S203)
The I / O device 7102 transmits an interrupt signal 7202 to the CPU 7103.

(Step S204)
The CPU 7103 transmits an interrupt signal 7203 to the failure management unit 7550.
(Step S205)
The process management unit 7554 in the failure management unit 7550 accesses the information storage unit 7553 from the ID (Identification) of the interrupt signal, and retrieves the corresponding process from the process table.

(Step S206)
The process management unit 7554 issues a command to execute the searched processing (here, a command (trigger signal) is issued to the resource allocation unit 7503 to change the connection of the back-end device driver 7504 to the device driver 7502). .

(Step S207)
The resource allocation unit 7503 changes the connection between the back-end device driver 7504 and the device driver 7501 to the connection with the device driver 7502 in accordance with the above command.

(Step S208)
When the above connection is established, the back-end device driver 7504, the front-end device driver 7531, and the process 7532 can access the device driver 7502 and the I / O device 7102 that is already operating as the active system, and the service is Restore.

  Next, the effect by this embodiment is demonstrated. Since the connection between the process 7532 and the device drivers 7501 and 7502 and the I / O devices 7101 and 7102 is realized through the connection between the front-end device driver 7531 and the back-end device driver 7504, the resource allocation unit 7503 By switching, faults in the device driver and I / O device are completely hidden. This is an advantage of the software failure recovery method. Further, failure detection and operation state switching use a high-speed hardware method, and by the cooperative synchronization operation of the hardware method and the software method by the processing management unit 7554, high-speed operation can be performed without hardware having a special concealment structure other than the alive confirmation unit. Disaster recovery is possible.

  Furthermore, in addition to the above-mentioned control based on availability (metric), hardware and software linked resource allocation control using the statistical values for items that can be monitored on hardware as a metric can be realized in the same way. It is. Statistical values include, for example, bandwidth measurement using a counter value, delay measurement using a test signal, and reliability using a bit error test, but are not necessarily limited thereto. As an implementation function by cooperation, for example, in the case of bandwidth measurement, it is possible to realize a dynamic load balance according to a change in a situation between pairs forming a redundant configuration.

[Third Embodiment]
Next, a third embodiment of the present invention will be described in detail with reference to FIG.

  The third embodiment of the present invention includes a hardware space 8100 and a software space 8500.

The hardware space 8100 includes at least one I / O device 8101, I / O device 8102, I / O device 8103, and CPU 8104 connected to each other by the communication unit 8150. The I / O devices 8102 and 8103 confirm each other's life and death through a keep alive signal by the life and death confirmation unit as in the second embodiment. The I / O device 8101 is a normal I / O device that does not have a life / death confirmation unit. The I / O device 8101, the I / O device 8102, and the I / O device 8103 constitute a physical resource, and the I / O device 8101 does not have a life / death confirmation unit and has different performance.
The software space 8500 includes virtualization means 8580 represented by an operating system and the like, and at least two or more virtual devices 8530 and 8540. The virtualization unit 8580 includes device drivers 8501 to 8503, resource allocation units 8506, back-end device drivers 8504 and 8505, and failure management units 8550 that are access units to the I / O devices 8101 to 8103. The failure management unit 8550 further includes an interface unit 8551 that provides an access unit to the administrator 8900, a setting management unit 8552, an information storage unit 8553, a process management unit 8554, and a physical resource management unit 8555. The virtual devices 8530 and 8540 have processes 8532 and 8542 and front-end device drivers 8531 and 8541, respectively.

  The administrator 8900 can input configuration setting information 8701 and management policy information 8702 as management information to the failure management unit 8550.

  Next, the operation of these means will be described. As in the second embodiment, the I / O devices 8101 to 8103 are input / output devices represented by NIC (network interface card) and disk, respectively, and the CPU 8104 is connected to Pentium (registered trademark) or Xeon. Although it is a representative central processing unit, it is not necessarily limited to these.

  The I / O devices 8102 and 8103 form a redundant pair, and one is set as the active system and the other is set as the standby system (here, the I / O device 8102 is set as the active system). Then, the I / O devices 8102 and 8103 confirm each other's life and death by communicating at regular intervals of keep alive signals by the life and death confirmation unit, as in the second embodiment. The I / O device 8101 is a normal hardware resource that does not have such a life / death confirmation unit.

  For this reason, if a failure occurs in the I / O device 8101, the failure is detected by detecting a response abnormality such as a timeout by a process or device driver, and the failure is recovered only by the conventional software failure recovery method. Is generally time consuming. On the other hand, since the failure in the I / O devices 8102 and 8103 is detected by the life and death confirmation unit, high-speed detection and operation state change can be performed by the same processing as described in the second embodiment.

  The device drivers 8501 to 8503 are software programs that provide access means to the I / O devices 8101 to 8103, respectively, and are created unique to the device. Similar to the second embodiment, the front-end device drivers 8531 and 8541 in the virtual devices 8530 and 8540 indicate to the processes 8532 and 8542 the existence of the virtual resources 8301 and 8302, and the processes 8532 and 8542 perform normal physical processing. Provides an interface similar to that used to access resources. The front-end device drivers 8531 and 8541 are associated with the back-end device drivers 8504 and 8505, respectively, and transfer access requests for the virtual resources 8301 and 8302.

  The back-end device drivers 8504 and 8505 are connected and controlled with the device drivers 8501 to 8503 through the resource allocation unit 8506 to form a fault concealment structure by a software method. If a redundant configuration is adopted, the configuration is designated as the configuration. Allow the active device driver to connect.

  The fault management unit 8550 includes a setting management unit 8552 and a physical resource management unit 8555 in addition to the second embodiment.

  The physical resource management unit 8555 manages information on physical resources currently in the system, and sets the physical resources as necessary. The physical resource information includes, for example, information related to failure recovery performance such as the presence / absence of the alive confirmation unit in addition to the physical resource ID (physical resource identification information), type, and performance. For example, the information is that the I / O device 8101 is a normal device, the I / O devices 8102 and 8103 have a life / death confirmation unit, and a redundant configuration by a hardware method can be built. Furthermore, when it is decided to form a redundant pair with the I / O devices 8102 and 8103, the life and death confirmation signal communication represented by the keep alive signal is set between the two, and the operation state setting of the active system and the standby system is performed. I do.

  The setting management unit 8552 has management information determined for each virtual device such as a virtual resource configuration assigned to a virtual device in the system, a priority, an availability level such as a redundant configuration and a failure recovery speed. The resource allocation configuration of the virtual device is acquired and stored when the virtual device is generated, and the availability level and priority information are determined by the configuration setting information 8701 and the management policy 8702, respectively. The configuration setting information 8701 and the management policy 8702 are input from the administrator 8900 through the interface unit 8551.

  The configuration setting information 8701 includes availability level information such as a redundant configuration of a virtual resource possessed by a virtual device in the system and a failure recovery speed. The management policy information 8702 has priority information describing the priority of the virtual device in the system.

  The setting management unit 8552 calculates a combination of possible resource allocation methods from the configuration setting information 8701 and the management policy information 8702, the virtual resource configuration information allocated to the virtual device, and the physical resource information of the physical resource management unit. The optimum configuration is searched, the setting reflection command is issued to the resource allocation unit 8506, and the setting information is held. At the same time, from the determined redundant configuration information, the processing at the time of occurrence of failure for each interrupt signal ID is stored in the table of the information storage unit 8553.

  The redundant configuration determination and setting processing from the input of the configuration setting information 8701 and the management policy information 8702 will be described in detail below with reference to the flowchart shown in FIG.

(Step S301)
Input configuration setting information 8701 and management policy 8702.

(Step S302)
The setting management unit 8552 acquires physical resource information from the physical resource management unit 8555. In addition, a physical resource allocation configuration that achieves the availability level described in the input configuration setting information 8701 is calculated for the virtual resource of the current virtual device, and compared with the acquired physical resource information.

(Step S303)
If physical resource allocation that satisfies the requested availability level is possible (step S304), the process proceeds. If not possible, an error message is output to the administrator 8900 through the interface means 8551 and the process is terminated.

(Step S304)
The virtual device priority information is acquired from the management policy information 8702, and among the physical resource allocation combinations satisfying the requested availability level, the physical resource allocation combinations are sorted according to the priority, and the physical resource of the virtual device having the highest availability level Determine the allocation method.

(Step S305)
A command to reflect the determined combination for setting is issued to the resource allocation unit 8506, the resource allocation is determined, and the setting is stored.

(Step S306)
For the set resource allocation configuration, a process for a failure in each physical resource is created and registered in the table of the information storage unit 8553 so that the process management unit 8554 can search for the process when acquiring the interrupt signal of the failure occurrence notification. To do.

  For example, in the example shown in FIG. 5, it is assumed that the virtual device 8540 has a higher priority than the virtual device 8530, and both virtual devices have a 1 + 1 redundant configuration (however, the standby system can be shared). When a failure occurs in the I / O device 8101, there is no hardware alive confirmation unit, and generally the failure recovery processing is software processing such as timeout, so that the processing time is long and the availability level is low. On the other hand, when a failure occurs in the I / O device 8102, the failure level can be recovered at a high speed as described in the second embodiment, so the availability level is high.

  Therefore, the I / O device 8102 is set for the active system of the virtual device 8302 of the virtual apparatus 8540, the I / O device 8103 is set for the standby system, and the I / O device is set for the active system of the virtual device 8301 of the virtual apparatus 8530. 8101 and the I / O device 8103 is set in the standby system. A combination in which the I / O device 8103 is used as the active system and the I / O device 8102 is used as the standby system can be realized with the same availability level.

Next, the effect by this embodiment is demonstrated. In a system having a hardware space (physical resource space) having a plurality of physical resources having different failure recovery performance and availability levels in the system, according to the present embodiment, the effect of the second embodiment is maintained while taking the virtual effect. It becomes possible to automatically select the optimum failure recovery configuration according to the priority of the device.
And in a complex management environment with various failure recovery performances such as hardware that has a high-speed failure recovery function such as keep alive method and other hardware that is not so, it is autonomous to satisfy the management policy. Management flexibility for optimal redundancy setting can be provided. The reason is that, while managing the setting information, management policy, resource information, etc., the optimum setting according to the state is automatically selected, and the configuration setting / failure recovery is performed.

[Fourth Embodiment]
Next, a fourth embodiment of the present invention will be described in detail with reference to FIG.
The fourth embodiment of the present invention relates to a method for generating a hardware space according to the second and third embodiments, and cooperation between the method and the software space.

  Referring to FIG. 7, the fourth embodiment includes chassis 9101 and 9201 that accommodate physical resources, a switch 9400, a hardware space 9000, and a software space 9500. The hardware space 9000 and the software space 9500 are composed of, for example, a personal computer. The software space 9500 is composed of a semiconductor memory such as a DRAM or a program or data stored in a hard disk device, and is represented by a CPU of the hardware space 6100. Processing is executed by the central processing unit.

  The chassis 9101 houses an I / O device 9111, an I / O device 9112, a CPU 9113, and a power source 9121 that supplies power to each device. The chassis 9201 houses an I / O device 9211, an I / O device 9212, a CPU 9213, and a power source 9221 that supplies power to each device. The switch 9400 connects the devices in the chassis 9101 and 9201 and the hardware space 9000 to each other. The hardware space 9000 includes an I / O device 9001, an I / O device 9002, an I / O device 9003, and a CPU 9004 that are resources logically divided into partitions by the switch 9400 and grouped. The software space 9500 includes physical resource management means 9501 and at least one virtual device 9502.

  Here, the number of chassis and the number and types of physical resources such as I / O devices accommodated in each chassis are not limited to the configuration shown in FIG.

  The physical resources selected to be grouped by the switch are also an example, and the present invention is not limited to this configuration.

  Grouping here means that the physical resources in the grouped hardware space 9000 communicate with each other by the logical division function of the switch, for example, the function represented by the VLAN function of the Ethernet (registered trademark) switch. In addition, different groups are processes in which connections are basically separated.

  The switch 9400 is a network device typified by an Ethernet (registered trademark) switch, but the protocol and the physical configuration thereof are not limited thereto.

  The software space 9500 includes failure concealment means and failure management means by a software method based on virtualization, represented by the software space described in the third embodiment.

  A physical resource represented by an I / O device may or may not have a failure recovery function by a hardware method such as a life / death confirmation unit.

The physical resource management unit 9502 manages information such as the performance, availability, physical location, and grouping configuration of physical resources belonging to the hardware space 9000.
Next, the operation of these means will be described. As shown in FIG. 7, physical resources belonging to the hardware space 9000 provided in the software space 9500 may be selected and grouped from either or both of the chassis 9101 and the chassis 9102 depending on the setting. .

In the configuration illustrated in FIG. 7, the CPU 9004, the I / O device 9001, and the I / O device 9002 are connected to the CPU 9113, the I / O device 9111, and the I / O device 9112 that belong to the chassis 9001, respectively. Functions as one I / O device. The I / O device 9003 is connected to the I / O device 9211 belonging to the chassis 9201 and functions as one I / O device. Here, as in the third embodiment, as a redundant configuration of virtual resources of two virtual devices having different priorities belonging to the software space 9500, a 1 + 1 configuration of I / O devices (one active system and one I / O in the standby system). / O device) is set.
Here, if the characteristics such as performance and availability of all I / O devices, chassis, and power supplies are equal, and options that produce an equivalent effect are omitted, the I / O device 9001 and the I / O device 9002 are completely equivalent. There is no need to replace the two. Then, as an active system, there is a choice between an I / O device of the chassis 9101 or an I / O device of the chassis 9201, but since it is assumed that they are completely equivalent, it is either from the viewpoint of availability. It is enough to consider one.

Therefore, in the hardware space 9000 shown in FIG.
(Option 1) Active system: I / O device 9001, standby system: I / O device 9002
(Option 2) Active system: I / O device 9001, standby system: I / O device 9003
There are two possible ways. This is the difference between a redundant configuration within the same chassis or a redundant configuration across chassis.

  From the viewpoint of power failure, (Option 1) is connected to the I / O device 9111 and I / O device 9112 in the same chassis driven by the same power source, and therefore shares the risk of failure (option Since 2) is driven by a different power source, the availability is high. Therefore, as described in the third embodiment, when the resource allocation control is determined for the redundant configuration and priority of the virtual resource of the virtual device designated by the administrator, the physical resource management unit 9501 determines the physical resource. By referring to the information and considering the difference in availability due to the difference in grouping as described above, a configuration with higher availability is assigned to a virtual device with higher priority.

  In this example, availability is taken as an example. For example, performance represented by differences in data transfer speed due to physical location and network performance, and other restrictions have been removed from consideration, but in addition to availability, management policy and By adding restrictions to be considered in the configuration setting information and including the restrictions when selecting the optimum setting, other settings are possible, and the present invention is not limited to this.

  Next, effects of the fourth embodiment will be described. According to the fourth embodiment, the performance of the physical resource represented by physical location information, shared risk information, etc., and the life / death confirmation unit, while taking over the effects of the second and third embodiments. It is possible to automatically select a configuration that best suits the administrator's intention to set, such as a management policy and configuration setting information, in consideration of other information such as availability information typified by the presence or absence.

Even when the physical resource failure risk is not uniform, an optimal configuration can be adopted. This is because the optimization can be optimized in consideration of the failure risk of each physical resource in the optimum configuration search process.
While typical embodiments of the present invention have been described above, the present embodiments can be variously modified and replaced without departing from the spirit and scope of the present invention defined by the claims of the present application. Is possible.

  The present invention relates to a system including a plurality of physical resources and a computer configuration unit having software for sharing the plurality of physical resources, for example, an IT system using a CPU and an I / O device as physical resources. And network (NW) systems that use CPU and line cards as physical resources.

It is a figure which shows the best form of this invention. It is a figure of the operation | movement flow in the best form of this invention. When a failure occurs, the hardware method and the software method cooperate to perform high-speed failure recovery. FIG. 10 is an operation flowchart when a hardware method and a software method cooperate to perform high-speed failure recovery when a failure occurs. FIG. 10 is a diagram for selecting a suitable resource and setting a redundant configuration for a plurality of virtual devices having different priorities. It is an operation | movement flowchart at the time of selecting an optimal resource and setting a redundant structure with respect to the several virtual apparatus from which a priority differs. It is a block diagram of the Example of this invention which is reflected in a setting when the failure risk of a physical resource is not uniform. It is a figure of virtualization by VMM. It is a figure of the virtualization architecture in XEN. It is a figure of the fault concealment system by a software system. It is a figure of the fault concealment method using the hardware by RAID. It is a figure of the high-speed switching at the time of the failure by the life and death confirmation apparatus.

Explanation of symbols

6100 Hardware space 6101, 6102 Physical resource 6111, 6112 Alive check unit 6113 Alive check signal 6121 Central processing unit 6131 Data transfer path (bus)
6500 Software space 6510 Virtual device 6511 Resource access means 6520 Virtual resource space 6521 Virtual resource 6550 Virtualization means 6551 Resource assignment means 6552 Failure management means 6553, 6554 Resource access means 7100 Hardware space 7500 Software space 7101, 7102 I / O device 7103 CPU
7111, 7112 Life confirmation unit 7150 Data transfer path 7201 Keep-alive signal 7202 Failure notification interrupt signal 7203 Interrupt signal 7301 Virtual resource 7501, 7502 Device driver 7503 Resource allocation control means 7504 Back-end device driver 7530 Virtual device 7531 Front-end device driver 7532 Process 7550 Fault management means 7551 Interface means 7553 Information storage means 7554 Processing management means 7580 Virtualization means 7900 Administrator 8100 Hardware space 8150 Communication means 8101, 8102, 8103 I / O device 8104 CPU
8500 Software space 8501, 8502, 8503 Device driver 8504, 8505 Back-end device driver 8506 Resource allocation means 8531, 8541 Front-end device driver 8532, 8542 Process 8580 Virtualization means 8530, 8540 Virtual device 8550 Fault management means 8551 Interface means 8552 Setting Management unit 8553 Information storage unit 8554 Processing management unit 8555 Physical resource management unit 8701 Configuration setting information 8702 Management policy information 8301, 8302 Virtual resource 8900 Administrator 9000 Hardware space 9001, 9002, 9003 I / O device 9004 CPU
9101, 9201 Chassis 9111, 9112 I / O device 9113 CPU
9211, 9212 I / O device 9213 CPU
9121, 9221 Power supply 9400 Switch 9301 Redundant pair 9302 Redundant pair 9500 Software space 9501 Physical resource management means 9502 Virtual device

Claims (14)

  1. Multiple physical resources,
    A virtual machine in which at least one software program operates, and a computer component that includes software functioning as a virtualization unit that enables the virtual apparatus to share the plurality of physical resources, and
    The virtualization means includes a resource allocation means for allocating the plurality of physical resources to the virtual device, and one or more of the plurality of physical resources have failed, and the hardware is allocated to another physical resource. Physical resource control, comprising: fault management means for controlling switching to the other physical resource by software in the resource allocation means in cooperation with the switching control when switching control is executed Management system.
  2. In the physical resource control management system according to claim 1, two or more or all of the plurality of physical resources have a life and death confirmation unit,
    The physical resource having the alive confirmation unit performs a state change by a preset operation when a failure of another physical resource is detected by the alive confirmation unit, and notifies the failure management means of the physical resource Resource control management system.
  3.   2. The physical resource control management system according to claim 1, wherein the failure management means includes an information storage means for storing information on an operation to be performed when a failure occurs, and an opportunity signal for switching to another physical resource. A physical resource control management system comprising processing management means for controlling the resource allocation means on the basis of information in the information storage means and performing failure recovery processing.
  4.   4. The physical resource control management system according to claim 3, wherein the information storage means includes a list table that associates an operation to be performed when a failure occurs with identification information determined from the trigger signal. Resource control management system.
  5. In the physical resource control management system according to claim 3, two or more or all of the plurality of physical resources have a life and death confirmation unit,
    The physical resource having the life and death confirmation unit performs a state change by a preset operation when a failure of another physical resource is detected by the life and death confirmation unit, and the trigger signal is notified by the physical resource that has performed the state change. A physical resource control management system, characterized in that the signal is a generated signal.
  6.   6. The physical resource control management system according to claim 1, wherein at least one physical resource of the plurality of physical resources is more reliable, bandwidth, and delay of physical resources than other physical resources. A physical resource control management system characterized in that any one of the performance, physical resource type, existence / non-existence confirmation function, and failure risk is different.
  7.   2. The physical resource control management system according to claim 1, wherein the failure management means includes performance including physical resource reliability, bandwidth, and delay, physical resource identification information, physical resource type, and alive confirmation in the system. A physical resource control management system comprising physical resource management means for managing physical resource information of at least one of presence / absence of function and failure risk.
  8.   8. The physical resource control management system according to claim 7, wherein the failure management unit includes physical resource information acquired from the physical resource management unit, and a redundant configuration of resources set for each virtual device input from an administrator. A setting for performing physical resource allocation calculation for the resource of the virtual device and performing setting control of the resource allocation means from configuration setting information including at least one information regarding availability level and a management policy regarding priority information for the virtual device A physical resource control management system comprising management means.
  9.   9. The physical resource control management system according to claim 8, wherein the physical resource management means has a failure probability as failure risk information, and the setting management means takes into account the failure probability and A physical resource control management system characterized by performing setting control.
  10.   2. The physical resource control management system according to claim 1, wherein the failure management unit performs setting control of the resource allocation unit based on a statistical value measured by a hardware monitoring unit mounted on the physical resource. A physical resource control management system.
  11.   11. The physical resource control management system according to claim 10, wherein the statistical value is at least one dynamic performance measurement value of bandwidth, delay, and bit error rate.
  12.   8. The physical resource control management system according to claim 7, wherein each of the plurality of physical resources is connected to a physical resource grouped through a network, and the physical resource management means manages a grouped configuration.
  13. A computer component having software that functions as a plurality of physical resources, a virtual device in which at least one software program operates, and a virtualization unit that enables the virtual devices to share the plurality of physical resources; With
    In the physical resource control management method of the physical resource control management system, the virtualization means includes resource allocation means for allocating the plurality of physical resources to the virtual device.
    A step of switching to another physical resource by hardware when a failure occurs in one or more of the plurality of physical resources;
    A physical resource control management method comprising: switching to the other physical resource by software in the resource allocation means in cooperation with the switching control.
  14. A physical resource control management program for causing a computer to function as a virtual device in which at least one software program operates, and a virtualization unit that enables the virtual device to share a plurality of physical resources,
    The virtualization means includes a resource allocation means for allocating the plurality of physical resources to the virtual device, and one or more of the plurality of physical resources have failed, and the hardware is allocated to another physical resource. A physical resource control that functions as a failure management unit that controls switching to the other physical resource by software in the resource allocation unit in cooperation with the switching control when the switching control is executed Administrative program.
JP2006287536A 2006-10-23 2006-10-23 Physical resource control management system, physical resource control management method and physical resource control management program Withdrawn JP2008107896A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2006287536A JP2008107896A (en) 2006-10-23 2006-10-23 Physical resource control management system, physical resource control management method and physical resource control management program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2006287536A JP2008107896A (en) 2006-10-23 2006-10-23 Physical resource control management system, physical resource control management method and physical resource control management program

Publications (1)

Publication Number Publication Date
JP2008107896A true JP2008107896A (en) 2008-05-08

Family

ID=39441225

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2006287536A Withdrawn JP2008107896A (en) 2006-10-23 2006-10-23 Physical resource control management system, physical resource control management method and physical resource control management program

Country Status (1)

Country Link
JP (1) JP2008107896A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010009396A (en) * 2008-06-27 2010-01-14 Toshiba Corp Computer system, and device control method for the same
JP2010066931A (en) * 2008-09-09 2010-03-25 Fujitsu Ltd Information processor having load balancing function
KR101070431B1 (en) 2008-12-22 2011-10-06 한국전자통신연구원 Physical System on the basis of Virtualization and Resource Management Method thereof
JP2011527047A (en) * 2008-06-30 2011-10-20 ピボット3 Method and system for execution of applications associated with distributed RAID
JP2011254303A (en) * 2010-06-02 2011-12-15 Nippon Telegr & Teleph Corp <Ntt> Network communication system and network communication method
WO2012070102A1 (en) * 2010-11-22 2012-05-31 三菱電機株式会社 Computing device and program
JP2012514803A (en) * 2009-01-07 2012-06-28 ヒューレット・パッカード・カンパニーHewlett−Packard Company Network connection manager
JP2012531676A (en) * 2009-06-26 2012-12-10 ヴイエムウェア インクVMware, Inc. Virtual mobile device
US8527699B2 (en) 2011-04-25 2013-09-03 Pivot3, Inc. Method and system for distributed RAID implementation
US8621147B2 (en) 2008-06-06 2013-12-31 Pivot3, Inc. Method and system for distributed RAID implementation
CN103699428A (en) * 2013-12-20 2014-04-02 华为技术有限公司 Method and computer device for affinity binding of interrupts of virtual network interface card
US8799895B2 (en) 2008-12-22 2014-08-05 Electronics And Telecommunications Research Institute Virtualization-based resource management apparatus and method and computing system for virtualization-based resource management
JP2016148973A (en) * 2015-02-12 2016-08-18 日本電信電話株式会社 Life-and-death monitoring device, life-and-death monitoring system, life-and-death monitoring method, and life-and-death monitoring method program

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9535632B2 (en) 2008-06-06 2017-01-03 Pivot3, Inc. Method and system for distributed raid implementation
US9465560B2 (en) 2008-06-06 2016-10-11 Pivot3, Inc. Method and system for data migration in a distributed RAID implementation
US9146695B2 (en) 2008-06-06 2015-09-29 Pivot3, Inc. Method and system for distributed RAID implementation
US8621147B2 (en) 2008-06-06 2013-12-31 Pivot3, Inc. Method and system for distributed RAID implementation
JP2010009396A (en) * 2008-06-27 2010-01-14 Toshiba Corp Computer system, and device control method for the same
JP2011527047A (en) * 2008-06-30 2011-10-20 ピボット3 Method and system for execution of applications associated with distributed RAID
US9086821B2 (en) 2008-06-30 2015-07-21 Pivot3, Inc. Method and system for execution of applications in conjunction with raid
JP2010066931A (en) * 2008-09-09 2010-03-25 Fujitsu Ltd Information processor having load balancing function
US8799895B2 (en) 2008-12-22 2014-08-05 Electronics And Telecommunications Research Institute Virtualization-based resource management apparatus and method and computing system for virtualization-based resource management
KR101070431B1 (en) 2008-12-22 2011-10-06 한국전자통신연구원 Physical System on the basis of Virtualization and Resource Management Method thereof
JP2012514803A (en) * 2009-01-07 2012-06-28 ヒューレット・パッカード・カンパニーHewlett−Packard Company Network connection manager
US8364825B2 (en) 2009-01-07 2013-01-29 Hewlett-Packard Development Company, L.P. Network connection manager
JP2012531676A (en) * 2009-06-26 2012-12-10 ヴイエムウェア インクVMware, Inc. Virtual mobile device
JP2011254303A (en) * 2010-06-02 2011-12-15 Nippon Telegr & Teleph Corp <Ntt> Network communication system and network communication method
JP5335150B2 (en) * 2010-11-22 2013-11-06 三菱電機株式会社 Computer apparatus and program
WO2012070102A1 (en) * 2010-11-22 2012-05-31 三菱電機株式会社 Computing device and program
US8527699B2 (en) 2011-04-25 2013-09-03 Pivot3, Inc. Method and system for distributed RAID implementation
CN103699428A (en) * 2013-12-20 2014-04-02 华为技术有限公司 Method and computer device for affinity binding of interrupts of virtual network interface card
JP2016148973A (en) * 2015-02-12 2016-08-18 日本電信電話株式会社 Life-and-death monitoring device, life-and-death monitoring system, life-and-death monitoring method, and life-and-death monitoring method program

Similar Documents

Publication Publication Date Title
US7447939B1 (en) Systems and methods for performing quiescence in a storage virtualization environment
US7653833B1 (en) Terminating a non-clustered workload in response to a failure of a system with a clustered workload
US8893147B2 (en) Providing a virtualized replication and high availability environment including a replication and high availability engine
EP1686473B1 (en) Computer system, computer, storage system, and control terminal
US7287186B2 (en) Shared nothing virtual cluster
JP5282046B2 (en) Computer system and enabling method thereof
JP4544146B2 (en) Disaster recovery method
US8055933B2 (en) Dynamic updating of failover policies for increased application availability
US20090138752A1 (en) Systems and methods of high availability cluster environment failover protection
US8037344B2 (en) Method and apparatus for managing virtual ports on storage systems
US7058731B2 (en) Failover and data migration using data replication
US8498967B1 (en) Two-node high availability cluster storage solution using an intelligent initiator to avoid split brain syndrome
US7062674B2 (en) Multiple computer system and method for assigning logical computers on the same system
JP5026305B2 (en) Storage and server provisioning for visualization and geographically distributed data centers
US8909884B2 (en) Migrating virtual machines across sites
CN1554055B (en) High-availability cluster virtual server system
US6609213B1 (en) Cluster-based system and method of recovery from server failures
US6598174B1 (en) Method and apparatus for storage unit replacement in non-redundant array
US6578158B1 (en) Method and apparatus for providing a raid controller having transparent failover and failback
US7054913B1 (en) System and method for performing virtual device I/O operations
CN102325192B (en) Cloud computing implementation method and system
US6757753B1 (en) Uniform routing of storage access requests through redundant array controllers
JP5142678B2 (en) Deployment method and system
US8984330B2 (en) Fault-tolerant replication architecture
US8069368B2 (en) Failover method through disk takeover and computer system having failover function

Legal Events

Date Code Title Description
RD04 Notification of resignation of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7424

Effective date: 20080613

A300 Withdrawal of application because of no request for examination

Free format text: JAPANESE INTERMEDIATE CODE: A300

Effective date: 20100105