JP2010257209A

JP2010257209A - Bus switch, computer system, and management method for computer system

Info

Publication number: JP2010257209A
Application number: JP2009106365A
Authority: JP
Inventors: Mitsuru Sato; 充佐藤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2009-04-24
Filing date: 2009-04-24
Publication date: 2010-11-11
Anticipated expiration: 2029-04-24
Also published as: JP5218252B2

Abstract

PROBLEM TO BE SOLVED: To move a virtual machine which directly controls an I/O device between servers while operating the virtual machine. SOLUTION: In a bus switch, a PCI Express switch 30 as the bus switch for connecting servers 10 and 20 and an I/O device 40 in which a virtual machine operates by a star type connection system is integrated with a packet duplication unit 30F for duplicating a memory access request (packet) from an I/O device 40, and for transmitting packets with the same contents to the servers 10 and 20 and an output suppression unit 30G for suppressing a memory access request to a designated address region and a duplication exclusion unit 30H for excluding a memory access request related with interruption from the object of duplication. Then, the packet duplication units 30F, the output suppression unit 30G and the duplication exclusion unit 30H are controlled as necessary, and the virtual machine is moved while being operated. COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、サーバ間で仮想マシンを移動させる技術に関する。 The present invention relates to a technique for moving a virtual machine between servers.

近年、ハイパーバイザと呼ばれる制御プログラムにより、１台の物理サーバ上に複数台の仮想マシンを構築し、各仮想マシンごとに任意のＯＳ（Operating System）やアプリケーションを動作させる仮想化技術が実用化されている。仮想マシンは、ハイパーバイザにより仮想化されたメモリアドレスを利用するため、ハイパーバイザを介して入出力デバイス（Ｉ／Ｏデバイス）を制御しなければならず、Ｉ／Ｏデバイスをアクセスするときにオーバヘッドが生じて性能低下をもたらしている。このため、ハイパーバイザにより仮想化されたメモリアドレスとＩ／Ｏデバイスのメモリアドレスとを相互に変換する機構を備えることで、仮想マシンからＩ／Ｏデバイスを直接制御できるようにした技術が提案されている。 In recent years, virtualization technology has been put into practical use in which multiple virtual machines are constructed on a single physical server and an arbitrary OS (Operating System) or application is operated for each virtual machine using a control program called a hypervisor. ing. Since the virtual machine uses a memory address virtualized by the hypervisor, the input / output device (I / O device) must be controlled via the hypervisor, and overhead is required when accessing the I / O device. Has resulted in performance degradation. For this reason, a technique has been proposed in which a virtual machine can directly control the I / O device by providing a mechanism for mutually converting the memory address virtualized by the hypervisor and the memory address of the I / O device. ing.

特開２００８−２１２５２号公報JP 2008-21252 A

ところで、仮想マシン環境では、例えば、夜間など稼働率が低下したときに、仮想マシンを１台の物理サーバに集約することで、省電力化を図ることが可能となる。しかし、Ｉ／Ｏデバイスを直接制御している仮想マシンでは、Ｉ／Ｏデバイスの状態をハイパーバイザが管理していないため、仮想マシンを稼動させたまま移動させることができなかった。 By the way, in the virtual machine environment, for example, when the operation rate decreases at night or the like, it is possible to conserve power by consolidating the virtual machines into one physical server. However, in a virtual machine that directly controls an I / O device, the state of the I / O device is not managed by the hypervisor, so that the virtual machine cannot be moved while it is running.

そこで、従来技術の問題点に鑑み、Ｉ／Ｏデバイスを直接制御している仮想マシンを稼動させたまま、サーバ間で仮想マシンを移動させることができるようにした技術を提供することを目的とする。 Accordingly, in view of the problems of the prior art, an object of the present invention is to provide a technology capable of moving a virtual machine between servers while operating a virtual machine that directly controls an I / O device. To do.

このため、本技術では、仮想マシンが稼動する複数のサーバと入出力デバイスとをスター型接続方式で接続するバススイッチに、入出力デバイスからのメモリアクセス要求を複製して同一内容のメモリアクセス要求を複数のサーバに送信する複製ユニットと、第１の指定アドレス領域に対するメモリアクセス要求を抑止する抑止ユニットと、割込みに係るメモリアクセス要求を複製対象から除外する複製除外ユニットと、を組み込む。また、本技術では、仮想マシンの移動指示があったときに、メモリアクセス要求が移動元サーバ及び移動先サーバに送信されるように複製ユニットを制御し、移動元サーバの仮想マシンが使用していたメモリの内容を移動先サーバに転送している間に、転送中のメモリアドレス領域に対するメモリアクセス要求が抑止されるように抑止ユニットを制御し、割込みに係るメモリアクセス要求が複製対象から除外されるように複製除外ユニットを制御する。 For this reason, this technology replicates memory access requests from input / output devices to a bus switch that connects multiple servers running virtual machines and input / output devices using a star connection method. Are incorporated into a plurality of servers, a suppression unit that suppresses memory access requests for the first designated address area, and a replication exclusion unit that excludes memory access requests related to interrupts from replication targets. In addition, in this technology, when a virtual machine movement instruction is issued, the replication unit is controlled so that a memory access request is transmitted to the movement source server and the movement destination server, and the virtual machine of the movement source server uses the virtual machine. While transferring the memory contents to the destination server, the suppression unit is controlled so that memory access requests to the memory address area being transferred are suppressed, and memory access requests related to interrupts are excluded from replication targets. The replication exclusion unit is controlled so that

本技術によれば、入出力デバイスを直接制御する仮想マシンを稼動させたまま、仮想マシンを移動元サーバから移動先サーバに移動させることができる。 According to the present technology, it is possible to move a virtual machine from a source server to a destination server while operating a virtual machine that directly controls an input / output device.

本技術を適用したコンピュータシステムの一実施形態図である。1 is an embodiment diagram of a computer system to which the present technology is applied. PCI Expressスイッチの詳細構成図である。It is a detailed block diagram of a PCI Express switch. PCI Expressスイッチの動作を説明するフローチャートである。It is a flowchart explaining operation | movement of a PCI Express switch. 移動元サーバで実行される移動処理のフローチャートである。It is a flowchart of the movement process performed with a movement origin server. 移動元サーバで実行される移動処理のフローチャートである。It is a flowchart of the movement process performed with a movement origin server. 移動先サーバで実行される移動処理のフローチャートである。It is a flowchart of the movement process performed by a movement destination server.

以下、添付された図面を参照して本技術を詳述する。
図１は、本技術を適用したコンピュータシステムの一実施形態を示す。
２台の物理サーバ（以下「サーバ」という）１０及び２０は、スター型接続方式を採用するバススイッチとしてのPCI Expressスイッチ３０（登録商標）を介して、ネットワークアダプタやハードディクスコントローラなどのＩ／Ｏデバイス４０に接続される。サーバ１０は、中央処理装置ＣＰＵ，メモリＭＥＭ及びＩＯハブＨＵＢを含む。また、サーバ１０では、仮想化技術を具現化するハイパーバイザ１２により、仮想的なコンピュータとしての少なくとも１台の仮想マシン（ＶＭ；Virtual Machine）１４が構築される。仮想マシン１４は、ＩＯハブＨＵＢに組み込まれたＩＯＭＭＵ（Input/Output Memory Management Unit）１６を介して、ハイパーバイザ１２により仮想化されたメモリアドレスを有するＶＭメモリ１８とＩ／Ｏデバイス４０との間でデータを直接授受できるようになっている。ここで、ＩＯＭＭＵ１６は、ＶＭメモリ１８のメモリアドレスとＩ／Ｏデバイス４０のメモリアドレスを相互に変換するためのハードウエアである。サーバ２０は、サーバ１０と同一構成であるため、その説明は省略するものとする。また、サーバ１０及び２０は、仮想マシン制御のためハイパーバイザが使用する管理ネットワーク５０に、ＮＩＣ（Network Interface Controller）６０を経由して接続されているものとする。 Hereinafter, the present technology will be described in detail with reference to the accompanying drawings.
FIG. 1 shows an embodiment of a computer system to which the present technology is applied.
Two physical servers (hereinafter referred to as “servers”) 10 and 20 are connected to an I / O such as a network adapter or a hard disk controller via a PCI Express switch 30 (registered trademark) as a bus switch adopting a star connection method. Connected to the O device 40. The server 10 includes a central processing unit CPU, a memory MEM, and an IO hub HUB. In the server 10, at least one virtual machine (VM) 14 as a virtual computer is constructed by the hypervisor 12 that implements the virtualization technology. The virtual machine 14 is connected between the VM memory 18 having a memory address virtualized by the hypervisor 12 and the I / O device 40 via an IOMMU (Input / Output Memory Management Unit) 16 incorporated in the IO hub HUB. You can send and receive data directly. Here, the IOMMU 16 is hardware for mutually converting the memory address of the VM memory 18 and the memory address of the I / O device 40. Since the server 20 has the same configuration as the server 10, the description thereof will be omitted. The servers 10 and 20 are connected to a management network 50 used by the hypervisor for virtual machine control via a NIC (Network Interface Controller) 60.

なお、制御手段の一例としてのハイパーバイザ１２及び２２は、ＣＤ−ＲＯＭなどのコンピュータ読取可能な記録媒体に記録されたコンピュータシステムの管理プログラムを、サーバ１０及び２０にインストールすることで実装される。 The hypervisors 12 and 22 as examples of the control means are implemented by installing a computer system management program recorded on a computer-readable recording medium such as a CD-ROM in the servers 10 and 20.

PCI Expressスイッチ３０は、ＰＣＩ−ＳＩＧ（Peripheral Component Interconnect Special Interest Group）のＭＲ−ＩＯＶ（Multi-Root I/O Virtualization）規格に準拠したＬＳＩ（Large Scale Integration）であって、複数の物理サーバに接続可能となっている。また、前述したＩ／Ｏデバイス４０も、ＰＣＩ−ＳＩＧのＭＲ−ＩＯＶ規格に準拠したデバイスであって、複数の物理サーバに接続可能となっている。 The PCI Express switch 30 is an LSI (Large Scale Integration) compliant with the PCI-SIG (Peripheral Component Interconnect Special Interest Group) MR-IOV (Multi-Root I / O Virtualization) standard, and connects to multiple physical servers. It is possible. The I / O device 40 described above is also a device that conforms to the PCI-SIG MR-IOV standard and can be connected to a plurality of physical servers.

PCI Expressスイッチ３０は、図２に示すように、各サーバ１０及び２０にパケットを送受信するための入力バッファ３０Ａ及び出力バッファ３０Ｂと、Ｉ／Ｏデバイス４０にパケットを送受信するための入力バッファ３０Ｃ及び出力バッファ３０Ｄと、ルーティングユニット３０Ｅと、を含む。入力バッファ３０Ａ及び３０Ｃ並びに出力バッファ３０Ｂ及び３０Ｄには、送受信するパケットを一時的に蓄積するキューが備えられている。ルーティングユニット３０Ｅは、サーバ１０又は２０とＩ／Ｏデバイス４０との間、又は、２つのサーバ１０及び２０の間でパケットを送受信するルーティング制御を行う。また、PCI Expressスイッチ３０は、仮想マシンを稼動させたたま物理サーバ間で移動させることができるようにすべく、パケット複製ユニット３０Ｆ，出力抑止ユニット３０Ｇ及び複製除外ユニット３０Ｈを更に含む。 As shown in FIG. 2, the PCI Express switch 30 includes an input buffer 30A and an output buffer 30B for transmitting / receiving packets to / from the servers 10 and 20, and an input buffer 30C for transmitting / receiving packets to / from the I / O device 40. An output buffer 30D and a routing unit 30E are included. The input buffers 30A and 30C and the output buffers 30B and 30D are provided with queues for temporarily storing packets to be transmitted and received. The routing unit 30E performs routing control for transmitting and receiving packets between the server 10 or 20 and the I / O device 40, or between the two servers 10 and 20. The PCI Express switch 30 further includes a packet duplication unit 30F, an output suppression unit 30G, and a duplication exclusion unit 30H so that the virtual machine can be moved between physical servers while the virtual machine is operating.

パケット複製ユニット３０Ｆは、Ｉ／Ｏデバイス４０からのメモリアクセス要求、即ち、メモリ書込要求パケット（以下「パケット」という）が入力バッファ３０Ｃに到着したときに、サーバ１０及び２０に対して同一パケットを送信すべく、パケットを複製する機能を提供する。出力抑止ユニット３０Ｇは、Ｉ／Ｏデバイス４０からのパケットが入力バッファ３０Ｃに到着したときに、指定アドレス領域に対するパケットが入力バッファ３０Ｃから出力されることを抑止する機能を提供する。複製除外ユニット３０Ｈは、パケット複製ユニット３０Ｆにより複製されるパケットのうち、割込要求パケットを複製対象から除外する機能を提供する。割込要求パケットを複製対象から除外する機能は、例えば、パケットを複製するアドレス領域を指定して、パケットの複製を除外するアドレス領域を指定して、コンフィグレーションアクセスをトラックしてＭＳＩ／ＭＳＩ−Ｘ（Message Signaled Interrupt）を認識して自動的に除外するなどで実装することができる。パケット複製ユニット３０Ｆ，出力抑止ユニット３０Ｇ及び複製除外ユニット３０Ｈは、夫々、PCI Expressスイッチ３０外部からの制御信号に応じて制御される。 When the memory access request from the I / O device 40, that is, the memory write request packet (hereinafter referred to as “packet”) arrives at the input buffer 30C, the packet duplication unit 30F sends the same packet to the servers 10 and 20. Provides a function to copy a packet to transmit. The output suppression unit 30G provides a function of suppressing a packet for a specified address area from being output from the input buffer 30C when a packet from the I / O device 40 arrives at the input buffer 30C. The duplication exclusion unit 30H provides a function of excluding an interrupt request packet from duplication targets among the packets duplicated by the packet duplication unit 30F. The function of excluding the interrupt request packet from the replication target is, for example, designating an address area for duplicating the packet, designating an address area for excluding the packet duplication, tracking configuration access, and MSI / MSI- It can be implemented by recognizing and automatically excluding X (Message Signaled Interrupt). The packet duplication unit 30F, the output suppression unit 30G, and the duplication exclusion unit 30H are each controlled according to a control signal from the outside of the PCI Express switch 30.

そして、ハイパーバイザ１２及び２２は、PCI Expressスイッチ３０に備えられたパケット複製ユニット３０Ｆ，出力抑止ユニット３０Ｇ及び複製除外ユニット３０Ｈから提供される各機能を制御して、仮想マシンを稼動させたままの移動を実現する。このとき、パケット複製ユニット３０Ｆ，出力抑止ユニット３０Ｇ及び複製除外ユニット３０Ｈから提供される各機能は、次のような形態で用いられる。なお、以下の説明では、サーバ１０の仮想マシン１４を稼動させたままサーバ２０に移動させるものとする。
（１）パケット複製ユニット３０Ｆが提供する複製機能
仮想マシン１４を稼動させたままの移動では、移動中にも仮想マシン１４及びＩ／Ｏデバイス４０は動作を継続する。このため、ハイパーバイザ１２及び２２が協働して移動のための処理、例えば、ＶＭメモリ１８の内容をサーバ２０に転送しているときにも、Ｉ／Ｏデバイス４０から仮想マシン１４のメモリ空間への書き込みを含んだメモリアクセスが発生する。このアクセスは、仮想マシン１４とＩ／Ｏデバイス４０との間で直接行われるものであり、ハイパーバイザ１２が関与することができない。 The hypervisors 12 and 22 control the functions provided by the packet duplication unit 30F, the output suppression unit 30G, and the duplication exclusion unit 30H provided in the PCI Express switch 30, and keep the virtual machine running. Realize the move. At this time, each function provided from the packet duplication unit 30F, the output suppression unit 30G, and the duplication exclusion unit 30H is used in the following form. In the following description, it is assumed that the virtual machine 14 of the server 10 is moved to the server 20 while operating.
(1) Replication function provided by the packet replication unit 30F In the movement with the virtual machine 14 operating, the virtual machine 14 and the I / O device 40 continue to operate even during the movement. Therefore, even when the hypervisors 12 and 22 cooperate to move, for example, when the contents of the VM memory 18 are transferred to the server 20, the memory space of the virtual machine 14 from the I / O device 40. A memory access including a write to occurs. This access is performed directly between the virtual machine 14 and the I / O device 40, and the hypervisor 12 cannot be involved.

そこで、PCI Expressスイッチ３０に複製機能を設けることで、Ｉ／Ｏデバイス４０からのメモリ書き込みは、移動元サーバ１０と移動先サーバ２０との両方に行なわれるようにする。このようにすると、ハイパーバイザ１２が関与しないＩ／Ｏデバイス４０からのメモリ書き込みも、移動元サーバ１０と移動先サーバ２０とでメモリ内容の同期がとれる。
（２）出力抑止ユニット３０Ｇが提供する抑止機能
複製機能により、Ｉ／Ｏデバイス４０からのメモリ書き込みを移動元サーバ１０及び移動先サーバ２０に行なったとしても、メモリ内容が不一致となる場合が想定される。例えば、移動元サーバ１０のＶＭメモリ１８からデータを読み出して移動先サーバ２０に転送しているときに、Ｉ／Ｏデバイス４０からのメモリ書き込みがあると、移動元サーバ１０のＶＭメモリ１８及び移動先サーバ２０のＶＭメモリ２８が上書きされる。その後、移動元サーバ１０から移動先サーバ２０にＶＭメモリ１８のデータが到着すると、そのＶＭメモリ１８は古いデータにより上書きされてしまうこととなる。 Therefore, by providing a replication function in the PCI Express switch 30, memory writing from the I / O device 40 is performed by both the migration source server 10 and the migration destination server 20. In this way, the memory contents are synchronized between the migration source server 10 and the migration destination server 20 even when the memory is written from the I / O device 40 not involving the hypervisor 12.
(2) Suppression Function Provided by Output Suppression Unit 30G Even if memory writing from the I / O device 40 is performed to the migration source server 10 and the migration destination server 20 by the replication function, it is assumed that the memory contents do not match. Is done. For example, when data is read from the VM memory 18 of the migration source server 10 and transferred to the migration destination server 20, if there is a memory write from the I / O device 40, the VM memory 18 of the migration source server 10 and the migration The VM memory 28 of the destination server 20 is overwritten. Thereafter, when the data in the VM memory 18 arrives at the destination server 20 from the source server 10, the VM memory 18 is overwritten with old data.

そこで、PCI Expressスイッチ３０に抑止機能を設けることで、移動元サーバ１０から移動先サーバ２０へと転送中のメモリ領域を上書きするパケットの出力を抑止し、移動元サーバ１０と移動先サーバ２０のメモリ内容が不一致となることを回避する。
（３）複製除外ユニット３０Ｈが提供する除外機能
ＣＰＵでは、特定アドレスに対するメモリ書き込みは、そのアドレス及びデータ内容に応じて割込みに変換するという機能が備えられている。仮想マシン１４の移動中には、複製機能により、Ｉ／Ｏデバイス４０からのメモリ書き込みは、移動元サーバ１０及び移動先サーバ２０に送信される。しかし、割込みは、移動元サーバ１０及び移動先サーバ２０の双方に送信することができない。なぜならば、仮想マシン１４の移動中には、移動元サーバ１０の仮想マシン１４は割込みを処理できるが、移動先サーバ２０の仮想マシン２４は割込みを処理できず、移動元及び移動先で二重に割込み処理してしまうおそれがあるためである。 Therefore, by providing a suppression function in the PCI Express switch 30, the output of a packet that overwrites the memory area being transferred from the source server 10 to the destination server 20 is suppressed, and the source server 10 and the destination server 20 Avoid inconsistent memory contents.
(3) Exclusion Function Provided by Replication Exclusion Unit 30H The CPU has a function of converting memory writing to a specific address into an interrupt according to the address and data content. During migration of the virtual machine 14, the memory write from the I / O device 40 is transmitted to the migration source server 10 and the migration destination server 20 by the replication function. However, the interrupt cannot be transmitted to both the migration source server 10 and the migration destination server 20. This is because, while the virtual machine 14 is moving, the virtual machine 14 of the source server 10 can process an interrupt, but the virtual machine 24 of the destination server 20 cannot process an interrupt, and the source and destination are duplicated. This is because there is a risk of interrupt processing.

そこで、PCI Expressスイッチ３０に除外機能を設けることで、Ｉ／Ｏデバイス４０からの割込みを複製対象から除外し、移動元サーバ１０及び移動先サーバ２０で二重に割込みが処理されないようにする。 Therefore, by providing an exclusion function in the PCI Express switch 30, the interrupt from the I / O device 40 is excluded from the replication target so that the source server 10 and the destination server 20 do not process the interrupt twice.

図３は、PCI Expressスイッチ３０が、Ｉ／Ｏデバイス４０から入力バッファ３０Ｃにパケットが到着したことを契機として実行する動作を示す。なお、図３に示すフローチャートは、各ユニットの動作を説明するものではなく、PCI Expressスイッチ３０全体としてどのような動作が行われるかを説明するものである。 FIG. 3 shows an operation that the PCI Express switch 30 executes when a packet arrives from the I / O device 40 to the input buffer 30C. Note that the flowchart shown in FIG. 3 does not describe the operation of each unit, but describes what operation is performed as the entire PCI Express switch 30.

ステップ１（図では「Ｓ１」と略記する。以下同様。）において、PCI Expressスイッチ３０が、パケットにより書き込まれるメモリ領域がロック中であるか否かを判定する。ここで、パケットにより書き込まれるメモリ領域がロック中であるか否かは、例えば、パケット内容と出力抑止ユニット３０Ｇに入力される制御信号のロック情報（ロック中のメモリアドレスを示す情報）との比較から判定することができる。そして、PCI Expressスイッチ３０が、メモリ領域がロック中であればステップ２へと進む一方（Ｙｅｓ）、メモリ領域がロック中でなければステップ５へと進む（Ｎｏ）。 In step 1 (abbreviated as “S1” in the figure, the same applies hereinafter), the PCI Express switch 30 determines whether the memory area written by the packet is locked. Here, whether or not the memory area written by the packet is locked is, for example, a comparison between the packet contents and lock information (information indicating the locked memory address) of the control signal input to the output suppression unit 30G. It can be determined from. If the memory area is locked, the PCI Express switch 30 proceeds to step 2 (Yes), but proceeds to step 5 if the memory area is not locked (No).

ステップ２において、PCI Expressスイッチ３０が、出力抑止ユニット３０Ｇにより、入力バッファ３０Ｃのキューからパケットが出力されることを抑止する。ここで、パケット出力が抑止されている間、入力バッファ３０Ｃに到着したパケットは、キューに到着順に順次蓄積される。 In Step 2, the PCI Express switch 30 suppresses the output suppression unit 30G from outputting a packet from the queue of the input buffer 30C. Here, while packet output is suppressed, packets that arrive at the input buffer 30C are sequentially stored in the queue in the order of arrival.

ステップ３において、PCI Expressスイッチ３０が、パケットにより書き込まれるメモリ領域のロックが解除されたか否かを判定する。ここで、メモリ領域のロックが解除されたか否かは、例えば、出力抑止ユニット３０Ｇに制御信号が入力されなくなったか否かを介して判定することができる。そして、PCI Expressスイッチ３０が、メモリ領域のロックが解除されたならばステップ４へと進む一方（Ｙｅｓ）、メモリ領域のロックが解除されていなければ待機する（Ｎｏ）。 In step 3, the PCI Express switch 30 determines whether or not the lock of the memory area written by the packet has been released. Here, whether or not the memory area is unlocked can be determined, for example, based on whether or not a control signal is no longer input to the output suppression unit 30G. The PCI Express switch 30 proceeds to step 4 if the memory area is unlocked (Yes), and waits if the memory area is not unlocked (No).

ステップ４において、PCI Expressスイッチ３０が、出力抑止ユニット３０Ｇによるパケット出力の抑止を解除する。
ステップ５において、PCI Expressスイッチ３０が、パケットの複製指示があるか否かを判定する。ここで、パケットの複製指示があるか否かは、例えば、パケット複製ユニット３０Ｆに制御信号が入力されているか否かを介して判定することができる。そして、PCI Expressスイッチ３０が、パケットの複製指示があればステップ６へと進む一方（Ｙｅｓ）、パケットの複製指示がなければステップ９へと進む（Ｎｏ）。 In step 4, the PCI Express switch 30 cancels the suppression of packet output by the output suppression unit 30G.
In step 5, the PCI Express switch 30 determines whether there is a packet duplication instruction. Here, whether or not there is a packet duplication instruction can be determined through, for example, whether or not a control signal is input to the packet duplication unit 30F. The PCI Express switch 30 proceeds to step 6 if there is a packet duplication instruction (Yes), and proceeds to step 9 if there is no packet duplication instruction (No).

ステップ６において、PCI Expressスイッチ３０が、パケットが複製対象であるか否かを判定する。ここで、パケットが複製対象であるか否かは、次のような３つの方法で判定することができる。第１の方法は、特定のアドレス領域に対するＩ／Ｏデバイス４０からのパケットのみを複製対象とする、即ち、複製対象のパケットを指定する方法である。第２の方法は、特定のアドレス領域以外に対するＩ／Ｏデバイス４０からのパケットのみを複製対象とする、即ち、複製対象から除外するパケットを指定する方法である。第３の方法は、サーバ１０からのPCI Expressスイッチ３０のコンフィグレーション設定動作を監視し、割込みに使用するメモリアドレスを検出し、検出されたアドレス領域に対するＩ／Ｏデバイス４０からのパケットを複製対象から除外する方法である。そして、PCI Expressスイッチ３０が、パケットが複製対象であればステップ７へと進む一方（Ｙｅｓ）、パケットが複製対象でなければステップ９へと進む（Ｎｏ）。 In step 6, the PCI Express switch 30 determines whether the packet is a replication target. Here, whether or not a packet is a replication target can be determined by the following three methods. The first method is a method in which only a packet from the I / O device 40 for a specific address area is to be copied, that is, a packet to be copied is designated. The second method is a method of designating only a packet from the I / O device 40 other than a specific address area as a replication target, that is, a packet to be excluded from the replication target. In the third method, the configuration setting operation of the PCI Express switch 30 from the server 10 is monitored, the memory address used for the interrupt is detected, and the packet from the I / O device 40 corresponding to the detected address area is to be copied. It is a method to exclude from. The PCI Express switch 30 proceeds to step 7 if the packet is to be copied (Yes), but proceeds to step 9 if the packet is not to be copied (No).

ステップ７において、PCI Expressスイッチ３０が、パケット複製ユニット３０Ｆにより、入力バッファ３０Ｃに到着したパケットを複製する。ここで、複製したパケットには、入力バッファ３０Ｃに到着したパケットとは異なる宛先が設定される。 In step 7, the PCI Express switch 30 duplicates the packet arriving at the input buffer 30C by the packet duplication unit 30F. Here, a destination different from the packet arriving at the input buffer 30C is set in the duplicated packet.

ステップ８において、PCI Expressスイッチ３０が、入力バッファ３０Ｃに到着したパケット、及び、パケット複製ユニット３０Ｆにより複製されたパケットを、各宛先のサーバ１０及び２０に送出する。 In step 8, the PCI Express switch 30 sends the packet that has arrived at the input buffer 30C and the packet duplicated by the packet duplication unit 30F to the destination servers 10 and 20.

ステップ９において、PCI Expressスイッチ３０が、入力バッファ３０Ｃに到着したパケットを、宛先のサーバ１０又は２０に送出する。
かかるPCI Expressスイッチ３０によれば、Ｉ／Ｏデバイス４０から入力バッファ３０Ｃにパケットが到着すると、そのパケットにより書き込まれるメモリ領域がロック中であるか否か判定される。そして、メモリ領域がロック中であれば、そのロックが解除されるまで、入力バッファ３０Ｃのキューからパケットが出力されることが抑止される。また、パケットの複製指示があり、かつ、そのパケットが複製対象（割込要求パケット以外）であれば、パケットが複製された後、各パケットの宛先にパケットが送出される。一方、パケットの複製指示がなく、又は、そのパケットが複製対象でない割込要求であれば、Ｉ／Ｏデバイス４０から到着したパケットが宛先に送出される。 In step 9, the PCI Express switch 30 sends the packet arriving at the input buffer 30 C to the destination server 10 or 20.
According to the PCI Express switch 30, when a packet arrives from the I / O device 40 to the input buffer 30C, it is determined whether or not a memory area written by the packet is locked. If the memory area is locked, the packet is prevented from being output from the queue of the input buffer 30C until the lock is released. If there is a packet duplication instruction and the packet is a duplication target (other than an interrupt request packet), the packet is duplicated and then sent to the destination of each packet. On the other hand, if there is no packet duplication instruction or if the packet is an interrupt request that is not a duplication target, the packet arriving from the I / O device 40 is sent to the destination.

次に、かかるコンピュータシステムにおいて、仮想マシンを移動させる移動処理内容を説明する。
図４及び図５は、移動元サーバ１０のハイパーバイザ１２が、管理者などから仮想マシンの移動指示があったことを契機として実行する移動処理を示す。なお、仮想マシンの移動指示には、少なくとも、サーバ２０に仮想マシンを移動させることを示す移動先情報が含まれている。 Next, the contents of the movement process for moving the virtual machine in the computer system will be described.
4 and 5 show the movement processing that is executed when the hypervisor 12 of the movement source server 10 receives an instruction to move a virtual machine from an administrator or the like. Note that the virtual machine movement instruction includes at least movement destination information indicating that the virtual machine is to be moved by the server 20.

ステップ１１では、ハイパーバイザ１２が、移動指示に含まれる移動先情報により特定される移動先サーバ２０のハイパーバイザ２２に、仮想マシン移動を開始することを通知する。 In step 11, the hypervisor 12 notifies the hypervisor 22 of the movement destination server 20 specified by the movement destination information included in the movement instruction that the virtual machine movement is started.

ステップ１２では、ハイパーバイザ１２が、移動先サーバ２０から応答があったか否かを判定する。そして、ハイパーバイザ１２が、移動先サーバ２０から応答があれば処理をステップ１３へと進める一方（Ｙｅｓ）、移動先サーバ２０から応答がなければ処理を待機させる（Ｎｏ）。 In step 12, the hypervisor 12 determines whether or not there is a response from the movement destination server 20. Then, if there is a response from the destination server 20, the hypervisor 12 advances the process to step 13 (Yes), but if there is no response from the destination server 20, the hypervisor 12 waits for the process (No).

ステップ１３では、ハイパーバイザ１２が、PCI Expressスイッチ３０にパケット複製を指示、即ち、PCI Expressスイッチ３０のパケット複製ユニット３０Ｆに制御信号を出力する。 In step 13, the hypervisor 12 instructs the PCI Express switch 30 to perform packet duplication, that is, outputs a control signal to the packet duplication unit 30 F of the PCI Express switch 30.

ステップ１４では、ハイパーバイザ１２が、仮想マシン１４のＶＭメモリ１８を分割しつつ移動先サーバ２０のＶＭメモリ２８に転送するために、ＶＭメモリ１８へのデータ書き込みをロックするメモリロック領域を設定する。ここで、メモリロック領域の大きさは、例えば、１つのパケットに含むことができるデータサイズに応じて決定すればよい。また、ハイパーバイザ１２は、PCI Expressスイッチ３０の出力抑止ユニット３０Ｇに対して、メモリロック領域を特定するロック情報を含んだ制御信号を出力する。 In step 14, the hypervisor 12 sets a memory lock area for locking data writing to the VM memory 18 in order to transfer the VM memory 18 of the virtual machine 14 to the VM memory 28 of the migration destination server 20 while dividing the VM memory 18. . Here, the size of the memory lock area may be determined according to the data size that can be included in one packet, for example. Further, the hypervisor 12 outputs a control signal including lock information for specifying a memory lock area to the output suppression unit 30G of the PCI Express switch 30.

ステップ１５では、ハイパーバイザ１２が、ＶＭメモリ１８のメモリロック領域からデータを読み出す。
ステップ１６では、ハイパーバイザ１２が、管理ネットワーク５０を介して、ＶＭメモリ１８から読み出したデータをパケット形式で移動先サーバ２０に送信する。 In step 15, the hypervisor 12 reads data from the memory lock area of the VM memory 18.
In step 16, the hypervisor 12 transmits the data read from the VM memory 18 to the destination server 20 in the packet format via the management network 50.

ステップ１７では、ハイパーバイザ１２が、移動先サーバ２０から応答があったか否かを判定する。そして、ハイパーバイザ１２が、移動先サーバ２０から応答があれば処理をステップ１８へと進める一方（Ｙｅｓ）、移動先サーバ２０から応答がなければ処理を待機させる（Ｎｏ）。 In step 17, the hypervisor 12 determines whether or not there is a response from the movement destination server 20. Then, if there is a response from the destination server 20, the hypervisor 12 advances the process to step 18 (Yes), but if there is no response from the destination server 20, the hypervisor 12 waits for the process (No).

ステップ１８では、ハイパーバイザ１２が、ＶＭメモリ１８の転送が完了したか否かを判定する。ここで、ＶＭメモリ１８の転送が完了したか否かは、例えば、ＶＭメモリ１８の最後のアドレスまでデータを転送したか否かを介して判定することができる。そして、ハイパーバイザ１２が、ＶＭメモリ１８の転送が完了したならば処理をステップ１９へと進める一方（Ｙｅｓ）、ＶＭメモリ１８の転送が完了していなければ処理をステップ１４へと戻す（Ｎｏ）。 In step 18, the hypervisor 12 determines whether or not the transfer of the VM memory 18 is completed. Here, whether or not the transfer of the VM memory 18 has been completed can be determined, for example, based on whether or not the data has been transferred to the last address of the VM memory 18. Then, if the transfer of the VM memory 18 is completed, the hypervisor 12 advances the process to step 19 (Yes), while if the transfer of the VM memory 18 is not completed, the process returns to step 14 (No). .

ステップ１９では、ハイパーバイザ１２が、移動元サーバ１０で稼動していた仮想マシン１４を停止させる。
ステップ２０では、ハイパーバイザ１２が、PCI Expressスイッチ３０に対して、Ｉ／Ｏデバイス４０から入力バッファ３０Ｃに到着したパケットを移動先サーバ２０に転送することを指示するための切替指示を出力する。 In step 19, the hypervisor 12 stops the virtual machine 14 that was running on the migration source server 10.
In step 20, the hypervisor 12 outputs a switching instruction for instructing the PCI Express switch 30 to transfer the packet arriving from the I / O device 40 to the input buffer 30 C to the destination server 20.

ステップ２１では、ハイパーバイザ１２が、ステップ１４〜ステップ１８においてＶＭメモリ１８を転送していた間に、移動元サーバ１０のＶＭメモリ１８に仮想マシン１４が上書きした上書領域を検出する。ここで、上書領域は、例えば、移動元サーバ１０のＣＰＵが備えるページテーブルに付随しているメモリの上書きフラグを参照することで、特定することができる。 In step 21, the hypervisor 12 detects the overwrite area over which the virtual machine 14 has overwritten the VM memory 18 of the migration source server 10 while the VM memory 18 was being transferred in steps 14 to 18. Here, the overwrite area can be specified by referring to a memory overwrite flag attached to a page table provided in the CPU of the migration source server 10, for example.

ステップ２２では、ハイパーバイザ１２が、ＶＭメモリ１８の上書領域から、所定サイズのデータを読み出す。ここで、所定サイズは、例えば、１つのパケットに含むことができるデータサイズに応じて決定すればよい。 In step 22, the hypervisor 12 reads data of a predetermined size from the overwriting area of the VM memory 18. Here, the predetermined size may be determined according to the data size that can be included in one packet, for example.

ステップ２３では、ハイパーバイザ１２が、管理ネットワーク５０を介して、ＶＭメモリ１８から読み出したデータをパケット形式で移動先サーバ２０に送信する。
ステップ２４では、ハイパーバイザ１２が、移動先サーバ２０から応答があったか否かを判定する。そして、ハイパーバイザ１２が、移動先サーバ２０から応答があれば処理をステップ２５へと進める一方（Ｙｅｓ）、移動先サーバ２０から応答がなければ処理を待機させる（Ｎｏ）。 In step 23, the hypervisor 12 transmits the data read from the VM memory 18 to the migration destination server 20 in the packet format via the management network 50.
In step 24, the hypervisor 12 determines whether or not there is a response from the movement destination server 20. Then, if there is a response from the destination server 20, the hypervisor 12 advances the process to step 25 (Yes), but if there is no response from the destination server 20, the hypervisor 12 waits for the process (No).

ステップ２５では、ハイパーバイザ１２が、上書領域のメモリ転送が完了したか否かを判定する。ここで、上書領域のメモリ転送が完了したか否かは、例えば、上書領域の最後のアドレスまでデータを転送したか否かを介して判定することができる。そして、ハイパーバイザ１２が、上書領域のメモリ転送が完了したならば処理をステップ２６へと進める一方（Ｙｅｓ）、上書領域のメモリ転送が完了していなければ処理をステップ２２へと戻す（Ｎｏ）。 In step 25, the hypervisor 12 determines whether or not the memory transfer of the overwrite area has been completed. Here, whether or not the memory transfer in the overwriting area has been completed can be determined, for example, based on whether or not the data has been transferred up to the last address in the overwriting area. The hypervisor 12 proceeds to step 26 if the memory transfer of the overwriting area is completed (Yes), but returns to step 22 if the memory transfer of the overwriting area is not completed (Yes). No).

ステップ２６では、ハイパーバイザ１２が、移動元サーバ１０から仮想マシン１４のプロセッサ状態を取り出す。ここで、プロセッサ状態とは、例えば、仮想マシン１４の各種レジスタの状態，プログラムカウンタの値などを意味する。 In step 26, the hypervisor 12 retrieves the processor state of the virtual machine 14 from the migration source server 10. Here, the processor state means, for example, the state of various registers of the virtual machine 14 and the value of the program counter.

ステップ２７では、ハイパーバイザ１２が、管理ネットワーク５０を介して、移動先サーバ２０にパケット形式でプロセッサ状態を送信する。
ステップ２８では、ハイパーバイザ１２が、移動先サーバ２０から応答があったか否かを判定する。そして、ハイパーバイザ１２が、移動先サーバ２０から応答があれば処理をステップ２９へと進める一方（Ｙｅｓ）、移動先サーバ２０から応答がなければ処理を待機させる（Ｎｏ）。 In step 27, the hypervisor 12 transmits the processor status in packet format to the destination server 20 via the management network 50.
In step 28, the hypervisor 12 determines whether or not there is a response from the movement destination server 20. Then, if there is a response from the destination server 20, the hypervisor 12 advances the process to step 29 (Yes), but if there is no response from the destination server 20, the hypervisor 12 waits for the process (No).

ステップ２９では、ハイパーバイザ１２が、移動元サーバ１０から仮想マシン１４を削除する。
図６は、移動先サーバ２０のハイパーバイザ２２が、移動元サーバ１０から仮想マシン移動を開始する通知を受けたことを契機として実行する移動処理を示す。 In step 29, the hypervisor 12 deletes the virtual machine 14 from the migration source server 10.
FIG. 6 shows a migration process that is executed when the hypervisor 22 of the migration destination server 20 receives a notification to start migration of the virtual machine from the migration source server 10.

ステップ３１では、ハイパーバイザ２２が、仮想マシン２４が使用するＶＭメモリ２８をメモリＭＥＭ上に確保する。
ステップ３２では、ハイパーバイザ２２が、ＶＭメモリ２８のメモリアドレスとＩ／Ｏデバイス４０のメモリアドレスを相互に変換可能とすべく、ＩＯハブＨＵＢに組み込まれたＩＯＭＭＵ２６を設定する。 In step 31, the hypervisor 22 secures the VM memory 28 used by the virtual machine 24 on the memory MEM.
In step 32, the hypervisor 22 sets the IOMMU 26 incorporated in the IO hub HUB so that the memory address of the VM memory 28 and the memory address of the I / O device 40 can be converted into each other.

ステップ３３では、ハイパーバイザ２２が、管理ネットワーク５０を介して、移動元サーバ１０に仮想サーバの移動準備が完了したことを示す応答を返信する。
ステップ３４では、ハイパーバイザ２２が、移動元サーバ１０からパケットを受信したか否か、即ち、移動元サーバ１０のＶＭメモリ１８のデータを受信したか否かを判定する。そして、ハイパーバイザ２２が、移動元サーバ１０からパケットを受信したならば処理をステップ３５へと進める一方（Ｙｅｓ）、移動元サーバ１０からパケットを受信していなければ処理を待機させる（Ｎｏ）。なお、ハイパーバイザ２２は、移動元サーバ１０から長時間パケットが到着しないときに、処理が無限ループに陥ることを回避すべく、所定時間経過したときにループを抜け出るタイムアウト処理を平行して行うことが望ましい。 In step 33, the hypervisor 22 returns a response indicating that the preparation for moving the virtual server is completed to the movement source server 10 via the management network 50.
In step 34, it is determined whether the hypervisor 22 has received a packet from the migration source server 10, that is, whether the data in the VM memory 18 of the migration source server 10 has been received. Then, if the hypervisor 22 receives a packet from the source server 10, the process proceeds to step 35 (Yes), but if the packet is not received from the source server 10, the process waits (No). Note that the hypervisor 22 performs in parallel the time-out process for exiting the loop when a predetermined time elapses in order to prevent the process from entering an infinite loop when a packet does not arrive from the source server 10 for a long time. Is desirable.

ステップ３５では、ハイパーバイザ２２が、ＶＭメモリ２８に受信データを書き込む。
ステップ３６では、ハイパーバイザ２２が、PCI Expressスイッチ３０の出力抑止ユニット３０Ｇに対して、パケット出力抑止を解除する制御信号を出力する。 In step 35, the hypervisor 22 writes the received data to the VM memory 28.
In step 36, the hypervisor 22 outputs a control signal for canceling the packet output suppression to the output suppression unit 30G of the PCI Express switch 30.

ステップ３７では、ハイパーバイザ２２が、管理ネットワーク５０を介して、移動元サーバ１０にデータ書き込みが完了したことを示す応答を返信する。
ステップ３８では、ハイパーバイザ２２が、移動元サーバ１０からプロセッサ状態を受信したか否かを判定する。そして、ハイパーバイザ２２が、プロセッサ状態を受信したならば処理をステップ３９へと進める一方（Ｙｅｓ）、プロセッサ状態を受信していなければ処理をステップ３４へと戻す（Ｎｏ）。 In step 37, the hypervisor 22 returns a response indicating that the data writing has been completed to the migration source server 10 via the management network 50.
In step 38, the hypervisor 22 determines whether a processor state has been received from the migration source server 10. If the hypervisor 22 receives the processor state, the process proceeds to step 39 (Yes). If the processor state is not received, the process returns to step 34 (No).

ステップ３９では、ハイパーバイザ２２が、受信したプロセッサ状態に基づいて、仮想マシン２４のプロセッサ状態を設定する。これにより、移動元サーバ１０の仮想マシン１４と移動先サーバ２０の仮想マシン２４との間で、プロセッサ状態の同期をとることができる。 In step 39, the hypervisor 22 sets the processor state of the virtual machine 24 based on the received processor state. As a result, the processor state can be synchronized between the virtual machine 14 of the migration source server 10 and the virtual machine 24 of the migration destination server 20.

ステップ４０では、ハイパーバイザ２２が、移動先サーバ２０の仮想マシン２４を起動させる。
ステップ４１では、ハイパーバイザ２２が、管理ネットワーク５０を介して、移動先サーバ１０に仮想マシン２４が起動されたことを示す応答を返信する。 In step 40, the hypervisor 22 activates the virtual machine 24 of the migration destination server 20.
In step 41, the hypervisor 22 returns a response indicating that the virtual machine 24 has been activated to the migration destination server 10 via the management network 50.

かかる移動処理によれば、移動元サーバ１０から移動先サーバ２０に仮想マシン１４を移動するときに、移動先サーバ２０において、仮想マシン２４が使用するＶＭメモリ２８の領域が確保されると共に、仮想マシン２４からＩ／Ｏデバイス４０を直接制御するためのＩＯＭＭＵ２６が設定される。また、移動元サーバ１０から移動先サーバ２０に対して、移動元サーバ１０の仮想マシン１４が使用していたＶＭメモリ１８が分割されつつ転送される。 According to the migration process, when the virtual machine 14 is moved from the migration source server 10 to the migration destination server 20, the migration destination server 20 secures an area of the VM memory 28 used by the virtual machine 24, An IOMMU 26 for directly controlling the I / O device 40 from the machine 24 is set. Further, the VM memory 18 used by the virtual machine 14 of the source server 10 is transferred while being divided from the source server 10 to the destination server 20.

移動元サーバ１０から移動先サーバ２０へのＶＭメモリ１８の転送が完了すると、移動元サーバ１０の仮想マシン１４が停止され、Ｉ／Ｏデバイス４０からPCI Expressスイッチ３０に到着したパケットは、移動先サーバ２０へと転送され始める。また、移動元サーバ１０から移動先サーバ２０にＶＭメモリ１８を転送している間、移動元サーバ１０においてデータが上書きされたＶＭメモリ１８の上書領域は、ＶＭメモリ１８の転送が完了した後、移動元サーバ１０から移動先サーバ２０に分割されつつ転送される。 When the transfer of the VM memory 18 from the migration source server 10 to the migration destination server 20 is completed, the virtual machine 14 of the migration source server 10 is stopped, and the packet arriving at the PCI Express switch 30 from the I / O device 40 is It starts to be transferred to the server 20. Further, while the VM memory 18 is being transferred from the migration source server 10 to the migration destination server 20, the overwriting area of the VM memory 18 in which data has been overwritten in the migration source server 10 is after the transfer of the VM memory 18 is completed. The data is transferred while being divided from the source server 10 to the destination server 20.

そして、移動元サーバ１０から移動先サーバ２０への上書領域の転送が完了すると、移動元サーバ１０の仮想マシン１４のプロセッサ状態が移動先サーバ２０へと送信される。プロセッサ状態を受信した移動先サーバ２０では、移動元サーバ１０の仮想マシン１４のプロセッサ状態を再現すべく、仮想マシン２４のプロセッサ状態が設定されると共に、仮想マシン２４が起動される。一方、移動元サーバ１０では、仮想マシン１４が削除される。 When the transfer of the overwrite area from the migration source server 10 to the migration destination server 20 is completed, the processor state of the virtual machine 14 of the migration source server 10 is transmitted to the migration destination server 20. In the destination server 20 that has received the processor state, the processor state of the virtual machine 24 is set and the virtual machine 24 is started in order to reproduce the processor state of the virtual machine 14 of the source server 10. On the other hand, in the migration source server 10, the virtual machine 14 is deleted.

よって、ハイパーバイザ１２及び２２とPCI Expressスイッチ３０とが協働することで、Ｉ／Ｏデバイス４０を直接制御する仮想マシン１４を稼動させたまま、仮想マシン１４を移動元サーバ２０から移動先サーバ２０に移動させることができる。 Therefore, the hypervisors 12 and 22 and the PCI Express switch 30 cooperate to move the virtual machine 14 from the source server 20 to the destination server while the virtual machine 14 that directly controls the I / O device 40 is operating. 20 can be moved.

なお、前記実施形態で説明したコンピュータシステムは、２台のサーバを備えているが、３台以上のサーバを備えていてもよい。
以上の実施形態に関し、更に以下の付記を開示する。 The computer system described in the above embodiment includes two servers, but may include three or more servers.
Regarding the above embodiment, the following additional notes are disclosed.

（付記１）仮想マシンが稼動する複数のサーバと入出力デバイスとをスター型接続方式で接続するバススイッチであって、前記入出力デバイスから一のサーバへのメモリアクセス要求を複製して、複数のサーバに同一内容のメモリアクセス要求を送信する複製ユニットと、第１の指定アドレス領域に対するメモリアクセス要求を抑止する抑止ユニットと、前記複製ユニットにより複製されるメモリアクセス要求のうち、割込みに係るメモリアクセス要求を複製対象から除外する複製除外ユニットと、を含むことを特徴とするバススイッチ。 (Appendix 1) A bus switch for connecting a plurality of servers on which a virtual machine operates and input / output devices in a star connection method, and copying a plurality of memory access requests from the input / output devices to one server. A replication unit that transmits a memory access request having the same content to the server of the server, a suppression unit that suppresses a memory access request for the first designated address area, and a memory related to an interrupt among the memory access requests replicated by the replication unit And a duplication exclusion unit for excluding access requests from duplication targets.

（付記２）前記抑止ユニットは、前記入出力デバイスから一のサーバへのメモリアクセス要求を到着順にキューに順次蓄積することで、前記メモリアクセス要求を抑止することを特徴とする付記１記載のバススイッチ。 (Supplementary note 2) The bus according to Supplementary note 1, wherein the inhibition unit inhibits the memory access request by sequentially accumulating memory access requests from the input / output device to one server in a queue in the order of arrival. switch.

（付記３）前記複製除外ユニットは、第２の指定アドレス領域に対するメモリアクセス要求を複製対象から除外することを特徴とする付記１又は付記２に記載のバススイッチ。 (Supplementary note 3) The bus switch according to Supplementary note 1 or 2, wherein the duplication exclusion unit excludes a memory access request for the second designated address area from a duplication target.

（付記４）前記複製除外ユニットは、第２の指定アドレス領域以外に対するメモリアクセス要求を複製対象から除外することを特徴とする付記１又は付記２に記載のバススイッチ。 (Supplementary note 4) The bus switch according to Supplementary note 1 or Supplementary note 2, wherein the replication exclusion unit excludes a memory access request for a region other than the second designated address area from a replication target.

（付記５）前記複製除外ユニットは、前記一のサーバからのコンフィグレーション設定動作を監視し、割込みに係るメモリアクセス要求を処理するアドレス領域を検出し、検出されたアドレス領域に対するメモリアクセス要求を複製対象から除外することを特徴とする付記１又は付記２に記載のバススイッチ。 (Supplementary Note 5) The duplication exclusion unit monitors a configuration setting operation from the one server, detects an address area for processing a memory access request related to an interrupt, and duplicates a memory access request for the detected address area. The bus switch according to appendix 1 or appendix 2, wherein the bus switch is excluded from the target.

（付記６）仮想マシンが稼動する複数のサーバと入出力デバイスとをスター型接続方式で接続するバススイッチと、前記バススイッチを制御する制御手段と、を含み、前記バススイッチは、前記入出力デバイスから一のサーバへのメモリアクセス要求を複製して、複数のサーバに同一内容のメモリアクセス要求を送信する複製ユニットと、第１の指定アドレス領域に対するメモリアクセス要求を抑止する抑止ユニットと、前記複製ユニットにより複製されるメモリアクセス要求のうち、割込みに係るメモリアクセス要求を複製対象から除外する複製除外ユニットと、を備える一方、前記制御手段は、前記仮想マシンの移動指示があったときに、前記メモリアクセス要求が移動元サーバ及び移動先サーバに送信されるように前記複製ユニットを制御した後、前記移動元サーバの仮想マシンが使用していたメモリの内容を移動先サーバに転送している間に、転送中のメモリアドレス領域に対するメモリアクセス要求が抑止されるように前記抑止ユニットを制御すると共に、割込みに係るメモリアクセス要求が複製対象から除外されるように前記複製除外ユニットを制御することを特徴とするコンピュータシステム。 (Supplementary Note 6) A bus switch that connects a plurality of servers on which a virtual machine operates and an input / output device in a star connection method, and a control unit that controls the bus switch, the bus switch including the input / output A replication unit that replicates a memory access request from a device to one server and transmits a memory access request having the same content to a plurality of servers; a suppression unit that suppresses a memory access request for a first designated address area; Among the memory access requests replicated by the replication unit, a replication exclusion unit that excludes the memory access request related to the interrupt from the replication target, while the control means, when there is an instruction to move the virtual machine, The duplication unit is set so that the memory access request is transmitted to the source server and the destination server The control unit controls the memory access request to the memory address area being transferred while transferring the memory contents used by the virtual machine of the source server to the destination server. And controlling the duplication exclusion unit so that the memory access request related to the interrupt is excluded from the duplication target.

（付記７）仮想マシンが稼動する複数のサーバと入出力デバイスとの間に介在し、前記入出力デバイスから一のサーバへのメモリアクセス要求を複製して、複数のサーバに同一内容のメモリアクセス要求を送信する複製ユニットと、第１の指定アドレスに対するメモリアクセス要求を抑止する抑止ユニットと、前記複製ユニットにより複製されるメモリアクセス要求のうち、割込みに係るメモリアクセス要求を複製対象から除外する複製除外ユニットと、を備えたバススイッチを制御するコンピュータが、前記仮想マシンの移動指示があったときに、前記メモリアクセス要求が移動元サーバ及び移動先サーバに送信されるように前記複製ユニットを制御するステップと、前記移動元サーバの仮想マシンが使用していたメモリの内容を移動先サーバに転送している間に、転送中のメモリアドレス領域に対するメモリアクセス要求が抑止されるように前記抑止ユニットを制御するステップと、割込みに係るメモリアクセス要求が複製対象から除外されるように前記複製除外ユニットを制御するステップと、を実行することを特徴とするコンピュータシステムの管理方法。 (Supplementary note 7) Intervening between a plurality of servers on which a virtual machine operates and input / output devices, copying memory access requests from the input / output devices to one server, and accessing the same contents to the plurality of servers A replication unit that transmits a request; a suppression unit that suppresses a memory access request for a first designated address; and a replication that excludes a memory access request related to an interrupt from replication targets among memory access requests that are replicated by the replication unit The computer that controls the bus switch including the exclusion unit controls the replication unit so that the memory access request is transmitted to the migration source server and the migration destination server when the virtual machine migration instruction is issued. And the contents of the memory used by the virtual machine of the source server Controlling the suppression unit so that a memory access request to the memory address area being transferred is suppressed during transfer to the memory, and the memory access request related to the interrupt is excluded from the replication target And a step of controlling the duplication exclusion unit.

（付記８）仮想マシンが稼動する複数のサーバと入出力デバイスとの間に介在し、前記入出力デバイスから一のサーバへのメモリアクセス要求を複製して、複数のサーバに同一内容のメモリアクセス要求を送信する複製ユニットと、第１の指定アドレスに対するメモリアクセス要求を抑止する抑止ユニットと、前記複製ユニットにより複製されるメモリアクセス要求のうち、割込みに係るメモリアクセス要求を複製対象から除外する複製除外ユニットと、を備えたバススイッチを制御するコンピュータに、前記仮想マシンの移動指示があったときに、前記メモリアクセス要求が移動元サーバ及び移動先サーバに送信されるように前記複製ユニットを制御するステップと、前記移動元サーバの仮想マシンが使用していたメモリの内容を移動先サーバに転送している間に、転送中のメモリアドレス領域に対するメモリアクセス要求が抑止されるように前記抑止ユニットを制御するステップと、割込みに係るメモリアクセス要求が複製対象から除外されるように前記複製除外ユニットを制御するステップと、を実現させるためのコンピュータシステムの管理プログラム。 (Appendix 8) Intervening between a plurality of servers on which a virtual machine operates and an input / output device, copying a memory access request from the input / output device to one server, and accessing the plurality of servers with the same contents A replication unit that transmits a request; a suppression unit that suppresses a memory access request for a first designated address; and a replication that excludes a memory access request related to an interrupt from replication targets among memory access requests that are replicated by the replication unit The replication unit is controlled so that the memory access request is transmitted to the migration source server and the migration destination server when the computer that controls the bus switch including the exclusion unit is instructed to migrate the virtual machine. And the contents of the memory used by the virtual machine of the source server Controlling the suppression unit so that a memory access request to the memory address area being transferred is suppressed during transfer to the memory, and the memory access request related to the interrupt is excluded from the replication target And a step of controlling the duplication exclusion unit.

１０サーバ（移動元サーバ）
１２ハイパーバイザ
１４仮想マシン
１８ＶＭメモリ
２０サーバ（移動先サーバ）
２２ハイパーバイザ
２４仮想マシン
２８ＶＭメモリ
３０ PCI Expressスイッチ
３０Ｆパケット複製ユニット
３０Ｇ出力抑止ユニット
３０Ｈ複製除外ユニット
４０Ｉ／Ｏデバイス 10 server (source server)
12 Hypervisor 14 Virtual machine 18 VM memory 20 Server (destination server)
22 Hypervisor 24 Virtual machine 28 VM memory 30 PCI Express switch 30F Packet replication unit 30G Output suppression unit 30H Replication exclusion unit 40 I / O device

Claims

A bus switch that connects multiple servers running virtual machines and input / output devices using a star connection method.
A replication unit that replicates a memory access request from the input / output device to one server and transmits the same memory access request to a plurality of servers;
A deterrence unit for deterring memory access requests to the first designated address area;
Among the memory access requests replicated by the replication unit, a replication exclusion unit that excludes memory access requests related to interrupts from replication targets;
A bus switch comprising:

2. The bus switch according to claim 1, wherein the inhibition unit inhibits the memory access request by sequentially accumulating memory access requests from the input / output device to one server in a queue in the order of arrival.

The bus switch according to claim 1 or 2, wherein the duplication exclusion unit excludes a memory access request for the second designated address area from a duplication target.

A bus switch that connects multiple servers running virtual machines and input / output devices in a star connection method;
Control means for controlling the bus switch;
Including
The bus switch replicates a memory access request from the input / output device to one server and transmits a memory access request having the same contents to a plurality of servers, and a memory access request for a first designated address area A suppression unit that suppresses and a replication exclusion unit that excludes memory access requests related to interrupts from replication targets among memory access requests replicated by the replication unit,
The control unit controls the replication unit so that the memory access request is transmitted to the movement source server and the movement destination server when an instruction to move the virtual machine is given, and then the virtual machine of the movement source server. While the contents of the memory used by the server are being transferred to the destination server, the control unit is controlled so that the memory access request for the memory address area being transferred is suppressed, and the memory access request related to the interrupt The computer system is characterized in that the duplication exclusion unit is controlled so as to be excluded from duplication targets.

It is interposed between a plurality of servers on which a virtual machine operates and input / output devices, replicates memory access requests from the input / output devices to one server, and transmits memory access requests having the same contents to the plurality of servers. A duplication unit; a deterrence unit that inhibits a memory access request for the first designated address; and a duplication exclusion unit that excludes a memory access request related to an interrupt from duplication targets among memory access requests duplicated by the duplication unit; A computer that controls the bus switch with
A step of controlling the replication unit so that the memory access request is transmitted to a source server and a destination server when there is an instruction to move the virtual machine, and the virtual machine of the source server uses Controlling the suppression unit so that a memory access request for the memory address area being transferred is suppressed while transferring the contents of the memory to the destination server, and the memory access request related to the interrupt is a replication target. Controlling the replication exclusion unit to be excluded from;
The management method of the computer system characterized by performing this.