JP5985642B2

JP5985642B2 - Data storage system and data storage control method

Info

Publication number: JP5985642B2
Application number: JP2014527293A
Authority: JP
Inventors: デサイ、コマル; ビー．バガーニ、サティヤム
Original assignee: VMware LLC
Current assignee: VMware LLC
Priority date: 2011-08-26
Filing date: 2012-08-23
Publication date: 2016-09-06
Anticipated expiration: 2032-08-23
Also published as: AU2015243081B2; CN103748545A; AU2012300402A1; EP3125103B1; CN105975210B; AU2012300402B2; WO2013032851A1; EP2712439A1; CN105975210A; US20130054890A1; EP3125103A1; US8775774B2; US8949570B2; US20140245016A1; JP2014531068A; CN103748545B; AU2015243082A1; AU2015243081A1; AU2015243082B2; EP2712439B1

Description

コンピュータ・システムが、とりわけ大規模なデータ・センターをサポートするというコンテキストにおいて、エンタープライズ・レベルへ拡張するにつれて、基礎をなすデータ・ストレージ・システムは、ストレージ・エリア・ネットワーク（ＳＡＮ）またはネットワーク・アタッチト・ストレージ（ＮＡＳ）を採用する場合が多い。従来からよく理解されているように、ＳＡＮまたはＮＡＳは、複数の技術的能力および機能上の利点を提供し、それらは基本的に、データ・ストレージ・デバイスの仮想化と、トランスペアレントな、フォルト・トレラントな、フェイルオーバな、およびフェイルセーフなコントロールを伴う物理的なデバイスの冗長性と、地理的に分散され複製されたストレージと、クライアント中心のコンピュータ・システム管理から切り離された集中化された監督およびストレージ構成管理とを含む。 As computer systems expand to the enterprise level, particularly in the context of supporting large data centers, the underlying data storage systems are storage area networks (SAN) or network attached networks. Storage (NAS) is often adopted. As is well understood in the past, SAN or NAS provides multiple technical capabilities and functional advantages, which basically consist of virtualization of data storage devices and transparent, fault-tolerant. Physical device redundancy with tolerant, failover, and fail-safe controls, geographically distributed and replicated storage, and centralized supervision and decoupling from client-centric computer system management Storage configuration management.

アーキテクチャ上は、ＳＡＮストレージ・システム（たとえば、ディスク・アレイなど）内のストレージ・デバイスは、典型的にはネットワーク・スイッチ（たとえば、ファイバ・チャネル・スイッチなど）に接続され、次いで、それらのネットワーク・スイッチは、ストレージ・デバイス内のデータへのアクセスを必要とするサーバまたは「ホスト」に接続される。ＳＡＮ内のサーバ、スイッチ、およびストレージ・デバイスは、典型的には、ディスク・データ・ブロックのレベルでネットワークを介してデータを転送するスモール・コンピュータ・システム・インターフェース（ＳＣＳＩ）プロトコルを使用して通信する。対照的に、ＮＡＳデバイスは、典型的には、１つまたは複数のストレージ・ドライブを内部に含み、イーサネットなどのネットワーク・プロトコルを通じてホスト（または中間スイッチ）に接続されているデバイスである。ＮＡＳデバイスはまた、ストレージ・デバイスを含むことに加えて、ネットワーク・ファイル・システム（ＮＦＳ）またはコモン・インターネット・ファイル・システム（ＣＩＦＳ）など、ネットワークベースのファイル・システムに従って自分のストレージ・デバイスを事前にフォーマットする。したがって、ＳＡＮは、ディスク（ＬＵＮと呼ばれ、これについては、以降でさらに詳述する）をホストに公開し、次いでそれらのディスクは、フォーマットされてから、ホストによって利用されるファイル・システムに従ってマウントされる必要があるが、そうしたＳＡＮとは対照的に、ＮＡＳデバイスのネットワークベースのファイル・システム（これは、ホストのオペレーティング・システムによってサポートされる必要がある）は、ＮＡＳデバイスが、ホストのオペレーティング・システムにとってファイル・サーバとして見えるようにし、次いでホストは、そのＮＡＳデバイスを、たとえば、オペレーティング・システムによってアクセス可能なネットワーク・ドライブとして、マウントまたはマップすることができる。ストレージ・システム・ベンダーによる継続的なイノベーションおよび新製品のリリースに伴って、ＳＡＮストレージ・システムとＮＡＳストレージ・システムとの間における明確な区別が薄れ続けており、実際のストレージ・システムの実施態様は、しばしば両方の特徴を呈しており、同じシステムにおいてファイルレベル・プロトコル（ＮＡＳ）およびブロックレベル・プロトコル（ＳＡＮ）の両方を提供しているということを認識されたい。たとえば、代替ＮＡＳアーキテクチャにおいては、従来のＮＡＳデバイスではなく、ＮＡＳ「ヘッド」またはＮＡＳ「ゲートウェイ」デバイスが、ホストにネットワーク接続される。そのようなＮＡＳゲートウェイ・デバイスは、自分自身ではストレージ・ドライブを含まず、外部のストレージ・デバイスが（たとえば、ファイバ・チャネル・インターフェースなどを介して）そのＮＡＳゲートウェイ・デバイスに接続されることを可能にする。そのようなＮＡＳゲートウェイ・デバイス（これは、従来のＮＡＳデバイスと同様の様式でホストによって知覚される）は、ファイルレベルのストレージ・アクセスのシンプルさを保持しながら、ＮＡＳベースのストレージ・アーキテクチャのキャパシティーを（たとえば、ＳＡＮによって、より伝統的にサポートされているストレージ・キャパシティー・レベルに）著しく増大させる能力を提供する。 Architecturally, storage devices in a SAN storage system (eg, disk array, etc.) are typically connected to network switches (eg, Fiber Channel switches, etc.) and then their network The switch is connected to a server or “host” that requires access to data in the storage device. Servers, switches, and storage devices in a SAN typically communicate using a small computer system interface (SCSI) protocol that transfers data over the network at the level of disk data blocks. To do. In contrast, a NAS device is a device that typically includes one or more storage drives and is connected to a host (or intermediate switch) through a network protocol such as Ethernet. In addition to including storage devices, NAS devices also pre-configure their storage devices according to a network-based file system, such as Network File System (NFS) or Common Internet File System (CIFS). Format to Therefore, SAN exposes disks (called LUNs, which will be described in further detail below) to the host, which are then formatted and then mounted according to the file system utilized by the host However, in contrast to such a SAN, the NAS device's network-based file system (which must be supported by the host operating system) allows the NAS device to be Make it visible to the system as a file server and then the host can mount or map its NAS device, for example, as a network drive accessible by the operating system. With continued innovation and new product releases by storage system vendors, the clear distinction between SAN storage systems and NAS storage systems continues to fade, and the actual storage system implementation is It should be appreciated that they often exhibit both features and provide both file level protocol (NAS) and block level protocol (SAN) in the same system. For example, in an alternative NAS architecture, a NAS “head” or NAS “gateway” device is networked to the host rather than a conventional NAS device. Such NAS gateway devices do not include storage drives themselves, and external storage devices can be connected to the NAS gateway device (eg, via a Fiber Channel interface, etc.) To. Such a NAS gateway device (which is perceived by the host in a manner similar to a traditional NAS device) is capable of NAS-based storage architecture while retaining the simplicity of file-level storage access. Provides the ability to significantly increase the city (eg, to storage capacity levels that are more traditionally supported by SANs).

図１Ａにおいて示されているストレージ・システム３０など、ＳＣＳＩおよびその他のブロック・プロトコルベースのストレージ・デバイスは、１つまたは複数のプログラムされているストレージ・プロセッサに相当するストレージ・システム・マネージャ３１を利用して、そのストレージ・デバイス内のストレージ・ユニットまたはドライブを集約し、それらを、一意に識別可能な番号をそれぞれ伴う１つまたは複数のＬＵＮ（ＬｏｇｉｃａｌＵｎｉｔＮｕｍｂｅｒ）３４として提示する。ＬＵＮ３４は、ネットワーク２０（たとえば、ファイバ・チャネルなど）を介して物理ホスト・バス・アダプタ（ＨＢＡ）１１を通じて１つまたは複数のコンピュータ・システム１０によってアクセスされる。コンピュータ・システム１０内で、ＨＢＡ１１の上に、ストレージ・アクセス・アブストラクションが、ローレベル・デバイス・ドライバ・レイヤ１２から始まってオペレーティング・システム固有のファイル・システム・レイヤ１５で終わる一連のソフトウェア・レイヤを通じて特徴的に実装されている。デバイス・ドライバ・レイヤ１２は、ＬＵＮ３４への基本的なアクセスを可能にし、典型的には、ストレージ・システムによって使用される通信プロトコル（たとえば、ＳＣＳＩなど）に固有のものである。ＨＢＡ１１を通じて見えるＬＵＮ３４のマルチパスの統合、およびその他のデータ・アクセス・コントロールおよび管理機能をサポートするために、データ・アクセス・レイヤ１３が、デバイス・ドライバ・レイヤ１２の上に実装されることが可能である。論理ボリューム・マネージャ１４が、典型的にはデータ・アクセス・レイヤ１３と従来のオペレーティング・システム・ファイル・システム・レイヤ１５との間に実装され、ＨＢＡ１１を通じてアクセス可能なＬＵＮ３４のボリューム指向の仮想化および管理をサポートする。複数のＬＵＮ３４が集められて、１つの論理デバイスとしてファイル・システム・レイヤ１５に提示されてファイル・システム・レイヤ１５によって使用されるために論理ボリューム・マネージャ１４のコントロールのもとで１つのボリュームとしてまとめて管理されることが可能である。 SCSI and other block protocol based storage devices, such as the storage system 30 shown in FIG. 1A, utilize a storage system manager 31 that corresponds to one or more programmed storage processors. Then, the storage units or drives in the storage device are aggregated and presented as one or more LUNs (Logical Unit Numbers) 34 each with a uniquely identifiable number. The LUN 34 is accessed by one or more computer systems 10 through a physical host bus adapter (HBA) 11 over a network 20 (eg, Fiber Channel, etc.). Within the computer system 10, on top of the HBA 11, the storage access abstraction passes through a series of software layers that begin with the low level device driver layer 12 and end with the operating system specific file system layer 15. Implemented characteristically. The device driver layer 12 allows basic access to the LUN 34 and is typically specific to the communication protocol (eg, SCSI, etc.) used by the storage system. Data access layer 13 can be implemented on top of device driver layer 12 to support multipath integration of LUN 34 visible through HBA 11 and other data access control and management functions. It is. A logical volume manager 14 is typically implemented between the data access layer 13 and the traditional operating system file system layer 15, and LUN 34 volume-oriented virtualization and access through the HBA 11. Support management. Multiple LUNs 34 are collected and presented as one logical device to the file system layer 15 and used as a single volume under the control of the logical volume manager 14 for use by the file system layer 15 It can be managed together.

ストレージ・システム・マネージャ３１は、ストレージ・システム３０内に存在する、図１Ａにおいてスピンドル３２と呼ばれている、物理的な、典型的にはディスク・ドライブベースのストレージ・ユニットの仮想化を実施する。論理的な観点からは、これらのスピンドル３２のそれぞれは、固定されたサイズのエクステント３３のシーケンシャル・アレイと考えられることが可能である。ストレージ・システム・マネージャ３１は、ＬＵＮ３４として知られている一組の仮想ＳＣＳＩデバイスへと分けられている連続した論理ストレージ・スペースを、コンピュータ・システム１０など、接続されているコンピュータ・システムに公開することによって、ディスク・ドライブの実際のスピンドルおよびエクステントのアドレスへのターゲットとなる読み取りおよび書き込みオペレーションの複雑さを取り去る。それぞれのＬＵＮは、そのようなＬＵＮの存在と、コンピュータ・システム１０へのそのようなＬＵＮの提示とのおかげでコンピュータ・システム１０によって使用されるために割り振られるいくらかのキャパシティーを表す。ストレージ・システム・マネージャ３１は、それぞれのそのようなＬＵＮに関する、エクステントの順序付けられたリストへのマッピングを含むメタデータを保持し、そのリストにおいては、それぞれのそのようなエクステントは、スピンドル／エクステントのペア＜スピンドル＃，エクステント＃＞として識別されることが可能であり、したがって、さまざまなスピンドル３２のうちのいずれかに位置特定されることが可能である。 The storage system manager 31 performs the virtualization of a physical, typically disk drive based storage unit, referred to as the spindle 32 in FIG. . From a logical point of view, each of these spindles 32 can be thought of as a sequential array of extents 33 of fixed size. The storage system manager 31 exposes a contiguous logical storage space divided into a set of virtual SCSI devices known as LUNs 34 to connected computer systems such as computer system 10. This removes the complexity of targeted read and write operations to the actual spindle and extent addresses of the disk drive. Each LUN represents some capacity that is allocated for use by the computer system 10 thanks to the presence of such LUNs and the presentation of such LUNs to the computer system 10. The storage system manager 31 maintains metadata for each such LUN, including a mapping to an ordered list of extents, in which each such extent is a spindle / extent number. The pair <spindle #, extent #> can be identified and thus can be located on any of the various spindles 32.

図１Ｂは、ネットワーク２１（たとえば、イーサネット）を介してネットワーク・インターフェース・カード（ＮＩＣ）１１’を経由して１つまたは複数のコンピュータ・システム１０に接続されている従来のＮＡＳまたはファイルレベル・ベースのストレージ・システム４０のブロック図である。ストレージ・システム４０は、１つまたは複数のプログラムされているストレージ・プロセッサに相当するストレージ・システム・マネージャ４１を含む。ストレージ・システム・マネージャ４１は、ストレージ・システム４０内に存在する、図１Ｂにおいてスピンドル４２と呼ばれている、物理的な、典型的にはディスク・ドライブベースのストレージ・ユニットの上にファイル・システム４５を実装する。論理的な観点からは、これらのスピンドルのそれぞれは、固定されたサイズのエクステント４３のシーケンシャル・アレイと考えられることが可能である。ファイル・システム４５は、各自のマウント・ポイントを通じてアクセスされるファイル・システム・レベル・ボリューム４４（以降では「ＦＳボリューム」と呼ばれる）へと編成されることが可能であるディレクトリおよびファイルを含むネームスペースを、コンピュータ・システム１０など、接続されているコンピュータ・システムに公開することによって、ディスク・ドライブの実際のスピンドルおよびエクステントのアドレスへのターゲットとなる読み取りおよび書き込みオペレーションの複雑さを取り去る。 FIG. 1B illustrates a conventional NAS or file level based connection to one or more computer systems 10 via a network interface card (NIC) 11 ′ via a network 21 (eg, Ethernet). 2 is a block diagram of the storage system 40 of FIG. The storage system 40 includes a storage system manager 41 corresponding to one or more programmed storage processors. The storage system manager 41 is a file system on top of a physical, typically disk drive based storage unit, referred to as the spindle 42 in FIG. 45 is implemented. From a logical point of view, each of these spindles can be thought of as a sequential array of extents 43 of fixed size. File system 45 is a namespace containing directories and files that can be organized into file system level volumes 44 (hereinafter referred to as “FS volumes”) that are accessed through their own mount points. Is exposed to a connected computer system, such as computer system 10, to remove the complexity of targeted read and write operations to the actual spindle and extent addresses of the disk drive.

上述のストレージ・システムにおける進歩をもってさえ、それらのストレージ・システムは、仮想化されたコンピュータ・システムの特定のニーズを満たす上で十分にスケーラブルではないということが広く認識されている。たとえば、サーバ・マシンのクラスタは、１０，０００個もの仮想マシン（ＶＭ：ｖｉｒｔｕａｌｍａｃｈｉｎｅ）にサービス提供することができ、それぞれのＶＭは、複数の「仮想ディスク」および複数の「スナップショット」を使用し、それらはそれぞれ、たとえば、１つのファイルとして特定のＬＵＮまたはＦＳボリューム上に格納されることが可能である。ＶＭごとに２つの仮想ディスクおよび２つのスナップショットというスケール・ダウンされた推定においてさえ、ＶＭが物理ディスクに直接接続される（すなわち、物理ディスクごとに１つの仮想ディスクまたはスナップショット）場合、ストレージ・システムがサポートするのは、合計６０，０００個の別々のディスクに達する。加えて、このスケールでのストレージ・デバイスおよびトポロジーの管理は困難であることがわかっている。結果として、本願明細書に援用する「ＰｒｏｖｉｄｉｎｇＭｕｌｔｉｐｌｅＣｏｎｃｕｒｒｅｎｔＡｃｃｅｓｓｔｏａＦｉｌｅＳｙｓｔｅｍ」と題されている（特許文献１）に記載されているような、ＶＭがさらに小さな組の物理ストレージ・エンティティー（たとえば、ＬＵＮベースのＶＭＦＳクラスタ化されたファイル・システムまたはＦＳボリューム）へと多重化されるデータストアというコンセプトが開発された。 Even with the advances in storage systems described above, it is widely recognized that these storage systems are not scalable enough to meet the specific needs of virtualized computer systems. For example, a cluster of server machines can serve as many as 10,000 virtual machines (VMs), each using multiple “virtual disks” and multiple “snapshots” They can each be stored, for example, on a particular LUN or FS volume as one file. Even with a scaled down estimate of 2 virtual disks and 2 snapshots per VM, if the VM is directly attached to a physical disk (ie, one virtual disk or snapshot per physical disk) The system supports a total of 60,000 separate disks. In addition, managing storage devices and topologies at this scale has proven difficult. As a result, as described in “Providing Multiple Concurrent Access to a File System” (Patent Document 1) incorporated herein, a VM is a smaller set of physical storage entities (eg, The concept of a data store that was multiplexed into a LUN-based VMFS clustered file system or FS volume) was developed.

ＬＵＮまたはＦＳボリュームを採用している従来のストレージ・システムにおいては、複数のＶＭからのワークロードは、典型的には、単一のＬＵＮまたは単一のＦＳボリュームによってサービス提供される。結果として、１つのＶＭワークロードからのリソース需要が、同じＬＵＮまたはＦＳボリューム上の別のＶＭワークロードに提供されるサービス・レベルに影響を与えることになる。したがって、待ち時間、および１秒あたりの入力／出力オペレーション（ＩＯ）、すなわちＩＯＰＳ（ｉｎｐｕｔ／ｏｕｔｐｕｔｏｐｅｒａｔｉｏｎｓｐｅｒｓｅｃｏｎｄ）など、ストレージに関する効率尺度は、所与のＬＵＮまたはＦＳボリューム内のワークロードの数に応じて変わり、保証されることは不可能である。その結果として、ＬＵＮまたはＦＳボリュームを採用しているストレージ・システムに関するストレージ・ポリシーがＶＭごとに実行されることは不可能であり、サービス・レベル・アグリーメント（ＳＬＡ）保証がＶＭごとに与えられることは不可能である。加えて、スナップショット、複製、暗号化、および重複排除など、ストレージ・システム・ベンダーによって提供されるデータ・サービスは、ＶＭの仮想ディスクの粒度ではなく、ＬＵＮまたはＦＳボリュームの粒度で提供される。結果として、スナップショットは、ストレージ・システム・ベンダーによって提供されるデータ・サービスを使用してＬＵＮ全体またはＦＳボリューム全体に関して作成されることが可能であるが、ＬＵＮ、または仮想ディスクが格納されているファイル・システムとは別に、ＶＭの単一の仮想ディスクに関するスナップショットが作成されることは不可能である。 In conventional storage systems that employ LUN or FS volumes, workloads from multiple VMs are typically served by a single LUN or a single FS volume. As a result, resource demand from one VM workload will affect the service level provided to another VM workload on the same LUN or FS volume. Thus, storage efficiency metrics such as latency and input / output operations per second (IO), or IOPS (input / output operations per second), are a function of the number of workloads in a given LUN or FS volume. It will vary and cannot be guaranteed. As a result, storage policies for storage systems that employ LUN or FS volumes cannot be enforced on a per VM basis, and service level agreement (SLA) guarantees are provided per VM. Is impossible. In addition, data services provided by storage system vendors, such as snapshots, replication, encryption, and deduplication, are provided at LUN or FS volume granularity rather than VM virtual disk granularity. As a result, snapshots can be created for entire LUNs or entire FS volumes using data services provided by storage system vendors, but LUNs or virtual disks are stored Apart from the file system, it is impossible to create a snapshot for a single virtual disk of a VM.

米国特許第７，８４９，０９８号明細書US Pat. No. 7,849,098

１つまたは複数の実施形態が対象にしているストレージ・システムは、その中で実行されるワークロード同士を分離するように構成されており、それによって、ＳＬＡ保証がワークロードごとに提供されることが可能であり、ストレージ・システムのデータ・サービスがワークロードごとに提供されることが可能であり、ストレージ・システムの抜本的な再設計は必要とされない。複数の仮想マシンに関する複数の仮想ディスクを格納するストレージ・システムにおいては、ＳＬＡ保証が仮想ディスクごとに提供されることが可能であり、ストレージ・システムのデータ・サービスが仮想ディスクごとに提供されることが可能である。 The storage system targeted by one or more embodiments is configured to isolate the workloads running therein, thereby providing SLA assurance for each workload. Storage system data services can be provided for each workload and no drastic redesign of the storage system is required. In storage systems that store multiple virtual disks for multiple virtual machines, SLA guarantees can be provided for each virtual disk, and storage system data services are provided for each virtual disk Is possible.

本発明の複数の実施形態によれば、ストレージ・システムは、論理的なストレージ・キャパシティーの割り当て（本明細書においては、「ストレージ・コンテナ」と呼ばれる）から、ワークロードごとにストレージ・オブジェクトとしてプロビジョンされる論理ストレージ・ボリューム（本明細書においては、「仮想ボリューム」と呼ばれる）をエクスポートする。ＶＭに関しては、そのＶＭの仮想ディスクおよびスナップショットのそれぞれに関して仮想ボリュームが作成されることが可能である。一実施形態においては、仮想ボリュームは、ストレージ・システムにおいて構成されているプロトコル・トラフィックに関する論理エンドポイント（「プロトコル・エンドポイント」として知られている）を通じて、ＳＣＳＩおよびＮＦＳなどの標準的なプロトコルを使用して、接続されているコンピュータ・システムによってオン・デマンドでアクセスされる。 In accordance with embodiments of the present invention, a storage system is configured as a storage object for each workload from logical storage capacity allocation (referred to herein as a “storage container”). Export a provisioned logical storage volume (referred to herein as a “virtual volume”). For a VM, a virtual volume can be created for each of the VM's virtual disks and snapshots. In one embodiment, the virtual volume uses standard protocols such as SCSI and NFS through logical endpoints (known as “protocol endpoints”) for protocol traffic configured in the storage system. Used and accessed on demand by the connected computer system.

本発明の一実施形態による、入力／出力コマンド（ＩＯ）パスおよび非ＩＯパスを介してストレージ・システムに接続されるコンピュータ・システムにおいて稼働するアプリケーションのために論理ストレージ・ボリュームをプロビジョンするための方法は、ストレージ・システムにおいて作成された論理ストレージ・コンテナを選択すること、その選択された論理ストレージ・コンテナに論理ストレージ・ボリュームを作成するための要求を、非ＩＯパスを介してストレージ・システムに発行すること、その要求に応答してストレージ・システムから受信された論理ストレージ・ボリュームに関する一意の識別子を格納し、その一意の識別子を、コンピュータ・システムにおいて稼働するアプリケーションに関連付けることを含む。 For provisioning a logical storage volume for an application running in a computer system connected to a storage system via input / output command (IO) paths and non-IO paths according to one embodiment of the invention The method selects a logical storage container created in the storage system and requests to create a logical storage volume in the selected logical storage container to the storage system via a non-IO path. Issuing, storing a unique identifier for the logical storage volume received from the storage system in response to the request, and associating the unique identifier with an application running in the computer system.

本発明の一実施形態による、ＩＯパスおよび非ＩＯパスを介してストレージ・システムに接続されるコンピュータ・システムにおいて稼働するアプリケーションのために論理ストレージ・ボリュームを再プロビジョンするための方法は、選択された論理ストレージ・コンテナにおいてプロビジョンされた論理ストレージ・ボリュームのサイズを増大させるための要求を、非ＩＯパスを介してストレージ・システムに発行すること、サイズにおけるその増大の肯定応答をストレージ・システムから受信すること、その増大したサイズを示すように、論理ストレージ・ボリュームに関連付けられているメタデータ・ファイルを更新することを含む。 A method for reprovisioning a logical storage volume for an application running in a computer system connected to a storage system via an IO path and a non-IO path is selected according to one embodiment of the present invention. Issuing a request to increase the size of a provisioned logical storage volume in a non-IO path to the storage system, and acknowledging the increase in size from the storage system Receiving, updating the metadata file associated with the logical storage volume to indicate its increased size.

本発明の別の実施形態によれば、ＩＯパスおよび非ＩＯパスを介してストレージ・システムに接続されるコンピュータ・システムは、非ＩＯパスにおける管理インターフェースと、ＩＯパスにおけるストレージ・インターフェースとを含む。管理インターフェースは、（ｉ）ストレージ・システムにおいて論理ストレージ・ボリュームを作成するための要求を生成し、その要求に応答して論理ストレージ・ボリュームに関する一意の識別子を受信すること、および（ｉｉ）論理ストレージ・ボリュームを、ストレージ・システムにおいて構成されているプロトコル・エンドポイントにバインドするための要求を生成し、その要求に応答して第１および第２の識別子を受信することを行うように構成されており、ストレージ・インターフェースは、論理ストレージ・ボリュームに発行されるＩＯを、第１および第２の識別子を用いてエンコードする。 According to another embodiment of the present invention, a computer system connected to a storage system via an IO path and a non-IO path includes a management interface in the non-IO path and a storage interface in the IO path. The management interface (i) generates a request to create a logical storage volume in the storage system, receives a unique identifier for the logical storage volume in response to the request, and (ii) logical storage Configured to generate a request to bind the volume to a protocol endpoint configured in the storage system and to receive the first and second identifiers in response to the request The storage interface encodes the IO issued to the logical storage volume using the first and second identifiers.

本発明の複数の実施形態はさらに、コンピュータ・システムによって実行されたときに上述の方法のうちの１つをそのコンピュータ・システムに実行させる命令を格納している非一時的なコンピュータ可読ストレージ・メディアを含む。 Embodiments of the present invention further provide non-transitory computer readable storage media storing instructions that, when executed by a computer system, cause the computer system to perform one of the methods described above. including.

ネットワークを介して１つまたは複数のコンピュータ・システムに接続されている従来のブロック・プロトコルベースのストレージ・デバイスのブロック図。1 is a block diagram of a conventional block protocol based storage device connected to one or more computer systems via a network. FIG. ネットワークを介して１つまたは複数のコンピュータ・システムに接続されている従来のＮＡＳデバイスのブロック図。1 is a block diagram of a conventional NAS device connected to one or more computer systems over a network. 本発明の一実施形態による、仮想ボリュームを実装しているブロック・プロトコルベースのストレージ・システム・クラスタのブロック図。1 is a block diagram of a block protocol based storage system cluster implementing a virtual volume, according to one embodiment of the invention. FIG. 本発明の一実施形態による、仮想ボリュームを実装しているＮＡＳベースのストレージ・システム・クラスタのブロック図。1 is a block diagram of a NAS-based storage system cluster implementing virtual volumes, according to one embodiment of the invention. FIG. 本発明の一実施形態による、仮想ボリュームを管理するための図２Ａまたは図２Ｂのストレージ・システム・クラスタのコンポーネントのブロック図。2B is a block diagram of components of the storage system cluster of FIG. 2A or 2B for managing virtual volumes, according to one embodiment of the invention. FIG. ストレージ・コンテナを作成するための方法工程の流れ図。Flow chart of method steps for creating a storage container. ＳＡＮベースのストレージ・システム上にホストされる仮想ボリュームを実装するように構成されているコンピュータ・システムの一実施形態のブロック図。1 is a block diagram of one embodiment of a computer system configured to implement a virtual volume hosted on a SAN-based storage system. ＮＡＳベースのストレージ・システム上にホストされる仮想ボリュームのために構成されている図５Ａのコンピュータ・システムのブロック図。FIG. 5B is a block diagram of the computer system of FIG. 5A configured for virtual volumes hosted on a NAS-based storage system. ＳＡＮベースのストレージ・システム上にホストされる仮想ボリュームを実装するように構成されているコンピュータ・システムの別の実施形態のブロック図。FIG. 3 is a block diagram of another embodiment of a computer system configured to implement a virtual volume hosted on a SAN-based storage system. ＮＡＳベースのストレージ・システム上にホストされる仮想ボリュームのために構成されている図５Ｃのコンピュータ・システムのブロック図。FIG. 5C is a block diagram of the computer system of FIG. 5C configured for virtual volumes hosted on a NAS-based storage system. 本発明の一実施形態による、仮想ボリュームを管理するために使用されるコンポーネントおよび通信パスを示すコンピュータ環境の簡略化されたブロック図。1 is a simplified block diagram of a computer environment showing components and communication paths used to manage virtual volumes according to one embodiment of the invention. FIG. 図２Ａまたは図２Ｂのストレージ・システム・クラスタに対してコンピュータ・システムを認証するための方法工程の流れ図。3 is a flow diagram of method steps for authenticating a computer system to the storage system cluster of FIG. 2A or FIG. 2B. 一実施形態による、仮想ボリュームを作成するための方法工程の流れ図。6 is a flowchart of method steps for creating a virtual volume, according to one embodiment. Ａは、コンピュータ・システムにとって利用可能であるプロトコル・エンドポイントを発見するための方法工程の流れ図、Ｂは、コンピュータ・システムが帯域内パスを介して接続されるプロトコル・エンドポイントをストレージ・システムが発見するための方法工程の流れ図。A is a flowchart of method steps for discovering protocol endpoints that are available to a computer system, and B is a protocol where a computer system is connected via an in-band path. Flow chart of method steps for discovery. 一実施形態による、仮想ボリューム・バインド要求を発行および実行するための方法工程の流れ図。6 is a flow diagram of method steps for issuing and executing a virtual volume bind request according to one embodiment. Ａは、一実施形態による、仮想ボリュームにＩＯを発行するための方法工程の流れ図、Ｂは、一実施形態による、仮想ボリュームにＩＯを発行するための方法工程の流れ図。A is a flowchart of method steps for issuing an IO to a virtual volume according to one embodiment, and B is a flowchart of method steps for issuing an IO to a virtual volume according to one embodiment. 一実施形態による、ストレージ・システムにおいてＩＯを実行するための方法工程の流れ図。4 is a flow diagram of method steps for performing IO in a storage system, according to one embodiment. 一実施形態による、仮想ボリューム再バインド要求を発行および実行するための方法工程の流れ図。6 is a flow diagram of method steps for issuing and executing a virtual volume rebind request, according to one embodiment. 仮想ボリュームのライフ・サイクルの概念図。The conceptual diagram of the life cycle of a virtual volume. 図２Ａのストレージ・システムを使用する一実施形態による、ＶＭをプロビジョンするための方法工程の流れ図。2B is a flow diagram of method steps for provisioning a VM, according to one embodiment using the storage system of FIG. 2A. Ａは、ＶＭをパワー・オンするための方法工程の流れ図、Ｂは、ＶＭをパワー・オフするための方法工程の流れ図。A is a flowchart of method steps for powering on a VM, and B is a flowchart of method steps for powering off a VM. ＶＭのｖｖｏｌのサイズを拡張するための方法工程の流れ図。Flow chart of method steps for expanding the size of a VM's vvol. ストレージ・コンテナ同士の間においてＶＭのｖｖｏｌを移動させるための方法工程の流れ図。FIG. 5 is a flow diagram of method steps for moving a VM's vvol between storage containers. テンプレートＶＭからＶＭをクローンするための方法工程の流れ図。Flow chart of method steps for cloning a VM from a template VM. 別の実施形態による、ＶＭをプロビジョンするための方法工程の流れ図。6 is a flow diagram of method steps for provisioning a VM, according to another embodiment. サンプル・ストレージ能力プロファイルと、プロファイル選択工程を含む、ストレージ・コンテナを作成するための方法とを示す図。FIG. 3 illustrates a sample storage capability profile and a method for creating a storage container that includes a profile selection process. ｖｖｏｌを作成して、そのｖｖｏｌに関するストレージ能力プロファイルを定義するための方法工程を示す流れ図。6 is a flow diagram illustrating method steps for creating a vvol and defining a storage capability profile for the vvol. スナップショットを作成するための方法工程を示す流れ図。5 is a flow diagram illustrating method steps for creating a snapshot.

図２Ａおよび図２Ｂは、本発明の実施形態による、「仮想ボリューム」を実装しているストレージ・システム・クラスタのブロック図である。このストレージ・システム・クラスタは、１つまたは複数のストレージ・システム、たとえば、ストレージ・システム１３０_１および１３０_２（これらは、ディスク・アレイであることが可能であり、それぞれが、複数のデータ・ストレージ・ユニット（ＤＳＵ）を有しており、それらのＤＳＵのうちの１つは、図において１４１とラベル付けされている）と、本明細書に記載されている本発明の実施形態を可能にするためのストレージ・システム１３０のさまざまなオペレーションをコントロールするストレージ・システム・マネージャ１３１および１３２とを含む。一実施形態においては、複数のストレージ・システム１３０が、分散型ストレージ・システム・マネージャ１３５を実装することができ、分散型ストレージ・システム・マネージャ１３５は、ストレージ・システム・クラスタのオペレーションを、あたかもそれらが単一の論理ストレージ・システムであるかのようにコントロールする。分散型ストレージ・システム・マネージャ１３５の運用ドメインは、同じデータ・センター内に、または複数のデータ・センターにわたってインストールされているストレージ・システムに及ぶことができる。たとえば、そのような一実施形態においては、分散型ストレージ・システム・マネージャ１３５は、ストレージ・システム・マネージャ１３１を含むことができ、ストレージ・システム・マネージャ１３１は、ストレージ・システム・マネージャ１３２と通信する場合に「マスター」マネージャとして機能し、ストレージ・システム・マネージャ１３２は、「スレーブ」マネージャとして機能するが、分散型ストレージ・システム・マネージャを実装するためのさまざまな代替方法が実施されることが可能であるということを認識されたい。ＤＳＵは、物理ストレージ・ユニット、たとえば、回転ディスクまたはソリッド・ステート・ディスクなどのディスクまたはフラッシュ・ベースのストレージ・ユニットに相当する。複数の実施形態によれば、ストレージ・システム・クラスタは、本明細書においてさらに詳述するように、「仮想ボリューム」（ｖｖｏｌ：ｖｉｒｔｕａｌｖｏｌｕｍｅ）を作成して、コンピュータ・システム１００_１および１００_２などの接続されているコンピュータ・システムに公開する。コンピュータ・システム１００内で稼働するアプリケーション（たとえば、自分の仮想ディスクにアクセスするＶＭなど）は、図２Ａの実施形態におけるＳＣＳＩ、および図２Ｂの実施形態におけるＮＦＳなど、標準的なプロトコルを使用して、ＳＣＳＩまたはＮＦＳプロトコル・トラフィックに関する論理エンドポイント（「プロトコル・エンドポイント」（ＰＥ）として知られており、ストレージ・システム１３０内で構成されている）を通じて、オン・デマンドでｖｖｏｌにアクセスする。アプリケーション関連のデータ・オペレーションに関する、コンピュータ・システム１００からストレージ・システム１３０への通信パスは、本明細書においては「帯域内」パスと呼ばれる。コンピュータ・システム１００のホスト・バス・アダプタ（ＨＢＡ）と、ストレージ・システム１３０内で構成されているＰＥとの間における通信パス、およびコンピュータ・システム１００のネットワーク・インターフェース・カード（ＮＩＣ）と、ストレージ・システム１３０内で構成されているＰＥとの間における通信パスは、帯域内パスの例である。帯域内ではなく、かつ典型的には、管理オペレーションを実行するために使用される、コンピュータ・システム１００からストレージ・システム１３０への通信パスは、本明細書においては「帯域外」パスと呼ばれる。コンピュータ・システム１００と、ストレージ・システム１３０との間におけるイーサネット・ネットワーク接続など、帯域外パスの例は、図６において帯域内パスとは別に示されている。簡単にするために、コンピュータ・システム１００は、ストレージ・システム１３０に直接接続されているように示されている。しかしながら、それらのコンピュータ・システム１００は、複数のパス、およびスイッチのうちの１つまたは複数を通じてストレージ・システム１３０に接続されることが可能であるということを理解されたい。 2A and 2B are block diagrams of a storage system cluster implementing a “virtual volume” according to an embodiment of the present invention. The storage system cluster is comprised of one or more storage systems, eg, storage systems 130 ₁ and 130 ₂ (which can be disk arrays, each of which is a plurality of data storage • have units (DSUs), one of those DSUs is labeled 141 in the figure) and enables the embodiments of the invention described herein Storage system managers 131 and 132 that control various operations of the storage system 130 for In one embodiment, a plurality of storage systems 130 can implement a distributed storage system manager 135, which can operate the storage system cluster as if they were As if it were a single logical storage system. The operational domain of the distributed storage system manager 135 can span storage systems installed within the same data center or across multiple data centers. For example, in one such embodiment, the distributed storage system manager 135 can include a storage system manager 131 that communicates with the storage system manager 132. Sometimes acts as a “master” manager, and the storage system manager 132 acts as a “slave” manager, but various alternative methods for implementing a distributed storage system manager can be implemented. Please recognize that it is. A DSU corresponds to a physical storage unit, for example a disk such as a rotating disk or a solid state disk or a flash-based storage unit. According to embodiments, the storage system cluster creates a “virtual volume” (vvol) as described in further detail herein, such as computer systems 100 ₁ and 100 _2, etc. To the connected computer system. Applications running within computer system 100 (eg, VMs that access their virtual disks, etc.) use standard protocols such as SCSI in the embodiment of FIG. 2A and NFS in the embodiment of FIG. 2B. Access vvol on demand through a logical endpoint for SCSI or NFS protocol traffic (known as a “protocol endpoint” (PE) and configured within storage system 130). The communication path from computer system 100 to storage system 130 for application-related data operations is referred to herein as the “in-band” path. The communication path between the host bus adapter (HBA) of the computer system 100 and the PE configured in the storage system 130, and the network interface card (NIC) of the computer system 100 and the storage A communication path between PEs configured in the system 130 is an example of an in-band path. The communication path from the computer system 100 to the storage system 130 that is not in-band and typically used to perform management operations is referred to herein as an “out-of-band” path. An example of an out-of-band path, such as an Ethernet network connection between the computer system 100 and the storage system 130, is shown separately from the in-band path in FIG. For simplicity, the computer system 100 is shown as being directly connected to the storage system 130. However, it should be understood that the computer systems 100 can be connected to the storage system 130 through one or more of multiple paths and switches.

分散型ストレージ・システム・マネージャ１３５、または単一のストレージ・システム・マネージャ１３１もしくは１３２は、（たとえば、コンピュータ・システム１００などの要求に応じて、）物理的なＤＳＵの論理的な集約に相当する論理的な「ストレージ・コンテナ」からｖｖｏｌを作成することができる。一般には、１つのストレージ・コンテナは、複数のストレージ・システムにわたることができ、単一のストレージ・システム・マネージャ、または分散型ストレージ・システム・マネージャによって、多くのストレージ・コンテナが作成されることが可能である。同様に、単一のストレージ・システムは、多くのストレージ・コンテナを含むことができる。図２Ａおよび図２Ｂにおいては、分散型ストレージ・システム・マネージャ１３５によって作成されたストレージ・コンテナ１４２_Ａは、ストレージ・システム１３０_１およびストレージ・システム１３０_２にわたるものとして示されており、その一方で、ストレージ・コンテナ１４２_Ｂおよびストレージ・コンテナ１４２_Ｃは、単一のストレージ・システム（すなわち、ストレージ・システム１３０_１およびストレージ・システム１３０_２それぞれ）の中に含まれているものとして示されている。１つのストレージ・コンテナは、複数のストレージ・システムにわたることができるため、ストレージ・システム管理者は、ストレージ・システムのいずれか１つのストレージ・キャパシティーを超えるストレージ・キャパシティーを自分の顧客にプロビジョンすることができるということを認識されたい。単一のストレージ・システム内に複数のストレージ・コンテナが作成されることが可能であるため、ストレージ・システム管理者は、単一のストレージ・システムを使用して複数の顧客にストレージをプロビジョンすることができるということをさらに認識されたい。 A distributed storage system manager 135, or a single storage system manager 131 or 132, corresponds to a logical aggregation of physical DSUs (eg, as required by computer system 100). A vvol can be created from a logical “storage container”. In general, one storage container can span multiple storage systems, and many storage containers can be created by a single storage system manager or a distributed storage system manager. Is possible. Similarly, a single storage system can include many storage containers. In FIGS. 2A and 2B, the storage container 142 _A created by the distributed storage system manager 135 is shown as spanning storage system 130 ₁ and storage system 130 ₂ , while storage container 142 _B and the storage container 142 _C is shown as being contained within a single storage system (i.e., storage system 130 ₁ and the storage system 130 _2, respectively). Because a storage container can span multiple storage systems, storage system administrators can provision storage capacity to their customers that exceeds the storage capacity of any one of the storage systems. Recognize that you can. Storage system administrators use a single storage system to provision storage to multiple customers because multiple storage containers can be created within a single storage system It should be further recognized that it is possible.

図２Ａの実施形態においては、それぞれのｖｖｏｌは、ブロック・ベースのストレージ・システムからプロビジョンされている。図２Ｂの実施形態においては、ＮＡＳベースのストレージ・システムが、ＤＳＵ１４１の上にファイル・システム１４５を実装しており、それぞれのｖｖｏｌは、このファイル・システム内の１つのファイル・オブジェクトとしてコンピュータ・システム１００に公開されている。加えて、以降でさらに詳細に説明するように、コンピュータ・システム１００上で稼働するアプリケーションは、ＰＥを通じたＩＯのためにｖｖｏｌにアクセスする。たとえば、図２Ａおよび図２Ｂにおいて破線で示されているように、ｖｖｏｌ１５１およびｖｖｏｌ１５２は、ＰＥ１６１を介してアクセス可能であり、ｖｖｏｌ１５３およびｖｖｏｌ１５５は、ＰＥ１６２を介してアクセス可能であり、ｖｖｏｌ１５４は、ＰＥ１６３およびＰＥ１６４を介してアクセス可能であり、ｖｖｏｌ１５６は、ＰＥ１６５を介してアクセス可能である。ストレージ・コンテナ１４２_Ａ内のｖｖｏｌ１５３、およびストレージ・コンテナ１４２_Ｃ内のｖｖｏｌ１５５など、複数のストレージ・コンテナからのｖｖｏｌは、任意の所与の時点においてＰＥ１６２などの単一のＰＥを介してアクセス可能とすることができるということを認識されたい。ＰＥ１６６などのＰＥは、それらのＰＥを介してアクセス可能であるｖｖｏｌがまったくなくても、存在することができるということをさらに認識されたい。 In the embodiment of FIG. 2A, each vvol is provisioned from a block-based storage system. In the embodiment of FIG. 2B, the NAS-based storage system implements a file system 145 on top of the DSU 141, and each vvol is a computer system as one file object in this file system. 100. In addition, as described in more detail below, an application running on the computer system 100 accesses vvol for IO through the PE. For example, as indicated by the dashed lines in FIGS. 2A and 2B, vvol 151 and vvol 152 are accessible via PE 161, vvol 153 and vvol 155 are accessible via PE 162, and vvol 154 is defined as PE 163 and Access is possible via PE 164, and vvol 156 is accessible via PE 165. Vvol from multiple storage containers, such as vvol 153 in storage container 142 _A and vvol 155 in storage container 142 _C , can be accessed through a single PE, such as PE 162, at any given time. Recognize that you can. It should further be appreciated that PEs such as PEs 166 can exist without any vvol accessible through those PEs.

図２Ａの実施形態においては、ストレージ・システム１３０は、ＬＵＮをセットアップするための知られている方法を使用して、特別なタイプのＬＵＮとしてＰＥを実装している。ＬＵＮと同様に、ストレージ・システム１３０は、ＷＷＮ（ＷｏｒｌｄＷｉｄｅＮａｍｅ）として知られている一意の識別子をそれぞれのＰＥに提供する。一実施形態においては、ＰＥを作成する際に、ストレージ・システム１３０は、その特別なＬＵＮに関するサイズを指定しない。なぜなら、本明細書に記載されているＰＥは、実際のデータ・コンテナではないためである。そのような一実施形態においては、ストレージ・システム１３０は、ＰＥ関連のＬＵＮのサイズとしてゼロの値または非常に小さな値を割り振ることができ、それによって管理者は、以降でさらに論じるように、ストレージ・システムがＬＵＮ（たとえば、従来のデータＬＵＮおよびＰＥ関連のＬＵＮ）のリストを提供するように要求する場合に、ＰＥを迅速に識別することができる。同様に、ストレージ・システム１３０は、ＰＥに対するＬＵＮに関する識別番号として、２５５よりも大きな数をＬＵＮに割り振って、それらのＬＵＮがデータＬＵＮではないということを、人間にとってわかりやすい方法で示すことができる。ＰＥとＬＵＮとの間において区別を行うための別の方法として、ＰＥビットがＥｘｔｅｎｄｅｄＩｎｑｕｉｒｙＤａｔａＶＰＤページ（ページ８６ｈ）に加えられることが可能である。ＰＥビットは、ＬＵＮがＰＥである場合には１に設定され、ＬＵＮが通常のデータＬＵＮである場合には０に設定される。コンピュータ・システム１００は、ＳＣＳＩコマンドＲＥＰＯＲＴ＿ＬＵＮＳを発行することによって帯域内パスを介してＰＥを発見することと、示されているＰＥビットを調べることによって、それらのＰＥが、本明細書に記載されている実施形態によるＰＥであるか、または従来のデータＬＵＮであるかを判定することとが可能である。コンピュータ・システム１００は、ＬＵＮがＰＥであるか、または従来のＬＵＮであるかをさらに確認するために、ＬＵＮのサイズおよびＬＵＮの番号のプロパティーを任意選択で検査することができる。ＰＥ関連のＬＵＮを通常のデータＬＵＮから区別するために上述の技術のうちの任意の技術が使用されることが可能であるということを認識されたい。一実施形態においては、ＰＥビット技術が、ＰＥ関連のＬＵＮを通常のデータＬＵＮから区別するために使用される唯一の技術である。 In the embodiment of FIG. 2A, the storage system 130 implements the PE as a special type of LUN using known methods for setting up the LUN. Similar to the LUN, the storage system 130 provides each PE with a unique identifier known as WWN (World Wide Name). In one embodiment, when creating a PE, the storage system 130 does not specify a size for that particular LUN. This is because the PE described herein is not an actual data container. In one such embodiment, the storage system 130 can allocate a zero value or a very small value as the size of the PE-related LUN, which allows the administrator to store the storage as discussed further below. A PE can be quickly identified when the system requests to provide a list of LUNs (eg, conventional data LUNs and PE-related LUNs). Similarly, the storage system 130 can allocate numbers greater than 255 as LUN identification numbers for PEs to indicate that these LUNs are not data LUNs in a human-friendly manner. As another way to distinguish between PE and LUN, PE bits can be added to the Extended Inquiry Data VPD page (page 86h). The PE bit is set to 1 when the LUN is PE, and is set to 0 when the LUN is a normal data LUN. The computer system 100 can find the PEs via the in-band path by issuing the SCSI command REPORT_LUNs and by examining the indicated PE bits, so that the PEs are described herein. It is possible to determine whether it is a PE according to an embodiment or a conventional data LUN. The computer system 100 can optionally check the properties of the LUN size and LUN number to further verify whether the LUN is a PE or a conventional LUN. It should be appreciated that any of the techniques described above can be used to distinguish PE-related LUNs from normal data LUNs. In one embodiment, PE bit technology is the only technology used to distinguish PE-related LUNs from regular data LUNs.

図２Ｂの実施形態においては、ＰＥは、ＦＳボリュームに対してマウント・ポイントをセットアップするための知られている方法を使用してストレージ・システム１３０内に作成される。図２Ｂの実施形態において作成されるそれぞれのＰＥは、ＩＰアドレスおよびファイル・システム・パスによって一意に識別され、それらのＩＰアドレスおよびファイル・システム・パスは、従来、合わせて「マウント・ポイント」とも呼ばれている。しかしながら、従来のマウント・ポイントとは異なり、それらのＰＥは、ＦＳボリュームに関連付けられない。加えて、図２ＡのＰＥとは異なり、図２ＢのＰＥは、仮想ボリュームが所与のＰＥにバインドされていない限り、帯域内パスを介してコンピュータ・システム１００によって発見可能ではない。したがって、図２ＢのＰＥは、帯域外パスを介してストレージ・システムによって報告される。 In the embodiment of FIG. 2B, the PE is created in the storage system 130 using a known method for setting up a mount point for an FS volume. Each PE created in the embodiment of FIG. 2B is uniquely identified by an IP address and file system path, and these IP addresses and file system paths have traditionally been referred to as “mount points” together. being called. However, unlike conventional mount points, those PEs are not associated with FS volumes. In addition, unlike the PE of FIG. 2A, the PE of FIG. 2B is not discoverable by computer system 100 via an in-band path unless a virtual volume is bound to a given PE. Accordingly, the PE of FIG. 2B is reported by the storage system via an out-of-band path.

図３は、一実施形態による、仮想ボリュームを管理するための図２Ａまたは図２Ｂのストレージ・システム・クラスタのコンポーネントのブロック図である。それらのコンポーネントは、一実施形態におけるストレージ・システム１３０において実行されるストレージ・システム・マネージャ１３１および１３２のソフトウェア・モジュール、または別の実施形態における分散型ストレージ・システム・マネージャ１３５のソフトウェア・モジュール、すなわち、入力／出力（Ｉ／Ｏ）マネージャ３０４、ボリューム・マネージャ３０６、コンテナ・マネージャ３０８、およびデータ・アクセス・レイヤ３１０を含む。本明細書における実施形態の説明においては、分散型ストレージ・システム・マネージャ１３５によって取られるあらゆるアクションは、実施形態に応じてストレージ・システム・マネージャ１３１またはストレージ・システム・マネージャ１３２によって取られることが可能であるということを理解されたい。 FIG. 3 is a block diagram of components of the storage system cluster of FIG. 2A or FIG. 2B for managing virtual volumes, according to one embodiment. These components are the storage system manager 131 and 132 software modules that execute in the storage system 130 in one embodiment, or the distributed storage system manager 135 software module in another embodiment, , An input / output (I / O) manager 304, a volume manager 306, a container manager 308, and a data access layer 310. In the description of the embodiments herein, any action taken by the distributed storage system manager 135 can be taken by the storage system manager 131 or the storage system manager 132 depending on the embodiment. Please understand that.

図３の例においては、分散型ストレージ・システム・マネージャ１３５は、３つのストレージ・コンテナＳＣ１、ＳＣ２、およびＳＣ３をＤＳＵ１４１から作成しており、ＤＳＵ１４１のそれぞれは、Ｐ１からＰｎとラベル付けされているスピンドル・エクステントを有するように示されている。一般に、それぞれのストレージ・コンテナは、固定された物理サイズを有しており、ＤＳＵの特定のエクステントに関連付けられている。図３に示されている例においては、分散型ストレージ・システム・マネージャ１３５は、コンテナ・データベース３１６へのアクセスを有しており、コンテナ・データベース３１６は、それぞれのストレージ・コンテナに関して、そのコンテナＩＤと、物理的なレイアウトの情報と、何らかのメタデータとを格納している。コンテナ・データベース３１６は、コンテナ・マネージャ３０８によって管理および更新され、コンテナ・マネージャ３０８は、一実施形態においては、分散型ストレージ・システム・マネージャ１３５のコンポーネントである。コンテナＩＤとは、ストレージ・コンテナが作成されたときにそのストレージ・コンテナに与えられる汎用一意識別子である。物理的なレイアウトの情報は、所与のストレージ・コンテナに関連付けられているＤＳＵ１４１のスピンドル・エクステントから構成されており、＜システムＩＤ，ＤＳＵＩＤ，エクステント番号＞の順序付けられたリストとして格納されている。メタデータ・セクションは、何らかの一般的なメタデータ、および何らかのストレージ・システム・ベンダー固有のメタデータを含むことができる。たとえば、メタデータ・セクションは、ストレージ・コンテナにアクセスすることを許可されているコンピュータ・システムまたはアプリケーションまたはユーザのＩＤを含むことができる。別の例として、メタデータ・セクションは、ストレージ・コンテナのどの＜システムＩＤ，ＤＳＵＩＤ，エクステント番号＞のエクステントが既存のｖｖｏｌに既に割り当てられているか、およびどの＜システムＩＤ，ＤＳＵＩＤ，エクステント番号＞のエクステントが空いているかを示すためのアロケーション・ビットマップを含む。一実施形態においては、ストレージ・システム管理者は、別々のビジネス・ユニットに関して別々のストレージ・コンテナを作成することができ、それによって、別々のビジネス・ユニットのｖｖｏｌ同士は、同じストレージ・コンテナからはプロビジョンされない。ｖｖｏｌ同士を分けるためのその他のポリシーが適用されることも可能である。たとえば、ストレージ・システム管理者は、クラウド・サービスの別々の顧客のｖｖｏｌ同士が別々のストレージ・コンテナからプロビジョンされるというポリシーを採用することができる。また、ｖｖｏｌ同士が、自分たちの必要とされるサービス・レベルに従ってストレージ・コンテナからグループ化されてプロビジョンされることも可能である。加えて、ストレージ・システム管理者は、ストレージ・コンテナを作成すること、削除すること、およびその他の形で管理すること、たとえば、作成されることが可能であるストレージ・コンテナの数を定義すること、および、ストレージ・コンテナごとに設定されることが可能である最大の物理サイズを設定することなどが可能である。 In the example of FIG. 3, distributed storage system manager 135 has created three storage containers SC1, SC2, and SC3 from DSU 141, each of which is labeled P1 to Pn. It is shown having a spindle extent. In general, each storage container has a fixed physical size and is associated with a particular extent of a DSU. In the example shown in FIG. 3, the distributed storage system manager 135 has access to a container database 316, which for each storage container has its container ID. And physical layout information and some metadata. Container database 316 is managed and updated by container manager 308, which in one embodiment is a component of distributed storage system manager 135. The container ID is a general-purpose unique identifier given to a storage container when the storage container is created. The physical layout information consists of the DSU 141 spindle extents associated with a given storage container and is stored as an ordered list of <system ID, DSU ID, extent number>. . The metadata section may include some general metadata and some storage system vendor specific metadata. For example, the metadata section may include the identity of a computer system or application or user that is authorized to access the storage container. As another example, the metadata section may indicate which <system ID, DSU ID, extent number> extents of the storage container have already been allocated to an existing vvol, and which <system ID, DSU ID, extent number. > Includes an allocation bitmap to indicate whether or not> extents are free. In one embodiment, the storage system administrator can create separate storage containers for different business units so that vvols of different business units are not from the same storage container. Not provisioned. Other policies for separating vvols can also be applied. For example, a storage system administrator can employ a policy that vvols of different customers of a cloud service are provisioned from different storage containers. It is also possible for vvols to be provisioned grouped from storage containers according to their required service level. In addition, the storage system administrator can create, delete, and otherwise manage storage containers, for example, define the number of storage containers that can be created And the maximum physical size that can be set for each storage container can be set.

また、図３の例においては、分散型ストレージ・システム・マネージャ１３５は、（要求を行っているコンピュータ・システム１００のために）複数のｖｖｏｌを、それぞれ別々のストレージ・コンテナからプロビジョンしている。一般に、ｖｖｏｌは、固定された物理サイズを有することができ、またはシン・プロビジョニングされることが可能であり、それぞれのｖｖｏｌは、ｖｖｏｌＩＤを有しており、ｖｖｏｌＩＤとは、ｖｖｏｌが作成されたときにそのｖｖｏｌに与えられる汎用一意識別子である。それぞれのｖｖｏｌに関して、ｖｖｏｌデータベース３１４が、それぞれのｖｖｏｌごとに、そのｖｖｏｌＩＤと、そのｖｖｏｌが作成されているストレージ・コンテナのコンテナＩＤと、そのｖｖｏｌのアドレス空間を含むそのストレージ・コンテナ内の＜オフセット，長さ＞の値の順序付けられたリストとを格納している。ｖｖｏｌデータベース３１４は、ボリューム・マネージャ３０６によって管理および更新され、ボリューム・マネージャ３０６は、一実施形態においては、分散型ストレージ・システム・マネージャ１３５のコンポーネントである。一実施形態においては、ｖｖｏｌデータベース３１４は、ｖｖｏｌに関する少量のメタデータも格納する。このメタデータは、一組のキー／値のペアとしてｖｖｏｌデータベース３１４内に格納され、そのｖｖｏｌの存在中はいつでも帯域外パスを介してコンピュータ・システム１００によって更新されることおよびクエリーされることが可能である。格納されるキー／値のペアは、３つのカテゴリーへと分かれる。第１のカテゴリーは、よく知られているキーであり、特定のキーの定義（ひいては、それらの値の解釈）は、公に利用可能である。１例は、仮想ボリューム・タイプ（たとえば、仮想マシンの実施形態においては、ｖｖｏｌがＶＭのメタデータを含むか、またはＶＭのデータを含むか）に対応するキーである。別の例は、ＡｐｐＩＤであり、これは、ｖｖｏｌ内にデータを格納したアプリケーションのＩＤである。第２のカテゴリーは、コンピュータ・システム固有のキーであり、コンピュータ・システムまたはその管理モジュールは、特定のキーおよび値を仮想ボリュームのメタデータとして格納する。第３のカテゴリーは、ストレージ・システム・ベンダー固有のキーであり、これらによって、ストレージ・システム・ベンダーは、仮想ボリュームのメタデータに関連付けられている特定のキーを格納することができる。ストレージ・システム・ベンダーがそのメタデータに関してこのキー／値の格納を使用する１つの理由は、これらのキーのすべてが、ｖｖｏｌに関する帯域外チャネルを介してストレージ・システム・ベンダーのプラグインおよびその他の拡張にとって容易に利用可能であることである。キー／値のペアに関する格納オペレーションは、仮想ボリュームの作成およびその他の工程の一部であり、したがって格納オペレーションは、適度に高速になるはずである。ストレージ・システムはまた、特定のキー上で提供される値に対する厳密な一致に基づいて仮想ボリュームの検索を可能にするように構成される。 Also, in the example of FIG. 3, the distributed storage system manager 135 is provisioning multiple vvols (for the requesting computer system 100), each from a separate storage container. . In general, a vvol can have a fixed physical size or can be thin provisioned, each vvol has a vvol ID, and vvol ID is the vvol created Is a universal unique identifier given to the vvol. For each vvol, the vvol database 314 shows, for each vvol, the vvol ID, the container ID of the storage container in which the vvol is created, and < And an ordered list of values of offset, length>. The vvol database 314 is managed and updated by the volume manager 306, which in one embodiment is a component of the distributed storage system manager 135. In one embodiment, the vvol database 314 also stores a small amount of metadata about vvol. This metadata is stored in the vvol database 314 as a set of key / value pairs and can be updated and queried by the computer system 100 via the out-of-band path at any time during the presence of the vvol. Is possible. The stored key / value pairs fall into three categories. The first category is well-known keys, and the definition of specific keys (and thus the interpretation of their values) is publicly available. One example is a key corresponding to a virtual volume type (eg, in a virtual machine embodiment, vvol contains VM metadata or VM data). Another example is the App ID, which is the ID of the application that stored the data in vvol. The second category is computer system specific keys, where the computer system or its management module stores specific keys and values as virtual volume metadata. The third category is storage system vendor specific keys, which allow the storage system vendor to store specific keys associated with virtual volume metadata. One reason storage system vendors use this key / value store for their metadata is that all of these keys are stored in storage system vendor plug-ins and other via an out-of-band channel for vvol. It is easily available for expansion. Store operations on key / value pairs are part of virtual volume creation and other processes, so store operations should be reasonably fast. The storage system is also configured to allow searching for virtual volumes based on an exact match to the value provided on a particular key.

ＩＯマネージャ３０４は、ＰＥとｖｖｏｌとの間における現在有効なＩＯ接続パスを記憶する接続データベース３１２を保持するソフトウェア・モジュール（また、特定の実施形態においては、分散型ストレージ・システム・マネージャ１３５のコンポーネント）である。図３に示されている例においては、７つの現在有効なＩＯセッションが示されている。それぞれの有効なセッションは、関連付けられているＰＥＩＤと、セカンダリー・レベル識別子（ＳＬＬＩＤ）と、ｖｖｏｌＩＤと、このＩＯセッションを通じてＩＯを実行している別々のアプリケーションの数を示すリファレンス・カウント（ＲｅｆＣｎｔ）とを有する。（たとえば、コンピュータ・システム１００による要求に応じて）分散型ストレージ・システム・マネージャ１３５によってＰＥとｖｖｏｌとの間における有効なＩＯセッションを確立する工程は、本明細書においては「バインド」工程と呼ばれる。それぞれのバインドごとに、分散型ストレージ・システム・マネージャ１３５は、（たとえば、ＩＯマネージャ３０４を介して）接続データベース３１２にエントリーを加える。その後に分散型ストレージ・システム・マネージャ１３５によってＩＯセッションを取り壊す工程は、本明細書においては「アンバインド」工程と呼ばれる。それぞれのアンバインドごとに、分散型ストレージ・システム・マネージャ１３５は、（たとえば、ＩＯマネージャ３０４を介して）ＩＯセッションのリファレンス・カウントを１ずつデクリメントする。ＩＯセッションのリファレンス・カウントがゼロである場合には、分散型ストレージ・システム・マネージャ１３５は、（たとえば、ＩＯマネージャ３０４を介して）そのＩＯ接続パスに関するエントリーを接続データベース３１２から削除することができる。前述したように、一実施形態においては、コンピュータ・システム１００は、バインド要求およびアンバインド要求を生成して、帯域外パスを介して分散型ストレージ・システム・マネージャ１３５へ送信する。あるいは、コンピュータ・システム１００は、アンバインド要求を生成して、オーバーローディングが存在しているエラー・パスによって帯域内パスを介して送信することができる。一実施形態においては、リファレンス・カウントが０から１に変わった場合、またはその逆の場合には、世代番号は、単調に増加する番号、またはランダムに生成された番号に変更される。別の実施形態においては、世代番号は、ランダムに生成された番号であり、接続データベース３１２からＲｅｆＣｎｔ列が取り除かれており、それぞれのバインドごとに、バインド要求が、既にバインドされているｖｖｏｌに対してなされた場合でさえ、分散型ストレージ・システム・マネージャ１３５は、（たとえば、ＩＯマネージャ３０４を介して）接続データベース３１２にエントリーを加える。 The IO manager 304 is a software module that maintains a connection database 312 that stores currently active IO connection paths between PEs and vvols (and in certain embodiments, a component of the distributed storage system manager 135). ). In the example shown in FIG. 3, seven currently active IO sessions are shown. Each valid session has an associated PE ID, secondary level identifier (SLLID), vvol ID, and a reference count (RefCnt) that indicates the number of separate applications executing the IO through this IO session. ). The process of establishing a valid IO session between the PE and vvol by the distributed storage system manager 135 (eg, as requested by the computer system 100) is referred to herein as a “bind” process. . For each bind, the distributed storage system manager 135 adds an entry to the connection database 312 (eg, via the IO manager 304). The process of subsequently tearing down the IO session by the distributed storage system manager 135 is referred to herein as the “unbind” process. For each unbind, the distributed storage system manager 135 decrements the IO session reference count by one (eg, via the IO manager 304). If the IO session reference count is zero, the distributed storage system manager 135 can delete an entry for that IO connection path from the connection database 312 (eg, via the IO manager 304). . As described above, in one embodiment, the computer system 100 generates bind requests and unbind requests and sends them to the distributed storage system manager 135 via an out-of-band path. Alternatively, the computer system 100 can generate an unbind request and send it via an in-band path over an error path where overloading exists. In one embodiment, when the reference count changes from 0 to 1, or vice versa, the generation number is changed to a monotonically increasing number or a randomly generated number. In another embodiment, the generation number is a randomly generated number and the RefCnt column has been removed from the connection database 312 and for each bind, a bind request is made for an already bound vvol. Even if done, the distributed storage system manager 135 adds an entry to the connection database 312 (eg, via the IO manager 304).

図２Ａのストレージ・システム・クラスタにおいては、ＩＯマネージャ３０４は、接続データベース３１２を使用してＰＥを通じて受信されたコンピュータ・システム１００からのＩＯ要求（ＩＯ）を処理する。ＩＯがＰＥのうちの１つにおいて受信された場合には、ＩＯマネージャ３０４は、そのＩＯの対象であったｖｖｏｌを特定する目的で、そのＩＯ内に含まれているＰＥＩＤおよびＳＬＬＩＤを識別するために、そのＩＯを解析する。次いで、接続データベース３１４にアクセスすることによって、ＩＯマネージャ３０４は、解析されたＰＥＩＤおよびＳＬＬＩＤに関連付けられているｖｖｏｌＩＤを取り出すことができる。図３および後続の図においては、簡単にするために、ＰＥＩＤは、ＰＥ＿Ａ、ＰＥ＿Ｂなどと示されている。一実施形態においては、実際のＰＥＩＤは、ＰＥのＷＷＮである。加えて、ＳＬＬＩＤは、Ｓ０００１、Ｓ０００２などと示されている。実際のＳＬＬＩＤは、接続データベース３１２内の所与のＰＥＩＤに関連付けられている複数のＳＬＬＩＤのうちの任意の一意の番号として、分散型ストレージ・システム・マネージャ１３５によって生成される。ｖｖｏｌＩＤを有する仮想ボリュームの論理アドレス空間と、ＤＳＵ１４１の物理ロケーションとの間におけるマッピングは、ｖｖｏｌデータベース３１４を使用してボリューム・マネージャ３０６によって、およびコンテナ・データベース３１６を使用してコンテナ・マネージャ３０８によって実行される。ＤＳＵ１４１の物理ロケーションが得られると、データ・アクセス・レイヤ３１０（一実施形態においては、やはり分散型ストレージ・システム・マネージャ１３５のコンポーネント）が、これらの物理ロケーション上でＩＯを実行する。 In the storage system cluster of FIG. 2A, the IO manager 304 uses the connection database 312 to process IO requests (IO) received from the computer system 100 through the PE. If the IO is received at one of the PEs, the IO manager 304 identifies the PE ID and SLLID contained within the IO for the purpose of identifying the vvol that was the target of the IO. Therefore, the IO is analyzed. Then, by accessing the connection database 314, the IO manager 304 can retrieve the vvol ID associated with the parsed PE ID and SLLID. In FIG. 3 and subsequent figures, the PE IDs are shown as PE_A, PE_B, etc. for simplicity. In one embodiment, the actual PE ID is the WWN of the PE. In addition, the SLLID is indicated as S0001, S0002, or the like. The actual SLLID is generated by the distributed storage system manager 135 as any unique number of multiple SLLIDs associated with a given PE ID in the connection database 312. The mapping between the logical address space of the virtual volume with the vvol ID and the physical location of the DSU 141 is mapped by the volume manager 306 using the vvol database 314 and by the container manager 308 using the container database 316. Executed. Once the physical locations of the DSU 141 are obtained, the data access layer 310 (also a component of the distributed storage system manager 135 in one embodiment) executes IO on these physical locations.

図２Ｂのストレージ・システム・クラスタにおいては、ＩＯは、ＰＥを通じて受信され、そのようなそれぞれのＩＯは、ＮＦＳハンドル（または類似のファイル・システム・ハンドル）を含み、そのＮＦＳハンドルに対して、そのＩＯは発行されている。一実施形態においては、そのようなシステムのための接続データベース３１２は、ＰＥＩＤとしてストレージ・システムのＮＦＳインターフェースのＩＰアドレスを、およびＳＬＬＩＤとしてファイル・システム・パスを含む。ＳＬＬＩＤは、ファイル・システム１４５内のｖｖｏｌのロケーションに基づいて生成される。ｖｖｏｌの論理アドレス空間と、ＤＳＵ１４１の物理ロケーションとの間におけるマッピングは、ｖｖｏｌデータベース３１４を使用してボリューム・マネージャ３０６によって、およびコンテナ・データベース３１６を使用してコンテナ・マネージャ３０８によって実行される。ＤＳＵ１４１の物理ロケーションが得られると、データ・アクセス・レイヤが、これらの物理ロケーション上でＩＯを実行する。図２Ｂのストレージ・システムに関しては、コンテナ・データベース３１２は、所与のｖｖｏｌに関してコンテナ・ロケーション・エントリー内にファイル：＜オフセット，長さ＞エントリーの順序付けられたリストを含むことができる（すなわち、ｖｖｏｌは、ファイル・システム１４５内に格納されている複数のファイル・セグメントから構成されることが可能である）ということを認識されたい。 In the storage system cluster of FIG. 2B, IOs are received through the PE, and each such IO includes an NFS handle (or similar file system handle) for that NFS handle. IO has been issued. In one embodiment, the connection database 312 for such a system includes the storage system's NFS interface IP address as the PE ID and the file system path as the SLLID. The SLLID is generated based on the location of the vvol in the file system 145. The mapping between the vvol logical address space and the physical location of the DSU 141 is performed by the volume manager 306 using the vvol database 314 and by the container manager 308 using the container database 316. Once the physical locations of DSU 141 are obtained, the data access layer executes IO on these physical locations. For the storage system of FIG. 2B, the container database 312 may include an ordered list of files: <offset, length> entries within the container location entry for a given vvol (ie, vvol). Can be made up of multiple file segments stored in file system 145).

一実施形態においては、接続データベース３１２は、揮発性メモリ内に保持され、その一方で、ｖｖｏｌデータベース３１４およびコンテナ・データベース３１６は、ＤＳＵ１４１などの永続ストレージ内に保持される。その他の実施形態においては、データベース３１２、３１４、３１６のすべてが永続ストレージ内に保持されることが可能である。 In one embodiment, connection database 312 is maintained in volatile memory, while vvol database 314 and container database 316 are maintained in persistent storage, such as DSU 141. In other embodiments, all of the databases 312, 314, 316 can be maintained in persistent storage.

図４は、ストレージ・コンテナを作成するための方法工程４１０の流れ図である。一実施形態においては、これらの工程は、ストレージ管理者のコントロールのもとで、ストレージ・システム・マネージャ１３１、ストレージ・システム・マネージャ１３２、または分散型ストレージ・システム・マネージャ１３５によって実行される。上述したように、ストレージ・コンテナは、物理的なＤＳＵの論理的な集約に相当し、複数のストレージ・システムからの物理的なＤＳＵにわたることができる。工程４１１において、ストレージ管理者は、（分散型ストレージ・システム・マネージャ１３５などを介して、）ストレージ・コンテナの物理的なキャパシティーを設定する。クラウドまたはデータ・センター内では、この物理的なキャパシティーは、たとえば、顧客によってリースされる物理ストレージの量に相当することができる。本明細書において開示されているストレージ・コンテナによって提供される柔軟性として、別々の顧客の複数のストレージ・コンテナが、１人のストレージ管理者によって同じストレージ・システムからプロビジョンされることが可能であり、また、たとえば、いずれか１つのストレージ・デバイスの物理的なキャパシティーが、顧客によって要求されているサイズを満たすのに十分ではないケースにおいて、または１つのｖｖｏｌの物理ストレージ・フットプリントが、必然的に複数のストレージ・システムにわたることになる複製などのケースにおいて、単一の顧客のための１つのストレージ・コンテナが、複数のストレージ・システムからプロビジョンされることが可能である。工程４１２において、ストレージ管理者は、ストレージ・コンテナにアクセスするための許可レベルを設定する。マルチテナント・データ・センターにおいては、たとえば、顧客は、自分にリースされているストレージ・コンテナにアクセスすることしかできない。工程４１３において、分散型ストレージ・システム・マネージャ１３５は、ストレージ・コンテナに関する一意の識別子を生成する。次いで工程４１４において、分散型ストレージ・システム・マネージャ１３５は、（たとえば、一実施形態においては、コンテナ・マネージャ３０８を介して、）ＤＳＵ１４１の空いているスピンドル・エクステントを、工程４１１において設定された物理的なキャパシティーを満たすのに十分な量でストレージ・コンテナに割り当てる。上述したように、いずれか１つのストレージ・システムの空きスペースが、物理的なキャパシティーを満たすのに十分ではないケースにおいては、分散型ストレージ・システム・マネージャ１３５は、複数のストレージ・システムからＤＳＵ１４１のスピンドル・エクステントを割り当てることができる。パーティションが割り当てられた後に、分散型ストレージ・システム・マネージャ１３５は、（たとえば、コンテナ・マネージャ３０８を介して、）一意のコンテナＩＤと、＜システム番号，ＤＳＵＩＤ，エクステント番号＞の順序付けられたリストと、ストレージ・コンテナにアクセスすることを許可されているコンピュータ・システムのコンテキストＩＤとでコンテナ・データベース３１６を更新する。 FIG. 4 is a flow diagram of method steps 410 for creating a storage container. In one embodiment, these steps are performed by the storage system manager 131, the storage system manager 132, or the distributed storage system manager 135 under the control of the storage administrator. As described above, a storage container represents a logical aggregation of physical DSUs and can span physical DSUs from multiple storage systems. In step 411, the storage administrator sets the physical capacity of the storage container (via the distributed storage system manager 135 or the like). Within the cloud or data center, this physical capacity can correspond, for example, to the amount of physical storage leased by the customer. The flexibility provided by the storage containers disclosed herein allows multiple storage containers of different customers to be provisioned from the same storage system by a single storage administrator. And, for example, in the case where the physical capacity of any one storage device is not sufficient to meet the size required by the customer, or a physical storage footprint of one vvol In cases such as replication, which will necessarily span multiple storage systems, a single storage container for a single customer can be provisioned from multiple storage systems. In step 412, the storage administrator sets a permission level for accessing the storage container. In a multi-tenant data center, for example, customers can only access storage containers leased to them. At step 413, the distributed storage system manager 135 generates a unique identifier for the storage container. Then, at step 414, the distributed storage system manager 135 determines the free spindle extent of the DSU 141 (eg, via the container manager 308 in one embodiment) the physical spindle configured at step 411. Allocate enough storage containers to meet typical capacity. As described above, in the case where the free space of any one storage system is not sufficient to meet the physical capacity, the distributed storage system manager 135 can send DSUs 141 from multiple storage systems. Spindle extents can be allocated. After the partition has been assigned, the distributed storage system manager 135 will order the unique container ID (eg, via the container manager 308) and <system number, DSU ID, extent number>. And the container database 316 with the computer system context ID that is allowed to access the storage container.

本明細書に記載されている実施形態によれば、ストレージ能力プロファイル、たとえば、ＳＬＡまたはサービス品質（ＱｏＳ：ｑｕａｌｉｔｙｏｆｓｅｒｖｉｃｅ）は、ｖｖｏｌごとに（たとえば、要求を行っているコンピュータ・システム１００のために）分散型ストレージ・システム・マネージャ１３５によって構成されることが可能である。したがって、別々のストレージ能力プロファイルを有する複数のｖｖｏｌが、同じストレージ・コンテナの一部であることが可能である。一実施形態においては、システム管理者は、ストレージ・コンテナの作成時に、新たに作成されたｖｖｏｌに関するデフォルトのストレージ能力プロファイル（または、複数の可能なストレージ能力プロファイル）を定義して、コンテナ・データベース３１６のメタデータ・セクション内に格納する。ストレージ・コンテナ内で作成されている新たなｖｖｏｌに関してストレージ能力プロファイルが明示的に指定されない場合には、その新たなｖｖｏｌは、そのストレージ・コンテナに関連付けられているデフォルトのストレージ能力プロファイルを引き継ぐことになる。 In accordance with the embodiments described herein, the storage capability profile, eg, SLA or quality of service (QoS) is per vvol (eg, for the computer system 100 making the request). And) can be configured by a distributed storage system manager 135. Thus, multiple vvols with different storage capability profiles can be part of the same storage container. In one embodiment, the system administrator defines a default storage capacity profile (or multiple possible storage capacity profiles) for the newly created vvol at the time of storage container creation, and container database 316. In the metadata section. If a storage capacity profile is not explicitly specified for a new vvol being created in a storage container, the new vvol will take over the default storage capacity profile associated with that storage container. Become.

図５Ａは、図２Ａのストレージ・システム・クラスタ上にホストされる仮想ボリュームを実装するように構成されているコンピュータ・システムの一実施形態のブロック図である。コンピュータ・システム１０１は、従来の、典型的にはサーバクラスである、ハードウェア・プラットフォーム５００上に構築されることが可能であり、ハードウェア・プラットフォーム５００は、１つまたは複数の中央処理装置（ＣＰＵ）５０１と、メモリ５０２と、１つまたは複数のネットワーク・インターフェース・カード（ＮＩＣ）５０３と、１つまたは複数のホスト・バス・アダプタ（ＨＢＡ）５０４とを含む。ＨＢＡ５０４は、コンピュータ・システム１０１が、ストレージ・デバイス１３０内に構成されているＰＥを通じて仮想ボリュームにＩＯを発行することを可能にする。図５Ａにおいてさらに示されているように、オペレーティング・システム５０８は、ハードウェア・プラットフォーム５００の上にインストールされており、複数のアプリケーション５１２_１〜５１２_Ｎが、オペレーティング・システム５０８の上で実行される。オペレーティング・システム５０８の例としては、よく知られているコモディティー・オペレーティング・システム、たとえばＭｉｃｒｏｓｏｆｔＷｉｎｄｏｗｓ、Ｌｉｎｕｘなどのうちの任意のものが含まれる。 FIG. 5A is a block diagram of one embodiment of a computer system configured to implement virtual volumes hosted on the storage system cluster of FIG. 2A. The computer system 101 can be built on a conventional, typically server-class, hardware platform 500, which includes one or more central processing units ( CPU) 501, memory 502, one or more network interface cards (NIC) 503, and one or more host bus adapters (HBA) 504. The HBA 504 enables the computer system 101 to issue an IO to a virtual volume through a PE configured in the storage device 130. As further shown in FIG. 5A, the operating system 508 is installed on the hardware platform 500 and a plurality of applications 512 ₁ -512 _N are executed on the operating system 508. . Examples of operating system 508 include any of the well-known commodity operating systems, such as Microsoft Windows, Linux, and the like.

本明細書に記載されている実施形態によれば、それぞれのアプリケーション５１２は、自分に関連付けられている１つまたは複数のｖｖｏｌを有しており、アプリケーション５１２によるオペレーティング・システム５０８への「ＣＲＥＡＴＥＤＥＶＩＣＥ」コールに従ってオペレーティング・システム５０８によって作成されたｖｖｏｌのブロック・デバイス・インスタンスにＩＯを発行する。ブロック・デバイス名と、ｖｖｏｌＩＤとの間における関連付けは、ブロック・デバイス・データベース５３３内に保持される。アプリケーション５１２_２〜５１２_ＮからのＩＯは、ファイル・システム・ドライバ５１０によって受信され、ファイル・システム・ドライバ５１０は、それらのＩＯをブロックＩＯに変換し、それらのブロックＩＯを仮想ボリューム・デバイス・ドライバ５３２に提供する。その一方で、アプリケーション５１２_１からのＩＯは、ファイル・システム・ドライバ５１０を迂回するように示されており、仮想ボリューム・デバイス・ドライバ５３２に直接提供され、これが意味するのは、アプリケーション５１２_１が、自分のブロック・デバイスに直接、ロー・ストレージ・デバイス（raw storage device）として、たとえば、データベース・ディスク、ログ・ディスク、バックアップ・アーカイブ、およびコンテンツ・リポジトリとして、「ＰｒｏｖｉｄｉｎｇＡｃｃｅｓｓｔｏａＲａｗＤａｔａＳｔｏｒａｇｅＵｎｉｔｉｎａＣｏｍｐｕｔｅｒＳｙｓｔｅｍ」と題されている米国特許第７，１５５，５５８号（その全内容を本願明細書に援用する）に記載されている様式でアクセスするということである。仮想ボリューム・デバイス・ドライバ５３２は、ブロックＩＯを受信した場合には、ブロック・デバイス・データベース５３３にアクセスして、そのＩＯ内で指定されているブロック・デバイス名と、そのブロック・デバイス名に関連付けられているｖｖｏｌへのＩＯ接続パスを定義するＰＥＩＤ（ＰＥＬＵＮのＷＷＮ）およびＳＬＬＩＤとの間におけるマッピングを参照する。ここで示されている例においては、「ａｒｃｈｉｖｅ」というブロック・デバイス名は、アプリケーション５１２_１に関して作成されたｖｖｏｌ１２のブロック・デバイス・インスタンスに対応しており、「ｆｏｏ」、「ｄｂａｓｅ」、および「ｌｏｇ」というブロック・デバイス名は、アプリケーション５１２_２〜５１２_Ｎのうちの１つまたは複数に関してそれぞれ作成されたｖｖｏｌ１、ｖｖｏｌ１６、およびｖｖｏｌ１７のブロック・デバイス・インスタンスに対応する。ブロック・デバイス・データベース５３３内に格納されているその他の情報としては、ブロック・デバイスがアクティブであるか否かを示すそれぞれのブロック・デバイスに関するアクティブ・ビット値と、ＣＩＦ（ｃｏｍｍａｎｄｓ−ｉｎ−ｆｌｉｇｈｔ：処理中コマンド）値とが含まれる。「１」というアクティブ・ビットは、ＩＯがブロック・デバイスに発行されることが可能であるということを意味する。「０」というアクティブ・ビットは、ブロック・デバイスが非アクティブであり、ＩＯがブロック・デバイスに発行されることは不可能であるということを意味する。ＣＩＦ値は、いくつのＩＯが処理中であるか、すなわち、発行されたが完了されていないかの表示を提供する。ここで示されている例においては、「ｆｏｏ」というブロック・デバイスは、アクティブであり、いくつかの処理中コマンドを有している。「ａｒｃｈｉｖｅ」というブロック・デバイスは、非アクティブであり、さらに新しいコマンドを受け入れないであろう。しかしながら、このブロック・デバイスは、２つの処理中コマンドが完了するのを待っている。「ｄｂａｓｅ」というブロック・デバイスは、非アクティブであり、未処理のコマンドはない。最後に、「ｌｏｇ」というブロック・デバイスは、アクティブであるが、アプリケーションは現在、このデバイスに対する未処理のＩＯを有していない。仮想ボリューム・デバイス・ドライバ５３２は、いつでも自分のデータベース５３３からこれらのようなデバイスを除去することを選択することができる。 In accordance with the embodiments described herein, each application 512 has one or more vvols associated with it, and the application 512 “CREATE DEVICE” to the operating system 508. Issue an IO to the vvol block device instance created by the operating system 508 according to the call. The association between the block device name and the vvol ID is maintained in the block device database 533. The IO from the application ₅₁₂ 2 to 512 _N, is received by the file system driver 510, file system driver 510 converts those IO block IO, those blocks IO virtual volume device driver 532. On the other hand, IO from application 512 ₁ is shown to bypass file system driver 510 and is provided directly to virtual volume device driver 532, which means that application 512 ₁ , As a raw storage device directly to your block device, for example, as a database disk, log disk, backup archive, and content repository, “Providing Access to a Raw Data Storage Unit” access in the manner described in US Pat. No. 7,155,558 (incorporated herein in its entirety) entitled “in a Computer System”. When receiving the block IO, the virtual volume device driver 532 accesses the block device database 533 and associates the block device name specified in the IO with the block device name. Refers to the mapping between PE ID (PE LUN WWN) and SLLID that define the IO connection path to the vvol being configured. In the example shown here, the block device name “archive” corresponds to the block device instance of vvol12 created for application 512 ₁ and is “foo”, “dbbase”, and “ The block device name “log” corresponds to the block device instances of vvol1, vvol16, and vvol17 created for one or more of the applications 512 _{2 to} 512 _N , respectively. Other information stored in the block device database 533 includes an active bit value for each block device indicating whether the block device is active, and CIF (commands-in-flight: In-process command) value. An active bit of “1” means that IO can be issued to the block device. An active bit of “0” means that the block device is inactive and no IO can be issued to the block device. The CIF value provides an indication of how many IOs are in process, ie issued but not completed. In the example shown here, the block device “foo” is active and has several commands in progress. The block device “archive” is inactive and will not accept new commands. However, this block device is waiting for two in-process commands to complete. The block device “dbase” is inactive and there are no outstanding commands. Finally, the block device named “log” is active, but the application currently has no outstanding IO for this device. The virtual volume device driver 532 can choose to remove such devices from its database 533 at any time.

上述のマッピングを実行することに加えて、仮想ボリューム・デバイス・ドライバ５３２は、データ・アクセス・レイヤ５４０にロー・ブロックレベルＩＯ（raw block level IO）を発行する。データ・アクセス・レイヤ５４０は、コマンド・キューイングおよびスケジューリング・ポリシーをロー・ブロックレベルＩＯに適用するデバイス・アクセス・レイヤ５３４と、プロトコルに準拠したフォーマットでロー・ブロックレベルＩＯをフォーマットして、それらのロー・ブロックレベルＩＯを、帯域内パスを介してＰＥへ転送するためにＨＢＡ５０４に送信する、ＨＢＡ５０４のためのデバイス・ドライバ５３６とを含む。ＳＣＳＩプロトコルが使用される実施形態においては、ｖｖｏｌ情報は、ＳＡＭ−５（ＳＣＳＩＡｒｃｈｉｔｅｃｔｕｒｅＭｏｄｅｌ−５）において指定されているように、ＳＣＳＩＬＵＮデータ・フィールド（これは、８バイト構造である）内にエンコードされる。ＰＥＩＤは、最初の２バイト（これは、従来はＬＵＮＩＤ用に使用されている）内にエンコードされ、ｖｖｏｌ情報、とりわけＳＬＬＩＤは、残っている６バイト（の一部）を利用して、ＳＣＳＩセカンド・レベルＬＵＮＩＤ内にエンコードされる。 In addition to performing the mapping described above, the virtual volume device driver 532 issues a raw block level IO to the data access layer 540. The data access layer 540 formats the device access layer 534 that applies command queuing and scheduling policies to the raw block level IO, and formats the raw block level IO in a protocol compliant format. And a device driver 536 for the HBA 504 that transmits the low block level IO to the HBA 504 for transfer to the PE via the in-band path. In embodiments where the SCSI protocol is used, the vvol information is in the SCSI LUN data field (which is an 8-byte structure), as specified in SAM-5 (SCSI Architecture Model-5). Encoded. The PE ID is encoded in the first 2 bytes (which is traditionally used for LUN ID), and vvol information, especially SLLID, uses (part of) the remaining 6 bytes, Encoded in the SCSI second level LUN ID.

図５Ａにおいてさらに示されているように、データ・アクセス・レイヤ５４０はまた、ストレージ・システムから帯域内パスを通じて受信されるＩＯエラーを取り扱うためのエラー・ハンドリング・ユニット５４２を含む。一実施形態においては、エラー・ハンドリング・ユニット５４２によって受信されたＩＯエラーは、Ｉ／Ｏマネージャ３０４によってＰＥを通じて伝搬される。ＩＯエラー・クラスの例としては、コンピュータ・システム１０１とＰＥとの間におけるパス・エラーと、ＰＥエラーと、ｖｖｏｌエラーとが含まれる。エラー・ハンドリング・ユニット５４２は、検知されたすべてのエラーを上述のクラスへと分類する。ＰＥへのパス・エラーに出くわし、ＰＥへの別のパスが存在する場合には、データ・アクセス・レイヤ５４０は、ＰＥへの別のパスに沿ってＩＯを送信する。ＩＯエラーがＰＥエラーである場合には、エラー・ハンドリング・ユニット５４２は、ＰＥを通じてＩＯを発行しているそれぞれのブロック・デバイスに関するエラー状況を示すために、ブロック・デバイス・データベース５３３を更新する。ＩＯエラーがｖｖｏｌエラーである場合には、エラー・ハンドリング・ユニット５４２は、ｖｖｏｌに関連付けられているそれぞれのブロック・デバイスに関するエラー状況を示すために、ブロック・デバイス・データベース５３３を更新する。エラー・ハンドリング・ユニット５４２は、アラームまたはシステム・イベントを発行することもでき、それによって、エラー状況を有するブロック・デバイスへのさらなるＩＯは、拒否されることになる。 As further shown in FIG. 5A, the data access layer 540 also includes an error handling unit 542 for handling IO errors received through the in-band path from the storage system. In one embodiment, IO errors received by error handling unit 542 are propagated through the PE by I / O manager 304. Examples of the IO error class include a path error between the computer system 101 and the PE, a PE error, and a vvol error. The error handling unit 542 classifies all detected errors into the class described above. If a path error to the PE is encountered and there is another path to the PE, the data access layer 540 sends the IO along another path to the PE. If the IO error is a PE error, error handling unit 542 updates block device database 533 to indicate the error status for each block device issuing IO through the PE. If the IO error is a vvol error, error handling unit 542 updates block device database 533 to indicate the error status for each block device associated with the vvol. The error handling unit 542 can also issue alarms or system events, which will cause further IOs to block devices with error conditions to be rejected.

図５Ｂは、図２Ａのストレージ・システム・クラスタの代わりに図２Ｂのストレージ・システム・クラスタとインターフェースを取るように構成されている図５Ａのコンピュータ・システムのブロック図である。この実施形態においては、データ・アクセス・レイヤ５４０は、ＮＦＳクライアント５４５と、ＮＩＣ５０３のためのデバイス・ドライバ５４６とを含む。ＮＦＳクライアント５４５は、ブロック・デバイス名を、ＰＥＩＤ（ＮＡＳストレージ・システムのＩＰアドレス）と、ブロック・デバイスに対応するＮＦＳファイル・ハンドルであるＳＬＬＩＤとにマップする。このマッピングは、図５Ｂにおいて示されているように、ブロック・デバイス・データベース５３３内に格納される。「アクティブ」および「ＣＩＦ」の列は、依然として存在するが、図５Ｂに示されているブロック・デバイス・データベース５３３においては示されていないということに留意されたい。以降で説明するように、ＮＦＳファイル・ハンドルは、ＮＡＳストレージ・システム内のファイル・オブジェクトを一意に識別し、バインド工程中に生成されることが可能である。あるいは、ｖｖｏｌをバインドしたいという要求に応答して、ＮＡＳストレージ・システムは、ＰＥＩＤおよびＳＬＬＩＤを返し、通常の帯域内メカニズム（たとえば、ルックアップまたはｒｅａｄｄｉｒｐｌｕｓ）を使用してｖｖｏｌが開かれると、ＮＦＳファイル・ハンドルが与えられることになる。ＮＦＳクライアント５４５はまた、仮想ボリューム・デバイス・ドライバ５３２から受信されたロー・ブロックレベルＩＯをＮＦＳファイルベースのＩＯに変換する。次いで、ＮＩＣ５０３のためのデバイス・ドライバ５４６は、プロトコルに準拠したフォーマットでＮＦＳファイルベースのＩＯをフォーマットして、それらのＮＦＳファイルベースのＩＯを、帯域内パスを介してＰＥのうちの１つへ転送するために、ＮＦＳハンドルとともに、ＮＩＣ５０３へ送信する。 FIG. 5B is a block diagram of the computer system of FIG. 5A configured to interface with the storage system cluster of FIG. 2B instead of the storage system cluster of FIG. 2A. In this embodiment, the data access layer 540 includes an NFS client 545 and a device driver 546 for the NIC 503. The NFS client 545 maps the block device name to a PE ID (NAS storage system IP address) and an SLLID which is an NFS file handle corresponding to the block device. This mapping is stored in the block device database 533 as shown in FIG. 5B. Note that the columns “active” and “CIF” still exist, but are not shown in the block device database 533 shown in FIG. 5B. As described below, the NFS file handle uniquely identifies a file object in the NAS storage system and can be generated during the binding process. Alternatively, in response to a request to bind vvol, the NAS storage system returns a PE ID and SLLID and NFS is opened when vvol is opened using a normal in-band mechanism (eg, lookup or readdirplus). A file handle will be given. The NFS client 545 also converts the raw block level IO received from the virtual volume device driver 532 into an NFS file based IO. The device driver 546 for the NIC 503 then formats the NFS file-based IO in a format that conforms to the protocol and passes the NFS file-based IO to one of the PEs via an in-band path. Send to NIC 503 along with NFS handle for transfer.

図５Ｃは、仮想ボリュームを実装するように構成されているコンピュータ・システムの別の実施形態のブロック図である。この実施形態においては、コンピュータ・システム１０２は、仮想化ソフトウェア（ここでは、ハイパーバイザ５６０として示されている）を伴って構成されている。ハイパーバイザ５６０は、ハードウェア・プラットフォーム５５０の上にインストールされており、ハードウェア・プラットフォーム５５０は、ＣＰＵ５５１、メモリ５５２、ＮＩＣ５５３、およびＨＢＡ５５４を含み、仮想マシン実行スペース５７０をサポートし、仮想マシン実行スペース５７０内では、複数の仮想マシン（ＶＭ）５７１_１〜５７１_Ｎが、同時にインスタンス化されて実行されることが可能である。１つまたは複数の実施形態においては、ハイパーバイザ５６０および仮想マシン５７１は、ヴイエムウェア・インコーポレイティッド社（ＶＭｗａｒｅ，Ｉｎｃ．）［米国カリフォルニア州パロアルト（ＰａｌｏＡｌｔｏ）所在］によって販売されているＶＭｗａｒｅｖＳｐｈｅｒｅ（登録商標）製品を使用して実装される。それぞれの仮想マシン５７１は、仮想ハードウェア・プラットフォーム５７３を実装しており、仮想ハードウェア・プラットフォーム５７３は、アプリケーション５７９を実行することができるゲスト・オペレーティング・システム（ＯＳ）５７２のインストレーションをサポートする。ゲストＯＳ５７２の例としては、よく知られているコモディティー・オペレーティング・システム、たとえばＭｉｃｒｏｓｏｆｔＷｉｎｄｏｗｓ、Ｌｉｎｕｘなどのうちの任意のものが含まれる。それぞれのインスタンスにおいて、ゲストＯＳ５７２は、ネイティブ・ファイル・システム・レイヤ（図５Ｃにおいては示されていない）、たとえば、ＮＴＦＳまたはｅｘｔ３ＦＳタイプのファイル・システム・レイヤのいずれかを含む。これらのファイル・システム・レイヤは、仮想ハードウェア・プラットフォーム５７３とインターフェースを取って、ゲストＯＳ５７２の視点からは、データ・ストレージＨＢＡにアクセスするが、このデータ・ストレージＨＢＡは、実際には、ゲストＯＳ５７２の実行を可能にするためにディスク・ストレージ・サポートの外見（実際には、仮想ディスク、すなわち仮想ディスク５７５_Ａ〜５７５_Ｘ）を提供する仮想ハードウェア・プラットフォーム５７３によって実装されている仮想ＨＢＡ５７４である。特定の実施形態においては、仮想ディスク５７５_Ａ〜５７５_Ｘは、ゲストＯＳ５７２の視点からは、仮想マシンに接続するためのＳＣＳＩ標準、または、ＩＤＥ、ＡＴＡ、およびＡＴＡＰＩを含む、当技術分野における標準的な技術者に知られているその他の任意の適切なハードウェア接続インターフェース標準をサポートするように見えることが可能である。ゲストＯＳ５７２の視点からは、ファイル・システム関連のデータ転送およびコントロール・オペレーションを実施するためにそのようなゲストＯＳ５７２によって開始されたファイル・システム・コールは、最終的な実行のために仮想ディスク５７５_Ａ〜５７５_Ｘへ回送されるように見えるが、実際には、そのようなコールは、処理され、仮想ＨＢＡ５７４を通じて補助的な仮想マシン・モニタ（ＶＭＭ）５６１_１〜５６１_Ｎに渡され、ＶＭＭ５６１_１〜５６１_Ｎは、ハイパーバイザ５６０とのオペレーションを調整するために必要とされる仮想システム・サポートを実施する。とりわけ、ＨＢＡエミュレータ５６２は、データ転送およびコントロール・オペレーションがハイパーバイザ５６０によって正しく取り扱われることを機能的に可能にし、ハイパーバイザ５６０は最終的に、そのようなオペレーションを、自分のさまざまなレイヤを通じて、ストレージ・システム１３０へ接続しているＨＢＡ５５４に渡す。 FIG. 5C is a block diagram of another embodiment of a computer system configured to implement virtual volumes. In this embodiment, computer system 102 is configured with virtualization software (shown here as hypervisor 560). The hypervisor 560 is installed on the hardware platform 550. The hardware platform 550 includes a CPU 551, a memory 552, a NIC 553, and an HBA 554, supports the virtual machine execution space 570, and the virtual machine execution space. Within 570, multiple virtual machines (VMs) 571 _{1 to} 571 _N can be instantiated and executed simultaneously. In one or more embodiments, the hypervisor 560 and the virtual machine 571 are a VMware vSphere sold by VMware, Inc. (Palo Alto, Calif.). It is implemented using a (registered trademark) product. Each virtual machine 571 implements a virtual hardware platform 573 that supports the installation of a guest operating system (OS) 572 that can execute the application 579. . Examples of guest OS 572 include any of the well-known commodity operating systems such as Microsoft Windows, Linux, etc. In each instance, guest OS 572 includes a native file system layer (not shown in FIG. 5C), eg, either an NTFS or ext3FS type file system layer. These file system layers interface with the virtual hardware platform 573 and access the data storage HBA from the viewpoint of the guest OS 572, but this data storage HBA is actually the guest OS 572. _A virtual HBA 574 implemented by a virtual hardware platform 573 that provides the appearance of disk storage support (actually virtual disks, ie, virtual disks 575 _{A to} 575 _X ) to enable execution of . In certain embodiments, virtual disks 575 _A- 575 _X are from the guest OS 572 perspective, a SCSI standard for connecting to virtual machines, or standard in the art, including IDE, ATA, and ATAPI. It may appear to support any other suitable hardware connection interface standard known to a skilled technician. From the guest OS 572 point of view, a file system call initiated by such guest OS 572 to perform file system related data transfer and control operations is performed on virtual disk 575 _A for final execution. appear to be forwarded to to 575 _X, in fact, such a call is processed and passed to auxiliary virtual machine monitor _(VMM) 561 1 ~561 _N through the virtual HBA574, VMM561 ₁ ~ 561 _N implements the virtual system support needed to coordinate operations with the hypervisor 560. Among other things, the HBA emulator 562 functionally allows data transfer and control operations to be handled correctly by the hypervisor 560, which ultimately performs such operations through its various layers. The data is transferred to the HBA 554 connected to the storage system 130.

本明細書に記載されている実施形態によれば、それぞれのＶＭ５７１は、自分に関連付けられている１つまたは複数のｖｖｏｌを有しており、ＶＭ５７１によるハイパーバイザ５６０への「ＣＲＥＡＴＥＤＥＶＩＣＥ」コールに従ってハイパーバイザ５６０によって作成されたｖｖｏｌのブロック・デバイス・インスタンスにＩＯを発行する。ブロック・デバイス名と、ｖｖｏｌＩＤとの間における関連付けは、ブロック・デバイス・データベース５８０内に保持される。ＶＭ５７１_２〜５７１_ＮからのＩＯは、ＳＣＳＩ仮想化レイヤ５６３によって受信され、ＳＣＳＩ仮想化レイヤ５６３は、それらのＩＯを、仮想マシン・ファイル・システム（ＶＭＦＳ）ドライバ５６４によって理解されるファイルＩＯへと変換する。次いで、ＶＭＦＳドライバ５６４は、それらのファイルＩＯをブロックＩＯに変換し、それらのブロックＩＯを仮想ボリューム・デバイス・ドライバ５６５に提供する。その一方で、ＶＭ５７１_１からのＩＯは、ＶＭＦＳドライバ５６４を迂回するように示されており、仮想ボリューム・デバイス・ドライバ５６５に直接提供され、これが意味するのは、ＶＭ５７１_１が、自分のブロック・デバイスに直接、ロー・ストレージ・デバイスとして、たとえば、データベース・ディスク、ログ・ディスク、バックアップ・アーカイブ、およびコンテンツ・リポジトリとして、米国特許第７，１５５，５５８号に記載されている様式でアクセスするということである。 According to the embodiments described herein, each VM 571 has one or more vvols associated with it, and according to a “CREATE DEVICE” call to the hypervisor 560 by the VM 571. An IO is issued to the vvol block device instance created by the hypervisor 560. The association between the block device name and the vvol ID is maintained in the block device database 580. IOs from VMs 571 _{2 to} 571 _N are received by the SCSI virtualization layer 563 which passes them to a file IO understood by the virtual machine file system (VMFS) driver 564. Convert. The VMFS driver 564 then converts those file IOs into block IOs and provides those block IOs to the virtual volume device driver 565. On the other hand, the IO from VM571 _1, is shown to bypass the VMFS driver 564, is provided directly to a virtual volume device driver 565, this means is, VM571 ₁ is my block Access the device directly as a raw storage device, for example, as a database disk, log disk, backup archive, and content repository, in the manner described in US Pat. No. 7,155,558. That is.

仮想ボリューム・デバイス・ドライバ５６５は、ブロックＩＯを受信した場合には、ブロック・デバイス・データベース５８０にアクセスして、そのＩＯ内で指定されているブロック・デバイス名と、そのブロック・デバイス名に関連付けられているｖｖｏｌへのＩＯセッションを定義するＰＥＩＤおよびＳＬＬＩＤとの間におけるマッピングを参照する。ここで示されている例においては、「ｄｂａｓｅ」および「ｌｏｇ」というブロック・デバイス名は、ＶＭ５７１_１に関してそれぞれ作成されたｖｖｏｌ１およびｖｖｏｌ４のブロック・デバイス・インスタンスに対応しており、「ｖｍｄｋ２」、「ｖｍｄｋｎ」および「ｓｎａｐｎ」というブロック・デバイス名は、ＶＭ５７１_２〜５７１_Ｎのうちの１つまたは複数に関してそれぞれ作成されたｖｖｏｌ１２、ｖｖｏｌ１６、およびｖｖｏｌ１７のブロック・デバイス・インスタンスに対応する。ブロック・デバイス・データベース５８０内に格納されているその他の情報としては、ブロック・デバイスがアクティブであるか否かを示すそれぞれのブロック・デバイスに関するアクティブ・ビット値と、ＣＩＦ（ｃｏｍｍａｎｄｓ−ｉｎ−ｆｌｉｇｈｔ：処理中コマンド）値とが含まれる。「１」というアクティブ・ビットは、ＩＯがブロック・デバイスに発行されることが可能であるということを意味する。「０」というアクティブ・ビットは、ブロック・デバイスが非アクティブであり、ＩＯがブロック・デバイスに発行されることは不可能であるということを意味する。ＣＩＦ値は、いくつのＩＯが処理中であるか、すなわち、発行されたが完了されていないかの表示を提供する。 When the virtual volume device driver 565 receives a block IO, the virtual volume device driver 565 accesses the block device database 580 and associates the block device name specified in the IO with the block device name. Refers to the mapping between the PE ID and SLLID that define the IO session to the vvol being registered. In the example shown here, "dbase" and block device name "log" corresponds to vvol1 and vvol4 block device instances created respectively with respect VM571 _1, "vmdk2" The block device names “vmdkn” and “snapn” correspond to the vvol12, vvol16, and vvol17 block device instances created for one or more of VM571 _{2 to} 571 _N , respectively. Other information stored in the block device database 580 includes an active bit value for each block device indicating whether the block device is active, and CIF (commands-in-flight: In-process command) value. An active bit of “1” means that IO can be issued to the block device. An active bit of “0” means that the block device is inactive and no IO can be issued to the block device. The CIF value provides an indication of how many IOs are in process, ie issued but not completed.

上述のマッピングを実行することに加えて、仮想ボリューム・デバイス・ドライバ５６５は、データ・アクセス・レイヤ５６６にロー・ブロックレベルＩＯを発行する。データ・アクセス・レイヤ５６６は、コマンド・キューイングおよびスケジューリング・ポリシーをロー・ブロックレベルＩＯに適用するデバイス・アクセス・レイヤ５６７と、プロトコルに準拠したフォーマットでロー・ブロックレベルＩＯをフォーマットして、それらのロー・ブロックレベルＩＯを、帯域内パスを介してＰＥへ転送するためにＨＢＡ５５４に送信する、ＨＢＡ５５４のためのデバイス・ドライバ５６８とを含む。ＳＣＳＩプロトコルが使用される実施形態においては、ｖｖｏｌ情報は、ＳＡＭ−５（ＳＣＳＩＡｒｃｈｉｔｅｃｔｕｒｅＭｏｄｅｌ−５）において指定されているように、ＳＣＳＩＬＵＮデータ・フィールド（これは、８バイト構造である）内にエンコードされる。ＰＥＩＤは、最初の２バイト（これは、従来はＬＵＮＩＤ用に使用されている）内にエンコードされ、ｖｖｏｌ情報、とりわけＳＬＬＩＤは、残っている６バイト（の一部）を利用して、ＳＣＳＩセカンド・レベルＬＵＮＩＤ内にエンコードされる。図５Ｃにおいてさらに示されているように、データ・アクセス・レイヤ５６６はまた、エラー・ハンドリング・ユニット５６９を含み、エラー・ハンドリング・ユニット５６９は、エラー・ハンドリング・ユニット５４２と同じ様式で機能する。 In addition to performing the mapping described above, the virtual volume device driver 565 issues a raw block level IO to the data access layer 566. The data access layer 566 formats the device access layer 567 that applies command queuing and scheduling policies to the raw block level IO, and formats the raw block level IO in a protocol compliant format. And a device driver 568 for HBA 554 that transmits the low block level IO to HBA 554 for transfer to the PE via an in-band path. In embodiments where the SCSI protocol is used, the vvol information is in the SCSI LUN data field (which is an 8-byte structure), as specified in SAM-5 (SCSI Architecture Model-5). Encoded. The PE ID is encoded in the first 2 bytes (which is traditionally used for LUN ID), and vvol information, especially SLLID, uses (part of) the remaining 6 bytes, Encoded in the SCSI second level LUN ID. As further shown in FIG. 5C, the data access layer 566 also includes an error handling unit 569 that functions in the same manner as the error handling unit 542.

図５Ｄは、図２Ａのストレージ・システム・クラスタの代わりに図２Ｂのストレージ・システム・クラスタとインターフェースを取るように構成されている図５Ｃのコンピュータ・システムのブロック図である。この実施形態においては、データ・アクセス・レイヤ５６６は、ＮＦＳクライアント５８５と、ＮＩＣ５５３のためのデバイス・ドライバ５８６とを含む。ＮＦＳクライアント５８５は、ブロック・デバイス名を、ＰＥＩＤ（ＩＰアドレス）と、ブロック・デバイスに対応するＳＬＬＩＤ（ＮＦＳファイル・ハンドル）とにマップする。このマッピングは、図５Ｄにおいて示されているように、ブロック・デバイス・データベース５８０内に格納される。「アクティブ」および「ＣＩＦ」の列は、依然として存在するが、図５Ｄに示されているブロック・デバイス・データベース５８０においては示されていないということに留意されたい。以降で説明するように、ＮＦＳファイル・ハンドルは、ＮＦＳ内のファイル・オブジェクトを一意に識別し、一実施形態においてはバインド工程中に生成される。ＮＦＳクライアント５８５はまた、仮想ボリューム・デバイス・ドライバ５６５から受信されたロー・ブロックレベルＩＯをＮＦＳファイルベースのＩＯに変換する。次いで、ＮＩＣ５５３のためのデバイス・ドライバ５８６は、プロトコルに準拠したフォーマットでＮＦＳファイルベースのＩＯをフォーマットして、それらのＮＦＳファイルベースのＩＯを、帯域内パスを介してＰＥのうちの１つへ転送するために、ＮＦＳハンドルとともに、ＮＩＣ５５３へ送信する。 FIG. 5D is a block diagram of the computer system of FIG. 5C configured to interface with the storage system cluster of FIG. 2B instead of the storage system cluster of FIG. 2A. In this embodiment, the data access layer 566 includes an NFS client 585 and a device driver 586 for the NIC 553. The NFS client 585 maps the block device name to the PE ID (IP address) and the SLLID (NFS file handle) corresponding to the block device. This mapping is stored in the block device database 580, as shown in FIG. 5D. Note that the columns “active” and “CIF” still exist, but are not shown in the block device database 580 shown in FIG. 5D. As will be described below, the NFS file handle uniquely identifies a file object in the NFS, and in one embodiment is created during the binding process. The NFS client 585 also converts the raw block level IO received from the virtual volume device driver 565 into an NFS file based IO. The device driver 586 for the NIC 553 then formats the NFS file based IO in a protocol compliant format and passes the NFS file based IO to one of the PEs via an in-band path. Send to NIC 553 with NFS handle for transfer.

図５Ａ〜図５Ｄにおけるコンポーネントを説明するために使用されているさまざまな用語、レイヤ、および分類は、それらの機能、または本発明の趣旨もしくは範囲から逸脱することなく、別の形で参照されることが可能であるということを認識されたい。たとえば、ＶＭＭ５６１は、ＶＭ５７１とハイパーバイザ５６０との間における別個の仮想化コンポーネントとみなされることが可能である（ハイパーバイザ５６０は、そのような概念においては、それ自体が仮想化「カーネル」コンポーネントとみなされることが可能である）。なぜなら、それぞれのインスタンス化されたＶＭごとに別々のＶＭＭが存在するためである。あるいは、それぞれのＶＭＭ５６１は、そのＶＭＭ５６１の対応する仮想マシンのコンポーネントであるとみなされることが可能である。なぜなら、そのようなＶＭＭは、その仮想マシンに関するハードウェア・エミュレーション・コンポーネントを含むためである。そのような代替概念においては、たとえば、仮想ハードウェア・プラットフォーム５７３として記載されている概念レイヤは、ＶＭＭ５６１と、およびＶＭＭ５６１内へ合併されることが可能であり、それによって、仮想ホスト・バス・アダプタ５７４は、図５Ｃおよび図５Ｄから除去される（すなわち、その仮想ホスト・バス・アダプタ５７４の機能は、ホスト・バス・アダプタ・エミュレータ５６２によって実施されるためである）。 Various terms, layers, and classifications used to describe the components in FIGS. 5A-5D are referred to in another manner without departing from their functionality or the spirit or scope of the present invention. Recognize that it is possible. For example, VMM 561 can be viewed as a separate virtualization component between VM 571 and hypervisor 560 (hypervisor 560 is itself a virtualization “kernel” component in such a concept). Can be considered). This is because there is a separate VMM for each instantiated VM. Alternatively, each VMM 561 can be considered a component of the corresponding virtual machine of that VMM 561. This is because such a VMM includes a hardware emulation component for the virtual machine. In such an alternative concept, for example, the concept layer described as virtual hardware platform 573 can be merged into and into VMM 561, thereby providing a virtual host bus adapter. 574 is removed from FIGS. 5C and 5D (ie, because the function of the virtual host bus adapter 574 is performed by the host bus adapter emulator 562).

図６は、本発明の一実施形態による、ｖｖｏｌを管理するために使用されるコンポーネントおよび通信パスを示すコンピュータ環境の簡略化されたブロック図である。前述したように、ＩＯプロトコル・トラフィックのための通信パスは、帯域内パスと呼ばれ、図６においては破線６０１として示されており、破線６０１は、コンピュータ・システムのデータ・アクセス・レイヤ５４０を（コンピュータ・システムにおいて提供されているＨＢＡまたはＮＩＣを通じて）、ストレージ・システム１３０において構成されている１つまたは複数のＰＥと接続している。ｖｖｏｌを管理するために使用される通信パスは、帯域外パス（前に定義したように、「帯域内」ではないパス）であり、図６においては実線６０２として示されている。本明細書に記載されている実施形態によれば、ｖｖｏｌは、管理サーバ６１０において提供されているプラグイン６１２、および／またはコンピュータ・システム１０３のそれぞれにおいて提供されているプラグイン６２２を通じて管理されることが可能であり、図６においては、コンピュータ・システム１０３のうちの１つのみが示されている。ストレージ・デバイス側では、管理インターフェース６２５が、ストレージ・システム・マネージャ１３１によって構成されており、管理インターフェース６２６が、ストレージ・システム・マネージャ１３２によって構成されている。加えて、管理インターフェース６２４が、分散型ストレージ・システム・マネージャ１３５によって構成されている。それぞれの管理インターフェースは、プラグイン６１２、６２２と通信する。管理コマンドの発行および取り扱いを容易にするために、特別なアプリケーション・プログラミング・インターフェース（ＡＰＩ）が開発されている。一実施形態においては、プラグイン６１２、６２２の両方が、特定のストレージ・システム・ベンダーからのストレージ・ハードウェアと通信するようにカスタマイズされているということを認識されたい。したがって、管理サーバ６１０およびコンピュータ・システム１０３は、別々のストレージ・システム・ベンダーに関するストレージ・ハードウェアと通信する場合には、別々のプラグインを採用することになる。別の実施形態においては、任意のベンダーの管理インターフェースと対話する単一のプラグインが存在することができる。これは、ストレージ・システム・マネージャが、（たとえば、コンピュータ・システムおよび／または管理サーバによって発行されているという理由で、）よく知られているインターフェースに合わせてプログラムされることを必要とすることになる。 FIG. 6 is a simplified block diagram of a computing environment showing components and communication paths used to manage vvol, according to one embodiment of the invention. As described above, the communication path for IO protocol traffic is referred to as an in-band path and is shown in FIG. 6 as a dashed line 601 that passes through the computer system data access layer 540. One or more PEs configured in the storage system 130 are connected (through an HBA or NIC provided in the computer system). The communication path used to manage vvol is an out-of-band path (a path that is not “in-band” as defined above) and is shown as a solid line 602 in FIG. According to the embodiments described herein, vvol is managed through a plug-in 612 provided at the management server 610 and / or a plug-in 622 provided at each of the computer systems 103. In FIG. 6, only one of the computer systems 103 is shown. On the storage device side, the management interface 625 is configured by the storage system manager 131, and the management interface 626 is configured by the storage system manager 132. In addition, the management interface 624 is configured by the distributed storage system manager 135. Each management interface communicates with plug-ins 612 and 622. Special application programming interfaces (APIs) have been developed to facilitate the issuance and handling of management commands. It should be appreciated that in one embodiment, both plug-ins 612, 622 are customized to communicate with storage hardware from a particular storage system vendor. Thus, the management server 610 and the computer system 103 will employ different plug-ins when communicating with storage hardware for different storage system vendors. In another embodiment, there can be a single plug-in that interacts with the management interface of any vendor. This requires that the storage system manager be programmed for a well-known interface (eg, because it is issued by a computer system and / or management server). Become.

管理サーバ６１０はさらに、コンピュータ・システムを管理するためのシステム・マネージャ６１１を伴って構成されている。一実施形態においては、コンピュータ・システムは、仮想マシンを実行しており、システム・マネージャ６１１は、コンピュータ・システムにおいて稼働している仮想マシンを管理する。仮想マシンを管理するシステム・マネージャ６１１の１例は、ヴイエムウェア・インコーポレイティッド社（ＶＭｗａｒｅ，Ｉｎｃ．）によって販売されているｖＳｐｈｅｒｅ（登録商標）製品である。示されているように、システム・マネージャ６１１は、コンピュータ・システム１０３からリソース使用レポートを受信するために、およびコンピュータ・システム１０３において稼働しているアプリケーション上でさまざまな管理オペレーションを開始するために、（管理サーバ６１０およびコンピュータ・システム１０３の両方において適切なハードウェア・インターフェースを通じて、）コンピュータ・システム１０３において稼働しているホスト・デーモン（ｈｏｓｔｄ）６２１と通信する。 The management server 610 is further configured with a system manager 611 for managing the computer system. In one embodiment, the computer system is running a virtual machine and the system manager 611 manages the virtual machines running in the computer system. One example of a system manager 611 that manages virtual machines is the vSphere® product sold by VMware, Inc. (VMware, Inc.). As shown, system manager 611 receives resource usage reports from computer system 103 and initiates various administrative operations on applications running on computer system 103. It communicates with a host daemon (hostd) 621 running on the computer system 103 (through an appropriate hardware interface on both the management server 610 and the computer system 103).

図７は、認証関連のＡＰＩを使用して図２Ａまたは図２Ｂのストレージ・システム・クラスタに対してコンピュータ・システムを認証するための方法工程の流れ図である。これらの方法工程は、コンピュータ・システムが自分のセキュア・ソケット・レイヤ（ＳＳＬ）証明書をストレージ・システムに送信することによって認証を要求したときに、開始される。工程７１０において、ストレージ・システムは、認証クレデンシャル（たとえば、ユーザ名およびパスワード）を求めるプロンプトを、認証を要求しているコンピュータ・システムに発行する。工程７１２において認証クレデンシャルを受信すると、ストレージ・システムは、工程７１４において、それらの認証クレデンシャルを、格納されているクレデンシャルと比較する。正しいクレデンシャルが提供された場合には、ストレージ・システムは、認証されたコンピュータ・システムのＳＳＬ証明書をキー・ストア内に格納する（工程７１６）。正しくないクレデンシャルが提供された場合には、ストレージ・システムは、ＳＳＬ証明書を無視して、適切なエラー・メッセージを返す（工程７１８）。認証された後に、コンピュータ・システムは、ＳＳＬリンクを介してストレージ・システムに管理コマンドを発行するためにＡＰＩを呼び出すことができ、また、ＳＳＬ証明書内に含まれている一意のコンテキストＩＤが、どのコンピュータ・システムがどのストレージ・コンテナにアクセスすることができるかを定義することなどの特定のポリシーを実施するために、ストレージ・システムによって使用される。いくつかの実施形態においては、コンピュータ・システムのコンテキストＩＤが、それらのコンピュータ・システムに与えられた許可を管理する際に使用されることが可能である。たとえば、ホスト・コンピュータは、ｖｖｏｌを作成することを許可されることが可能であるが、そのｖｖｏｌを削除すること、もしくはそのｖｖｏｌをスナップショットすることを許可されないことが可能であり、またはホスト・コンピュータは、ｖｖｏｌのスナップショットを作成することを許可されることが可能であるが、そのｖｖｏｌをクローンすることを許可されないことが可能である。加えて、許可は、認証されたコンピュータ・システムにログインされるユーザのユーザレベル特権に従って変わることが可能である。 FIG. 7 is a flow diagram of method steps for authenticating a computer system to the storage system cluster of FIG. 2A or 2B using an authentication related API. These method steps begin when a computer system requests authentication by sending its secure socket layer (SSL) certificate to the storage system. In step 710, the storage system issues a prompt for authentication credentials (eg, username and password) to the computer system requesting authentication. Upon receipt of the authentication credentials at step 712, the storage system compares those authentication credentials with the stored credentials at step 714. If the correct credentials are provided, the storage system stores the authenticated computer system's SSL certificate in the key store (step 716). If incorrect credentials are provided, the storage system ignores the SSL certificate and returns an appropriate error message (step 718). After being authenticated, the computer system can call the API to issue management commands to the storage system over the SSL link, and the unique context ID contained within the SSL certificate is Used by the storage system to enforce specific policies such as defining which computer systems can access which storage containers. In some embodiments, computer system context IDs can be used in managing permissions granted to those computer systems. For example, a host computer may be allowed to create a vvol, but may not be allowed to delete the vvol, or to snapshot the vvol, or A computer may be allowed to create a snapshot of a vvol, but may not be allowed to clone that vvol. In addition, permissions can vary according to the user level privileges of the user logged into the authenticated computer system.

図８は、仮想ボリューム作成ＡＰＩコマンドを使用して仮想ボリュームを作成するための方法工程の流れ図である。一実施形態においては、コンピュータ・システム１０３は、最小ＩＯＰＳおよび平均待ち時間など、特定のサイズおよびストレージ能力プロファイルを有するｖｖｏｌを作成したいという要求を自分のアプリケーションのうちの１つから工程８０２において受信した場合には、帯域外パス６０２を介してストレージ・システムに仮想ボリューム作成ＡＰＩコマンドを発行する。それに応じて、コンピュータ・システム１０３は、工程８０４において、１つのストレージ・コンテナを（コンピュータ・システム１０３および要求を行っているアプリケーションがアクセスすることを許可されていて、かつ要求に対応するのに十分な空きキャパシティーを有するストレージ・コンテナの中から）選択し、プラグイン６２２を介してストレージ・システムに仮想ボリューム作成ＡＰＩコマンドを発行する。このＡＰＩコマンドは、ストレージ・コンテナＩＤと、ｖｖｏｌサイズと、ｖｖｏｌのストレージ能力プロファイルとを含む。別の実施形態においては、このＡＰＩコマンドは、一組のキー／値のペアを含み、アプリケーションは、ストレージ・システムが、その一組のキー／値のペアを、新たに作成されたｖｖｏｌとともに格納することを必要とする。別の実施形態においては、管理サーバ６１０は、帯域外パス６０２を介してストレージ・システムに仮想ボリューム作成ＡＰＩコマンドを（プラグイン６１２を介して）発行する。 FIG. 8 is a flowchart of method steps for creating a virtual volume using a virtual volume creation API command. In one embodiment, the computer system 103 received a request in one of its applications from step 802 to create a vvol with a specific size and storage capability profile, such as minimum IOPS and average latency. In this case, a virtual volume creation API command is issued to the storage system via the out-of-band path 602. In response, the computer system 103, at step 804, is authorized to access one storage container (the computer system 103 and the requesting application and is sufficient to respond to the request. Select a storage container having a free capacity and issue a virtual volume creation API command to the storage system via the plug-in 622. This API command includes a storage container ID, a vvol size, and a vvol storage capability profile. In another embodiment, the API command includes a set of key / value pairs, and the application allows the storage system to store the set of key / value pairs with the newly created vvol. You need to do. In another embodiment, the management server 610 issues a virtual volume creation API command (via plug-in 612) to the storage system via the out-of-band path 602.

工程８０６において、ストレージ・システム・マネージャは、管理インターフェース（たとえば、管理インターフェース６２４、６２５、または６２６）を介して、ｖｖｏｌを生成したいという要求を受信し、コンテナ・データベース３１６内の選択されたストレージ・コンテナのメタデータ・セクションにアクセスして、コンピュータ・システム１０３とアプリケーションとを含む要求コンテキストが、その選択されたストレージ・コンテナ内にｖｖｏｌを作成するのに十分な許可を有していることを確かめる。一実施形態においては、許可レベルが十分でない場合には、エラー・メッセージがコンピュータ・システム１０３に返される。許可レベルが十分である場合には、工程８１０において、一意のｖｖｏｌＩＤが生成される。次いで工程８１２において、ストレージ・システム・マネージャは、コンテナ・データベース３１６のメタデータ・セクション内のアロケーション・ビットマップをスキャンして、選択されたストレージ・コンテナの空いているパーティションを特定する。ストレージ・システム・マネージャは、選択されたストレージ・コンテナの空いているパーティションを、要求されているｖｖｏｌサイズに対応するのに十分なだけ割り当て、コンテナ・データベース３１６のストレージ・コンテナのメタデータ・セクション内のアロケーション・ビットマップを更新する。ストレージ・システム・マネージャはまた、新たなｖｖｏｌエントリーでｖｖｏｌデータベース３１４を更新する。新たなｖｖｏｌエントリーは、工程８１０において生成されたｖｖｏｌＩＤと、新たに割り当てられたストレージ・コンテナ・エクステントの順序付けられたリストと、キー／値のペアとして表されている新たなｖｖｏｌのメタデータとを含む。次いで工程８１４において、ストレージ・システム・マネージャは、ｖｖｏｌＩＤをコンピュータ・システム１０３へ送信する。工程８１６において、コンピュータ・システム１０３は、ｖｖｏｌＩＤを、ｖｖｏｌの作成を要求したアプリケーションに関連付ける。一実施形態においては、それぞれのアプリケーションに関して、１つまたは複数のｖｖｏｌ記述子ファイルが保持され、ｖｖｏｌの作成を要求したアプリケーションに関して保持されているｖｖｏｌ記述子ファイル内にｖｖｏｌＩＤが書き込まれる。 In step 806, the storage system manager receives a request to generate a vvol via a management interface (eg, management interface 624, 625, or 626) and selects the selected storage storage in the container database 316. Access the metadata section of the container to verify that the request context containing the computer system 103 and the application has sufficient permissions to create a vvol in the selected storage container . In one embodiment, an error message is returned to computer system 103 if the permission level is not sufficient. If the permission level is sufficient, a unique vvol ID is generated at step 810. Next, at step 812, the storage system manager scans the allocation bitmap in the metadata section of the container database 316 to identify a free partition of the selected storage container. The storage system manager allocates free partitions of the selected storage container enough to accommodate the requested vvol size, and in the storage container metadata section of the container database 316 Update the allocation bitmap for. The storage system manager also updates the vvol database 314 with new vvol entries. The new vvol entry includes the vvol ID generated in step 810, the ordered list of newly allocated storage container extents, and the new vvol metadata represented as key / value pairs. including. Next, at step 814, the storage system manager sends the vvol ID to the computer system 103. In step 816, the computer system 103 associates the vvol ID with the application that requested the creation of the vvol. In one embodiment, one or more vvol descriptor files are maintained for each application, and the vvol ID is written in the vvol descriptor file maintained for the application that requested the creation of the vvol.

図２Ａおよび図２Ｂにおいて示されているように、すべてのｖｖｏｌがＰＥに接続されているわけではない。ＰＥに接続されていないｖｖｏｌは、対応するアプリケーションによって発行されたＩＯに気づかない。なぜなら、そのｖｖｏｌにはＩＯセッションが確立されていないためである。ＩＯがｖｖｏｌに発行されることが可能になる前に、そのｖｖｏｌはバインド工程を経て、その結果として、そのｖｖｏｌは特定のＰＥにバインドされることになる。ｖｖｏｌがＰＥにバインドされると、そのｖｖｏｌがそのＰＥからアンバインドされるまで、ＩＯがそのｖｖｏｌに発行されることが可能である。 As shown in FIGS. 2A and 2B, not all vvols are connected to the PE. A vvol that is not connected to a PE is unaware of the IO issued by the corresponding application. This is because no IO session has been established for the vvol. Before an IO can be issued to a vvol, that vvol goes through a bind process, which results in the vvol being bound to a specific PE. Once a vvol is bound to a PE, an IO can be issued to that vvol until the vvol is unbound from the PE.

一実施形態においては、バインド要求は、コンピュータ・システム１０３によって、バインド仮想ボリュームＡＰＩを使用して、帯域外パス６０２を介してストレージ・システムに発行される。バインド要求は、（ｖｖｏｌＩＤを使用して、）バインドされることになるｖｖｏｌを識別し、それに応じてストレージ・システムは、そのｖｖｏｌを、コンピュータ・システム１０３が帯域内パスを介して接続されるＰＥにバインドする。図９Ａは、コンピュータ・システムが帯域内パスを介して接続されるＰＥをそのコンピュータ・システムが発見するための方法工程の流れ図である。ＳＣＳＩプロトコルベースのストレージ・デバイスにおいて構成されているＰＥは、標準的なＳＣＳＩコマンド、ＲＥＰＯＲＴ＿ＬＵＮＳを使用して、帯域内パスを介して発見される。ＮＦＳプロトコルベースのストレージ・デバイスにおいて構成されているＰＥは、ＡＰＩを使用して、帯域外パスを介して発見される。図９Ａの方法工程は、コンピュータ・システムによって、それぞれの接続されているストレージ・システムごとに実行される。 In one embodiment, the bind request is issued by the computer system 103 to the storage system via the out-of-band path 602 using the bind virtual volume API. The bind request identifies the vvol to be bound (using the vvol ID) and accordingly the storage system connects the vvol to the computer system 103 via an in-band path. Bind to PE. FIG. 9A is a flow diagram of method steps for a computer system to discover PEs to which the computer system is connected via an in-band path. PEs configured in SCSI protocol-based storage devices are discovered via the in-band path using the standard SCSI command REPORT_UNS. PEs configured in NFS protocol based storage devices are discovered via out-of-band paths using APIs. The method steps of FIG. 9A are performed by the computer system for each connected storage system.

工程９１０において、コンピュータ・システムは、接続されているストレージ・システムがＳＣＳＩプロトコルベースであるか、またはＮＦＳプロトコルベースであるかを判定する。ストレージ・システムがＳＣＳＩプロトコルベースである場合には、ＳＣＳＩコマンド、ＲＥＰＯＲＴ＿ＬＵＮＳが、帯域内のコンピュータ・システムによってストレージ・システムに発行される（工程９１２）。次いで工程９１３において、コンピュータ・システムは、ストレージ・システムからの応答、とりわけ、返されるＰＥＩＤのそれぞれに関連付けられているＰＥビットを調べて、ＰＥ関連のＬＵＮと、従来のデータＬＵＮとの間における区別を行う。ストレージ・システムがＮＦＳプロトコルベースである場合には、利用可能なＰＥのＩＤを得るために、ＡＰＩコールが、帯域外のコンピュータ・システムによって、プラグイン６２２から管理インターフェース（たとえば、管理インターフェース６２４、６２５、または６２６）に発行される（工程９１４）。工程９１３および９１４の後に続く工程９１６において、コンピュータ・システムは、ストレージ・システムによって返されたＰＥ関連のＬＵＮのＰＥＩＤ、または管理インターフェースによって返されたＰＥＩＤを、バインド工程中に使用するために格納する。ＳＣＳＩプロトコルベースのストレージ・デバイスによって返されたＰＥＩＤはそれぞれ、ＷＷＮを含み、ＮＦＳプロトコルベースのストレージ・デバイスによって返されたＰＥＩＤはそれぞれ、ＩＰアドレスおよびマウント・ポイントを含むということを認識されたい。 In step 910, the computer system determines whether the attached storage system is SCSI protocol based or NFS protocol based. If the storage system is SCSI protocol based, a SCSI command, REPORT_LUNS, is issued to the storage system by the in-band computer system (step 912). Next, at step 913, the computer system examines the response from the storage system, specifically the PE bit associated with each of the returned PE IDs, between the PE-related LUN and the traditional data LUN. Make a distinction. If the storage system is NFS protocol based, an API call is sent from the plug-in 622 to the management interface (eg, management interfaces 624, 625) by the out-of-band computer system to obtain the available PE ID. Or 626) (step 914). In step 916, following steps 913 and 914, the computer system uses the PE ID of the PE-related LUN returned by the storage system or the PE ID returned by the management interface to use during the binding step. Store. It should be appreciated that each PE ID returned by a SCSI protocol-based storage device includes a WWN, and each PE ID returned by an NFS protocol-based storage device includes an IP address and a mount point. .

図９Ｂは、所与のコンピュータ・システム１０３が帯域内パスを介して接続されるＰＥを、ストレージ・システム・マネージャ１３１、またはストレージ・システム・マネージャ１３２、または分散型ストレージ・システム・マネージャ１３５（以降では、「ストレージ・システム・マネージャ」と呼ばれる）が発見するための方法工程の流れ図である。そのようなＰＥをストレージ・システム・マネージャによって発見することにより、ストレージ・システムは、コンピュータ・システムからのバインド要求に応答して、要求を行っているコンピュータ・システムに、有効なＰＥＩＤを返すことができ、コンピュータ・システムは、そのＰＥＩＤ上に実際に接続されることが可能である。工程９５０において、ストレージ・システム・マネージャは、管理インターフェースおよびプラグイン６２２を介してコンピュータ・システム１０３に帯域外の「Ｄｉｓｃｏｖｅｒ＿Ｔｏｐｏｌｏｇｙ」ＡＰＩコールを発行する。コンピュータ・システム１０３は、自分のシステムＩＤと、図９Ａの流れ図を介して自分が発見したすべてのＰＥＩＤのリストとを返す。一実施形態においては、ストレージ・システム・マネージャは、管理インターフェースおよびプラグイン６１２を介して管理サーバ６１０に「Ｄｉｓｃｏｖｅｒ＿Ｔｏｐｏｌｏｇｙ」ＡＰＩコールを発行することによって、工程９５０を実行する。そのような一実施形態においては、ストレージ・システムは、複数のコンピュータ・システムＩＤと、関連付けられているＰＥＩＤとを含む応答を、管理サーバ６１０が管理するそれぞれのコンピュータ・システム１０３ごとに１つ受信することになる。次いで工程９５２において、ストレージ・システム・マネージャは、工程９５０からの結果を処理する。たとえば、ストレージ・システム・マネージャは、自分の現在のコントロール下にないすべてのＰＥＩＤのリストを消去する。たとえば、Ｄｉｓｃｏｖｅｒ＿Ｔｏｐｏｌｏｇｙコールを発行しているときにストレージ・システム・マネージャ１３５によって受信される特定のＰＥＩＤは、同じコンピュータ・システムに接続されている別のストレージ・システムに対応している可能性がある。同様に、受信される特定のＰＥＩＤは、その後にストレージ・システム管理者によって削除されたさらに古いＰＥに対応している可能性がある、といった具合である。工程９５４において、ストレージ・システム・マネージャは、処理された結果を、その後のバインド要求中に使用するためにキャッシュする。一実施形態においては、ストレージ・システム・マネージャは、図９Ｂの工程を定期的に実行して、コンピュータ・システムおよびネットワーク・トポロジーの進行中の変化で自分のキャッシュされた結果を更新する。別の実施形態においては、ストレージ・システム・マネージャは、新たなｖｖｏｌ作成要求を受信するたびに、図９Ｂの工程を実行する。さらに別の実施形態においては、ストレージ・システム・マネージャは、図７の認証工程を実行した後に、図９Ｂの工程を実行する。 FIG. 9B shows a PE to which a given computer system 103 is connected via an in-band path, a storage system manager 131, or a storage system manager 132, or a distributed storage system manager 135 (hereinafter Is a flow diagram of method steps for discovery by a “storage system manager”. By discovering such PEs by the storage system manager, the storage system returns a valid PE ID to the requesting computer system in response to the bind request from the computer system. And the computer system can actually be connected on its PE ID. At step 950, the storage system manager issues an out-of-band “Discover_Topology” API call to the computer system 103 via the management interface and plug-in 622. The computer system 103 returns its system ID and a list of all PE IDs it has found via the flowchart of FIG. 9A. In one embodiment, the storage system manager performs step 950 by issuing a “Discover_Topology” API call to the management server 610 via the management interface and plug-in 612. In one such embodiment, the storage system sends one response for each computer system 103 managed by the management server 610, including a plurality of computer system IDs and associated PE IDs. Will receive. Then, at step 952, the storage system manager processes the results from step 950. For example, the storage system manager clears the list of all PE IDs that are not under his current control. For example, a particular PE ID received by the storage system manager 135 when issuing a Discover_Topology call may correspond to another storage system connected to the same computer system . Similarly, the particular PE ID received may correspond to an older PE that was subsequently deleted by the storage system administrator, and so on. In step 954, the storage system manager caches the processed results for use during subsequent bind requests. In one embodiment, the storage system manager periodically performs the steps of FIG. 9B to update its cached results with ongoing changes in computer system and network topology. In another embodiment, the storage system manager performs the process of FIG. 9B each time it receives a new vvol creation request. In yet another embodiment, the storage system manager performs the process of FIG. 9B after performing the authentication process of FIG.

図１０は、バインド仮想ボリュームＡＰＩを使用して仮想ボリューム・バインド要求を発行および実行するための方法工程の流れ図である。一実施形態においては、コンピュータ・システム１０３は、自分のアプリケーションのうちの１つが、まだＰＥにバインドされていないｖｖｏｌに関連付けられているブロック・デバイスへのＩＯアクセスを要求した場合には、帯域外パス６０２を介してストレージ・システムにバインド要求を発行する。別の実施形態においては、管理サーバ６１０は、ＶＭのパワー・オン、および１つのストレージ・コンテナから別のストレージ・コンテナへのｖｖｏｌの移行を含む特定のＶＭ管理オペレーションに関連したバインド要求を発行する。 FIG. 10 is a flow diagram of method steps for issuing and executing a virtual volume bind request using the bind virtual volume API. In one embodiment, the computer system 103 may be out of band if one of its applications requests IO access to a block device associated with a vvol that is not yet bound to a PE. A bind request is issued to the storage system via the path 602. In another embodiment, the management server 610 issues a bind request related to a specific VM management operation that includes powering on the VM and migrating the vvol from one storage container to another. .

まだＰＥにバインドされていないｖｖｏｌに関連付けられているブロック・デバイスへのＩＯアクセスをアプリケーションが要求している上述の例について続けると、コンピュータ・システム１０３は、工程１００２において、ブロック・デバイス・データベース５３３（または５８０）から、そのｖｖｏｌのｖｖｏｌＩＤを特定する。次いで工程１００４において、コンピュータ・システム１０３は、そのｖｖｏｌをバインドしたいという要求を、帯域外パス６０２を通じてストレージ・システムに発行する。 Continuing with the above example where an application is requesting IO access to a block device associated with a vvol that has not yet been bound to a PE, the computer system 103, at step 1002, the block device database 533. (Or 580), the vvol ID of the vvol is specified. Next, in step 1004, the computer system 103 issues a request to bind the vvol to the storage system through the out-of-band path 602.

ストレージ・システム・マネージャは、工程１００６において、管理インターフェース（たとえば、管理インターフェース６２４、６２５、または６２６）を介して、ｖｖｏｌをバインドしたいという要求を受信し、次いで、ｖｖｏｌがバインドされることになるＰＥを選択すること、選択されたＰＥに関するＳＬＬＩＤおよび世代番号を生成すること、ならびに（たとえば、ＩＯマネージャ３０４を介して）接続データベース３１２を更新することを含む工程１００８を実行する。ｖｖｏｌがバインドされることになるＰＥの選択は、接続（すなわち、コンピュータ・システム１０３への既存の帯域内接続を有するＰＥのみが、選択に利用可能である）と、利用可能なＰＥを通る現在のＩＯトラフィックなどのその他の要因とに従って行われる。一実施形態においては、ストレージ・システムは、図９Ｂの方法に従ってコンピュータ・システム１０３がそのストレージ・システムに送信したＰＥの処理がなされてキャッシュされたリストから選択を行う。ＳＬＬＩＤの生成は、図２Ａのストレージ・システム・クラスタを採用している実施形態と、図２Ｂのストレージ・システム・クラスタを採用している実施形態との間において異なる。前者のケースにおいては、選択されたＰＥに関して一意であるＳＬＬＩＤが生成される。後者のケースにおいては、ｖｖｏｌに対応するファイル・オブジェクトへのファイル・パスが、ＳＬＬＩＤとして生成される。選択されたＰＥに関してＳＬＬＩＤおよび世代番号が生成された後に、接続データベース３１２は、ｖｖｏｌに対する新たに生成されたＩＯセッションを含むように更新される。次いで工程１０１０において、選択されたＰＥのＩＤ、生成されたＳＬＬＩＤ、および世代番号が、コンピュータ・システム１０３に返される。任意選択で、図２Ｂのストレージ・システム・クラスタを採用している実施形態においては、一意のＮＦＳファイル・ハンドルが、ｖｖｏｌに対応するファイル・オブジェクトに関して生成され、選択されたＰＥのＩＤ、生成されたＳＬＬＩＤ、および世代番号とともにコンピュータ・システム１０３に返されることが可能である。工程１０１２において、コンピュータ・システム１０３は、ストレージ・システムから返されたＰＥＩＤ、ＳＬＬＩＤ（および任意選択で、ＮＦＳハンドル）、および世代番号を含めるようにブロック・デバイス・データベース５３３（または５８０）を更新する。とりわけ、ストレージ・システムから返されたＰＥＩＤ、ＳＬＬＩＤ（および任意選択で、ＮＦＳハンドル）、および各組の世代番号は、新たなエントリーとしてブロック・デバイス・データベース５３３（または５８０）に加えられることになる。世代番号は、リプレイ攻撃を防ぐために使用されるということを認識されたい。したがって、リプレイ攻撃が懸念されない実施形態においては、世代番号は使用されない。 The storage system manager receives, in step 1006, a request to bind vvol via a management interface (eg, management interface 624, 625, or 626) and then the PE to which vvol will be bound. Performing step 1008 including selecting, generating an SLLID and generation number for the selected PE, and updating the connection database 312 (eg, via the IO manager 304). The choice of PE to which vvol will be bound is the connection (ie, only PEs that have an existing in-band connection to computer system 103 are available for selection) and the current passing through the available PE In accordance with other factors such as IO traffic. In one embodiment, the storage system selects from a cached list that has been processed by the PEs sent by the computer system 103 to the storage system according to the method of FIG. 9B. The generation of SLLID differs between the embodiment employing the storage system cluster of FIG. 2A and the embodiment employing the storage system cluster of FIG. 2B. In the former case, an SLLID is generated that is unique for the selected PE. In the latter case, the file path to the file object corresponding to vvol is generated as the SLLID. After the SLLID and generation number are generated for the selected PE, the connection database 312 is updated to include the newly generated IO session for vvol. Then, in step 1010, the ID of the selected PE, the generated SLL ID, and the generation number are returned to the computer system 103. Optionally, in an embodiment employing the storage system cluster of FIG. 2B, a unique NFS file handle is generated for the file object corresponding to vvol and the ID of the selected PE is generated. Can be returned to the computer system 103 along with the SLLID and generation number. In step 1012, the computer system 103 updates the block device database 533 (or 580) to include the PE ID, SLLID (and optionally the NFS handle), and generation number returned from the storage system. To do. In particular, the PE ID, SLLID (and optionally NFS handle) returned from the storage system, and each set of generation numbers will be added to the block device database 533 (or 580) as a new entry. Become. It should be appreciated that generation numbers are used to prevent replay attacks. Therefore, generation numbers are not used in embodiments where replay attacks are not a concern.

同じｖｖｏｌにＩＯを発行したいと望む別のアプリケーションによって開始された同じｖｖｏｌへのその後のバインド要求に対して、ストレージ・システム・マネージャは、そのｖｖｏｌを同じまたは別のＰＥにバインドすることができる。そのｖｖｏｌが同じＰＥにバインドされる場合には、ストレージ・システム・マネージャは、その同じＰＥのＩＤと、以前に生成されたＳＬＬＩＤとを返し、接続データベース３１２内に格納されているこのＩＯ接続パスのリファレンス・カウントをインクリメントする。その一方で、そのｖｖｏｌが別のＰＥにバインドされる場合には、ストレージ・システム・マネージャは、新たなＳＬＬＩＤを生成し、その別のＰＥのＩＤと、新たに生成されたＳＬＬＩＤとを返し、そのｖｖｏｌへのこの新たなＩＯ接続パスを新たなエントリーとして接続データベース３１２に加える。 For subsequent bind requests to the same vvol initiated by another application that wants to issue IO to the same vvol, the storage system manager can bind that vvol to the same or a different PE. If the vvol is bound to the same PE, the storage system manager returns the ID of that same PE and the previously generated SLL ID, and this IO connection path stored in the connection database 312 Increment the reference count. On the other hand, if the vvol is bound to another PE, the storage system manager generates a new SLLID and returns the ID of the other PE and the newly generated SLLID, This new IO connection path to the vvol is added to the connection database 312 as a new entry.

アンバインド仮想ボリュームＡＰＩを使用して、仮想ボリューム・アンバインド要求が発行されることが可能である。アンバインド要求は、ｖｖｏｌがそれまでにバインドされた際に経由したＩＯ接続パスのＰＥＩＤおよびＳＬＬＩＤを含む。しかしながら、アンバインド要求の処理は、助言的なものである。ストレージ・システム・マネージャが、すぐに、または遅れてｖｖｏｌをＰＥからアンバインドするのは自由である。アンバインド要求は、ＰＥＩＤおよびＳＬＬＩＤを含むエントリーのリファレンス・カウントをデクリメントするように接続データベース３１２を更新することによって、処理される。リファレンス・カウントがゼロまでデクリメントされた場合には、そのエントリーは、削除されることが可能である。このケースにおいては、そのｖｖｏｌは、引き続き存在するが、その所与のＰＥＩＤおよびＳＬＬＩＤを使用したＩＯにそれ以上利用することはできないということに留意されたい。 An unbound virtual volume API can be used to issue a virtual volume unbind request. The unbind request includes the PE ID and SLL ID of the IO connection path through which vvol has been bound so far. However, the handling of unbind requests is advisory. The storage system manager is free to unbind vvol from the PE immediately or late. The unbind request is processed by updating the connection database 312 to decrement the reference count of the entry including the PE ID and SLLID. If the reference count is decremented to zero, the entry can be deleted. Note that in this case, the vvol continues to exist, but is no longer available for IO using that given PE ID and SLLID.

ＶＭの仮想ディスクを実装しているｖｖｏｌのケースにおいては、このｖｖｏｌに関するリファレンス・カウントは、少なくとも１となる。ＶＭがパワー・オフされ、それに関連してアンバインド要求が発行された場合には、リファレンス・カウントは、１だけデクリメントされることになる。リファレンス・カウントがゼロである場合には、ｖｖｏｌエントリーは、接続データベース３１２から除去されることが可能である。一般に、エントリーを接続データベース３１２から除去することは有益である。なぜなら、Ｉ／Ｏマネージャ３０４は、より少ないデータを管理し、ＳＬＬＩＤを再利用することもできるためである。そのような利点は、ストレージ・システムによって格納されているｖｖｏｌの総数が多い（たとえば、数百万個程度のｖｖｏｌ）が、アプリケーションによってアクティブにアクセスされているｖｖｏｌの総数が少ない（たとえば、数万個のＶＭ）場合に、顕著になる。加えて、ｖｖｏｌがいずれのＰＥにもバインドされていない場合には、ストレージ・システムは、そのｖｖｏｌをＤＳＵ１４１内のどこに格納するかを選択する際に、より大きな柔軟性を有する。たとえば、ストレージ・システムは、いくつかのＤＳＵ１４１が、より高速なデータ・アクセスを提供し、その他のＤＳＵ１４１が、（たとえば、ストレージ・コストを節約するために）より低速なデータ・アクセスを提供する、非対称的な、階層的なＤＳＵ１４１を伴って実装されることが可能である。１実施態様においては、ｖｖｏｌがいずれのＰＥにもバインドされていない場合には（これは、接続データベース３１２内のそのｖｖｏｌのエントリーのリファレンス・カウントをチェックすることによって判定されることが可能である）、ストレージ・システムは、そのｖｖｏｌを、より低速なおよび／またはより安価なタイプの物理ストレージに移行させることができる。次いで、そのｖｖｏｌがＰＥにバインドされると、ストレージ・システムは、そのｖｖｏｌを、より高速なタイプの物理ストレージに移行させることができる。そのような移行は、ｖｖｏｌデータベース３１４内の所与のｖｖｏｌを構成するコンテナ・ロケーションの順序付けられたリストの１つまたは複数の要素を変更すること、およびコンテナ・データベース３１６のメタデータ・セクション内の対応するエクステント・アロケーション・ビットマップを更新することによって達成されることが可能であるということを認識されたい。 In the case of vvol in which a VM virtual disk is mounted, the reference count for this vvol is at least 1. If the VM is powered off and an unbind request is issued in connection with it, the reference count will be decremented by one. If the reference count is zero, the vvol entry can be removed from the connection database 312. In general, it is beneficial to remove entries from the connection database 312. This is because the I / O manager 304 can manage less data and reuse the SLLID. Such an advantage is that the total number of vvols stored by the storage system is large (eg, on the order of millions of vvols), but the total number of vvols that are actively accessed by the application (eg, tens of thousands). The number of VMs) becomes significant. In addition, if the vvol is not bound to any PE, the storage system has greater flexibility in choosing where to store the vvol in the DSU 141. For example, the storage system may have some DSUs 141 providing faster data access and other DSUs 141 providing slower data access (eg, to save storage costs). It can be implemented with an asymmetric, hierarchical DSU 141. In one embodiment, if the vvol is not bound to any PE (this can be determined by checking the reference count of that vvol entry in the connection database 312. ), The storage system can migrate its vvol to a slower and / or less expensive type of physical storage. The storage system can then migrate the vvol to a faster type of physical storage once the vvol is bound to a PE. Such a migration may change one or more elements of the ordered list of container locations that make up a given vvol in the vvol database 314, and in the metadata section of the container database 316. It should be appreciated that this can be achieved by updating the corresponding extent allocation bitmap.

ｖｖｏｌをＰＥにバインドすることおよびアンバインドすることによって、ストレージ・システム・マネージャは、ｖｖｏｌの活性を決定することができる。ストレージ・システム・マネージャは、この情報を利用して、ＩＯサービスを提供しない（パッシブな）ｖｖｏｌおよびＩＯサービスを提供する（アクティブな）ｖｖｏｌに関してストレージ・システム・ベンダー固有の最適化を実行することができる。たとえば、ストレージ・システム・マネージャは、ｖｖｏｌが、特定のしきい値の時間を超えてパッシブな状態にとどまっている場合には、そのｖｖｏｌを、待ち時間が少ない（高いコストの）ＳＳＤから、待ち時間が中程度の（低いコストの）ハード・ドライブへ移動させるように構成されることが可能である。 By binding and unbinding vvol to PE, the storage system manager can determine the activity of vvol. The storage system manager can use this information to perform storage system vendor specific optimizations on vvols that do not provide IO services (passive) and vvols that provide IO services (active). it can. For example, the storage system manager may wait for a vvol from a low-latency (high cost) SSD if the vvol remains passive beyond a certain threshold time. It can be configured to move to a hard drive that is medium in time (low cost).

図１１Ａおよび図１１Ｂは、一実施形態による、仮想ボリュームにＩＯを発行するための方法工程の流れ図である。図１１Ａは、アプリケーションからのＩＯを直接ロー・ブロック・デバイスに発行するための方法工程１１００の流れ図であり、図１１Ｂは、アプリケーションからのＩＯを、ファイル・システム・ドライバを通じて発行するための方法工程１１２０の流れ図である。 11A and 11B are flowcharts of method steps for issuing IOs to a virtual volume, according to one embodiment. FIG. 11A is a flowchart of a method step 1100 for issuing IO from an application directly to a raw block device, and FIG. 11B is a method step for issuing IO from an application through a file system driver. 1120 is a flowchart.

方法１１００は、工程１１０２において開始し、工程１１０２では、図５Ａ〜図５Ｂにおいて示されているアプリケーション５１２、または図５Ｃ〜図５Ｄにおいて示されているＶＭ５７１などのアプリケーションが、ロー・ブロック・デバイスにＩＯを発行する。工程１１０４において、仮想ボリューム・デバイス・ドライバ５３２または５６５が、アプリケーションによって発行されたＩＯからロー・ブロックレベルＩＯを生成する。工程１１０６において、ロー・ブロック・デバイスの名前が、仮想ボリューム・デバイス・ドライバ５３２または５６５によってＰＥＩＤおよびＳＬＬＩＤに（そしてまた、図２Ｂのストレージ・デバイスを採用している実施形態においては、ＮＦＳクライアント５４５または５８５によってＮＦＳハンドルに）変換される。工程１１０８において、データ・アクセス・レイヤ５４０または５６６が、ＰＥＩＤおよびＳＬＬＩＤを（そしてまた、図２Ｂのストレージ・デバイスを採用している実施形態においては、ＮＦＳハンドルを）ロー・ブロックレベルＩＯへとエンコードすることを実行する。次いで工程１１１０において、ＨＢＡ／ＮＩＣが、ロー・ブロックレベルＩＯを発行する。 The method 1100 begins at step 1102, where an application 512, such as the application 512 shown in FIGS. 5A-5B, or the VM 571 shown in FIGS. 5C-5D, is transferred to the low block device. Issue IO. In step 1104, the virtual volume device driver 532 or 565 generates a raw block level IO from the IO issued by the application. In step 1106, the name of the raw block device is changed to the PE ID and SLLID by the virtual volume device driver 532 or 565 (and also in the embodiment employing the storage device of FIG. 2B, the NFS client). Converted to NFS handle) by 545 or 585. At step 1108, the data access layer 540 or 566 passes the PE ID and SLLID (and also the NFS handle in the embodiment employing the storage device of FIG. 2B) to the raw block level IO. Perform the encoding. Next, at step 1110, the HBA / NIC issues a low block level IO.

図５Ａ〜図５Ｂにおいて示されているアプリケーション５１２など、ＶＭ以外のアプリケーションに関しては、方法１１２０は、工程１１２１において開始する。工程１１２１においては、アプリケーションが、ｖｖｏｌベースのブロック・デバイス上に格納されているファイルにＩＯを発行する。次いで工程１１２２において、ファイル・システム・ドライバ、たとえばファイル・システム・ドライバ５１０が、ファイルＩＯからブロックレベルＩＯを生成する。工程１１２２の後には、工程１１２６、１１２８、および１１３０（これらは、工程１１０６、１１０８、および１１１０と同じである）が実行される。 For applications other than VMs, such as application 512 shown in FIGS. 5A-5B, method 1120 begins at step 1121. In step 1121, the application issues an IO to a file stored on the vvol-based block device. Then, at step 1122, a file system driver, such as file system driver 510, generates a block level IO from the file IO. Step 1122 is followed by steps 1126, 1128, and 1130 (which are the same as steps 1106, 1108, and 1110).

図５Ｃ〜図５Ｄにおいて示されているＶＭ５７１など、ＶＭアプリケーションに関しては、方法１１２０は、工程１１２３において開始する。工程１１２３においては、ＶＭが、自分の仮想ディスクにＩＯを発行する。次いで工程１１２４において、このＩＯは、たとえばＳＣＳＩ仮想化レイヤ５６３によって、ファイルＩＯに変換される。次いで、ファイル・システム・ドライバ、たとえばＶＭＦＳドライバ５６４が、工程１１２５において、ファイルＩＯからブロックレベルＩＯを生成する。工程１１２５の後には、工程１１２６、１１２８、および１１３０（これらは、工程１１０６、１１０８、および１１１０と同じである）が実行される。 For a VM application, such as VM 571 shown in FIGS. 5C-5D, method 1120 begins at step 1123. In step 1123, the VM issues an IO to its virtual disk. Then, in step 1124, this IO is converted to a file IO by, for example, the SCSI virtualization layer 563. A file system driver, eg, VMFS driver 564, then generates a block level IO from the file IO at step 1125. Following step 1125, steps 1126, 1128, and 1130 (which are the same as steps 1106, 1108, and 1110) are performed.

図１２は、一実施形態による、ストレージ・システムにおいてＩＯを実行するための方法工程の流れ図である。工程１２１０において、コンピュータ・システムによって発行されたＩＯが、ストレージ・システムにおいて構成されているＰＥのうちの１つを通じて受信される。そのＩＯは、工程１２１２においてＩＯマネージャ３０４によって解析される。工程１２１２の後には、ストレージ・システム・クラスタが、図２Ａにおいて示されているタイプのものである場合には、ＩＯマネージャ３０４によって工程１２１４ａが実行され、ストレージ・システム・クラスタが、図２Ｂにおいて示されているタイプのものである場合には、ＩＯマネージャ３０４によって工程１２１４ｂが実行される。工程１２１４ａにおいては、ＩＯマネージャ３０４は、解析されたＩＯからＳＬＬＩＤを抽出し、接続データベース３１２にアクセスして、ＰＥＩＤと、抽出されたＳＬＬＩＤとに対応するｖｖｏｌＩＤを特定する。工程１２１４ｂにおいては、ＩＯマネージャ３０４は、解析されたＩＯからＮＦＳハンドルを抽出し、ＰＥＩＤと、ＳＬＬＩＤとしてのＮＦＳハンドルとを使用して、ｖｖｏｌを識別する。工程１２１４ａおよび１２１４ｂの後には、工程１２１６が実行される。工程１２１６においては、ＩＯが実行されることになる物理ストレージ・ロケーションを得るために、ｖｖｏｌデータベース３１４およびコンテナ・データベース３１６が、それぞれボリューム・マネージャ３０６およびコンテナ・マネージャ３０８によってアクセスされる。次いで工程１２１８において、データ・アクセス・レイヤ３１０が、工程１２１６において得られた物理ストレージ・ロケーション上でＩＯを実行する。 FIG. 12 is a flow diagram of method steps for performing IO in a storage system, according to one embodiment. In step 1210, the IO issued by the computer system is received through one of the PEs configured in the storage system. The IO is analyzed by the IO manager 304 at step 1212. After step 1212, if the storage system cluster is of the type shown in FIG. 2A, step 1214a is performed by IO manager 304 and the storage system cluster is shown in FIG. 2B. If so, step 1214b is performed by the IO manager 304. In step 1214a, the IO manager 304 extracts the SLL ID from the analyzed IO, accesses the connection database 312, and specifies the PE ID and the vvol ID corresponding to the extracted SLL ID. In step 1214b, the IO manager 304 extracts the NFS handle from the parsed IO and identifies the vvol using the PE ID and the NFS handle as the SLLID. Steps 1216a and 1214b are followed by step 1216. In step 1216, the vvol database 314 and the container database 316 are accessed by the volume manager 306 and container manager 308, respectively, to obtain the physical storage location where the IO will be executed. Next, at step 1218, the data access layer 310 performs IO on the physical storage location obtained at step 1216.

いくつかの状況においては、アプリケーション（アプリケーション５１２またはＶＭ５７１）、管理サーバ６１０、および／またはストレージ・システム・マネージャは、特定のＰＥに対するあるｖｖｏｌのバインディングが、問題（そのＰＥが、あまりにも多くのバインディングでオーバーロード状態になっている場合など）を経験していると判定する場合がある。そのような問題を解決するための方法として、バインドされているｖｖｏｌは、ＩＯコマンドがそのｖｖｏｌへ導かれている間でさえ、ストレージ・システム・マネージャによって別のＰＥに再バインドされることが可能である。図１３は、再バインドＡＰＩを使用した、一実施形態による、ｖｖｏｌ再バインド要求を発行および実行するための方法工程１３００の流れ図である。 In some situations, an application (application 512 or VM 571), management server 610, and / or storage system manager may have a problem with a certain vvol binding to a particular PE (the PE has too many bindings). It may be determined that the user is experiencing an overload condition. As a way to solve such a problem, a bound vvol can be rebound to another PE by the storage system manager even while an IO command is directed to that vvol. It is. FIG. 13 is a flow diagram of method steps 1300 for issuing and executing a vvol rebind request according to one embodiment using a rebind API.

示されているように、方法１３００は、工程１３０２において開始し、工程１３０２では、ストレージ・システム・マネージャは、ｖｖｏｌが、そのｖｖｏｌが現在バインドされている第１のＰＥとは異なる第２のＰＥにバインドされるべきであると判定する。工程１３０４において、ストレージ・システム・マネージャは、ｖｖｏｌを再バインドしたいという要求を、ｖｖｏｌにＩＯを発行するアプリケーションを実行しているコンピュータ・システム（たとえば、コンピュータ・システム１０３）に、帯域外パスを介して発行する。工程１３０６において、コンピュータ・システム１０３は、ストレージ・システム・マネージャから再バインド要求を受信し、それに応じて、ｖｖｏｌを新たなＰＥにバインドしたいという要求を発行する。工程１３０８において、ストレージ・システム・マネージャは、再バインド要求を受信し、それに応じて、ｖｖｏｌを新たなＰＥにバインドする。工程１３１０において、ストレージ・システム・マネージャは、図１０に関連して上述したように、ｖｖｏｌが現在やはりバインドされている新たなＰＥのＩＤと、ｖｖｏｌにアクセスするためのＳＬＬＩＤとをコンピュータ・システムに送信する。 As shown, method 1300 begins at step 1302, where the storage system manager has a second PE whose vvol is different from the first PE to which that vvol is currently bound. Determine that it should be bound to In step 1304, the storage system manager sends a request to rebind vvol to the computer system (eg, computer system 103) executing the application that issues IO to vvol via an out-of-band path. Issue. In step 1306, the computer system 103 receives the rebind request from the storage system manager and accordingly issues a request to bind vvol to the new PE. In step 1308, the storage system manager receives the rebind request and accordingly binds vvol to the new PE. In step 1310, the storage system manager gives the computer system the ID of the new PE to which vvol is also currently bound and the SLL ID for accessing vvol, as described above in connection with FIG. Send.

工程１３１２において、コンピュータ・システムは、ストレージ・システム・マネージャから新たなＰＥＩＤおよびＳＬＬＩＤを受信する。ブロック・デバイス・データベース５３３または５８０において、新たなＰＥ接続のアクティブ・ビットが、はじめは１に設定され、これが意味するのは、新たなＰＥを介したｖｖｏｌのための新たなＩＯセッションが確立されたということである。コンピュータ・システムはまた、第１のＰＥ接続のアクティブ・ビットを０に設定し、これが意味するのは、このＰＥ接続を通じてそのｖｖｏｌにそれ以上ＩＯが発行されることは不可能であるということである。認識されたいこととして、このＰＥ接続は、非アクティブ化された際にすぐにアンバインドされるべきではない。なぜなら、処理中である、すなわち、発行されたが完了されていない可能性がある、このＰＥ接続を通じたそのｖｖｏｌへのＩＯが存在する可能性があるためである。したがって、工程１３１４において、コンピュータ・システムは、ブロック・デバイス・データベース５３３または５８０にアクセスして、第１のＰＥ接続を通じてそのｖｖｏｌに発行されたすべての「処理中コマンド」（ＣＩＦ）が完了されているか、すなわち、ＣＩＦ＝０であるかを確かめる。コンピュータ・システムは、工程１３１８を実行する前に、ＣＩＦがゼロになるのを待つ。その間に、そのｖｖｏｌへのさらなるＩＯが、新たなＰＥを通じて発行される。なぜなら、新たなＰＥ接続のアクティブ・ビットが既に１に設定されているためである。ＣＩＦがゼロに達しない場合には、工程１３１８が実行され、工程１３１８では、第１のＰＥ接続をアンバインドしたいという要求が、ストレージ・システム・マネージャに発行される。次いで工程１３２０において、ストレージ・システム・マネージャは、そのｖｖｏｌを第１のＰＥからアンバインドする。また、コンピュータ・システムは、工程１３２４において、新たなＰＥを通じてそのｖｖｏｌにさらなるＩＯをすべて発行する。 At step 1312, the computer system receives a new PE ID and SLLID from the storage system manager. In the block device database 533 or 580, the active bit for the new PE connection is initially set to 1, which means that a new IO session for vvol via the new PE is established. That is. The computer system also sets the active bit of the first PE connection to 0, which means that no more IO can be issued to that vvol through this PE connection. is there. It should be appreciated that this PE connection should not be immediately unbound when deactivated. This is because there may be an IO to that vvol through this PE connection that is being processed, ie it may have been issued but not completed. Accordingly, at step 1314, the computer system accesses the block device database 533 or 580 to complete all “in-process commands” (CIFs) issued to that vvol through the first PE connection. That is, that is, CIF = 0. The computer system waits for CIF to reach zero before performing step 1318. In the meantime, further IO to that vvol is issued through the new PE. This is because the active bit of the new PE connection is already set to 1. If the CIF does not reach zero, step 1318 is executed where a request to unbind the first PE connection is issued to the storage system manager. Then, in step 1320, the storage system manager unbinds the vvol from the first PE. The computer system also issues all additional IO to the vvol through the new PE at step 1324.

図１４は、一実施形態による、仮想ボリュームのライフ・サイクルの概念図である。図１４において示されているすべてのコマンド、すなわち、作成、スナップショット、クローン、バインド、アンバインド、拡張、および削除は、ｖｖｏｌ管理コマンド・セットを形成しており、図６に関連して上述したプラグイン６１２、６２２を通じてアクセス可能である。示されているように、「ｖｖｏｌを作成する」、「ｖｖｏｌをスナップショットする」、または「ｖｖｏｌをクローンする」というコマンドのうちのいずれかの結果としてｖｖｏｌが生成された場合には、その生成されたｖｖｏｌは、「パッシブな」状態にとどまり、パッシブな状態では、そのｖｖｏｌは、特定のＰＥにバインドされておらず、したがってＩＯを受信することはできない。加えて、ｖｖｏｌがパッシブな状態にあるときに、「ｖｖｏｌをスナップショットする」、「ｖｖｏｌをクローンする」、または「ｖｖｏｌを拡張する」というコマンドのうちのいずれかが実行された場合には、オリジナルのｖｖｏｌおよび（もしあれば）新たに作成されたｖｖｏｌは、パッシブな状態にとどまる。やはり示されているように、パッシブな状態にあるｖｖｏｌがＰＥにバインドされた場合には、そのｖｖｏｌは、「アクティブな」状態に入る。逆に、アクティブなｖｖｏｌがＰＥからアンバインドされた場合には、そのｖｖｏｌがいずれのさらなるＰＥにもバインドされていないと仮定すると、そのｖｖｏｌは、パッシブな状態に入る。ｖｖｏｌがアクティブな状態にあるときに、「ｖｖｏｌをスナップショットする」、「ｖｖｏｌをクローンする」、「ｖｖｏｌを拡張する」、または「ｖｖｏｌを再バインドする」というコマンドのうちのいずれかが実行された場合には、オリジナルのｖｖｏｌは、アクティブな状態にとどまり、（もしあれば）新たに作成されたｖｖｏｌは、パッシブな状態にとどまる。 FIG. 14 is a conceptual diagram of a virtual volume life cycle, according to one embodiment. All the commands shown in FIG. 14, namely create, snapshot, clone, bind, unbind, extend, and delete, form the vvol management command set and are described above in connection with FIG. Accessible through plug-ins 612 and 622. As shown, if a vvol is created as a result of one of the commands "Create vvol", "Snapshot vvol", or "Clone vvol" The vvol that has been made remains in the “passive” state, where it is not bound to a particular PE and therefore cannot receive IO. In addition, if any of the commands "Snapshot vvol", "Clone vvol", or "Extend vvol" is executed when vvol is in passive state, The original vvol and the newly created vvol (if any) remain passive. As also shown, if a vvol in a passive state is bound to a PE, the vvol enters an “active” state. Conversely, if an active vvol is unbound from a PE, the vvol enters a passive state, assuming that the vvol is not bound to any further PE. When vvol is active, one of the commands "snapshot vvol", "clone vvol", "extend vvol" or "rebind vvol" is executed If so, the original vvol remains in the active state, and the newly created vvol (if any) remains in the passive state.

上述したように、１つのＶＭは、複数の仮想ディスクを有することができ、それぞれの仮想ディスクごとに別々のｖｖｏｌが作成される。ＶＭはまた、そのＶＭの構成について記述するメタデータ・ファイルを有する。それらのメタデータ・ファイルとしては、ＶＭ構成ファイル、ＶＭログ・ファイル、ディスク記述子ファイル、すなわち、ＶＭのための仮想ディスクのそれぞれに関するファイル、ＶＭスワップ・ファイルなどが含まれる。仮想ディスクに関するディスク記述子ファイルは、仮想ディスクに関連する情報、たとえば、その仮想ディスクのｖｖｏｌＩＤ、その仮想ディスクのサイズ、その仮想ディスクがシン・プロビジョニングされているかどうか、および、その仮想ディスクに関して作成された１つまたは複数のスナップショットのＩＤなどを含む。ＶＭスワップ・ファイルは、ストレージ・システム上におけるＶＭのスワップ・スペースを提供する。一実施形態においては、これらのＶＭ構成ファイルは、ｖｖｏｌ内に格納され、このｖｖｏｌは、本明細書においてはメタデータｖｖｏｌと呼ばれる。 As described above, one VM can have a plurality of virtual disks, and a separate vvol is created for each virtual disk. A VM also has a metadata file that describes the configuration of that VM. These metadata files include a VM configuration file, a VM log file, a disk descriptor file, ie, a file for each virtual disk for a VM, a VM swap file, and the like. A disk descriptor file for a virtual disk is created for information related to the virtual disk, for example, the vvol ID of the virtual disk, the size of the virtual disk, whether the virtual disk is thin provisioned, and the virtual disk Including the ID of one or more snapshots made. The VM swap file provides VM swap space on the storage system. In one embodiment, these VM configuration files are stored in vvol, which is referred to herein as metadata vvol.

図１５は、一実施形態による、ＶＭをプロビジョンするための方法工程の流れ図である。この実施形態においては、管理サーバ６１０と、ＶＭをホストしているコンピュータ・システム、たとえば、図５Ｃにおいて示されているコンピュータ・システム１０２（以降では、「ホスト・コンピュータ」と呼ばれる）と、図２Ａのストレージ・システム・クラスタ、とりわけストレージ・システム・マネージャ１３１、１３２、または１３５とが使用される。示されているように、ストレージ・システム・マネージャは、工程１５０２において、ＶＭをプロビジョンしたいという要求を受信する。これは、ＶＭ管理者が、管理サーバ６１０への適切なユーザ・インターフェースを使用して、特定のサイズおよびストレージ能力プロファイルを有するＶＭをプロビジョンするためのコマンドを管理サーバ６１０に発行するときに生成される要求であることが可能である。それに応じて、工程１５０４において、管理サーバ６１０は、図８に関連して上述した様式で、ＶＭのメタデータを含めるためのｖｖｏｌ（以降では、「メタデータｖｖｏｌ」と呼ばれる）を作成するための方法を開始し、それに従ってストレージ・システム・マネージャは、工程１５０８において、メタデータｖｖｏｌを作成し、そのメタデータｖｖｏｌのｖｖｏｌＩＤを管理サーバ６１０に返す。工程１５１４において、管理サーバ６１０は、ＶＭをホストしているコンピュータ・システムへ戻るメタデータｖｖｏｌのｖｖｏｌＩＤを登録する。工程１５１６において、ホスト・コンピュータは、図１０に関連して上述した様式で、メタデータｖｖｏｌをＰＥにバインドするための方法を開始し、それに従ってストレージ・システム・マネージャは、工程１５１８において、メタデータｖｖｏｌをＰＥにバインドし、ＰＥＩＤおよびＳＬＬＩＤをホスト・コンピュータに返す。 FIG. 15 is a flow diagram of method steps for provisioning a VM, according to one embodiment. In this embodiment, the management server 610 and the computer system hosting the VM, such as the computer system 102 shown in FIG. 5C (hereinafter referred to as the “host computer”), FIG. 2A Storage system clusters, in particular, storage system managers 131, 132, or 135 are used. As shown, the storage system manager receives a request to provision a VM at step 1502. This is generated when the VM administrator issues a command to the management server 610 to provision a VM with a specific size and storage capability profile using the appropriate user interface to the management server 610. It is possible that the request is made. Accordingly, in step 1504, the management server 610 creates vvol (hereinafter referred to as “metadata vvol”) to include VM metadata in the manner described above with reference to FIG. The method starts and accordingly the storage system manager creates the metadata vvol at step 1508 and returns the vvol ID of the metadata vvol to the management server 610. In step 1514, the management server 610 registers the vvol ID of the metadata vvol back to the computer system hosting the VM. At step 1516, the host computer initiates a method for binding the metadata vvol to the PE in the manner described above in connection with FIG. 10, and accordingly, the storage system manager, at step 1518, the metadata. Bind vvol to PE and return PE ID and SLLID to host computer.

工程１５２２において、ホスト・コンピュータは、ホスト・コンピュータのオペレーティング・システムへの「ＣＲＥＡＴＥＤＥＶＩＣＥ」コールを使用して、メタデータｖｖｏｌのブロック・デバイス・インスタンスを作成する。次いで工程１５２４において、ホスト・コンピュータは、ブロック・デバイスの上にファイル・システム（たとえば、ＶＭＦＳ）を作成し、それに応答して、ファイル・システムＩＤ（ＦＳＩＤ）が返される。ホスト・コンピュータは、工程１５２６において、返されたＦＳＩＤを有するファイル・システムをマウントし、ＶＭのメタデータを、そのファイル・システムに関連付けられているネームスペース内に格納する。メタデータの例としては、ＶＭログ・ファイル、ディスク記述子ファイル、すなわち、ＶＭのための仮想ディスクのそれぞれに関するファイル、およびＶＭスワップ・ファイルが含まれる。 In step 1522, the host computer creates a block device instance of the metadata vvol using a “CREATE DEVICE” call to the host computer's operating system. Next, in step 1524, the host computer creates a file system (eg, VMFS) on the block device and in response, a file system ID (FSID) is returned. In step 1526, the host computer mounts the file system having the returned FSID and stores the VM's metadata in the namespace associated with the file system. Examples of metadata include a VM log file, a disk descriptor file, ie a file for each of the virtual disks for the VM, and a VM swap file.

工程１５２８において、ホスト・コンピュータは、図８に関連して上述した様式で、ＶＭの仮想ディスクのそれぞれに関するｖｖｏｌ（そのようなそれぞれのｖｖｏｌは、本明細書においては「データｖｖｏｌ」と呼ばれる）を作成するための方法を開始し、それに従ってストレージ・システム・マネージャは、工程１５３０において、データｖｖｏｌを作成し、そのデータｖｖｏｌのｖｖｏｌＩＤをホスト・コンピュータに返す。工程１５３２において、ホスト・コンピュータは、データｖｖｏｌのＩＤを、仮想ディスクに関するディスク記述子ファイル内に格納する。この方法は、ＶＭの仮想ディスクのうちのすべてに関してデータｖｖｏｌが作成された後にメタデータｖｖｏｌがアンバインドされること（図示せず）に伴って、終了する。 In step 1528, the host computer sends a vvol for each of the VM's virtual disks (each such vvol is referred to herein as a “data vvol”) in the manner described above in connection with FIG. The method for creating is started, and the storage system manager accordingly creates the data vvol at step 1530 and returns the vvol ID of the data vvol to the host computer. In step 1532, the host computer stores the ID of the data vvol in the disk descriptor file for the virtual disk. This method ends with the metadata vvol being unbound (not shown) after the data vvol has been created for all of the VM's virtual disks.

図１６Ａは、図１５に関連して説明した様式でＶＭがプロビジョンされた後にＶＭをパワー・オンするための方法工程の流れ図である。図１６Ｂは、ＶＭがパワー・オンされた後にＶＭをパワー・オフするための方法工程の流れ図である。これらの２つの方法は、ＶＭのためのホスト・コンピュータによって実行される。 FIG. 16A is a flow diagram of method steps for powering on a VM after the VM has been provisioned in the manner described in connection with FIG. FIG. 16B is a flowchart of method steps for powering off a VM after the VM is powered on. These two methods are performed by the host computer for the VM.

工程１６０８においてＶＭパワー・オン・コマンドを受信すると、そのＶＭに対応するメタデータｖｖｏｌのＩＤが、工程１６１０において取り出される。次いで工程１６１２において、メタデータｖｖｏｌは、図１０に関連して上述したようなバインド工程を経る。工程１６１４において、ファイル・システムがメタデータｖｖｏｌ上にマウントされ、それによって、工程１６１６において、データｖｖｏｌに関するメタデータ・ファイル、とりわけディスク記述子ファイルを読み取ることができ、データｖｖｏｌＩＤを得ることができる。次いで工程１６１８において、データｖｖｏｌは、図１０に関連して上述したように、１つずつバインド工程を経る。 When the VM power on command is received in step 1608, the ID of the metadata vvol corresponding to the VM is retrieved in step 1610. Next, at step 1612, the metadata vvol undergoes a binding step as described above with respect to FIG. In step 1614, the file system is mounted on the metadata vvol, so that in step 1616 the metadata file for the data vvol, in particular the disk descriptor file, can be read and the data vvol ID can be obtained. . Then, in step 1618, the data vvol goes through a binding step one by one as described above with respect to FIG.

工程１６２０においてＶＭパワー・オフ・コマンドを受信すると、そのＶＭのデータｖｖｏｌが、ブロック・デバイス・データベース（たとえば、図５Ｃのブロック・デバイス・データベース５８０）において非アクティブとしてマークされ、ホスト・コンピュータは、それらのデータｖｖｏｌのそれぞれに関連付けられているＣＩＦがゼロに達するのを待つ（工程１６２２）。それぞれのデータｖｖｏｌに関連付けられているＣＩＦがゼロに達した際に、ホスト・コンピュータは、工程１６２４において、そのデータｖｖｏｌをアンバインドするようストレージ・システムに要求する。すべてのデータｖｖｏｌに関連付けられているＣＩＦがゼロに達した後に、工程１６２６において、メタデータｖｖｏｌが、ブロック・デバイス・データベースにおいて非アクティブとしてマークされる。次いで工程１６２８において、メタデータｖｖｏｌに関連付けられているＣＩＦがゼロに達したときに、ホスト・コンピュータは、工程１６３０において、メタデータｖｖｏｌがアンバインドされるよう要求する。 Upon receipt of the VM power off command at step 1620, the VM's data vvol is marked as inactive in the block device database (eg, block device database 580 of FIG. 5C) and the host computer Wait for the CIF associated with each of those data vvols to reach zero (step 1622). When the CIF associated with each data vvol reaches zero, the host computer requests the storage system to unbind the data vvol at step 1624. After the CIF associated with all data vvol reaches zero, in step 1626, the metadata vvol is marked as inactive in the block device database. Then, in step 1628, when the CIF associated with the metadata vvol reaches zero, the host computer requests in step 1630 that the metadata vvol be unbound.

図１７および図１８は、ＶＭを再プロビジョンするための方法工程の流れ図である。ここで示されている例においては、図１７は、ＶＭのｖｖｏｌ、とりわけＶＭの仮想ディスクに関するデータｖｖｏｌのサイズを拡張するための、ホスト・コンピュータ上で実行される方法工程の流れ図であり、図１８は、ストレージ・コンテナ同士の間においてＶＭのｖｖｏｌを移動させるための、ストレージ・システムにおいて実行される方法工程の流れ図である。 17 and 18 are flowcharts of method steps for re-provisioning a VM. In the example shown here, FIG. 17 is a flow diagram of the method steps performed on the host computer to expand the size of the data vvol for the VM vvol, in particular the VM virtual disk, 18 is a flowchart of method steps performed in the storage system for moving a VM's vvol between storage containers.

ＶＭの仮想ディスクに関するデータｖｖｏｌのサイズを拡張するための方法が、工程１７０８において開始し、工程１７０８では、ホスト・コンピュータが、ＶＭがパワー・オンされているかどうかを判定する。ホスト・コンピュータは、工程１７０８において、ＶＭがパワー・オンされていないと判定した場合には、工程１７１０において、そのＶＭに対応するメタデータｖｖｏｌのＩＤを取り出す。次いで工程１７１２において、メタデータｖｖｏｌに関するバインド工程が、ホスト・コンピュータによって開始される。バインドの後に、工程１７１４において、ホスト・コンピュータが、ファイル・システムをメタデータｖｖｏｌ上にマウントし、仮想ディスクに対応するデータｖｖｏｌのＩＤを、仮想ディスクに関するディスク記述子ファイル（これは、メタデータｖｖｏｌ上にマウントされたファイル・システム内のファイルである）から取り出す。次いで工程１７１６において、ホスト・コンピュータは、拡張ｖｖｏｌＡＰＩコールを工程１７１６においてストレージ・システムへ送信し、その拡張ｖｖｏｌＡＰＩコールは、データｖｖｏｌのＩＤと、データｖｖｏｌの新たなサイズとを含む。 A method for expanding the size of the data vvol for the VM's virtual disk begins at step 1708, where the host computer determines whether the VM is powered on. If the host computer determines in step 1708 that the VM is not powered on, in step 1710, the host computer retrieves the ID of the metadata vvol corresponding to that VM. Next, in step 1712, a binding step for metadata vvol is initiated by the host computer. After binding, in step 1714, the host computer mounts the file system on the metadata vvol and assigns the ID of the data vvol corresponding to the virtual disk to the disk descriptor file for the virtual disk (this is the metadata vvol). From the file system mounted above). Next, at step 1716, the host computer sends an extended vvol API call to the storage system at step 1716, which includes the ID of the data vvol and the new size of the data vvol.

ＶＭがパワー・オンされている場合には、ホスト・コンピュータは、工程１７１５において、拡張されることになるＶＭの仮想ディスクのデータｖｖｏｌのＩＤを取り出す。図１６Ａの方法から認識されたいこととして、このＩＤは、ＶＭの仮想ディスクに関連付けられているディスク記述子ファイルから入手されることが可能である。次いで工程１７１６において、ホスト・コンピュータは、拡張ｖｖｏｌＡＰＩコールを工程１７１６においてストレージ・システムへ送信し、その拡張ｖｖｏｌＡＰＩコールは、データｖｖｏｌのＩＤと、データｖｖｏｌの新たなサイズとを含む。 If the VM is powered on, the host computer retrieves in step 1715 the ID of the VM's virtual disk data vvol to be expanded. It should be appreciated from the method of FIG. 16A that this ID can be obtained from the disk descriptor file associated with the VM's virtual disk. Next, at step 1716, the host computer sends an extended vvol API call to the storage system at step 1716, which includes the ID of the data vvol and the new size of the data vvol.

拡張ｖｖｏｌＡＰＩコールの結果、ｖｖｏｌデータベースおよびコンテナ・データベース（たとえば、図３のｖｖｏｌデータベース３１４およびコンテナ・データベース３１６）は、ｖｖｏｌの増大されたアドレス空間を反映するようにストレージ・システムにおいて更新される。拡張ｖｖｏｌＡＰＩコールが完了したという肯定応答を受信すると、ホスト・コンピュータは、工程１７１８において、ＶＭの仮想ディスクに関するディスク記述子ファイルを新たなサイズで更新する。次いで工程１７２０において、ホスト・コンピュータは、ＶＭがパワー・オンされているかどうかを判定する。ＶＭがパワー・オンされていない場合には、ホスト・コンピュータは、工程１７２２において、ファイル・システムをアンマウントし、メタデータｖｖｏｌをアンバインドしたいという要求をストレージ・システムへ送信する。その一方で、ＶＭがパワー・オンされている場合には、この方法は終了する。 As a result of the extended vvol API call, the vvol database and container database (eg, vvol database 314 and container database 316 in FIG. 3) are updated in the storage system to reflect the increased address space of vvol. Upon receiving an acknowledgment that the extended vvol API call is complete, the host computer updates the disk descriptor file for the VM's virtual disk with a new size at step 1718. Then, in step 1720, the host computer determines whether the VM is powered on. If the VM is not powered on, the host computer sends a request to the storage system in step 1722 to unmount the file system and unbind the metadata vvol. On the other hand, if the VM is powered on, the method ends.

現在ＰＥにバインドされているＶＭのｖｖｏｌをソース・ストレージ・コンテナから宛先ストレージ・コンテナへ移動させるための方法（この場合、ソース・ストレージ・コンテナおよび宛先ストレージ・コンテナの両方が、同じストレージ・システム・マネージャの範囲内にある）が、工程１８１０において開始し、工程１８１０では、ソース・ストレージ・コンテナおよび宛先ストレージ・コンテナ（それぞれ、ＳＣ１およびＳＣ２）のコンテナＩＤと、移動されることになるｖｖｏｌのｖｖｏｌＩＤとが受信される。次いで工程１８１２において、ｖｖｏｌデータベース（たとえば、図３のｖｖｏｌデータベース３１４）、およびコンテナ・データベース（たとえば、図３のコンテナ・データベース３１６）のエクステント・アロケーション・ビットマップが、下記のように更新される。はじめに、ストレージ・システム・マネージャが、ＳＣ１内のｖｖｏｌエクステントをコンテナ・データベース３１６内のＳＣ１のエントリーから除去し、次いで、コンテナ・データベース３１６内のＳＣ２のエントリーを修正することによって、これらのエクステントをＳＣ２に割り振る。一実施形態においては、ストレージ・システムは、ＳＣ１における（ｖｖｏｌストレージ・エクステントの除去に起因する）ストレージ・キャパシティーの喪失を、新たなスピンドル・エクステントをＳＣ１に割り振ることによって補うこと、およびＳＣ２における（ｖｖｏｌストレージ・エクステントの追加に起因する）ストレージ・キャパシティーの増大を、いくつかの使用されていないスピンドル・エクステントをＳＣ２から除去することによって調整することが可能である。工程１８１４において、ストレージ・システム・マネージャは、現在バインドされているＰＥがｖｖｏｌの新たなロケーションにＩＯを最適にサービス提供することができるかどうかを判定する。現在のＰＥがｖｖｏｌの新たなロケーションにＩＯをサービス提供することができない場合の１例は、ストレージ管理者が、ストレージ・システム・マネージャを、別々のＰＥを別々の顧客ひいては別々のストレージ・コンテナからのｖｖｏｌに割り振るように静的に構成している場合である。現在のＰＥがｖｖｏｌにＩＯをサービス提供することができない場合には、ｖｖｏｌは、工程１８１５において、図１３に関連して上述した再バインド工程（および接続データベース、たとえば、図３の接続データベース３１２に対する関連する変更）を経る。工程１８１５の後には、工程１８１６が実行され、工程１８１６では、移動が首尾よく完了した旨の肯定応答が、ホスト・コンピュータに返される。工程１８１４において、現在のＰＥがｖｖｏｌの新たなロケーションにＩＯをサービス提供することができるとストレージ・システム・マネージャが判定した場合には、工程１８１５は迂回され、次いで工程１８１６が実行される。 A method for moving the vvol of a VM that is currently bound to a PE from the source storage container to the destination storage container (in this case, both the source storage container and the destination storage container have the same storage system Is within the manager's scope), which begins at step 1810, where the container IDs of the source and destination storage containers (SC1 and SC2, respectively) and the vvol of the vvol to be moved ID is received. Then, at step 1812, the vvol database (eg, vvol database 314 of FIG. 3) and the extent allocation bitmap of the container database (eg, container database 316 of FIG. 3) are updated as follows. First, the storage system manager removes the vvol extents in SC1 from the SC1 entry in the container database 316, and then modifies the SC2 entry in the container database 316 to make these extents SC2 Allocate to In one embodiment, the storage system compensates for the loss of storage capacity (due to the removal of the vvol storage extent) in SC1 by allocating a new spindle extent to SC1, and in SC2 ( The increase in storage capacity (due to the addition of vvol storage extents) can be adjusted by removing some unused spindle extents from SC2. In step 1814, the storage system manager determines whether the currently bound PE can optimally serve IO to the new location in vvol. One example where the current PE is unable to serve IO to a new location in the vvol is when the storage administrator sends the storage system manager, separate PEs from different customers and therefore separate storage containers. This is a case where it is statically configured to be allocated to the vvol. If the current PE is unable to service IO to vvol, vvol will, in step 1815, rebind as described above in connection with FIG. 13 (and to the connection database, eg, connection database 312 of FIG. Related changes). Step 1815 is followed by step 1816, where an acknowledgment that the move has been completed is returned to the host computer. In step 1814, if the storage system manager determines that the current PE can service IO to the new location in vvol, step 1815 is bypassed and then step 1816 is executed.

互換性がないストレージ・コンテナ同士の間において、たとえば、別々の製造業者のストレージ・デバイス内に作成されたストレージ・コンテナ同士の間においてｖｖｏｌが移動される場合には、コンテナ・データベース３１６、ｖｖｏｌデータベース３１４、および接続データベース３１２に対する変更に加えて、ストレージ・コンテナ同士の間においてデータの移動が実行される。一実施形態においては、２００８年５月２９日に出願された「ＯｆｆｌｏａｄｉｎｇＳｔｏｒａｇｅＯｐｅｒａｔｉｏｎｓｔｏＳｔｏｒａｇｅＨａｒｄｗａｒｅ」と題されている米国特許出願第１２／１２９，３２３号（その全内容を本願明細書に援用する）に記載されているデータ移動技術が採用される。 When vvol is moved between storage containers that are not compatible, for example, between storage containers created in storage devices of different manufacturers, the container database 316, the vvol database In addition to changes to 314 and connection database 312, data movement is performed between storage containers. In one embodiment, US patent application Ser. No. 12 / 129,323, filed May 29, 2008, entitled “Offloading Storage Operations to Storage Hardware”, the entire contents of which are incorporated herein by reference. ) Is adopted.

図１９は、テンプレートＶＭからＶＭをクローンするための、ホスト・コンピュータおよびストレージ・システムにおいて実行される方法工程の流れ図である。この方法は、工程１９０８において開始し、工程１９０８では、ホスト・コンピュータが、新たなＶＭに関するメタデータｖｖｏｌを作成したいという要求をストレージ・システムへ送信する。１９１０において、ストレージ・システムは、図８に関連して上述した方法に従って新たなＶＭに関するメタデータｖｖｏｌを作成し、新たなメタデータｖｖｏｌＩＤをホスト・コンピュータに返す。次いで工程１９１４において、クローンｖｖｏｌＡＰＩコールが、テンプレートＶＭに属するすべてのデータｖｖｏｌＩＤに関して、ホスト・コンピュータから帯域外パス６０１を介してストレージ・システムに発行される。工程１９１８において、ストレージ・システム・マネージャが、テンプレートＶＭのデータｖｖｏｌと、新たなＶＭのデータｖｖｏｌとに互換性があるか否かをチェックする。別々の製造業者のストレージ・システム内に作成されたストレージ・コンテナ同士の間においてクローニングが行われる場合には、データｖｖｏｌ同士に互換性がない可能性があるということを認識されたい。互換性がある場合には、工程１９１９が実行される。工程１９１９において、ストレージ・システム・マネージャは、新たなデータｖｖｏｌＩＤを生成すること、コンテナ・データベース３１６内のアロケーション・ビットマップを更新すること、および新たなｖｖｏｌエントリーをｖｖｏｌデータベース３１４に加えることによって、新たなデータｖｖｏｌを作成し、テンプレートＶＭのデータｖｖｏｌ内に格納されているコンテンツを新たなＶＭのデータｖｖｏｌにコピーする。工程１９２０において、ストレージ・システム・マネージャは、新たなデータｖｖｏｌＩＤをホスト・コンピュータに返す。新たなデータｖｖｏｌＩＤを受信することは、データｖｖｏｌのクローニングがエラーなく完了した旨の確認をホスト・コンピュータに提供する。次いで工程１９２５において、ホスト・コンピュータは、メタデータ・ファイル、とりわけディスク記述子ファイルを、新たに生成されたデータｖｖｏｌＩＤで更新するために、新たなＶＭのメタデータｖｖｏｌにＩＯを発行する。ホスト・コンピュータによってストレージ・システムに発行されたＩＯは、工程１９２６においてストレージ・システムによって実行され、その結果として、新たなＶＭのディスク記述子ファイルが、新たに生成されたデータｖｖｏｌＩＤで更新される。 FIG. 19 is a flowchart of method steps performed in the host computer and storage system for cloning a VM from a template VM. The method begins at step 1908, where the host computer sends a request to the storage system to create metadata vvol for the new VM. At 1910, the storage system creates metadata vvol for the new VM according to the method described above with reference to FIG. 8, and returns the new metadata vvol ID to the host computer. Then, in step 1914, a clone vvol API call is issued from the host computer to the storage system via the out-of-band path 601 for all data vvol IDs belonging to the template VM. In step 1918, the storage system manager checks whether the data vvol of the template VM is compatible with the data vvol of the new VM. It should be recognized that data vvol may not be compatible if cloning is performed between storage containers created in storage systems of different manufacturers. If so, step 1919 is performed. In step 1919, the storage system manager creates a new data vvol ID, updates the allocation bitmap in the container database 316, and adds a new vvol entry to the vvol database 314. A new data vvol is created, and the content stored in the template VM data vvol is copied to the new VM data vvol. In step 1920, the storage system manager returns the new data vvol ID to the host computer. Receiving the new data vvol ID provides confirmation to the host computer that the cloning of the data vvol has been completed without error. Then, in step 1925, the host computer issues an IO to the new VM's metadata vvol to update the metadata file, particularly the disk descriptor file, with the newly generated data vvol ID. The IO issued to the storage system by the host computer is executed by the storage system in step 1926, resulting in the new VM disk descriptor file being updated with the newly generated data vvol ID. .

工程１９１８において、テンプレートＶＭのデータｖｖｏｌと、新たなＶＭのデータｖｖｏｌとに互換性がないとストレージ・システム・マネージャが判定した場合には、エラー・メッセージが、ホスト・コンピュータに返される。このエラー・メッセージを受信すると、ホスト・コンピュータは、工程１９２１において、新たなデータｖｖｏｌを作成するために、作成ｖｖｏｌＡＰＩコールをストレージ・システムに発行する。工程１９２２において、ストレージ・システム・マネージャは、新たなデータｖｖｏｌＩＤを生成すること、コンテナ・データベース３１６内のアロケーション・ビットマップを更新すること、および新たなｖｖｏｌエントリーをｖｖｏｌデータベース３１４に加えることによって、新たなデータｖｖｏｌを作成し、新たなデータｖｖｏｌＩＤをホスト・コンピュータに返す。工程１９２３において、ホスト・コンピュータは、２００９年１月２１日に出願された「ＤａｔａＭｏｖｅｒｆｏｒＣｏｍｐｕｔｅｒＳｙｓｔｅｍ」と題されている米国特許出願第１２／３５６，６９４号（その全内容を本願明細書に援用する）に記載されている技術に従って、データの移動を実行する（工程１９２３）。工程１９２３の後には、工程１９２５および１９２６が、上述のように実行される。 In step 1918, if the storage system manager determines that the template VM data vvol and the new VM data vvol are not compatible, an error message is returned to the host computer. Upon receipt of this error message, the host computer issues a create vvol API call to the storage system in step 1921 to create new data vvol. At step 1922, the storage system manager creates a new data vvol ID, updates the allocation bitmap in the container database 316, and adds a new vvol entry to the vvol database 314. A new data vvol is created, and a new data vvol ID is returned to the host computer. In step 1923, the host computer is identified in US patent application Ser. No. 12 / 356,694, entitled “Data Move for Computer System”, filed Jan. 21, 2009, the entire contents of which are incorporated herein by reference. Data movement is performed according to the technique described in (incorporated) (step 1923). After step 1923, steps 1925 and 1926 are performed as described above.

図２０は、別の実施形態による、ＶＭをプロビジョンするための方法工程の流れ図である。この実施形態においては、管理サーバ６１０と、ＶＭをホストしているコンピュータ・システム、たとえば、図５Ｄにおいて示されているコンピュータ・システム１０２（以降では、「ホスト・コンピュータ」と呼ばれる）と、図２Ｂのストレージ・システム・クラスタ、とりわけストレージ・システム・マネージャ１３１、またはストレージ・システム・マネージャ１３２、またはストレージ・システム・マネージャ１３５とが使用される。示されているように、ＶＭをプロビジョンしたいという要求が、工程２００２において受信される。これは、ＶＭ管理者が、管理サーバ６１０への適切なユーザ・インターフェースを使用して、特定のサイズおよびストレージ能力プロファイルを有するＶＭをプロビジョンするためのコマンドを管理サーバ６１０に発行するときに生成される要求であることが可能である。それに応じて、工程２００４において、管理サーバ６１０は、図８に関連して上述した様式で、ＶＭのメタデータを含めるためのｖｖｏｌ、とりわけメタデータｖｖｏｌを作成するための方法を開始し、それに従ってストレージ・システム・マネージャは、工程２００８において、メタデータｖｖｏｌ（これは、ＮＡＳデバイス内のファイルである）を作成し、メタデータｖｖｏｌＩＤを管理サーバ６１０に返す。工程２０２０において、管理サーバ６１０は、ホスト・コンピュータへ戻るメタデータｖｖｏｌのｖｖｏｌＩＤを登録する。工程２０２２において、ホスト・コンピュータは、メタデータｖｖｏｌＩＤに関するバインド要求をストレージ・システムに発行し、それに応答して、ストレージ・システムは、工程２０２３において、ＩＰアドレスおよびディレクトリ・パスをそれぞれＰＥＩＤおよびＳＬＬＩＤとして返す。工程２０２４において、ホスト・コンピュータは、指定されたＩＰアドレスおよびディレクトリ・パスにおいてディレクトリをマウントし、そのマウントされたディレクトリ内にメタデータ・ファイルを格納する。ＮＦＳを使用する実施形態においては、ＮＦＳクライアント５４５または５８５が、そのようなディレクトリにＮＦＳ要求を発行するために、所与のＩＰアドレスおよびディレクトリ・パスをＮＦＳハンドルへと変換することができる。 FIG. 20 is a flow diagram of method steps for provisioning a VM according to another embodiment. In this embodiment, a management server 610 and a computer system hosting a VM, such as the computer system 102 shown in FIG. 5D (hereinafter referred to as the “host computer”), FIG. Storage system clusters, particularly the storage system manager 131, or the storage system manager 132, or the storage system manager 135 are used. As shown, a request to provision a VM is received at step 2002. This is generated when the VM administrator issues a command to the management server 610 to provision a VM with a specific size and storage capability profile using the appropriate user interface to the management server 610. It is possible that the request is made. Accordingly, in step 2004, the management server 610 initiates a method for creating a vvol for including the metadata of the VM, in particular the metadata vvol, in the manner described above with reference to FIG. In step 2008, the storage system manager creates metadata vvol (this is a file in the NAS device) and returns the metadata vvol ID to the management server 610. In step 2020, the management server 610 registers the vvol ID of the metadata vvol that returns to the host computer. In step 2022, the host computer issues a bind request for the metadata vvol ID to the storage system, and in response, the storage system assigns the IP address and directory path to the PE ID and SLLID, respectively, in step 2023. Return as. In step 2024, the host computer mounts the directory at the specified IP address and directory path and stores the metadata file in the mounted directory. In embodiments using NFS, an NFS client 545 or 585 can convert a given IP address and directory path to an NFS handle in order to issue an NFS request to such a directory.

工程２０２６において、ホスト・コンピュータは、図８に関連して上述した様式で、ＶＭの仮想ディスクのそれぞれに関するデータｖｖｏｌを作成するための方法を開始し、それに従ってストレージ・システム・マネージャは、工程２０３０において、データｖｖｏｌを作成し、そのデータｖｖｏｌのｖｖｏｌＩＤをホスト・コンピュータに返す。工程２０３２において、ホスト・コンピュータは、データｖｖｏｌのＩＤを、仮想ディスクに関するディスク記述子ファイル内に格納する。この方法は、ＶＭの仮想ディスクのうちのすべてに関してデータｖｖｏｌが作成された後にメタデータｖｖｏｌがアンバインドされること（図示せず）に伴って、終了する。 In step 2026, the host computer initiates a method for creating data vvol for each of the VM's virtual disks in the manner described above with respect to FIG. The data vvol is created and the vvol ID of the data vvol is returned to the host computer. In step 2032, the host computer stores the ID of the data vvol in the disk descriptor file for the virtual disk. This method ends with the metadata vvol being unbound (not shown) after the data vvol has been created for all of the VM's virtual disks.

図８に関連して上述したように、新たなｖｖｏｌがストレージ・コンテナから作成され、その新たなｖｖｏｌに関してストレージ能力プロファイルが明示的に指定されない場合には、その新たなｖｖｏｌは、ストレージ・コンテナに関連付けられているストレージ能力プロファイルを引き継ぐことになる。ストレージ・コンテナに関連付けられているストレージ能力プロファイルは、いくつかの別々のプロファイルのうちの１つから選択されることが可能である。たとえば、図２１において示されているように、それらの別々のプロファイルとしては、プロダクション（ｐｒｏｄ）プロファイル２１０１、デベロップメント（ｄｅｖ）プロファイル２１０２、およびテスト・プロファイル２１０３（ここでは「プロファイル２１００」と総称される）が含まれる。その他の多くのプロファイルが定義されることも可能であるということを認識されたい。示されているように、特定のプロファイルのそれぞれのプロファイル・エントリーは、固定タイプまたは可変タイプのものであり、１つの名前と、それに関連付けられている１つまたは複数の値とを有している。固定タイプのプロファイル・エントリーは、固定された数の選択可能なアイテムを有している。たとえば、プロファイル・エントリー「複製」は、真または偽であるように設定されることが可能である。対照的に、可変タイプのプロファイル・エントリーは、事前に定義された選択肢を有していない。その代わりに、可変タイプのプロファイル・エントリーに関しては、デフォルトの値およびある範囲の値が設定され、ユーザは、その範囲内にある任意の値を選択することができる。値がまったく指定されない場合には、デフォルトの値が使用される。図２１に示されている例示的なプロファイル２１００においては、可変タイプのプロファイル・エントリーは、コンマによって区切られている３つの数字を有している。第１の数字は、指定された範囲の下限であり、第２の数字は、指定された範囲の上限である。第３の数字は、デフォルトの値である。したがって、プロダクション・プロファイル２１０１において定義されているストレージ能力プロファイルを引き継ぐｖｖｏｌは、複製されることになり（複製。値＝真）、複製に関する目標復旧時間（ＲＴＯ：ｒｅｃｏｖｅｒｙｔｉｍｅｏｂｊｅｃｔｉｖｅ）は、０．１時間から２４時間の範囲内で定義されることが可能であり、デフォルトは１時間である。加えて、このｖｖｏｌに関しては、スナップショットが可能である（スナップショット。値＝真）。保持されるスナップショットの数は、１から１００の範囲内であり、デフォルトは１であり、スナップショットの頻度は、１時間に１回から２４時間に１回の範囲内であり、デフォルトは１時間に１回である。「スナップ引き継ぎ」の列は、派生ｖｖｏｌである新たなｖｖｏｌを作成するために所与のｖｖｏｌがスナップショットされる場合に所与のプロファイル属性（およびその値）が派生ｖｖｏｌに伝搬されるべきかどうかを示す。プロダクション・プロファイル２１０１の例においては、最初の２つのプロファイル・エントリー（複製およびＲＴＯ）のみが、プロダクション・プロファイル２１０１を有する所与のｖｖｏｌのスナップショットｖｖｏｌに伝搬されることが可能である。スナップショットｖｖｏｌのその他のすべての属性の値は、そのプロファイルにおいて指定されているデフォルトの値に設定されることになる。言い換えれば、所与のｖｖｏｌに関するこれらのその他の属性のあらゆるカスタマイゼーション（たとえば、スナップショット頻度のデフォルトではない値）は、それらの対応する「スナップ引き継ぎ」の列が偽であるため、スナップショットｖｖｏｌに伝搬されない。このプロファイルは、どの属性値が所与のｖｖｏｌのそれぞれクローンおよびレプリカに伝搬されるかをコントロールする「クローン引き継ぎ」（図示せず）および「レプリカ引き継ぎ」（図示せず）などのその他の列も含む。 As described above in connection with FIG. 8, if a new vvol is created from a storage container and no storage capability profile is explicitly specified for the new vvol, the new vvol is stored in the storage container. It will take over the associated storage capability profile. The storage capacity profile associated with the storage container can be selected from one of several separate profiles. For example, as shown in FIG. 21, these separate profiles are collectively referred to as a production profile 2101, a development (dev) profile 2102, and a test profile 2103 (here “profile 2100”). ) Is included. It should be appreciated that many other profiles can be defined. As shown, each profile entry for a particular profile is of fixed or variable type and has a name and one or more values associated with it. . A fixed type profile entry has a fixed number of selectable items. For example, the profile entry “Duplicate” can be set to be true or false. In contrast, variable type profile entries do not have predefined options. Instead, for variable type profile entries, a default value and a range of values are set, and the user can select any value within that range. If no value is specified, the default value is used. In the exemplary profile 2100 shown in FIG. 21, the variable type profile entry has three numbers separated by commas. The first number is the lower limit of the specified range, and the second number is the upper limit of the specified range. The third number is the default value. Therefore, the vvol that takes over the storage capability profile defined in the production profile 2101 will be replicated (replication, value = true), and the target recovery time (RTO: recovery time object) for replication is 0.1. It can be defined in the range of 24 hours from time, the default is 1 hour. In addition, snapshots are possible for this vvol (snapshot, value = true). The number of snapshots retained is in the range of 1 to 100, the default is 1, the snapshot frequency is in the range of once per hour to once every 24 hours, the default is 1 Once per hour. The "Snap Takeover" column indicates whether a given profile attribute (and its value) should be propagated to a derived vvol when a given vvol is snapshotted to create a new vvol that is a derived vvol Indicates whether or not In the production profile 2101 example, only the first two profile entries (duplicate and RTO) can be propagated to the snapshot vvol of a given vvol with production profile 2101. The values of all other attributes of the snapshot vvol will be set to the default values specified in the profile. In other words, any customization of these other attributes for a given vvol (eg, non-default value for snapshot frequency) will result in a snapshot vvol because their corresponding “snap-in” column is false. Not propagated. This profile also includes other columns such as “clone takeover” (not shown) and “replica takeover” (not shown) that control which attribute values are propagated to each clone and replica of a given vvol. Including.

図４の方法に従ってストレージ・コンテナが作成される場合には、そのストレージ・コンテナから作成されるｖｖｏｌに関して定義することができるストレージ能力プロファイルのタイプが設定されることが可能である。図２１における流れ図は、図４において示されているストレージ・コンテナを作成するための方法を、工程４１２と工程４１３との間に工程２１１０が挿入された状態で示している。工程２１１０において、ストレージ管理者は、作成されているストレージ・コンテナに関するプロファイル２１００のうちの１つまたは複数を選択する。たとえば、１人の顧客のために作成された１つのストレージ・コンテナが、プロダクション・プロファイル２１０１およびデベロップメント・プロファイル２１０２に関連付けられることが可能であり、それによって、プロダクション・タイプのものであるｖｖｏｌは、場合によってデフォルトの値または顧客によって指定された値を伴うプロダクション・プロファイル２１０１において定義されているストレージ能力プロファイルを引き継ぐことになり、デベロップメント・タイプのものであるｖｖｏｌは、場合によってデフォルトの値または顧客によって指定された値を伴うデベロップメント・プロファイル２１０２において定義されているストレージ能力プロファイルを引き継ぐことになる。 When a storage container is created according to the method of FIG. 4, a type of storage capability profile that can be defined for a vvol created from the storage container can be set. The flowchart in FIG. 21 shows the method for creating the storage container shown in FIG. 4 with step 2110 inserted between step 412 and step 413. In step 2110, the storage administrator selects one or more of the profiles 2100 for the storage container being created. For example, a single storage container created for a single customer can be associated with a production profile 2101 and a development profile 2102 so that a vvol of production type is In some cases, it will take over the storage capability profile defined in the production profile 2101 with the default value or the value specified by the customer, and the vvol of development type may be It will take over the storage capability profile defined in the development profile 2102 with the specified value.

図２２は、ｖｖｏｌを作成して、そのｖｖｏｌに関するストレージ能力プロファイルを定義するための、ストレージ・システム・マネージャ１３１、１３２、または１３５によって実行される方法工程を示す流れ図である。図２２の方法工程、とりわけ工程２２１０、２２１２、２２１８、および２２２０はそれぞれ、図８において示されている工程８０６、８１０、８１２、および８１４に対応する。加えて、図２２の方法工程は、工程２２１４、２２１５、および２２１６を含み、これらの工程は、作成されているｖｖｏｌに関するストレージ能力プロファイルを定義する。 FIG. 22 is a flow diagram illustrating method steps performed by the storage system manager 131, 132, or 135 to create a vvol and define a storage capability profile for the vvol. The method steps of FIG. 22, particularly steps 2210, 2212, 2218, and 2220, correspond to steps 806, 810, 812, and 814, respectively, shown in FIG. In addition, the method steps of FIG. 22 include steps 2214, 2215, and 2216, which define a storage capability profile for the vvol being created.

工程２２１４において、ストレージ・システム・マネージャは、ストレージ能力プロファイルにおいて使用されることになる値が、ｖｖｏｌを作成したいという要求内で指定されているかどうかを判定する。ストレージ能力プロファイルにおいて使用されることになる値が指定されていない場合には、ストレージ・システム・マネージャは、工程２２１５において、ｖｖｏｌのストレージ・コンテナに関連付けられているストレージ能力プロファイルを、デフォルトの値を伴うｖｖｏｌのストレージ能力プロファイルとして採用する。ストレージ能力プロファイルにおいて使用されることになる値が指定されている場合には、ストレージ・システム・マネージャは、工程２２１６において、ｖｖｏｌのストレージ・コンテナに関連付けられているストレージ能力プロファイルを、デフォルトの値の代わりに指定されている値を伴うｖｖｏｌのストレージ能力プロファイルとして採用する。 In step 2214, the storage system manager determines whether the value to be used in the storage capability profile is specified in the request to create a vvol. If the value to be used in the storage capability profile is not specified, the storage system manager sets the storage capability profile associated with the vvol storage container to the default value in step 2215. Adopted as a vvol storage capacity profile. If a value to be used in the storage capacity profile is specified, the storage system manager, in step 2216, sets the storage capacity profile associated with the vvol storage container to the default value. Instead, it is adopted as a storage capacity profile of vvol with a specified value.

一実施形態においては、ｖｖｏｌのストレージ能力プロファイルは、キー／値のペアとしてｖｖｏｌデータベース３１４内に格納される。いったんｖｖｏｌのストレージ能力プロファイルが定義されて、キー／値のペアとしてｖｖｏｌデータベース３１４内に格納されると、図２１の例示的なプロファイルにおいて示されているように複製およびスナップショット関連の属性および値がこのプロファイルの一部である限り、ストレージ・システムは、ホスト・コンピュータによって発行されるさらなる命令を伴わずに、そのｖｖｏｌに関する複製およびスナップショットを実行することができる。 In one embodiment, vvol storage capability profiles are stored in the vvol database 314 as key / value pairs. Once the vvol storage capability profile is defined and stored as key / value pairs in the vvol database 314, replication and snapshot related attributes and values as shown in the exemplary profile of FIG. As long as is part of this profile, the storage system can perform replication and snapshots for that vvol without further instructions issued by the host computer.

図２３は、親ｖｖｏｌからスナップショットを作成するための、ストレージ・システム・マネージャ１３１、１３２、または１３５によって実行される方法工程を示す流れ図である。一実施形態においては、所与のｖｖｏｌのストレージ能力プロファイル内のスナップショット定義に従ってスナップショットをスケジュールするためのスナップショット・トラッキング・データ構造が採用される。スナップショットのためのスケジュールされた時刻に達すると、ストレージ・システム・マネージャは、工程２３１０において、スナップショット・トラッキング・データ構造からｖｖｏｌＩＤを取り出す。次いで工程２３１２において、ストレージ・システム・マネージャは、スナップショットに関する一意のｖｖｏｌＩＤを生成する。ストレージ・システム・マネージャは、工程２３１５において、親ｖｖｏｌ（すなわち、スナップショット・トラッキング・データ構造から取り出されたｖｖｏｌＩＤを有するｖｖｏｌ）のストレージ能力プロファイルを、スナップショットｖｖｏｌのストレージ能力プロファイルとして採用する。これは、ストレージ・システムによって駆動される自動化されたプロファイル駆動型のスナップショット工程であるため、ユーザには、スナップショットｖｖｏｌのストレージ能力プロファイルにおいて使用されることになるカスタムの値を指定するための機会はないということに留意されたい。工程２３１８において、ストレージ・システム・マネージャは、コンテナ・データベース３１６内のアロケーション・ビットマップを更新すること、およびスナップショットｖｖｏｌに関する新たなｖｖｏｌエントリーをｖｖｏｌデータベース３１４に加えることによって、親ｖｖｏｌのストレージ・コンテナ内にスナップショットｖｖｏｌを作成する。次いで工程２３２０において、ストレージ・システム・マネージャは、親ｖｖｏｌに関する次なるスナップショットを生成するための時刻をスケジュールすることによって、スナップショット・トラッキング・データ構造を更新する。ストレージ・システム・マネージャは、スナップショット・トラッキング・データ構造を保持することと、スケジュールされたスナップショットを命じるストレージ能力プロファイルを有するすべてのｖｖｏｌに関して図２３の方法工程を実行することとを同時に行わなければならないということを認識されたい。 FIG. 23 is a flow diagram illustrating method steps performed by the storage system manager 131, 132, or 135 to create a snapshot from a parent vvol. In one embodiment, a snapshot tracking data structure is employed to schedule a snapshot according to a snapshot definition in a given vvol storage capability profile. When the scheduled time for the snapshot is reached, the storage system manager retrieves the vvol ID from the snapshot tracking data structure at step 2310. Then, in step 2312, the storage system manager generates a unique vvol ID for the snapshot. In step 2315, the storage system manager adopts the storage capability profile of the parent vvol (ie, vvol with the vvol ID retrieved from the snapshot tracking data structure) as the storage capability profile of the snapshot vvol. Since this is an automated profile driven snapshot process driven by the storage system, the user can specify a custom value that will be used in the storage capacity profile of the snapshot vvol. Note that there is no opportunity. In step 2318, the storage system manager updates the allocation bitmap in the container database 316 and adds a new vvol entry for the snapshot vvol to the vvol database 314, thereby creating a storage container for the parent vvol. A snapshot vvol is created inside. Next, at step 2320, the storage system manager updates the snapshot tracking data structure by scheduling a time to generate the next snapshot for the parent vvol. The storage system manager must simultaneously maintain the snapshot tracking data structure and perform the method steps of FIG. 23 for all vvols that have a storage capability profile that commands a scheduled snapshot. I want you to recognize that I have to.

上述の様式でスナップショットが作成された後に、ｖｖｏｌデータベース３１４内に格納されているキー／値のペアは、スナップショットｖｖｏｌがタイプ＝スナップショットのものであるということを示すように更新される。また、スナップショットに関して世代番号が保持され、その世代番号が、スナップショットがとられるたびにインクリメントされるか、または日にち＋時刻に等しくなるように設定される実施形態においては、世代番号は、キー／値のペアとして格納される。スナップショットｖｖｏｌの親ｖｖｏｌＩＤも、キー／値のペアとしてスナップショットｖｖｏｌエントリー内に格納される。結果として、ホスト・コンピュータは、特定のｖｖｏｌＩＤに対応するスナップショットを求めてｖｖｏｌデータベース３１４にクエリーを行うことができる。ホスト・コンピュータは、特定のｖｖｏｌＩＤおよび特定の世代番号に対応するスナップショットを求めてｖｖｏｌデータベースにクエリーを発行することも可能である。 After the snapshot is created in the manner described above, the key / value pair stored in the vvol database 314 is updated to indicate that the snapshot vvol is of type = snapshot. Also, in embodiments where a generation number is retained for a snapshot and the generation number is incremented each time a snapshot is taken or set to be equal to the date + time, the generation number is the key Stored as / value pairs. The parent vvol ID of the snapshot vvol is also stored in the snapshot vvol entry as a key / value pair. As a result, the host computer can query the vvol database 314 for a snapshot corresponding to a particular vvol ID. The host computer can also issue a query to the vvol database for a snapshot corresponding to a specific vvol ID and a specific generation number.

本明細書に記載されているさまざまな実施形態は、コンピュータ・システム内に格納されているデータを含むさまざまなコンピュータ実施オペレーションを採用することができる。たとえば、これらのオペレーションは、通常、必須ではないが、物理量の物理的な操作を必要とする場合があり、これらの量は、電気信号または磁気信号の形態を取ることができ、そうした形態では、それらの量、またはそれらの量の表示は、格納されること、転送されること、結合されること、比較されること、またはその他の形で操作されることが可能である。さらに、そのような操作はしばしば、作り出すこと、識別すること、判定すること、または比較することなどの用語で呼ばれる。１つまたは複数の実施形態の一部を形成する、本明細書に記載されているあらゆるオペレーションは、有用なマシン・オペレーションであることが可能である。加えて、１つまたは複数の実施形態はまた、これらのオペレーションを実行するためのデバイスまたは装置に関する。その装置は、特定の必要とされる目的のために特別に構築されることが可能であり、または、コンピュータ内に格納されているコンピュータ・プログラムによって選択的にアクティブ化または構成される汎用コンピュータであることが可能である。とりわけ、さまざまな汎用マシンが、本明細書における教示に従って書かれたコンピュータ・プログラムとともに使用されることが可能であり、または、必要とされるオペレーションを実行するためのさらに専門化された装置を構築することが、より好都合である可能性がある。 The various embodiments described herein can employ a variety of computer-implemented operations, including data stored in computer systems. For example, these operations are usually not required, but may require physical manipulation of physical quantities, which can take the form of electrical or magnetic signals, in which form Those quantities, or indications of those quantities, can be stored, transferred, combined, compared, or otherwise manipulated. Further, such operations are often referred to in terms such as creating, identifying, determining, or comparing. Any of the operations described herein that form part of one or more embodiments can be useful machine operations. In addition, one or more embodiments also relate to a device or apparatus for performing these operations. The device can be specially constructed for a particular required purpose, or it can be a general purpose computer selectively activated or configured by a computer program stored within the computer. It is possible that there is. In particular, various general-purpose machines can be used with computer programs written according to the teachings herein, or build more specialized equipment to perform the required operations It may be more convenient to do.

本明細書に記載されているさまざまな実施形態は、ハンドヘルド・デバイス、マイクロプロセッサ・システム、マイクロプロセッサベースのまたはプログラム可能な家庭用電化製品、ミニコンピュータ、メインフレーム・コンピュータなどを含むその他のコンピュータ・システム構成とともに実施されることが可能である。 Various embodiments described herein include other computer computers, including handheld devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like. It can be implemented with the system configuration.

１つまたは複数の実施形態は、１つもしくは複数のコンピュータ・プログラムとして、または１つもしくは複数のコンピュータ可読メディアにおいて具体化される１つもしくは複数のコンピュータ・プログラム・モジュールとして実装されることが可能である。コンピュータ可読メディアという用語は、その後にコンピュータ・システムに入力されることが可能であるデータを格納することができる任意のデータ・ストレージ・デバイスを指し、コンピュータ可読メディアは、コンピュータ・プログラムがコンピュータによって読み取られることを可能にする様式でそれらのコンピュータ・プログラムを具体化するための任意の既存のまたはその後に開発されるテクノロジーに基づくことができる。コンピュータ可読メディアの例としては、ハード・ドライブ、ネットワーク・アタッチト・ストレージ（ＮＡＳ）、読み取り専用メモリ、ランダムアクセス・メモリ（たとえば、フラッシュ・メモリ・デバイス）、ＣＤ（ＣｏｍｐａｃｔＤｉｓｃ）、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、またはＣＤ−ＲＷ、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）、磁気テープ、ならびにその他の光学式および非光学式のデータ・ストレージ・デバイスが含まれる。コンピュータ可読メディアは、ネットワークに結合されたコンピュータ・システムを介して分散されることも可能であり、それによってコンピュータ可読コードは、分散された様式で格納および実行される。 One or more embodiments may be implemented as one or more computer programs or as one or more computer program modules embodied in one or more computer readable media. It is. The term computer readable media refers to any data storage device that can store data that can thereafter be input into a computer system. Computer readable media is read by a computer program by a computer. Can be based on any existing or subsequently developed technology for instantiating those computer programs in a manner that allows them to be done. Examples of computer readable media include hard drives, network attached storage (NAS), read only memory, random access memory (eg, flash memory device), CD (Compact Disc), CD-ROM, CD -R or CD-RW, DVD (Digital Versatile Disc), magnetic tape, and other optical and non-optical data storage devices. The computer readable media can also be distributed via a computer system coupled to a network so that the computer readable code is stored and executed in a distributed fashion.

１つまたは複数の実施形態について、理解を明確にするためにいくらか詳細に説明したが、特許請求の範囲の範疇内で特定の変更および修正が行われることが可能であるということは明らかであろう。たとえば、ＳＣＳＩが、ＳＡＮデバイスのためのプロトコルとして採用され、ＮＦＳが、ＮＡＳデバイスのためのプロトコルとして使用される。ファイバ・チャネルなど、ＳＣＳＩプロトコルに対する任意の代替手段が使用されることが可能であり、ＣＩＦＳ（ＣｏｍｍｏｎＩｎｔｅｒｎｅｔＦｉｌｅＳｙｓｔｅｍ）プロトコルなど、ＮＦＳプロトコルに対する任意の代替手段が使用されることが可能である。したがって、記載されている実施形態は、限定的なものではなく例示的なものとみなされるべきであり、特許請求の範囲の範疇は、本明細書において与えられている詳細に限定されるものではなく、特許請求の範囲の範疇および均等物の中で修正されることが可能である。特許請求の範囲において、要素および／または工程は、特許請求の範囲において明示的に記載されていない限り、オペレーションのいかなる特定の順序も意味するものではない。 Although one or more embodiments have been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the claims. Let's go. For example, SCSI is employed as a protocol for SAN devices and NFS is used as a protocol for NAS devices. Any alternative to the SCSI protocol can be used, such as Fiber Channel, and any alternative to the NFS protocol, such as the CIFS (Common Internet File System) protocol, can be used. Accordingly, the described embodiments are to be regarded as illustrative rather than restrictive, and the scope of the claims is not to be limited to the details provided herein. Rather, modifications can be made within the scope and equivalents of the claims appended hereto. In the claims, elements and / or steps do not imply any particular order of operation, unless explicitly stated in the claims.

加えて、記載されている仮想化方法は一般に、仮想マシンが、特定のハードウェア・システムと整合するインターフェースを提示すると想定しているが、記載されているそれらの方法は、いかなる特定のハードウェア・システムにも直接対応しない仮想化に関連して使用されることが可能である。ホストされる実施形態、ホストされない実施形態として、またはそれらの両者の間における区別をあいまいにする傾向がある実施形態として実施される、さまざまな実施形態による仮想化システムが、すべて想定されている。さらに、さまざまな仮想化オペレーションは、全体的にまたは部分的にハードウェアで実施されることが可能である。たとえば、ハードウェアの実施態様は、非ディスク・データをセキュアにするためのストレージ・アクセス要求の修正のためのルックアップ・テーブルを採用することができる。 In addition, although the described virtualization methods generally assume that a virtual machine presents an interface that is consistent with a particular hardware system, those methods described are not limited to any particular hardware. It can be used in connection with virtualization that does not directly support the system. Virtual systems according to various embodiments, all implemented as hosted embodiments, unhosted embodiments, or embodiments that tend to blur the distinction between them, are all envisioned. In addition, various virtualization operations can be implemented in whole or in part in hardware. For example, a hardware implementation may employ a lookup table for modifying storage access requests to secure non-disk data.

仮想化の度合いにかかわらず、多くの変形、修正、追加、および改良が可能である。したがって仮想化ソフトウェアは、仮想化機能を実行するホスト、コンソール、またはゲスト・オペレーティング・システムのコンポーネントを含むことができる。単一のインスタンスとして本明細書に記載されているコンポーネント、オペレーション、または構造のために、複数のインスタンスが提供されることが可能である。最後に、さまざまなコンポーネント、オペレーション、およびデータストア同士の間における境界は、いくらか任意のものであり、特定のオペレーションは、特定の例示的な構成のコンテキストにおいて示されている。機能のその他の割り当ても想定されており、本明細書に記載されている実施形態の範囲内に収まることができる。一般に、例示的な構成において別々のコンポーネントとして提示されている構造および機能は、結合された構造またはコンポーネントとして実装されることが可能である。同様に、単一のコンポーネントとして提示されている構造および機能は、別々のコンポーネントとして実装されることが可能である。これらおよびその他の変形、修正、追加、および改良は、添付の（１つまたは複数の）特許請求の範囲の範疇内に収まることができる。 Many variations, modifications, additions and improvements are possible regardless of the degree of virtualization. Thus, the virtualization software can include a host, console, or guest operating system component that performs the virtualization function. Multiple instances can be provided for a component, operation, or structure described herein as a single instance. Finally, the boundaries between the various components, operations, and data stores are somewhat arbitrary, and specific operations are shown in the context of a particular exemplary configuration. Other assignments of functionality are envisioned and may fall within the scope of the embodiments described herein. In general, structures and functionality presented as separate components in an exemplary configuration may be implemented as a combined structure or component. Similarly, structure and functionality presented as a single component can be implemented as separate components. These and other variations, modifications, additions, and improvements can fall within the scope of the appended claim (s).

Claims

In a computer system connected to a storage system via input / output command (IO) paths and non-IO paths used for application-related data operations , the logic for the applications running on the computer system A method for provisioning storage volumes,
The computer system selects a logical storage container created in the storage system;
Issuing a request from the computer system to the storage system via a non-IO path to create the logical storage volume in the selected logical storage container;
The computer system stores a unique identifier for the logical storage volume received from the storage system in response to the request and associates the unique identifier with a virtual machine running on the computer system A method comprising:

Before SL logical storage volumes, to store metadata files of the virtual machines, the metadata file includes a disk descriptor file for the virtual disks of the virtual machines, the method according to claim 1.

The method of claim 2, wherein the logical storage volume stores a data file of the virtual disk and has a size defined in the disk descriptor file for the virtual disk.

Issuing a request to the storage system to increase the size of the virtual disk;
The method of claim 3, further comprising: updating the disk descriptor file for the virtual disk to indicate an increased size.

The method of claim 1, further comprising issuing a request to the storage system to clone the logical storage volume between the same or different logical storage containers.

Determining whether the logical storage containers are incompatible due to being created in storage devices of different manufacturers ;
Issuing a data movement command from the computer system to copy the contents of the logical storage volume to the cloned logical storage volume when it is determined that the logical storage containers are not compatible with each other The method of claim 5, further comprising:

The method of claim 1, further comprising: issuing to the storage system a request to move the logical storage volume between logical storage containers in the storage system.

Determining whether the IO path from the computer system to the storage system will be broken as a result of the migration;
Establishing a new IO path from the computer system to the storage system after the move is complete if it is determined that the IO path from the computer system to the storage system will be broken as a result of the move The method of claim 7 further comprising: issuing a request to do.

The method of claim 1, further comprising: determining that the logical storage container has sufficient space to accommodate the logical storage volume.

Further comprising sending an authentication credential and a decryption key to the storage system;
When the computer system is authenticated by the storage system, the request to create the logical storage volume is in an encrypted form that can be decrypted using the decryption key. Issued to the storage system;
The method of claim 1.

The method of claim 1, wherein the storage system is a SCSI protocol based storage system.

The method of claim 1, wherein the storage system is an NFS protocol based storage system.

In a computer system connected to a storage system via input / output command (IO) paths and non-IO paths used for application-related data operations , the logic for the applications running on the computer system A method for re-provisioning storage volumes,
Issuing a request from the computer system to the storage system via a non-IO path to increase the size of the logical storage volume provisioned in a selected logical storage container;
The computer system receives an acknowledgment of increase in size from the storage system;
Updating a metadata file associated with the logical storage volume to indicate an increased size.

The method of claim 13, further comprising: determining that the logical storage container has sufficient space to accommodate the increased size of the logical storage volume.

The method of claim 13, wherein the application is a virtual machine and the updated metadata file is a disk descriptor file for the virtual disk of the virtual machine.

The method of claim 15, wherein the metadata file is stored on another logical storage volume provisioned for the virtual machine.

A computer system connected to a storage system via input / output command (IO) paths and non-IO paths used for application-related data operations ,
A management interface in a non-IO path;
A storage interface in the IO path,
The management interface generates (i) a request to create a logical storage volume in a selected logical storage container of the storage system, and a unique identifier for the logical storage volume in response to the request And (ii) generating a request to bind the logical storage volume to a protocol endpoint configured in the storage system, and first and second in response to the request Is configured to receive an identifier of
A computer system wherein the storage interface encodes IOs issued to the logical storage volume using the first and second identifiers.

The computer system of claim 17, wherein the storage interface generates IO in a SCSI compliant format.

The computer system of claim 17, wherein the storage interface generates IO in a format compliant with NFS.

The management interface is configured to generate a request to rebind the logical storage volume and receive new first and second identifiers in response to the request; An identifier of 1 identifies the new protocol endpoint to which the logical storage volume is bound, and the new second identifier represents all logicals bound to the new protocol endpoint. The computer system of claim 17, wherein the computer system uniquely identifies the logical storage volume among storage volumes.