US20100241726A1 - Virtualized Data Storage Over Wide-Area Networks - Google Patents

Virtualized Data Storage Over Wide-Area Networks Download PDF

Info

Publication number
US20100241726A1
US20100241726A1 US12/730,179 US73017910A US2010241726A1 US 20100241726 A1 US20100241726 A1 US 20100241726A1 US 73017910 A US73017910 A US 73017910A US 2010241726 A1 US2010241726 A1 US 2010241726A1
Authority
US
United States
Prior art keywords
storage
data
block
storage block
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/730,179
Inventor
David Tze-Si Wu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Riverbed Technology LLC
Original Assignee
Riverbed Technology LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Riverbed Technology LLC filed Critical Riverbed Technology LLC
Priority to US12/730,179 priority Critical patent/US20100241726A1/en
Assigned to RIVERBED TECHNOLOGY, INC. reassignment RIVERBED TECHNOLOGY, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: WU, DAVID
Priority to US12/818,872 priority patent/US8504670B2/en
Publication of US20100241726A1 publication Critical patent/US20100241726A1/en
Assigned to MORGAN STANLEY & CO. LLC reassignment MORGAN STANLEY & CO. LLC SECURITY AGREEMENT Assignors: OPNET TECHNOLOGIES, INC., RIVERBED TECHNOLOGY, INC.
Assigned to RIVERBED TECHNOLOGY, INC. reassignment RIVERBED TECHNOLOGY, INC. RELEASE OF PATENT SECURITY INTEREST Assignors: MORGAN STANLEY & CO. LLC, AS COLLATERAL AGENT
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/188Virtual file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0862Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory
    • G06F2212/6024History based prefetching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0653Monitoring storage devices or systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • the present invention relates generally to data storage systems, and systems and methods to improve storage efficiency, compactness, performance, reliability, and compatibility.
  • a file system specifies an arrangement for storing, retrieving, and organizing data files or other types of data on data storage devices, such as hard disk devices.
  • a file system may include functionality for maintaining the physical location or address of data on a data storage device and for providing access to data files from local or remote users or applications.
  • data storage for multiple users and applications in an enterprise is implemented using a file server attached to one or more client systems and application servers via a local area network (LAN).
  • the file server allows users and applications to access data via file-based network protocols, such as NFS or SMB/CIFS.
  • a storage area network appears to file and application servers as one or more locally attached storage devices.
  • Storage area networks use protocols such as iSCSI and Fibre Channel Protocol to communicate with storage clients. These storage area network protocols are based on reading and writing blocks of data to storage devices and typically operate below the level of the file system.
  • branches Large organizations, such as enterprises, are often geographically spread out over many separate locations, referred to as branches.
  • an enterprise may have offices or branches in New York, San Francisco, and India.
  • Each branch location may include its own internal local area network for exchanging data within the branch.
  • the branches may be connected via a wide area network, such as the internet, for exchanging data between branches.
  • Typical branch LAN installations also required data storage for their local client systems and application servers.
  • a typical branch LAN installation may include a file server for storing data for the client systems and application services.
  • this branch's data storage is located at the branch site and connected directly with the branch LAN.
  • each branch requires its own file server and associated data storage devices.
  • Deploying and maintaining file servers and data storage at a number of different branches is expensive and inefficient. Organizations often require on-site personnel at each branch to configure and upgrade each branch's data storage, and to manage data backups and data retention. Additionally, organizations often purchase excess storage capacity for each branch to allow for upgrades and growing data storage requirements. Because branches are serviced infrequently, due to their numbers and geographic dispersion, organizations often deploy enough data storage at each branch to allow for months or years of storage growth. However, this excess storage capacity often sits unused for months or years until it is needed, unnecessarily driving up costs.
  • FIGS. 1A-1B illustrates virtual storage array system according to an embodiment of the invention
  • FIGS. 2A-2B illustrate a method of optimizing data reads in a virtual storage array system according to an embodiment of the invention
  • FIG. 3 illustrates a method of optimizing data writes in a virtual storage array system according to an embodiment of the invention
  • FIGS. 4A-4B illustrate data migration of virtual storage array system according to an embodiment of the invention
  • FIG. 5 illustrates a method of creating data snapshots of a virtual storage array according to an embodiment of the invention
  • FIG. 6 illustrates an example optimized data compression and deduplication using file-system or other storage format awareness according to an embodiment of the invention
  • FIG. 7 illustrates an example virtual machine implementation of a virtual storage array interface according to an embodiment of the invention.
  • FIG. 8 illustrates an example computer system capable of implementing a virtual storage array interface according to an embodiment of the invention.
  • An embodiment of the invention uses virtual storage arrays to consolidate branch location-specific data storage at data centers connected with branch locations via wide area networks.
  • the virtual storage array appears to a storage client as a local branch data storage; however, embodiments of the invention actually store the virtual storage array data at a data center connected with the branch location via a wide-area network.
  • a branch storage client accesses the virtual storage array using storage block based protocols.
  • Embodiments of the invention overcome the bandwidth and latency limitations of the wide area network between branch locations and the data center by predicting storage blocks likely to be requested in the future by the branch storage client and prefetching and caching these predicted storage blocks at the branch location. When this prediction is successful, storage block requests from the branch storage client may be fulfilled in whole or in part from the branch location's storage block cache. As a result, the latency and bandwidth restrictions of the wide-area network are hidden from the storage client.
  • the branch location storage client uses storage block-based protocols to specify reads, writes, modifications, and/or deletions of storage blocks.
  • servers and higher-level applications typically access data in terms of files in a structured file system, relational database, or other high-level data structure.
  • Each entity in the high-level data structure such as a file or directory, or database table, node, or row, may be spread out over multiple storage blocks at various non-contiguous locations in the storage device.
  • prefetching storage blocks based solely on their locations in the storage device is unlikely to be effective in hiding wide-area network latency and bandwidth limits from storage clients.
  • An embodiment of the invention leverages an understanding of the semantics and structure of the high-level data structures associated with the storage blocks to predict which storage blocks are likely to be requested by a storage client in the near future. To do this, an embodiment of the invention determines the association between requested storage blocks and the corresponding high-level data structure entities, such as files, directories, or database elements. Once this embodiment has identified one or more of the high-level data structure entities associated with a requested storage block, this embodiment of the invention identifies additional portions of the same or other high-level data structure entities that are likely to be accessed by the storage client. This embodiment of the invention then identifies the additional storage blocks corresponding to these additional high-level data structure entities. The additional storage blocks are then prefetched and cached at the branch location.
  • a further embodiment of the invention also hides wide-area network latency and bandwidth limits from storage clients during write operations by caching write requests from storage clients at their associated branch locations. Once a write request is cached at the branch location, the write request is acknowledged as complete to the storage client, allowing the storage client to continue operations. The cached write requests are then transferred from the branch location to the data center independently of the storage clients' operations. Once a new or updated storage block is stored in the storage block cache, it may be accessed by storage clients prior to its transfer to the data storage.
  • An additional embodiment allows for snapshots of the virtual storage array data.
  • a snapshot is prepared by setting the virtual storage array interface to a quiescent state and identifying any new or updated storage blocks in the branch location's storage block cache.
  • the virtual storage array interface may then be set to an active state and resume normal operations.
  • the virtual storage array interface transfers these identified storage blocks to the data center, if they have not already been transferred as part of the normal write request process.
  • an embodiment of the branch virtual storage array interface makes a copy of this storage block. The modification is then applied to the copy of this storage block. Storage clients accessing this storage block will receive the modified copy of this storage block. However, the unmodified version of the storage block may be used to fulfill a subsequent snapshot request.
  • An embodiment of the invention uses virtual storage arrays to consolidate branch location-specific data storage at data centers connected with branch locations via wide area networks.
  • the virtual storage array appears to a storage client as a local branch data storage; however, embodiments of the invention actually store the virtual storage array data at a data center connected with the branch location via a wide-area network.
  • a branch storage client accesses the virtual storage array using storage block based protocols.
  • Embodiments of the invention overcome the bandwidth and latency limitations of the wide area network between branch locations and the data center by predicting storage blocks likely to be requested in the future by the branch storage client and prefetching and caching these predicted storage blocks at the branch location. When this prediction is successful, storage block requests from the branch storage client may be fulfilled in whole or in part from the branch location' storage block cache. As a result, the latency and bandwidth restrictions of the wide-area network are hidden from the storage client.
  • the branch location storage client uses storage block-based protocols to specify reads, writes, modifications, and/or deletions of storage blocks.
  • servers and higher-level applications typically access data in terms of files in a structured file system, relational database, or other high-level data structure.
  • Each entity in the high-level data structure such as a file or directory, or database table, node, or row, may be spread out over multiple storage blocks at various non-contiguous locations in the storage device.
  • prefetching storage blocks based solely on their locations in the storage device is unlikely to be effective in hiding wide-area network latency and bandwidth limits from storage clients.
  • An embodiment of the invention leverages an understanding of the semantics and structure of the high-level data structures associated with the storage blocks to predict which storage blocks are likely to be requested by a storage client in the near future. To do this, an embodiment of the invention determines the association between requested storage blocks and the corresponding high-level data structure entities, such as files, directories, or database elements. Once this embodiment has identified one or more of the high-level data structure entities associated with a requested storage block, this embodiment of the invention identifies additional portions of the same or other high-level data structure entities that are likely to be accessed by the storage client. This embodiment of the invention then identifies the additional storage blocks corresponding to these additional high-level data structure entities. The additional storage blocks are then prefetched and cached at the branch location.
  • Another embodiment of the invention analyzes a selected high-level data structure entity to identify portions of the same or other high-level data structure entities that is likely to be accessed by the storage client. This embodiment of the invention then identifies the additional storage blocks corresponding to these additional high-level data structure entities. The additional storage blocks are then prefetched and cached at the branch location. This embodiment of the invention may also identify additional high-level data structure entities to analyze based on its analysis of previously selected high-level data structure entities.
  • FIG. 1 may depict corresponding high-level data structure entities directly from requests for storage blocks.
  • FIGS. 1A-1B illustrates virtual storage array systems according to an embodiment of the invention.
  • FIG. 1A illustrates an example system 100 including virtual storage arrays according to an embodiment of an invention.
  • the example system 100 includes two branches 105 a and 105 b, each of which has its own internal local area network (LAN), and a data center 110 , which also includes its own LAN.
  • the two branch networks 105 and the data center network 110 are connected by one or more wide area networks (WANs) 115 , such as the internet.
  • WANs wide area networks
  • FIG. 1A shows two branches and one data center, embodiments of the invention can be implemented with any arbitrary number of branches and data centers.
  • Each of the branch LANs 105 may include routers, switches, and other wired or wireless network devices 107 for connecting with client systems and other devices, such as network devices 107 a and 107 b.
  • each of the branch LANs 105 may connect one or more client systems 108 , such as client system 108 a and 108 b, with one or more application servers 109 , such as 109 a and 109 b.
  • Application servers 109 provide applications and application functionality to the client systems 108 .
  • typical branch LAN installations also requires data storage for client systems and application servers.
  • a prior typical branch LAN installation may include a file server for storing data for the client systems and application servers, such as database servers and e-mail servers.
  • this branch's data storage is located at the branch site and connected directly with the branch LAN.
  • the branch data storage previously could not be located at the data center, because the intervening WAN is too slow and has high latency, making storage accesses unacceptably slow for client systems and application servers.
  • An embodiment of the invention allows for storage consolidation of branch-specific data storage at data centers connected with branches via wide area networks.
  • This embodiment of the invention overcomes the bandwidth and latency limitations of the wide area network between branches and the data center.
  • an embodiment of the invention includes virtual storage arrays.
  • a virtual storage array appears to branch users, such as branch client systems and branch application servers, as a storage array connected with the branch's local area network.
  • a virtual storage array can be used for the same purposes as a local storage area network or other data storage device.
  • a virtual storage array may be used in conjunction with a file server for general-purpose data storage, in conjunction with a database server for database application storage, or in conjunction with an e-mail server for e-mail storage.
  • the virtual storage array stores its data at a data center connected with the branch via a wide area network. Multiple separate virtual storage arrays, from different branches, may store their data in the same data center and, as described below, on the same storage devices.
  • An organization can manage and control access to their data storage at a central data center, rather than at large numbers of separate branches. This increases the reliability and performance of an organization's data storage. This also reduces the personnel required at branch offices to provision, maintain, and backup data storage. It also enables organizations to implement more effective backup systems, data snapshots, and disaster recovery for their data storage. Furthermore, organizations can plan for storage growth more efficiently, by consolidating their storage expansion for multiple branches and reducing the amount of excess unused storage. Additionally, an organization can apply optimizations such as compression or data deduplication over the data from multiple branches stored at the data center, reducing the total amount of storage required by the organization.
  • virtual storage arrays are implemented at each of the branches 105 using branch virtual storage array interfaces 120 , such as branch virtual storage array interfaces 120 a and 120 b.
  • branch virtual storage array interfaces 120 may be a stand-alone computer system or network appliance or built into other computer systems or network equipment as hardware and/or software.
  • any of the branch virtual storage array interfaces 120 may be implemented as a software application or other executable code running on a client system or application server.
  • each of the branch virtual storage array interfaces 120 includes one or more storage array network interfaces and supports one or more storage array network protocols to connect with client systems and/or application servers within a branch local area network.
  • storage array network interfaces suitable for use with embodiments of the invention include Ethernet, Fibre Channel, IP, and InfiniBand interfaces.
  • storage array network protocols include ATA, Fibre Channel Protocol, and SCSI.
  • Various combinations of storage array network interfaces and protocols are suitable for use with embodiments of the invention, including iSCSI, HyperSCSI, Fibre Channel over Ethernet, and iFCP.
  • an embodiment of the branch virtual storage array interface 120 can use the branch LAN's physical connections and networking equipment for communicating with client systems and application services. In other embodiments, separate connections and networking equipment, such as Fibre Channel networking equipment, is used to connect the branch virtual storage array interface 120 with client systems 108 and/or application servers 109 .
  • one or more of the branch LANs 105 can include a file server, for example built into one of the application servers 109 , for providing a network file interface to the virtual storage array to client systems 108 and other application servers 109 .
  • the branch virtual storage array interface 120 is integrated as hardware and/or software with an application server 109 , such as a file server, database server, or e-mail server.
  • the branch virtual storage array interface 120 can include application server interfaces, such as a network file interface, for interfacing with other application servers and/or client systems.
  • a branch virtual storage array interface 120 appears to be a local storage array, having its data storage at the associated branch 105 .
  • branch virtual storage array 120 a appears to clients 108 a and application server 109 a as a local data storage array on branch LAN 105 a.
  • the branch virtual storage array interfaces 120 actually store and retrieve data from storage devices located on the data center LAN 110 . Because virtual storage array data accesses must travel via the WAN 115 between the data center LAN 110 to the branch LANs 105 , the virtual storage arrays are subject to the latency and bandwidth restrictions of the WAN 115 .
  • the branch virtual storage array interfaces 120 includes virtual storage array caches 122 , such as virtual storage array caches 122 a and 122 b for virtual storage array interfaces 120 a and 120 b respectively, which are used to ameliorate the effects of the WAN 115 on virtual storage array performance.
  • virtual storage array data accesses including data reads and data writes, can be optimized to minimize the effect of WAN bandwidth restrictions and latency.
  • an embodiment of the invention includes a data center virtual storage array interface 125 located on the data center LAN 110 .
  • the data center virtual storage array interface 125 communicates with one or more branch virtual storage interfaces 120 via the data center LAN 110 , the WAN 115 , and their respective branch LANs 105 .
  • Data communications between virtual storage interfaces 120 and 125 can be in any form and/or protocol used for carrying data over wired and wireless data communications networks, including TCP/IP.
  • the data center virtual storage array interface 125 translates data communications from branch virtual storage array interfaces 120 into storage accesses of a physical storage array network.
  • a data center virtual storage array interface 125 accesses a physical storage array network interface 127 , which in turn accesses physical data storage devices 129 on a storage array network.
  • Examples of data storage devices 129 include physical data storage array devices 129 a and data backup devices 129 b.
  • the data center virtual storage array interface 125 includes one or more storage array network interfaces and supports one or more storage array network protocols for directly connecting with a physical storage array network and its data storage devices 129 .
  • Examples of storage array network interfaces suitable for use with embodiments of the invention include Ethernet, Fibre Channel, IP, and InfiniBand interfaces.
  • Examples of storage array network protocols include ATA, Fibre Channel Protocol, and SCSI.
  • Various combinations of storage array network interfaces and protocols are suitable for use with embodiments of the invention, including iSCSI, HyperSCSI, Fibre Channel over Ethernet, and iFCP.
  • Embodiments of the data center virtual storage array interface 125 may connect with the physical storage array interface 127 and/or directly with the physical storage array network using the Ethernet network of the data center LAN and/or separate data communications connections, such as a Fibre Channel network.
  • branch 105 and data center LANs 110 may optionally include network optimizers 130 for improving the performance of data communications over the WAN 115 between branches and/or the data center.
  • Network optimizers 130 can improve actual and perceived WAN network performance using techniques including compressing data communications; anticipating and prefetching data; caching frequently accessed data; shaping and restricting network traffic; and optimizing usage of network protocols.
  • network optimizers 130 may be used in conjunction with virtual storage array interfaces 120 and 125 to further improve virtual storage array performance accessing data via the WAN 115 .
  • network optimizers 130 may ignore or pass-through virtual storage array data traffic, relying on the virtual storage array interfaces 120 and 125 on the branch 105 and data center LANs 110 to optimize WAN performance.
  • a data center virtual storage array interface may be connected directly between a WAN and a physical data storage array, eliminating the need for a data center LAN.
  • a branch virtual storage array interface implemented for example in the form of a software application executed by a storage client computer system, may be connected directly with a WAN, such as the internet, eliminating the need for a branch LAN.
  • FIG. 1B illustrates an example arrangement 150 of data within virtual and physical storage array networks according to an embodiment of the invention.
  • two branches 155 a and 155 b each include a branch virtual storage array interface 160 a and 160 b and associated virtual storage array cache 165 a and 165 b, respectively.
  • each of the virtual storage array caches 165 are used to store prefetched virtual storage array network data and pending virtual storage array write data for their branch's respective virtual storage arrays.
  • each of the branches 155 includes its own separate virtual storage array, which appears to be located within its branch LAN 155 . However, the majority of the data storage of a branch's virtual storage array is located within the data center LAN 170 on one or more physical data storage devices 175 .
  • the data center LAN 170 is connected with the branch LANs 155 via WAN 185
  • each branch's virtual storage array data is stored within a physical storage area network at the data center LAN 170 .
  • the physical storage area network may store virtual storage array data 180 for two or more branches.
  • physical data storage array 175 stores virtual storage array data 180 a and 180 b, which correspond with the data of the virtual storage arrays for branch 155 a and 155 b, respectively.
  • data optimizations such as data compression and data deduplication can be applied to each branch's virtual storage array data 180 separately or may be consolidated over multiple branches' virtual storage array data 180 .
  • redundant data within a single branch's virtual storage array data within the data center's physical storage array network can be compressed or deduplicated to reduce storage requirements.
  • compression or data deduplication can be applied over all of these virtual storage arrays, such that only a single copy of the redundant data needs to be stored in the physical storage area network.
  • each of the separate branch virtual storage arrays will reference this single copy of the redundant data.
  • branch's 155 a virtual storage array data 180 a can be compressed or deduplicated together with branch's 155 b virtual storage array data 180 b so that there is only a single copy of any redundant data found in both virtual storage arrays.
  • the virtual storage array can be used to provide “cloud” storage for network-based applications.
  • An embodiment of the invention prefetches virtual storage array data to improve data read performance of the virtual storage array.
  • the branch or data center virtual storage array interface analyzes read and write accesses to a branch's virtual storage array to predict which storage blocks may be accessed in the future. The branch or data center virtual storage array interface then retrieves some or all of these predicted storage blocks and stores them in the branch's virtual storage array cache. If storage client, such as an application server, file server, or client system, later requests access to one or more of the cached storage blocks, the branch virtual storage array interface retrieves the requested storage block from the virtual storage array cache, rather than retrieving the storage block from the physical storage devices located in the data center LAN via the WAN. This storage block prefetching hides the bandwidth and latency of the WAN from the storage client, making the virtual storage array appear as if it is a local storage device.
  • FIG. 2A illustrates an example 200 of a storage client 205 opening an example file “Foo.txt” and reading the first five file file system blocks or clusters of this file.
  • These file protocol reads may be performed using any file system protocol, such as CIFS, NFS, or NTFS.
  • This sequence of file protocol reads is received by a file server 210 .
  • the file server 210 translates these file protocol reads into one or more storage area network reads.
  • Each storage area network read retrieves one or more storage blocks from the virtual or physical storage area network 215 .
  • the storage area network reads may use any storage area network protocol, such as iSCSI or other protocols discussed above.
  • the sizes and boundaries of file system blocks and storage area network blocks are independent of each other; thus each file system block may correspond with a fraction of a storage area network block, a single storage area network block, or multiple storage area network blocks.
  • file system block 0 corresponds with storage area network blocks 101 and 200 .
  • File system block 1 corresponds with storage area network block 14 .
  • File system block 2 corresponds with storage area network block 25 .
  • File system block 3 corresponds with storage area network block 26 .
  • File system block 4 corresponds with storage area network block 12 .
  • the first five file system blocks of a file in a file system correspond to six non-sequential storage area network blocks.
  • FIG. 2B illustrates a method 250 of performing reactive prefetching of storage blocks according to an embodiment of the invention.
  • Step 255 receives a storage block read request from a storage client, such as a client system or application server, at the branch location.
  • the storage block read request may be received by a branch location virtual data storage array interface.
  • the storage block read request may be received using a storage area network protocol, such as iSCSI.
  • decision block 260 determines if the requested storage block has been previously retrieved and stored in the storage block read cache at the branch location. If so, step 270 retrieves the requested storage block from the storage block read cache and returns it to the requesting storage client. In an embodiment, if the system includes a data center virtual storage array interface, then step 270 also forwards the storage block read request back to the data center virtual storage array interface for use in identifying additional storage blocks likely to be requested by the storage client in the future.
  • step 265 retrieves the requested storage block via a WAN connection from the virtual storage array data located in a physical data storage at the data center.
  • a branch location virtual storage array interface forwards the storage block read request to the data center virtual storage array interface via the WAN connection.
  • the data center virtual storage array interface then retrieves the requested storage block from the physical storage array and returns it to the branch location virtual storage array interface, which in turn provides this requested storage block to the storage client.
  • a copy of the retrieved storage block may be stored in the storage block read cache for future accesses.
  • steps 275 to 299 prefetch additional storage blocks likely to be requested by the storage client in the near future.
  • Step 275 identifies a high-level data structure entity associated with the requested storage block.
  • high-level data structure entities include file system entities such as files, directories, and file system blocks or clusters; and database structures such as database tables, rows, and nodes.
  • Typical block storage protocols such as iSCSI and FCP, specify block read requests using a storage block address or identifier. However, these storage block read requests do not include any identification of the associated high-level data structure entity, such as a specific file, directory, or database entity, that is associated with this storage block.
  • an embodiment of step 275 identifies the high-level data structure entity corresponding with the requested storage block.
  • a branch or data center virtual storage array interface searches a file system data structure, such as an allocation table or tree, or a database data structure, such as a B-tree, to identify one or more high-level data structure entities corresponding with the requested storage block.
  • a branch or data center virtual storage array interface preprocesses data structures to create other databases, tables, or other data structures adapted to facilitate searching for high-level data structure entities corresponding with storage blocks. These data structures mapping storage blocks to corresponding high-level data structure entities may be updated frequently or infrequently, depending upon the desired prefetching performance.
  • step 275 also determines a location or range of locations within the high-level data structure entity corresponding with the requested storage block.
  • a storage block may correspond with a specific range of addresses or offsets within a larger file.
  • step 280 identifies additional high-level data structure entities or portions thereof that are likely to be requested by the storage client.
  • additional high-level data structure entities or portions thereof for prefetching may be used by embodiments of step 280 . Some of these are described in detail in co-pending U.S. patent application Ser. No. ______[Attorney Docket Number R001420US], entitled “Virtual Data Storage System Optimizations”, filed ______, which is incorporated by reference herein for all purposes.
  • step 280 prefetch portions of the high-level data structure entity based on their adjacency or close proximity to the identified portion of the entity. For example, if step 275 determines that the requested storage block corresponds with a portion of a file from file offset 0 up to offset 4095, then step 280 may identify a second portion of this same file beginning with offset 4096 for prefetching. It should be noted that although these two portions are adjacent in the high-level data structure entity, their corresponding storage blocks may be non-contiguous.
  • FIG. 1 For example, application or protocol specific information may be used to identify storage blocks for prefetching and caching. For example, if the virtual storage array is used to store e-mail data, a branch or data center virtual storage array interface may identify an e-mail account or e-mail message ID associated with a requested storage block and then identify and prefetch storage blocks associated with the same user, with the same e-mail message ID, and/or with e-mail messages having nearby e-mail message IDs. This application or protocol specific information may be used alone or in conjunction with the above-described file system or database data.
  • Step 280 identifies all or portions of one or more high-level data structure entities for prefetching based on the high-level data structure entity associated with the requested storage block.
  • storage clients specify data access requests in terms of storage blocks, not high-level data structure entities such as files, directories, or database entities.
  • step 285 identifies one or more storage blocks corresponding with the high-level data structure entities identified for prefetching in step 280 .
  • step 285 identifies additional storage blocks corresponding with the high-level data structure entities by accessing the data structures associated with a file system data structure, such as an allocation table or tree, or a database data structure, such as a B-tree, in a manner similar to a client system or application server requesting a high-level data structure entity.
  • step 280 accesses a separate data structure maintained by a virtual storage array interface to identify one or more storage blocks corresponding with the high-level data structure entities identified for prefetching.
  • Decision block 290 determines if any of the storage blocks identified in step 285 have already been stored in the storage block read cache located at the branch location. If not, step 295 retrieves these uncached additional storage blocks from the virtual storage array data located in a physical data storage on the data center LAN and sends them via a WAN connection to the appropriate branch LAN. Step 299 stores these additional storage blocks in the branch's virtual storage array cache for potential future access by storage clients within the branch LAN. In a further embodiment, decision block 290 and the determination of whether an additional storage block has been previously retrieved and cached may be omitted. Instead, this embodiment can send all of the identified additional storage blocks to the branch virtual storage array interface to be cached. The branch virtual storage array interface may then discard any redundant storage blocks. This embodiment can be used when WAN latency, rather than WAN bandwidth limitations, are an overriding concern.
  • method 250 of FIG. 2B is described with respect to accessing files via the virtual storage array, embodiments of method 250 can also be applied to non-file based storage accesses.
  • an embodiment of method 250 can be applied to access databases via the virtual storage array.
  • portions of database tables or B-tree child nodes, rather than file system blocks, are used to identify corresponding storage blocks for prefetching and caching by a branch virtual storage array interface
  • indirect blocks of a file system may be used to identify additional storage blocks to be prefetched and cached.
  • step 255 proceeds to step 255 to await receipt of further storage block requests.
  • the storage blocks added to the storage block read cache in previous iterations of method 250 may be available for fulfilling storage block read requests.
  • Method 250 may be performed by a branch virtual data storage array interface, by a data center virtual data storage array interface, or by both virtual data storage array interfaces working in concert. For example, steps 255 to 270 of method 250 may be performed by a branch location virtual storage array interface and steps 275 to 299 of method 250 may be performed by a data center virtual storage array interface. In another example, all of the steps of method 250 may be performed by a branch location virtual storage array interface.
  • FIG. 3 illustrates a method 300 of optimizing data writes in a virtual storage array system according to an embodiment of the invention.
  • An embodiment of method 300 starts with step 305 receiving a storage block write request from a storage client within the branch LAN.
  • the storage block write request may be received by a branch virtual storage interface.
  • decision block 310 determines if the virtual storage array cache is capable of accepting additional write requests or is full.
  • the virtual storage array cache may use some or all of its storage as a queue for pending virtual storage array operations.
  • step 315 stores the storage block write request, including the storage block data to be written, in the virtual storage array cache.
  • step 320 then sends a write acknowledgement to the storage client. Following the storage client's receipt of this write request, the storage client believes its storage block write request is complete and can continue to operation normally. However, in step 325 , the virtual storage array interface will transfer the queued written storage block via the WAN to the physical storage array at the data center LAN. In an embodiment, step 325 may perform this transfer in the background and asynchronously with the operation of storage clients.
  • the virtual storage array interface intercepts the storage block access request.
  • the virtual storage array interface provides the storage client with the queued storage block.
  • the virtual storage array interface will update the queued storage block data and send a write acknowledgement to the storage client for this additional storage block access.
  • step 330 immediately transfers the storage block via the WAN to the physical storage array at the data center LAN. Following completion of this transfer, step 335 receives a write acknowledgement from the data center virtual storage array interface or the physical data storage array itself. Step 340 then sends a write acknowledgement to the storage client, allowing the storage client to resume normal operation.
  • a virtual storage array interface may throttle storage block read and/or write requests from storage clients to prevent the virtual storage array cache from filling up under typical usage scenarios.
  • FIGS. 4A-4B illustrate data migration of virtual storage array system according to an embodiment of the invention. Because the data storage of a branch's virtual storage array is located at a data center, rather than at the branch location, migrating data from one branch to another branch is straightforward.
  • FIG. 4A illustrates a first branch virtual storage interface 405 at a first branch 410 that provides access to a virtual storage array 415 a, having its virtual storage array data 420 stored at a data center 425 .
  • the first branch virtual storage array interface is configured to deactivate the first branch's access to the virtual storage array.
  • a second branch virtual storage array interface at the second branch is then configured to access the virtual storage array data at the data center, thus providing the second branch with access to the virtual storage array.
  • FIG. 4B illustrates an example of a second branch virtual storage interface 430 at a second branch 435 that provides access to a virtual storage array 415 b, having its virtual storage array data 420 stored at a data center 425 .
  • the first branch virtual storage array interface 405 at the first branch 410 has been configured to deactivate the first branch's access to the virtual storage array.
  • the second branch 435 has exclusive access to the virtual storage array data 420 via virtual storage array 415 b.
  • the first branch virtual storage interface upon deactivating the virtual storage array 415 a at a first branch 410 , is adapted to transfer any updated storage data in its virtual storage array cache, such as new or updated storage blocks associated with pending write operations, back to the virtual storage array data 420 in the physical data storage array 440 . This ensures that the virtual storage array data 420 maintained at the data center 425 is up to date.
  • virtual storage array data 420 does not change location when a virtual storage array 415 is migrated to a new location
  • virtual storage arrays can be migrated frequently. For example, if an organization has a first branch in New York and a second branch in India, a virtual storage array may be migrated between these offices every work day. Because of the time differences between these two locations, the virtual storage array enables a 24-hour work cycle. During business hours in the New York branch, the New York branch will be given access to the virtual storage array. At the same time, it is late at night in India; thus this branch does not require access to the virtual storage array.
  • the New York branch virtual storage array interface deactivates its virtual storage array access and completes any remaining updates to the virtual storage array data at the data center. Then, the India branch virtual storage array interface can activate virtual storage array access for the India branch. This allows the India branch to access the virtual storage array while the New York branch is closed for the night. At the end of business hours in India, this process is reversed and the New York branch reconnects with the virtual storage array.
  • a virtual storage array interface at the branch can connect with the virtual storage array interface that is currently connected with the virtual storage array data via the WAN to provide after-hours storage clients access to the virtual storage array.
  • the virtual storage array data 420 is accessed by virtual storage array 415 b currently provided by virtual storage array interface 430 located at the second branch 435 . If a client system 445 at the first branch 410 needs to access data in the virtual storage array 415 b, the client system 445 contacts the first virtual storage array interface 405 . The first virtual storage array interface 405 then contacts the second virtual storage array interface 430 to access the virtual storage array 415 b.
  • one or more virtual machines executing virtual storage array applications, application servers, and/or other applications may migrate with a virtual storage array between two or more branches.
  • an application server such as a database application or an e-mail server and its associated data storage, implemented using a virtual storage array, may move together between branches. Because the application server is implemented within a virtual machine, this migration between branches may be seamless from the perspective of the application server.
  • FIG. 5 illustrates a method 500 of creating data snapshots of a virtual storage array according to an embodiment of the invention.
  • An embodiment of the method 500 begins in step 505 with the initiation of a virtual storage array checkpoint.
  • a virtual storage array checkpoint may be initiated automatically by a branch virtual storage interface according to a schedule or based on criteria, such as the amount of data changed since the last checkpoint.
  • a virtual storage array checkpoint may be initiated in response to a request for a virtual storage array snapshot from a system administrator or administration application.
  • step 510 sets the branch virtual storage array interface to a quiescent state. This entails completing any pending operations with storage clients (though not necessarily background operations between the branch and data center virtual storage array interfaces). While in the quiescent state, the branch virtual storage interface will not accept any new storage operations from storage clients.
  • an embodiment of the branch virtual storage array interface identifies updated storage blocks in its associated virtual storage array cache. These updated storage blocks include data that has been created or updated by storage clients but have yet to be transferred via the WAN back to the data center LAN for storage in the physical data storage array.
  • step 515 an embodiment of the branch virtual storage array creates a checkpoint data structure.
  • the checkpoint data structure specifies a time of checkpoint creation and the set of updated storage blocks at that moment of time.
  • step 520 reactivates the branch's virtual storage array.
  • the branch virtual storage array interface can resume servicing storage operations from storage clients.
  • the branch virtual storage array may resume transferring new or updated storage blocks via the WAN to the data center LAN for storage in the physical data storage array.
  • the virtual storage array cache may maintain a copy of an updated storage block even after a copy is transferred back to the data center LAN for storage. This allows subsequent snapshots to be created based on this data.
  • the virtual storage array interface preserves the updated storage blocks specified by the checkpoint data structure from further changes. If a storage client attempts to update a storage block that is associated with a checkpoint, an embodiment of the virtual storage array interface creates a duplicate of this storage block in the virtual storage array cache to store the updated data. This preserves the data of this storage block at the time of the checkpoint for potential future reference.
  • an embodiment of the method 500 may initiate one or more additional virtual storage array checkpoints at later times or in response to criteria or conditions.
  • Embodiments of the virtual storage array interface may maintain any arbitrary number of checkpoint data structures and automatically delete outdated checkpoint data structures.
  • a branch virtual storage interface may maintain only the most recently created checkpoint data structure, or checkpoint data structures from the beginning of the most recent business day and the most recent hour.
  • a system administrator or administration application may request a snapshot of the virtual storage array data.
  • a snapshot of the virtual storage array data represents the complete set of virtual storage array data at a specific moment of time.
  • Step 525 receives a snapshot request from the a system administrator or administration application.
  • an embodiment of a branch virtual storage array interface transfers a copy of the appropriate checkpoint data structure to the data center virtual storage interface. Additionally, the branch virtual storage array interface transfers a copy of any updated storage blocks specified by this checkpoint data structure.
  • the data center virtual storage array interface creates a snapshot of the data of the virtual storage array.
  • the snapshot includes a copy of the all of the virtual storage array data in the physical data storage array unchanged from the time of creation of the checkpoint data structure.
  • the snapshot also includes a copy of the updated storage blocks specified by the checkpoint data structure.
  • An embodiment of the data center virtual storage array interface may store the snapshot in the physical storage array or using a data backup.
  • the data center virtual storage array interface automatically sends storage operations to the physical storage array interface to create a snapshot from a checkpoint data structure. These storage operations can be carried out in the background by the data center virtual storage array interface in addition to translating virtual storage array operations from one or more branch virtual storage array interfaces into corresponding physical storage array operations.
  • storage clients can interact with virtual storage arrays in the same manner that they would interact with physical storage arrays. This includes issuing storage commands to the branch virtual storage interface using storage array network protocols such as iSCSI or Fibre Channel protocol.
  • storage array network protocols such as iSCSI or Fibre Channel protocol.
  • Most storage array network protocols organize data according to storage blocks, each of which has a unique storage address or location.
  • a storage block's unique storage address may include logical unit number (using the SCSI protocol) or other representation of a logical volume.
  • the virtual storage arrays provided by branch virtual storage interfaces allow storage clients to access storage blocks by their unique storage address within the virtual storage array.
  • one or more virtual storage arrays actually store their data within a physical storage array, for example implemented as a physical storage area network
  • an embodiment of the invention allows arbitrary mappings between the unique storage addresses of storage blocks in the virtual storage array and the corresponding unique storage addresses in one or more physical storage arrays.
  • the mapping between virtual and physical storage address may be performed by a branch virtual storage array interface and/or by a data center virtual storage array interface.
  • storage blocks in the virtual storage array may be of a different size and/or structure than the corresponding storage blocks in the physical storage array. For example, if data compression is applied to the storage data, then the physical storage array data blocks may be smaller than the storage blocks of the virtual storage array, to take advantage of data storage savings.
  • the branch and/or data center virtual storage array interfaces map one or more virtual storage array storage blocks to one or more physical storage array storage blocks.
  • a virtual storage array storage block can correspond with a fraction of a physical storage array storage block, a single physical storage array storage block, or multiple physical storage array storage blocks, as required by the configuration of the virtual and physical storage arrays.
  • the branch and data center virtual storage array interfaces may reorder or regroup storage operations from storage clients to improve efficiency of data optimizations such as data compression. For example, if two storage clients are simultaneously accessing the same virtual storage array, then these storage operations will be intermixed when received by the branch virtual storage array interface.
  • An embodiment of the branch and/or data center virtual storage array interface can reorder or regroup these storage operations according to storage client, type of storage operation, data or application type, or any other attribute or criteria to improve virtual storage array performance and efficiency.
  • a virtual storage array interface can group storage operations by storage client and apply data compression to each storage client's operations separately, which is likely to provide greater data compression than compressing all storage operations together. FIG.
  • FIG. 6 illustrates an example 600 of optimized data compression and deduplication using file-system or other storage format awareness, such as database nodes, according to an embodiment of the invention.
  • incoming requests for file system blocks or clusters are regrouped and reordered based on their associated file system file and their position within their respective files.
  • unique storage labels can be assigned to storage blocks or groups of storage blocks in the virtual storage array cache. These unique storage labels can be determined arbitrarily or based on the data included in storage blocks, for example using hashes or hashes of hashes.
  • hierarchical labels may be assigned to storage blocks. A hierarchical label is associated with a sequence of one or more additional labels. Each of these additional labels is associated with either a storage block or one or more additional labels.
  • Embodiments of the invention can implement virtual storage array interfaces at the branch and/or data center as standalone devices or as part of other devices, computer systems, or applications.
  • FIG. 7 illustrates an example virtual machine implementation 700 of a virtual storage array interface according to an embodiment of the invention.
  • the virtual storage array interface 705 is implemented as a software application executed by a virtual machine 710 .
  • the virtual machine 710 is located in this example within a network optimizer device 715 ; however, other embodiments of this virtual machine implementation 700 can be located within other types of network devices, including switches, routers, and storage devices and interfaces.
  • the virtual machine 710 implementing the virtual storage interface is optionally connected with an internal or external data storage device to act as a virtual storage array cache 720 .
  • the network optimizer 715 include LAN and WAN network connections 725 and 730 for intercepting network traffic.
  • a virtual machine hardware and software interface 740 is connected with these network connections to allow the virtual machine to send and receive network communications.
  • the network optimizer also includes a network optimization module 735 for performing WAN optimization techniques on network traffic passing between the LAN and the WAN network connections 725 and 730 .
  • the network optimizer 715 or other host device may include multiple virtual machines for executing additional applications, application servers, and/or performing additional data processing functions.
  • a network optimizer device can include a first virtual machine for implementing a virtual storage array interface to a virtual storage array; a second virtual machine for implementing an application server, such as a database application; and a third virtual machine executing a data processing application, such as an anti-virus scanning application.
  • the virtual machines can communicate with each other as well as with other entities connected via the local and wide area networks.
  • FIG. 8 illustrates an example computer system capable of implementing a virtual storage array interface according to an embodiment of the invention.
  • FIG. 8 is a block diagram of a computer system 2000 , such as a personal computer or other digital device, suitable for practicing an embodiment of the invention.
  • Embodiments of computer system 2000 may include dedicated networking devices, such as wireless access points, network switches, hubs, routers, hardware firewalls, network traffic optimizers and accelerators, network attached storage devices, storage array network interfaces, and combinations thereof.
  • Computer system 2000 includes a central processing unit (CPU) 2005 for running software applications and optionally an operating system.
  • CPU 2005 may be comprised of one or more processing cores.
  • Memory 2010 stores applications and data for use by the CPU 2005 . Examples of memory 2010 include dynamic and static random access memory.
  • Storage 2015 provides non-volatile storage for applications and data and may include fixed or removable hard disk drives, flash memory devices, ROM memory, and CD-ROM, DVD-ROM, Blu-ray, HD-DVD, or other magnetic, optical, or solid state storage devices.
  • CPU 2005 may execute virtual machine software applications to create one or more virtual processors capable of executing additional software applications and optional additional operating systems.
  • Optional user input devices 2020 communicate user inputs from one or more users to the computer system 2000 , examples of which may include keyboards, mice, joysticks, digitizer tablets, touch pads, touch screens, still or video cameras, and/or microphones.
  • user input devices may be omitted and computer system 2000 may present a user interface to a user over a network, for example using a web page or network management protocol and network management software applications.
  • Computer system 2000 includes one or more network interfaces 2025 that allow computer system 2000 to communicate with other computer systems via an electronic communications network, and may include wired or wireless communication over local area networks and wide area networks such as the Internet.
  • Computer system 2000 may support a variety of networking protocols at one or more levels of abstraction.
  • Computer system may support networking protocols at one or more layers of the seven layer OSI network model.
  • An embodiment of network interface 2025 includes one or more wireless network interfaces adapted to communicate with wireless clients and with other wireless networking devices using radio waves, for example using the 802.11 family of protocols, such as 802.11a, 802.11b, 802.11g, and 802.11n.
  • An embodiment of the computer system 2000 may also include a wired networking interface, such as one or more Ethernet connections to communicate with other networking devices via local or wide-area networks.
  • a wired networking interface such as one or more Ethernet connections to communicate with other networking devices via local or wide-area networks.
  • the components of computer system 2000 including CPU 2005 , memory 2010 , data storage 2015 , user input devices 2020 , and network interface 2025 are connected via one or more data buses 2060 . Additionally, some or all of the components of computer system 2000 , including CPU 2005 , memory 2010 , data storage 2015 , user input devices 2020 , and network interface 2025 may be integrated together into one or more integrated circuits or integrated circuit packages. Furthermore, some or all of the components of computer system 2000 may be implemented as application specific integrated circuits (ASICS) and/or programmable logic.
  • ASICS application specific integrated circuits
  • embodiments of the invention can be used with any number of network connections and may be added to any type of network device, client or server computer, or other computing device in addition to the computer illustrated above.
  • combinations or sub-combinations of the above disclosed invention can be advantageously made.
  • the block diagrams of the architecture and flow charts are grouped for ease of understanding. However it should be understood that combinations of blocks, additions of new blocks, re-arrangement of blocks, and the like are contemplated in alternative embodiments of the present invention.

Abstract

Virtual storage arrays consolidate branch data storage at data centers connected via wide area networks. Virtual storage arrays appear to storage clients as local data storage; however, virtual storage arrays actually store data at the data center. The virtual storage arrays overcomes bandwidth and latency limitations of the wide area network by predicting and prefetching storage blocks, which are then cached at the branch location. Virtual storage arrays leverage an understanding of the semantics and structure of high-level data structures associated with storage blocks to predict which storage blocks are likely to be requested by a storage client in the near future. Virtual storage arrays determine the association between requested storage blocks and corresponding high-level data structure entities to predict additional high-level data structure entities that are likely to be accessed. From this, the virtual storage array identifies the additional storage blocks for prefetching.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims priority to U.S. Provisional Patent Application No. 61/162,463, entitled “Virtualized Data Storage Over Wide-Area Networks”, filed Mar. 23, 2009; and is related to U.S. patent application Ser. No. ______ [Attorney Docket Number R001410US], entitled “Virtualized Data Storage System Architecture”, filed ______; U.S. patent application Ser. No. ______ [Attorney Docket Number R001411US], entitled “Virtualized Data Storage Cache Management”, filed ______; and U.S. patent application Ser. No. ______ [Attorney Docket Number R001420US], entitled “Virtual Data Storage System Optimizations”, filed ______; all of which are incorporated by reference herein for all purposes.
  • BACKGROUND
  • The present invention relates generally to data storage systems, and systems and methods to improve storage efficiency, compactness, performance, reliability, and compatibility. In computing, a file system specifies an arrangement for storing, retrieving, and organizing data files or other types of data on data storage devices, such as hard disk devices. A file system may include functionality for maintaining the physical location or address of data on a data storage device and for providing access to data files from local or remote users or applications.
  • Typically, data storage for multiple users and applications in an enterprise is implemented using a file server attached to one or more client systems and application servers via a local area network (LAN). The file server allows users and applications to access data via file-based network protocols, such as NFS or SMB/CIFS.
  • Many physical storage devices, such as hard disk drives, are too small, too slow, and too unreliable for enterprise storage operations. As a result, many file servers are connected with large numbers of remote data storage devices, such as disk arrays, tape libraries, and optical drive jukeboxes, via a storage area network (SAN). A storage area network appears to file and application servers as one or more locally attached storage devices. Storage area networks use protocols such as iSCSI and Fibre Channel Protocol to communicate with storage clients. These storage area network protocols are based on reading and writing blocks of data to storage devices and typically operate below the level of the file system.
  • Large organizations, such as enterprises, are often geographically spread out over many separate locations, referred to as branches. For example, an enterprise may have offices or branches in New York, San Francisco, and India. Each branch location may include its own internal local area network for exchanging data within the branch. Additionally, the branches may be connected via a wide area network, such as the internet, for exchanging data between branches.
  • Typical branch LAN installations also required data storage for their local client systems and application servers. For example, a typical branch LAN installation may include a file server for storing data for the client systems and application services. In prior systems, this branch's data storage is located at the branch site and connected directly with the branch LAN. Thus, each branch requires its own file server and associated data storage devices.
  • Deploying and maintaining file servers and data storage at a number of different branches is expensive and inefficient. Organizations often require on-site personnel at each branch to configure and upgrade each branch's data storage, and to manage data backups and data retention. Additionally, organizations often purchase excess storage capacity for each branch to allow for upgrades and growing data storage requirements. Because branches are serviced infrequently, due to their numbers and geographic dispersion, organizations often deploy enough data storage at each branch to allow for months or years of storage growth. However, this excess storage capacity often sits unused for months or years until it is needed, unnecessarily driving up costs.
  • Previously, some types information technology infrastructure, such as application servers, from multiple branches has been consolidated to one or a small number of centralized data centers. These centralized data centers are connected with multiple branches via a wide area network, such as the internet. This consolidation of information technology infrastructure decreases costs and improves management efficiency. However, branch data storage is rarely consolidated at a remote data center, because the intervening WAN is slow and has high latency, making storage accesses unacceptably slow for client systems and application servers. Thus, organizations have previously been unable to consolidate data storage from multiple branches.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The invention will be described with reference to the drawings, in which:
  • FIGS. 1A-1B illustrates virtual storage array system according to an embodiment of the invention;
  • FIGS. 2A-2B illustrate a method of optimizing data reads in a virtual storage array system according to an embodiment of the invention;
  • FIG. 3 illustrates a method of optimizing data writes in a virtual storage array system according to an embodiment of the invention;
  • FIGS. 4A-4B illustrate data migration of virtual storage array system according to an embodiment of the invention;
  • FIG. 5 illustrates a method of creating data snapshots of a virtual storage array according to an embodiment of the invention;
  • FIG. 6 illustrates an example optimized data compression and deduplication using file-system or other storage format awareness according to an embodiment of the invention;
  • FIG. 7 illustrates an example virtual machine implementation of a virtual storage array interface according to an embodiment of the invention; and
  • FIG. 8 illustrates an example computer system capable of implementing a virtual storage array interface according to an embodiment of the invention.
  • SUMMARY
  • An embodiment of the invention uses virtual storage arrays to consolidate branch location-specific data storage at data centers connected with branch locations via wide area networks. The virtual storage array appears to a storage client as a local branch data storage; however, embodiments of the invention actually store the virtual storage array data at a data center connected with the branch location via a wide-area network. In embodiments of the invention, a branch storage client accesses the virtual storage array using storage block based protocols.
  • Embodiments of the invention overcome the bandwidth and latency limitations of the wide area network between branch locations and the data center by predicting storage blocks likely to be requested in the future by the branch storage client and prefetching and caching these predicted storage blocks at the branch location. When this prediction is successful, storage block requests from the branch storage client may be fulfilled in whole or in part from the branch location's storage block cache. As a result, the latency and bandwidth restrictions of the wide-area network are hidden from the storage client.
  • The branch location storage client uses storage block-based protocols to specify reads, writes, modifications, and/or deletions of storage blocks. However, servers and higher-level applications typically access data in terms of files in a structured file system, relational database, or other high-level data structure. Each entity in the high-level data structure, such as a file or directory, or database table, node, or row, may be spread out over multiple storage blocks at various non-contiguous locations in the storage device. Thus, prefetching storage blocks based solely on their locations in the storage device is unlikely to be effective in hiding wide-area network latency and bandwidth limits from storage clients.
  • An embodiment of the invention leverages an understanding of the semantics and structure of the high-level data structures associated with the storage blocks to predict which storage blocks are likely to be requested by a storage client in the near future. To do this, an embodiment of the invention determines the association between requested storage blocks and the corresponding high-level data structure entities, such as files, directories, or database elements. Once this embodiment has identified one or more of the high-level data structure entities associated with a requested storage block, this embodiment of the invention identifies additional portions of the same or other high-level data structure entities that are likely to be accessed by the storage client. This embodiment of the invention then identifies the additional storage blocks corresponding to these additional high-level data structure entities. The additional storage blocks are then prefetched and cached at the branch location.
  • A further embodiment of the invention also hides wide-area network latency and bandwidth limits from storage clients during write operations by caching write requests from storage clients at their associated branch locations. Once a write request is cached at the branch location, the write request is acknowledged as complete to the storage client, allowing the storage client to continue operations. The cached write requests are then transferred from the branch location to the data center independently of the storage clients' operations. Once a new or updated storage block is stored in the storage block cache, it may be accessed by storage clients prior to its transfer to the data storage.
  • An additional embodiment allows for snapshots of the virtual storage array data. In this embodiment, a snapshot is prepared by setting the virtual storage array interface to a quiescent state and identifying any new or updated storage blocks in the branch location's storage block cache. The virtual storage array interface may then be set to an active state and resume normal operations. Upon receiving a snapshot request, the virtual storage array interface transfers these identified storage blocks to the data center, if they have not already been transferred as part of the normal write request process. If a storage client tries to modify a storage block previously identified as new or updated prior to a snapshot request, an embodiment of the branch virtual storage array interface makes a copy of this storage block. The modification is then applied to the copy of this storage block. Storage clients accessing this storage block will receive the modified copy of this storage block. However, the unmodified version of the storage block may be used to fulfill a subsequent snapshot request.
  • DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
  • An embodiment of the invention uses virtual storage arrays to consolidate branch location-specific data storage at data centers connected with branch locations via wide area networks. The virtual storage array appears to a storage client as a local branch data storage; however, embodiments of the invention actually store the virtual storage array data at a data center connected with the branch location via a wide-area network. In embodiments of the invention, a branch storage client accesses the virtual storage array using storage block based protocols.
  • Embodiments of the invention overcome the bandwidth and latency limitations of the wide area network between branch locations and the data center by predicting storage blocks likely to be requested in the future by the branch storage client and prefetching and caching these predicted storage blocks at the branch location. When this prediction is successful, storage block requests from the branch storage client may be fulfilled in whole or in part from the branch location' storage block cache. As a result, the latency and bandwidth restrictions of the wide-area network are hidden from the storage client.
  • The branch location storage client uses storage block-based protocols to specify reads, writes, modifications, and/or deletions of storage blocks. However, servers and higher-level applications typically access data in terms of files in a structured file system, relational database, or other high-level data structure. Each entity in the high-level data structure, such as a file or directory, or database table, node, or row, may be spread out over multiple storage blocks at various non-contiguous locations in the storage device. Thus, prefetching storage blocks based solely on their locations in the storage device is unlikely to be effective in hiding wide-area network latency and bandwidth limits from storage clients.
  • An embodiment of the invention leverages an understanding of the semantics and structure of the high-level data structures associated with the storage blocks to predict which storage blocks are likely to be requested by a storage client in the near future. To do this, an embodiment of the invention determines the association between requested storage blocks and the corresponding high-level data structure entities, such as files, directories, or database elements. Once this embodiment has identified one or more of the high-level data structure entities associated with a requested storage block, this embodiment of the invention identifies additional portions of the same or other high-level data structure entities that are likely to be accessed by the storage client. This embodiment of the invention then identifies the additional storage blocks corresponding to these additional high-level data structure entities. The additional storage blocks are then prefetched and cached at the branch location.
  • Another embodiment of the invention analyzes a selected high-level data structure entity to identify portions of the same or other high-level data structure entities that is likely to be accessed by the storage client. This embodiment of the invention then identifies the additional storage blocks corresponding to these additional high-level data structure entities. The additional storage blocks are then prefetched and cached at the branch location. This embodiment of the invention may also identify additional high-level data structure entities to analyze based on its analysis of previously selected high-level data structure entities.
  • Further embodiments of the invention may identify corresponding high-level data structure entities directly from requests for storage blocks.
  • DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
  • FIGS. 1A-1B illustrates virtual storage array systems according to an embodiment of the invention. FIG. 1A illustrates an example system 100 including virtual storage arrays according to an embodiment of an invention. The example system 100 includes two branches 105 a and 105 b, each of which has its own internal local area network (LAN), and a data center 110, which also includes its own LAN. The two branch networks 105 and the data center network 110 are connected by one or more wide area networks (WANs) 115, such as the internet. Although FIG. 1A shows two branches and one data center, embodiments of the invention can be implemented with any arbitrary number of branches and data centers.
  • Each of the branch LANs 105 may include routers, switches, and other wired or wireless network devices 107 for connecting with client systems and other devices, such as network devices 107 a and 107 b. For example, each of the branch LANs 105 may connect one or more client systems 108, such as client system 108 a and 108 b, with one or more application servers 109, such as 109 a and 109 b. Application servers 109 provide applications and application functionality to the client systems 108.
  • Previously, typical branch LAN installations also requires data storage for client systems and application servers. For example, a prior typical branch LAN installation may include a file server for storing data for the client systems and application servers, such as database servers and e-mail servers. In prior systems, this branch's data storage is located at the branch site and connected directly with the branch LAN. The branch data storage previously could not be located at the data center, because the intervening WAN is too slow and has high latency, making storage accesses unacceptably slow for client systems and application servers.
  • An embodiment of the invention allows for storage consolidation of branch-specific data storage at data centers connected with branches via wide area networks. This embodiment of the invention overcomes the bandwidth and latency limitations of the wide area network between branches and the data center. To this end, an embodiment of the invention includes virtual storage arrays.
  • A virtual storage array appears to branch users, such as branch client systems and branch application servers, as a storage array connected with the branch's local area network. A virtual storage array can be used for the same purposes as a local storage area network or other data storage device. For example, a virtual storage array may be used in conjunction with a file server for general-purpose data storage, in conjunction with a database server for database application storage, or in conjunction with an e-mail server for e-mail storage. However, the virtual storage array stores its data at a data center connected with the branch via a wide area network. Multiple separate virtual storage arrays, from different branches, may store their data in the same data center and, as described below, on the same storage devices.
  • Because the data storage of multiple branches is consolidated at a data center, the efficiency, reliability, cost-effectiveness, and performance of data storage is improved. An organization can manage and control access to their data storage at a central data center, rather than at large numbers of separate branches. This increases the reliability and performance of an organization's data storage. This also reduces the personnel required at branch offices to provision, maintain, and backup data storage. It also enables organizations to implement more effective backup systems, data snapshots, and disaster recovery for their data storage. Furthermore, organizations can plan for storage growth more efficiently, by consolidating their storage expansion for multiple branches and reducing the amount of excess unused storage. Additionally, an organization can apply optimizations such as compression or data deduplication over the data from multiple branches stored at the data center, reducing the total amount of storage required by the organization.
  • In an embodiment, virtual storage arrays are implemented at each of the branches 105 using branch virtual storage array interfaces 120, such as branch virtual storage array interfaces 120 a and 120 b. Any of the branch virtual storage array interfaces 120 may be a stand-alone computer system or network appliance or built into other computer systems or network equipment as hardware and/or software. In a further embodiment, any of the branch virtual storage array interfaces 120 may be implemented as a software application or other executable code running on a client system or application server.
  • In an embodiment, each of the branch virtual storage array interfaces 120 includes one or more storage array network interfaces and supports one or more storage array network protocols to connect with client systems and/or application servers within a branch local area network. Examples of storage array network interfaces suitable for use with embodiments of the invention include Ethernet, Fibre Channel, IP, and InfiniBand interfaces. Examples of storage array network protocols include ATA, Fibre Channel Protocol, and SCSI. Various combinations of storage array network interfaces and protocols are suitable for use with embodiments of the invention, including iSCSI, HyperSCSI, Fibre Channel over Ethernet, and iFCP. In cases where the storage array network interface uses Ethernet, an embodiment of the branch virtual storage array interface 120 can use the branch LAN's physical connections and networking equipment for communicating with client systems and application services. In other embodiments, separate connections and networking equipment, such as Fibre Channel networking equipment, is used to connect the branch virtual storage array interface 120 with client systems 108 and/or application servers 109.
  • In an embodiment, one or more of the branch LANs 105 can include a file server, for example built into one of the application servers 109, for providing a network file interface to the virtual storage array to client systems 108 and other application servers 109. In a further embodiment, the branch virtual storage array interface 120 is integrated as hardware and/or software with an application server 109, such as a file server, database server, or e-mail server. In this embodiment, the branch virtual storage array interface 120 can include application server interfaces, such as a network file interface, for interfacing with other application servers and/or client systems.
  • From the view of application servers 109 and client systems 108, a branch virtual storage array interface 120 appears to be a local storage array, having its data storage at the associated branch 105. For example, branch virtual storage array 120 a appears to clients 108 a and application server 109 a as a local data storage array on branch LAN 105 a. However, the branch virtual storage array interfaces 120 actually store and retrieve data from storage devices located on the data center LAN 110. Because virtual storage array data accesses must travel via the WAN 115 between the data center LAN 110 to the branch LANs 105, the virtual storage arrays are subject to the latency and bandwidth restrictions of the WAN 115.
  • In an embodiment, the branch virtual storage array interfaces 120 includes virtual storage array caches 122, such as virtual storage array caches 122 a and 122 b for virtual storage array interfaces 120 a and 120 b respectively, which are used to ameliorate the effects of the WAN 115 on virtual storage array performance. As described in detail below, virtual storage array data accesses, including data reads and data writes, can be optimized to minimize the effect of WAN bandwidth restrictions and latency.
  • Additionally, an embodiment of the invention includes a data center virtual storage array interface 125 located on the data center LAN 110. In an embodiment, the data center virtual storage array interface 125 communicates with one or more branch virtual storage interfaces 120 via the data center LAN 110, the WAN 115, and their respective branch LANs 105. Data communications between virtual storage interfaces 120 and 125 can be in any form and/or protocol used for carrying data over wired and wireless data communications networks, including TCP/IP.
  • The data center virtual storage array interface 125 translates data communications from branch virtual storage array interfaces 120 into storage accesses of a physical storage array network. To this end, an embodiment of a data center virtual storage array interface 125 accesses a physical storage array network interface 127, which in turn accesses physical data storage devices 129 on a storage array network. Examples of data storage devices 129 include physical data storage array devices 129 a and data backup devices 129 b. In another embodiment, the data center virtual storage array interface 125 includes one or more storage array network interfaces and supports one or more storage array network protocols for directly connecting with a physical storage array network and its data storage devices 129. Examples of storage array network interfaces suitable for use with embodiments of the invention include Ethernet, Fibre Channel, IP, and InfiniBand interfaces. Examples of storage array network protocols include ATA, Fibre Channel Protocol, and SCSI. Various combinations of storage array network interfaces and protocols are suitable for use with embodiments of the invention, including iSCSI, HyperSCSI, Fibre Channel over Ethernet, and iFCP. Embodiments of the data center virtual storage array interface 125 may connect with the physical storage array interface 127 and/or directly with the physical storage array network using the Ethernet network of the data center LAN and/or separate data communications connections, such as a Fibre Channel network.
  • In a further embodiment, branch 105 and data center LANs 110 may optionally include network optimizers 130 for improving the performance of data communications over the WAN 115 between branches and/or the data center. Network optimizers 130 can improve actual and perceived WAN network performance using techniques including compressing data communications; anticipating and prefetching data; caching frequently accessed data; shaping and restricting network traffic; and optimizing usage of network protocols. In an embodiment, network optimizers 130 may be used in conjunction with virtual storage array interfaces 120 and 125 to further improve virtual storage array performance accessing data via the WAN 115. In other embodiments, network optimizers 130 may ignore or pass-through virtual storage array data traffic, relying on the virtual storage array interfaces 120 and 125 on the branch 105 and data center LANs 110 to optimize WAN performance.
  • Further embodiments of the invention may be used in different network architectures. For example, a data center virtual storage array interface may be connected directly between a WAN and a physical data storage array, eliminating the need for a data center LAN. Similarly, a branch virtual storage array interface, implemented for example in the form of a software application executed by a storage client computer system, may be connected directly with a WAN, such as the internet, eliminating the need for a branch LAN.
  • FIG. 1B illustrates an example arrangement 150 of data within virtual and physical storage array networks according to an embodiment of the invention. In this example 150, two branches 155 a and 155 b each include a branch virtual storage array interface 160 a and 160 b and associated virtual storage array cache 165 a and 165 b, respectively. As discussed in detail below, each of the virtual storage array caches 165 are used to store prefetched virtual storage array network data and pending virtual storage array write data for their branch's respective virtual storage arrays.
  • In an embodiment, each of the branches 155 includes its own separate virtual storage array, which appears to be located within its branch LAN 155. However, the majority of the data storage of a branch's virtual storage array is located within the data center LAN 170 on one or more physical data storage devices 175. The data center LAN 170 is connected with the branch LANs 155 via WAN 185 In an embodiment, each branch's virtual storage array data is stored within a physical storage area network at the data center LAN 170. The physical storage area network may store virtual storage array data 180 for two or more branches. For example, physical data storage array 175 stores virtual storage array data 180 a and 180 b, which correspond with the data of the virtual storage arrays for branch 155 a and 155 b, respectively.
  • In a further embodiment, data optimizations such as data compression and data deduplication can be applied to each branch's virtual storage array data 180 separately or may be consolidated over multiple branches' virtual storage array data 180. For example, redundant data within a single branch's virtual storage array data within the data center's physical storage array network can be compressed or deduplicated to reduce storage requirements. In another further example, if two or more branches' virtual storage arrays include the same or similar data, compression or data deduplication can be applied over all of these virtual storage arrays, such that only a single copy of the redundant data needs to be stored in the physical storage area network. In this example, each of the separate branch virtual storage arrays will reference this single copy of the redundant data. For example, branch's 155 a virtual storage array data 180 a can be compressed or deduplicated together with branch's 155 b virtual storage array data 180 b so that there is only a single copy of any redundant data found in both virtual storage arrays.
  • In another embodiment, the virtual storage array can be used to provide “cloud” storage for network-based applications.
  • An embodiment of the invention prefetches virtual storage array data to improve data read performance of the virtual storage array. In an embodiment, the branch or data center virtual storage array interface analyzes read and write accesses to a branch's virtual storage array to predict which storage blocks may be accessed in the future. The branch or data center virtual storage array interface then retrieves some or all of these predicted storage blocks and stores them in the branch's virtual storage array cache. If storage client, such as an application server, file server, or client system, later requests access to one or more of the cached storage blocks, the branch virtual storage array interface retrieves the requested storage block from the virtual storage array cache, rather than retrieving the storage block from the physical storage devices located in the data center LAN via the WAN. This storage block prefetching hides the bandwidth and latency of the WAN from the storage client, making the virtual storage array appear as if it is a local storage device.
  • One complication with storage block prefetching is that sequential data within a file system or file is not necessarily stored as contiguous storage blocks within a storage area network. Similar complications occur when accessing databases or application data, such as e-mail data. This complication is illustrated by FIG. 2A. FIG. 2A illustrates an example 200 of a storage client 205 opening an example file “Foo.txt” and reading the first five file file system blocks or clusters of this file. These file protocol reads may be performed using any file system protocol, such as CIFS, NFS, or NTFS. This sequence of file protocol reads is received by a file server 210. The file server 210 translates these file protocol reads into one or more storage area network reads. Each storage area network read retrieves one or more storage blocks from the virtual or physical storage area network 215. The storage area network reads may use any storage area network protocol, such as iSCSI or other protocols discussed above. The sizes and boundaries of file system blocks and storage area network blocks are independent of each other; thus each file system block may correspond with a fraction of a storage area network block, a single storage area network block, or multiple storage area network blocks.
  • In this example, file system block 0 corresponds with storage area network blocks 101 and 200. File system block 1 corresponds with storage area network block 14. File system block 2 corresponds with storage area network block 25. File system block 3 corresponds with storage area network block 26. File system block 4 corresponds with storage area network block 12. As shown in this example, the first five file system blocks of a file in a file system correspond to six non-sequential storage area network blocks.
  • Typically, if a storage client requests the first five system blocks of a file, one optimization would be to prefetch and cache additional file blocks in this sequence, such as the next five file system blocks. However, because the storage area network blocks corresponding with this sequence of file blocks are not sequential, storage area network interfaces, which typically only receive requests for storage area network blocks, cannot accurately identify the storage area network blocks corresponding with a predicted sequence of file blocks.
  • FIG. 2B illustrates a method 250 of performing reactive prefetching of storage blocks according to an embodiment of the invention. Step 255 receives a storage block read request from a storage client, such as a client system or application server, at the branch location. In an embodiment, the storage block read request may be received by a branch location virtual data storage array interface. The storage block read request may be received using a storage area network protocol, such as iSCSI.
  • In response to the receipt of the storage block read request in step 255, decision block 260 determines if the requested storage block has been previously retrieved and stored in the storage block read cache at the branch location. If so, step 270 retrieves the requested storage block from the storage block read cache and returns it to the requesting storage client. In an embodiment, if the system includes a data center virtual storage array interface, then step 270 also forwards the storage block read request back to the data center virtual storage array interface for use in identifying additional storage blocks likely to be requested by the storage client in the future.
  • If the storage block read cache at the branch location does not include the requested storage block, step 265 retrieves the requested storage block via a WAN connection from the virtual storage array data located in a physical data storage at the data center. In an embodiment, a branch location virtual storage array interface forwards the storage block read request to the data center virtual storage array interface via the WAN connection. The data center virtual storage array interface then retrieves the requested storage block from the physical storage array and returns it to the branch location virtual storage array interface, which in turn provides this requested storage block to the storage client. In a further embodiment of step 265, a copy of the retrieved storage block may be stored in the storage block read cache for future accesses.
  • During and/or following the retrieval of the requested storage block from the virtual storage array or virtual storage array cache, steps 275 to 299 prefetch additional storage blocks likely to be requested by the storage client in the near future. Step 275 identifies a high-level data structure entity associated with the requested storage block. Examples of high-level data structure entities include file system entities such as files, directories, and file system blocks or clusters; and database structures such as database tables, rows, and nodes. Typical block storage protocols, such as iSCSI and FCP, specify block read requests using a storage block address or identifier. However, these storage block read requests do not include any identification of the associated high-level data structure entity, such as a specific file, directory, or database entity, that is associated with this storage block.
  • Therefore, an embodiment of step 275 identifies the high-level data structure entity corresponding with the requested storage block. In an embodiment of step 275, a branch or data center virtual storage array interface searches a file system data structure, such as an allocation table or tree, or a database data structure, such as a B-tree, to identify one or more high-level data structure entities corresponding with the requested storage block. In a further embodiment of step 275, a branch or data center virtual storage array interface preprocesses data structures to create other databases, tables, or other data structures adapted to facilitate searching for high-level data structure entities corresponding with storage blocks. These data structures mapping storage blocks to corresponding high-level data structure entities may be updated frequently or infrequently, depending upon the desired prefetching performance.
  • In a further embodiment, step 275 also determines a location or range of locations within the high-level data structure entity corresponding with the requested storage block. For example, a storage block may correspond with a specific range of addresses or offsets within a larger file.
  • Using the identification of the high-level data structure entity and optionally the location provided by step 275, step 280 identifies additional high-level data structure entities or portions thereof that are likely to be requested by the storage client. There are a number of different techniques for identifying addition high-level data structure entities or portions thereof for prefetching that may be used by embodiments of step 280. Some of these are described in detail in co-pending U.S. patent application Ser. No. ______[Attorney Docket Number R001420US], entitled “Virtual Data Storage System Optimizations”, filed ______, which is incorporated by reference herein for all purposes.
  • One example technique used by an embodiment of step 280 is to prefetch portions of the high-level data structure entity based on their adjacency or close proximity to the identified portion of the entity. For example, if step 275 determines that the requested storage block corresponds with a portion of a file from file offset 0 up to offset 4095, then step 280 may identify a second portion of this same file beginning with offset 4096 for prefetching. It should be noted that although these two portions are adjacent in the high-level data structure entity, their corresponding storage blocks may be non-contiguous.
  • Further embodiments of the invention may use other heuristics or other techniques to select predicted file system blocks, such as knowledge of application behavior associated with a file type. For example, application or protocol specific information may be used to identify storage blocks for prefetching and caching. For example, if the virtual storage array is used to store e-mail data, a branch or data center virtual storage array interface may identify an e-mail account or e-mail message ID associated with a requested storage block and then identify and prefetch storage blocks associated with the same user, with the same e-mail message ID, and/or with e-mail messages having nearby e-mail message IDs. This application or protocol specific information may be used alone or in conjunction with the above-described file system or database data.
  • Step 280 identifies all or portions of one or more high-level data structure entities for prefetching based on the high-level data structure entity associated with the requested storage block. However, as discussed above, storage clients specify data access requests in terms of storage blocks, not high-level data structure entities such as files, directories, or database entities. Thus, step 285 identifies one or more storage blocks corresponding with the high-level data structure entities identified for prefetching in step 280. In an embodiment, step 285 identifies additional storage blocks corresponding with the high-level data structure entities by accessing the data structures associated with a file system data structure, such as an allocation table or tree, or a database data structure, such as a B-tree, in a manner similar to a client system or application server requesting a high-level data structure entity. In another embodiment, step 280 accesses a separate data structure maintained by a virtual storage array interface to identify one or more storage blocks corresponding with the high-level data structure entities identified for prefetching.
  • Decision block 290 determines if any of the storage blocks identified in step 285 have already been stored in the storage block read cache located at the branch location. If not, step 295 retrieves these uncached additional storage blocks from the virtual storage array data located in a physical data storage on the data center LAN and sends them via a WAN connection to the appropriate branch LAN. Step 299 stores these additional storage blocks in the branch's virtual storage array cache for potential future access by storage clients within the branch LAN. In a further embodiment, decision block 290 and the determination of whether an additional storage block has been previously retrieved and cached may be omitted. Instead, this embodiment can send all of the identified additional storage blocks to the branch virtual storage array interface to be cached. The branch virtual storage array interface may then discard any redundant storage blocks. This embodiment can be used when WAN latency, rather than WAN bandwidth limitations, are an overriding concern.
  • Although the method 250 of FIG. 2B is described with respect to accessing files via the virtual storage array, embodiments of method 250 can also be applied to non-file based storage accesses. For example, an embodiment of method 250 can be applied to access databases via the virtual storage array. In this embodiment, portions of database tables or B-tree child nodes, rather than file system blocks, are used to identify corresponding storage blocks for prefetching and caching by a branch virtual storage array interface In another example, indirect blocks of a file system may be used to identify additional storage blocks to be prefetched and cached.
  • Following step 299, method 250 proceeds to step 255 to await receipt of further storage block requests. The storage blocks added to the storage block read cache in previous iterations of method 250 may be available for fulfilling storage block read requests.
  • Method 250 may be performed by a branch virtual data storage array interface, by a data center virtual data storage array interface, or by both virtual data storage array interfaces working in concert. For example, steps 255 to 270 of method 250 may be performed by a branch location virtual storage array interface and steps 275 to 299 of method 250 may be performed by a data center virtual storage array interface. In another example, all of the steps of method 250 may be performed by a branch location virtual storage array interface.
  • Similarly, the virtual storage array cache can be used to hide latency and bandwidth limitations of the WAN during virtual storage array writes. FIG. 3 illustrates a method 300 of optimizing data writes in a virtual storage array system according to an embodiment of the invention.
  • An embodiment of method 300 starts with step 305 receiving a storage block write request from a storage client within the branch LAN. The storage block write request may be received by a branch virtual storage interface.
  • In response to the receipt of the storage block write request, decision block 310 determines if the virtual storage array cache is capable of accepting additional write requests or is full. In an embodiment, the virtual storage array cache may use some or all of its storage as a queue for pending virtual storage array operations.
  • If decision block 310 determines that the virtual storage array cache can accept an additional storage block write request, then, in an embodiment of method 300, step 315 stores the storage block write request, including the storage block data to be written, in the virtual storage array cache. In this embodiment of method 300, step 320 then sends a write acknowledgement to the storage client. Following the storage client's receipt of this write request, the storage client believes its storage block write request is complete and can continue to operation normally. However, in step 325, the virtual storage array interface will transfer the queued written storage block via the WAN to the physical storage array at the data center LAN. In an embodiment, step 325 may perform this transfer in the background and asynchronously with the operation of storage clients.
  • While a storage block write request is queued and waiting to be transferred to the data center, a storage client may wish to access this storage block for a read or write. In this situation, the virtual storage array interface intercepts the storage block access request. In the case of a storage block read, the virtual storage array interface provides the storage client with the queued storage block. In the case of a storage block write, the virtual storage array interface will update the queued storage block data and send a write acknowledgement to the storage client for this additional storage block access.
  • Conversely, if decision block 310 determines that the virtual storage array cache cannot accept an additional storage block write request, then step 330 immediately transfers the storage block via the WAN to the physical storage array at the data center LAN. Following completion of this transfer, step 335 receives a write acknowledgement from the data center virtual storage array interface or the physical data storage array itself. Step 340 then sends a write acknowledgement to the storage client, allowing the storage client to resume normal operation.
  • In a further embodiment, a virtual storage array interface may throttle storage block read and/or write requests from storage clients to prevent the virtual storage array cache from filling up under typical usage scenarios.
  • FIGS. 4A-4B illustrate data migration of virtual storage array system according to an embodiment of the invention. Because the data storage of a branch's virtual storage array is located at a data center, rather than at the branch location, migrating data from one branch to another branch is straightforward. For example, FIG. 4A illustrates a first branch virtual storage interface 405 at a first branch 410 that provides access to a virtual storage array 415 a, having its virtual storage array data 420 stored at a data center 425. To migrate this example virtual storage array 415 a to a second branch, the first branch virtual storage array interface is configured to deactivate the first branch's access to the virtual storage array. A second branch virtual storage array interface at the second branch is then configured to access the virtual storage array data at the data center, thus providing the second branch with access to the virtual storage array.
  • Continuing from the example of FIG. 4A, FIG. 4B illustrates an example of a second branch virtual storage interface 430 at a second branch 435 that provides access to a virtual storage array 415 b, having its virtual storage array data 420 stored at a data center 425. In this example, the first branch virtual storage array interface 405 at the first branch 410 has been configured to deactivate the first branch's access to the virtual storage array. As a result, the second branch 435 has exclusive access to the virtual storage array data 420 via virtual storage array 415 b.
  • In a further embodiment, upon deactivating the virtual storage array 415 a at a first branch 410, the first branch virtual storage interface is adapted to transfer any updated storage data in its virtual storage array cache, such as new or updated storage blocks associated with pending write operations, back to the virtual storage array data 420 in the physical data storage array 440. This ensures that the virtual storage array data 420 maintained at the data center 425 is up to date.
  • Moreover, because the virtual storage array data 420 does not change location when a virtual storage array 415 is migrated to a new location, virtual storage arrays can be migrated frequently. For example, if an organization has a first branch in New York and a second branch in India, a virtual storage array may be migrated between these offices every work day. Because of the time differences between these two locations, the virtual storage array enables a 24-hour work cycle. During business hours in the New York branch, the New York branch will be given access to the virtual storage array. At the same time, it is late at night in India; thus this branch does not require access to the virtual storage array. When business hours are over in New York, the New York branch virtual storage array interface deactivates its virtual storage array access and completes any remaining updates to the virtual storage array data at the data center. Then, the India branch virtual storage array interface can activate virtual storage array access for the India branch. This allows the India branch to access the virtual storage array while the New York branch is closed for the night. At the end of business hours in India, this process is reversed and the New York branch reconnects with the virtual storage array.
  • In some cases, there may be some storage clients in a branch operating past business hours. In an embodiment, a virtual storage array interface at the branch can connect with the virtual storage array interface that is currently connected with the virtual storage array data via the WAN to provide after-hours storage clients access to the virtual storage array. For example, in FIG. 4B, the virtual storage array data 420 is accessed by virtual storage array 415 b currently provided by virtual storage array interface 430 located at the second branch 435. If a client system 445 at the first branch 410 needs to access data in the virtual storage array 415 b, the client system 445 contacts the first virtual storage array interface 405. The first virtual storage array interface 405 then contacts the second virtual storage array interface 430 to access the virtual storage array 415 b.
  • In a further embodiment, one or more virtual machines executing virtual storage array applications, application servers, and/or other applications may migrate with a virtual storage array between two or more branches. In this embodiment, an application server, such as a database application or an e-mail server and its associated data storage, implemented using a virtual storage array, may move together between branches. Because the application server is implemented within a virtual machine, this migration between branches may be seamless from the perspective of the application server.
  • FIG. 5 illustrates a method 500 of creating data snapshots of a virtual storage array according to an embodiment of the invention. An embodiment of the method 500 begins in step 505 with the initiation of a virtual storage array checkpoint. A virtual storage array checkpoint may be initiated automatically by a branch virtual storage interface according to a schedule or based on criteria, such as the amount of data changed since the last checkpoint. In a further embodiment, a virtual storage array checkpoint may be initiated in response to a request for a virtual storage array snapshot from a system administrator or administration application.
  • To create a virtual storage array checkpoint, in an embodiment of the method 500, step 510 sets the branch virtual storage array interface to a quiescent state. This entails completing any pending operations with storage clients (though not necessarily background operations between the branch and data center virtual storage array interfaces). While in the quiescent state, the branch virtual storage interface will not accept any new storage operations from storage clients.
  • Once the branch virtual storage array interface is set to a quiescent state by step 510, in step 515, an embodiment of the branch virtual storage array interface identifies updated storage blocks in its associated virtual storage array cache. These updated storage blocks include data that has been created or updated by storage clients but have yet to be transferred via the WAN back to the data center LAN for storage in the physical data storage array.
  • Once all of the updated storage blocks have been identified, in step 515 an embodiment of the branch virtual storage array creates a checkpoint data structure. The checkpoint data structure specifies a time of checkpoint creation and the set of updated storage blocks at that moment of time. Following the creation of the checkpoint data structure, in an embodiment of the method 500, step 520 reactivates the branch's virtual storage array. Following step 520, the branch virtual storage array interface can resume servicing storage operations from storage clients. Additionally, the branch virtual storage array may resume transferring new or updated storage blocks via the WAN to the data center LAN for storage in the physical data storage array. In a further embodiment, the virtual storage array cache may maintain a copy of an updated storage block even after a copy is transferred back to the data center LAN for storage. This allows subsequent snapshots to be created based on this data.
  • In an embodiment, following the reactivation of the virtual storage array in step 520, the virtual storage array interface preserves the updated storage blocks specified by the checkpoint data structure from further changes. If a storage client attempts to update a storage block that is associated with a checkpoint, an embodiment of the virtual storage array interface creates a duplicate of this storage block in the virtual storage array cache to store the updated data. This preserves the data of this storage block at the time of the checkpoint for potential future reference.
  • Optionally, an embodiment of the method 500 may initiate one or more additional virtual storage array checkpoints at later times or in response to criteria or conditions. Embodiments of the virtual storage array interface may maintain any arbitrary number of checkpoint data structures and automatically delete outdated checkpoint data structures. For example, a branch virtual storage interface may maintain only the most recently created checkpoint data structure, or checkpoint data structures from the beginning of the most recent business day and the most recent hour.
  • At some point, a system administrator or administration application may request a snapshot of the virtual storage array data. A snapshot of the virtual storage array data represents the complete set of virtual storage array data at a specific moment of time. Step 525 receives a snapshot request from the a system administrator or administration application. In response to a snapshot request, in step 530, an embodiment of a branch virtual storage array interface transfers a copy of the appropriate checkpoint data structure to the data center virtual storage interface. Additionally, the branch virtual storage array interface transfers a copy of any updated storage blocks specified by this checkpoint data structure.
  • In an embodiment, the data center virtual storage array interface creates a snapshot of the data of the virtual storage array. The snapshot includes a copy of the all of the virtual storage array data in the physical data storage array unchanged from the time of creation of the checkpoint data structure. The snapshot also includes a copy of the updated storage blocks specified by the checkpoint data structure. An embodiment of the data center virtual storage array interface may store the snapshot in the physical storage array or using a data backup. In an embodiment, the data center virtual storage array interface automatically sends storage operations to the physical storage array interface to create a snapshot from a checkpoint data structure. These storage operations can be carried out in the background by the data center virtual storage array interface in addition to translating virtual storage array operations from one or more branch virtual storage array interfaces into corresponding physical storage array operations.
  • As described above, storage clients can interact with virtual storage arrays in the same manner that they would interact with physical storage arrays. This includes issuing storage commands to the branch virtual storage interface using storage array network protocols such as iSCSI or Fibre Channel protocol. Most storage array network protocols organize data according to storage blocks, each of which has a unique storage address or location. A storage block's unique storage address may include logical unit number (using the SCSI protocol) or other representation of a logical volume.
  • In an embodiment, the virtual storage arrays provided by branch virtual storage interfaces allow storage clients to access storage blocks by their unique storage address within the virtual storage array. However, because one or more virtual storage arrays actually store their data within a physical storage array, for example implemented as a physical storage area network, an embodiment of the invention allows arbitrary mappings between the unique storage addresses of storage blocks in the virtual storage array and the corresponding unique storage addresses in one or more physical storage arrays. In an embodiment, the mapping between virtual and physical storage address may be performed by a branch virtual storage array interface and/or by a data center virtual storage array interface. Furthermore, there may be multiple levels of mapping between a branch virtual storage array and the physical storage array.
  • In an embodiment, storage blocks in the virtual storage array may be of a different size and/or structure than the corresponding storage blocks in the physical storage array. For example, if data compression is applied to the storage data, then the physical storage array data blocks may be smaller than the storage blocks of the virtual storage array, to take advantage of data storage savings. In an embodiment, the branch and/or data center virtual storage array interfaces map one or more virtual storage array storage blocks to one or more physical storage array storage blocks. Thus, a virtual storage array storage block can correspond with a fraction of a physical storage array storage block, a single physical storage array storage block, or multiple physical storage array storage blocks, as required by the configuration of the virtual and physical storage arrays.
  • In a further embodiment, the branch and data center virtual storage array interfaces may reorder or regroup storage operations from storage clients to improve efficiency of data optimizations such as data compression. For example, if two storage clients are simultaneously accessing the same virtual storage array, then these storage operations will be intermixed when received by the branch virtual storage array interface. An embodiment of the branch and/or data center virtual storage array interface can reorder or regroup these storage operations according to storage client, type of storage operation, data or application type, or any other attribute or criteria to improve virtual storage array performance and efficiency. For example, a virtual storage array interface can group storage operations by storage client and apply data compression to each storage client's operations separately, which is likely to provide greater data compression than compressing all storage operations together. FIG. 6 illustrates an example 600 of optimized data compression and deduplication using file-system or other storage format awareness, such as database nodes, according to an embodiment of the invention. In the example 600, incoming requests for file system blocks or clusters are regrouped and reordered based on their associated file system file and their position within their respective files.
  • In an embodiment, unique storage labels can be assigned to storage blocks or groups of storage blocks in the virtual storage array cache. These unique storage labels can be determined arbitrarily or based on the data included in storage blocks, for example using hashes or hashes of hashes. Furthermore, hierarchical labels may be assigned to storage blocks. A hierarchical label is associated with a sequence of one or more additional labels. Each of these additional labels is associated with either a storage block or one or more additional labels. By assigning labels to storage blocks, WAN optimization techniques can be further applied to virtual storage array data traffic between the branch LAN and the data center LAN.
  • Embodiments of the invention can implement virtual storage array interfaces at the branch and/or data center as standalone devices or as part of other devices, computer systems, or applications. FIG. 7 illustrates an example virtual machine implementation 700 of a virtual storage array interface according to an embodiment of the invention. In this example virtual machine implementation 700, the virtual storage array interface 705 is implemented as a software application executed by a virtual machine 710. The virtual machine 710 is located in this example within a network optimizer device 715; however, other embodiments of this virtual machine implementation 700 can be located within other types of network devices, including switches, routers, and storage devices and interfaces.
  • In an embodiment, the virtual machine 710 implementing the virtual storage interface is optionally connected with an internal or external data storage device to act as a virtual storage array cache 720.
  • In an embodiment, the network optimizer 715 include LAN and WAN network connections 725 and 730 for intercepting network traffic. A virtual machine hardware and software interface 740 is connected with these network connections to allow the virtual machine to send and receive network communications. In this example, the network optimizer also includes a network optimization module 735 for performing WAN optimization techniques on network traffic passing between the LAN and the WAN network connections 725 and 730.
  • In a further embodiment, the network optimizer 715 or other host device may include multiple virtual machines for executing additional applications, application servers, and/or performing additional data processing functions. For example, a network optimizer device can include a first virtual machine for implementing a virtual storage array interface to a virtual storage array; a second virtual machine for implementing an application server, such as a database application; and a third virtual machine executing a data processing application, such as an anti-virus scanning application. In this example, the virtual machines can communicate with each other as well as with other entities connected via the local and wide area networks.
  • FIG. 8 illustrates an example computer system capable of implementing a virtual storage array interface according to an embodiment of the invention. FIG. 8 is a block diagram of a computer system 2000, such as a personal computer or other digital device, suitable for practicing an embodiment of the invention. Embodiments of computer system 2000 may include dedicated networking devices, such as wireless access points, network switches, hubs, routers, hardware firewalls, network traffic optimizers and accelerators, network attached storage devices, storage array network interfaces, and combinations thereof.
  • Computer system 2000 includes a central processing unit (CPU) 2005 for running software applications and optionally an operating system. CPU 2005 may be comprised of one or more processing cores. Memory 2010 stores applications and data for use by the CPU 2005. Examples of memory 2010 include dynamic and static random access memory. Storage 2015 provides non-volatile storage for applications and data and may include fixed or removable hard disk drives, flash memory devices, ROM memory, and CD-ROM, DVD-ROM, Blu-ray, HD-DVD, or other magnetic, optical, or solid state storage devices. In a further embodiment, CPU 2005 may execute virtual machine software applications to create one or more virtual processors capable of executing additional software applications and optional additional operating systems.
  • Optional user input devices 2020 communicate user inputs from one or more users to the computer system 2000, examples of which may include keyboards, mice, joysticks, digitizer tablets, touch pads, touch screens, still or video cameras, and/or microphones. In an embodiment, user input devices may be omitted and computer system 2000 may present a user interface to a user over a network, for example using a web page or network management protocol and network management software applications.
  • Computer system 2000 includes one or more network interfaces 2025 that allow computer system 2000 to communicate with other computer systems via an electronic communications network, and may include wired or wireless communication over local area networks and wide area networks such as the Internet. Computer system 2000 may support a variety of networking protocols at one or more levels of abstraction. For example, computer system may support networking protocols at one or more layers of the seven layer OSI network model. An embodiment of network interface 2025 includes one or more wireless network interfaces adapted to communicate with wireless clients and with other wireless networking devices using radio waves, for example using the 802.11 family of protocols, such as 802.11a, 802.11b, 802.11g, and 802.11n.
  • An embodiment of the computer system 2000 may also include a wired networking interface, such as one or more Ethernet connections to communicate with other networking devices via local or wide-area networks.
  • The components of computer system 2000, including CPU 2005, memory 2010, data storage 2015, user input devices 2020, and network interface 2025 are connected via one or more data buses 2060. Additionally, some or all of the components of computer system 2000, including CPU 2005, memory 2010, data storage 2015, user input devices 2020, and network interface 2025 may be integrated together into one or more integrated circuits or integrated circuit packages. Furthermore, some or all of the components of computer system 2000 may be implemented as application specific integrated circuits (ASICS) and/or programmable logic.
  • Further embodiments can be envisioned to one of ordinary skill in the art after reading the attached documents. For example, embodiments of the invention can be used with any number of network connections and may be added to any type of network device, client or server computer, or other computing device in addition to the computer illustrated above. In other embodiments, combinations or sub-combinations of the above disclosed invention can be advantageously made. The block diagrams of the architecture and flow charts are grouped for ease of understanding. However it should be understood that combinations of blocks, additions of new blocks, re-arrangement of blocks, and the like are contemplated in alternative embodiments of the present invention.
  • The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. It will, however, be evident that various modifications and changes may be made thereunto without departing from the broader spirit and scope of the invention as set forth in the claims.

Claims (26)

1. A method of optimizing a block storage protocol read access to a block storage device via a wide area network, the method comprising:
receiving a storage request specifying at least a first storage block from a storage client, wherein the storage client is connected with a wide area network at a first network location;
identifying at least a first portion of a set of file system entities corresponding with the first storage block;
identifying at least at a second portion of the set of file system entities likely to be associated with a future storage request based on the first portion of the set of file system entities;
identifying at least a second storage block corresponding with the second portion of the set of file system entities;
retrieving the second storage block from a data storage connected with the wide area network at a second network location;
communicating via the wide area network the second storage block from the data storage to a storage block cache at the first network location; and
storing the second storage block in the storage block cache.
2. The method of claim 1, wherein the first portion of the set of file system entities and the second portion of the set of file system entities include a first one of the set of file system entities.
3. The method of claim 1, wherein the first portion of the set of file system entities includes a first one of the set of file system entities and the second portion of the set of file system entities includes a second one of the set of file system entities.
4. The method of claim 1, wherein the set of file system entities includes a file system entity.
5. The method of claim 1, wherein the set of file system entities includes a directory.
6. The method of claim 1, wherein the set of file system entities includes a file system data structure.
7. The method of claim 1, wherein identifying at least the first portion of a set of file system entities corresponding with the first storage block comprises:
accessing a storage structure database including mappings from storage block locations to portions of the set of file system entities.
8. The method of claim 1, wherein identifying at least the second storage block corresponding with the second portion of the set of file system entities comprises:
accessing a data storage structure including previously determined mappings from portions of the set of file system entities to storage block locations.
9. The method of claim 1, comprising:
receiving a second storage request from the storage client;
determining if the second storage request includes a request for the second storage block;
in response to the determination that the second storage request includes the request for the second storage block, retrieving the second storage block from the storage block cache at the first network location; and
in response to the determination that the second storage request does not include the request for the second storage block, retrieving at least one additional storage block from the data storage connected with the wide area network at the second network location.
10. A method of optimizing a block storage protocol read access to a block storage device via a wide area network, the method comprising:
receiving a storage request specifying at least a first storage block from a storage client, wherein the storage client is connected with a wide area network at a first network location;
identifying at least a first portion of a set of database entities corresponding with the first storage block;
identifying at least at a second portion of the set of database entities likely to be associated with a future storage request based on the first portion of the set of database entities;
identifying at least a second storage block corresponding with the second portion of the set of database entities;
retrieving the second storage block from a data storage connected with the wide area network at a second network location;
communicating via the wide area network the second storage block from the data storage to a storage block cache at the first network location; and
storing the second storage block in the storage block cache.
11. The method of claim 10, wherein the first portion of the set of database entities and the second portion of the set of database entities include a first one of the set of database entities.
12. The method of claim 10, wherein the first portion of the set of database entities includes a first one of the set of database entities and the second portion of the set of database entities includes a second one of the set of database entities.
13. The method of claim 10, wherein the set of database entities includes a table.
14. The method of claim 10, wherein the set of database entities includes a database system node.
15. The method of claim 10, wherein identifying at least the first portion of a set of database entities corresponding with the first storage block comprises:
accessing a storage structure database including mappings from storage block locations to portions of the set of database entities.
16. The method of claim 10, wherein identifying at least the second storage block corresponding with the second portion of the set of database entities comprises:
accessing a data storage structure including previously determined mappings from portions of the set of database entities to storage block locations.
17. The method of claim 10, comprising:
receiving a second storage request from the storage client;
determining if the second storage request includes a request for the second storage block;
in response to the determination that the second storage request includes the request for the second storage block, retrieving the second storage block from the storage block cache at the first network location; and
in response to the determination that the second storage request does not include the request for the second storage block, retrieving at least one additional storage block from the data storage connected with the wide area network at the second network location.
18. A method of optimizing a block storage protocol write access to a block storage device via a wide area network, the method comprising:
receiving a storage request specifying at least a first storage block from a storage client connected with a wide area network at a first network location;
determining if a storage block cache has sufficient capacity to store at least the first storage block; and
in response to the determination that the storage block cache has sufficient capacity to store at least the first storage block:
storing the first storage block in the storage block cache;
sending a storage request acknowledgement to the storage client indicating that the storage request is complete; and
following the storage request acknowledgement, communicating the first storage block via the wide area network to a data storage connected with the wide area network at a second network location, wherein the data storage is adapted to store the first storage block.
19. The method of claim 18, wherein the storage block cache is located at the first network location and is connected with the storage client via a first local network.
20. The method of claim 18, further comprising:
in response to the determination that the storage block cache does not have sufficient capacity to store at least the first storage block:
communicating the first storage block via the wide area network to a data storage connected with the wide area network at a second network location;
receiving a first storage request acknowledgement from the data storage, wherein the first storage request acknowledgment indicates that the data center has stored the first storage block; and
following the receipt of the first storage request acknowledgement, sending a second storage request acknowledgement to the storage client indicating that the storage request is complete.
21. A method of preserving data in a data storage device, the method comprising:
setting a storage interface connected with a wide area network at a first network location to a quiescent state;
identifying a first set of storage blocks in a storage block cache connected at the first network location that has changed since following its initial storage in the storage block cache;
setting the storage interface to an active state;
following the storage interface setting to the active state, transferring the first set of storage blocks via the wide area network to a second network location; and
storing a data snapshot on a data storage at the second network location, wherein the snapshot includes the first set of storage blocks.
22. The method of claim 21, wherein transferring is in response to a snapshot request received from an administration application.
23. The method of claim 21, wherein the data snapshot includes a copy of a second set of storage blocks stored by the data storage, wherein the second set of storage blocks is unchanged since the time of that the storage interface is set to the quiescent state.
24. The method of claim 23, wherein the second set of storage blocks was previously stored by the data storage prior to the storage interface being set to the quiescent state.
25. The method of claim 21, comprising:
receiving, prior to transferring the first set of storage blocks, a first modification to at least a portion of the first set of storage blocks;
in response to receiving the first modification, creating a copy of at least the portion of the first set of storage blocks;
applying the first modification to the copy of at least the portion of the first set of storage blocks; and
preserving the unmodified portion of the first set of storage blocks for transfer to the second network location.
26. The method of claim 25, comprising:
receiving a storage request from a storage client at the first network location, wherein the storage request specifies at least the portion of the first set of storage blocks;
in response to the storage request, providing the modified copy of the portion of the first set of storage blocks to the storage client.
US12/730,179 2009-03-23 2010-03-23 Virtualized Data Storage Over Wide-Area Networks Abandoned US20100241726A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/730,179 US20100241726A1 (en) 2009-03-23 2010-03-23 Virtualized Data Storage Over Wide-Area Networks
US12/818,872 US8504670B2 (en) 2010-03-23 2010-06-18 Virtualized data storage applications and optimizations

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16246309P 2009-03-23 2009-03-23
US12/730,179 US20100241726A1 (en) 2009-03-23 2010-03-23 Virtualized Data Storage Over Wide-Area Networks

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/818,872 Continuation-In-Part US8504670B2 (en) 2010-03-23 2010-06-18 Virtualized data storage applications and optimizations

Publications (1)

Publication Number Publication Date
US20100241726A1 true US20100241726A1 (en) 2010-09-23

Family

ID=42738538

Family Applications (5)

Application Number Title Priority Date Filing Date
US12/730,192 Abandoned US20100241807A1 (en) 2009-03-23 2010-03-23 Virtualized data storage system cache management
US12/730,179 Abandoned US20100241726A1 (en) 2009-03-23 2010-03-23 Virtualized Data Storage Over Wide-Area Networks
US12/730,198 Active 2031-11-14 US9348842B2 (en) 2009-03-23 2010-03-23 Virtualized data storage system optimizations
US12/730,185 Active 2031-12-29 US10831721B2 (en) 2009-03-23 2010-03-23 Virtualized data storage system architecture
US16/849,888 Active 2030-07-03 US11593319B2 (en) 2009-03-23 2020-04-15 Virtualized data storage system architecture

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US12/730,192 Abandoned US20100241807A1 (en) 2009-03-23 2010-03-23 Virtualized data storage system cache management

Family Applications After (3)

Application Number Title Priority Date Filing Date
US12/730,198 Active 2031-11-14 US9348842B2 (en) 2009-03-23 2010-03-23 Virtualized data storage system optimizations
US12/730,185 Active 2031-12-29 US10831721B2 (en) 2009-03-23 2010-03-23 Virtualized data storage system architecture
US16/849,888 Active 2030-07-03 US11593319B2 (en) 2009-03-23 2020-04-15 Virtualized data storage system architecture

Country Status (3)

Country Link
US (5) US20100241807A1 (en)
EP (1) EP2411918B1 (en)
WO (1) WO2010111312A2 (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100306445A1 (en) * 2009-05-28 2010-12-02 Steven Dake Mechanism for Virtual Logical Volume Management
US20110238775A1 (en) * 2010-03-23 2011-09-29 Riverbed Technology, Inc. Virtualized Data Storage Applications and Optimizations
US20120158660A1 (en) * 2010-12-15 2012-06-21 International Business Machines Corporation Method and system for deduplicating data
US20130166625A1 (en) * 2010-05-27 2013-06-27 Adobe Systems Incorporated Optimizing Caches For Media Streaming
WO2013134105A1 (en) * 2012-03-05 2013-09-12 Riverbed Technology, Inc. Virtualized data storage system architecture using prefetching agent
US20140025713A1 (en) * 2012-07-23 2014-01-23 Red Hat, Inc. Unified file and object data storage
US20140059298A1 (en) * 2012-08-24 2014-02-27 Dell Products L.P. Snapshot Coordination
US8966172B2 (en) 2011-11-15 2015-02-24 Pavilion Data Systems, Inc. Processor agnostic data storage in a PCIE based shared storage enviroment
US9069782B2 (en) 2012-10-01 2015-06-30 The Research Foundation For The State University Of New York System and method for security and privacy aware virtual machine checkpointing
US9223607B2 (en) 2012-01-17 2015-12-29 Microsoft Technology Licensing, Llc System for replicating or migrating virtual machine operations log by throttling guest write iOS based on destination throughput
US9262329B2 (en) 2012-08-24 2016-02-16 Dell Products L.P. Snapshot access
WO2016024994A1 (en) * 2014-08-15 2016-02-18 Hitachi, Ltd. Method and apparatus to virtualize remote copy pair in three data center configuration
US9372726B2 (en) 2013-01-09 2016-06-21 The Research Foundation For The State University Of New York Gang migration of virtual machines using cluster-wide deduplication
US9565269B2 (en) 2014-11-04 2017-02-07 Pavilion Data Systems, Inc. Non-volatile memory express over ethernet
US9600206B2 (en) 2012-08-01 2017-03-21 Microsoft Technology Licensing, Llc Request ordering support when switching virtual disk replication logs
US9652182B2 (en) 2012-01-31 2017-05-16 Pavilion Data Systems, Inc. Shareable virtual non-volatile storage device for a server
US9712619B2 (en) 2014-11-04 2017-07-18 Pavilion Data Systems, Inc. Virtual non-volatile memory express drive
US9733860B2 (en) 2011-07-06 2017-08-15 Microsoft Technology Licensing, Llc Combined live migration and storage migration using file shares and mirroring
CN107092677A (en) * 2010-12-29 2017-08-25 亚马逊科技公司 Receiver-side Data duplication in data system is deleted
US9767284B2 (en) 2012-09-14 2017-09-19 The Research Foundation For The State University Of New York Continuous run-time validation of program execution: a practical approach
US9767271B2 (en) 2010-07-15 2017-09-19 The Research Foundation For The State University Of New York System and method for validating program execution at run-time
US9823842B2 (en) 2014-05-12 2017-11-21 The Research Foundation For The State University Of New York Gang migration of virtual machines using cluster-wide deduplication
US10009412B1 (en) * 2017-02-09 2018-06-26 International Business Machines Corporation Distributed file transfer with high performance
US10095707B1 (en) 2014-12-19 2018-10-09 EMC IP Holding Company LLC Nearline cloud storage based on FUSE framework
US10095710B1 (en) * 2014-12-19 2018-10-09 EMC IP Holding Company LLC Presenting cloud based storage as a virtual synthetic
US10120765B1 (en) 2014-12-19 2018-11-06 EMC IP Holding Company LLC Restore process using incremental inversion
US10235463B1 (en) * 2014-12-19 2019-03-19 EMC IP Holding Company LLC Restore request and data assembly processes
US10838820B1 (en) 2014-12-19 2020-11-17 EMC IP Holding Company, LLC Application level support for selectively accessing files in cloud-based storage
US10901943B1 (en) * 2016-09-30 2021-01-26 EMC IP Holding Company LLC Multi-tier storage system with direct client access to archive storage tier
US11038809B1 (en) * 2012-06-07 2021-06-15 Open Invention Network Llc Migration of files contained on virtual storage to a cloud storage infrastructure
US11477280B1 (en) * 2017-07-26 2022-10-18 Pure Storage, Inc. Integrating cloud storage services

Families Citing this family (85)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100241807A1 (en) 2009-03-23 2010-09-23 Riverbed Technology, Inc. Virtualized data storage system cache management
US8417765B2 (en) * 2009-06-09 2013-04-09 International Business Machines Corporation Method and apparatus to enable protocol verification
US8751780B2 (en) * 2010-02-08 2014-06-10 Microsoft Corporation Fast machine booting through streaming storage
US8751738B2 (en) 2010-02-08 2014-06-10 Microsoft Corporation Background migration of virtual storage
US9135031B1 (en) * 2010-04-28 2015-09-15 Netapp, Inc. System and method for determining storage resources of a virtual machine in a virtual server environment
US10423577B2 (en) 2010-06-29 2019-09-24 International Business Machines Corporation Collections for storage artifacts of a tree structured repository established via artifact metadata
KR101845110B1 (en) * 2010-10-06 2018-04-03 가부시끼가이샤 도시바 Distributed cache coherency protocol
US10394757B2 (en) 2010-11-18 2019-08-27 Microsoft Technology Licensing, Llc Scalable chunk store for data deduplication
US8495108B2 (en) 2010-11-30 2013-07-23 International Business Machines Corporation Virtual node subpool management
JP5681465B2 (en) * 2010-12-02 2015-03-11 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Information processing system, information processing apparatus, preparation method, program, and recording medium
US8645335B2 (en) 2010-12-16 2014-02-04 Microsoft Corporation Partial recall of deduplicated files
US8380681B2 (en) 2010-12-16 2013-02-19 Microsoft Corporation Extensible pipeline for data deduplication
US8892707B2 (en) 2011-04-13 2014-11-18 Netapp, Inc. Identification of virtual applications for backup in a cloud computing system
US9244933B2 (en) 2011-04-29 2016-01-26 International Business Machines Corporation Disk image introspection for storage systems
US9483484B1 (en) * 2011-05-05 2016-11-01 Veritas Technologies Llc Techniques for deduplicated data access statistics management
WO2012154658A2 (en) * 2011-05-06 2012-11-15 University Of North Carolina At Chapel Hill Methods, systems, and computer readable media for efficient computer forensic analysis and data access control
US8683466B2 (en) * 2011-05-24 2014-03-25 Vmware, Inc. System and method for generating a virtual desktop
US9705756B2 (en) 2011-06-02 2017-07-11 Hewlett Packard Enterprise Development Lp Network virtualization
US9342254B2 (en) * 2011-06-04 2016-05-17 Microsoft Technology Licensing, Llc Sector-based write filtering with selective file and registry exclusions
US8849777B1 (en) 2011-06-30 2014-09-30 Emc Corporation File deletion detection in key value databases for virtual backups
US8849769B1 (en) * 2011-06-30 2014-09-30 Emc Corporation Virtual machine file level recovery
US9158632B1 (en) * 2011-06-30 2015-10-13 Emc Corporation Efficient file browsing using key value databases for virtual backups
US9710338B1 (en) * 2011-06-30 2017-07-18 EMC IP Holding Company LLC Virtual machine data recovery
US9229951B1 (en) 2011-06-30 2016-01-05 Emc Corporation Key value databases for virtual backups
US9311327B1 (en) * 2011-06-30 2016-04-12 Emc Corporation Updating key value databases for virtual backups
US8949829B1 (en) 2011-06-30 2015-02-03 Emc Corporation Virtual machine disaster recovery
US8843443B1 (en) 2011-06-30 2014-09-23 Emc Corporation Efficient backup of virtual data
US8671075B1 (en) 2011-06-30 2014-03-11 Emc Corporation Change tracking indices in virtual machines
US8990171B2 (en) * 2011-09-01 2015-03-24 Microsoft Corporation Optimization of a partially deduplicated file
US20130159382A1 (en) * 2011-12-15 2013-06-20 Microsoft Corporation Generically presenting virtualized data
CN103917966B (en) * 2011-12-23 2016-08-24 英派尔科技开发有限公司 Resources optimistic utilization in cluster tool
US9984079B1 (en) * 2012-01-13 2018-05-29 Amazon Technologies, Inc. Managing data storage using storage policy specifications
US9773006B1 (en) * 2012-02-15 2017-09-26 Veritas Technologies Llc Techniques for managing non-snappable volumes
US9110595B2 (en) 2012-02-28 2015-08-18 AVG Netherlands B.V. Systems and methods for enhancing performance of software applications
US8914585B1 (en) 2012-03-31 2014-12-16 Emc Corporation System and method for obtaining control of a logical unit number
US8874799B1 (en) 2012-03-31 2014-10-28 Emc Corporation System and method for improving cache performance
US8914584B1 (en) 2012-03-31 2014-12-16 Emc Corporation System and method for improving cache performance upon detection of a LUN control event
US8554954B1 (en) * 2012-03-31 2013-10-08 Emc Corporation System and method for improving cache performance
US10606754B2 (en) * 2012-04-16 2020-03-31 International Business Machines Corporation Loading a pre-fetch cache using a logical volume mapping
US9223799B1 (en) * 2012-06-29 2015-12-29 Emc Corporation Lightweight metadata sharing protocol for location transparent file access
WO2014006656A1 (en) * 2012-07-05 2014-01-09 Hitachi, Ltd. Computer system, cache control method and computer program
US9766873B2 (en) * 2012-08-17 2017-09-19 Tripwire, Inc. Operating system patching and software update reconciliation
EP2701356B1 (en) * 2012-08-20 2017-08-02 Alcatel Lucent A method for establishing an authorized communication between a physical object and a communication device enabling a write access
US10489295B2 (en) * 2012-10-08 2019-11-26 Sandisk Technologies Llc Systems and methods for managing cache pre-fetch
US9727268B2 (en) 2013-01-08 2017-08-08 Lyve Minds, Inc. Management of storage in a storage network
US10120900B1 (en) 2013-02-25 2018-11-06 EMC IP Holding Company LLC Processing a database query using a shared metadata store
US9984083B1 (en) 2013-02-25 2018-05-29 EMC IP Holding Company LLC Pluggable storage system for parallel query engines across non-native file systems
US9298634B2 (en) * 2013-03-06 2016-03-29 Gregory RECUPERO Client spatial locality through the use of virtual request trackers
US9286225B2 (en) * 2013-03-15 2016-03-15 Saratoga Speed, Inc. Flash-based storage system including reconfigurable circuitry
US9304902B2 (en) 2013-03-15 2016-04-05 Saratoga Speed, Inc. Network storage system using flash storage
US9983992B2 (en) 2013-04-30 2018-05-29 WMware Inc. Trim support for a solid-state drive in a virtualized environment
US10902081B1 (en) 2013-05-06 2021-01-26 Veeva Systems Inc. System and method for controlling electronic communications
US9430031B2 (en) * 2013-07-29 2016-08-30 Western Digital Technologies, Inc. Power conservation based on caching
US9678678B2 (en) * 2013-12-20 2017-06-13 Lyve Minds, Inc. Storage network data retrieval
US10313236B1 (en) 2013-12-31 2019-06-04 Sanmina Corporation Method of flow based services for flash storage
US9280780B2 (en) 2014-01-27 2016-03-08 Umbel Corporation Systems and methods of generating and using a bitmap index
US9378021B2 (en) * 2014-02-14 2016-06-28 Intel Corporation Instruction and logic for run-time evaluation of multiple prefetchers
US9672180B1 (en) 2014-08-06 2017-06-06 Sanmina Corporation Cache memory management system and method
US9384147B1 (en) 2014-08-13 2016-07-05 Saratoga Speed, Inc. System and method for cache entry aging
US9715428B1 (en) 2014-09-24 2017-07-25 Sanmina Corporation System and method for cache data recovery
US9697227B2 (en) 2014-10-27 2017-07-04 Cohesity, Inc. Concurrent access and transactions in a distributed file system
US9628350B2 (en) 2014-11-05 2017-04-18 Amazon Technologies, Inc. Dynamic scaling of storage volumes for storage client file systems
JP2016115286A (en) * 2014-12-17 2016-06-23 株式会社リコー Information processing apparatus and information processing method
US11588783B2 (en) * 2015-06-10 2023-02-21 Cisco Technology, Inc. Techniques for implementing IPV6-based distributed storage space
US10129357B2 (en) 2015-08-21 2018-11-13 International Business Machines Corporation Managing data storage in distributed virtual environment
US9507628B1 (en) 2015-09-28 2016-11-29 International Business Machines Corporation Memory access request for a memory protocol
US10069896B2 (en) * 2015-11-01 2018-09-04 International Business Machines Corporation Data transfer via a data storage drive
US10067711B2 (en) 2015-11-01 2018-09-04 International Business Machines Corporation Data transfer between data storage libraries
US9607104B1 (en) 2016-04-29 2017-03-28 Umbel Corporation Systems and methods of using a bitmap index to determine bicliques
US11003658B2 (en) * 2016-11-21 2021-05-11 International Business Machines Corporation Selectively retrieving data from remote share nothing computer clusters
US10067944B2 (en) * 2017-01-09 2018-09-04 Splunk, Inc. Cache aware searching of buckets in remote storage
US10705796B1 (en) 2017-04-27 2020-07-07 Intuit Inc. Methods, systems, and computer program product for implementing real-time or near real-time classification of digital data
US10467122B1 (en) 2017-04-27 2019-11-05 Intuit Inc. Methods, systems, and computer program product for capturing and classification of real-time data and performing post-classification tasks
US10528329B1 (en) * 2017-04-27 2020-01-07 Intuit Inc. Methods, systems, and computer program product for automatic generation of software application code
US10467261B1 (en) 2017-04-27 2019-11-05 Intuit Inc. Methods, systems, and computer program product for implementing real-time classification and recommendations
EP3635582B1 (en) * 2017-06-08 2022-12-14 Hitachi Vantara LLC Fast recall for geographically distributed object data
US11526469B1 (en) * 2017-07-31 2022-12-13 EMC IP Holding Company LLC File system reorganization in the presence of inline compression
GB2569651A (en) * 2017-12-22 2019-06-26 Veea Systems Ltd Edge computing system
US11281589B2 (en) * 2018-08-30 2022-03-22 Micron Technology, Inc. Asynchronous forward caching memory systems and methods
CN109460862B (en) * 2018-10-22 2021-04-27 郑州大学 Method for solving multi-objective optimization problem based on MAB (multi-object-based) hyperheuristic algorithm
US11321114B2 (en) * 2019-07-19 2022-05-03 Vmware, Inc. Hypervisor assisted application virtualization
US11782610B2 (en) * 2020-01-30 2023-10-10 Seagate Technology Llc Write and compare only data storage
US11150840B2 (en) * 2020-02-09 2021-10-19 International Business Machines Corporation Pinning selected volumes within a heterogeneous cache
WO2021225080A1 (en) * 2020-05-08 2021-11-11 ソニーグループ株式会社 Information processing device, information processing method, and program
US11755219B1 (en) 2022-05-26 2023-09-12 International Business Machines Corporation Block access prediction for hybrid cloud storage

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020165942A1 (en) * 2001-01-29 2002-11-07 Ulrich Thomas R. Data path accelerator with variable parity, variable length, and variable extent parity groups
US20060064536A1 (en) * 2004-07-21 2006-03-23 Tinker Jeffrey L Distributed storage architecture based on block map caching and VFS stackable file system modules
US20060212539A1 (en) * 1999-06-11 2006-09-21 Microsoft Corporation Network file system
US20080140937A1 (en) * 2006-12-12 2008-06-12 Sybase, Inc. System and Methodology Providing Multiple Heterogeneous Buffer Caches
US20100257219A1 (en) * 2001-08-03 2010-10-07 Isilon Systems, Inc. Distributed file system for intelligently managing the storing and retrieval of data

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5947424A (en) * 1995-08-01 1999-09-07 Tolco, Incorporated Pipe support assembly with retaining strap
US6442565B1 (en) * 1999-08-13 2002-08-27 Hiddenmind Technology, Inc. System and method for transmitting data content in a computer network
US6718454B1 (en) * 2000-04-29 2004-04-06 Hewlett-Packard Development Company, L.P. Systems and methods for prefetch operations to reduce latency associated with memory access
WO2002035359A2 (en) * 2000-10-26 2002-05-02 Prismedia Networks, Inc. Method and system for managing distributed content and related metadata
US20020161860A1 (en) * 2001-02-28 2002-10-31 Benjamin Godlin Method and system for differential distributed data file storage, management and access
US6868439B2 (en) * 2002-04-04 2005-03-15 Hewlett-Packard Development Company, L.P. System and method for supervising use of shared storage by multiple caching servers physically connected through a switching router to said shared storage via a robust high speed connection
US7359890B1 (en) * 2002-05-08 2008-04-15 Oracle International Corporation System load based adaptive prefetch
JP4244572B2 (en) * 2002-07-04 2009-03-25 ソニー株式会社 Cache device, cache data management method, and computer program
JP4124331B2 (en) * 2002-09-17 2008-07-23 株式会社日立製作所 Virtual volume creation and management method for DBMS
US6910106B2 (en) * 2002-10-04 2005-06-21 Microsoft Corporation Methods and mechanisms for proactive memory management
JP4116413B2 (en) * 2002-12-11 2008-07-09 株式会社日立製作所 Prefetch appliance server
US7953819B2 (en) * 2003-08-22 2011-05-31 Emc Corporation Multi-protocol sharable virtual storage objects
JP2005148868A (en) * 2003-11-12 2005-06-09 Hitachi Ltd Data prefetch in storage device
US7631148B2 (en) * 2004-01-08 2009-12-08 Netapp, Inc. Adaptive file readahead based on multiple factors
US7849257B1 (en) * 2005-01-06 2010-12-07 Zhe Khi Pak Method and apparatus for storing and retrieving data
WO2006082592A1 (en) * 2005-02-04 2006-08-10 Hewlett-Packard Development Company, L.P. Data processing system and method
ATE512412T1 (en) * 2005-04-25 2011-06-15 Network Appliance Inc SYSTEM AND METHOD FOR CAPACING NETWORK FILE SYSTEMS
US7386662B1 (en) * 2005-06-20 2008-06-10 Symantec Operating Corporation Coordination of caching and I/O management in a multi-layer virtualized storage environment
US7386675B2 (en) * 2005-10-21 2008-06-10 Isilon Systems, Inc. Systems and methods for using excitement values to predict future access to resources
EP2153340A4 (en) 2007-05-08 2015-10-21 Riverbed Technology Inc A hybrid segment-oriented file server and wan accelerator
US8903938B2 (en) * 2007-06-18 2014-12-02 Amazon Technologies, Inc. Providing enhanced data retrieval from remote locations
US7702857B2 (en) * 2007-08-22 2010-04-20 International Business Machines Corporation Adjusting parameters used to prefetch data from storage into cache
US9323680B1 (en) * 2007-09-28 2016-04-26 Veritas Us Ip Holdings Llc Method and apparatus for prefetching data
US20100241807A1 (en) 2009-03-23 2010-09-23 Riverbed Technology, Inc. Virtualized data storage system cache management

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060212539A1 (en) * 1999-06-11 2006-09-21 Microsoft Corporation Network file system
US20020165942A1 (en) * 2001-01-29 2002-11-07 Ulrich Thomas R. Data path accelerator with variable parity, variable length, and variable extent parity groups
US20100257219A1 (en) * 2001-08-03 2010-10-07 Isilon Systems, Inc. Distributed file system for intelligently managing the storing and retrieval of data
US20060064536A1 (en) * 2004-07-21 2006-03-23 Tinker Jeffrey L Distributed storage architecture based on block map caching and VFS stackable file system modules
US20080140937A1 (en) * 2006-12-12 2008-06-12 Sybase, Inc. System and Methodology Providing Multiple Heterogeneous Buffer Caches

Cited By (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8151033B2 (en) * 2009-05-28 2012-04-03 Red Hat, Inc. Mechanism for virtual logical volume management
US20100306445A1 (en) * 2009-05-28 2010-12-02 Steven Dake Mechanism for Virtual Logical Volume Management
US20110238775A1 (en) * 2010-03-23 2011-09-29 Riverbed Technology, Inc. Virtualized Data Storage Applications and Optimizations
US8504670B2 (en) * 2010-03-23 2013-08-06 Riverbed Technology, Inc. Virtualized data storage applications and optimizations
US9253548B2 (en) * 2010-05-27 2016-02-02 Adobe Systems Incorporated Optimizing caches for media streaming
US9532114B2 (en) 2010-05-27 2016-12-27 Adobe Systems Incorporated Optimizing caches for media streaming
US20130166625A1 (en) * 2010-05-27 2013-06-27 Adobe Systems Incorporated Optimizing Caches For Media Streaming
US9767271B2 (en) 2010-07-15 2017-09-19 The Research Foundation For The State University Of New York System and method for validating program execution at run-time
US8364641B2 (en) * 2010-12-15 2013-01-29 International Business Machines Corporation Method and system for deduplicating data
US8458132B2 (en) * 2010-12-15 2013-06-04 International Business Machines Corporation Method and system for deduplicating data
US20120158660A1 (en) * 2010-12-15 2012-06-21 International Business Machines Corporation Method and system for deduplicating data
CN107092677A (en) * 2010-12-29 2017-08-25 亚马逊科技公司 Receiver-side Data duplication in data system is deleted
US9733860B2 (en) 2011-07-06 2017-08-15 Microsoft Technology Licensing, Llc Combined live migration and storage migration using file shares and mirroring
US8966172B2 (en) 2011-11-15 2015-02-24 Pavilion Data Systems, Inc. Processor agnostic data storage in a PCIE based shared storage enviroment
US9720598B2 (en) 2011-11-15 2017-08-01 Pavilion Data Systems, Inc. Storage array having multiple controllers
US9285995B2 (en) 2011-11-15 2016-03-15 Pavilion Data Systems, Inc. Processor agnostic data storage in a PCIE based shared storage environment
US9223607B2 (en) 2012-01-17 2015-12-29 Microsoft Technology Licensing, Llc System for replicating or migrating virtual machine operations log by throttling guest write iOS based on destination throughput
US9652182B2 (en) 2012-01-31 2017-05-16 Pavilion Data Systems, Inc. Shareable virtual non-volatile storage device for a server
WO2013134105A1 (en) * 2012-03-05 2013-09-12 Riverbed Technology, Inc. Virtualized data storage system architecture using prefetching agent
US11038809B1 (en) * 2012-06-07 2021-06-15 Open Invention Network Llc Migration of files contained on virtual storage to a cloud storage infrastructure
US11522806B1 (en) 2012-06-07 2022-12-06 International Business Machines Corporation Migration of files contained on virtual storage to a cloud storage infrastructure
US20140025713A1 (en) * 2012-07-23 2014-01-23 Red Hat, Inc. Unified file and object data storage
US10515058B2 (en) 2012-07-23 2019-12-24 Red Hat, Inc. Unified file and object data storage
US9971788B2 (en) * 2012-07-23 2018-05-15 Red Hat, Inc. Unified file and object data storage
US9971787B2 (en) 2012-07-23 2018-05-15 Red Hat, Inc. Unified file and object data storage
US9600206B2 (en) 2012-08-01 2017-03-21 Microsoft Technology Licensing, Llc Request ordering support when switching virtual disk replication logs
US9189396B2 (en) * 2012-08-24 2015-11-17 Dell Products L.P. Snapshot coordination
US20140059298A1 (en) * 2012-08-24 2014-02-27 Dell Products L.P. Snapshot Coordination
US9262329B2 (en) 2012-08-24 2016-02-16 Dell Products L.P. Snapshot access
US9767284B2 (en) 2012-09-14 2017-09-19 The Research Foundation For The State University Of New York Continuous run-time validation of program execution: a practical approach
US9552495B2 (en) 2012-10-01 2017-01-24 The Research Foundation For The State University Of New York System and method for security and privacy aware virtual machine checkpointing
US9069782B2 (en) 2012-10-01 2015-06-30 The Research Foundation For The State University Of New York System and method for security and privacy aware virtual machine checkpointing
US10324795B2 (en) 2012-10-01 2019-06-18 The Research Foundation for the State University o System and method for security and privacy aware virtual machine checkpointing
US9372726B2 (en) 2013-01-09 2016-06-21 The Research Foundation For The State University Of New York Gang migration of virtual machines using cluster-wide deduplication
US9823842B2 (en) 2014-05-12 2017-11-21 The Research Foundation For The State University Of New York Gang migration of virtual machines using cluster-wide deduplication
US10156986B2 (en) 2014-05-12 2018-12-18 The Research Foundation For The State University Of New York Gang migration of virtual machines using cluster-wide deduplication
WO2016024994A1 (en) * 2014-08-15 2016-02-18 Hitachi, Ltd. Method and apparatus to virtualize remote copy pair in three data center configuration
US9936024B2 (en) 2014-11-04 2018-04-03 Pavilion Data Systems, Inc. Storage sever with hot plug and unplug capabilities
US9565269B2 (en) 2014-11-04 2017-02-07 Pavilion Data Systems, Inc. Non-volatile memory express over ethernet
US10079889B1 (en) 2014-11-04 2018-09-18 Pavilion Data Systems, Inc. Remotely accessible solid state drive
US9712619B2 (en) 2014-11-04 2017-07-18 Pavilion Data Systems, Inc. Virtual non-volatile memory express drive
US10348830B1 (en) 2014-11-04 2019-07-09 Pavilion Data Systems, Inc. Virtual non-volatile memory express drive
US10095707B1 (en) 2014-12-19 2018-10-09 EMC IP Holding Company LLC Nearline cloud storage based on FUSE framework
US10838820B1 (en) 2014-12-19 2020-11-17 EMC IP Holding Company, LLC Application level support for selectively accessing files in cloud-based storage
US11068553B2 (en) * 2014-12-19 2021-07-20 EMC IP Holding Company LLC Restore request and data assembly processes
US10095710B1 (en) * 2014-12-19 2018-10-09 EMC IP Holding Company LLC Presenting cloud based storage as a virtual synthetic
US10120765B1 (en) 2014-12-19 2018-11-06 EMC IP Holding Company LLC Restore process using incremental inversion
US11003546B2 (en) 2014-12-19 2021-05-11 EMC IP Holding Company LLC Restore process using incremental inversion
US10997128B1 (en) 2014-12-19 2021-05-04 EMC IP Holding Company LLC Presenting cloud based storage as a virtual synthetic
US10235463B1 (en) * 2014-12-19 2019-03-19 EMC IP Holding Company LLC Restore request and data assembly processes
US10846270B2 (en) 2014-12-19 2020-11-24 EMC IP Holding Company LLC Nearline cloud storage based on fuse framework
US10901943B1 (en) * 2016-09-30 2021-01-26 EMC IP Holding Company LLC Multi-tier storage system with direct client access to archive storage tier
US10594771B2 (en) 2017-02-09 2020-03-17 International Business Machines Corporation Distributed file transfer with high performance
US10594772B2 (en) 2017-02-09 2020-03-17 International Business Machines Corporation Distributed file transfer with high performance
US10218774B2 (en) * 2017-02-09 2019-02-26 International Business Machines Corporation Distributed file transfer with high performance
US10225321B2 (en) * 2017-02-09 2019-03-05 International Business Machines Corporation Distributed file transfer with high performance
US10009412B1 (en) * 2017-02-09 2018-06-26 International Business Machines Corporation Distributed file transfer with high performance
US11477280B1 (en) * 2017-07-26 2022-10-18 Pure Storage, Inc. Integrating cloud storage services

Also Published As

Publication number Publication date
US10831721B2 (en) 2020-11-10
US20100241807A1 (en) 2010-09-23
EP2411918A2 (en) 2012-02-01
US11593319B2 (en) 2023-02-28
EP2411918B1 (en) 2018-07-11
WO2010111312A2 (en) 2010-09-30
US20100241673A1 (en) 2010-09-23
WO2010111312A3 (en) 2010-12-29
US9348842B2 (en) 2016-05-24
US20100241654A1 (en) 2010-09-23
US20200242088A1 (en) 2020-07-30
EP2411918A4 (en) 2013-08-21

Similar Documents

Publication Publication Date Title
US11593319B2 (en) Virtualized data storage system architecture
US8504670B2 (en) Virtualized data storage applications and optimizations
US20130232215A1 (en) Virtualized data storage system architecture using prefetching agent
US9852149B1 (en) Transferring and caching a cloud file in a distributed filesystem
US9824095B1 (en) Using overlay metadata in a cloud controller to generate incremental snapshots for a distributed filesystem
US9679040B1 (en) Performing deduplication in a distributed filesystem
US9678968B1 (en) Deleting a file from a distributed filesystem
US8799413B2 (en) Distributing data for a distributed filesystem across multiple cloud storage systems
US9678981B1 (en) Customizing data management for a distributed filesystem
US8799414B2 (en) Archiving data for a distributed filesystem
US8788628B1 (en) Pre-fetching data for a distributed filesystem
US9792298B1 (en) Managing metadata and data storage for a cloud controller in a distributed filesystem
US11294855B2 (en) Cloud-aware snapshot difference determination
US8805968B2 (en) Accessing cached data from a peer cloud controller in a distributed filesystem
US8694469B2 (en) Cloud synthetic backups
US9852150B2 (en) Avoiding client timeouts in a distributed filesystem
US8805967B2 (en) Providing disaster recovery for a distributed filesystem
US9811662B2 (en) Performing anti-virus checks for a distributed filesystem
US9613064B1 (en) Facilitating the recovery of a virtual machine using a distributed filesystem
US8504797B2 (en) Method and apparatus for managing thin provisioning volume by using file storage system
JP2022095781A (en) System and method of database tenant migration
US20130297855A1 (en) Ensuring write operation consistency using multiple storage devices
US20050071560A1 (en) Autonomic block-level hierarchical storage management for storage networks
US11023433B1 (en) Systems and methods for bi-directional replication of cloud tiered data across incompatible clusters
US20130297854A1 (en) Ensuring write operation consistency using raid storage devices

Legal Events

Date Code Title Description
AS Assignment

Owner name: RIVERBED TECHNOLOGY, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:WU, DAVID;REEL/FRAME:024472/0201

Effective date: 20100323

AS Assignment

Owner name: MORGAN STANLEY & CO. LLC, MARYLAND

Free format text: SECURITY AGREEMENT;ASSIGNORS:RIVERBED TECHNOLOGY, INC.;OPNET TECHNOLOGIES, INC.;REEL/FRAME:029646/0060

Effective date: 20121218

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: RIVERBED TECHNOLOGY, INC., CALIFORNIA

Free format text: RELEASE OF PATENT SECURITY INTEREST;ASSIGNOR:MORGAN STANLEY & CO. LLC, AS COLLATERAL AGENT;REEL/FRAME:032113/0425

Effective date: 20131220