US20200356277A1 - De-duplication of client-side data cache for virtual disks - Google Patents

De-duplication of client-side data cache for virtual disks Download PDF

Info

Publication number
US20200356277A1
US20200356277A1 US16/937,401 US202016937401A US2020356277A1 US 20200356277 A1 US20200356277 A1 US 20200356277A1 US 202016937401 A US202016937401 A US 202016937401A US 2020356277 A1 US2020356277 A1 US 2020356277A1
Authority
US
United States
Prior art keywords
data
client
block
virtual disk
storage platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/937,401
Inventor
Avinash Lakshman
Gaurav YADAV
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Commvault Systems Inc
Original Assignee
Commvault Systems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Commvault Systems Inc filed Critical Commvault Systems Inc
Priority to US16/937,401 priority Critical patent/US20200356277A1/en
Assigned to Hedvig, Inc. reassignment Hedvig, Inc. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LAKSHMAN, AVINASH, YADAV, Gaurav
Assigned to COMMVAULT SYSTEMS, INC. reassignment COMMVAULT SYSTEMS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: Hedvig, Inc.
Publication of US20200356277A1 publication Critical patent/US20200356277A1/en
Assigned to JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT reassignment JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: COMMVAULT SYSTEMS, INC.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0665Virtualisation aspects at area level, e.g. provisioning of virtual or logical volumes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation

Definitions

  • the present invention relates generally to local caching of data to be stored on a virtual disk within a data center. More specifically, the present invention relates to de-duplication of data stored in the local cache.
  • each computer server can now host dozens of software applications through the use of a hypervisor on each computer server and the use of virtual machines.
  • computer servers which had been underutilized could now host many different server applications, each application needing to store its data within the storage area network. Weaknesses in the storage area network were revealed by the sheer number of server applications needing to access disks within the central storage node. And, even with the use of remote storage platforms (such as “in-the-cloud” storage), problems still exist.
  • a global client-side cache within a computer server of a compute farm allows any client application, software application or virtual machine executing on that computer to make use of this client-side cache. De-duplication of blocks of data within this client-side cache then occurs globally and automatically for all applications executing upon that computer or upon others, regardless of which is the client and regardless of which virtual disk is being accessed within the storage platform. Additionally, each application may decide whether or not to enable client-side caching for each of its virtual disks.
  • the storage resources overhead associated with de-duplication metadata is minimal ( ⁇ 2%) compared to other prior art techniques, and the present invention keeps metadata distributed as well, which means node or disk failures do not lead to a reduction in de-duplication ratios.
  • the computing resources overhead is negligible as well: the present invention does not need any specific hardware for de-duplication, and can be run on any commodity hardware.
  • the present invention performs global de-duplication, not at the volume or disk level, which means higher de-duplication ratios across the entire storage platform.
  • the present invention performs in-line de-duplication, which means the invention only writes unique data to the storage platform.
  • Prior art offline or asynchronous de-duplication performs de-duplication in the background, and hence does not provide any real-time guarantees as to reduction in storage.
  • in-line de-duplication also increases the capacity and life of raw disks.
  • a method writes a block of data to a virtual disk on a remote storage platform.
  • a computer server receives a write request to write the block of data from the computer server to the remote storage platform, the write request includes an offset within the virtual disk and the data.
  • the server writes the block of data to a storage node of the storage platform.
  • the computer server calculates a hash value of the block of data using a hash function or similar function to produce a unique identifier for the block.
  • the computer determines whether the resulting hash value exists in a first metadata table of a block cache of the computer server.
  • the computer adds an entry in a second metadata table of the block cache that includes the virtual disk offset and the hash value as a key/value pair.
  • a later read request uses these tables to find the block of data in the cache without the need to go to the storage platform.
  • a method writes a block of data to a virtual disk on a remote storage platform.
  • a computer server receives a write request to write the block of data from the computer server to the remote storage platform, the write request includes an offset within the virtual disk and the data.
  • the server writes the block of data to a storage node of the storage platform.
  • the computer server calculates a hash value of the block of data using a hash function or similar function to produce a unique identifier for the block.
  • the computer determines whether the resulting hash value exists in a first metadata table of a block cache of the computer server.
  • the computer writes the block of data into the block cache at a block cache data offset and stores the hash value and the block cache data offset as a key/value pair in the first metadata table.
  • the computer adds an entry in a second metadata table of the block cache that includes the virtual disk offset and the hash value as a key/value pair.
  • a later read request uses these tables to find the block of data in the cache without the need to go to the storage platform.
  • a method reads a block of data from a virtual disk on a remote storage platform.
  • a computer server receives a read request to read the block of data from the remote storage platform, the read request includes an offset within the virtual disk.
  • the computer server determines whether the virtual disk offset exists as an entry in a first metadata table of a block cache of the computer server. If so, the computer retrieves a unique identifier corresponding to the virtual disk offset in the entry, and then accesses a second metadata table of the block cache and retrieves a block cache data offset using the unique identifier as a key. Finally, the computer reading the block of data from the block cache at the block cache data offset. Thus, it is not necessary to access a remote storage platform to read the block of data.
  • a method reads a block of data from a virtual disk on a remote storage platform.
  • a computer server receives a read request to read the block of data from the remote storage platform, the read request includes an offset within the virtual disk.
  • the computer server determines whether the virtual disk offset exists as an entry in a first metadata table of a block cache of the computer server. If not, the computer reads the block of data from a remote storage platform. The block is then returned to the requesting application.
  • FIG. 1 illustrates a data storage system having a storage platform according to one embodiment of the invention.
  • FIG. 2 is a symbolic representation of a virtual disk showing how data within the virtual disk is stored within the storage platform.
  • FIG. 3 illustrates in greater detail the computer servers in communication with the storage platform.
  • FIG. 4 illustrates one example of a block cache.
  • FIG. 5 illustrates a metadata table present within metadata used to store identifiers for blocks of data that have been stored within the block cache.
  • FIG. 6 illustrates another metadata table present within metadata used to store MD5s corresponding to a virtual disk offsets.
  • FIG. 7 is a flow diagram describing one embodiment by which a virtual machine writes data to the storage platform.
  • FIG. 8 is a flow diagram describing one embodiment by which a virtual machine reads data from the storage platform.
  • FIGS. 9 and 10 illustrate a computer system suitable for implementing embodiments of the present invention.
  • FIG. 1 illustrates a data storage system 10 according to one embodiment of the invention having a storage platform 20 .
  • the storage platform 20 includes any number of computer nodes 30 - 40 .
  • Each computer node of the storage platform has a unique identifier (e.g., “A”) that uniquely identifies that computer node within the storage platform.
  • Each computer node is a computer having any number of hard drives and solid-state drives (e.g., flash drives), and in one embodiment includes about twenty disks of about 1 TB each.
  • a typical storage platform may include on the order of about 81 TB and may include any number of computer nodes.
  • One advantage is that a platform may start with as few as three nodes and then grow incrementally to as large as 1,000 nodes or more.
  • Computers nodes 30 - 40 are shown logically being grouped together, although they may be spread across data centers and may be in different geographic locations.
  • a management console 40 used for provisioning virtual disks within the storage platform communicates with the platform over a link 44 .
  • Any number of remotely located computer servers 50 - 52 each typically executes a hypervisor in order to host any number of virtual machines.
  • Server computers 50 - 52 form what is typically referred to as a compute farm.
  • these virtual machines may be implementing any of a variety of applications such as a database server, an e-mail server, etc., including applications from companies such as Oracle, Microsoft, etc. These applications write to and read data from the storage platform using a suitable storage protocol such as iSCSI or NFS, although each application will not be aware that data is being transferred over link 54 using a different protocol.
  • applications such as a database server, an e-mail server, etc., including applications from companies such as Oracle, Microsoft, etc.
  • These applications write to and read data from the storage platform using a suitable storage protocol such as iSCSI or NFS, although each application will not be aware that data is being transferred over link 54 using a different protocol.
  • Management console 40 is any suitable computer able to communicate over an Internet connection or link 44 with storage platform 20 .
  • an administrator wishes to manage the storage platform (e.g., provisioning a virtual disk, snapshots, revert, clone, analyze metrics, determine health of cluster, etc.) he or she uses the management console to access the storage platform and is put in communication with a management console routine executing as part of a software module on any one of the computer nodes within the platform.
  • the management console routine is typically a Web server application.
  • the virtual disk In order to provision a new virtual disk within storage platform 20 for a particular application running on a virtual machine, the virtual disk is first created and then attached to a particular virtual machine.
  • a user uses the management console to first select the size of the virtual disk (e.g., 100 GB), and then selects the individual policies that will apply to that virtual disk. For example, the user selects a replication factor, a data center aware policy and other policies concerning whether or not to compress the data, the type of disk storage, etc.
  • the virtual disk Once the virtual disk has been created, it is then attached to a particular virtual machine within one of the computer servers 50 - 52 and the provisioning process is complete.
  • storage platform 20 is able to simulate prior art central storage nodes (such as the VMax and Clarion products from EMC, VMWare products, etc.) and the virtual machines and software applications will be unaware that they are communicating with storage platform 20 instead of a prior art central storage node.
  • the provisioning process can be completed on the order of minutes or less, rather than in four to eight weeks as was typical with prior art techniques.
  • the advantage is that one only needs to add metadata concerning a new virtual disk in order to provision the disk and have the disk ready to perform writes and reads.
  • an administrator is aware that a particular software application desires a virtual disk within the platform and is aware of the characteristics that the virtual disk should have.
  • the administrator first uses the management console to access the platform and connect with the management console Web server on any one of the computer nodes within the platform.
  • the administrator chooses the characteristics of the new virtual disk such as a name; a size; a replication factor; a residence; compressed; a replication policy; cache enabled (a quality-of-service choice); and a disk type (indicating whether the virtual disk is of a block type—the iSCSI protocol—or of a file type—the NFS protocol).
  • one of the characteristics for the virtual disk that may be chosen is whether or not the client-side cache of the local computer should be enabled for that virtual disk. Applications that do not read or write frequently may not desire the cache to be enabled (as writing to the cache can add overhead), while applications that read and write frequently may desire the cache to be enabled. Cache enablement, thus, is an optional feature that may be turned on or off for each virtual disk.
  • these characteristics are stored as “virtual disk information” metadata onto a computer node within the storage platform and may be replicated.
  • the virtual disk metadata has been stored upon metadata nodes within the platform (which might be different from the nodes where the actual data of the virtual disk will be stored).
  • the identities of the storage nodes which store this metadata for the virtual disk is also sent to the controller virtual machine for placing into a cache.
  • the virtual disk that has been created is also attached to a virtual machine of the compute farm.
  • the administrator is aware of which virtual machine on which computer of the compute farm needs the virtual disk.
  • information regarding the newly created virtual disk i.e., name, space available, virtual disk information, etc.
  • the information is provided to a controller virtual machine which stores the information in a cache, ready for use when the virtual machine needs to write or to read data.
  • the administrator also supplies the name of the virtual disk to the application that will use it.
  • FIG. 2 is a symbolic representation of a virtual disk 330 showing how data within the virtual disk is stored within the storage platform.
  • the virtual disk has been provisioned as a disk holding up to 50 GB, and the disk has been logically divided into segments or portions of 16 GB each. Each of these portions is termed a “container,” and may range in size from about 4 GB up to about 32 GB, although a size of 16 GB works well.
  • portions 332 - 338 are referred to as containers C 1 , C 2 , C 3 and C 4 .
  • each container of data will be stored upon a particular node or nodes within the storage platform that are chosen during the write process.
  • the replication factor is three, thus, data stored within container 332 will be stored upon the three nodes A, B and F, data stored within the second container 334 will be stored upon the three nodes B, D and E, etc. Note that this storage technique using containers is one of many possible implementations of the storage platform and is transparent to the virtual machines that are storing data.
  • FIG. 3 illustrates in greater detail one of the computer servers 51 in communication with storage platform 20 .
  • each computer server may host any number of virtual machines, each executing a particular software application.
  • the application may perform I/O handling using a block-based protocol such as iSCSI, using a file-based protocol such as NFS, and the virtual machine communicates using this protocol.
  • a block-based protocol such as iSCSI
  • NFS file-based protocol
  • Other suitable protocols may also be used by an application.
  • the actual communication protocol used between the server and platform is transparent to these virtual machines.
  • server 51 includes a hypervisor and virtual machines 182 and 186 that desire to perform I/O handling using the iSCSI protocol 187 or the NFS protocol 183 .
  • Server 51 also includes a specialized controller virtual machine (CVM) 180 that is adapted to handle communications with the virtual machines using either protocol (and others), yet communicates with the storage platform using a proprietary protocol 189 .
  • Protocol 189 may be any suitable protocol for passing data between storage platform 20 and a remote computer server 51 such as TCP.
  • the CVM may also communicate with public cloud storage using the same or different protocol 191 .
  • the CVM need not communicate any “liveness” information between itself and the computer nodes of the platform. There is no need for any CVM to track the status of nodes in the cluster. The CVM need only talk to a node in the platform, which is then able to route requests to other nodes and public storage nodes.
  • the CVM also uses a memory cache 181 on the computer server 51 .
  • a memory cache 181 on the computer server 51 .
  • CVM 180 In communication with computer server 51 and with CVM 180 are also any number of solid-state disks (or other similar persistent storage) 195 that will be explained in greater detail below. These disks may be used as a data cache to store data blocks that are written into storage platform 20 and then to rapidly retrieve these data blocks instead of retrieving them from the remote storage platform.
  • CVM 180 handles different protocols by simulating an entity that the protocol would expect. For example, when communicating under the iSCSI block protocol, CVM responds to an iSCSI Initiation by behaving as an iSCSI Target. In other words, when virtual machine 186 performs I/O handling, it is the iSCSI Initiator and the controller virtual machine is the iSCSI Target. When an application is using the block protocol, the CVM masquerades as the iSCSI Target, traps the iSCSI CDBs, translates this information into its own protocol, and then communicates this information to the storage platform. Thus, when the CVM presents itself as an iSCSI Target, the application may simply talk to a block device as it would do normally.
  • the CVM when communicating with an NFS client, the CVM behaves as an NFS server.
  • virtual machine 182 performs I/O handling the controller virtual machine is the NFS server and the NFS client (on behalf of virtual machine 182 ) executes either in the hypervisor of computer server 51 or in the operating system kernel of virtual machine 182 .
  • the CVM masquerades as an NFS server, captures NFS packets, and then communicates this information to the storage platform using its own protocol.
  • An application is unaware that the CVM is trapping and intercepting its calls under the iSCSI or NFS protocol, or that the CVM even exists.
  • One advantage is that an application need not be changed in order to write to and read from the storage platform.
  • Use of the CVM allows an application executing upon a virtual machine to continue using the protocol it expects, yet allows these applications on the various computer servers to write data to and read data from the same storage platform 20 .
  • Replicas of a virtual disk may be stored within public cloud storage 190 .
  • public cloud storage refers to those data centers operated by enterprises that allow the public to store data for a fee. Included within these data centers are those known as Amazon Web Services and Google Compute.
  • the write request will include an identifier for each computer node to which a replica should be written. For example, nodes may be identified by their IP address.
  • the computer node within the platform that first fields the write request from the CVM will then route the data to be written to nodes identified by their IP addresses.
  • Any replica that should be sent to the public cloud can then simply be sent to the DNS name of a particular node which request (and data) is then routed to the appropriate public storage cloud. Any suitable computer router within the storage platform may handle this operation.
  • a client machine such as computer 51
  • the present invention provides an apparatus and technique in order to efficiently cache data on the client side so that during a read operation from a software application it may not be necessary to access the remote storage platform 20 .
  • One advantage of the present invention is that very large sizes of a data cache are supported and that blocks of data are stored efficiently.
  • the invention facilitates very large data caches because the invention de-duplicates data in the cache as well, which in turn increases the cache capacity by the factor of the de-duplication ratio.
  • FIG. 4 illustrates one example of a block cache 195 .
  • the block cache is implemented using persistent storage such as any number of hard disks, and most preferably solid-state disks are used. There may be one or more solid-state disks in the block cache. Given a particular size of the block cache (such as 1 TB), FIG. 4 indicates that approximately 10% of the block cache is used for metadata storage 410 and that the remaining portion 420 is used for data storage.
  • a block cache data offset 430 is used to indicate a particular location of a particular block of data within the block cache.
  • the block cache can be many disks one disk.
  • the invention takes only one disk as an input. But, users may combine multiple disks into one disk using suitable software such as a Logical Volume Manager (LVM) tool.
  • LVM Logical Volume Manager
  • FIG. 5 illustrates a metadata table 440 present within metadata 410 used to store identifiers for blocks of data that have been stored within the block cache. Metadata is stored in pairs, where column 444 indicates the MD5 (or other message digest or unique hash value from a hash function) of a particular block of data, and where column 448 indicates the offset within data 420 where that block of data has been stored.
  • MD5 or other message digest or unique hash value from a hash function
  • FIG. 6 illustrates a metadata table 480 present within metadata 410 used to store MD5s corresponding to a virtual disk offsets. This metadata is stored in pairs, where column 484 indicates a particular offset of a block of data within a particular named virtual disk, and where column 488 indicates the MD5 for the corresponding block of data.
  • FIG. 7 is a flow diagram describing one embodiment by which a virtual machine writes data to the storage platform.
  • an application on a virtual machine is writing to a virtual disk within the platform that has the client-side cache 195 enabled.
  • the CVM is aware of which virtual disks have the cache enabled and which have not because it has stored the virtual disk information into its memory cache 181 . This flow may be performed in conjunction with actually sending the data to the storage platform, before sending such data, or after sending such data.
  • step 504 the virtual machine (on behalf of its software application) that desires to write data into the storage platform sends a write request including the data to be written to a particular virtual disk.
  • the request may originate from a virtual machine on the same computer as the CVM, or from a virtual machine on a different computer.
  • a write request may originate with any of the applications on one of computer servers 50 - 52 and may use any of a variety of storage protocols.
  • the write request typically takes the form: write (offset, size, virtual disk name).
  • the parameter “virtual disk name” is the name of the virtual disk.
  • the parameter “offset” is an offset within the virtual disk (i.e., a value from 0 up to the size of the virtual disk), and the parameter “size” is the length of the data to be written in bytes.
  • the CVM will trap or capture this write request sent by the application (in the block protocol or NFS protocol, for example).
  • step 508 the CVM calculates the MD5 of each block within the data to be written.
  • Blocks may be of any size, although typically the size is 4 k bytes.
  • the CVM performs a lookup in metadata 410 of the block cache 195 to determine if each MD5 exists within table 440 in order to prevent duplicates from being stored. If an MD5 exists, this indicates that that exact block of data has already been written into the client-side cache 195 (for any virtual disk accessed by that CVM) and that it will not be necessary to write that block of data again into the cache.
  • the MD5 does not exist, this indicates that the block of data does not exist within the block cache yet and that the data block should be written to the cache. It is possible that within the data requested to be written, that some blocks already exist within the block cache and that some do not. It is also possible that the MD5s for certain blocks will be the same (e.g., if all of these blocks are entirely filled with zeros). For each query of table 440 with an MD5, the result returned is whether or not the MD5 exists, and if it exists, the block cache data offset 448 .
  • step 516 will write those unique blocks to the data region 420 of the block cache and return the block cache data offset where each block was written in data 420 .
  • step 520 the CVM updates table 440 with the MD5 of each block written to the block cache and its corresponding block cache data offset, so that the block can later be found in the block cache using its MD5.
  • step 512 if, for any block of data, its MD5 does already exist in table 440 , this indicates that the block of data does exist in the block cache, and control moves to step 524 .
  • table 480 is updated for every block of data in the write request. This table will be updated to include the virtual disk offset of each block along with its corresponding MD5. Knowing the offset from the write request and the block size, it is a simple matter to calculate the virtual disk offset for each block of the write request. In this fashion, the MD5s for all blocks of the write request will be available in table 480 by using the virtual disk offset for each block as a key, which will be useful when reading data from the storage platform and using this client-side cache. In addition, by performing the check in step 512 , duplicate blocks of data are not written to the cache.
  • FIG. 8 is a flow diagram describing one embodiment by which a virtual machine reads data from the storage platform.
  • an application on a virtual machine is reading from a virtual disk within the platform that has the client-side cache 195 enabled.
  • step 604 the virtual machine that desires to read data from the storage platform sends a read request from a particular application to the desired virtual disk.
  • the controller virtual machine will then trap or capture the request (depending upon whether it is a block request or an NFS request) and then typically places a request into its own protocol before sending the request to the storage platform.
  • a read request may originate with any of the virtual machines on computers 50 - 52 (for example) and may use any of a variety of storage protocols.
  • the read request typically takes the form: read (offset, size, virtual disk name).
  • the parameter “virtual disk name” is the name of a virtual disk on the storage platform.
  • the parameter “offset” is an offset within the virtual disk (i.e., a value from 0 up to the size of the virtual disk), and the parameter “size” is the length of the data to be read in bytes.
  • the CVM is aware of which virtual disks have the client-side cache enabled, and, if so, before sending the read request to the storage platform, the CVM will first check its block cache 195 to determine whether any of the blocks to be read are already present within this cache. Thus, in step 608 , the CVM divides up the read request into blocks; e.g., a request of size 64 k is divided up into sixteen blocks of 4 k each, each block having a corresponding offset within the named virtual disk. Thus, an offset within the named virtual disk is calculated for each block of data.
  • Step 612 then checks metadata 410 to determine whether an entry exists in table 480 for each of the calculated offsets of the named virtual disk. If an entry exists, this means that the corresponding data block has been stored in the client-side cache and the MD5 488 corresponding to that entry is returned to the CVM. Thus, in step 616 the CVM consults table 440 using the returned MD5 in order to obtain the block cache data offset for that particular block within data 420 . Once obtained, the data block is simply read from the block cache at the block cache data offset, thus obviating the need to read a data block from the remote storage platform 20 .
  • step 620 a read request for that particular data block is sent to the storage platform which then returns the data block.
  • the CVM may choose to read those data blocks from the remote storage platform one at a time, or may choose to send a single, combined read request.
  • Those data blocks that do exist within the client-side cache may also be read one by one, or the CVM may issue a single read request for all of those blocks at one time.
  • step 624 after collecting both the data blocks read from the storage platform and the data blocks read from the block cache, the CVM then returns this data corresponding to the original read request to the requesting virtual machine using the appropriate protocol, again masquerading either as a block device or as an NFS device depending upon the protocol used by the particular application.
  • FIGS. 9 and 10 illustrate a computer system 900 suitable for implementing embodiments of the present invention.
  • FIG. 9 shows one possible physical form of the computer system.
  • the computer system may have many physical forms including an integrated circuit, a printed circuit board, a small handheld device (such as a mobile telephone or PDA), a personal computer or a supercomputer.
  • Computer system 900 includes a monitor 902 , a display 904 , a housing 906 , a disk drive 908 , a keyboard 910 and a mouse 912 .
  • Disk 914 is a computer-readable medium used to transfer data to and from computer system 900 .
  • FIG. 10 is an example of a block diagram for computer system 900 . Attached to system bus 920 are a wide variety of subsystems.
  • Processor(s) 922 also referred to as central processing units, or CPUs
  • Memory 924 includes random access memory (RAM) and read-only memory (ROM).
  • RAM random access memory
  • ROM read-only memory
  • RAM random access memory
  • ROM read-only memory
  • RAM random access memory
  • ROM read-only memory
  • a fixed disk 926 is also coupled bi-directionally to CPU 922 ; it provides additional data storage capacity and may also include any of the computer-readable media described below.
  • Fixed disk 926 may be used to store programs, data and the like and is typically a secondary mass storage medium (such as a hard disk, a solid-state drive, a hybrid drive, flash memory, etc.) that can be slower than primary storage but persists data. It will be appreciated that the information retained within fixed disk 926 , may, in appropriate cases, be incorporated in standard fashion as virtual memory in memory 924 .
  • Removable disk 914 may take the form of any of the computer-readable media described below.
  • CPU 922 is also coupled to a variety of input/output devices such as display 904 , keyboard 910 , mouse 912 and speakers 930 .
  • an input/output device may be any of: video displays, track balls, mice, keyboards, microphones, touch-sensitive displays, transducer card readers, magnetic or paper tape readers, tablets, styluses, voice or handwriting recognizers, biometrics readers, or other computers.
  • CPU 922 optionally may be coupled to another computer or telecommunications network using network interface 940 . With such a network interface, it is contemplated that the CPU might receive information from the network, or might output information to the network in the course of performing the above-described method steps.
  • method embodiments of the present invention may execute solely upon CPU 922 or may execute over a network such as the Internet in conjunction with a remote CPU that shares a portion of the processing.
  • embodiments of the present invention further relate to computer storage products with a computer-readable medium that have computer code thereon for performing various computer-implemented operations.
  • the media and computer code may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well known and available to those having skill in the computer software arts.
  • Examples of computer-readable media include, but are not limited to: magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs and holographic devices; magneto-optical media such as floptical disks; and hardware devices that are specially configured to store and execute program code, such as application-specific integrated circuits (ASICs), programmable logic devices (PLDs) and ROM and RAM devices.
  • Examples of computer code include machine code, such as produced by a compiler, and files containing higher-level code that are executed by a computer using an interpreter.

Abstract

A computer receives a write request including an offset within a virtual disk. The computer writes the data block to a remote platform and calculates a hash value of the data. If the hash value does not exist in a first table of a block cache of the computer, the computer adds a pair to the first table: hash value/block cache data offset. Next, the computer adds a pair in a second table of the block cache: virtual disk offset of the data/hash value. A read request uses these tables to find the data in the cache without accessing the storage platform. The read consults the second table to find the hash value corresponding to the virtual disk offset of block. The hash value is used as a key into the first table to find the block cache data offset of the data; the data is read from the block cache at that offset.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a Continuation of U.S. patent application Ser. No. 15/156,015 filed on May 16, 2016, which is hereby incorporated by reference herein. This application is related to U.S. patent application Ser. Nos. 14/322,813, 14/322,832, 14/684,086, 14/322,850, 14/322,855, 14/322,867, 14/322,868, 14/322,871, and 14/723,380, which are all hereby incorporated by reference. This application is related to U.S. patent application Ser. No. 15/155,838 (Attorney Docket No. COMMV.495A, formerly HEDVP012), filed on May 16, 2016, which is also hereby incorporated by reference.
  • FIELD OF THE INVENTION
  • The present invention relates generally to local caching of data to be stored on a virtual disk within a data center. More specifically, the present invention relates to de-duplication of data stored in the local cache.
  • BACKGROUND OF THE INVENTION
  • In the field of data storage, enterprises have used a variety of techniques in order to store the data that their software applications use. At one point in time, each individual computer server within an enterprise running a particular software application (such as a database or e-mail application) would store data from that application in any number of attached local disks. Although this technique was relatively straightforward, it led to storage manageability problems in that the data was stored in many different places throughout the enterprise.
  • These problems led to the introduction of the storage area network in which each computer server within an enterprise communicated with a central storage computer node that included all of the storage disks. The application data that used to be stored locally at each computer server was now stored centrally on the central storage node via a fiber channel switch, for example. Although such a storage area network was easier to manage, changes in computer server architecture created new problems.
  • With the advent of virtualization, each computer server can now host dozens of software applications through the use of a hypervisor on each computer server and the use of virtual machines. Thus, computer servers which had been underutilized could now host many different server applications, each application needing to store its data within the storage area network. Weaknesses in the storage area network were revealed by the sheer number of server applications needing to access disks within the central storage node. And, even with the use of remote storage platforms (such as “in-the-cloud” storage), problems still exist.
  • For example, the sheer amount of data that applications desire to store in a remote storage platform can overwhelm a local virtual machine if it attempts to cache data to be stored remotely in the storage platform, can raise costs, and can lead to inefficiency. Attempts to remove duplicates of locally-cached data have been tried but are not optimal. Accordingly, further techniques and systems are desired to remove duplicates of data cached at a local computer.
  • SUMMARY OF THE INVENTION
  • To achieve the foregoing, and in accordance with the purpose of the present invention, techniques are disclosed that provide the advantages discussed below.
  • Use of a global client-side cache within a computer server of a compute farm allows any client application, software application or virtual machine executing on that computer to make use of this client-side cache. De-duplication of blocks of data within this client-side cache then occurs globally and automatically for all applications executing upon that computer or upon others, regardless of which is the client and regardless of which virtual disk is being accessed within the storage platform. Additionally, each application may decide whether or not to enable client-side caching for each of its virtual disks.
  • In addition, the storage resources overhead associated with de-duplication metadata is minimal (<2%) compared to other prior art techniques, and the present invention keeps metadata distributed as well, which means node or disk failures do not lead to a reduction in de-duplication ratios. And, the computing resources overhead is negligible as well: the present invention does not need any specific hardware for de-duplication, and can be run on any commodity hardware. Moreover, the present invention performs global de-duplication, not at the volume or disk level, which means higher de-duplication ratios across the entire storage platform. Finally, the present invention performs in-line de-duplication, which means the invention only writes unique data to the storage platform. Prior art offline or asynchronous de-duplication performs de-duplication in the background, and hence does not provide any real-time guarantees as to reduction in storage. Thus, in-line de-duplication also increases the capacity and life of raw disks.
  • In a first embodiment, a method writes a block of data to a virtual disk on a remote storage platform. First, a computer server receives a write request to write the block of data from the computer server to the remote storage platform, the write request includes an offset within the virtual disk and the data. The server writes the block of data to a storage node of the storage platform. After this write, or even prior, the computer server calculates a hash value of the block of data using a hash function or similar function to produce a unique identifier for the block. The computer determines whether the resulting hash value exists in a first metadata table of a block cache of the computer server. If so, the computer adds an entry in a second metadata table of the block cache that includes the virtual disk offset and the hash value as a key/value pair. A later read request uses these tables to find the block of data in the cache without the need to go to the storage platform.
  • In a second embodiment, a method writes a block of data to a virtual disk on a remote storage platform. First, a computer server receives a write request to write the block of data from the computer server to the remote storage platform, the write request includes an offset within the virtual disk and the data. The server writes the block of data to a storage node of the storage platform. After this write, or even prior, the computer server calculates a hash value of the block of data using a hash function or similar function to produce a unique identifier for the block. The computer determines whether the resulting hash value exists in a first metadata table of a block cache of the computer server. If not, the computer writes the block of data into the block cache at a block cache data offset and stores the hash value and the block cache data offset as a key/value pair in the first metadata table. Next the computer adds an entry in a second metadata table of the block cache that includes the virtual disk offset and the hash value as a key/value pair. A later read request uses these tables to find the block of data in the cache without the need to go to the storage platform.
  • In a third embodiment, a method reads a block of data from a virtual disk on a remote storage platform. First a computer server receives a read request to read the block of data from the remote storage platform, the read request includes an offset within the virtual disk. Next, the computer server determines whether the virtual disk offset exists as an entry in a first metadata table of a block cache of the computer server. If so, the computer retrieves a unique identifier corresponding to the virtual disk offset in the entry, and then accesses a second metadata table of the block cache and retrieves a block cache data offset using the unique identifier as a key. Finally, the computer reading the block of data from the block cache at the block cache data offset. Thus, it is not necessary to access a remote storage platform to read the block of data.
  • In a fourth embodiment, a method reads a block of data from a virtual disk on a remote storage platform. First a computer server receives a read request to read the block of data from the remote storage platform, the read request includes an offset within the virtual disk. Next, the computer server determines whether the virtual disk offset exists as an entry in a first metadata table of a block cache of the computer server. If not, the computer reads the block of data from a remote storage platform. The block is then returned to the requesting application.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The invention, together with further advantages thereof, may best be understood by reference to the following description taken in conjunction with the accompanying drawings in which:
  • FIG. 1 illustrates a data storage system having a storage platform according to one embodiment of the invention.
  • FIG. 2 is a symbolic representation of a virtual disk showing how data within the virtual disk is stored within the storage platform.
  • FIG. 3 illustrates in greater detail the computer servers in communication with the storage platform.
  • FIG. 4 illustrates one example of a block cache.
  • FIG. 5 illustrates a metadata table present within metadata used to store identifiers for blocks of data that have been stored within the block cache.
  • FIG. 6 illustrates another metadata table present within metadata used to store MD5s corresponding to a virtual disk offsets.
  • FIG. 7 is a flow diagram describing one embodiment by which a virtual machine writes data to the storage platform.
  • FIG. 8 is a flow diagram describing one embodiment by which a virtual machine reads data from the storage platform.
  • FIGS. 9 and 10 illustrate a computer system suitable for implementing embodiments of the present invention.
  • DETAILED DESCRIPTION OF THE INVENTION Storage System
  • FIG. 1 illustrates a data storage system 10 according to one embodiment of the invention having a storage platform 20. Included within the storage platform 20 are any number of computer nodes 30-40. Each computer node of the storage platform has a unique identifier (e.g., “A”) that uniquely identifies that computer node within the storage platform. Each computer node is a computer having any number of hard drives and solid-state drives (e.g., flash drives), and in one embodiment includes about twenty disks of about 1 TB each. A typical storage platform may include on the order of about 81 TB and may include any number of computer nodes. One advantage is that a platform may start with as few as three nodes and then grow incrementally to as large as 1,000 nodes or more.
  • Computers nodes 30-40 are shown logically being grouped together, although they may be spread across data centers and may be in different geographic locations. A management console 40 used for provisioning virtual disks within the storage platform communicates with the platform over a link 44. Any number of remotely located computer servers 50-52 each typically executes a hypervisor in order to host any number of virtual machines. Server computers 50-52 form what is typically referred to as a compute farm.
  • As shown, these virtual machines may be implementing any of a variety of applications such as a database server, an e-mail server, etc., including applications from companies such as Oracle, Microsoft, etc. These applications write to and read data from the storage platform using a suitable storage protocol such as iSCSI or NFS, although each application will not be aware that data is being transferred over link 54 using a different protocol.
  • Management console 40 is any suitable computer able to communicate over an Internet connection or link 44 with storage platform 20. When an administrator wishes to manage the storage platform (e.g., provisioning a virtual disk, snapshots, revert, clone, analyze metrics, determine health of cluster, etc.) he or she uses the management console to access the storage platform and is put in communication with a management console routine executing as part of a software module on any one of the computer nodes within the platform. The management console routine is typically a Web server application.
  • In order to provision a new virtual disk within storage platform 20 for a particular application running on a virtual machine, the virtual disk is first created and then attached to a particular virtual machine. In order to create a virtual disk, a user uses the management console to first select the size of the virtual disk (e.g., 100 GB), and then selects the individual policies that will apply to that virtual disk. For example, the user selects a replication factor, a data center aware policy and other policies concerning whether or not to compress the data, the type of disk storage, etc. Once the virtual disk has been created, it is then attached to a particular virtual machine within one of the computer servers 50-52 and the provisioning process is complete.
  • Advantageously, storage platform 20 is able to simulate prior art central storage nodes (such as the VMax and Clarion products from EMC, VMWare products, etc.) and the virtual machines and software applications will be unaware that they are communicating with storage platform 20 instead of a prior art central storage node. In addition, the provisioning process can be completed on the order of minutes or less, rather than in four to eight weeks as was typical with prior art techniques. The advantage is that one only needs to add metadata concerning a new virtual disk in order to provision the disk and have the disk ready to perform writes and reads.
  • Provision Virtual Disk
  • Typically, an administrator is aware that a particular software application desires a virtual disk within the platform and is aware of the characteristics that the virtual disk should have. The administrator first uses the management console to access the platform and connect with the management console Web server on any one of the computer nodes within the platform. The administrator chooses the characteristics of the new virtual disk such as a name; a size; a replication factor; a residence; compressed; a replication policy; cache enabled (a quality-of-service choice); and a disk type (indicating whether the virtual disk is of a block type—the iSCSI protocol—or of a file type—the NFS protocol).
  • As mentioned above, one of the characteristics for the virtual disk that may be chosen is whether or not the client-side cache of the local computer should be enabled for that virtual disk. Applications that do not read or write frequently may not desire the cache to be enabled (as writing to the cache can add overhead), while applications that read and write frequently may desire the cache to be enabled. Cache enablement, thus, is an optional feature that may be turned on or off for each virtual disk.
  • Once chosen, these characteristics are stored as “virtual disk information” metadata onto a computer node within the storage platform and may be replicated. In this fashion, the virtual disk metadata has been stored upon metadata nodes within the platform (which might be different from the nodes where the actual data of the virtual disk will be stored). In addition, the identities of the storage nodes which store this metadata for the virtual disk is also sent to the controller virtual machine for placing into a cache.
  • The virtual disk that has been created is also attached to a virtual machine of the compute farm. In this step, the administrator is aware of which virtual machine on which computer of the compute farm needs the virtual disk. Thus, information regarding the newly created virtual disk (i.e., name, space available, virtual disk information, etc.) is sent from the management console routine to the appropriate computer within the compute farm. The information is provided to a controller virtual machine which stores the information in a cache, ready for use when the virtual machine needs to write or to read data. The administrator also supplies the name of the virtual disk to the application that will use it.
  • FIG. 2 is a symbolic representation of a virtual disk 330 showing how data within the virtual disk is stored within the storage platform. As shown, the virtual disk has been provisioned as a disk holding up to 50 GB, and the disk has been logically divided into segments or portions of 16 GB each. Each of these portions is termed a “container,” and may range in size from about 4 GB up to about 32 GB, although a size of 16 GB works well. As shown, portions 332-338 are referred to as containers C1, C2, C3 and C4.
  • Similar to a traditional hard disk, as data is written to the virtual disk at a particular offset 340 (ranging from 0 up to the size of the virtual disk) the virtual disk will fill up symbolically from left to right. Each container of data will be stored upon a particular node or nodes within the storage platform that are chosen during the write process. In the example of FIG. 2, the replication factor is three, thus, data stored within container 332 will be stored upon the three nodes A, B and F, data stored within the second container 334 will be stored upon the three nodes B, D and E, etc. Note that this storage technique using containers is one of many possible implementations of the storage platform and is transparent to the virtual machines that are storing data.
  • Controller Virtual Machine
  • FIG. 3 illustrates in greater detail one of the computer servers 51 in communication with storage platform 20. As mentioned above, each computer server may host any number of virtual machines, each executing a particular software application. The application may perform I/O handling using a block-based protocol such as iSCSI, using a file-based protocol such as NFS, and the virtual machine communicates using this protocol. Of course, other suitable protocols may also be used by an application. The actual communication protocol used between the server and platform is transparent to these virtual machines.
  • As shown, server 51 includes a hypervisor and virtual machines 182 and 186 that desire to perform I/O handling using the iSCSI protocol 187 or the NFS protocol 183. Server 51 also includes a specialized controller virtual machine (CVM) 180 that is adapted to handle communications with the virtual machines using either protocol (and others), yet communicates with the storage platform using a proprietary protocol 189. Protocol 189 may be any suitable protocol for passing data between storage platform 20 and a remote computer server 51 such as TCP. In addition, the CVM may also communicate with public cloud storage using the same or different protocol 191. Advantageously, the CVM need not communicate any “liveness” information between itself and the computer nodes of the platform. There is no need for any CVM to track the status of nodes in the cluster. The CVM need only talk to a node in the platform, which is then able to route requests to other nodes and public storage nodes.
  • The CVM also uses a memory cache 181 on the computer server 51. In communication with computer server 51 and with CVM 180 are also any number of solid-state disks (or other similar persistent storage) 195 that will be explained in greater detail below. These disks may be used as a data cache to store data blocks that are written into storage platform 20 and then to rapidly retrieve these data blocks instead of retrieving them from the remote storage platform.
  • CVM 180 handles different protocols by simulating an entity that the protocol would expect. For example, when communicating under the iSCSI block protocol, CVM responds to an iSCSI Initiation by behaving as an iSCSI Target. In other words, when virtual machine 186 performs I/O handling, it is the iSCSI Initiator and the controller virtual machine is the iSCSI Target. When an application is using the block protocol, the CVM masquerades as the iSCSI Target, traps the iSCSI CDBs, translates this information into its own protocol, and then communicates this information to the storage platform. Thus, when the CVM presents itself as an iSCSI Target, the application may simply talk to a block device as it would do normally.
  • Similarly, when communicating with an NFS client, the CVM behaves as an NFS server. When virtual machine 182 performs I/O handling the controller virtual machine is the NFS server and the NFS client (on behalf of virtual machine 182) executes either in the hypervisor of computer server 51 or in the operating system kernel of virtual machine 182. Thus, when an application is using the NFS protocol, the CVM masquerades as an NFS server, captures NFS packets, and then communicates this information to the storage platform using its own protocol.
  • An application is unaware that the CVM is trapping and intercepting its calls under the iSCSI or NFS protocol, or that the CVM even exists. One advantage is that an application need not be changed in order to write to and read from the storage platform. Use of the CVM allows an application executing upon a virtual machine to continue using the protocol it expects, yet allows these applications on the various computer servers to write data to and read data from the same storage platform 20.
  • Replicas of a virtual disk may be stored within public cloud storage 190. As known in the art, public cloud storage refers to those data centers operated by enterprises that allow the public to store data for a fee. Included within these data centers are those known as Amazon Web Services and Google Compute. During a write request, the write request will include an identifier for each computer node to which a replica should be written. For example, nodes may be identified by their IP address. Thus, the computer node within the platform that first fields the write request from the CVM will then route the data to be written to nodes identified by their IP addresses. Any replica that should be sent to the public cloud can then simply be sent to the DNS name of a particular node which request (and data) is then routed to the appropriate public storage cloud. Any suitable computer router within the storage platform may handle this operation.
  • Client-Side Cache
  • As mentioned above, a client machine, such as computer 51, uses a data cache 195 in order to store blocks of data that it has written to storage platform 20 in order to retrieve those blocks more quickly when a read is performed. The present invention provides an apparatus and technique in order to efficiently cache data on the client side so that during a read operation from a software application it may not be necessary to access the remote storage platform 20. One advantage of the present invention is that very large sizes of a data cache are supported and that blocks of data are stored efficiently. The invention facilitates very large data caches because the invention de-duplicates data in the cache as well, which in turn increases the cache capacity by the factor of the de-duplication ratio.
  • FIG. 4 illustrates one example of a block cache 195. Preferably, the block cache is implemented using persistent storage such as any number of hard disks, and most preferably solid-state disks are used. There may be one or more solid-state disks in the block cache. Given a particular size of the block cache (such as 1 TB), FIG. 4 indicates that approximately 10% of the block cache is used for metadata storage 410 and that the remaining portion 420 is used for data storage. A block cache data offset 430 is used to indicate a particular location of a particular block of data within the block cache. The block cache can be many disks one disk. Preferably, the invention takes only one disk as an input. But, users may combine multiple disks into one disk using suitable software such as a Logical Volume Manager (LVM) tool.
  • FIG. 5 illustrates a metadata table 440 present within metadata 410 used to store identifiers for blocks of data that have been stored within the block cache. Metadata is stored in pairs, where column 444 indicates the MD5 (or other message digest or unique hash value from a hash function) of a particular block of data, and where column 448 indicates the offset within data 420 where that block of data has been stored.
  • FIG. 6 illustrates a metadata table 480 present within metadata 410 used to store MD5s corresponding to a virtual disk offsets. This metadata is stored in pairs, where column 484 indicates a particular offset of a block of data within a particular named virtual disk, and where column 488 indicates the MD5 for the corresponding block of data.
  • Write Using Client-Side Cache
  • FIG. 7 is a flow diagram describing one embodiment by which a virtual machine writes data to the storage platform. In this embodiment, an application on a virtual machine is writing to a virtual disk within the platform that has the client-side cache 195 enabled. The CVM is aware of which virtual disks have the cache enabled and which have not because it has stored the virtual disk information into its memory cache 181. This flow may be performed in conjunction with actually sending the data to the storage platform, before sending such data, or after sending such data.
  • In step 504 the virtual machine (on behalf of its software application) that desires to write data into the storage platform sends a write request including the data to be written to a particular virtual disk. The request may originate from a virtual machine on the same computer as the CVM, or from a virtual machine on a different computer. As mentioned, a write request may originate with any of the applications on one of computer servers 50-52 and may use any of a variety of storage protocols. The write request typically takes the form: write (offset, size, virtual disk name). The parameter “virtual disk name” is the name of the virtual disk. The parameter “offset” is an offset within the virtual disk (i.e., a value from 0 up to the size of the virtual disk), and the parameter “size” is the length of the data to be written in bytes. As mentioned above, the CVM will trap or capture this write request sent by the application (in the block protocol or NFS protocol, for example).
  • Next, in step 508 the CVM calculates the MD5 of each block within the data to be written. Blocks may be of any size, although typically the size is 4 k bytes. After all of the message digests have been calculated (or perhaps after each one is calculated), in step 512 the CVM performs a lookup in metadata 410 of the block cache 195 to determine if each MD5 exists within table 440 in order to prevent duplicates from being stored. If an MD5 exists, this indicates that that exact block of data has already been written into the client-side cache 195 (for any virtual disk accessed by that CVM) and that it will not be necessary to write that block of data again into the cache. If the MD5 does not exist, this indicates that the block of data does not exist within the block cache yet and that the data block should be written to the cache. It is possible that within the data requested to be written, that some blocks already exist within the block cache and that some do not. It is also possible that the MD5s for certain blocks will be the same (e.g., if all of these blocks are entirely filled with zeros). For each query of table 440 with an MD5, the result returned is whether or not the MD5 exists, and if it exists, the block cache data offset 448.
  • For those blocks of data that do not already exist within the block cache, step 516 will write those unique blocks to the data region 420 of the block cache and return the block cache data offset where each block was written in data 420.
  • Next, for those unique blocks written in step 516 their metadata will be updated in step 520. In step 520 the CVM updates table 440 with the MD5 of each block written to the block cache and its corresponding block cache data offset, so that the block can later be found in the block cache using its MD5.
  • In step 512 if, for any block of data, its MD5 does already exist in table 440, this indicates that the block of data does exist in the block cache, and control moves to step 524. In step 524, table 480 is updated for every block of data in the write request. This table will be updated to include the virtual disk offset of each block along with its corresponding MD5. Knowing the offset from the write request and the block size, it is a simple matter to calculate the virtual disk offset for each block of the write request. In this fashion, the MD5s for all blocks of the write request will be available in table 480 by using the virtual disk offset for each block as a key, which will be useful when reading data from the storage platform and using this client-side cache. In addition, by performing the check in step 512, duplicate blocks of data are not written to the cache.
  • Read Using Client-Side Cache
  • FIG. 8 is a flow diagram describing one embodiment by which a virtual machine reads data from the storage platform. In this embodiment, an application on a virtual machine is reading from a virtual disk within the platform that has the client-side cache 195 enabled.
  • In step 604 the virtual machine that desires to read data from the storage platform sends a read request from a particular application to the desired virtual disk. As explained above, the controller virtual machine will then trap or capture the request (depending upon whether it is a block request or an NFS request) and then typically places a request into its own protocol before sending the request to the storage platform.
  • As mentioned, a read request may originate with any of the virtual machines on computers 50-52 (for example) and may use any of a variety of storage protocols. The read request typically takes the form: read (offset, size, virtual disk name). The parameter “virtual disk name” is the name of a virtual disk on the storage platform. The parameter “offset” is an offset within the virtual disk (i.e., a value from 0 up to the size of the virtual disk), and the parameter “size” is the length of the data to be read in bytes.
  • The CVM is aware of which virtual disks have the client-side cache enabled, and, if so, before sending the read request to the storage platform, the CVM will first check its block cache 195 to determine whether any of the blocks to be read are already present within this cache. Thus, in step 608, the CVM divides up the read request into blocks; e.g., a request of size 64 k is divided up into sixteen blocks of 4 k each, each block having a corresponding offset within the named virtual disk. Thus, an offset within the named virtual disk is calculated for each block of data.
  • Step 612 then checks metadata 410 to determine whether an entry exists in table 480 for each of the calculated offsets of the named virtual disk. If an entry exists, this means that the corresponding data block has been stored in the client-side cache and the MD5 488 corresponding to that entry is returned to the CVM. Thus, in step 616 the CVM consults table 440 using the returned MD5 in order to obtain the block cache data offset for that particular block within data 420. Once obtained, the data block is simply read from the block cache at the block cache data offset, thus obviating the need to read a data block from the remote storage platform 20.
  • If an entry does not exist in table 480 for any of the calculated offsets for the named virtual disk, this means that the corresponding data block has not been previously stored in the client-side cache and that the data block must be read from the remote storage platform. Accordingly, in step 620 a read request for that particular data block is sent to the storage platform which then returns the data block.
  • It is possible that within a given read request there may be some data blocks that have been stored in the client-side cache and some that have not. Thus, for those data blocks that must be read from the storage platform, the CVM may choose to read those data blocks from the remote storage platform one at a time, or may choose to send a single, combined read request. Those data blocks that do exist within the client-side cache may also be read one by one, or the CVM may issue a single read request for all of those blocks at one time.
  • In step 624, after collecting both the data blocks read from the storage platform and the data blocks read from the block cache, the CVM then returns this data corresponding to the original read request to the requesting virtual machine using the appropriate protocol, again masquerading either as a block device or as an NFS device depending upon the protocol used by the particular application.
  • Computer System Embodiment
  • FIGS. 9 and 10 illustrate a computer system 900 suitable for implementing embodiments of the present invention. FIG. 9 shows one possible physical form of the computer system. Of course, the computer system may have many physical forms including an integrated circuit, a printed circuit board, a small handheld device (such as a mobile telephone or PDA), a personal computer or a supercomputer. Computer system 900 includes a monitor 902, a display 904, a housing 906, a disk drive 908, a keyboard 910 and a mouse 912. Disk 914 is a computer-readable medium used to transfer data to and from computer system 900.
  • FIG. 10 is an example of a block diagram for computer system 900. Attached to system bus 920 are a wide variety of subsystems. Processor(s) 922 (also referred to as central processing units, or CPUs) are coupled to storage devices including memory 924. Memory 924 includes random access memory (RAM) and read-only memory (ROM). As is well known in the art, ROM acts to transfer data and instructions uni-directionally to the CPU and RAM is used typically to transfer data and instructions in a bi-directional manner. Both of these types of memories may include any suitable of the computer-readable media described below. A fixed disk 926 is also coupled bi-directionally to CPU 922; it provides additional data storage capacity and may also include any of the computer-readable media described below. Fixed disk 926 may be used to store programs, data and the like and is typically a secondary mass storage medium (such as a hard disk, a solid-state drive, a hybrid drive, flash memory, etc.) that can be slower than primary storage but persists data. It will be appreciated that the information retained within fixed disk 926, may, in appropriate cases, be incorporated in standard fashion as virtual memory in memory 924. Removable disk 914 may take the form of any of the computer-readable media described below.
  • CPU 922 is also coupled to a variety of input/output devices such as display 904, keyboard 910, mouse 912 and speakers 930. In general, an input/output device may be any of: video displays, track balls, mice, keyboards, microphones, touch-sensitive displays, transducer card readers, magnetic or paper tape readers, tablets, styluses, voice or handwriting recognizers, biometrics readers, or other computers. CPU 922 optionally may be coupled to another computer or telecommunications network using network interface 940. With such a network interface, it is contemplated that the CPU might receive information from the network, or might output information to the network in the course of performing the above-described method steps. Furthermore, method embodiments of the present invention may execute solely upon CPU 922 or may execute over a network such as the Internet in conjunction with a remote CPU that shares a portion of the processing.
  • In addition, embodiments of the present invention further relate to computer storage products with a computer-readable medium that have computer code thereon for performing various computer-implemented operations. The media and computer code may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well known and available to those having skill in the computer software arts. Examples of computer-readable media include, but are not limited to: magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs and holographic devices; magneto-optical media such as floptical disks; and hardware devices that are specially configured to store and execute program code, such as application-specific integrated circuits (ASICs), programmable logic devices (PLDs) and ROM and RAM devices. Examples of computer code include machine code, such as produced by a compiler, and files containing higher-level code that are executed by a computer using an interpreter.
  • Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Therefore, the described embodiments should be taken as illustrative and not restrictive, and the invention should not be limited to the details given herein but should be defined by the following claims and their full scope of equivalents.

Claims (20)

We claim:
1. A method of using deduplicated client-side caching for a storage platform, the system comprising:
by a first computer server, intercepting a write request to write a first block of data to a first virtual disk configured on the storage platform,
wherein the write request is issued by an application executing at one of: the first computer server and one of a plurality of other computer servers that use the storage platform;
by the first computer server, based on determining that the first virtual disk is administered with client-side caching enabled:
calculating a hash value for the first block of data,
querying first metadata that tracks contents of a client-side cache maintained by the first computer server,
if the first metadata comprises the hash value, refraining by the first computer server from storing the first block of data to the client-side cache, and
if the first metadata does not comprise the hash value, (a) storing the first block of data in the client-side cache at a second offset, and (b) updating the first metadata to indicate that a block of data having the hash value is stored in the client-side cache at the second offset; and
wherein the client-side cache is not used for write requests to a virtual disk administered without client-side caching, wherein the client-side cache is configured in persistent storage of the first computer server, and wherein the persistent storage is distinct from data storage resources configured in the storage platform.
2. The method of claim 1, wherein the client-side cache provides global cache deduplication for all virtual disks in the storage platform that are configured with client-side caching enabled.
3. The method of claim 1 further comprising: if the first metadata comprises the hash value, updating second metadata that tracks virtual disk offsets within the storage platform with a key-value pair that comprises: a first offset for the first block of data within the first virtual disk and the hash value of the first block of data.
4. The method of claim 1 further comprising:
by the first computer server, intercepting a read request for the first block of data from the first offset within the first virtual disk configured on the storage platform,
wherein the read request is issued by an application executing at one of: the first computer server and one of a plurality of other computer servers that use the storage platform;
by the first computer server, based on determining that the first virtual disk is administered with client-side caching enabled:
querying the first metadata for the hash value of the first block of data,
if the first metadata lacks the hash value, issuing a read request to a storage node of the storage platform to read the first data block from the first offset on the first virtual disk, and serving the first data block received from the storage platform in response to the read request, and
if the first metadata comprises the hash value, serving a data block having the hash value from the client-side cache in response to the read request; and
wherein the client-side cache is not used for read requests from a virtual disk administered without client-side caching.
5. The method of claim 1, wherein the first offset for the first block of data within the first virtual disk is calculated by the first computer server based on an amount of data in the write request.
6. The method of claim 1, wherein the client-side cache is used for all virtual disks in the storage platform that are configured with client-side caching enabled.
7. The method of claim 1, wherein the client-side cache is used for all virtual disks in the storage platform that are configured at the application level with client-side caching enabled.
8. The method of claim 1 further comprising: writing the first block of data to the first virtual disk at the first offset.
9. The method of claim 1 further comprising: writing the first block of data to the first virtual disk at the first offset regardless of whether client-side caching is enabled for the first virtual disk.
10. A method of using deduplicated client-side caching for a storage platform, the system comprising:
by a first computer server, intercepting a write request to write a first block of data to a first virtual disk configured on the storage platform,
wherein the write request is issued by an application executing at one of: the first computer server and one of a plurality of other computer servers that use the storage platform;
by the first computer server, based on determining that the first virtual disk is administered with client-side caching enabled:
calculating a hash value for the first block of data,
querying first metadata for the hash value, wherein the first metadata tracks contents of a client-side cache configured in persistent storage at the first computing server, and wherein the persistent storage is distinct from data storage resources configured in the storage platform, and
if the first metadata comprises the hash value, refraining from writing the first block of data to the client-side cache, and updating second metadata that tracks virtual disk offsets within the storage platform with a key-value pair comprising a first offset for the first block of data within the first virtual disk and the hash value of the first block of data;
by the first computer server, intercepting a read request for the first block of data from the first offset within the first virtual disk,
wherein the read request is issued by an application executing at one of: the first computer server and one of a plurality of other computer servers that use the storage platform;
by the first computer server, based on determining that the first virtual disk is administered with client-side caching enabled:
querying the first metadata for the hash value of the first block of data, and
if the first metadata comprises the hash value, serving from the client-side cache a data block having the hash value of the first block of data in response to the read request.
11. The method of claim 10 further comprising:
if the first metadata does not comprise the hash value, by the first computer server, storing the first block of data in the client-side cache at a second offset, and updating the first metadata to indicate that a block of data having the hash value is stored in the client-side cache at the second offset.
12. The method of claim 10 further comprising:
if the second metadata lacks an entry for the first offset, by the first computer server, issuing a read request to a storage node of the storage platform to read the first block of data from the first offset on the first virtual disk, and serving the first block of data received from the storage platform in response to the read request.
13. The method of claim 10, wherein the client-side cache is not used for write requests to and read requests from a virtual disk administered without client-side caching.
14. The method of claim 10, wherein the client-side cache is used for all virtual disks in the storage platform that are configured with client-side caching enabled.
15. A method of using deduplicated client-side caching for a storage platform, the system comprising:
by a first computer server, intercepting a write request to write data to a first virtual disk configured on the storage platform,
wherein the write request is issued by an application executing at one of: the first computer server and one of a plurality of other computer servers that use the storage platform;
by the first computer server, based on determining that the first virtual disk is administered with client-side caching enabled:
calculating a hash value for a first block of the data in the write request,
querying first metadata for the hash value, wherein the first metadata tracks contents of a client-side cache configured in persistent storage at the first computing server, and
if the first metadata does not comprise the hash value, by the first computer server, storing the first block in the client-side cache at a second offset, and updating the first metadata to indicate that a block of data having the hash value is stored in the client-side cache at the second offset;
by the first computer server, intercepting a read request for data from the first offset within the first virtual disk configured on the storage platform,
wherein the read request is issued by an application executing at one of: the first computer server and one of a plurality of other computer servers that use the storage platform;
by the first computer server, based on determining that the first virtual disk is administered with client-side caching enabled:
identifying the first block in the data in the read request,
calculating the hash value for the first block,
querying the first metadata for the hash value, and
if the first metadata comprises the hash value corresponding to the second offset within the client-side cache, serving from the second offset at the client-side cache a data block having the hash value of the first block in response to the read request.
16. The method of claim 15 further comprising: if the first metadata comprises the hash value, refraining from writing the first block of data to the client-side cache.
17. The method of claim 15 further comprising:
if the first metadata comprises the hash value, refraining from writing the first block of data to the client-side cache and updating second metadata that tracks virtual disk offsets within the storage platform with a key-value pair comprising a first offset for the first block of data within the first virtual disk and the hash value of the first block of data; and
if the second metadata lacks an entry for the first offset, by the first computer server, issuing a read request to a storage node of the storage platform to read the first data block from the first offset on the first virtual disk, and serving the first data block received from the storage platform in response to the read request.
18. The method of claim 15, wherein the client-side cache provides global cache deduplication for all virtual disks in the storage platform that are configured with client-side caching enabled.
19. The method of claim 15, wherein the client-side cache is not used for write requests to and read requests from a virtual disk administered without client-side caching.
20. The method of claim 15, wherein the client-side cache is used for all virtual disks in the storage platform that are configured with client-side caching enabled.
US16/937,401 2016-05-16 2020-07-23 De-duplication of client-side data cache for virtual disks Abandoned US20200356277A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/937,401 US20200356277A1 (en) 2016-05-16 2020-07-23 De-duplication of client-side data cache for virtual disks

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15/156,015 US10795577B2 (en) 2016-05-16 2016-05-16 De-duplication of client-side data cache for virtual disks
US16/937,401 US20200356277A1 (en) 2016-05-16 2020-07-23 De-duplication of client-side data cache for virtual disks

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US15/156,015 Continuation US10795577B2 (en) 2016-05-16 2016-05-16 De-duplication of client-side data cache for virtual disks

Publications (1)

Publication Number Publication Date
US20200356277A1 true US20200356277A1 (en) 2020-11-12

Family

ID=60297069

Family Applications (2)

Application Number Title Priority Date Filing Date
US15/156,015 Active US10795577B2 (en) 2016-05-16 2016-05-16 De-duplication of client-side data cache for virtual disks
US16/937,401 Abandoned US20200356277A1 (en) 2016-05-16 2020-07-23 De-duplication of client-side data cache for virtual disks

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US15/156,015 Active US10795577B2 (en) 2016-05-16 2016-05-16 De-duplication of client-side data cache for virtual disks

Country Status (1)

Country Link
US (2) US10795577B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11314458B2 (en) 2016-05-16 2022-04-26 Commvault Systems, Inc. Global de-duplication of virtual disks in a storage platform

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10740300B1 (en) 2017-12-07 2020-08-11 Commvault Systems, Inc. Synchronization of metadata in a distributed storage system
CN108306740B (en) * 2018-01-22 2020-07-31 华中科技大学 Intel SGX state consistency protection method and system
US10848468B1 (en) 2018-03-05 2020-11-24 Commvault Systems, Inc. In-flight data encryption/decryption for a distributed storage platform
US11669400B2 (en) 2019-08-28 2023-06-06 Commvault Systems, Inc. Lightweight metadata handling for file indexing and live browse of backup copies
US11593228B2 (en) 2019-08-28 2023-02-28 Commvault Systems, Inc. Live browse cache enhacements for live browsing block-level backup copies of virtual machines and/or file systems
US20210089403A1 (en) * 2019-09-20 2021-03-25 Samsung Electronics Co., Ltd. Metadata table management scheme for database consistency
US11487468B2 (en) 2020-07-17 2022-11-01 Commvault Systems, Inc. Healing failed erasure-coded write attempts in a distributed data storage system configured with fewer storage nodes than data plus parity fragments
US11513708B2 (en) * 2020-08-25 2022-11-29 Commvault Systems, Inc. Optimized deduplication based on backup frequency in a distributed data storage system
US11789830B2 (en) 2020-09-22 2023-10-17 Commvault Systems, Inc. Anti-entropy-based metadata recovery in a strongly consistent distributed data storage system
US11647075B2 (en) 2020-09-22 2023-05-09 Commvault Systems, Inc. Commissioning and decommissioning metadata nodes in a running distributed data storage system
US11314687B2 (en) 2020-09-24 2022-04-26 Commvault Systems, Inc. Container data mover for migrating data between distributed data storage systems integrated with application orchestrators

Family Cites Families (176)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5625793A (en) 1991-04-15 1997-04-29 International Business Machines Corporation Automatic cache bypass for instructions exhibiting poor cache hit ratio
EP0541281B1 (en) 1991-11-04 1998-04-29 Commvault Systems, Inc. Incremental-computer-file backup using signatures
US5499367A (en) 1991-11-15 1996-03-12 Oracle Corporation System for database integrity with multiple logs assigned to client subsets
US5664106A (en) 1993-06-04 1997-09-02 Digital Equipment Corporation Phase-space surface representation of server computer performance in a computer network
US5642496A (en) 1993-09-23 1997-06-24 Kanfi; Arnon Method of making a backup copy of a memory over a plurality of copying sessions
US5765173A (en) 1996-01-11 1998-06-09 Connected Corporation High performance backup via selective file saving which can perform incremental backups and exclude files and uses a changed block signature list
US6418478B1 (en) 1997-10-30 2002-07-09 Commvault Systems, Inc. Pipelined high speed data transfer mechanism
US20010052015A1 (en) 1998-06-24 2001-12-13 Chueng-Hsien Lin Push-pull sevices for the internet
US6415385B1 (en) 1998-07-29 2002-07-02 Unisys Corporation Digital signaturing method and system for packaging specialized native files for open network transport and for burning onto CD-ROM
US6757705B1 (en) 1998-08-14 2004-06-29 Microsoft Corporation Method and system for client-side caching
US6425057B1 (en) 1998-08-27 2002-07-23 Hewlett-Packard Company Caching protocol method and system based on request frequency and relative storage duration
US6286084B1 (en) 1998-09-16 2001-09-04 Cisco Technology, Inc. Methods and apparatus for populating a network cache
US7035880B1 (en) 1999-07-14 2006-04-25 Commvault Systems, Inc. Modular backup and retrieval system used in conjunction with a storage area network
US7395282B1 (en) 1999-07-15 2008-07-01 Commvault Systems, Inc. Hierarchical backup and retrieval system
US7389311B1 (en) 1999-07-15 2008-06-17 Commvault Systems, Inc. Modular backup and retrieval system
US7028096B1 (en) 1999-09-14 2006-04-11 Streaming21, Inc. Method and apparatus for caching for streaming data
US6823377B1 (en) 2000-01-28 2004-11-23 International Business Machines Corporation Arrangements and methods for latency-sensitive hashing for collaborative web caching
US6658436B2 (en) 2000-01-31 2003-12-02 Commvault Systems, Inc. Logical view and access to data managed by a modular data and storage management system
US7003641B2 (en) 2000-01-31 2006-02-21 Commvault Systems, Inc. Logical view with granular access to exchange data managed by a modular data and storage management system
US6760723B2 (en) 2000-01-31 2004-07-06 Commvault Systems Inc. Storage management across multiple time zones
US6721767B2 (en) 2000-01-31 2004-04-13 Commvault Systems, Inc. Application specific rollback in a computer system
US6542972B2 (en) 2000-01-31 2003-04-01 Commvault Systems, Inc. Logical view and access to physical storage in modular data and storage management system
US6952737B1 (en) 2000-03-03 2005-10-04 Intel Corporation Method and apparatus for accessing remote storage in a distributed storage cluster architecture
US6760812B1 (en) 2000-10-05 2004-07-06 International Business Machines Corporation System and method for coordinating state between networked caches
US7315884B2 (en) 2001-04-03 2008-01-01 Hewlett-Packard Development Company, L.P. Reduction of network retrieval latency using cache and digest
US20030174648A1 (en) 2001-10-17 2003-09-18 Mea Wang Content delivery network by-pass system
US20030115346A1 (en) 2001-12-13 2003-06-19 Mchenry Stephen T. Multi-proxy network edge cache system and methods
US20030188106A1 (en) 2002-03-26 2003-10-02 At&T Corp. Cache validation using rejuvenation in a data network
US8650266B2 (en) 2002-03-26 2014-02-11 At&T Intellectual Property Ii, L.P. Cache validation using smart source selection in a data network
JP4221646B2 (en) 2002-06-26 2009-02-12 日本電気株式会社 Shared cache server
US7130970B2 (en) 2002-09-09 2006-10-31 Commvault Systems, Inc. Dynamic storage device pooling in a computer system
US7284030B2 (en) 2002-09-16 2007-10-16 Network Appliance, Inc. Apparatus and method for processing data in a network
AU2003272457A1 (en) 2002-09-16 2004-04-30 Commvault Systems, Inc. System and method for blind media support
US7171469B2 (en) 2002-09-16 2007-01-30 Network Appliance, Inc. Apparatus and method for storing data in a proxy cache in a network
WO2004025483A1 (en) 2002-09-16 2004-03-25 Commvault Systems, Inc. System and method for optimizing storage operations
US8176186B2 (en) 2002-10-30 2012-05-08 Riverbed Technology, Inc. Transaction accelerator for client-server communications systems
US8069225B2 (en) 2003-04-14 2011-11-29 Riverbed Technology, Inc. Transparent client-server transaction accelerator
US7240059B2 (en) 2002-11-14 2007-07-03 Seisint, Inc. System and method for configuring a parallel-processing database system
JP2006516341A (en) 2003-01-17 2006-06-29 タシット ネットワークス,インク. Method and system for storage caching with distributed file system
WO2004090788A2 (en) 2003-04-03 2004-10-21 Commvault Systems, Inc. System and method for dynamically performing storage operations in a computer network
US20040230753A1 (en) 2003-05-16 2004-11-18 International Business Machines Corporation Methods and apparatus for providing service differentiation in a shared storage environment
US7454569B2 (en) 2003-06-25 2008-11-18 Commvault Systems, Inc. Hierarchical system and method for performing storage operations in a computer network
US8938595B2 (en) 2003-08-05 2015-01-20 Sepaton, Inc. Emulated storage system
GB2425199B (en) 2003-11-13 2007-08-15 Commvault Systems Inc System and method for combining data streams in pipelined storage operations in a storage network
US7546324B2 (en) 2003-11-13 2009-06-09 Commvault Systems, Inc. Systems and methods for performing storage operations using network attached storage
US7440982B2 (en) 2003-11-13 2008-10-21 Commvault Systems, Inc. System and method for stored data archive verification
US7613748B2 (en) 2003-11-13 2009-11-03 Commvault Systems, Inc. Stored data reverification management system and method
CA2548542C (en) 2003-11-13 2011-08-09 Commvault Systems, Inc. System and method for performing a snapshot and for restoring data
DE10356724B3 (en) 2003-12-02 2005-06-16 Deutsches Zentrum für Luft- und Raumfahrt e.V. Method for reducing the transport volume of data in data networks
US7734820B1 (en) 2003-12-31 2010-06-08 Symantec Operating Corporation Adaptive caching for a distributed file sharing system
JP4402997B2 (en) 2004-03-26 2010-01-20 株式会社日立製作所 Storage device
US7246258B2 (en) 2004-04-28 2007-07-17 Lenovo (Singapore) Pte. Ltd. Minimizing resynchronization time after backup system failures in an appliance-based business continuance architecture
US7370163B2 (en) 2004-05-03 2008-05-06 Gemini Storage Adaptive cache engine for storage area network including systems and methods related thereto
US20060020660A1 (en) 2004-07-20 2006-01-26 Vishwa Prasad Proxy and cache architecture for document storage
US7975061B1 (en) 2004-11-05 2011-07-05 Commvault Systems, Inc. System and method for performing multistream storage operations
WO2006053050A2 (en) 2004-11-08 2006-05-18 Commvault Systems, Inc. System and method for performing auxiliary storage operations
US7574692B2 (en) 2004-11-19 2009-08-11 Adrian Herscu Method for building component-software for execution in a standards-compliant programming environment
US8296369B2 (en) 2005-09-27 2012-10-23 Research In Motion Limited Email server with proxy caching of unique identifiers
US7584338B1 (en) 2005-09-27 2009-09-01 Data Domain, Inc. Replication of deduplicated storage system
US7725671B2 (en) 2005-11-28 2010-05-25 Comm Vault Systems, Inc. System and method for providing redundant access to metadata over a network
US7617253B2 (en) 2005-12-19 2009-11-10 Commvault Systems, Inc. Destination systems and methods for performing data replication
US7617262B2 (en) 2005-12-19 2009-11-10 Commvault Systems, Inc. Systems and methods for monitoring application data in a data replication system
US7606844B2 (en) 2005-12-19 2009-10-20 Commvault Systems, Inc. System and method for performing replication copy storage operations
US7636743B2 (en) 2005-12-19 2009-12-22 Commvault Systems, Inc. Pathname translation in a data replication system
ES2582364T3 (en) 2005-12-19 2016-09-12 Commvault Systems, Inc. Systems and methods to perform data replication
US7620710B2 (en) 2005-12-19 2009-11-17 Commvault Systems, Inc. System and method for performing multi-path storage operations
US7651593B2 (en) 2005-12-19 2010-01-26 Commvault Systems, Inc. Systems and methods for performing data replication
US7543125B2 (en) 2005-12-19 2009-06-02 Commvault Systems, Inc. System and method for performing time-flexible calendric storage operations
US8301839B2 (en) 2005-12-30 2012-10-30 Citrix Systems, Inc. System and method for performing granular invalidation of cached dynamically generated objects in a data communication network
US7840618B2 (en) 2006-01-03 2010-11-23 Nec Laboratories America, Inc. Wide area networked file system
US7725655B2 (en) 2006-02-16 2010-05-25 Hewlett-Packard Development Company, L.P. Method of operating distributed storage system in which data is read from replicated caches and stored as erasure-coded data
US7761663B2 (en) 2006-02-16 2010-07-20 Hewlett-Packard Development Company, L.P. Operating a replicated cache that includes receiving confirmation that a flush operation was initiated
US8412682B2 (en) 2006-06-29 2013-04-02 Netapp, Inc. System and method for retrieving and using block fingerprints for data deduplication
US20080005509A1 (en) 2006-06-30 2008-01-03 International Business Machines Corporation Caching recovery information on a local system to expedite recovery
US7720841B2 (en) 2006-10-04 2010-05-18 International Business Machines Corporation Model-based self-optimizing distributed information management
US10296629B2 (en) 2006-10-20 2019-05-21 Oracle International Corporation Server supporting a consistent client-side cache
US7831566B2 (en) 2006-12-22 2010-11-09 Commvault Systems, Inc. Systems and methods of hierarchical storage management, such as global management of storage operations
US7769971B2 (en) 2007-03-29 2010-08-03 Data Center Technologies Replication and restoration of single-instance storage pools
US7873809B2 (en) 2007-03-29 2011-01-18 Hitachi, Ltd. Method and apparatus for de-duplication after mirror operation
EP2153340A4 (en) 2007-05-08 2015-10-21 Riverbed Technology Inc A hybrid segment-oriented file server and wan accelerator
US8078729B2 (en) 2007-08-21 2011-12-13 Ntt Docomo, Inc. Media streaming with online caching and peer-to-peer forwarding
WO2009032710A2 (en) 2007-08-29 2009-03-12 Nirvanix, Inc. Filing system and method for data files stored in a distributed communications network
US8738575B2 (en) 2007-09-17 2014-05-27 International Business Machines Corporation Data recovery in a hierarchical data storage system
US7822939B1 (en) 2007-09-25 2010-10-26 Emc Corporation Data de-duplication using thin provisioning
US8548953B2 (en) 2007-11-12 2013-10-01 F5 Networks, Inc. File deduplication using storage tiers
US8209334B1 (en) 2007-12-28 2012-06-26 Don Doerner Method to direct data to a specific one of several repositories
US8145614B1 (en) 2007-12-28 2012-03-27 Emc Corporation Selection of a data path based on the likelihood that requested information is in a cache
US8473956B2 (en) 2008-01-15 2013-06-25 Microsoft Corporation Priority based scheduling system for server
JP5089429B2 (en) * 2008-02-21 2012-12-05 キヤノン株式会社 Information processing apparatus, control method therefor, and program
US7539710B1 (en) 2008-04-11 2009-05-26 International Business Machines Corporation Method of and system for deduplicating backed up data in a client-server environment
US9395929B2 (en) 2008-04-25 2016-07-19 Netapp, Inc. Network storage server with integrated encryption, compression and deduplication capability
US8620877B2 (en) 2008-04-30 2013-12-31 International Business Machines Corporation Tunable data fingerprinting for optimizing data deduplication
US8484162B2 (en) 2008-06-24 2013-07-09 Commvault Systems, Inc. De-duplication systems and methods for application-specific data
US8176269B2 (en) 2008-06-30 2012-05-08 International Business Machines Corporation Managing metadata for data blocks used in a deduplication system
US7812621B2 (en) 2008-07-31 2010-10-12 Hiroshima University Measuring apparatus and method for measuring a surface capacitance of an insulating film
US8290915B2 (en) 2008-09-15 2012-10-16 International Business Machines Corporation Retrieval and recovery of data chunks from alternate data stores in a deduplicating system
US7814149B1 (en) 2008-09-29 2010-10-12 Symantec Operating Corporation Client side data deduplication
US8495032B2 (en) 2008-10-01 2013-07-23 International Business Machines Corporation Policy based sharing of redundant data across storage pools in a deduplicating system
US20100088296A1 (en) 2008-10-03 2010-04-08 Netapp, Inc. System and method for organizing data to facilitate data deduplication
US8401996B2 (en) 2009-03-30 2013-03-19 Commvault Systems, Inc. Storing a variable number of instances of data objects
US8205065B2 (en) 2009-03-30 2012-06-19 Exar Corporation System and method for data deduplication
US8805953B2 (en) 2009-04-03 2014-08-12 Microsoft Corporation Differential file and system restores from peers and the cloud
US8261126B2 (en) 2009-04-03 2012-09-04 Microsoft Corporation Bare metal machine recovery from the cloud
US8255365B2 (en) 2009-06-08 2012-08-28 Symantec Corporation Source classification for performing deduplication in a backup operation
US20100318759A1 (en) 2009-06-15 2010-12-16 Microsoft Corporation Distributed rdc chunk store
US8930306B1 (en) 2009-07-08 2015-01-06 Commvault Systems, Inc. Synchronized data deduplication
US8204862B1 (en) 2009-10-02 2012-06-19 Symantec Corporation Systems and methods for restoring deduplicated data
US8458144B2 (en) * 2009-10-22 2013-06-04 Oracle America, Inc. Data deduplication method using file system constructs
US8380688B2 (en) 2009-11-06 2013-02-19 International Business Machines Corporation Method and apparatus for data compression
US8504528B2 (en) 2009-11-09 2013-08-06 Ca, Inc. Duplicate backup data identification and consolidation
US20110119741A1 (en) 2009-11-18 2011-05-19 Hotchalk Inc. Method for Conditionally Obtaining Files From a Local Appliance
US8694469B2 (en) 2009-12-28 2014-04-08 Riverbed Technology, Inc. Cloud synthetic backups
US8250638B2 (en) * 2010-02-01 2012-08-21 Vmware, Inc. Maintaining the domain access of a virtual machine
US8732133B2 (en) 2010-03-16 2014-05-20 Commvault Systems, Inc. Extensible data deduplication system and method
US8468135B2 (en) 2010-04-14 2013-06-18 International Business Machines Corporation Optimizing data transmission bandwidth consumption over a wide area network
US8312471B2 (en) * 2010-04-26 2012-11-13 Vmware, Inc. File system independent content aware cache
US9852150B2 (en) * 2010-05-03 2017-12-26 Panzura, Inc. Avoiding client timeouts in a distributed filesystem
US8504526B2 (en) 2010-06-04 2013-08-06 Commvault Systems, Inc. Failover systems and methods for performing backup operations
US8548944B2 (en) 2010-07-15 2013-10-01 Delphix Corp. De-duplication based backup of file systems
US8578109B2 (en) 2010-09-30 2013-11-05 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US8577851B2 (en) 2010-09-30 2013-11-05 Commvault Systems, Inc. Content aligned block-based deduplication
US8549350B1 (en) 2010-09-30 2013-10-01 Emc Corporation Multi-tier recovery
US8886613B2 (en) 2010-10-12 2014-11-11 Don Doerner Prioritizing data deduplication
US9116850B2 (en) 2010-12-14 2015-08-25 Commvault Systems, Inc. Client-side repository in a networked deduplicated storage system
US9020900B2 (en) 2010-12-14 2015-04-28 Commvault Systems, Inc. Distributed deduplicated storage system
US9021198B1 (en) 2011-01-20 2015-04-28 Commvault Systems, Inc. System and method for sharing SAN storage
US9823981B2 (en) 2011-03-11 2017-11-21 Microsoft Technology Licensing, Llc Backup and restore strategies for data deduplication
US9098325B2 (en) * 2012-02-28 2015-08-04 Hewlett-Packard Development Company, L.P. Persistent volume at an offset of a virtual block device of a storage server
US9298715B2 (en) 2012-03-07 2016-03-29 Commvault Systems, Inc. Data storage system utilizing proxy device for storage operations
US9342537B2 (en) 2012-04-23 2016-05-17 Commvault Systems, Inc. Integrated snapshot interface for a data storage system
US9218374B2 (en) 2012-06-13 2015-12-22 Commvault Systems, Inc. Collaborative restore in a networked storage system
US8977828B2 (en) * 2012-06-21 2015-03-10 Ca, Inc. Data recovery using conversion of backup to virtual disk
US9275086B2 (en) 2012-07-20 2016-03-01 Commvault Systems, Inc. Systems and methods for database archiving
US8938481B2 (en) 2012-08-13 2015-01-20 Commvault Systems, Inc. Generic file level restore from a block-level secondary copy
US9298723B1 (en) * 2012-09-19 2016-03-29 Amazon Technologies, Inc. Deduplication architecture
US9747169B2 (en) 2012-12-21 2017-08-29 Commvault Systems, Inc. Reporting using data obtained during backup of primary storage
US9405482B2 (en) 2012-12-21 2016-08-02 Commvault Systems, Inc. Filtered reference copy of secondary storage data in a data storage system
US9633033B2 (en) 2013-01-11 2017-04-25 Commvault Systems, Inc. High availability distributed deduplicated storage system
US9886346B2 (en) 2013-01-11 2018-02-06 Commvault Systems, Inc. Single snapshot for multiple agents
US20140201485A1 (en) 2013-01-14 2014-07-17 Commvault Systems, Inc. Pst file archiving
EP2799973B1 (en) * 2013-04-30 2017-11-22 iNuron NV A method for layered storage of enterprise data
US9342253B1 (en) * 2013-08-23 2016-05-17 Nutanix, Inc. Method and system for implementing performance tier de-duplication in a virtualization environment
US9424058B1 (en) 2013-09-23 2016-08-23 Symantec Corporation File deduplication and scan reduction in a virtualization environment
JP6025149B2 (en) 2013-11-06 2016-11-16 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation System and method for managing data
US9639426B2 (en) 2014-01-24 2017-05-02 Commvault Systems, Inc. Single snapshot for multiple applications
US9495251B2 (en) 2014-01-24 2016-11-15 Commvault Systems, Inc. Snapshot readiness checking and reporting
US9632874B2 (en) 2014-01-24 2017-04-25 Commvault Systems, Inc. Database application backup in single snapshot for multiple applications
US20150212894A1 (en) 2014-01-24 2015-07-30 Commvault Systems, Inc. Restoring application data from a single snapshot for multiple applications
US9753812B2 (en) 2014-01-24 2017-09-05 Commvault Systems, Inc. Generating mapping information for single snapshot for multiple applications
US9633056B2 (en) 2014-03-17 2017-04-25 Commvault Systems, Inc. Maintaining a deduplication database
US10380072B2 (en) 2014-03-17 2019-08-13 Commvault Systems, Inc. Managing deletions from a deduplication database
US9921773B2 (en) * 2014-06-18 2018-03-20 Citrix Systems, Inc. Range-based data deduplication using a hash table with entries replaced based on address alignment information
US9875063B2 (en) 2014-07-02 2018-01-23 Hedvig, Inc. Method for writing data to a virtual disk using a controller virtual machine and different storage and communication protocols
US9558085B2 (en) 2014-07-02 2017-01-31 Hedvig, Inc. Creating and reverting to a snapshot of a virtual disk
US9798489B2 (en) 2014-07-02 2017-10-24 Hedvig, Inc. Cloning a virtual disk in a storage platform
US9864530B2 (en) 2014-07-02 2018-01-09 Hedvig, Inc. Method for writing data to virtual disk using a controller virtual machine and different storage and communication protocols on a single storage platform
US9411534B2 (en) 2014-07-02 2016-08-09 Hedvig, Inc. Time stamp generation for virtual disks
US9424151B2 (en) * 2014-07-02 2016-08-23 Hedvig, Inc. Disk failure recovery for virtual disk with policies
US9483205B2 (en) 2014-07-02 2016-11-01 Hedvig, Inc. Writing to a storage platform including a plurality of storage clusters
US10067722B2 (en) 2014-07-02 2018-09-04 Hedvig, Inc Storage system for provisioning and storing data to a virtual disk
US20160019224A1 (en) 2014-07-18 2016-01-21 Commvault Systems, Inc. File system content archiving based on third-party application archiving rules and metadata
US10042716B2 (en) 2014-09-03 2018-08-07 Commvault Systems, Inc. Consolidated processing of storage-array commands using a forwarder media agent in conjunction with a snapshot-control media agent
US9774672B2 (en) 2014-09-03 2017-09-26 Commvault Systems, Inc. Consolidated processing of storage-array commands by a snapshot-control media agent
US9753955B2 (en) 2014-09-16 2017-09-05 Commvault Systems, Inc. Fast deduplication data verification
US9575673B2 (en) 2014-10-29 2017-02-21 Commvault Systems, Inc. Accessing a file system using tiered deduplication
US9848046B2 (en) 2014-11-13 2017-12-19 Commvault Systems, Inc. Archiving applications in information management systems
US9448731B2 (en) 2014-11-14 2016-09-20 Commvault Systems, Inc. Unified snapshot storage management
US9648105B2 (en) 2014-11-14 2017-05-09 Commvault Systems, Inc. Unified snapshot storage management, using an enhanced storage manager and enhanced media agents
US10108687B2 (en) 2015-01-21 2018-10-23 Commvault Systems, Inc. Database protection using block-level mapping
US10339106B2 (en) 2015-04-09 2019-07-02 Commvault Systems, Inc. Highly reusable deduplication database after disaster recovery
US9639274B2 (en) 2015-04-14 2017-05-02 Commvault Systems, Inc. Efficient deduplication database validation
US20160350391A1 (en) 2015-05-26 2016-12-01 Commvault Systems, Inc. Replication using deduplicated secondary copy data
US20160350302A1 (en) 2015-05-27 2016-12-01 Hedvig, Inc. Dynamically splitting a range of a node in a distributed hash table
US9547513B2 (en) 2015-06-17 2017-01-17 VNware, Inc. Provisioning virtual desktops with stub virtual disks
US10310953B2 (en) 2015-12-30 2019-06-04 Commvault Systems, Inc. System for redirecting requests after a secondary storage computing device failure
US10846024B2 (en) 2016-05-16 2020-11-24 Commvault Systems, Inc. Global de-duplication of virtual disks in a storage platform

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11314458B2 (en) 2016-05-16 2022-04-26 Commvault Systems, Inc. Global de-duplication of virtual disks in a storage platform
US11733930B2 (en) 2016-05-16 2023-08-22 Commvault Systems, Inc. Global de-duplication of virtual disks in a storage platform

Also Published As

Publication number Publication date
US20170329530A1 (en) 2017-11-16
US10795577B2 (en) 2020-10-06

Similar Documents

Publication Publication Date Title
US20200356277A1 (en) De-duplication of client-side data cache for virtual disks
US11733930B2 (en) Global de-duplication of virtual disks in a storage platform
US20230359644A1 (en) Cloud-based replication to cloud-external systems
US11775392B2 (en) Indirect replication of a dataset
US11340672B2 (en) Persistent reservations for virtual disk using multiple targets
US11470056B2 (en) In-flight data encryption/decryption for a distributed storage platform
JP7053682B2 (en) Database tenant migration system and method
US9558085B2 (en) Creating and reverting to a snapshot of a virtual disk
US9864530B2 (en) Method for writing data to virtual disk using a controller virtual machine and different storage and communication protocols on a single storage platform
US9875063B2 (en) Method for writing data to a virtual disk using a controller virtual machine and different storage and communication protocols
US9424151B2 (en) Disk failure recovery for virtual disk with policies
US10067722B2 (en) Storage system for provisioning and storing data to a virtual disk
US8904047B1 (en) Cloud capable storage system with high perormance nosql key-value pair operating environment
US20160004450A1 (en) Storage system with virtual disks
US20210344772A1 (en) Distributed database systems including callback techniques for cache of same
US20160004449A1 (en) Storage system with virtual disks
US20160004451A1 (en) Storage system with virtual disks
JP2018509695A (en) Computer program, system, and method for managing data in storage
AU2015231085A1 (en) Remote replication using mediums
US11347647B2 (en) Adaptive cache commit delay for write aggregation
US20190228078A1 (en) Methods for automated artifact storage management and devices thereof
US11971902B1 (en) Data retrieval latency management system

Legal Events

Date Code Title Description
AS Assignment

Owner name: HEDVIG, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LAKSHMAN, AVINASH;YADAV, GAURAV;REEL/FRAME:053581/0720

Effective date: 20170727

Owner name: COMMVAULT SYSTEMS, INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HEDVIG, INC.;REEL/FRAME:053567/0619

Effective date: 20200226

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

AS Assignment

Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT, ILLINOIS

Free format text: SECURITY INTEREST;ASSIGNOR:COMMVAULT SYSTEMS, INC.;REEL/FRAME:058496/0836

Effective date: 20211213

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION