US20120150818A1 - Client-side repository in a networked deduplicated storage system - Google Patents

Client-side repository in a networked deduplicated storage system Download PDF

Info

Publication number
US20120150818A1
US20120150818A1 US13/324,884 US201113324884A US2012150818A1 US 20120150818 A1 US20120150818 A1 US 20120150818A1 US 201113324884 A US201113324884 A US 201113324884A US 2012150818 A1 US2012150818 A1 US 2012150818A1
Authority
US
United States
Prior art keywords
client
data
repository
data blocks
csr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/324,884
Inventor
Manoj Kumar Vijayan Retnamma
Deepak Raghunath Attarde
Hetalkumar N. Joshi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CommVault Systems Inc
Original Assignee
CommVault Systems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US42303110P priority Critical
Application filed by CommVault Systems Inc filed Critical CommVault Systems Inc
Priority to US13/324,884 priority patent/US20120150818A1/en
Assigned to COMMVAULT SYSTEMS, INC. reassignment COMMVAULT SYSTEMS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ATTARDE, DEEPAK RAGHUNATH, JOSHI, HETALKUMAR N., RETNAMMA, MANOJ KUMAR VIJAYAN
Publication of US20120150818A1 publication Critical patent/US20120150818A1/en
Assigned to BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT reassignment BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT SECURITY INTEREST Assignors: COMMVAULT SYSTEMS, INC.
Application status is Abandoned legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1456Hardware arrangements for backup
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1453Management of the data involved in backup or backup restore using de-duplication of the data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/80Database-specific techniques

Abstract

A storage system according to certain embodiments includes a client-side repository (CSR). The CSR may communicate with a client at a higher data transfer rate than the rate used for communication between the client and secondary storage. During copy operations, for instance, some or all of the data being backed up or otherwise copied to secondary storage is stored in the CSR. During restore operations, copies of the data stored in the CSR is accessed from the CSR instead of from secondary storage, improving performance. Remaining data blocks not stored in the CSR can be restored from secondary storage.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • This application claims the benefit of priority under 35 U.S.C. §119(e) of U.S. Provisional Patent Application No. 61/423,031, filed on Dec. 14, 2010, and entitled “Client-Side Repository in a Networked Deduplicated Storage System,” the disclosure of which is hereby incorporated by reference in its entirety.
  • BACKGROUND
  • Computers have become an integral part of business operations such that many banks, insurance companies, brokerage firms, financial service providers, and a variety of other businesses rely on computer networks to store, manipulate, and display information that is constantly subject to change. Oftentimes, the success or failure of an important transaction may turn on the availability of information that is both accurate and current. Accordingly, businesses worldwide recognize the commercial value of their data and seek reliable, cost-effective ways to protect the information stored on their computer networks.
  • In corporate environments, protecting information is generally part of a routine process that is performed for many computer systems within an organization. For example, a company might back up critical computing systems related to e-commerce such as databases, file servers, web servers, and so on as part of a daily, weekly, or monthly maintenance schedule. The company may also protect computing systems used by each of its employees, such as those used by an accounting department, marketing department, engineering department, and so forth.
  • As such, enterprises are generating ever increasing volumes of data and corresponding storage requirements. Moreover, enterprise storage systems are typically distributed over one or more networks, such as where backup storage is remote from client computers. In such situations, backup storage operations place heavy demands on available network bandwidth.
  • SUMMARY
  • In response to these challenges, one technique developed by storage system providers is data deduplication. Deduplication typically involves eliminating or reducing the amount of redundant data stored and communicated within a storage system, improving storage utilization. For example, data can be divided into units of a chosen granularity (e.g., files or data blocks). As new data enters the system, the data units can be checked to see if they already exist in the storage system. If the data unit already exists, instead of storing and/or communicating a duplicate copy, the storage system stores and/or communicates a reference to the existing data segment. Thus, deduplication can improve storage utilization, system traffic (e.g., over a networked storage system), or both.
  • Deduplication techniques designed to reduce the demands on storage systems during backup and/or replication operations are described in greater detail in the following U.S. patent applications, each of which is incorporated by reference in its entirety. One or more embodiments of the present disclosure may be used with systems and methods disclosed therein:
      • U.S. patent application Ser. No. ______, entitled “Distributed Deduplicated Storage System,” and filed on Dec. 13, 2011;
      • U.S. patent application Ser. No. 12/982,086, entitled “Content Aligned Block-Based Deduplication,” filed Dec. 30, 2010;
      • U.S. patent application Ser. No. 12/982,100, entitled “Systems and Methods for Retaining and Using Block Signatures in Data Protection Operations,” filed Dec. 30, 2010
      • U.S. patent application Ser. No. 12/145,347, entitled “Application-Aware and Remote Single Instance Data Management,” filed Jun. 24, 2008;
      • U.S. patent application Ser. No. 12/145,342, entitled “Application-Aware and Remote Single Instance Data Management,” filed Jun. 24, 2008; and
      • U.S. patent application Ser. No. 12/725,288, entitled “Extensible Data Deduplication System and Method,” filed Mar. 16, 2010.
  • In addition, one or more embodiments of the present disclosure may also be used with systems and methods disclosed in the following patents, each of which is hereby incorporated herein by reference in its entirety:
      • U.S. Pat. No. 7,389,311, entitled “Hierarchical Backup and Retrieval System,” issued Jun. 17, 2008;
      • U.S. Pat. No. 6,418,478, entitled “Pipelined High Speed Data Transfer Mechanism,” issued Jul. 9, 2002;
      • U.S. Pat. No. 7,035,880, entitled “Modular Backup and Retrieval System Used in Conjunction with a Storage Area Network,” issued Apr. 25, 2006;
      • U.S. Pat. No. 6,542,972, entitled “Logical View and Access to Physical Storage in Modular Data and Storage Management System,” issued Apr. 1, 2003;
      • U.S. Pat. No. 6,658,436, entitled “Logical View and Access to Data Manage by a Modular Data and Storage Management System,” issued Dec. 2, 2003;
      • U.S. Pat. No. 7,130,970, entitled “Dynamic Storage Device Pooling in a Computer System,” issued Oct. 10, 2006;
      • U.S. Pat. No. 7,246,207, entitled “System and Method for Dynamically Performing Storage Operations in a Computer Network,” issued Jul. 17, 2007;
      • U.S. Pat. No. 7,454,569, entitled “Hierarchical System and Method for Performing Storage Operations in a Computer Network,” issued Nov. 18, 2008;
      • U.S. Pat. No. 7,613,748, entitled “System and Method for Containerized Data Storage and Tracking,” issued Nov. 3, 2009; and
      • U.S. Pat. No. 7,620,710, entitled “Systems and Methods for Performing Multi-Path Storage Operations,” issued Nov. 17, 2009.
  • However, even in those systems employing deduplication, restore operations, including operations where data is restored from backup storage to a client, can place equally heavy demands on available network bandwidth and available system resources. Restore operations can also introduce significant delay due to communication latency between backup storage and the client.
  • In accordance with certain aspects of the disclosure, one technique developed to address these challenges incorporates the use of a client-side repository. A client-side repository (CSR) can be used as part of a storage system to reduce the demands on the network between a client and secondary storage, such as backup storage. For example, a CSR can be located in proximity to the client or may share a common network topology with the client whereas the client and the backup storage devices may be remote from one another or reside on differing network topologies. As just one example, the CSR and the client may communicate over a local area network (LAN), while client and secondary storage communicate over a wide area network (WAN). Thus, the CSR can communicate more effectively (e.g., at a higher data transfer rate, more reliably, with less latency, etc.) with the client than the backup storage devices can communicate with the client.
  • During backup or other secondary storage operations (e.g., copy, replication, or snapshot operations), some or all of the data to be copied from the client can be stored in the CSR in addition to being stored in the backup storage devices. Upon restore, the CSR can restore the data stored therein to the client. This data is therefore not transmitted from the backup storage to the client. The remaining data is transmitted from the backup storage to the client in the normal fashion. In this manner, the CSR can reduce the system traffic between the client and the backup storage devices and reduce the amount of time used to restore the client.
  • In certain embodiments, a method of restoring deduplicated data to a client from a destination storage system is provided. The method can include receiving one or more queries from a destination storage system inquiring as to the presence of a plurality of data blocks in a data repository of a client-side repository. The data blocks may correspond to at least a portion of data that has been previously copied from a client to the destination storage system according to a deduplication scheme. The destination storage system may be remote from the client and the client-side repository. The method can further include consulting, consulting, using one or more processors, a signature repository of the client-side repository having stored thereon signatures corresponding to the data blocks in the data repository. The consulting may be performed in response to the one or more queries and to determine which of the queried data blocks are stored in the data repository of the client-side repository. The method may further include restoring the data blocks that are stored in the data repository of the client-side repository from the data repository to the client.
  • According to some embodiments, a storage system is provided including a client-side repository comprising a data repository storing a plurality of data blocks, the data blocks corresponding to at least a portion of data that has been previously copied from an information store of a client to a destination storage system according to a deduplication scheme. The client-side repository may further include a signature repository storing signatures corresponding to the data blocks in the data repository, the data repository and the signature repository remote from the destination storage system. The storage system may further include a control module executing in one or more processors and configured to receive one or more queries inquiring as to the presence of a plurality of data blocks in the data repository. The control module may further be configured to consult the signature repository in response to the one or more queries to determine which of the queried data blocks are stored in the data block repository. The control module may additionally be configured to restore the data blocks that are stored in the data block repository from the data block repository to the information store of the client.
  • In certain embodiments, a method of restoring deduplicated data from a destination storage system to an information store associated with a client is provided. The method may include, in response to instructions to copy data from an information store associated with a client system to at least one destination storage system remote from the client system: copying at least a portion of the data from the information store to a data repository of a client-side repository as a plurality of data blocks, the client-side repository being remote from the destination storage system, wherein the data from the information store is copied to the destination storage system according to a deduplication scheme. Also in response to the instructions, the method may include populating a signature repository of the client-side repository with a plurality of deduplication signatures corresponding to the data blocks stored in the data repository of the client-side repository. During a restore operation in which the copied data is restored from the destination storage system to the client, the method may include receiving a plurality of queries inquiring as to the presence of the plurality of data blocks in the client-side repository. Also during the restore operation the method may include consulting the signature repository of the client-side repository using one or more processors and in response to the queries to determine which of the data blocks are stored in the data repository of the client-side repository. Also during the restore operation, the method may include restoring data blocks that are stored in the data repository of the client-side repository from the client-side repository to the client, the data blocks not stored in the data repository of the client-side repository being restored from the destination storage system to the client.
  • In certain embodiments, a method of restoring deduplicated data to an information store associated with a client from a destination storage system is provided. The method can include sending one or more queries to a client-side repository inquiring as to the presence of a plurality of data blocks in a data repository of a client-side repository, the data blocks corresponding to at least a portion of data that has been previously copied from an information store of a client to the destination storage device according to a deduplication scheme, the destination storage device remote from the client and the client-side repository. The method can further include receiving an indication as to which of the queried data blocks are stored in the data repository of the client-side repository. The method may include restoring the data blocks that are not stored in the data repository of the client-side repository from the destination storage device to the information store of the client.
  • In yet other embodiments, a storage system is provided. The storage system can include at least one destination storage device storing data that has been previously copied from an information store of a client to the destination storage device according to a deduplication scheme. The storage system can further include a control module executing in one or more processors and configured to send one or more queries to a client-side repository inquiring as to the presence of a plurality of data blocks in a data repository of the client-side repository, the data blocks corresponding to at least a portion of the data that was copied from the information store of the client to the destination storage device, the destination storage device remote from the client and the client-side repository. The control module can further be configured to receive an indication as to which of the queried data blocks are stored in the data repository of the client-side repository. Additionally, the control module can be configured to restore the data blocks that are not stored in the data repository of the client-side repository from the destination storage device to the information store of the client.
  • In certain embodiments, a method is provided of modifying a client-side repository usable during restore operations in a deduplicated storage system, the method including monitoring the use of a client-side repository using one or more processors, the client-side repository usable during copy and restore operations. The copy operations can include storing data blocks and signatures corresponding to the data blocks in the client-side repository, the data blocks corresponding to at least a portion of data that is copied from a client system to a destination storage system according to a deduplication scheme. The restore operations may include restoring the data blocks not stored in the client-side repository from the destination storage system to the client system and restoring the data blocks stored in the client-side repository from the client-side repository to the client system. In certain embodiments, the method includes determining whether the use of the client-side repository meets a usage threshold in response to the monitoring. The method can also include, upon determining that the use of the client-side repository meets a usage threshold, tuning a client-side repository parameter.
  • In certain embodiments, a storage system is provided having a client-side repository. The client-side repository can include a data repository storing a plurality of data blocks. The data blocks corresponding to at least a portion of data that has been previously copied from a client system to a destination storage system according to a deduplication scheme. In certain embodiments the client-side repository also includes a signature repository storing signatures corresponding to the data blocks in the data repository. The data repository and the signature repository may be remote from the destination storage system. The system may further include a control module executing in one or more processors and configured to monitor the use of the client-side repository during restore operations, wherein the restore operations include restoring the data blocks not stored in the client-side repository from the destination storage system to the client system and restoring the data blocks stored in the client-side repository from the client-side repository to the client system. The control module may further be configured to determine whether the use of the client-side repository meets a usage threshold in response to the monitoring. In addition, the control module may be configured to, upon determining that the use of the client-side repository meets a usage threshold, tune a client-side repository parameter.
  • In certain embodiments, a method of modifying a client-side repository usable during restore operations in a de-duplicated storage system is provided. The method may include populating a client-side repository with a plurality of data blocks, the data blocks corresponding to at least a portion of data that is copied from a client system to a destination storage system according to a deduplication scheme. The method can further include populating the client-side repository with deduplication signatures corresponding to the data blocks that are stored in the client-side repository. The method can also include, during at least one restore operation in which the data is restored to the client system, determining which of the plurality of data blocks are stored in the client-side repository with one or more processors and at least in part based on the deduplication signatures stored in the client-side repository. During the at least one restore operation, the method can also include accessing the client-side repository to restore the data blocks that are stored in the client-side repository from the client-side repository to the client system, wherein the data blocks that are not stored in the client-side repository are restored from the destination storage system to the client system. The method can also include generating a performance metric relating to the at least one restore operation. The method may further include modifying a parameter associated with the client-side repository in response to the performance metric not meeting a threshold condition.
  • In certain embodiments, a storage system is provided. The storage system can include at least one destination storage device storing a plurality of data blocks corresponding to data that has been previously copied from a client system to the destination storage device according to a deduplication scheme. The storage system may further include a control module executing in one or more processors. The control module may be configured to monitor the use of a client-side repository during restore operations. The client-side repository may include a data repository storing at least a portion of the data blocks that were previously copied to the destination storage system. The client-side repository may further include a signature repository storing signatures corresponding to the data blocks in the data repository, the data repository and the signature repository remote from the destination storage device. The restore operations can include restoring the data blocks not stored in the client-side repository from the destination storage device to the client system and restoring the data blocks stored in the client-side repository from the client-side repository to the client system. The control module may further be configured to determine whether the use of the client-side repository meets a usage threshold in response to the monitoring, upon determining that the use of the client-side repository meets a usage threshold, tune a client-side repository parameter.
  • In certain embodiments, a method of restoring deduplicated data from a destination storage system to a client system is provided. The method may include, during a restore operation in which data is restored to a client system from a destination storage system, the data previously copied as a plurality of data blocks with corresponding deduplication signatures to the destination storage system according to a deduplication scheme, and at least some of the data blocks previously copied along with corresponding deduplication signatures to a client-side repository that is remote from the destination storage system, grouping a plurality of the deduplication signatures stored at the destination storage system into one or more bundles using one or more processors. The method can further include sending the bundles to the client-side repository. The method may also include receiving an indication from the client-side repository as to which of the data blocks corresponding to the signatures in the bundles are stored in the client-side repository. In certain embodiments, the method includes accessing the destination storage system to restore data blocks not stored in the client-side repository from the destination storage system to the client system, wherein the data blocks that are stored in the client-side repository are restored from the client-side repository to the client system.
  • In certain embodiments, a storage system is provided comprising at least one destination storage device storing data that was previously copied to the destination storage device from a client system as a plurality of data blocks and according to a deduplication scheme. The storage system may also include a control module executing in one or more processors and configured to, during at least one restore operation in which the data is restored to the client system. The control module may further be configured to group a plurality of queries into one or more query bundles, each query of the one or more query bundles being associated with a data block to restore to the client system and comprising a signature associated with the data block. The control module may be configured to send at least one of the query bundles to the client-side repository. The control module can be configured to receive an indication from the client-side repository as to whether one or more of the data blocks associated with the at least one query bundle are stored in the client-side repository. In some embodiments, the control module is configured to access the destination storage device to restore data blocks not stored in the client-side repository from the destination storage device to the client system, wherein the data blocks that are stored in the client-side repository are restored from the client-side repository to the client system.
  • In certain embodiments, a method of restoring deduplicated data from a destination storage system to a client system is provided. The method can include receiving from a destination storage system, at a client-side repository remote from the destination storage system, one or more query bundles, wherein data from the client system was previously copied to the destination storage system as a plurality of data blocks according to a deduplication scheme, each query bundle inquiring as to the presence of a plurality of the data blocks at the client-side repository. In certain embodiments, the method also includes consulting a signature repository of the client-side repository using one or more processors and in response to each of the query bundles to determine which of the plurality of data blocks associated with query bundle are stored in the client-side repository. The method can further include indicating to the destination storage system which of the plurality of data blocks associated with the respective query bundles are stored in the client-side repository. The method in certain embodiments includes restoring the one or more data blocks stored in the client-side repository from the client-side repository to the client system.
  • In certain embodiments, a storage system is provided having a client-side repository, comprising: a data repository storing a plurality of data blocks, the data blocks corresponding to at least a portion of data that has been previously copied from a client system to a destination storage system according to a deduplication scheme. The client-side repository may include a signature repository storing signatures corresponding to the data blocks in the data repository, the data repository and the signature repository remote from the destination storage system. The client-side repository may also include a control module configured to receive one or more query bundles from the destination storage system, each query bundle inquiring as to the presence of a plurality of the data blocks at the client-side repository. The control module may be configured to consult the signature repository in response to each of the received query bundles to determine which of the plurality of data blocks associated with query bundle are stored in the data repository. The control module may further be configured to indicate to the destination storage system which the plurality of data blocks associated with the received query bundles are stored in the data block repository. The control module may also be configured to restore the one or more data blocks stored in the data block repository from the client-side repository to the client system.
  • In certain embodiments, a method for restoring data to a client system from a destination storage system is provided. The method can include, for each of a plurality of data blocks previously copied to a destination storage system according to a deduplication scheme, consulting an archive file identifier corresponding to the data block to determine age information associated with the data block. Based on the age information and using one or more processors, the method can include determining whether to query a client-side repository remote from the destination storage system as to whether the client-side repository is populated with a copy of the data block. The method can also include querying the client-side repository from the destination storage system as to whether the client-side repository is populated with a copy of the data block based on the determination. The method may include restoring data blocks that are not stored in the client-side repository from the destination storage system to the client system, wherein the data blocks that are stored in the client-side repository are restored from the client-side repository to the client system.
  • In certain embodiments, a storage system is provided comprising at least one destination storage device storing data that was previously copied to the destination storage device from a client system as a plurality of data blocks and according to a deduplication scheme. The storage system may further include a control module executing in one or more processors. The control module may be configured to consult an archive file identifier corresponding to the data block to determine age information associated with the data block. The control module can also be configured to, based on the age information and using one or more processors, determine whether to query a client-side repository remote from the destination storage system as to whether the client-side repository is populated with a copy of the data block. The control module may also be configured to query the client-side repository from the destination storage system as to whether the client-side repository is populated with a copy of the data block based on the determination. In some embodiments, the control module is configured to restore data blocks that are not stored in the client-side repository from the destination storage system to the client system, wherein the data blocks that are stored in the client-side repository are restored from the client-side repository to the client system.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIGS. 1 and 2 are block diagrams that illustrate components of example storage systems configured to implement techniques compatible with embodiments described herein.
  • FIG. 3 is a block diagram illustrative of an expanded view of an example client-side repository.
  • FIGS. 4A-4B are state diagrams illustrative of the interaction between the various components of an example storage system with respect to example backup and restore operations, respectively.
  • FIG. 5 is a flow diagram illustrative of one embodiment of a routine implemented by a storage system for restoring data using a client-side repository.
  • FIG. 6 is a flow diagram illustrative of one embodiment of a routine implemented by a storage system for tuning a client-side repository parameter.
  • FIG. 7 is a flow diagram illustrative of one embodiment of a routine implemented by a storage system for restoring data using a client-side repository.
  • FIG. 8 is a flow diagram illustrative of one embodiment of a routine implemented by a storage system for bundling queries for a client-side repository.
  • DETAILED DESCRIPTION Client-Side Repository Overview
  • The present disclosure is directed to a system, method, and computer-readable non-transitory storage medium for storing data to and restoring data from a storage system including a client-side repository (CSR). Specifically, aspects of the disclosure will be described with regard to storing deduplicated data in both a CSR and secondary storage (e.g., during backup or other copy operations) and restoring data from both the CSR and secondary storage during restore. Although various aspects of the disclosure will be described with regard to examples and embodiments, one skilled in the art will appreciate that the disclosed embodiments and examples should not be construed as limiting.
  • While described primarily with respect to backup operations for the purposes of illustration, the techniques described herein may be equally compatible with other types of storage operations including copy, replication, snapshot and archive operations, to name a few. A description of some storage operations compatible with embodiments described herein is provided near the end of this disclosure.
  • In accordance with aspects described herein, data is broken up into data blocks, or data segments for processing. For example, the data blocks can be used for the purposes of removing duplicate data blocks and replacing them with references to those blocks during data deduplication. Thus, a data block refers to a portion of data. The data blocks can vary in size based on system preferences. While other compatible data reduction techniques are possible, the embodiments described herein are described primarily in relation to data deduplication for clarity. Moreover, certain aspects described herein are compatible with systems that do not incorporate data reduction techniques.
  • In order to identify data blocks, various functions can be performed on individual data blocks to generate a unique or substantially unique signature corresponding to the data block. For example, hash functions and the like can be used, as described in greater detail in any of the applications incorporated by reference herein, such as, for example, the application entitled “Content-Aligned Block-Based Deduplication.” Any number of different hash functions or other operations can be performed on the data blocks, such as SHA-512, for example. The hash or other signature can be used for a variety of purposes. For example, the signature can be used to determine if two data blocks contain the same data for the purposes of deduplication. As will be described in greater detail below, the signature can also be used to efficiently determine whether a data block exists in a client-side repository.
  • As described above, storage systems described herein can backup and restore data to a client using a CSR. The data can include deduplicated data. The present disclosure describes certain embodiments that selectively store at least some of the data that is sent to the backup storage device in the CSR. Moreover, the data can be kept in the CSR for a predetermined period of time. For example, a client can communicate with a media agent associated with the backup storage devices to backup the data stored in the client at a predetermined time interval. The system can employ deduplication techniques to reduce the amount of data stored and the time and network resources used to backup the data.
  • The CSR can be employed to reduce the time and network resources used during restore operations. For instance, during backup client data, the storage system stores a first copy of the data in the backup storage device and stores a second copy of the data in the CSR. The second copy may include a subset or signature of the first copy, and not all of the data in some cases. And a hash or other signature corresponding to each data block can be stored along with the respective data block.
  • At least some of the data is restored from the CSR rather than from backup storage in some embodiments. For example, during restore, the storage system queries the CSR for the data blocks stored therein. The query can include a hash or other signature of a data block that is to be restored. If the data block is located in the CSR, the storage system restores the data block using the copy in the CSR. To determine if the data block is stored in the CSR, a signature, or hash, included in the query may be compared with signatures, or hashes, located in the CSR. A match indicates that the data block is stored in the CSR, and the corresponding data block can be restored to the client from the CSR rather than from secondary storage. On the other hand, if the data block is not located in the CSR, the storage system can restore the data block from secondary storage.
  • In addition, the description includes embodiments for altering, or tuning, the CSR according to system preferences. For example, as network demand increases between the client and media agent as a result of restore operations, the storage system can determine that a threshold is met. In response to the threshold being met, the storage system can advantageously tune the CSR to accommodate the increased network demand. For example, the storage system can increase the storage capacity of CSR to reduce the network traffic between the client and the media agent. By dynamically tuning the CSR, the system can achieve further system performance improvement.
  • According to other aspects, systems described herein bundle queries to the CSR. The communication channel between the CSR and the media agent may be a relatively high latency channel, and during restore operations, as the media agents query the CSR for various data blocks, system performance can be adversely affected. Thus, the storage system can bundle the queries to the CSR to efficiently utilize network resources. In an embodiment, instead of sending queries for groups of data blocks to the CSR serially, the storage system packages together and transmits multiple queries at the same time. Additional logic can be used to determine which and how many queries to bundle. For example, bundling can be implemented based on a predefined number of queries, network bandwidth, data/file location within the backup storage device or information store of the client, etc. Furthermore, the queries can be bundled according to a signature block value, an archive file identifier (AFID), a hash signature value, a location within the backup storage device, an offset, and/or a previous storage location within the information store and/or pseudo-randomly. Bundling the queries can reduce the overhead associated with each query, and free up network bandwidth for other operations.
  • The description further includes embodiments for reviewing age or other appropriate information related to data blocks before querying the CSR for those data blocks. As mentioned previously, during the restore operation many queries can be sent to the CSR. Rather than querying the CSR for all data blocks associated with a client, the storage system can determine which data blocks are likely stored in CSR and query the CSR for only those data blocks, thereby reducing the overall number of queries. For example, over time the data in CSR can be pruned (e.g., deleted or overwritten) according to client preferences. In one embodiment, the data blocks in CSR are overwritten after a predefined time interval, such as 10 days.
  • In order to track data block aging, each data block stored in CSR and the backup storage device can have age information associated with it. For example, the storage system can assign an archive file identifier (AFID) indicating an age associated with the data block. For example, AFIDs are assigned sequentially incrementing values in one configuration. The AFIDs may be unique to each backup or other storage operation session, to each data block, or can be assigned according to some other scheme, depending on the embodiment. The storage system can review the AFID associated with the data blocks to be restored and determine the relative age of the block based on various factors, such as the number of AFIDs assigned over a period of time, last AFID assigned vs. AFID of data block to be restored, etc. In this manner, the AFID can be used to determine the likelihood that the data block associated with the AFID is stored in the CSR. If it is likely that the data block is stored in the CSR, the storage system can query the CSR for the data block Otherwise, the storage system can restore the data using the backup storage device without querying the CSR.
  • Illustrative explanations of several terms used throughout the disclosure are provided herein. While these meanings apply to the respective terms as used with respect to certain embodiments, it will be appreciated that the meanings can vary depending on the embodiment. Additionally, the meanings of these and other terms used herein will be understood in view of their usage throughout the entirety of the disclosure.
  • Example Storage Systems Including Client-Side Repositories
  • FIG. 1 illustrates a block diagram of an example network storage architecture compatible with embodiments described herein. The system 100 is configured to perform storage operations on electronic data, including deduplicated data, in a computer network.
  • As shown, the storage system 100 includes a storage manager 108 and one or more of the following: a client 102, an information store 106, a data agent 104, a media agent 112, and a secondary storage device 116. The storage system 100 can further include one or more client-side repositories (CSR) 118, which will be described in greater detail below with reference to FIGS. 2 and 3. In addition, the storage system can also include one or more index caches as part of the media agent 112 and/or the storage manager 108. The index caches can indicate, logical associations between components of the system, user preferences, management tasks, and other useful data, as described in greater detail in application Ser. No. 10/818,749, now U.S. Pat. No. 7,246,207, issued Jul. 17, 1007, herein incorporated by reference in its entirety.
  • As illustrated, the client computer 102 can be communicatively coupled with the information store 106, the storage manager 108, and/or the CSR 118. The information store contains data associated with the client 102. Although not illustrated in FIG. 1, the client 102 can also be in direct communication with the media agent 112 and/or the secondary storage device 116. For simplicity, and not to be construed as limiting, the components of storage system 100 are illustrated as communicating indirectly via the storage manager 108. However, all components of the storage system 100 can be in direct communication with each other or communicate indirectly via the client 102, the storage manager 108, the media agent 112, or the like.
  • With further reference to FIG. 1, the client computer 102 (also generally referred to as a client) contains data in the information store 106 that can be copied to and then restored from the secondary storage device 116 and/or the CSR 118. In an illustrative embodiment, the client 102 can correspond to a wide variety of computing devices including personal computing devices, laptop computing devices, hand-held computing devices, terminal computing devices, mobile devices, wireless devices, various electronic devices, appliances and the like. In an illustrative embodiment, the client 102 includes necessary hardware and software components for establishing communication with the other components of storage system 100. For example, the client 102 can be equipped with networking equipment and browser software applications that facilitate communication with the rest of the components from storage system 100. Although not illustrated in FIG. 1, each client 102, can also display a user interface. The user interface can include various menus and fields for entering storage and restore options. The user interface can further present the results of any processing performed by the storage manager 108 in an easy to understand format.
  • A data agent 104 can be a software module that is generally responsible for archiving, migrating, and recovering data of a client computer 102 stored in an information store 106 or other memory location. Each client computer 102 has at least one data agent 104 and the storage system 100 can support many client computers 102. The storage system 100 provides a plurality of data agents 104 each of which is intended to backup, migrate, and recover data associated with a different application. For example, different individual data agents 104 may be designed to handle Microsoft Exchange™ data, Microsoft Windows file system data, and other types of data known in the art. If a client computer 102 has two or more types of data, one data agent 104 may be implemented for each data type to archive, migrate, and restore the client computer 102 data.
  • The storage manager 108 is generally a software module or application that coordinates and controls the system. The storage manager 108 communicates with all elements of the storage system 100 including the client computers 102, data agents 104, the media agents 112, and the secondary storage devices 116, to initiate and manage system backups, migrations, recoveries, and the like. The storage manager 108 can be located within the client 102, the CSR 118, the media agent 112, or can be a software module within a separate computing device. In other words, the media agent 112, the client 102 and/or the CSR 118 can include a storage manager module. In one embodiment, the storage manager 108 is located in close proximity to the client 102 and communicates with the client 102 via a LAN. In another embodiment, the storage manager 108 communicates with the client 102 via a WAN. Similarly, in one embodiment, the storage manager 108 communicates with the media agent 112 via a LAN, and in another embodiment communicates with the media agent 112 via a WAN.
  • The storage manager 108 can also deduplicate the data that is being backed up in storage device 116. For example, the storage manager 108 can analyze individual data blocks being backed up, and replace duplicate data blocks with pointers to other data blocks already stored in the secondary storage device 116. To identify duplicate data blocks, the storage manager 108 can perform a hash or other signature function on each data block. The signatures of the different data blocks can be compared. Matching signatures of different data blocks can indicate duplicate data, which can be replaced with a pointer to previously stored data. Other components of storage system 100 can perform the deduplication techniques on the data blocks, such as the media agent 112, the client 102, the CSR 118, and/or storage device 116.
  • A media agent 112 is generally a software module that conducts data, as directed by the storage manager 108, between locations in the storage system 100. For example, the media agent 112 may conduct data between the client computer 102 and one or more secondary storage devices 116, between two or more secondary storage devices 116, etc. Although not shown in FIG. 1, one or more of the media agents 112 can also be communicatively coupled to one another. In some embodiments, the media agent communicates with the storage manager 108 via a LAN or SAN. In other embodiments, the media agent 112 communicates with the storage manager 108 via a WAN. The media agent 112 generally communicates with the secondary storage devices 116 via a local bus. In some embodiments, the secondary storage device 116 is communicatively coupled to the media agent(s) 112 via a Storage Area Network (“SAN”).
  • The secondary storage devices 116 can include a tape library, a magnetic media secondary storage device, an optical media secondary storage device, or other secondary storage device. The secondary storage devices 116 can further store the data according to a deduplication schema as discussed above. The storage devices 116 can also include a signature block corresponding to each stored data block. As will be described in greater detail below with reference to FIGS. 2 and 3, the signature block can include various information related to the data block and in one embodiment includes the signature block includes a signature of the data block, an archive file identifier (AFID), and an offset.
  • Further embodiments of storage systems such as the one shown in FIG. 1 are described in application Ser. No. 10/818,749, now U.S. Pat. No. 7,246,207, issued Jul. 17, 1007, which is hereby incorporated by reference in its entirety. In various embodiments, components of the storage system 100 may be distributed amongst multiple computers, or one or more of the components may reside and execute on the same computer.
  • Furthermore, components of the storage system 100 of FIG. 1 can also communicate with each other via a computer network. For example, the network may comprise a public network such as the Internet, virtual private network (VPN), token ring or TCP/IP based network, wide area network (WAN), local area network (LAN), an intranet network, point-to-point link, a wireless network, cellular network, wireless data transmission system, two-way cable system, interactive kiosk network, satellite network, broadband network, baseband network, combinations of the same or the like.
  • FIG. 2 illustrates a block diagram of an embodiment of a storage system 200 similar to storage system 100 of FIG. 1. The storage system 200 includes a client-side repository (CSR) 204, clients 208A-208 c, information stores 210 a-210 c, the media agents 212 a-212 b, and the secondary storage devices 214 a-214 b. Clients 208A-208 c, information stores 210 a-210 c, the media agents 212 a-212 b, and the secondary storage devices 214 a-214 b can be similar to the similarly named components of FIG. 1.
  • As described above with respect to FIG. 1, the various components can communicate directly or indirectly with each other. For simplicity, and not to be construed as limiting, line 220 illustrates communication occurring between any of clients 208 a-208 c and the CSR 204, line 230 illustrates communication occurring between any of the clients 208A-208 c and any of the media agents 212 a-212 b and/or the secondary storage device 214 a-214 b, and line 240 illustrates communication occurring between the CSR 204 and any of the media agents 212 a-212 b and/or the secondary storage devices 214 a-214 b. Although a storage manager is not illustrated in FIG. 2, communication can also be facilitated via a storage manager.
  • The storage system 200 also includes a client-side repository (CSR) 204, which can be made up of one or more storage devices. The CSR 204 can also include a computing device having one or more processors. As illustrated, the CSR 204 can be in communication with any of clients 208A-208 c (“client 208”), information stores 210 a-210 c (“information store 210”), the media agents 212 a-212 b (“media agent “212”) and/or the secondary storage devices 214 a-214 b (“secondary storage device 214”). The CSR 204 can communicate with these devices over any number of different network topologies including, but not limited to, the Internet, VPN, token ring or TCP/IP based network, WAN, LAN, an intranet, point-to-point link, wireless, cellular, wireless data transmission system, two-way cable system, interactive kiosk, satellite, broadband, baseband, combinations of the same, or the like.
  • In certain embodiments, the CSR 204 is part of a client 208. For example, the client 208 can include additional local storage configured as the CSR 204. In an embodiment, each client 208 has a dedicated CSR 204. For example, each client 208 can communicate with a separate CSR 204 via a LAN. In another embodiment, more than one client 208 shares a CSR 204. In other embodiments, the CSR 204 is in close proximity to the client 208 and communicates with the client 208 using a different network topology than the topology used for communication between the clients 208 and the media agents 212. For example, in an embodiment, the clients 208 communicate with the CSR 204 over a LAN and communicate with the media agents 212 over a WAN. In certain embodiments, communication between the clients 208 and the CSR 204 takes place at a higher data rate than communication between the clients 208 and the media agents 206. By storing data blocks in the CSR 204 the amount of traffic between the clients 208 and the media agents 214 (or storage manager) can be reduced in favor of traffic between the client 208 and the CSR 204. As such, the data blocks stored in the CSR 204 can more quickly or efficiently be restored to the client 208 during restore operations, and traffic over a WAN can be reduced. Furthermore, although not illustrated, the CSR 204 can communicate with the media agents 212 and/or the clients 208 via a storage manager.
  • In general, the CSR 204 is used by the storage system 200 to store data signature blocks and data blocks, which will be described in greater detail below with reference to FIG. 3, and can restore data blocks to the client 208 in the event of a restore operation. In some embodiments, the data blocks are deduplicated data blocks, and the signature blocks includes signatures of the deduplicated data blocks. In some embodiments, the signatures are hash signatures. As mentioned above, restore times and network resources used can by reduced by locating the CSR 204 in close proximity to the client 208 and communicating via a LAN. Data not restored using the CSR 204 can be restored using the media agent 212 and the secondary storage device 214.
  • Data can be stored in the CSR 204 at any number of different intervals, such as upon request by a user, during each backup or other storage operation, at set intervals (e.g. daily, weekly, etc.), and the like. In an embodiment, the CSR 204 is populated during each backup or other secondary storage operation associated with a client 208.
  • Furthermore, the storage system can determine which data blocks to copy to the CSR 204 in a number of ways including, but not limited to, a storage policy such as a policy defining relative priorities associated with the clients, most recently used data blocks, file type, data/file location in the information store 210, backup data/file location in the secondary storage device 214, and the like. The CSR 204 can also store the signature blocks corresponding to each data block. In an embodiment, the CSR 204 is populated during each backup of the client 208 with the most recently used or changed data blocks. In such an embodiment, during backup, the most recently used or changed data blocks from the client 208 as well as corresponding signature blocks are stored in the CSR 204. Any number of different components can determine which data blocks are the most recently used or changed, including the clients 208, the media agents 206, a storage manager, the CSR 204, or the like. In some embodiments, all the data, including the data blocks copied to the CSR 204, is also backed-up in the secondary storage device 214. Furthermore, any one of the various components of the storage system 200 can generate the signature for each data block, such as the client 208, the CSR 204, the media agent 212, and/or a storage manager.
  • In one embodiment, upon restoring the data of the client 208, the most recently used data blocks are retrieved from the CSR 204 and the rest of the data blocks are retrieved from the secondary storage device 214. The restore request and determining the location from which to restore the data can be accomplished using any number of methods implemented by any one, or a multiple of, the components of storage system 200. In an embodiment a storage manager requests a restore for a particular client 208 and selects the appropriate media agent to conduct the restore. The selected media agent 212 determines which data blocks are to be restored from the CSR 204 and which data blocks are to be restored from the secondary storage device 214.
  • In such an embodiment, to determine which data blocks are stored in the CSR 204, the media agent 214 can query the CSR 204. A query can include a request for a specific data block, or an acknowledgement that the specific data block is stored in the CSR 204, based on a signature of that data block. In response to the query, the CSR 204 can check a signature block repository to determine if the data block requested is in the CSR 204. In checking the signature block repository, the CSR 204 can compare the signature received in the query with signatures stored in the signature block repository. A match indicates the data block is stored in the CSR 204. If the data block is stored in the CSR 204, the CSR 204 supplies the data block to the client 208. If the data block is not stored in the CSR 204, the media agents 212 can use the secondary storage device 214 to restore the data block to the client 208. The media agents 212 can also include an index of which data blocks are stored in the CSR 204. In this manner, the media agent 212 can use the index to determine which data blocks to restore using the CSR 204 and which data blocks to restore using the secondary storage device 214.
  • In an embodiment, the media agent 212 can use information regarding data blocks, such as an archive file identifier (AFID), which will be described in greater detail below, to determine if it is likely that a data block is in the CSR 204. Based on the determination, the media agent 212 can determine whether to query the CSR 204 or instead to restore the data block using the secondary storage device 214 and without querying the CSR 204.
  • In another embodiment, the media agent 212 reduces network traffic by bundling the queries to the CSR 204, e.g., by transmitting multiple queries at the same time, rather than one at a time.
  • Although the above-embodiment is described in terms of the media agent 212 implementing the restore request, determining which data blocks to restore from the CSR 204, and determining which data blocks to restore from the secondary storage device 214, any of the other components of storage system 200 can implement this process, including, but not limited to, the client 208, the CSR 204, and the secondary storage device 214. For example, the client 208 can request a restore and then determine which data blocks should be restored from the CSR 204 and which data blocks should be restored from the secondary storage device 214. Alternatively, in one embodiment the client 208A requests a restore on behalf of the client 208B, and similarly determines from what location the data blocks should be restored. In another embodiment, a client 208 can request a restore and the media agent 212 can determine the location of the data blocks for the restore and manage the restore. Various components can be used to implement the restore request and determining the location of the data blocks to be restored and managing the restore without departing from the spirit and scope of the description.
  • Furthermore, the above example describes the CSR 204 being populated with the most recently used or changed data blocks. However, many variations exist for determining which data blocks to store in the CSR 204, and thus which data blocks to restore. For example, in an embodiment, the CSR 204 can be populated based on user-determined criteria, such as specific files and/or folders, or file types. Furthermore, the data blocks stored in the CSR 204 can be based on the original location of the data blocks within the information store 210 or the location of the backed-up copy of the data blocks in the secondary storage device 214, and the like. In addition, client preference can be used to determine which data blocks to store in the CSR 204. For example, in an embodiment, the clients can be given relative priorities with respect to one another. Thus, where client 208A has a higher priority than client 208B, the data blocks from client 208A can be given higher storage priority than the data blocks from client 208B. Accordingly, the system may store data blocks from the client 208A in the CSR 204 for longer periods of time or overwrite data blocks in the CSR 204 that came from the client 208B with data blocks from the client 208A.
  • In another embodiment, upon receiving a restore request from a client 208, the CSR 204 restores all the data blocks stored therein that are related to the client 208. In such an embodiment, following the restore of the data blocks from the CSR 204, the client 208 (or CSR 204) can supply the media agent 212 with an index of the data blocks restored by the CSR 204. The media agent 212 can restore the remaining data blocks using the secondary storage device 214. In yet another embodiment, upon receiving a restore request from a client 208, the CSR 204 supplies the media agent 212 with an index of the data blocks stored in the CSR 204. The media agent 212 determines which data blocks are to be restored from the CSR 204 and which data blocks are to be restored from the secondary storage device 214. In certain embodiments, a storage manager, the client 208, and/or a different client are to make the determination instead of the media agent 214.
  • Over time, the data blocks stored in the CSR 204 may be pruned or overwritten based on any of the criteria mentioned above. Thus, overwriting data blocks can be based on time, client preferences, or other criteria as described above. In an embodiment, the data blocks are overwritten based on time. For example, data blocks are stored in the CSR 204 for 10 days and then deleted, or overwritten. In other embodiments, the data blocks are overwritten at different time intervals, such as daily, weekly, monthly, or some other pre-defined time interval. In another embodiment, as data blocks change within an information store 210, they are overwritten in the CSR 204. Thus, the CSR 204 can have the most up-to-date version of the data blocks in the information store 210.
  • Example Client-Side Repository
  • FIG. 3 is a block diagram illustrative of an expanded view of a client-side repository associated with the storage system of FIGS. 1 and 2. As illustrated, the client-side repository 204 can be made up of at least two repositories: a signature block repository 302 and a data repository 304, which will now be explained in greater detail.
  • The signature block repository 302 includes a signature block 306 for each data block in the data repository 304. Although a variety of implementations are possible, the signature block 306 of one embodiment includes a signature 308, an archive file identifier (AFID) 310, and an offset 312.
  • When archiving or otherwise copying data blocks, a signature 308 can be derived for a specific data block by performing a hash or other function on the data block. The signature 308 is used to uniquely or substantially uniquely identify the data block and/or determine the likelihood that the data block is a duplicate of an already stored data block with the same signature 308. In one embodiment, the signature 308 is a deduplication signature derived using a deduplication function, such as a hash function.
  • In an embodiment, the SHA-512 algorithm is used on a 64 kB or 128 kB data block to derive the signature 308. The resulting signature 308 is a 256 bytes, and can be used for deduplication purposes. As illustrated in FIG. 3, in an embodiment, the signature 308 is part of a signature block 306 stored in the CSR 204. Hash functions other than SHA-512 can be used on the data blocks to derive signature 308, as well as other non-hash functions. In addition, different sized signatures 308 may be used without departing from the spirit and scope of the description. Additionally, the signatures 308 for each of the backed up data blocks are also stored at the secondary storage device in certain embodiments. In other cases, the signatures 308 are generated on-the-fly on a per use basis instead of being stored at the CSR 204 and/or the secondary storage device.
  • The AFID 310 according to certain embodiments provides aging information associated with the data blocks. For example, the AFID 310 in one embodiment includes a number indicative of when the data block was last backed up (or replicated). For instance, the AFID may be a unique identifier associated with a particular backup, backup catalog, or other storage operation associated with the data block. The AFID 310 in some embodiments is generated during a backup operation, e.g., when the data block is backed up. During a restore, the AFID 310 can be used as a handle to get and restore the data block. As shown, the AFIDs 310 can reside in the signature block repository 302 of the CSR 204 and each AFID 310 can be embedded with or otherwise be associated with the hash signature 308 and/or offset 312 of the corresponding data block. Additionally, the AFID 310 in some embodiments is embedded in or is otherwise associated with the respective data blocks, e.g., in the data repository of the CSR 310. In some alternative embodiments, the AFIDs 310 are stored separately from the data blocks in the CSR 204, or are stored at the secondary storage device instead of or in addition to being stored in the CSR 204.
  • The offset 312 can be used to identify the actual location of the data block in storage. The offset 312 can be made up of one or more bytes of data, and can be used by the CSR 204 or other system component to locate a data block during a restore operation. The offset 312, can be populated during backup operations (or replication or other copy operations) once the location where the data block is to be stored is known. As shown, the offsets 312 can reside in the signature block repository 302 of the CSR 204 and each offset 312 can be embedded with or otherwise be associated with the hash signature 308 and/or AFID 310 of the corresponding data block. Additionally, the AFID 310 in some embodiments is embedded in or is otherwise associated with the respective data blocks, e.g., in the data repository of the CSR 310.
  • The signature block 306 can have fewer or more parts than what is illustrated in FIG. 3. For example, in an embodiment, the signature block 306 can include only a signature 308. In another embodiment, the signature block 306 can include additional information instead of or in addition to the signature 308, AFID 310 and offset 312. For example, the signature block 306 can include information regarding the source of the data block.
  • The data repository 304 contains one or more of the data blocks from the information store 210 of the client 208. The data blocks can be stored in any type of format. In one embodiment, the data blocks are deduplicated data blocks and are stored according to a deduplication scheme. Furthermore, the data blocks for multiple clients 208 can be stored in the data repository 304 of the CSR. The data repository 304 can also include an index of the source the client 208 for the different data blocks. Although illustrated as two separate repositories, the data repository 304 and the signature block repository 302 can be a single, co-mingled repository. For example, in an embodiment, a signature block precedes each data block. In another embodiment, the signature blocks are all contained in a group separate from the data blocks. In such an embodiment, each signature block can include a pointer to the corresponding data block, or the offset 312 can indicate the location of the corresponding data block.
  • With reference now to FIGS. 4A-4B, the interaction between the various components of a storage system is illustrated with respect to example backup and restore operations, respectively. For example, the storage system may be similar to or the same as either of the storage systems 100, 200 of FIGS. 1 and 2 respectively. For purposes of the example, however, the illustrated example has been simplified to include interaction between one client system 402B and one media agent 408B and associated storage device 410B. In other cases, any of the media agents 408A, 408B and secondary storage devices 410A, 410B, alone or in combination, can be used for backing-up and restoring data blocks from any combination of the client systems 402A-C. Client system 402A-C are similar to the clients discussed with reference to FIGS. 1, and 2. Furthermore, although not shown in FIG. 4, information stores (e.g., primary storage) can associated with each client system.
  • FIG. 4A is a state a diagram illustrative of the interaction between the various components of the storage system 400 during a backup operation. In an embodiment, a client system 402B initiates a backup of data blocks stored within an information store (not shown) that is associated with the client system 402B.
  • In initiating the backup, the client system 402B transmits the data blocks to be backed-up to both the CSR 404 and the storage manager 406. In another embodiment, the client system 402B transmits the data blocks to be backed up to the storage manager 406. In turn, the storage manager 406 transmits the data blocks to the CSR 404. In one embodiment, the data blocks are transmitted to the storage manager 406 and the CSR 404 simultaneously, or at approximately the same time. In another scenario, the data blocks are transmitted first to either the CSR 404 or the storage manager 404 and then to the other component.
  • The backup (or other storage operation) can be initiated in many different ways, such as at predetermined time intervals, upon client request, upon storage manager request, or upon a CSR request. For example, the backup of the client system 402B can occur daily, weekly, monthly or at some other predetermined time interval. Alternatively, the backup can occur based on the client or system administrator selecting the backup from a user interface. In another embodiment one client can initiate the backup for a different client.
  • The system 400 can determine which data blocks to backup in the CSR 404 in any number of different ways. In some embodiments, all of the data from the client system 402B is copied to the CSR 404, e.g., as it is copied to the secondary storage device 410B. In such embodiments, however, the CSR 404 generally may not be able to retain the entire data image to be backed up. As such, the system 400 implements a data retention policy for the CSR 404. Although a wide variety of retention policies can be used, in one case the system 400 implements a first-in first-out (FIFO) policy in which the least recently written data is pushed out of the CSR 404 in favor of newly written data.
  • In other embodiments, only some of the data is stored in the CSR 404. Which data blocks to store can be determined based one or more factors, such as most recently used data blocks, location of the backed-up data blocks in the secondary storage device 410B, the communication path between the secondary storage device 410B and the client system 402B, file type of the data blocks, location of data blocks in the information store of the client system 402B or folder location, client preferences, client priorities, and the like.
  • Additionally, the data can be written to the CSR 404 according to a deduplication policy in which references are written to the CSR 404 in place of data blocks and or signature blocks previously written to the CSR 404.
  • With continued reference to FIG. 4A, the CSR 404 stores the data blocks and a signature block associated with each data block. The signature block can be determined by the CSR 404, the storage manager 406, the media agent 408B, and/or the client system 402B. In an embodiment where the client system 402B calculates the signature block, the client system 402B can transmit the signature block along with the data block to the CSR 404 and/or the storage manager 406. As discussed previously with reference to FIG. 3, the data blocks and signature blocks can be stored in many different ways and formats without departing from the spirit and scope of the description.
  • Upon receiving the data blocks for backup, the storage manager 406 proceeds to store the data blocks as described above with reference to FIG. 1 using the media agent 408B and the secondary storage device 410B. As described, the data blocks can be stored using deduplication schemes. In addition, the secondary storage device 410B can also store signature blocks corresponding to each data block. The signature blocks can include a signature, an AFID and an offset, similar to the signature blocks described above with reference to FIG. 3.
  • FIG. 4B is a state a diagram illustrative of the interaction between the various components of the storage system of FIGS. 1 and 2 during a restore operation. In an embodiment, the client system 402B initiates a restore by requesting a restore of its data from the storage manager 406. The restore request can be initiated by any one of several components of the storage system 400. For example, the restore request can be initiated by a client 402A or 402C on behalf of the client system 402B. Alternatively, the storage manager 405 or the CSR 404 can initiate the restore without a request from the client system 402B. Such a restore may initiate upon the occurrence of some predetermined criteria, such as a power outage, information store error, some other condition that causes a client system to go off-line, addition of a new client, or the like. In one embodiment, the data from the client system 402B can be restored to another client 402A, 402C or a new client.
  • In response to the restore request, the storage manager 406 queries the CSR 404 for data blocks associated with the client system 402B, although the query can come directly from the media agent 408B in other configurations. The query contains a signature of a specific data block to be restored. In some embodiments, the storage manager 406 maintains an index of the data blocks stored in the CSR 404 based on the responses to the queries, and uses the index to determine which data blocks to restore using the CSR 404 and which data blocks to restore using the secondary storage device 410B. The index can include signature blocks of the data blocks stored in the CSR 404.
  • In other embodiments, as will be described below with respect to FIG. 8, the storage manager 406 bundles the queries to the CSR 404, rather than transmitting each query separately. In other embodiments, the storage manager 406 queries the CSR 404 for all the data blocks associated with the client system 402B at once.
  • In response to the queries from the storage manager 406, the CSR 404 determines which of the data blocks requested are stored therein and notifies the storage manager 406. To determine which of the data blocks are stored in the CSR 404, the CSR 404 can compare the signatures received in the queries with the signatures in a signature block repository. Matching signatures indicate the data block is stored in the CSR 404. The CSR 404 can notify the storage manager 406 which data blocks are found, and begin transmitting the data blocks stored therein to the client system 402B. In one embodiment, the CSR 404 responds to the queries with an index of all the queried data blocks stored therein that are associated with the client system 402B, allowing the storage manager 406 to determine which data blocks to restore using the media agent 408B and the secondary storage device 410B. In an embodiment, the index includes a signature of each data block found in the CSR 404.
  • It will be appreciated that the hand-shaking and flow of data between the components can take a variety of forms. For example, the CSR 404 may await instructions from the storage manager 406 before transmitting any data blocks to the client system 402B. The CSR 404 in one scenario transmits the data blocks stored therein to the storage manager 406 instead of directly to the client system 402B, and the storage manager 406 in turn transmits the data blocks to the client system 402B. In another embodiment, the storage manager 406 generates and maintains an index of the data blocks stored in the CSR 404 as the data is written to and/or cycled out of the CSR 404. In such an embodiment, the storage manager 406 uses the index to determine which data blocks to query and/or restore using the CSR 404 and which data blocks to restore using the secondary storage device 410B.
  • Upon receiving the response from the CSR 404 regarding the data blocks stored therein, the storage manager 406 restores the remaining data blocks using the media agent 408B and the secondary storage device 410B. The remaining data blocks are retrieved from the secondary storage device 410B and restored to the client system 402B. Although not illustrated, the secondary storage device 410B can communicate directly with the client system 402B to restore the data blocks rather than transmitting the data via the media agent 408B and/or the storage manager 406. Furthermore, as described previously with reference to FIG. 4A, any of the media agents 408A, 408B and the secondary storage devices 410A, 410B can be used to backup and restore data blocks.
  • One skilled in the art will appreciate that all of the components of storage system 400 are not necessary to store and restore data blocks, and that the processes described herein can be implemented in any number of ways without departing from the spirit and scope of the description. For example, in an embodiment, there is no storage manager 406. In such an embodiment, the client system 402B can query the CSR 404 for the data blocks contained therein and retrieve the remaining data blocks using the media agents 408A, 408B and the secondary storage devices 410A, 410B. In an alternative embodiment, the media agent 408B receives the restore request from the client system 402B, performs the query of the CSR 404, and retrieves the data blocks not found in the CSR 404 from the secondary storage device 410B. In yet another embodiment, the CSR 404 receives the restore request from the client system 402B, restores the data blocks stored therein to the client system 402B, and transmits an index of the data blocks restored to the media agent 408B. In turn, the media agent 408B uses the index to retrieve and restore the remaining data blocks from the secondary storage device 410B and restore the data blocks to the client system 402B. In yet another embodiment, the media agent 408B contains an index of the data blocks stored within the CSR 404. The CSR 404 and the media agent 408B receive the restore request. The CSR 404 restores the data blocks stored therein to the client system 402B. Using the index, the media agent 408B retrieves and restores the data blocks not stored in the CSR 404 from the secondary storage device 410B to the client system 402B. One skill in the art will understand that the data can be stored in any storage device 410A, 410B and can be retrieved using any media agent 408A, 408B without departing from the spirit and scope of the description.
  • FIGS. 5-8 are flow diagrams illustrative of various processes or routines that the storage system 400 can carry out. FIG. 5 is a flow diagram of a routine implemented by the storage system for processing a restore request and restoring data blocks to a client using a client-side repository. FIG. 6 is a flow diagram of a routine implemented by the storage system for tuning the client-side repository. FIG. 7 is a flow diagram of a routine implemented by the storage system for restoring data blocks to a client using a client-side repository and AFID. FIG. 8 is a flow diagram of a routine implemented by the storage system for bundling queries for a client-side repository.
  • FIG. 5 is a flow diagram illustrative of one embodiment of a routine 500 implemented by a storage system for processing a restore request and restoring data to a client using a client-side repository. For example, routine 500 can apply to embodiments described in reference to FIGS. 1, 2, 3, 4A, and 4B. One skilled in the relevant art will appreciate that the elements outlined for routine 500 may be implemented by one or many computing devices/components that are associated with the storage system 400. For example, routine 500 can be implemented by any one, or a combination, of the client 402 (i.e. any one of the clients 402A-402C), the CSR 404, the storage manager 406, the media agent 408 (i.e. any one of the media agents 408A-408B) and/or the secondary storage device 410 (i.e. any one of the secondary storage devices 410A-410B). Accordingly, routine 500 has been logically associated as being generally performed by the storage system 400, and thus the following illustrative embodiments should not be construed as limiting.
  • At block 502, the storage system receives a restore request. The request can be received from or by a client 408, a new client, one client on behalf of another, a storage manager, 406, the media agent 408, or the like. The request can occur automatically upon a reboot, information store error, lost data, predetermined time interval, user selection, or the like.
  • At block 504, the storage system sends multiple queries to the CSR 404 for data blocks stored therein. In one embodiment, each query comprises a signature block of a data block being searched for. As discussed previously, the CSR 404 contains data blocks previously stored during a backup or other function, as well as signature blocks corresponding to each data block. In an embodiment, the data blocks are deduplicated blocks and the signature blocks are deduplication signature blocks. Upon receiving each query, the CSR 404 checks the data blocks stored therein using the received signature block and a signature block repository, as described above with reference to FIGS. 3 and 4.
  • At block 506, the storage system determines if a signature block indicates the data block is stored in the CSR 404. In an embodiment, the storage system compares the received signature block with the signature blocks found in the signature block repository. In one embodiment, the signature block indicates the data block is stored in the CSR 404 if a signature block in the signature block repository matches the signature block of the query. If the signature block indicates the data block is stored in the CSR 404, the data block is restored to the client using the CSR 404, as illustrated in FIG. 508. Upon restoring the data block using the CSR 404, the storage system 400, continues to query the CSR 404 for additional data blocks contained therein until all queries have been completed.
  • On the other hand, if the signature block does not indicate that the data block is stored in the CSR 506, the storage system restores the data block using the secondary storage device 410. Upon restoring the data block using the secondary storage device 410, the storage system 400 continues to query the CSR 404 for additional data blocks contained therein, until all queries have been completed.
  • One skilled in the art will appreciate that routine 500 can include fewer, more, or different blocks than those illustrated in FIG. 5. For example, rather than restoring each data block at each iteration, storage system 400 can restore all data blocks once all queries are finished. Furthermore, while some data blocks are being restored, additional queries can continue. Thus, some blocks may be performed concurrently with others.
  • FIG. 6 is a flow diagram illustrative of one embodiment of a routine 600 implemented by the storage system for tuning the client-side repository. For example, routine 600 can apply to embodiments described in reference to FIGS. 1, 2, 3, 4A, and 4B. One skilled in the relevant art will appreciate that the elements outlined for routine 600 may be implemented by one or many computing devices/components that are associated with the storage system 400. For example, routine 600 can be implemented by any one, or a combination, of the client 402, the CSR 404, the storage manager 406, the media agent 408 and/or the secondary storage device 410. Accordingly, routine 600 has been logically associated as being generally performed by the storage system 400, and thus the following illustrative embodiments should not be construed as limiting.
  • At block 602, the storage system 400 monitors the usage of the CSR 404. The monitoring can occur during backup, restore or other operations, and can be done by any number of components of the storage system including, but not limited to the client 402, the storage manager 406, the media agent 408, or even the CSR 404 itself. In monitoring the usage of the CSR 404, the storage system 400 can generate a metric. Thus, to monitor the usage of the CSR 404, the storage system can analyze the generated metric. The metric can relate to a total amount of data transmitted between the client-side repository and the client system, an amount of data transmitted between the client-side repository and the client system within a predefined time interval, a number of restore operations, a data transmit rate, an amount of network bandwidth used during restore operations, an amount of time used during restore operations, a destination of the data blocks during the restore operation, and the like.
  • At decision block 604, the storage system 400 determines if a threshold condition is triggered. In one embodiment, the storage system 400 determines if the metric exceeds a predefined threshold. In one embodiment, the threshold condition is threshold amount or size of data transmitted, e.g., within a particular time interval. In another embodiment, the threshold condition is a threshold number of restore requests, which may also be within a particular time interval. The threshold condition may also be a maximum or minimum amount of time taken to transmit data, a percentage of network bandwidth used during restore requests, competing needs for the network, and the like. In general, any combination of the above threshold conditions or other appropriate threshold conditions can be used. For example, in one case, the threshold condition is a predefined amount of data restored from the secondary storage device 410 to the client 402. If storage system 400 determines that the threshold condition is not triggered, the storage system 400 continues to monitor the usage of the CSR 404, as illustrated in block 602. In this manner, if a relatively high percentage of data is being restored from secondary storage rather than from the CSR, the system 400 can react in an appropriate fashion.
  • Alternatively, if the storage system 400 determines that the threshold condition is triggered, the storage system 400 tunes at least one CSR 404 parameter. The parameter can include, without limitation, the storage capacity or size of the CSR, the function used to generate the signatures, a hash function, a data transfer rate, and client storage priority. The storage system 400 can tune the CSR 404 parameter in one of many different ways, such as increasing the storage capacity of the CSR 404, changing the function used to generate signatures, changing the hash function used to determine the signature hashes, changing storage parameters, changing which clients use the CSR 404, altering the priority given to data from one client relative to another client, and the like. In further configurations, data may be pruned (e.g., deleted or overwritten) from the CSR 404 in response to the threshold condition being triggered.
  • These changes can be carried out automatically, based upon the threshold being triggered, or upon a client request. For example, in one embodiment, the threshold condition is a predefined amount of data being restored using the secondary storage device 410. Once storage system 400 detects the threshold condition is met, it tunes the CSR 404 to better accommodate the storage needs of the client 402. In one embodiment, storage system 400 tunes the CSR 404 by increasing its storage capacity. Increasing the storage capacity of the CSR 404 can reduce the number of requests made to the secondary storage device 410 to restore data, thereby decreasing the restore time of the client 402 and increasing available network bandwidth. Storage capacity of the CRS 404 can be increased by allocating additional media to the CSR 404 or by pruning the CSR 404, e.g., by deleting data that is used relatively infrequently.
  • FIG. 7 is a flow diagram illustrative of one embodiment of a routine 700 implemented by the storage system for restoring a client using AFIDs associated with the data blocks stored in the CSR 404. For example, routine 700 can apply to embodiments described in reference to FIGS. 1, 2, 3, 4A, and 4B. One skilled in the relevant art will appreciate that the elements outlined for routine 700 may be implemented by one or many computing devices/components that are associated with the storage system 400. The process 700 can be implemented by any one, or a combination, of the client 402, the CSR 404, the storage manager 406, the media agent 408 and/or the secondary storage device 410. Accordingly, routine 700 has been logically associated as being generally performed by the storage system 400, and thus the following illustrative embodiments should not be construed as limiting.
  • Similar to block 502 of FIG. 5, at block 702, the storage system receives a request to restore data to a client system. In an embodiment, the data is made up of a plurality of deduplicated data blocks. Upon receiving the request, the storage system 400 in one embodiment retrieves a signature block of at least one of the deduplicated data blocks to be restored, and extracts a storage indicator from the signature block. The signature block may be organized in a manner similar to the signature block shown in FIG. 3, for instance, or in some other manner. In one embodiment, the storage system retrieves just the storage indicator, and not an entire signature block. The storage indicator provides aging information or information related to some other parameter associated with the data block. In one embodiment, the storage indicator is an AFID. Whether or not the storage indicator is associated with the signature block, the storage indicator can be retrieved in a variety of manners. For instance, storage indicator for each data block may be received along with the restore request, or the media agent may retrieve the storage indicator by consulting a separate table or index, e.g., by using a signature associated with the data block. In various embodiments, the storage indicator may be transmitted from the client-side repository, e.g., over the WAN, may be retrieved from local storage by the media agent or other component, or may be transmitted to the media agent over a LAN, e.g., from another media agent, from the storage manager, or from secondary storage. In one embodiment, the media agent requests the storage indicator from the CSR, e.g., by sending a signature to the CSR corresponding to the data block, and the CSR returns the appropriate storage indicator.
  • At decision block 706, the storage system determines whether or not to query the CSR 404 for the particular data block(s) in the file that is being restored. For instance, the storage system may review the storage indicator to determine whether it is likely that the data block is in the CSR 404. The media agent or other component of the storage system can make this determination in several different ways. For example, in one embodiment, based on the AFID or other storage indicator, the media agent determines the age of the data block. The age may be an indication of when the data block was last involved in a copy operation, for example. For instance, the AFID may correspond to a unique identifier for a particular copy (e.g., backup) session. The media agent may have access to a list indicating when each copy session took place, and can correlate the AFID associated with the requested data block to the list. A variety of other mechanisms are possible to provide aging information. In one embodiment, the AFID provides a direct numerical indication of the age of the data block. For instance, in one embodiment the AFID may increment as each block (or group of blocks) is created.
  • In an embodiment, where the CSR deletes data blocks after a set time interval, the storage system can use the determined age of the storage indicator to determine if it is likely that the data block is stored in the CSR 404. As one example, if data blocks are deleted after 10 days, and the AFID indicates that the data block was last backed up more than 10 days ago, the media agent may determine that the data block has likely been pruned from the CSR 404 and is therefore not likely currently stored in the CSR 404. On the other hand, if the AFID indicates that the data block was last backed up less than 10 days ago, the media agent may determine that the data block is likely to be found in the CSR 404.
  • While described primarily with respect to the AFID for the purposes of illustration, the type of information provided by the storage indicator may vary. For example, in another embodiment, storage indicator provides an indication as to the source of the data block, such as an indication as to which client or clients the data block was backed up from. The storage system can use the information regarding the source(s) of the data block to determine if the data block is likely stored in the CSR 404. For instance, more than one client may share the CSR, but have different priorities with respect to the CSR. Where the storage indicator indicates that the data block came from a client having a relatively high priority with respect to the CSR, the media agent may determine that the data block is likely stored in the CSR. In addition to a client priority policy, other CSR policies can be used such as update frequency, the CSR pruning algorithm (e.g., first-in-first-out), and the like. Generally, any combination of any of the above parameters can be used instead of or in addition to the AFID or other aging information to determine the likelihood that the particular data block is stored in the CSR.
  • If it is determined that the data block is not likely stored in the CSR 404, then storage system 400 restores the data block using the secondary storage device 410, as described in greater detail above with reference to block 510 of FIG. 5. On the other hand, if the storage system 400 determines that it is likely that the data block is in the CSR 404, the storage system 400 can query the CSR 404 for the data block, as illustrated in block 710, and as described in greater detail above with reference to block 504 of FIG. 5.
  • Following the query, the storage system 400 determines if the signature block indicates that the data block is in the CSR 404, as described in greater detail above with reference to decision block 506 of FIG. 5. If the storage system 400 determines that the data block is not within the CSR 404, the storage system restores the data block using the secondary storage device 410, as illustrated in block 708 and described in greater detail above with reference to block 510 of FIG. 5. On the other hand, if the storage system 400 determines that the data block is stored within the CSR 404, the storage system restores the data block using the CSR 404, as illustrated in block 714 and described in greater detail above with reference to block 508 of FIG. 5. In a similar manner, storage system 400 can restore multiple data blocks associated with a particular client. In alternative embodiments, the media agents or other system components are provided with an up to date or substantially up to date listing of what data blocks are stored in the CSR, and may therefore not perform the query. For instance, the CSR may transmit the updates to the media agents and/or storage manager periodically or as blocks are stored in and pruned from the CSR. In yet further embodiments, the media agent queries the CSR for all of the data blocks without determining the likelihood that the data block is stored in the CSR.
  • FIG. 8 is a flow diagram illustrative of one embodiment of a routine 800 implemented by the storage system for restoring data blocks to a client using a CSR 404 and an AFID. For example, routine 800 can apply to embodiments described in reference to FIGS. 1, 2, 3, 4A, and 4B. One skilled in the relevant art will appreciate that the elements outlined for routine 800 may be implemented by one or many computing devices/components that are associated with the storage system 400. For example, routine 800 can be implemented by any one, or a combination, of the client 402, the CSR 404, the storage manager 406, the media agent 408 and/or the secondary storage device 410. Accordingly, routine 800 has been logically associated as being generally performed by the storage system 400, and thus the following illustrative embodiments should not be construed as limiting.
  • As discussed previously, during backups all of the data is stored in the secondary storage device 410 as data blocks. However, to expedite restores, some data blocks can also be stored in the CSR 404. During a restore, queries are sent to the CSR 404 to determine which data blocks are stored therein. Each query includes a request for a specific data block potentially stored in the CSR 404. Over the course of a restore there may be many queries sent to the CSR 404. These queries may use network bandwidth that could more effectively be used elsewhere, especially when the queries are made over a WAN. To reduce the network traffic, storage system 400, can bundle the queries, as will be described in greater detail below with reference to FIG. 8. The storage system can implement bundling based on a predefined number of queries, network bandwidth, data/file location within the secondary storage device or information store of the client, and the like
  • Similar to block 502 of FIG. 5, at block 802, the storage system 400 receives a request to restore data. In one embodiment, the data blocks to be restored are a deduplicated data blocks. At block 804, the storage system bundles a number of queries for a set of data blocks. As mentioned previously, each query can contain a signature block corresponding to a data block that is to be restored to the client. The queries can be bundled in any number of ways, such as based on a signature block value, an AFID value, a time of query, a set number of queries, a location of client, a client identification, a location of data block in the secondary storage device or CSR, and/or pseudo-randomly. For example, in one embodiment, all the queries can be bundled together. Alternatively, some or all of the queries for data blocks that are likely to be found in the CSR 404 can be bundled together. In another embodiment, a set number of queries are bundled.
  • At block 806, the bundled queries are sent to the CSR 404, similar to what is described above with reference to block 504 of FIG. 5. Upon receiving the bundled queries, the CSR 404 parses the bundled queries into the individual queries and determines which data blocks corresponding to the queries are stored therein. Following the determination made by the CSR 404, the storage system 400 restores the requested data, as illustrated in block 808. The data blocks stored in the CSR 404 are restored using the CSR 404, while the data blocks not stored in the CSR 404 are restored using the secondary storage device 410.
  • The bundling process 800 of FIG. 8 can advantageously be used in conjunction with the process 700 of FIG. 7. Thus, in one embodiment the media agent or other appropriate component first determines whether data blocks are likely to be found in the CSR according to the process 700 of FIG. 7, and then bundles queries according to the process 800 of FIG. 8 for the data blocks that are likely to be found in the CSR. In another embodiment, the media agent bundles the queries according to the process 800 of FIG. 8 and then determines which of the data blocks corresponding to the bundled queries are likely to be found in the CSR. The media agent may then only transmit the queries in the respective bundles that are likely to be found in the CSR.
  • It will be appreciated by those skilled in the art and others that all of the functions described in this disclosure may be embodied in software executed by one or more processors of the disclosed components and mobile communication devices. The software may be persistently stored in any type of non-volatile storage.
  • Conditional language, such as, among others, “can,” “could,” “might,” or “may,” unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or steps. Thus, such conditional language is not generally intended to imply that features, elements and/or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without user input or prompting, whether these features, elements and/or steps are included or are to be performed in any particular embodiment.
  • It is also recognized that the term “remote” may include data, objects, devices, components, and/or modules not stored or located locally, or that are not accessible via the same portion of a network, using the network topology, etc. Thus, a remote device may be located in a separate geographic area, such as, for example, in a different location, country, and so forth. The meaning of the term “remote” will additionally be understood in view of its usage throughout the entirety of the disclosure.
  • In certain embodiments of the invention, operations disclosed herein can be used to copy or otherwise retrieve data of one or more applications residing on and/or being executed by a computing device. For instance, the applications may comprise software applications that interact with a user to process data and may include, for example, database applications (e.g., SQL applications), word processors, spreadsheets, financial applications, management applications, e-commerce applications, browsers, combinations of the same or the like. For example, in certain embodiments, the applications may comprise one or more of the following: MICROSOFT EXCHANGE, MICROSOFT SHAREPOINT, MICROSOFT SQL SERVER, ORACLE, MICROSOFT WORD and LOTUS NOTES.
  • Moreover, in certain embodiments of the invention, data backup systems and methods may be used in a modular storage management system, embodiments of which are described in more detail in U.S. Pat. No. 7,035,880, issued Apr. 5, 2006, and U.S. Pat. No. 6,542,972, issued Jan. 30, 2001, each of which is hereby incorporated herein by reference in its entirety. For example, the disclosed backup systems may be part of one or more storage operation cells that includes combinations of hardware and software components directed to performing storage operations on electronic data. Exemplary storage operation cells usable with embodiments of the invention include CommCells as embodied in the QNet storage management system and the QiNetix storage management system by CommVault Systems, Inc., and as further described in U.S. Pat. No. 7,454,569, issued Nov. 18, 2008, which is hereby incorporated herein by reference in its entirety.
  • Storage operations compatible with embodiments described herein will now be described. For example, data can be stored in primary storage as a primary copy or in secondary storage as various types of secondary copies including, as a backup copy, a snapshot copy, a hierarchical storage management copy (“HSM”), an archive copy, and other types of copies. Certain embodiments described herein with respect to backup operations are similarly compatible with each of these types of operations.
  • A primary copy of data is generally a production copy or other “live” version of the data which is used by a software application and is generally in the native format of that application. Such primary copy data is typically intended for short term retention (e.g., several hours or days) before some or all of the data is stored as one or more secondary copies, such as, for example, to prevent loss of data in the event a problem occurred with the data stored in primary storage.
  • Secondary copies include point-in-time data and are typically intended for long-term retention (e.g., weeks, months or years) before some or all of the data is moved to other storage or is discarded. Secondary copies may be indexed so users can browse and restore the data at another point in time. After certain primary copy data is backed up, a pointer or other location indicia such as a stub may be placed in the primary copy to indicate the current location of that data.
  • One type of secondary copy is a backup copy. A backup copy is generally a point-in-time copy of the primary copy data stored in a backup format, as opposed to a native application format. For example, a backup copy may be stored in a backup format that facilitates compression and/or efficient long-term storage. Backup copies generally have relatively long retention periods and may be stored on media with slower retrieval times than other types of secondary copies and media. In some cases, backup copies may be stored at on offsite location.
  • Another form of secondary copy is a snapshot copy. From an end-user viewpoint, a snapshot may be thought of as an instant image of the primary copy data at a given point in time. A snapshot generally captures the directory structure of a primary copy volume at a particular moment in time and may also preserve file attributes and contents. In some embodiments, a snapshot may exist as a virtual file system, parallel to the actual file system. Users typically gain read-only access to the record of files and directories of the snapshot. By electing to restore primary copy data from a snapshot taken at a given point in time, users may also return the current file system to the state of the file system that existed when the snapshot was taken.
  • A snapshot may be created instantly, using a minimum amount of file space, but may still function as a conventional file system backup. A snapshot may not actually create another physical copy of all the data, but may simply create pointers that are able to map files and directories to specific disk blocks.
  • In some embodiments, once a snapshot has been taken, subsequent changes to the file system typically do not overwrite the blocks in use at the time of the snapshot. Therefore, the initial snapshot may use only a small amount of disk space needed to record a mapping or other data structure representing or otherwise tracking the blocks that correspond to the current state of the file system. Additional disk space is usually required only when files and directories are actually modified later. Furthermore, when files are modified, typically only the pointers which map to blocks are copied, not the blocks themselves. In some embodiments, for example in the case of copy-on-write snapshots, when a block changes in primary storage, the block is copied to secondary storage before the block is overwritten in primary storage. The snapshot mapping of file system data is also updated to reflect the changed block(s) at that particular point in time.
  • An HSM copy is generally a copy of the primary copy data but typically includes only a subset of the primary copy data that meets a certain criteria and is usually stored in a format other than the native application format. For example, an HSM copy may include data from the primary copy that is larger than a given size threshold or older than a given age threshold and that is stored in a backup format. Often, HSM data is removed from the primary copy, and a stub is stored in the primary copy to indicate the new location of the HSM data. When a user requests access to the HSM data that has been removed or migrated, systems use the stub to locate the data and often make recovery of the data appear transparent, even though the HSM data may be stored at a location different from the remaining primary copy data.
  • An archive copy is generally similar to an HSM copy. However, the data satisfying criteria for removal from the primary copy is generally completely removed with no stub left in the primary copy to indicate the new location (i.e., where the archive copy data has been moved to). Archive copies of data are generally stored in a backup format or other non-native application format. In addition, archive copies are generally retained for very long periods of time (e.g., years) and, in some cases, are never deleted. In certain embodiments, such archive copies may be made and kept for extended periods in order to meet compliance regulations or for other permanent storage applications.
  • In some embodiments, application data over its lifetime moves from more expensive quick access storage to less expensive slower access storage. This process of moving data through these various tiers of storage is sometimes referred to as information lifecycle management (“ILM”). This is the process by which data is “aged” from forms of primary storage with faster access/restore times down through less expensive secondary storage with slower access/restore times. For example, such aging may occur as data becomes less important or mission critical over time.
  • Similar data transfers associated with location-specific criteria are performed when restoring data from secondary storage to primary storage. For example, to restore data a user or system process generally must specify a particular secondary storage device, piece of media, or archive file. Thus, the precision with which conventional storage management systems perform storage operations on electronic data is generally limited by the ability to define or specify storage operations based on data location.
  • Systems and modules described herein may comprise software, firmware, hardware, or any combination(s) of software, firmware, or hardware suitable for the purposes described herein. Software and other modules may reside on servers, workstations, personal computers, computerized tablets, PDAs, and other devices suitable for the purposes described herein. Software and other modules may be accessible via local memory, via a network, via a browser, or via other means suitable for the purposes described herein. Data structures described herein may comprise computer files, variables, programming arrays, programming structures, or any electronic information storage schemes or methods, or any combinations thereof, suitable for the purposes described herein. User interface elements described herein may comprise elements from graphical user interfaces, command line interfaces, and other interfaces suitable for the purposes described herein.
  • Embodiments of the invention are also described above with reference to flow chart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products. It will be understood that each block of the flow chart illustrations and/or block diagrams, and combinations of blocks in the flow chart illustrations and/or block diagrams, may be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the acts specified in the flow chart and/or block diagram block or blocks.
  • These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the acts specified in the flow chart and/or block diagram block or blocks. The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operations to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the acts specified in the flow chart and/or block diagram block or blocks.
  • While certain embodiments of the inventions have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the disclosure. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the disclosure. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the disclosure.

Claims (18)

1. A method of restoring deduplicated data from secondary storage to an information store associated with a client, the method comprising:
in response to instructions to copy data from an information store associated with a client system to secondary storage remote from the client system:
copying at least a portion of the data from the information store to a data repository of a client-side repository as a plurality of data blocks, the client-side repository being remote from the secondary storage, wherein the data from the information store is copied to the secondary storage according to a deduplication scheme; and
populating a signature repository of the client-side repository with a plurality of deduplication signatures corresponding to the data blocks stored in the data repository of the client-side repository; and
for a restore operation in which at least some of the copied data is restored from secondary storage to the client:
receiving a plurality of queries inquiring as to the presence of at least some of the plurality of data blocks in the client-side repository;
using one or more processors and in response to the queries, consulting the signature repository of the client-side repository to determine which of the data blocks are stored in the data repository of the client-side repository; and
restoring data blocks that are stored in the data repository of the client-side repository from the client-side repository to the information store associated with the client, the data blocks not stored in the data repository of the client-side repository being restored from the secondary storage to the information store associated with the client.
2. The method of claim 1, wherein the client-side repository and the client communicate with one another over a different network topology than the network topology that the secondary storage and the client communicate with one another over.
3. The method of claim 2, wherein the client-side repository and the client communicate via a local area network.
4. The method of claim 3, wherein the secondary storage and the client communicate via a wide area network.
5. The method of claim 1, wherein before sending a query inquiring as to the presence of a data block, the secondary storage consults an identifier indicative of an age of the corresponding data block to determine whether to send the query.
6. The method of claim 1, wherein consulting the signature repository comprises comparing signatures stored in the client-side repository to signatures received in the queries.
7. The method of claim 1, further comprising modifying the size of the client-side repository in response to an amount of usage of the client-side repository as part of one or more restore operations.
8. The method of claim 1, further comprising deleting one or more of the data blocks from the client-side repository based on a predetermined retention scheme.
9. The method of claim 1, wherein the blocks are deleted based on a period of time that the data blocks have been stored in the client-side repository according to the retention scheme.
10. The method of claim 1, wherein the blocks are deleted on a first-in, first-out basis according to the retention scheme.
11. The method of claim 1, wherein the data was previously copied to the secondary storage as part of a backup operation.
12. A storage system, comprising:
a client-side repository, comprising:
a data repository storing a plurality of data blocks, the data blocks corresponding to at least a portion of data that has been previously copied from an information store of a client to secondary storage according to a deduplication scheme;
a signature repository storing signatures corresponding to the data blocks in the data repository, the data repository and the signature repository remote from the secondary storage; and
a control module executing in one or more processors and configured to:
receive one or more queries inquiring as to the presence of a plurality of data blocks in the data repository;
consult the signature repository in response to the one or more queries to determine which of the queried data blocks are stored in the data block repository; and
restore the data blocks that are stored in the data block repository from the data block repository to the information store of the client,
wherein data blocks that are not stored in the data block repository are restored from the secondary storage to the information store of the client.
13. The storage system of claim 12, wherein the client-side repository and the client communicate with one another over a different network topology than the network topology that the secondary storage and the client communicate with one another over.
14. The storage system of claim 12, wherein before sending a query inquiring as to the presence of a data block, the secondary storage consults an identifier indicative of an age of the corresponding data block to determine whether to send the query.
15. The storage system of claim 12, wherein the control module is configured to consult the signature repository by comparing signatures stored in the signature repository to signatures received in the queries.
16. The storage system of claim 12, further comprising deleting one or more of the data blocks from the client-side repository based on a predetermined retention scheme.
17. The storage system of claim 12, wherein the control module is configured to modify the size of the data repository of the client-side repository in response to an amount of usage of the client-side repository as part of one or more restore operations.
18. A method of restoring deduplicated data to an information store associated with a client from secondary storage, comprising:
sending one or more queries to a client-side repository inquiring as to the presence of a plurality of data blocks in the client-side repository, the data blocks corresponding to at least a portion of data that has been previously copied from an information store of a client to the secondary storage according to a deduplication scheme, the secondary storage remote from the client and the client-side repository;
receiving an indication as to which of the queried data blocks are stored in the client-side repository; and
restoring the data blocks that are not stored in the client-side repository from the secondary storage to the information store of the client,
wherein data blocks that are stored in the client-side repository are restored from the client-side repository to the information store of the client.
US13/324,884 2010-12-14 2011-12-13 Client-side repository in a networked deduplicated storage system Abandoned US20120150818A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US42303110P true 2010-12-14 2010-12-14
US13/324,884 US20120150818A1 (en) 2010-12-14 2011-12-13 Client-side repository in a networked deduplicated storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/324,884 US20120150818A1 (en) 2010-12-14 2011-12-13 Client-side repository in a networked deduplicated storage system

Publications (1)

Publication Number Publication Date
US20120150818A1 true US20120150818A1 (en) 2012-06-14

Family

ID=46200387

Family Applications (5)

Application Number Title Priority Date Filing Date
US13/324,817 Active US8954446B2 (en) 2010-12-14 2011-12-13 Client-side repository in a networked deduplicated storage system
US13/324,884 Abandoned US20120150818A1 (en) 2010-12-14 2011-12-13 Client-side repository in a networked deduplicated storage system
US13/324,848 Active US9104623B2 (en) 2010-12-14 2011-12-13 Client-side repository in a networked deduplicated storage system
US13/324,792 Active US9116850B2 (en) 2010-12-14 2011-12-13 Client-side repository in a networked deduplicated storage system
US14/673,021 Active 2034-06-26 US10191816B2 (en) 2010-12-14 2015-03-30 Client-side repository in a networked deduplicated storage system

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/324,817 Active US8954446B2 (en) 2010-12-14 2011-12-13 Client-side repository in a networked deduplicated storage system

Family Applications After (3)

Application Number Title Priority Date Filing Date
US13/324,848 Active US9104623B2 (en) 2010-12-14 2011-12-13 Client-side repository in a networked deduplicated storage system
US13/324,792 Active US9116850B2 (en) 2010-12-14 2011-12-13 Client-side repository in a networked deduplicated storage system
US14/673,021 Active 2034-06-26 US10191816B2 (en) 2010-12-14 2015-03-30 Client-side repository in a networked deduplicated storage system

Country Status (1)

Country Link
US (5) US8954446B2 (en)

Cited By (102)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120303595A1 (en) * 2011-05-25 2012-11-29 Inventec Corporation Data restoration method for data de-duplication
US8572340B2 (en) 2010-09-30 2013-10-29 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US8577851B2 (en) 2010-09-30 2013-11-05 Commvault Systems, Inc. Content aligned block-based deduplication
US8930306B1 (en) 2009-07-08 2015-01-06 Commvault Systems, Inc. Synchronized data deduplication
US8938481B2 (en) 2012-08-13 2015-01-20 Commvault Systems, Inc. Generic file level restore from a block-level secondary copy
US8954446B2 (en) 2010-12-14 2015-02-10 Comm Vault Systems, Inc. Client-side repository in a networked deduplicated storage system
US8977672B2 (en) 2012-06-08 2015-03-10 Commvault Systems, Inc. Intelligent scheduling for remote computers
US9020900B2 (en) 2010-12-14 2015-04-28 Commvault Systems, Inc. Distributed deduplicated storage system
US9026498B2 (en) 2012-08-13 2015-05-05 Commvault Systems, Inc. Lightweight mounting of a secondary copy of file system data
US9092378B2 (en) 2011-03-31 2015-07-28 Commvault Systems, Inc. Restoring computing environments, such as autorecovery of file systems at certain points in time
US9124611B2 (en) 2006-12-18 2015-09-01 Commvault Systems, Inc. Systems and methods for writing data and storage system specific metadata to network attached storage device
US9128883B2 (en) 2008-06-19 2015-09-08 Commvault Systems, Inc Data storage resource allocation by performing abbreviated resource checks based on relative chances of failure of the data storage resources to determine whether data storage requests would fail
US9164850B2 (en) 2001-09-28 2015-10-20 Commvault Systems, Inc. System and method for archiving objects in an information store
US9189170B2 (en) 2012-06-12 2015-11-17 Commvault Systems, Inc. External storage manager for a data storage cell
US9189167B2 (en) 2012-05-31 2015-11-17 Commvault Systems, Inc. Shared library in a data storage system
US9218374B2 (en) 2012-06-13 2015-12-22 Commvault Systems, Inc. Collaborative restore in a networked storage system
US9262435B2 (en) 2013-01-11 2016-02-16 Commvault Systems, Inc. Location-based data synchronization management
US9262226B2 (en) 2008-06-19 2016-02-16 Commvault Systems, Inc. Data storage resource allocation by employing dynamic methods and blacklisting resource request pools
US9275086B2 (en) 2012-07-20 2016-03-01 Commvault Systems, Inc. Systems and methods for database archiving
US9274803B2 (en) 2000-01-31 2016-03-01 Commvault Systems, Inc. Storage of application specific profiles correlating to document versions
US9286327B2 (en) 2012-03-30 2016-03-15 Commvault Systems, Inc. Data storage recovery automation
US9292815B2 (en) 2012-03-23 2016-03-22 Commvault Systems, Inc. Automation of data storage activities
US9313143B2 (en) 2005-12-19 2016-04-12 Commvault Systems, Inc. Systems and methods for granular resource management in a storage network
US9323466B2 (en) 2011-04-27 2016-04-26 Commvault Systems, Inc. System and method for client policy assignment in a data storage system
US9367702B2 (en) 2013-03-12 2016-06-14 Commvault Systems, Inc. Automatic file encryption
US9390109B2 (en) 2012-12-21 2016-07-12 Commvault Systems, Inc. Systems and methods to detect deleted files
US9405635B2 (en) 2013-04-16 2016-08-02 Commvault Systems, Inc. Multi-source restore in an information management system
US9405763B2 (en) 2008-06-24 2016-08-02 Commvault Systems, Inc. De-duplication systems and methods for application-specific data
US9405928B2 (en) 2014-09-17 2016-08-02 Commvault Systems, Inc. Deriving encryption rules based on file content
US9411986B2 (en) 2004-11-15 2016-08-09 Commvault Systems, Inc. System and method for encrypting secondary copies of data
US9444811B2 (en) 2014-10-21 2016-09-13 Commvault Systems, Inc. Using an enhanced data agent to restore backed up data across autonomous storage management systems
US9451023B2 (en) 2011-09-30 2016-09-20 Commvault Systems, Inc. Information management of virtual machines having mapped storage devices
US9459968B2 (en) 2013-03-11 2016-10-04 Commvault Systems, Inc. Single index to query multiple backup formats
US9483362B2 (en) 2013-05-08 2016-11-01 Commvault Systems, Inc. Use of auxiliary data protection software in failover operations
US9483489B2 (en) 2013-01-14 2016-11-01 Commvault Systems, Inc. Partial sharing of secondary storage files in a data storage system
US9483558B2 (en) 2013-05-29 2016-11-01 Commvault Systems, Inc. Assessing user performance in a community of users of data storage resources
US9483201B2 (en) 2012-07-31 2016-11-01 Commvault Systems, Inc. Administering a shared, on-line pool of data storage resources for performing data storage operations
US9557929B2 (en) 2010-09-30 2017-01-31 Commvault Systems, Inc. Data recovery operations, such as recovery from modified network data management protocol data
US9563518B2 (en) 2014-04-02 2017-02-07 Commvault Systems, Inc. Information management by a media agent in the absence of communications with a storage manager
US9563514B2 (en) 2015-06-19 2017-02-07 Commvault Systems, Inc. Assignment of proxies for virtual-machine secondary copy operations including streaming backup jobs
US9575673B2 (en) 2014-10-29 2017-02-21 Commvault Systems, Inc. Accessing a file system using tiered deduplication
US9575804B2 (en) 2015-03-27 2017-02-21 Commvault Systems, Inc. Job management and resource allocation
US9588972B2 (en) 2010-09-30 2017-03-07 Commvault Systems, Inc. Efficient data management improvements, such as docking limited-feature data management modules to a full-featured data management system
US9588849B2 (en) 2015-01-20 2017-03-07 Commvault Systems, Inc. Synchronizing selected portions of data in a storage management system
US9590886B2 (en) 2013-11-01 2017-03-07 Commvault Systems, Inc. Systems and methods for differential health checking of an information management system
US20170097770A1 (en) * 2015-01-15 2017-04-06 Commvault Systems, Inc. Intelligent hybrid drive caching
US9619339B2 (en) 2012-12-21 2017-04-11 Commvault Systems, Inc. Systems and methods to confirm replication data accuracy for data backup in data storage systems
US9633216B2 (en) 2012-12-27 2017-04-25 Commvault Systems, Inc. Application of information management policies based on operation with a geographic entity
US9633026B2 (en) 2014-03-13 2017-04-25 Commvault Systems, Inc. Systems and methods for protecting email data
US9632713B2 (en) 2014-12-03 2017-04-25 Commvault Systems, Inc. Secondary storage editor
US9633056B2 (en) 2014-03-17 2017-04-25 Commvault Systems, Inc. Maintaining a deduplication database
US9633033B2 (en) 2013-01-11 2017-04-25 Commvault Systems, Inc. High availability distributed deduplicated storage system
US9641388B2 (en) 2014-07-29 2017-05-02 Commvault Systems, Inc. Customized deployment in information management systems
US9639286B2 (en) 2015-05-14 2017-05-02 Commvault Systems, Inc. Restore of secondary data using thread pooling
US9648100B2 (en) 2014-03-05 2017-05-09 Commvault Systems, Inc. Cross-system storage management for transferring data across autonomous information management systems
US9645891B2 (en) 2014-12-04 2017-05-09 Commvault Systems, Inc. Opportunistic execution of secondary copy operations
US9652283B2 (en) 2013-01-14 2017-05-16 Commvault Systems, Inc. Creation of virtual machine placeholders in a data storage system
US9710465B2 (en) 2014-09-22 2017-07-18 Commvault Systems, Inc. Efficiently restoring execution of a backed up virtual machine based on coordination with virtual-machine-file-relocation operations
US9720787B2 (en) 2013-01-11 2017-08-01 Commvault Systems, Inc. Table level database restore in a data storage system
US9740578B2 (en) 2014-07-16 2017-08-22 Commvault Systems, Inc. Creating customized bootable image for client computing device from backup copy
US9740574B2 (en) 2014-05-09 2017-08-22 Commvault Systems, Inc. Load balancing across multiple data paths
US9740702B2 (en) 2012-12-21 2017-08-22 Commvault Systems, Inc. Systems and methods to identify unprotected virtual machines
US9747169B2 (en) 2012-12-21 2017-08-29 Commvault Systems, Inc. Reporting using data obtained during backup of primary storage
US9753816B2 (en) 2014-12-05 2017-09-05 Commvault Systems, Inc. Synchronization based on filtered browsing
US9760444B2 (en) 2013-01-11 2017-09-12 Commvault Systems, Inc. Sharing of secondary storage data
US9766825B2 (en) 2015-07-22 2017-09-19 Commvault Systems, Inc. Browse and restore for block-level backups
US9804930B2 (en) 2013-01-11 2017-10-31 Commvault Systems, Inc. Partial file restore in a data storage system
US9823977B2 (en) 2014-11-20 2017-11-21 Commvault Systems, Inc. Virtual machine change block tracking
US9823978B2 (en) 2014-04-16 2017-11-21 Commvault Systems, Inc. User-level quota management of data objects stored in information management systems
US9848046B2 (en) 2014-11-13 2017-12-19 Commvault Systems, Inc. Archiving applications in information management systems
US9852026B2 (en) 2014-08-06 2017-12-26 Commvault Systems, Inc. Efficient application recovery in an information management system based on a pseudo-storage-device driver
US9904598B2 (en) 2015-04-21 2018-02-27 Commvault Systems, Inc. Content-independent and database management system-independent synthetic full backup of a database based on snapshot technology
US9912625B2 (en) 2014-11-18 2018-03-06 Commvault Systems, Inc. Storage and management of mail attachments
US9928001B2 (en) 2014-09-22 2018-03-27 Commvault Systems, Inc. Efficiently restoring execution of a backed up virtual machine based on coordination with virtual-machine-file-relocation operations
US9928144B2 (en) 2015-03-30 2018-03-27 Commvault Systems, Inc. Storage management of data using an open-archive architecture, including streamlined access to primary data originally stored on network-attached storage and archived to secondary storage
US9934265B2 (en) 2015-04-09 2018-04-03 Commvault Systems, Inc. Management of log data
US9939981B2 (en) 2013-09-12 2018-04-10 Commvault Systems, Inc. File manager integration with virtualization in an information management system with an enhanced storage manager, including user control and storage management of virtual machines
US9952934B2 (en) 2015-01-20 2018-04-24 Commvault Systems, Inc. Synchronizing selected portions of data in a storage management system
US9965316B2 (en) 2012-12-21 2018-05-08 Commvault Systems, Inc. Archiving virtual machines in a data storage system
US9977687B2 (en) 2013-01-08 2018-05-22 Commvault Systems, Inc. Virtual server agent load balancing
US10031917B2 (en) 2014-07-29 2018-07-24 Commvault Systems, Inc. Efficient volume-level replication of data via snapshots in an information management system
US10048889B2 (en) 2014-09-22 2018-08-14 Commvault Systems, Inc. Efficient live-mount of a backed up virtual machine in a storage management system
US10061663B2 (en) 2015-12-30 2018-08-28 Commvault Systems, Inc. Rebuilding deduplication data in a distributed deduplication data storage system
US10084873B2 (en) 2015-06-19 2018-09-25 Commvault Systems, Inc. Assignment of data agent proxies for executing virtual-machine secondary copy operations including streaming backup jobs
US10091146B2 (en) 2008-11-05 2018-10-02 Commvault Systems, Inc. System and method for monitoring and copying multimedia messages to storage locations in compliance with a policy
US10102192B2 (en) 2015-11-03 2018-10-16 Commvault Systems, Inc. Summarization and processing of email on a client computing device based on content contribution to an email thread using weighting techniques
US10101913B2 (en) 2015-09-02 2018-10-16 Commvault Systems, Inc. Migrating data to disk without interrupting running backup operations
US10108652B2 (en) 2013-01-11 2018-10-23 Commvault Systems, Inc. Systems and methods to process block-level backup for selective file restoration for virtual machines
US10108687B2 (en) 2015-01-21 2018-10-23 Commvault Systems, Inc. Database protection using block-level mapping
US10152251B2 (en) 2016-10-25 2018-12-11 Commvault Systems, Inc. Targeted backup of virtual machine
US10157184B2 (en) 2012-03-30 2018-12-18 Commvault Systems, Inc. Data previewing before recalling large data files
US10162528B2 (en) 2016-10-25 2018-12-25 Commvault Systems, Inc. Targeted snapshot based on virtual machine location
US10169121B2 (en) 2014-02-27 2019-01-01 Commvault Systems, Inc. Work flow management for an information management system
US10192065B2 (en) 2015-08-31 2019-01-29 Commvault Systems, Inc. Automated intelligent provisioning of data storage resources in response to user requests in a data storage management system
US10198324B2 (en) 2008-06-18 2019-02-05 Commvault Systems, Inc. Data protection scheduling, such as providing a flexible backup window in a data protection system
US10204010B2 (en) 2014-10-03 2019-02-12 Commvault Systems, Inc. Intelligent protection of off-line mail data
US10210048B2 (en) 2016-10-25 2019-02-19 Commvault Systems, Inc. Selective snapshot and backup copy operations for individual virtual machines in a shared storage
US10228962B2 (en) 2015-12-09 2019-03-12 Commvault Systems, Inc. Live synchronization and management of virtual machines across computing and virtualization platforms and using live synchronization to support disaster recovery
US10296368B2 (en) 2016-03-09 2019-05-21 Commvault Systems, Inc. Hypervisor-independent block-level live browse for access to backed up virtual machine (VM) data and hypervisor-free file-level recovery (block-level pseudo-mount)
US10303557B2 (en) 2016-03-09 2019-05-28 Commvault Systems, Inc. Data transfer to a distributed storage environment
US10303559B2 (en) 2012-12-27 2019-05-28 Commvault Systems, Inc. Restoration of centralized data storage manager, such as data storage manager in a hierarchical data storage system
US10310950B2 (en) 2017-08-21 2019-06-04 Commvault Systems, Inc. Load balancing across multiple data paths

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8214646B2 (en) * 2008-05-06 2012-07-03 Research In Motion Limited Bundle verification
KR20130087810A (en) * 2012-01-30 2013-08-07 삼성전자주식회사 Method and apparatus for cooperative caching in mobile communication system
US9014540B1 (en) * 2012-07-17 2015-04-21 Time Warner Cable Enterprises Llc Techniques for provisioning local media players with content
US8996478B2 (en) * 2012-10-18 2015-03-31 Netapp, Inc. Migrating deduplicated data
US8738577B1 (en) 2013-03-01 2014-05-27 Storagecraft Technology Corporation Change tracking for multiphase deduplication
US8732135B1 (en) 2013-03-01 2014-05-20 Storagecraft Technology Corporation Restoring a backup from a deduplication vault storage
US8874527B2 (en) * 2013-03-01 2014-10-28 Storagecraft Technology Corporation Local seeding of a restore storage for restoring a backup from a remote deduplication vault storage
US8682870B1 (en) 2013-03-01 2014-03-25 Storagecraft Technology Corporation Defragmentation during multiphase deduplication
US9740422B1 (en) * 2013-03-14 2017-08-22 EMC IP Holding Company LLC Version-based deduplication of incremental forever type backup
US8751454B1 (en) 2014-01-28 2014-06-10 Storagecraft Technology Corporation Virtual defragmentation in a deduplication vault
US9886457B2 (en) * 2014-03-10 2018-02-06 International Business Machines Corporation Deduplicated data processing hierarchical rate control in a data deduplication system
US9218407B1 (en) * 2014-06-25 2015-12-22 Pure Storage, Inc. Replication and intermediate read-write state for mediums
US9921756B2 (en) * 2015-12-29 2018-03-20 EMC IP Holding Company LLC Method and system for synchronizing an index of data blocks stored in a storage system using a shared storage module

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7472238B1 (en) * 2004-11-05 2008-12-30 Commvault Systems, Inc. Systems and methods for recovering electronic information from a storage medium
US20100070478A1 (en) * 2008-09-15 2010-03-18 International Business Machines Corporation Retrieval and recovery of data chunks from alternate data stores in a deduplicating system
US7814149B1 (en) * 2008-09-29 2010-10-12 Symantec Operating Corporation Client side data deduplication
US7822939B1 (en) * 2007-09-25 2010-10-26 Emc Corporation Data de-duplication using thin provisioning
WO2010131292A1 (en) * 2009-05-13 2010-11-18 Hitachi, Ltd. Storage system and utilization management method for storage system
US20100312752A1 (en) * 2009-06-08 2010-12-09 Symantec Corporation Source Classification For Performing Deduplication In A Backup Operation
US20110161723A1 (en) * 2009-12-28 2011-06-30 Riverbed Technology, Inc. Disaster recovery using local and cloud spanning deduplicated storage system

Family Cites Families (438)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4267568A (en) 1975-12-03 1981-05-12 System Development Corporation Information storage and retrieval system
US4084231A (en) 1975-12-18 1978-04-11 International Business Machines Corporation System for facilitating the copying back of data in disc and tape units of a memory hierarchial system
GB2035014B (en) 1978-11-06 1982-09-29 British Broadcasting Corp Cyclic redundancy data check encoding method and apparatus
US4417321A (en) 1981-05-18 1983-11-22 International Business Machines Corp. Qualifying and sorting file record data
US4641274A (en) 1982-12-03 1987-02-03 International Business Machines Corporation Method for communicating changes made to text form a text processor to a remote host
US4686620A (en) 1984-07-26 1987-08-11 American Telephone And Telegraph Company, At&T Bell Laboratories Database backup method
GB8622010D0 (en) 1986-09-12 1986-10-22 Hewlett Packard Ltd File backup facility
US5193154A (en) 1987-07-10 1993-03-09 Hitachi, Ltd. Buffered peripheral system and method for backing up and retrieving data to and from backup memory device
US5005122A (en) 1987-09-08 1991-04-02 Digital Equipment Corporation Arrangement with cooperating management server node and network service node
JPH0743676B2 (en) 1988-03-11 1995-05-15 株式会社日立製作所 -Back up-data dump control method and apparatus
US4912637A (en) 1988-04-26 1990-03-27 Tandem Computers Incorporated Version management tool
US4995035A (en) 1988-10-31 1991-02-19 International Business Machines Corporation Centralized management in a computer network
US5093912A (en) 1989-06-26 1992-03-03 International Business Machines Corporation Dynamic resource pool expansion and contraction in multiprocessing environments
EP0405926B1 (en) 1989-06-30 1996-12-04 Digital Equipment Corporation Method and apparatus for managing a shadow set of storage media
US5454099A (en) 1989-07-25 1995-09-26 International Business Machines Corporation CPU implemented method for backing up modified data sets in non-volatile store for recovery in the event of CPU failure
US5133065A (en) 1989-07-27 1992-07-21 Personal Computer Peripherals Corporation Backup computer program for networks
US5321816A (en) 1989-10-10 1994-06-14 Unisys Corporation Local-remote apparatus with specialized image storage modules
US5504873A (en) 1989-11-01 1996-04-02 E-Systems, Inc. Mass data storage and retrieval system
US5276860A (en) 1989-12-19 1994-01-04 Epoch Systems, Inc. Digital data processor with improved backup storage
US5276867A (en) 1989-12-19 1994-01-04 Epoch Systems, Inc. Digital data storage system with improved data migration
JPH0410041A (en) 1990-04-27 1992-01-14 Toshiba Corp Data saving system
GB2246218B (en) 1990-07-18 1994-02-09 Stc Plc Distributed data processing systems
US5239647A (en) 1990-09-07 1993-08-24 International Business Machines Corporation Data storage hierarchy with shared storage level
US5544347A (en) 1990-09-24 1996-08-06 Emc Corporation Data storage system controlled remote data mirroring with respectively maintained data indices
US5301286A (en) 1991-01-02 1994-04-05 At&T Bell Laboratories Memory archiving indexing arrangement
US5212772A (en) 1991-02-11 1993-05-18 Gigatrend Incorporated System for storing data in backup tape device
US5625793A (en) 1991-04-15 1997-04-29 International Business Machines Corporation Automatic cache bypass for instructions exhibiting poor cache hit ratio
US5287500A (en) 1991-06-03 1994-02-15 Digital Equipment Corporation System for allocating storage spaces based upon required and optional service attributes having assigned piorities
US5333315A (en) 1991-06-27 1994-07-26 Digital Equipment Corporation System of device independent file directories using a tag between the directories and file descriptors that migrate with the files
US5347653A (en) 1991-06-28 1994-09-13 Digital Equipment Corporation System for reconstructing prior versions of indexes using records indicating changes between successive versions of the indexes
US5410700A (en) 1991-09-04 1995-04-25 International Business Machines Corporation Computer system which supports asynchronous commitment of data
EP0541281B1 (en) 1991-11-04 1998-04-29 Commvault Systems, Inc. Incremental-computer-file backup using signatures
US5499367A (en) 1991-11-15 1996-03-12 Oracle Corporation System for database integrity with multiple logs assigned to client subsets
US5263154A (en) 1992-04-20 1993-11-16 International Business Machines Corporation Method and system for incremental time zero backup copying of data
US5241670A (en) 1992-04-20 1993-08-31 International Business Machines Corporation Method and system for automated backup copy ordering in a time zero backup copy session
US5241668A (en) 1992-04-20 1993-08-31 International Business Machines Corporation Method and system for automated termination and resumption in a time zero backup copy process
US5842033A (en) 1992-06-30 1998-11-24 Discovision Associates Padding apparatus for passing an arbitrary number of bits through a buffer in a pipeline system
US5403639A (en) 1992-09-02 1995-04-04 Storage Technology Corporation File server having snapshot application data groups
CA2153769C (en) 1993-01-21 2001-08-07 Steven E. Kullick Apparatus and method for transferring and storing data from an arbitrarily large number of networked computer storage devices
DE69434311D1 (en) 1993-02-01 2005-04-28 Sun Microsystems Inc Archiving file system for data providers in a distributed network environment
US5889935A (en) 1996-05-28 1999-03-30 Emc Corporation Disaster control features for remote data mirroring
CA2121852A1 (en) 1993-04-29 1994-10-30 Larry T. Jost Disk meshing and flexible storage mapping with enhanced flexible caching
US5664106A (en) * 1993-06-04 1997-09-02 Digital Equipment Corporation Phase-space surface representation of server computer performance in a computer network
JPH0721135A (en) 1993-07-02 1995-01-24 Fujitsu Ltd Data processing system with duplex monitor function
US5544345A (en) 1993-11-08 1996-08-06 International Business Machines Corporation Coherence controls for store-multiple shared data coordinated by cache directory entries in a shared electronic storage
JPH09509768A (en) 1993-11-09 1997-09-30 シーゲート テクノロジー,インコーポレイテッド Backup and restore system data for the computer network
US5495607A (en) 1993-11-15 1996-02-27 Conner Peripherals, Inc. Network management system having virtual catalog overview of files distributively stored across network domain
US5491810A (en) 1994-03-01 1996-02-13 International Business Machines Corporation Method and system for automated data storage system space allocation utilizing prioritized data set parameters
US5673381A (en) 1994-05-27 1997-09-30 Cheyenne Software International Sales Corp. System and parallel streaming and data stripping to back-up a network
US5638509A (en) 1994-06-10 1997-06-10 Exabyte Corporation Data storage and protection system
US5574906A (en) 1994-10-24 1996-11-12 International Business Machines Corporation System and method for reducing storage requirement in backup subsystems utilizing segmented compression and differencing
US5990810A (en) 1995-02-17 1999-11-23 Williams; Ross Neil Method for partitioning a block of data into subblocks and for storing and communcating such subblocks
US5930831A (en) 1995-02-23 1999-07-27 Powerquest Corporation Partition manipulation architecture supporting multiple file systems
US5559957A (en) 1995-05-31 1996-09-24 Lucent Technologies Inc. File system for a data storage device having a power fail recovery mechanism for write/replace operations
US5699361A (en) 1995-07-18 1997-12-16 Industrial Technology Research Institute Multimedia channel formulation mechanism
US5813009A (en) 1995-07-28 1998-09-22 Univirtual Corp. Computer based records management system method
US5619644A (en) 1995-09-18 1997-04-08 International Business Machines Corporation Software directed microcode state save for distributed storage controller
US5907672A (en) 1995-10-04 1999-05-25 Stac, Inc. System for backing up computer disk volumes with error remapping of flawed memory addresses
JP3856855B2 (en) 1995-10-06 2006-12-13 三菱電機株式会社 Differential backup method
US5819020A (en) 1995-10-16 1998-10-06 Network Specialists, Inc. Real time backup system
US5778395A (en) 1995-10-23 1998-07-07 Stac, Inc. System for backing up files from disk volumes on multiple nodes of a computer network
US5729743A (en) 1995-11-17 1998-03-17 Deltatech Research, Inc. Computer apparatus and method for merging system deltas
US5761677A (en) 1996-01-03 1998-06-02 Sun Microsystems, Inc. Computer system method and apparatus providing for various versions of a file without requiring data copy or log operations
US5765173A (en) 1996-01-11 1998-06-09 Connected Corporation High performance backup via selective file saving which can perform incremental backups and exclude files and uses a changed block signature list
US6131095A (en) 1996-12-11 2000-10-10 Hewlett-Packard Company Method of accessing a target entity over a communications network
KR970076238A (en) 1996-05-23 1997-12-12 포만 제프리 엘 Server, and how to create multiple copies of client data files, and manage product and program
US5812398A (en) 1996-06-10 1998-09-22 Sun Microsystems, Inc. Method and system for escrowed backup of hotelled world wide web sites
US5940833A (en) 1996-07-12 1999-08-17 Microsoft Corporation Compressing sets of integers
US5813008A (en) 1996-07-12 1998-09-22 Microsoft Corporation Single instance storage of information
US5758359A (en) 1996-10-24 1998-05-26 Digital Equipment Corporation Method and apparatus for performing retroactive backups in a computer system
US5875478A (en) 1996-12-03 1999-02-23 Emc Corporation Computer backup using a file system, network, disk, tape and remote archiving repository media system
US5878408A (en) 1996-12-06 1999-03-02 International Business Machines Corporation Data management system and process
WO1998033113A1 (en) 1997-01-23 1998-07-30 Overland Data, Inc. Virtual media library
US5875481A (en) 1997-01-30 1999-02-23 International Business Machines Corporation Dynamic reconfiguration of data storage devices to balance recycle throughput
US6658526B2 (en) 1997-03-12 2003-12-02 Storage Technology Corporation Network attached virtual data storage subsystem
US5924102A (en) 1997-05-07 1999-07-13 International Business Machines Corporation System and method for managing critical files
US6094416A (en) 1997-05-09 2000-07-25 I/O Control Corporation Multi-tier architecture for control network
US5887134A (en) 1997-06-30 1999-03-23 Sun Microsystems System and method for preserving message order while employing both programmed I/O and DMA operations
US6366988B1 (en) 1997-07-18 2002-04-02 Storactive, Inc. Systems and methods for electronic data storage management
WO1999009480A1 (en) 1997-07-29 1999-02-25 Telebackup Systems, Inc. Method and system for nonredundant backup of identical files stored on remote computers
DE69802294T2 (en) 1997-08-29 2002-05-16 Hewlett Packard Co Systems for data backup and recovery
EP0899662A1 (en) 1997-08-29 1999-03-03 Hewlett-Packard Company Backup and restore system for a computer network
US5950205A (en) 1997-09-25 1999-09-07 Cisco Technology, Inc. Data transmission over the internet using a cache memory file system
US6275953B1 (en) 1997-09-26 2001-08-14 Emc Corporation Recovery from failure of a data processor in a network server
US6052735A (en) 1997-10-24 2000-04-18 Microsoft Corporation Electronic mail object synchronization between a desktop computer and mobile device
US6021415A (en) 1997-10-29 2000-02-01 International Business Machines Corporation Storage management system with file aggregation and space reclamation within aggregated files
US7581077B2 (en) 1997-10-30 2009-08-25 Commvault Systems, Inc. Method and system for transferring data in a storage operation
US6418478B1 (en) 1997-10-30 2002-07-09 Commvault Systems, Inc. Pipelined high speed data transfer mechanism
JPH11143754A (en) 1997-11-05 1999-05-28 Hitachi Ltd Version information and constitution information display method and device therefor, and computer readable recording medium for recording version information and constitution information display program
US6131190A (en) 1997-12-18 2000-10-10 Sidwell; Leland P. System for modifying JCL parameters to optimize data storage allocations
US6374336B1 (en) 1997-12-24 2002-04-16 Avid Technology, Inc. Computer system and process for transferring multiple high bandwidth streams of data between multiple storage units and multiple applications in a scalable and reliable manner
US6076148A (en) 1997-12-26 2000-06-13 Emc Corporation Mass storage subsystem and backup arrangement for digital data processing system which permits information to be backed up while host computer(s) continue(s) operating in connection with information stored on mass storage subsystem
US6154787A (en) 1998-01-21 2000-11-28 Unisys Corporation Grouping shared resources into one or more pools and automatically re-assigning shared resources from where they are not currently needed to where they are needed
US6260069B1 (en) 1998-02-10 2001-07-10 International Business Machines Corporation Direct data retrieval in a distributed computing system
DE69816415T2 (en) 1998-03-02 2004-04-15 Hewlett-Packard Co. (N.D.Ges.D.Staates Delaware), Palo Alto Data Backup System
US6026414A (en) 1998-03-05 2000-02-15 International Business Machines Corporation System including a proxy client to backup files in a distributed computing environment
US6289432B1 (en) 1998-03-25 2001-09-11 International Business Machines Corporation Sharing segments of storage by enabling the sharing of page tables
US6161111A (en) 1998-03-31 2000-12-12 Emc Corporation System and method for performing file-handling operations in a digital data processing system using an operating system-independent file map
US6167402A (en) 1998-04-27 2000-12-26 Sun Microsystems, Inc. High performance message store
US6163856A (en) 1998-05-29 2000-12-19 Sun Microsystems, Inc. Method and apparatus for file system disaster recovery
US20010052015A1 (en) * 1998-06-24 2001-12-13 Chueng-Hsien Lin Push-pull sevices for the internet
US6421711B1 (en) 1998-06-29 2002-07-16 Emc Corporation Virtual ports for data transferring of a data storage system
US6366986B1 (en) 1998-06-30 2002-04-02 Emc Corporation Method and apparatus for differential backup in a computer storage system
US6094605A (en) 1998-07-06 2000-07-25 Storage Technology Corporation Virtual automated cartridge system
US6415385B1 (en) 1998-07-29 2002-07-02 Unisys Corporation Digital signaturing method and system for packaging specialized native files for open network transport and for burning onto CD-ROM
US6269431B1 (en) 1998-08-13 2001-07-31 Emc Corporation Virtual storage and block level direct access of secondary storage for recovery of backup data
US6353878B1 (en) 1998-08-13 2002-03-05 Emc Corporation Remote control of backup media in a secondary storage subsystem through access to a primary storage subsystem
US6757705B1 (en) 1998-08-14 2004-06-29 Microsoft Corporation Method and system for client-side caching
GB2341249A (en) 1998-08-17 2000-03-08 Connected Place Limited A method of generating a difference file defining differences between an updated file and a base file
US6425057B1 (en) 1998-08-27 2002-07-23 Hewlett-Packard Company Caching protocol method and system based on request frequency and relative storage duration
US6286084B1 (en) * 1998-09-16 2001-09-04 Cisco Technology, Inc. Methods and apparatus for populating a network cache
US6920537B2 (en) 1998-12-31 2005-07-19 Emc Corporation Apparatus and methods for copying, backing up and restoring logical objects in a computer storage system by transferring blocks out of order or in parallel
US6487561B1 (en) 1998-12-31 2002-11-26 Emc Corporation Apparatus and methods for copying, backing up, and restoring data using a backup segment size larger than the storage block size
US7107395B1 (en) 1998-12-31 2006-09-12 Emc Corporation Apparatus and methods for operating a computer storage system
US6397308B1 (en) 1998-12-31 2002-05-28 Emc Corporation Apparatus and method for differential backup and restoration of data in a computer storage system
US6212512B1 (en) 1999-01-06 2001-04-03 Hewlett-Packard Company Integration of a database into file management software for protecting, tracking and retrieving data
US6324581B1 (en) 1999-03-03 2001-11-27 Emc Corporation File server system using file system storage, data movers, and an exchange of meta data among data movers for file locking and direct access to shared file systems
US6389432B1 (en) 1999-04-05 2002-05-14 Auspex Systems, Inc. Intelligent virtual volume access
US6519679B2 (en) 1999-06-11 2003-02-11 Dell Usa, L.P. Policy based storage configuration
US7035880B1 (en) 1999-07-14 2006-04-25 Commvault Systems, Inc. Modular backup and retrieval system used in conjunction with a storage area network
US6538669B1 (en) 1999-07-15 2003-03-25 Dell Products L.P. Graphical user interface for configuration of a storage system
US7395282B1 (en) 1999-07-15 2008-07-01 Commvault Systems, Inc. Hierarchical backup and retrieval system
US7389311B1 (en) 1999-07-15 2008-06-17 Commvault Systems, Inc. Modular backup and retrieval system
US6912629B1 (en) 1999-07-28 2005-06-28 Storage Technology Corporation System and method for restoring data from secondary volume to primary volume in a data storage system
US6490666B1 (en) 1999-08-20 2002-12-03 Microsoft Corporation Buffering data in a hierarchical data storage environment
US6496850B1 (en) 1999-08-31 2002-12-17 Accenture Llp Clean-up of orphaned server contexts
US6343324B1 (en) 1999-09-13 2002-01-29 International Business Machines Corporation Method and system for controlling access share storage devices in a network environment by configuring host-to-volume mapping data structures in the controller memory for granting and denying access to the devices
US7028096B1 (en) 1999-09-14 2006-04-11 Streaming21, Inc. Method and apparatus for caching for streaming data
US6625623B1 (en) 1999-12-16 2003-09-23 Livevault Corporation Systems and methods for backing up data files
US6564228B1 (en) 2000-01-14 2003-05-13 Sun Microsystems, Inc. Method of enabling heterogeneous platforms to utilize a universal file system in a storage area network
US6823377B1 (en) 2000-01-28 2004-11-23 International Business Machines Corporation Arrangements and methods for latency-sensitive hashing for collaborative web caching
US6542972B2 (en) 2000-01-31 2003-04-01 Commvault Systems, Inc. Logical view and access to physical storage in modular data and storage management system
US6760723B2 (en) 2000-01-31 2004-07-06 Commvault Systems Inc. Storage management across multiple time zones
US7003641B2 (en) 2000-01-31 2006-02-21 Commvault Systems, Inc. Logical view with granular access to exchange data managed by a modular data and storage management system
US6658436B2 (en) 2000-01-31 2003-12-02 Commvault Systems, Inc. Logical view and access to data managed by a modular data and storage management system
US6721767B2 (en) 2000-01-31 2004-04-13 Commvault Systems, Inc. Application specific rollback in a computer system
US6704730B2 (en) 2000-02-18 2004-03-09 Avamar Technologies, Inc. Hash file system and method for use in a commonality factoring system
US7117246B2 (en) 2000-02-22 2006-10-03 Sendmail, Inc. Electronic mail system with methodology providing distributed message store
US6952737B1 (en) 2000-03-03 2005-10-04 Intel Corporation Method and apparatus for accessing remote storage in a distributed storage cluster architecture
US7730113B1 (en) 2000-03-07 2010-06-01 Applied Discovery, Inc. Network-based system and method for accessing and processing emails and other electronic legal documents that may include duplicate information
US6438368B1 (en) 2000-03-30 2002-08-20 Ikadega, Inc. Information distribution system and method
US6356801B1 (en) 2000-05-19 2002-03-12 International Business Machines Corporation High availability work queuing in an automated data storage library
US6557030B1 (en) 2000-05-31 2003-04-29 Prediwave Corp. Systems and methods for providing video-on-demand services for broadcasting systems
US6665815B1 (en) 2000-06-22 2003-12-16 Hewlett-Packard Development Company, L.P. Physical incremental backup using snapshots
US6330642B1 (en) 2000-06-29 2001-12-11 Bull Hn Informatin Systems Inc. Three interconnected raid disk controller data processing system architecture
US6909722B1 (en) 2000-07-07 2005-06-21 Qualcomm, Incorporated Method and apparatus for proportionately multiplexing data streams onto one data stream
US7082441B1 (en) 2000-08-17 2006-07-25 Emc Corporation Method and storage and manipulation of storage system metrics
US6886020B1 (en) 2000-08-17 2005-04-26 Emc Corporation Method and apparatus for storage system metrics management and archive
US6732125B1 (en) 2000-09-08 2004-05-04 Storage Technology Corporation Self archiving log structured volume with intrinsic data protection
EP1193616A1 (en) 2000-09-29 2002-04-03 Sony France S.A. Fixed-length sequence generation of items out of a database using descriptors
US6760812B1 (en) 2000-10-05 2004-07-06 International Business Machines Corporation System and method for coordinating state between networked caches
US6810398B2 (en) 2000-11-06 2004-10-26 Avamar Technologies, Inc. System and method for unorchestrated determination of data sequences using sticky byte factoring to determine breakpoints in digital sequences
US6557089B1 (en) 2000-11-28 2003-04-29 International Business Machines Corporation Backup by ID-suppressed instant virtual copy then physical backup copy with ID reintroduced
US7003551B2 (en) 2000-11-30 2006-02-21 Bellsouth Intellectual Property Corp. Method and apparatus for minimizing storage of common attachment files in an e-mail communications server
US6799258B1 (en) 2001-01-10 2004-09-28 Datacore Software Corporation Methods and apparatus for point-in-time volumes
US7194454B2 (en) 2001-03-12 2007-03-20 Lucent Technologies Method for organizing records of database search activity by topical relevance
US20020133601A1 (en) 2001-03-16 2002-09-19 Kennamer Walter J. Failover of servers over which data is partitioned
EP1244221A1 (en) 2001-03-23 2002-09-25 Sun Microsystems, Inc. Method and system for eliminating data redundancies
TW518513B (en) 2001-03-28 2003-01-21 Synq Technology Inc System and method to update an executing application software by modular way
US7315884B2 (en) * 2001-04-03 2008-01-01 Hewlett-Packard Development Company, L.P. Reduction of network retrieval latency using cache and digest
US20040181519A1 (en) 2002-07-09 2004-09-16 Mohammed Shahbaz Anwar Method for generating multidimensional summary reports from multidimensional summary reports from multidimensional data
US7685126B2 (en) 2001-08-03 2010-03-23 Isilon Systems, Inc. System and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system
US7243163B1 (en) 2001-08-07 2007-07-10 Good Technology, Inc. System and method for full wireless synchronization of a data processing apparatus with a messaging system
US6662198B2 (en) 2001-08-30 2003-12-09 Zoteca Inc. Method and system for asynchronous transmission, backup, distribution of data and file sharing
US7586914B2 (en) 2001-09-27 2009-09-08 Broadcom Corporation Apparatus and method for hardware creation of a DOCSIS header
US20030174648A1 (en) 2001-10-17 2003-09-18 Mea Wang Content delivery network by-pass system
US7139809B2 (en) 2001-11-21 2006-11-21 Clearcube Technology, Inc. System and method for providing virtual network attached storage using excess distributed storage capacity
US7496604B2 (en) 2001-12-03 2009-02-24 Aol Llc Reducing duplication of files on a network
US20030115346A1 (en) * 2001-12-13 2003-06-19 Mchenry Stephen T. Multi-proxy network edge cache system and methods
CA2475319A1 (en) 2002-02-04 2003-08-14 Cataphora, Inc. A method and apparatus to visually present discussions for data mining purposes
US7539735B2 (en) 2002-03-06 2009-05-26 International Business Machines Corporation Multi-session no query restore
US20030188106A1 (en) * 2002-03-26 2003-10-02 At&T Corp. Cache validation using rejuvenation in a data network
US8650266B2 (en) 2002-03-26 2014-02-11 At&T Intellectual Property Ii, L.P. Cache validation using smart source selection in a data network
US6983351B2 (en) 2002-04-11 2006-01-03 International Business Machines Corporation System and method to guarantee overwrite of expired data in a virtual tape server
US20060089954A1 (en) 2002-05-13 2006-04-27 Anschutz Thomas A Scalable common access back-up architecture
JP4221646B2 (en) * 2002-06-26 2009-02-12 日本電気株式会社 Shared cache server
US6865655B1 (en) 2002-07-30 2005-03-08 Sun Microsystems, Inc. Methods and apparatus for backing up and restoring data portions stored in client computer systems
US6952758B2 (en) 2002-07-31 2005-10-04 International Business Machines Corporation Method and system for providing consistent data modification information to clients in a storage system
US7100089B1 (en) 2002-09-06 2006-08-29 3Pardata, Inc. Determining differences between snapshots
US7130970B2 (en) 2002-09-09 2006-10-31 Commvault Systems, Inc. Dynamic storage device pooling in a computer system
WO2004025423A2 (en) 2002-09-16 2004-03-25 Commvault Systems, Inc. System and method for blind media support
WO2004025483A1 (en) 2002-09-16 2004-03-25 Commvault Systems, Inc. System and method for optimizing storage operations
US7284030B2 (en) 2002-09-16 2007-10-16 Network Appliance, Inc. Apparatus and method for processing data in a network
US7171469B2 (en) 2002-09-16 2007-01-30 Network Appliance, Inc. Apparatus and method for storing data in a proxy cache in a network
US7287252B2 (en) 2002-09-27 2007-10-23 The United States Of America Represented By The Secretary Of The Navy Universal client and consumer
AU2003279847A1 (en) 2002-10-07 2004-05-04 Commvault Systems, Inc. System and method for managing stored data
US7664771B2 (en) 2002-10-16 2010-02-16 Microsoft Corporation Optimizing defragmentation operations in a differential snapshotter
US8176186B2 (en) * 2002-10-30 2012-05-08 Riverbed Technology, Inc. Transaction accelerator for client-server communications systems
US7065619B1 (en) 2002-12-20 2006-06-20 Data Domain, Inc. Efficient data storage system
US8375008B1 (en) 2003-01-17 2013-02-12 Robert Gomes Method and system for enterprise-wide retention of digital or electronic data
JP2006516341A (en) 2003-01-17 2006-06-29 タシット ネットワークス,インク. Storage caching method and system use with the distributed file system
GB0303192D0 (en) 2003-02-12 2003-03-19 Saviso Group Ltd Methods and apparatus for traffic management in peer-to-peer networks
US7174433B2 (en) 2003-04-03 2007-02-06 Commvault Systems, Inc. System and method for dynamically sharing media in a computer network
US7457982B2 (en) 2003-04-11 2008-11-25 Network Appliance, Inc. Writable virtual disk of read-only snapshot file objects
US8069225B2 (en) * 2003-04-14 2011-11-29 Riverbed Technology, Inc. Transparent client-server transaction accelerator
US7155465B2 (en) 2003-04-18 2006-12-26 Lee Howard F Method and apparatus for automatically archiving a file system
US20040230753A1 (en) 2003-05-16 2004-11-18 International Business Machines Corporation Methods and apparatus for providing service differentiation in a shared storage environment
US7454569B2 (en) 2003-06-25 2008-11-18 Commvault Systems, Inc. Hierarchical system and method for performing storage operations in a computer network
US8280926B2 (en) 2003-08-05 2012-10-02 Sepaton, Inc. Scalable de-duplication mechanism
US8938595B2 (en) 2003-08-05 2015-01-20 Sepaton, Inc. Emulated storage system
US20050060643A1 (en) 2003-08-25 2005-03-17 Miavia, Inc. Document similarity detection and classification system
US7904428B2 (en) 2003-09-23 2011-03-08 Symantec Corporation Methods and apparatus for recording write requests directed to a data store
US7725760B2 (en) 2003-09-23 2010-05-25 Symantec Operating Corporation Data storage system
US7577806B2 (en) 2003-09-23 2009-08-18 Symantec Operating Corporation Systems and methods for time dependent data storage and recovery
JP4267420B2 (en) 2003-10-20 2009-05-27 株式会社日立製作所 Storage devices and backup acquisition method
US7251680B2 (en) 2003-10-31 2007-07-31 Veritas Operating Corporation Single instance backup of email message attachments
US7539707B2 (en) 2003-11-13 2009-05-26 Commvault Systems, Inc. System and method for performing an image level snapshot and for restoring partial volume data
US7613748B2 (en) 2003-11-13 2009-11-03 Commvault Systems, Inc. Stored data reverification management system and method
US7440982B2 (en) 2003-11-13 2008-10-21 Commvault Systems, Inc. System and method for stored data archive verification
WO2005065084A2 (en) 2003-11-13 2005-07-21 Commvault Systems, Inc. System and method for providing encryption in pipelined storage operations in a storage network
WO2005050381A2 (en) 2003-11-13 2005-06-02 Commvault Systems, Inc. Systems and methods for performing storage operations using network attached storage
US7412583B2 (en) 2003-11-14 2008-08-12 International Business Machines Corporation Virtual incremental storage method
US7272606B2 (en) 2003-11-26 2007-09-18 Veritas Operating Corporation System and method for detecting and storing file content access information within a file system
DE10356724B3 (en) * 2003-12-02 2005-06-16 Deutsches Zentrum für Luft- und Raumfahrt e.V. A method for reducing the volume of transport of data in data networks
US7155633B2 (en) 2003-12-08 2006-12-26 Solid Data Systems, Inc. Exchange server method and system
US7519726B2 (en) 2003-12-12 2009-04-14 International Business Machines Corporation Methods, apparatus and computer programs for enhanced access to resources within a network
WO2005064469A1 (en) 2003-12-19 2005-07-14 Network Appliance, Inc. System and method for supporting asynchronous data replication with very short update intervals
US7734820B1 (en) 2003-12-31 2010-06-08 Symantec Operating Corporation Adaptive caching for a distributed file sharing system
US7246272B2 (en) 2004-01-16 2007-07-17 International Business Machines Corporation Duplicate network address detection
JP4402997B2 (en) * 2004-03-26 2010-01-20 株式会社日立製作所 Storage devices
US7246258B2 (en) * 2004-04-28 2007-07-17 Lenovo (Singapore) Pte. Ltd. Minimizing resynchronization time after backup system failures in an appliance-based business continuance architecture
US7343356B2 (en) 2004-04-30 2008-03-11 Commvault Systems, Inc. Systems and methods for storage modeling and costing
US7370163B2 (en) 2004-05-03 2008-05-06 Gemini Storage Adaptive cache engine for storage area network including systems and methods related thereto
US8055745B2 (en) 2004-06-01 2011-11-08 Inmage Systems, Inc. Methods and apparatus for accessing data from a primary data storage system for secondary storage
US7293035B2 (en) 2004-06-30 2007-11-06 International Business Machines Corporation System and method for performing compression/encryption on data such that the number of duplicate blocks in the transformed data is increased
US7383462B2 (en) 2004-07-02 2008-06-03 Hitachi, Ltd. Method and apparatus for encrypted remote copy for secure data backup and restoration
US20060020660A1 (en) * 2004-07-20 2006-01-26 Vishwa Prasad Proxy and cache architecture for document storage
US7631194B2 (en) 2004-09-09 2009-12-08 Microsoft Corporation Method, system, and apparatus for creating saved searches and auto discovery groups for a data protection system
US7587423B2 (en) 2004-09-17 2009-09-08 Sap Ag Multistep master data cleansing in operative business processes
US7386578B2 (en) 2004-10-29 2008-06-10 Sap Ag Associations between duplicate master data objects
US7536291B1 (en) 2004-11-08 2009-05-19 Commvault Systems, Inc. System and method to support simulated storage operations
JP4349301B2 (en) 2004-11-12 2009-10-21 日本電気株式会社 Storage management system and method and program
US20060136685A1 (en) 2004-12-17 2006-06-22 Sanrad Ltd. Method and system to maintain data consistency over an internet small computer system interface (iSCSI) network
US7437388B1 (en) 2004-12-21 2008-10-14 Symantec Corporation Protecting data for distributed applications using cooperative backup agents
US7711695B2 (en) 2005-01-18 2010-05-04 Oracle International Corporation Reducing memory used by metadata for duplicate user defined types
US8245131B2 (en) 2005-02-10 2012-08-14 Hewlett-Packard Development Company, L.P. Constraining layout variations for accommodating variable content in electronic documents
US7765186B1 (en) 2005-04-13 2010-07-27 Progress Software Corporation Update-anywhere replication of distributed systems
US7672979B1 (en) 2005-04-22 2010-03-02 Symantec Operating Corporation Backup and restore techniques using inconsistent state indicators
US8024292B2 (en) 2005-06-29 2011-09-20 Emc Corporation Creation of a single snapshot using a server job request
DE602005005312T2 (en) 2005-06-30 2009-03-12 Ixos Software Ag Method and system for managing electronic messages
US7401080B2 (en) 2005-08-17 2008-07-15 Microsoft Corporation Storage reports duplicate file detection
US7747577B2 (en) 2005-08-17 2010-06-29 International Business Machines Corporation Management of redundant objects in storage systems
US8296369B2 (en) 2005-09-27 2012-10-23 Research In Motion Limited Email server with proxy caching of unique identifiers
US7584338B1 (en) 2005-09-27 2009-09-01 Data Domain, Inc. Replication of deduplicated storage system
US7613752B2 (en) 2005-11-28 2009-11-03 Commvault Systems, Inc. Systems and methods for using metadata to enhance data management operations
US7606844B2 (en) 2005-12-19 2009-10-20 Commvault Systems, Inc. System and method for performing replication copy storage operations
US7543125B2 (en) 2005-12-19 2009-06-02 Commvault Systems, Inc. System and method for performing time-flexible calendric storage operations
US7617253B2 (en) 2005-12-19 2009-11-10 Commvault Systems, Inc. Destination systems and methods for performing data replication
US7636743B2 (en) 2005-12-19 2009-12-22 Commvault Systems, Inc. Pathname translation in a data replication system
US7620710B2 (en) 2005-12-19 2009-11-17 Commvault Systems, Inc. System and method for performing multi-path storage operations
US7651593B2 (en) 2005-12-19 2010-01-26 Commvault Systems, Inc. Systems and methods for performing data replication
US7617262B2 (en) 2005-12-19 2009-11-10 Commvault Systems, Inc. Systems and methods for monitoring application data in a data replication system
US7870355B2 (en) 2005-12-19 2011-01-11 Commvault Systems, Inc. Log based data replication system with disk swapping below a predetermined rate
US8301839B2 (en) * 2005-12-30 2012-10-30 Citrix Systems, Inc. System and method for performing granular invalidation of cached dynamically generated objects in a data communication network
US7840618B2 (en) * 2006-01-03 2010-11-23 Nec Laboratories America, Inc. Wide area networked file system
US7743051B1 (en) 2006-01-23 2010-06-22 Clearwell Systems, Inc. Methods, systems, and user interface for e-mail search and retrieval
US7899871B1 (en) 2006-01-23 2011-03-01 Clearwell Systems, Inc. Methods and systems for e-mail topic classification
US8170985B2 (en) 2006-01-31 2012-05-01 Emc Corporation Primary stub file retention and secondary retention coordination in a hierarchical storage system
US7472242B1 (en) 2006-02-14 2008-12-30 Network Appliance, Inc. Eliminating duplicate blocks during backup writes
US7761663B2 (en) * 2006-02-16 2010-07-20 Hewlett-Packard Development Company, L.P. Operating a replicated cache that includes receiving confirmation that a flush operation was initiated
US7725655B2 (en) 2006-02-16 2010-05-25 Hewlett-Packard Development Company, L.P. Method of operating distributed storage system in which data is read from replicated caches and stored as erasure-coded data
US8543782B2 (en) 2006-04-25 2013-09-24 Hewlett-Packard Development Company, L.P. Content-based, compression-enhancing routing in distributed, differential electronic-data storage systems
US8165221B2 (en) 2006-04-28 2012-04-24 Netapp, Inc. System and method for sampling based elimination of duplicate data
US8175875B1 (en) 2006-05-19 2012-05-08 Google Inc. Efficient indexing of documents with similar content
US8412682B2 (en) * 2006-06-29 2013-04-02 Netapp, Inc. System and method for retrieving and using block fingerprints for data deduplication
US20080005509A1 (en) * 2006-06-30 2008-01-03 International Business Machines Corporation Caching recovery information on a local system to expedite recovery
US8726242B2 (en) 2006-07-27 2014-05-13 Commvault Systems, Inc. Systems and methods for continuous data replication
US7720841B2 (en) * 2006-10-04 2010-05-18 International Business Machines Corporation Model-based self-optimizing distributed information management
US8527469B2 (en) 2006-10-13 2013-09-03 Sony Corporation System and method for automatic detection of duplicate digital photos
US7882077B2 (en) 2006-10-17 2011-02-01 Commvault Systems, Inc. Method and system for offline indexing of content and classifying stored data
US9465823B2 (en) 2006-10-19 2016-10-11 Oracle International Corporation System and method for data de-duplication
US10296629B2 (en) * 2006-10-20 2019-05-21 Oracle International Corporation Server supporting a consistent client-side cache
US8214517B2 (en) 2006-12-01 2012-07-03 Nec Laboratories America, Inc. Methods and systems for quick and efficient data management and/or processing
WO2008070688A1 (en) 2006-12-04 2008-06-12 Commvault Systems, Inc. Systems and methods for creating copies of data, such as archive copies
JP4997950B2 (en) 2006-12-11 2012-08-15 富士通株式会社 Network management systems, network management programs and network management methods
US7840537B2 (en) 2006-12-22 2010-11-23 Commvault Systems, Inc. System and method for storing redundant information
US7831566B2 (en) 2006-12-22 2010-11-09 Commvault Systems, Inc. Systems and methods of hierarchical storage management, such as global management of storage operations
US7734669B2 (en) 2006-12-22 2010-06-08 Commvault Systems, Inc. Managing copies of data
US8775823B2 (en) 2006-12-29 2014-07-08 Commvault Systems, Inc. System and method for encrypting secondary copies of data
US7733910B2 (en) 2006-12-29 2010-06-08 Riverbed Technology, Inc. Data segmentation using shift-varying predicate function fingerprinting
JP5020673B2 (en) 2007-03-27 2012-09-05 株式会社日立製作所 Computer system to prevent the storage of duplicate files
US7873809B2 (en) * 2007-03-29 2011-01-18 Hitachi, Ltd. Method and apparatus for de-duplication after mirror operation
US7769971B2 (en) * 2007-03-29 2010-08-03 Data Center Technologies Replication and restoration of single-instance storage pools
US7761425B1 (en) 2007-03-29 2010-07-20 Symantec Corporation Low-overhead means of performing data backup
JP4900811B2 (en) 2007-03-30 2012-03-21 日立コンピュータ機器株式会社 A storage system and storage control method
US8489830B2 (en) 2007-03-30 2013-07-16 Symantec Corporation Implementing read/write, multi-versioned file system on top of backup data
US8768895B2 (en) 2007-04-11 2014-07-01 Emc Corporation Subsegmenting for efficient storage, resemblance determination, and transmission
US20080256431A1 (en) 2007-04-13 2008-10-16 Arno Hornberger Apparatus and Method for Generating a Data File or for Reading a Data File
US7827150B1 (en) 2007-04-30 2010-11-02 Symantec Corporation Application aware storage appliance archiving
US9930099B2 (en) 2007-05-08 2018-03-27 Riverbed Technology, Inc. Hybrid segment-oriented file server and WAN accelerator
US8315984B2 (en) 2007-05-22 2012-11-20 Netapp, Inc. System and method for on-the-fly elimination of redundant data
US8626741B2 (en) 2007-06-15 2014-01-07 Emc Corporation Process for cataloging data objects backed up from a content addressed storage system
US20090012984A1 (en) 2007-07-02 2009-01-08 Equivio Ltd. Method for Organizing Large Numbers of Documents
US8028106B2 (en) 2007-07-06 2011-09-27 Proster Systems, Inc. Hardware acceleration of commonality factoring with removable media
US20090043767A1 (en) 2007-08-07 2009-02-12 Ashutosh Joshi Approach For Application-Specific Duplicate Detection
US8078729B2 (en) 2007-08-21 2011-12-13 Ntt Docomo, Inc. Media streaming with online caching and peer-to-peer forwarding
US7809765B2 (en) 2007-08-24 2010-10-05 General Electric Company Sequence identification and analysis
WO2009032711A1 (en) 2007-08-29 2009-03-12 Nirvanix, Inc. Policy-based file management for a storage delivery network
US8738575B2 (en) * 2007-09-17 2014-05-27 International Business Machines Corporation Data recovery in a hierarchical data storage system
US7870409B2 (en) 2007-09-26 2011-01-11 Hitachi, Ltd. Power efficient data storage with data de-duplication
US8548953B2 (en) 2007-11-12 2013-10-01 F5 Networks, Inc. File deduplication using storage tiers
US8180747B2 (en) 2007-11-12 2012-05-15 F5 Networks, Inc. Load sharing cluster file systems
US8244846B2 (en) 2007-12-26 2012-08-14 Symantec Corporation Balanced consistent hashing for distributed resource management
US8209334B1 (en) 2007-12-28 2012-06-26 Don Doerner Method to direct data to a specific one of several repositories
US8145614B1 (en) * 2007-12-28 2012-03-27 Emc Corporation Selection of a data path based on the likelihood that requested information is in a cache
US7962452B2 (en) 2007-12-28 2011-06-14 International Business Machines Corporation Data deduplication by separating data from meta data
US7797279B1 (en) 2007-12-31 2010-09-14 Emc Corporation Merging of incremental data streams with prior backed-up data
US8621240B1 (en) 2007-12-31 2013-12-31 Emc Corporation User-specific hash authentication
US8190835B1 (en) 2007-12-31 2012-05-29 Emc Corporation Global de-duplication in shared architectures
US8261240B2 (en) 2008-01-15 2012-09-04 Microsoft Corporation Debugging lazily evaluated program components
US8473956B2 (en) 2008-01-15 2013-06-25 Microsoft Corporation Priority based scheduling system for server
US20090204636A1 (en) 2008-02-11 2009-08-13 Microsoft Corporation Multimodal object de-duplication
US7814074B2 (en) 2008-03-14 2010-10-12 International Business Machines Corporation Method and system for assuring integrity of deduplicated data
US8199911B1 (en) 2008-03-31 2012-06-12 Symantec Operating Corporation Secure encryption algorithm for data deduplication on untrusted storage
US7516186B1 (en) 2008-04-01 2009-04-07 International Business Machines Corporation Thread based view and archive for simple mail transfer protocol (SMTP) clients devices and methods
JP2009251725A (en) 2008-04-02 2009-10-29 Hitachi Ltd Storage controller and duplicated data detection method using storage controller
US7539710B1 (en) 2008-04-11 2009-05-26 International Business Machines Corporation Method of and system for deduplicating backed up data in a client-server environment
US9395929B2 (en) 2008-04-25 2016-07-19 Netapp, Inc. Network storage server with integrated encryption, compression and deduplication capability
US8515909B2 (en) 2008-04-29 2013-08-20 International Business Machines Corporation Enhanced method and system for assuring integrity of deduplicated data
US8620877B2 (en) 2008-04-30 2013-12-31 International Business Machines Corporation Tunable data fingerprinting for optimizing data deduplication
US8200638B1 (en) 2008-04-30 2012-06-12 Netapp, Inc. Individual file restore from block-level incremental backups by using client-server backup protocol
US8527482B2 (en) 2008-06-06 2013-09-03 Chrysalis Storage, Llc Method for reducing redundancy between two or more datasets
US8484162B2 (en) 2008-06-24 2013-07-09 Commvault Systems, Inc. De-duplication systems and methods for application-specific data
US8219524B2 (en) 2008-06-24 2012-07-10 Commvault Systems, Inc. Application-aware and remote single instance data management
US9098495B2 (en) 2008-06-24 2015-08-04 Commvault Systems, Inc. Application-aware and remote single instance data management
US8108446B1 (en) 2008-06-27 2012-01-31 Symantec Corporation Methods and systems for managing deduplicated data using unilateral referencing
US8468320B1 (en) 2008-06-30 2013-06-18 Symantec Operating Corporation Scalability of data deduplication through the use of a locality table
US8176269B2 (en) * 2008-06-30 2012-05-08 International Business Machines Corporation Managing metadata for data blocks used in a deduplication system
US8166263B2 (en) 2008-07-03 2012-04-24 Commvault Systems, Inc. Continuous data protection over intermittent connections, such as continuous data backup for laptops or wireless devices
US8046550B2 (en) 2008-07-14 2011-10-25 Quest Software, Inc. Systems and methods for performing backup operations of virtual machine files
JP4322958B1 (en) 2008-07-31 2009-09-02 国立大学法人広島大学 Measuring apparatus and methods
US7913114B2 (en) 2008-07-31 2011-03-22 Quantum Corporation Repair of a corrupt data segment used by a de-duplication engine
US8788466B2 (en) 2008-08-05 2014-07-22 International Business Machines Corporation Efficient transfer of deduplicated data
US8086799B2 (en) 2008-08-12 2011-12-27 Netapp, Inc. Scalable deduplication of stored data
US20100049927A1 (en) 2008-08-21 2010-02-25 International Business Machines Corporation Enhancement of data mirroring to provide parallel processing of overlapping writes
US20100049926A1 (en) 2008-08-21 2010-02-25 International Business Machines Corporation Enhancement of data mirroring to provide parallel processing of overlapping writes
US8725688B2 (en) 2008-09-05 2014-05-13 Commvault Systems, Inc. Image level copy or restore, such as image level restore without knowledge of data object metadata
US8307177B2 (en) 2008-09-05 2012-11-06 Commvault Systems, Inc. Systems and methods for management of virtualization data
US9098519B2 (en) 2008-09-16 2015-08-04 File System Labs Llc Methods and apparatus for distributed data storage
US8620845B2 (en) 2008-09-24 2013-12-31 Timothy John Stoakes Identifying application metadata in a backup stream
US9015181B2 (en) 2008-09-26 2015-04-21 Commvault Systems, Inc. Systems and methods for managing single instancing data
US8495032B2 (en) 2008-10-01 2013-07-23 International Business Machines Corporation Policy based sharing of redundant data across storage pools in a deduplicating system
US20100088296A1 (en) * 2008-10-03 2010-04-08 Netapp, Inc. System and method for organizing data to facilitate data deduplication
US8626723B2 (en) 2008-10-14 2014-01-07 Vmware, Inc. Storage-network de-duplication
US8082228B2 (en) 2008-10-31 2011-12-20 Netapp, Inc. Remote office duplication
US8412677B2 (en) 2008-11-26 2013-04-02 Commvault Systems, Inc. Systems and methods for byte-level or quasi byte-level single instancing
US8495161B2 (en) 2008-12-12 2013-07-23 Verizon Patent And Licensing Inc. Duplicate MMS content checking
US8200923B1 (en) 2008-12-31 2012-06-12 Emc Corporation Method and apparatus for block level data de-duplication
US8291183B2 (en) 2009-01-15 2012-10-16 Emc Corporation Assisted mainframe data de-duplication
US20100306180A1 (en) 2009-01-28 2010-12-02 Digitiliti, Inc. File revision management
US8074043B1 (en) 2009-01-30 2011-12-06 Symantec Corporation Method and apparatus to recover from interrupted data streams in a deduplication system
US8108638B2 (en) 2009-02-06 2012-01-31 International Business Machines Corporation Backup of deduplicated data
US8645334B2 (en) 2009-02-27 2014-02-04 Andrew LEPPARD Minimize damage caused by corruption of de-duplicated data
US8140491B2 (en) 2009-03-26 2012-03-20 International Business Machines Corporation Storage management through adaptive deduplication
US8205065B2 (en) 2009-03-30 2012-06-19 Exar Corporation System and method for data deduplication
US8401996B2 (en) 2009-03-30 2013-03-19 Commvault Systems, Inc. Storing a variable number of instances of data objects
US8261126B2 (en) 2009-04-03 2012-09-04 Microsoft Corporation Bare metal machine recovery from the cloud
US20100257403A1 (en) 2009-04-03 2010-10-07 Microsoft Corporation Restoration of a system from a set of full and partial delta system snapshots across a distributed system
US8805953B2 (en) 2009-04-03 2014-08-12 Microsoft Corporation Differential file and system restores from peers and the cloud
US8095756B1 (en) 2009-04-28 2012-01-10 Netapp, Inc. System and method for coordinating deduplication operations and backup operations of a storage volume
US8578120B2 (en) 2009-05-22 2013-11-05 Commvault Systems, Inc. Block-level single instancing
US8214611B2 (en) 2009-06-04 2012-07-03 Hitachi, Ltd. Storage subsystem and its data processing method, and computer system
US20100318759A1 (en) 2009-06-15 2010-12-16 Microsoft Corporation Distributed rdc chunk store
US8122284B2 (en) 2009-06-18 2012-02-21 Taylor Tracy M N+1 failover and resynchronization of data storage appliances
US20100332401A1 (en) 2009-06-30 2010-12-30 Anand Prahlad Performing data storage operations with a cloud storage environment, including automatically selecting among multiple cloud storage sites
US8930306B1 (en) 2009-07-08 2015-01-06 Commvault Systems, Inc. Synchronized data deduplication
GB2471715A (en) 2009-07-10 2011-01-12 Hewlett Packard Development Co Determining the data chunks to be used as seed data to restore a database, from manifests of chunks stored in a de-duplicated data chunk store.
US8280854B1 (en) 2009-09-01 2012-10-02 Symantec Corporation Systems and methods for relocating deduplicated data within a multi-device storage system
US8204862B1 (en) 2009-10-02 2012-06-19 Symantec Corporation Systems and methods for restoring deduplicated data
US8595188B2 (en) 2009-11-06 2013-11-26 International Business Machines Corporation Operating system and file system independent incremental data backup
US8380688B2 (en) 2009-11-06 2013-02-19 International Business Machines Corporation Method and apparatus for data compression
US8504528B2 (en) * 2009-11-09 2013-08-06 Ca, Inc. Duplicate backup data identification and consolidation
US20110119741A1 (en) 2009-11-18 2011-05-19 Hotchalk Inc. Method for Conditionally Obtaining Files From a Local Appliance
WO2011082138A1 (en) 2009-12-31 2011-07-07 Commvault Systems, Inc. Systems and methods for performing data management operations using snapshots
US8224875B1 (en) 2010-01-05 2012-07-17 Symantec Corporation Systems and methods for removing unreferenced data segments from deduplicated data systems
US8452932B2 (en) 2010-01-06 2013-05-28 Storsimple, Inc. System and method for efficiently creating off-site data volume back-ups
US8352422B2 (en) 2010-03-30 2013-01-08 Commvault Systems, Inc. Data restore systems and methods in a replication environment
US8468135B2 (en) * 2010-04-14 2013-06-18 International Business Machines Corporation Optimizing data transmission bandwidth consumption over a wide area network
US8244992B2 (en) 2010-05-24 2012-08-14 Spackman Stephen P Policy based data retrieval performance for deduplicated data
US8572038B2 (en) 2010-05-28 2013-10-29 Commvault Systems, Inc. Systems and methods for performing data replication
US8370315B1 (en) 2010-05-28 2013-02-05 Symantec Corporation System and method for high performance deduplication indexing
US8504526B2 (en) 2010-06-04 2013-08-06 Commvault Systems, Inc. Failover systems and methods for performing backup operations
US20110314070A1 (en) 2010-06-18 2011-12-22 Microsoft Corporation Optimization of storage and transmission of data
US8965907B2 (en) 2010-06-21 2015-02-24 Microsoft Technology Licensing, Llc Assisted filtering of multi-dimensional data
US20120011101A1 (en) 2010-07-12 2012-01-12 Computer Associates Think, Inc. Integrating client and server deduplication systems
US8548944B2 (en) 2010-07-15 2013-10-01 Delphix Corp. De-duplication based backup of file systems
US9678688B2 (en) 2010-07-16 2017-06-13 EMC IP Holding Company LLC System and method for data deduplication for disk storage subsystems
US8838624B2 (en) 2010-09-24 2014-09-16 Hitachi Data Systems Corporation System and method for aggregating query results in a fault-tolerant database management system
US8549350B1 (en) * 2010-09-30 2013-10-01 Emc Corporation Multi-tier recovery
US8578109B2 (en) 2010-09-30 2013-11-05 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US8364652B2 (en) 2010-09-30 2013-01-29 Commvault Systems, Inc. Content aligned block-based deduplication
US8886613B2 (en) 2010-10-12 2014-11-11 Don Doerner Prioritizing data deduplication
US9442806B1 (en) 2010-11-30 2016-09-13 Veritas Technologies Llc Block-level deduplication
US8954446B2 (en) 2010-12-14 2015-02-10 Comm Vault Systems, Inc. Client-side repository in a networked deduplicated storage system
US9020900B2 (en) 2010-12-14 2015-04-28 Commvault Systems, Inc. Distributed deduplicated storage system
KR20120072909A (en) 2010-12-24 2012-07-04 주식회사 케이티 Distribution storage system with content-based deduplication function and object distributive storing method thereof, and computer-readable recording medium
US9823981B2 (en) 2011-03-11 2017-11-21 Microsoft Technology Licensing, Llc Backup and restore strategies for data deduplication
US8719264B2 (en) 2011-03-31 2014-05-06 Commvault Systems, Inc. Creating secondary copies of data based on searches for content
US8849762B2 (en) 2011-03-31 2014-09-30 Commvault Systems, Inc. Restoring computing environments, such as autorecovery of file systems at certain points in time
US9323820B1 (en) 2011-06-30 2016-04-26 Emc Corporation Virtual datacenter redundancy
US8775376B2 (en) 2011-06-30 2014-07-08 International Business Machines Corporation Hybrid data backup in a networked computing environment
US9128901B1 (en) 2011-12-30 2015-09-08 Emc Corporation Continuous protection of data and storage management configuration
US9298715B2 (en) 2012-03-07 2016-03-29 Commvault Systems, Inc. Data storage system utilizing proxy device for storage operations
US9286327B2 (en) 2012-03-30 2016-03-15 Commvault Systems, Inc. Data storage recovery automation
US9342537B2 (en) 2012-04-23 2016-05-17 Commvault Systems, Inc. Integrated snapshot interface for a data storage system
US20130339310A1 (en) 2012-06-13 2013-12-19 Commvault Systems, Inc. Restore using a client side signature repository in a networked storage system
US8909980B1 (en) 2012-06-29 2014-12-09 Emc Corporation Coordinating processing for request redirection
US9075820B2 (en) 2012-07-30 2015-07-07 Hewlett-Packard Development Company, L.P. Distributed file system at network switch
US8938481B2 (en) 2012-08-13 2015-01-20 Commvault Systems, Inc. Generic file level restore from a block-level secondary copy
US9665591B2 (en) 2013-01-11 2017-05-30 Commvault Systems, Inc. High availability distributed deduplicated storage system
US9886346B2 (en) 2013-01-11 2018-02-06 Commvault Systems, Inc. Single snapshot for multiple agents
US9372865B2 (en) 2013-02-12 2016-06-21 Atlantis Computing, Inc. Deduplication metadata access in deduplication file system
US9705730B1 (en) 2013-05-07 2017-07-11 Axcient, Inc. Cloud storage using Merkle trees
US9483363B2 (en) 2013-05-08 2016-11-01 Commvault Systems, Inc. Use of temporary secondary copies in failover operations
EP2997475A4 (en) 2013-05-16 2017-03-22 Hewlett-Packard Enterprise Development LP Deduplicated data storage system having distributed manifest
US9298724B1 (en) 2013-06-14 2016-03-29 Symantec Corporation Systems and methods for preserving deduplication efforts after backup-job failures
US9201800B2 (en) 2013-07-08 2015-12-01 Dell Products L.P. Restoring temporal locality in global and local deduplication storage systems
US9753812B2 (en) 2014-01-24 2017-09-05 Commvault Systems, Inc. Generating mapping information for single snapshot for multiple applications
US20150212894A1 (en) 2014-01-24 2015-07-30 Commvault Systems, Inc. Restoring application data from a single snapshot for multiple applications
US9632874B2 (en) 2014-01-24 2017-04-25 Commvault Systems, Inc. Database application backup in single snapshot for multiple applications
US9639426B2 (en) 2014-01-24 2017-05-02 Commvault Systems, Inc. Single snapshot for multiple applications
US9495251B2 (en) 2014-01-24 2016-11-15 Commvault Systems, Inc. Snapshot readiness checking and reporting
US9779153B2 (en) 2014-03-03 2017-10-03 Netapp, Inc. Data transfer between storage systems using data fingerprints
US20150261776A1 (en) 2014-03-17 2015-09-17 Commvault Systems, Inc. Managing deletions from a deduplication database
US9633056B2 (en) 2014-03-17 2017-04-25 Commvault Systems, Inc. Maintaining a deduplication database
US20160042090A1 (en) 2014-08-06 2016-02-11 Commvault Systems, Inc. Preserving the integrity of a snapshot on a storage device via ephemeral write operations in an information management system
US20160154709A1 (en) 2014-08-06 2016-06-02 Commvault Systems, Inc. Point-in-time backups of a production application made accessible over fibre channel and/or iscsi as data sources to a remote application by representing the backups as pseudo-disks operating apart from the production application and its host
US9852026B2 (en) 2014-08-06 2017-12-26 Commvault Systems, Inc. Efficient application recovery in an information management system based on a pseudo-storage-device driver
US9774672B2 (en) 2014-09-03 2017-09-26 Commvault Systems, Inc. Consolidated processing of storage-array commands by a snapshot-control media agent
US10042716B2 (en) 2014-09-03 2018-08-07 Commvault Systems, Inc. Consolidated processing of storage-array commands using a forwarder media agent in conjunction with a snapshot-control media agent
US9916206B2 (en) 2014-09-30 2018-03-13 Code 42 Software, Inc. Deduplicated data distribution techniques
US9575673B2 (en) 2014-10-29 2017-02-21 Commvault Systems, Inc. Accessing a file system using tiered deduplication
US9448731B2 (en) 2014-11-14 2016-09-20 Commvault Systems, Inc. Unified snapshot storage management
US9648105B2 (en) 2014-11-14 2017-05-09 Commvault Systems, Inc. Unified snapshot storage management, using an enhanced storage manager and enhanced media agents
US20160299818A1 (en) 2015-04-09 2016-10-13 Commvault Systems, Inc. Highly reusable deduplication database after disaster recovery
US20160350391A1 (en) 2015-05-26 2016-12-01 Commvault Systems, Inc. Replication using deduplicated secondary copy data
US10089183B2 (en) 2015-07-31 2018-10-02 Hiveio Inc. Method and apparatus for reconstructing and checking the consistency of deduplication metadata of a deduplication file system
US20170193003A1 (en) 2015-12-30 2017-07-06 Commvault Systems, Inc. Redundant and robust distributed deduplication data storage system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7472238B1 (en) * 2004-11-05 2008-12-30 Commvault Systems, Inc. Systems and methods for recovering electronic information from a storage medium
US7822939B1 (en) * 2007-09-25 2010-10-26 Emc Corporation Data de-duplication using thin provisioning
US20100070478A1 (en) * 2008-09-15 2010-03-18 International Business Machines Corporation Retrieval and recovery of data chunks from alternate data stores in a deduplicating system
US7814149B1 (en) * 2008-09-29 2010-10-12 Symantec Operating Corporation Client side data deduplication
WO2010131292A1 (en) * 2009-05-13 2010-11-18 Hitachi, Ltd. Storage system and utilization management method for storage system
US20100312752A1 (en) * 2009-06-08 2010-12-09 Symantec Corporation Source Classification For Performing Deduplication In A Backup Operation
US20110161723A1 (en) * 2009-12-28 2011-06-30 Riverbed Technology, Inc. Disaster recovery using local and cloud spanning deduplicated storage system

Cited By (196)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9274803B2 (en) 2000-01-31 2016-03-01 Commvault Systems, Inc. Storage of application specific profiles correlating to document versions
US9164850B2 (en) 2001-09-28 2015-10-20 Commvault Systems, Inc. System and method for archiving objects in an information store
US9633232B2 (en) 2004-11-15 2017-04-25 Commvault Systems, Inc. System and method for encrypting secondary copies of data
US9411986B2 (en) 2004-11-15 2016-08-09 Commvault Systems, Inc. System and method for encrypting secondary copies of data
US9930118B2 (en) 2005-12-19 2018-03-27 Commvault Systems, Inc. Systems and methods for granular resource management in a storage network
US9313143B2 (en) 2005-12-19 2016-04-12 Commvault Systems, Inc. Systems and methods for granular resource management in a storage network
US9400803B2 (en) 2006-12-18 2016-07-26 Commvault Systems, Inc. Systems and methods for restoring data from network attached storage
US9652335B2 (en) 2006-12-18 2017-05-16 Commvault Systems, Inc. Systems and methods for restoring data from network attached storage
US9124611B2 (en) 2006-12-18 2015-09-01 Commvault Systems, Inc. Systems and methods for writing data and storage system specific metadata to network attached storage device
US10198324B2 (en) 2008-06-18 2019-02-05 Commvault Systems, Inc. Data protection scheduling, such as providing a flexible backup window in a data protection system
US9262226B2 (en) 2008-06-19 2016-02-16 Commvault Systems, Inc. Data storage resource allocation by employing dynamic methods and blacklisting resource request pools
US10162677B2 (en) 2008-06-19 2018-12-25 Commvault Systems, Inc. Data storage resource allocation list updating for data storage operations
US9612916B2 (en) 2008-06-19 2017-04-04 Commvault Systems, Inc. Data storage resource allocation using blacklisting of data storage requests classified in the same category as a data storage request that is determined to fail if attempted
US9823979B2 (en) 2008-06-19 2017-11-21 Commvault Systems, Inc. Updating a list of data storage requests if an abbreviated resource check determines that a request in the list would fail if attempted
US9128883B2 (en) 2008-06-19 2015-09-08 Commvault Systems, Inc Data storage resource allocation by performing abbreviated resource checks based on relative chances of failure of the data storage resources to determine whether data storage requests would fail
US9639400B2 (en) 2008-06-19 2017-05-02 Commvault Systems, Inc. Data storage resource allocation by employing dynamic methods and blacklisting resource request pools
US9405763B2 (en) 2008-06-24 2016-08-02 Commvault Systems, Inc. De-duplication systems and methods for application-specific data
US10091146B2 (en) 2008-11-05 2018-10-02 Commvault Systems, Inc. System and method for monitoring and copying multimedia messages to storage locations in compliance with a policy
US8930306B1 (en) 2009-07-08 2015-01-06 Commvault Systems, Inc. Synchronized data deduplication
US9639289B2 (en) 2010-09-30 2017-05-02 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US9898225B2 (en) 2010-09-30 2018-02-20 Commvault Systems, Inc. Content aligned block-based deduplication
US9557929B2 (en) 2010-09-30 2017-01-31 Commvault Systems, Inc. Data recovery operations, such as recovery from modified network data management protocol data
US10126973B2 (en) 2010-09-30 2018-11-13 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US9239687B2 (en) 2010-09-30 2016-01-19 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US10275318B2 (en) 2010-09-30 2019-04-30 Commvault Systems, Inc. Data recovery operations, such as recovery from modified network data management protocol data
US9110602B2 (en) 2010-09-30 2015-08-18 Commvault Systems, Inc. Content aligned block-based deduplication
US8577851B2 (en) 2010-09-30 2013-11-05 Commvault Systems, Inc. Content aligned block-based deduplication
US8578109B2 (en) 2010-09-30 2013-11-05 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US8572340B2 (en) 2010-09-30 2013-10-29 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
US9619480B2 (en) 2010-09-30 2017-04-11 Commvault Systems, Inc. Content aligned block-based deduplication
US9588972B2 (en) 2010-09-30 2017-03-07 Commvault Systems, Inc. Efficient data management improvements, such as docking limited-feature data management modules to a full-featured data management system
US9898478B2 (en) 2010-12-14 2018-02-20 Commvault Systems, Inc. Distributed deduplicated storage system
US9116850B2 (en) 2010-12-14 2015-08-25 Commvault Systems, Inc. Client-side repository in a networked deduplicated storage system
US10191816B2 (en) 2010-12-14 2019-01-29 Commvault Systems, Inc. Client-side repository in a networked deduplicated storage system
US8954446B2 (en) 2010-12-14 2015-02-10 Comm Vault Systems, Inc. Client-side repository in a networked deduplicated storage system
US9020900B2 (en) 2010-12-14 2015-04-28 Commvault Systems, Inc. Distributed deduplicated storage system
US9104623B2 (en) 2010-12-14 2015-08-11 Commvault Systems, Inc. Client-side repository in a networked deduplicated storage system
US9092378B2 (en) 2011-03-31 2015-07-28 Commvault Systems, Inc. Restoring computing environments, such as autorecovery of file systems at certain points in time
US9648106B2 (en) 2011-04-27 2017-05-09 Commvault Systems, Inc. System and method for client policy assignment in a data storage system
US9323466B2 (en) 2011-04-27 2016-04-26 Commvault Systems, Inc. System and method for client policy assignment in a data storage system
US20120303595A1 (en) * 2011-05-25 2012-11-29 Inventec Corporation Data restoration method for data de-duplication
US9451023B2 (en) 2011-09-30 2016-09-20 Commvault Systems, Inc. Information management of virtual machines having mapped storage devices
US9292815B2 (en) 2012-03-23 2016-03-22 Commvault Systems, Inc. Automation of data storage activities
US10157184B2 (en) 2012-03-30 2018-12-18 Commvault Systems, Inc. Data previewing before recalling large data files
US9734018B2 (en) 2012-03-30 2017-08-15 Commvault Systems, Inc. Data storage recovery automation
US9286327B2 (en) 2012-03-30 2016-03-15 Commvault Systems, Inc. Data storage recovery automation
US9189167B2 (en) 2012-05-31 2015-11-17 Commvault Systems, Inc. Shared library in a data storage system
US10126949B2 (en) 2012-05-31 2018-11-13 Commvault Systems, Inc. Shared library in a data storage system
US8977672B2 (en) 2012-06-08 2015-03-10 Commvault Systems, Inc. Intelligent scheduling for remote computers
US10033813B2 (en) 2012-06-12 2018-07-24 Commvault Systems, Inc. External storage manager for a data storage cell
US9189170B2 (en) 2012-06-12 2015-11-17 Commvault Systems, Inc. External storage manager for a data storage cell
US9485311B2 (en) 2012-06-12 2016-11-01 Commvault Systems, Inc. External storage manager for a data storage cell
US9858156B2 (en) 2012-06-13 2018-01-02 Commvault Systems, Inc. Dedicated client-side signature generator in a networked storage system
US9218376B2 (en) 2012-06-13 2015-12-22 Commvault Systems, Inc. Intelligent data sourcing in a networked storage system
US9251186B2 (en) 2012-06-13 2016-02-02 Commvault Systems, Inc. Backup using a client-side signature repository in a networked storage system
US9218375B2 (en) 2012-06-13 2015-12-22 Commvault Systems, Inc. Dedicated client-side signature generator in a networked storage system
US10176053B2 (en) 2012-06-13 2019-01-08 Commvault Systems, Inc. Collaborative restore in a networked storage system
US9218374B2 (en) 2012-06-13 2015-12-22 Commvault Systems, Inc. Collaborative restore in a networked storage system
US9659076B2 (en) 2012-07-20 2017-05-23 Commvault Systems, Inc. Systems and methods for database archiving
US9275086B2 (en) 2012-07-20 2016-03-01 Commvault Systems, Inc. Systems and methods for database archiving
US9483201B2 (en) 2012-07-31 2016-11-01 Commvault Systems, Inc. Administering a shared, on-line pool of data storage resources for performing data storage operations
US10152231B2 (en) 2012-07-31 2018-12-11 Commvault Systems, Inc. Administering a shared, on-line pool of data storage resources for performing data storage operations
US9632882B2 (en) 2012-08-13 2017-04-25 Commvault Systems, Inc. Generic file level restore from a block-level secondary copy
US10089193B2 (en) 2012-08-13 2018-10-02 Commvault Systems, Inc. Generic file level restore from a block-level secondary copy
US8938481B2 (en) 2012-08-13 2015-01-20 Commvault Systems, Inc. Generic file level restore from a block-level secondary copy
US9483478B2 (en) 2012-08-13 2016-11-01 Commvault Systems, Inc. Lightweight mounting of a secondary copy of file system data
US9026498B2 (en) 2012-08-13 2015-05-05 Commvault Systems, Inc. Lightweight mounting of a secondary copy of file system data
US10007453B2 (en) 2012-08-13 2018-06-26 Commvault Systems, Inc. Lightweight mounting of a secondary copy of file system data
US9740702B2 (en) 2012-12-21 2017-08-22 Commvault Systems, Inc. Systems and methods to identify unprotected virtual machines
US9965316B2 (en) 2012-12-21 2018-05-08 Commvault Systems, Inc. Archiving virtual machines in a data storage system
US9747169B2 (en) 2012-12-21 2017-08-29 Commvault Systems, Inc. Reporting using data obtained during backup of primary storage
US9619339B2 (en) 2012-12-21 2017-04-11 Commvault Systems, Inc. Systems and methods to confirm replication data accuracy for data backup in data storage systems
US10296607B2 (en) 2012-12-21 2019-05-21 Commvault Systems, Inc. Systems and methods to detect deleted files
US9390109B2 (en) 2012-12-21 2016-07-12 Commvault Systems, Inc. Systems and methods to detect deleted files
US9633216B2 (en) 2012-12-27 2017-04-25 Commvault Systems, Inc. Application of information management policies based on operation with a geographic entity
US10303559B2 (en) 2012-12-27 2019-05-28 Commvault Systems, Inc. Restoration of centralized data storage manager, such as data storage manager in a hierarchical data storage system
US9977687B2 (en) 2013-01-08 2018-05-22 Commvault Systems, Inc. Virtual server agent load balancing
US9846620B2 (en) 2013-01-11 2017-12-19 Commvault Systems, Inc. Table level database restore in a data storage system
US9336226B2 (en) 2013-01-11 2016-05-10 Commvault Systems, Inc. Criteria-based data synchronization management
US9811423B2 (en) 2013-01-11 2017-11-07 Commvault Systems, Inc. Partial file restore in a data storage system
US9804930B2 (en) 2013-01-11 2017-10-31 Commvault Systems, Inc. Partial file restore in a data storage system
US10108652B2 (en) 2013-01-11 2018-10-23 Commvault Systems, Inc. Systems and methods to process block-level backup for selective file restoration for virtual machines
US9766987B2 (en) 2013-01-11 2017-09-19 Commvault Systems, Inc. Table level database restore in a data storage system
US9898481B2 (en) 2013-01-11 2018-02-20 Commvault Systems, Inc. Data synchronization management
US9665591B2 (en) 2013-01-11 2017-05-30 Commvault Systems, Inc. High availability distributed deduplicated storage system
US9262435B2 (en) 2013-01-11 2016-02-16 Commvault Systems, Inc. Location-based data synchronization management
US9633033B2 (en) 2013-01-11 2017-04-25 Commvault Systems, Inc. High availability distributed deduplicated storage system
US10229133B2 (en) 2013-01-11 2019-03-12 Commvault Systems, Inc. High availability distributed deduplicated storage system
US9720787B2 (en) 2013-01-11 2017-08-01 Commvault Systems, Inc. Table level database restore in a data storage system
US9430491B2 (en) 2013-01-11 2016-08-30 Commvault Systems, Inc. Request-based data synchronization management
US9760444B2 (en) 2013-01-11 2017-09-12 Commvault Systems, Inc. Sharing of secondary storage data
US10140037B2 (en) 2013-01-14 2018-11-27 Commvault Systems, Inc. Partial sharing of secondary storage files in a data storage system
US9483489B2 (en) 2013-01-14 2016-11-01 Commvault Systems, Inc. Partial sharing of secondary storage files in a data storage system
US9766989B2 (en) 2013-01-14 2017-09-19 Commvault Systems, Inc. Creation of virtual machine placeholders in a data storage system
US9652283B2 (en) 2013-01-14 2017-05-16 Commvault Systems, Inc. Creation of virtual machine placeholders in a data storage system
US9459968B2 (en) 2013-03-11 2016-10-04 Commvault Systems, Inc. Single index to query multiple backup formats
US9734348B2 (en) 2013-03-12 2017-08-15 Commvault Systems, Inc. Automatic file encryption
US9367702B2 (en) 2013-03-12 2016-06-14 Commvault Systems, Inc. Automatic file encryption
US9990512B2 (en) 2013-03-12 2018-06-05 Commvault Systems, Inc. File backup with selective encryption
US9483655B2 (en) 2013-03-12 2016-11-01 Commvault Systems, Inc. File backup with selective encryption
US9934103B2 (en) 2013-04-16 2018-04-03 Commvault Systems, Inc. Managing multi-source restore operations in an information management system
US9405635B2 (en) 2013-04-16 2016-08-02 Commvault Systems, Inc. Multi-source restore in an information management system
US9483363B2 (en) 2013-05-08 2016-11-01 Commvault Systems, Inc. Use of temporary secondary copies in failover operations
US9483362B2 (en) 2013-05-08 2016-11-01 Commvault Systems, Inc. Use of auxiliary data protection software in failover operations
US10001935B2 (en) 2013-05-08 2018-06-19 Commvault Systems, Inc. Use of auxiliary data protection software in failover operations
US9483364B2 (en) 2013-05-08 2016-11-01 Commvault Systems, Inc. Synchronization of local secondary copies with a remote storage management component
US9483361B2 (en) 2013-05-08 2016-11-01 Commvault Systems, Inc. Information management cell with failover management capability
US10043147B2 (en) 2013-05-29 2018-08-07 Commvault Systems, Inc. Assessing user performance in a community of users of data storage resources
US9483558B2 (en) 2013-05-29 2016-11-01 Commvault Systems, Inc. Assessing user performance in a community of users of data storage resources
US9939981B2 (en) 2013-09-12 2018-04-10 Commvault Systems, Inc. File manager integration with virtualization in an information management system with an enhanced storage manager, including user control and storage management of virtual machines
US9590886B2 (en) 2013-11-01 2017-03-07 Commvault Systems, Inc. Systems and methods for differential health checking of an information management system
US9928258B2 (en) 2013-11-01 2018-03-27 Commvault Systems, Inc. Differential health checking of an information management system
US10169121B2 (en) 2014-02-27 2019-01-01 Commvault Systems, Inc. Work flow management for an information management system
US9769260B2 (en) 2014-03-05 2017-09-19 Commvault Systems, Inc. Cross-system storage management for transferring data across autonomous information management systems
US10205780B2 (en) 2014-03-05 2019-02-12 Commvault Systems, Inc. Cross-system storage management for transferring data across autonomous information management systems
US9648100B2 (en) 2014-03-05 2017-05-09 Commvault Systems, Inc. Cross-system storage management for transferring data across autonomous information management systems
US9633026B2 (en) 2014-03-13 2017-04-25 Commvault Systems, Inc. Systems and methods for protecting email data
US9633056B2 (en) 2014-03-17 2017-04-25 Commvault Systems, Inc. Maintaining a deduplication database
US9811427B2 (en) 2014-04-02 2017-11-07 Commvault Systems, Inc. Information management by a media agent in the absence of communications with a storage manager
US10013314B2 (en) 2014-04-02 2018-07-03 Commvault Systems, Inc. Information management by a media agent in the absence of communications with a storage manager
US9563518B2 (en) 2014-04-02 2017-02-07 Commvault Systems, Inc. Information management by a media agent in the absence of communications with a storage manager
US9823978B2 (en) 2014-04-16 2017-11-21 Commvault Systems, Inc. User-level quota management of data objects stored in information management systems
US9740574B2 (en) 2014-05-09 2017-08-22 Commvault Systems, Inc. Load balancing across multiple data paths
US9740578B2 (en) 2014-07-16 2017-08-22 Commvault Systems, Inc. Creating customized bootable image for client computing device from backup copy
US10031917B2 (en) 2014-07-29 2018-07-24 Commvault Systems, Inc. Efficient volume-level replication of data via snapshots in an information management system
US9893942B2 (en) 2014-07-29 2018-02-13 Commvault Systems, Inc. Customized deployment in information management systems
US9641388B2 (en) 2014-07-29 2017-05-02 Commvault Systems, Inc. Customized deployment in information management systems
US9852026B2 (en) 2014-08-06 2017-12-26 Commvault Systems, Inc. Efficient application recovery in an information management system based on a pseudo-storage-device driver
US9727491B2 (en) 2014-09-17 2017-08-08 Commvault Systems, Inc. Token-based encryption determination process
US9405928B2 (en) 2014-09-17 2016-08-02 Commvault Systems, Inc. Deriving encryption rules based on file content
US9720849B2 (en) 2014-09-17 2017-08-01 Commvault Systems, Inc. Token-based encryption rule generation process
US9984006B2 (en) 2014-09-17 2018-05-29 Commvault Systems, Inc. Data storage systems and methods
US9928001B2 (en) 2014-09-22 2018-03-27 Commvault Systems, Inc. Efficiently restoring execution of a backed up virtual machine based on coordination with virtual-machine-file-relocation operations
US9710465B2 (en) 2014-09-22 2017-07-18 Commvault Systems, Inc. Efficiently restoring execution of a backed up virtual machine based on coordination with virtual-machine-file-relocation operations
US9996534B2 (en) 2014-09-22 2018-06-12 Commvault Systems, Inc. Efficiently restoring execution of a backed up virtual machine based on coordination with virtual-machine-file-relocation operations
US10048889B2 (en) 2014-09-22 2018-08-14 Commvault Systems, Inc. Efficient live-mount of a backed up virtual machine in a storage management system
US10204010B2 (en) 2014-10-03 2019-02-12 Commvault Systems, Inc. Intelligent protection of off-line mail data
US10073650B2 (en) 2014-10-21 2018-09-11 Commvault Systems, Inc. Using an enhanced data agent to restore backed up data across autonomous storage management systems
US9645762B2 (en) 2014-10-21 2017-05-09 Commvault Systems, Inc. Using an enhanced data agent to restore backed up data across autonomous storage management systems
US9444811B2 (en) 2014-10-21 2016-09-13 Commvault Systems, Inc. Using an enhanced data agent to restore backed up data across autonomous storage management systems
US9575673B2 (en) 2014-10-29 2017-02-21 Commvault Systems, Inc. Accessing a file system using tiered deduplication
US9934238B2 (en) 2014-10-29 2018-04-03 Commvault Systems, Inc. Accessing a file system using tiered deduplication
US9848046B2 (en) 2014-11-13 2017-12-19 Commvault Systems, Inc. Archiving applications in information management systems
US9912625B2 (en) 2014-11-18 2018-03-06 Commvault Systems, Inc. Storage and management of mail attachments
US9996287B2 (en) 2014-11-20 2018-06-12 Commvault Systems, Inc. Virtual machine change block tracking
US9823977B2 (en) 2014-11-20 2017-11-21 Commvault Systems, Inc. Virtual machine change block tracking
US9983936B2 (en) 2014-11-20 2018-05-29 Commvault Systems, Inc. Virtual machine change block tracking
US9632713B2 (en) 2014-12-03 2017-04-25 Commvault Systems, Inc. Secondary storage editor
US10168924B2 (en) 2014-12-03 2019-01-01 Commvault Systems, Inc. Secondary storage editor
US9645891B2 (en) 2014-12-04 2017-05-09 Commvault Systems, Inc. Opportunistic execution of secondary copy operations
US9753816B2 (en) 2014-12-05 2017-09-05 Commvault Systems, Inc. Synchronization based on filtered browsing
US10019172B2 (en) * 2015-01-15 2018-07-10 Commvault Systems, Inc. Hybrid drive caching in a backup system with SSD deletion management
US20170097770A1 (en) * 2015-01-15 2017-04-06 Commvault Systems, Inc. Intelligent hybrid drive caching
US9588849B2 (en) 2015-01-20 2017-03-07 Commvault Systems, Inc. Synchronizing selected portions of data in a storage management system
US9952934B2 (en) 2015-01-20 2018-04-24 Commvault Systems, Inc. Synchronizing selected portions of data in a storage management system
US10126977B2 (en) 2015-01-20 2018-11-13 Commvault Systems, Inc. Synchronizing selected portions of data in a storage management system
US9928005B2 (en) 2015-01-20 2018-03-27 Commvault Systems, Inc. Synchronizing selected portions of data in a storage management system
US10108687B2 (en) 2015-01-21 2018-10-23 Commvault Systems, Inc. Database protection using block-level mapping
US10223212B2 (en) 2015-01-21 2019-03-05 Commvault Systems, Inc. Restoring archived object-level database data
US10191819B2 (en) 2015-01-21 2019-01-29 Commvault Systems, Inc. Database protection using block-level mapping
US10223211B2 (en) 2015-01-21 2019-03-05 Commvault Systems, Inc. Object-level database restore
US10210051B2 (en) 2015-01-21 2019-02-19 Commvault Systems, Inc. Cross-application database restore
US9720736B2 (en) 2015-03-27 2017-08-01 Commvault Systems, Inc. Job management and resource allocation in a data protection system
US9575804B2 (en) 2015-03-27 2017-02-21 Commvault Systems, Inc. Job management and resource allocation
US10025632B2 (en) 2015-03-27 2018-07-17 Commvault Systems, Inc. Job management and resource allocation in a data protection system
US9928144B2 (en) 2015-03-30 2018-03-27 Commvault Systems, Inc. Storage management of data using an open-archive architecture, including streamlined access to primary data originally stored on network-attached storage and archived to secondary storage
US9934265B2 (en) 2015-04-09 2018-04-03 Commvault Systems, Inc. Management of log data
US10296613B2 (en) 2015-04-09 2019-05-21 Commvault Systems, Inc. Management of log data
US10303550B2 (en) 2015-04-21 2019-05-28 Commvault Systems, Inc. Content-independent and database management system-independent synthetic full backup of a database based on snapshot technology
US9904598B2 (en) 2015-04-21 2018-02-27 Commvault Systems, Inc. Content-independent and database management system-independent synthetic full backup of a database based on snapshot technology
US9870164B2 (en) 2015-05-14 2018-01-16 Commvault Systems, Inc. Restore of secondary data using thread pooling
US9639286B2 (en) 2015-05-14 2017-05-02 Commvault Systems, Inc. Restore of secondary data using thread pooling
US9563514B2 (en) 2015-06-19 2017-02-07 Commvault Systems, Inc. Assignment of proxies for virtual-machine secondary copy operations including streaming backup jobs
US10148780B2 (en) 2015-06-19 2018-12-04 Commvault Systems, Inc. Assignment of data agent proxies for executing virtual-machine secondary copy operations including streaming backup jobs
US10298710B2 (en) 2015-06-19 2019-05-21 Commvault Systems, Inc. Assigning data agent proxies for executing virtual-machine secondary copy operations including streaming backup jobs
US10169067B2 (en) 2015-06-19 2019-01-01 Commvault Systems, Inc. Assignment of proxies for virtual-machine secondary copy operations including streaming backup job
US10084873B2 (en) 2015-06-19 2018-09-25 Commvault Systems, Inc. Assignment of data agent proxies for executing virtual-machine secondary copy operations including streaming backup jobs
US9766825B2 (en) 2015-07-22 2017-09-19 Commvault Systems, Inc. Browse and restore for block-level backups
US10168929B2 (en) 2015-07-22 2019-01-01 Commvault Systems, Inc. Browse and restore for block-level backups
US10192065B2 (en) 2015-08-31 2019-01-29 Commvault Systems, Inc. Automated intelligent provisioning of data storage resources in response to user requests in a data storage management system
US10101913B2 (en) 2015-09-02 2018-10-16 Commvault Systems, Inc. Migrating data to disk without interrupting running backup operations
US10102192B2 (en) 2015-11-03 2018-10-16 Commvault Systems, Inc. Summarization and processing of email on a client computing device based on content contribution to an email thread using weighting techniques
US10228962B2 (en) 2015-12-09 2019-03-12 Commvault Systems, Inc. Live synchronization and management of virtual machines across computing and virtualization platforms and using live synchronization to support disaster recovery
US10255143B2 (en) 2015-12-30 2019-04-09 Commvault Systems, Inc. Deduplication replication in a distributed deduplication data storage system
US10061663B2 (en) 2015-12-30 2018-08-28 Commvault Systems, Inc. Rebuilding deduplication data in a distributed deduplication data storage system
US10313196B2 (en) 2016-02-26 2019-06-04 Commvault Systems, Inc. Automated grouping of computing devices in a networked data storage system
US10296368B2 (en) 2016-03-09 2019-05-21 Commvault Systems, Inc. Hypervisor-independent block-level live browse for access to backed up virtual machine (VM) data and hypervisor-free file-level recovery (block-level pseudo-mount)
US10303557B2 (en) 2016-03-09 2019-05-28 Commvault Systems, Inc. Data transfer to a distributed storage environment
US10310953B2 (en) 2016-10-20 2019-06-04 Commvault Systems, Inc. System for redirecting requests after a secondary storage computing device failure
US10152251B2 (en) 2016-10-25 2018-12-11 Commvault Systems, Inc. Targeted backup of virtual machine
US10210048B2 (en) 2016-10-25 2019-02-19 Commvault Systems, Inc. Selective snapshot and backup copy operations for individual virtual machines in a shared storage
US10162528B2 (en) 2016-10-25 2018-12-25 Commvault Systems, Inc. Targeted snapshot based on virtual machine location
US10318542B2 (en) 2016-12-13 2019-06-11 Commvault Systems, Inc. Information management of mobile device data
US10313442B2 (en) 2017-03-30 2019-06-04 Commvault Systems, Inc. System and method for client policy assignment in a data storage system
US10310950B2 (en) 2017-08-21 2019-06-04 Commvault Systems, Inc. Load balancing across multiple data paths
US10318157B2 (en) 2018-08-10 2019-06-11 Commvault Systems, Inc. Migrating data to disk without interrupting running operations

Also Published As

Publication number Publication date
US20120150814A1 (en) 2012-06-14
US10191816B2 (en) 2019-01-29
US9104623B2 (en) 2015-08-11
US8954446B2 (en) 2015-02-10
US20120150949A1 (en) 2012-06-14
US20120150817A1 (en) 2012-06-14
US9116850B2 (en) 2015-08-25
US20150205681A1 (en) 2015-07-23

Similar Documents

Publication Publication Date Title
DK2329377T3 (en) Use of a snapshot, as the data source
US8335776B2 (en) Distributed indexing system for data storage
US8024294B2 (en) Systems and methods for performing replication copy storage operations
CA2783370C (en) Systems and methods for performing data management operations using snapshots
US9495404B2 (en) Systems and methods to process block-level backup for selective file restoration for virtual machines
CA2632935C (en) Systems and methods for performing data replication
US9003374B2 (en) Systems and methods for continuous data replication
US9342537B2 (en) Integrated snapshot interface for a data storage system
US9298715B2 (en) Data storage system utilizing proxy device for storage operations
US9639563B2 (en) Archiving data objects using secondary copies
US9058117B2 (en) Block-level single instancing
US9495251B2 (en) Snapshot readiness checking and reporting
US8121983B2 (en) Systems and methods for monitoring application data in a data replication system
US9495382B2 (en) Systems and methods for performing discrete data replication
US7962455B2 (en) Pathname translation in a data replication system
US8285684B2 (en) Systems and methods for performing data replication
US9996287B2 (en) Virtual machine change block tracking
US7617253B2 (en) Destination systems and methods for performing data replication
US7962709B2 (en) Network redirector systems and methods for performing data replication
US9430491B2 (en) Request-based data synchronization management
US9965316B2 (en) Archiving virtual machines in a data storage system
US8095756B1 (en) System and method for coordinating deduplication operations and backup operations of a storage volume
US9639426B2 (en) Single snapshot for multiple applications
US10204010B2 (en) Intelligent protection of off-line mail data
US9632874B2 (en) Database application backup in single snapshot for multiple applications

Legal Events

Date Code Title Description
AS Assignment

Owner name: COMMVAULT SYSTEMS, INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RETNAMMA, MANOJ KUMAR VIJAYAN;ATTARDE, DEEPAK RAGHUNATH;JOSHI, HETALKUMAR N.;REEL/FRAME:027698/0219

Effective date: 20120117

AS Assignment

Owner name: BANK OF AMERICA, N.A., AS ADMINISTRATIVE AGENT, NO

Free format text: SECURITY INTEREST;ASSIGNOR:COMMVAULT SYSTEMS, INC.;REEL/FRAME:033266/0678

Effective date: 20140630

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION