US11782795B2 - Source versus target metadata-based data integrity checking - Google Patents

Source versus target metadata-based data integrity checking Download PDF

Info

Publication number
US11782795B2
US11782795B2 US17/384,975 US202117384975A US11782795B2 US 11782795 B2 US11782795 B2 US 11782795B2 US 202117384975 A US202117384975 A US 202117384975A US 11782795 B2 US11782795 B2 US 11782795B2
Authority
US
United States
Prior art keywords
asset
backup
data
target
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US17/384,975
Other versions
US20220398165A1 (en
Inventor
Savitha Susan Bijoy
Gururaj Kulkarni
Mahesh Kamath
Kiran Kumar Malle Gowda
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
EMC Corp
Original Assignee
EMC IP Holding Co LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by EMC IP Holding Co LLC filed Critical EMC IP Holding Co LLC
Assigned to EMC IP Holding Company LLC reassignment EMC IP Holding Company LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BIJOY, SAVITHA SUSAN, GOWDA, KIRAN KUMAR MALLE, KAMATH, MAHESH, KULKARNI, GURURAJ
Assigned to CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH reassignment CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH SECURITY AGREEMENT Assignors: DELL PRODUCTS, L.P., EMC IP Holding Company LLC
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DELL PRODUCTS L.P., EMC IP Holding Company LLC
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DELL PRODUCTS L.P., EMC IP Holding Company LLC
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DELL PRODUCTS L.P., EMC IP Holding Company LLC
Assigned to DELL PRODUCTS L.P., EMC IP Holding Company LLC reassignment DELL PRODUCTS L.P. RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (057758/0286) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Assigned to DELL PRODUCTS L.P., EMC IP Holding Company LLC reassignment DELL PRODUCTS L.P. RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (057931/0392) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Assigned to DELL PRODUCTS L.P., EMC IP Holding Company LLC reassignment DELL PRODUCTS L.P. RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (058014/0560) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Publication of US20220398165A1 publication Critical patent/US20220398165A1/en
Publication of US11782795B2 publication Critical patent/US11782795B2/en
Application granted granted Critical
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0643Hash functions, e.g. MD5, SHA, HMAC or f9 MAC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0894Escrow, recovery or storing of secret information, e.g. secret key escrow or cryptographic key storage
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • H04L9/3239Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving non-keyed hash functions, e.g. modification detection codes [MDCs], MD5, SHA or RIPEMD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/84Using snapshots, i.e. a logical point-in-time copy of the data

Definitions

  • the invention in general, in one aspect, relates to a method for data protection.
  • the method includes providing, to an asset source, instructions for initiating a backup operation targeting an asset on the asset source, receiving, from the asset source and in response to the instructions, source backup information pertinent to the asset, receiving, from a backup target, target backup information pertinent to an asset backup associated with the asset, making a determination that the source backup information matches the target backup information, and instructing, based on the determination, the backup target to commit the asset backup.
  • the invention relates to a non-transitory computer readable medium (CRM).
  • CRM computer readable medium
  • the non-transitory CRM includes computer readable program code, which when executed by a computer processor, enables the computer processor to perform a method for data protection.
  • the method includes providing, to an asset source, instructions for initiating a backup operation targeting an asset on the asset source, receiving, from the asset source and in response to the instructions, source backup information pertinent to the asset, receiving, from a backup target, target backup information pertinent to an asset backup associated with the asset, making a determination that the source backup information matches the target backup information, and instructing, based on the determination, the backup target to commit the asset backup.
  • FIG. 1 shows a system in accordance with one or more embodiments of the invention.
  • FIG. 2 shows a flowchart describing a method for source versus target metadata-based data integrity checking in accordance with one or more embodiments of the invention.
  • FIG. 3 shows an exemplary computing system in accordance with one or more embodiments of the invention.
  • any component described with regard to a figure in various embodiments of the invention, may be equivalent to one or more like-named components described with regard to any other figure.
  • descriptions of these components will not be repeated with regard to each figure.
  • each and every embodiment of the components of each figure is incorporated by reference and assumed to be optionally present within every other figure having one or more like-named components.
  • any description of the components of a figure is to be interpreted as an optional embodiment which may be implemented in addition to, in conjunction with, or in place of the embodiments described with regard to a corresponding like-named component in any other figure.
  • ordinal numbers e.g., first, second, third, etc.
  • an element i.e., any noun in the application.
  • the use of ordinal numbers is not to necessarily imply or create any particular ordering of the elements nor to limit any element to being only a single element unless expressly disclosed, such as by the use of the terms “before”, “after”, “single”, and other such terminology. Rather, the use of ordinal numbers is to distinguish between the elements.
  • a first element is distinct from a second element, and a first element may encompass more than one element and succeed (or precede) the second element in an ordering of elements.
  • embodiments of the invention relate to a method and system for source versus target metadata-based data integrity checking.
  • said given data may be subjected to corruption detection at the source prior to initiating a backup operation, however, said given data may not be checked for data integrity following transfer of said given data to a target storage medium prior to committing the said given data thereto. That is, at least presently, the prospect of data corruption compromising given data during the time window through which the given data journeys, usually via a network, from its source to a target storage medium, is often overlooked.
  • Embodiments of the invention accordingly, propose a scheme directed to detecting corruption amongst data transferred from a source to a target storage medium, and handling said data given the determined integrity of said data.
  • FIG. 1 shows a system in accordance with one or more embodiments of the invention.
  • the system ( 100 ) may include an asset source ( 102 ), a backup target ( 110 ), a data protection manager ( 114 ), and an admin device ( 118 ). Each of these system ( 100 ) components is described below.
  • the asset source ( 102 ) may represent any physical appliance or computing system designed and configured to receive, generate, process, store, and/or transmit data, as well as to provide an environment in which one or more computer programs may execute thereon.
  • the computer program(s) may, for example, implement large-scale and complex data processing; or implement one or more services offered locally or over a network.
  • the asset source ( 102 ) may include and allocate various resources (e.g., computer processors, memory, storage, virtualization, network bandwidth, etc.), as needed, to the computer program(s) and the workloads instantiated thereby.
  • asset source ( 102 ) may perform other functionalities without departing from the scope of the invention.
  • the asset source ( 102 ) may include, but are not limited to, a desktop computer, a laptop computer, a server, a mainframe, or any other computing system similar to the exemplary computing system shown in FIG. 3 .
  • the asset source ( 102 ) may include one or more assets ( 104 A- 104 N), a backup and recovery agent ( 106 ), and a source corruption detection agent ( 108 ). Each of these asset source ( 102 ) subcomponents is described below.
  • an asset ( 104 A- 104 N) may refer to a database, or any logical container to and from data (and/or metadata thereof), which has been received by or generated on the asset source ( 102 ), may be stored and retrieved, respectively.
  • An asset ( 104 A- 104 N) may occupy any portion of persistent storage (not shown) available on the asset source ( 102 ). Examples of persistent storage may include, but are not limited to, optical storage, magnetic storage, NAND Flash Memory, NOR Flash Memory, Magnetic Random Access memory (M-RAM), Spin Torque Magnetic RAM (ST-MRAM), Phase Change Memory (PCM), or any other storage defined as non-volatile Storage Class memory (SCM).
  • SCM non-volatile Storage Class memory
  • the backup and recovery agent ( 106 ) may refer to a computer program that may execute on the underlying hardware of the asset source ( 102 ), which may be responsible for facilitating backup and recovery operations targeting one or more assets ( 104 A- 104 N) on the asset source ( 102 ). To that extent, the backup and recovery agent ( 106 ) may protect one or more assets ( 104 A- 104 N) against data loss (i.e., backup the targeted data and/or metadata); and reconstruct one or more assets ( 104 A- 104 N) following such data loss (i.e., recover the targeted data and/or metadata). Further, one of ordinary skill will appreciate that the backup and recovery agent ( 106 ) may perform other functionalities without departing from the scope of the invention.
  • the source corruption detection agent ( 108 ) may refer to a computer program that may execute on the underlying hardware of the asset source ( 102 ), which may be responsible for metadata submission therefrom. More specifically, the source corruption detection agent ( 108 ) may include functionality to: upon receipt of instructions from the data protection manager ( 114 ) to initiate a backup operation at the asset source ( 102 ) for a given asset ( 104 A- 104 N), obtain metadata (or a cryptographic hash thereof) descriptive of the data belonging to the given asset ( 104 A- 104 N); and submit the obtained metadata (or the cryptographic hash thereof) to the data protection manager ( 114 ) prior to initiating the backup operation per the received instructions.
  • the source corruption detection agent ( 108 ) may perform other functionalities without departing from the scope of the invention.
  • the backup target ( 110 ) may represent any data backup, archiving, and/or disaster recovery storage system.
  • the backup target ( 110 ) may be implemented using one or more storage servers (or computing systems similar to the exemplary computing system shown in FIG. 3 ) (not shown)—each of which may house one or many storage devices for storing data.
  • the backup target ( 110 ) may, at least in part, include persistent storage. Examples of persistent storage may include, but are not limited to, optical storage, magnetic storage, NAND Flash Memory, NOR Flash Memory, Magnetic Random Access Memory (M-RAM), Spin Torque Magnetic RAM (ST-MRAM), Phase Change Memory (PCM), or any other storage defined as non-volatile Storage Class Memory (SCM).
  • the backup target ( 110 ) may include a target corruption detection agent ( 112 ), which is described below.
  • the target corruption detection agent ( 112 ) may refer to a computer program that may execute on the underlying hardware of the backup target ( 110 ), which may be responsible for metadata submission therefrom. More specifically, the target corruption detection agent ( 112 ) may include functionality to: receive data belonging to a given asset ( 104 A- 104 N) from the asset source ( 102 ) during a backup operation targeting the given asset ( 104 A- 104 N); upon completing the transfer of the data from the asset source ( 102 ) to the backup target ( 110 ), notify the data protection manager ( 114 ) of the completion of said data transfer; receive a request thereafter, from the data protection manager ( 114 ), to provide metadata (or a cryptographic hash thereof) descriptive of the transferred data; obtaining the requested metadata (or the requested cryptographic hash thereof) for the transferred data; submitting, in response to the received request, the obtained metadata (or cryptographic hash thereof) to the data protection manager ( 114 ); and receiving, thereafter in return, instructions from the data protection manager (
  • the data protection manager ( 114 ) may represent information technology (IT) infrastructure configured for data integrity check management.
  • the data protection manager ( 114 ) may include functionality to perform the method outlined and described through FIG. 2 , below.
  • the data protection manager ( 114 ) may perform other functionalities without departing from the scope of the invention.
  • the data protection manager ( 114 ) may include and employ an integrity verifier ( 116 ) to, at least in part, fulfill the aforementioned functionality, which is described below.
  • the integrity verifier ( 116 ) may refer to a computer program that may execute on the underlying hardware of the data protection manager ( 114 ), which may be responsible for determining whether metadata (or a cryptographic hash thereof) submitted from the asset source ( 102 ), and metadata (or a cryptographic hash thereof) submitted from the backup target, ( 110 ) match or mismatch.
  • the integrity verifier ( 116 ) may deduce that the data being targeted for backup, a copy of which now transferred to the backup target ( 110 ), is corruption-free and, accordingly, may instruct the backup target ( 110 ) to commit the aforementioned copy of the data into storage.
  • the integrity verifier ( 116 ) may alternatively deduce that the data being targeted for backup, a copy of which now transferred to the backup target ( 110 ), is corrupted and, accordingly, may alternatively instruct the backup target ( 110 ) to discard the aforementioned copy of the data it has obtained.
  • the integrity verifier ( 116 ) may perform other functionalities without departing from the scope of the invention.
  • the admin device ( 118 ) may represent any physical appliance or computing system operated by one or more administrators of the system ( 100 ).
  • An administrator may refer to an individual or entity whom may be responsible for overseeing system ( 100 ) operations and maintenance.
  • the admin device ( 118 ) may include functionality to enable an administrator to: register the asset source ( 102 ) with the data protection manager ( 114 ); and submit protection policies, concerning one or more assets (described below) on the asset source ( 102 ), to the data protection manager ( 114 ).
  • These functionalities are described in further detail in FIG. 2 , below.
  • the admin device ( 118 ) may perform other functionalities without departing from the scope of the invention.
  • the above-mentioned system ( 100 ) components (or subcomponents thereof) may communicate with one another through a network (not shown) (e.g., a local area network (LAN), a wide area network (WAN) such as the Internet, a mobile network, any other network type, or a combination thereof).
  • the network may be implemented using any combination of wired and/or wireless connections.
  • the network may encompass various interconnected, network-enabled subcomponents (or systems) (e.g., switches, routers, etc.) that may facilitate communications between the above-mentioned system ( 100 ) components (or subcomponents thereof).
  • the above-mentioned system ( 100 ) components (or subcomponents thereof) may employ any combination of wired and/or wireless communication protocols.
  • FIG. 1 shows a configuration of components
  • system ( 100 ) configurations may be used without departing from the scope of the invention.
  • the system ( 100 ) may include more than one asset source (not shown) and/or more than one backup target (not shown).
  • FIG. 2 shows a flowchart describing a method for source versus target metadata-based data integrity checking in accordance with one or more embodiments of the invention.
  • the various steps outlined below may be performed by the data protection manager (see e.g., FIG. 1 ). Further, while the various steps in the flowchart are presented and described sequentially, one of ordinary skill will appreciate that some or all steps may be executed in different orders, may be combined or omitted, and some or all steps may be executed in parallel.
  • an asset source registration for an asset source (see e.g., FIG. 1 ), is received from an admin device.
  • the asset source registration may refer to connection information for the asset source.
  • Connection information may entail information necessary to connect to and/or interact with the asset source, which may include, but is not limited to: an Internet Protocol (IP) address assigned to the asset source; a network port number of the asset source through which a connection thereto may be attempted; and authentication information (e.g., authentication mode, username or login, and password) for accessing the asset source.
  • IP Internet Protocol
  • authentication information e.g., authentication mode, username or login, and password
  • Step 202 based on the asset source registration (received in Step 200 ), the asset source is discovered and agents are deployed thereto.
  • discovering the asset source may entail establishing a connection with and successfully accessing the asset source using the provided connection information.
  • agents deployed to and/or installed on the asset source may include, but are not limited to, a backup and recovery agent and a source corruption detection agent (both described above) (see e.g., FIG. 1 ).
  • a protection policy for one or more assets (described above) (see e.g., FIG. 1 ) on the asset source, is received from the admin device.
  • the protection policy may refer to a collection of rules and/or preferences directed to the protection of asset data and/or metadata.
  • the rules and/or preferences specified in/by the protection policy may include, but are not limited to: which asset data and/or metadata is/are critical regarding data protection; which modes of backup operations (e.g., incremental, full, etc.) should be performed that target the asset data and/or metadata, and how often (or on what schedule) should the backup operations occur; which backup target or storage medium should asset backup(s) be committed to; what is the retention span or recovery point objective (RPO) assigned to the asset data and/or metadata; whether data integrity verification, in accordance with embodiments of the invention, should be applied while performing backup operations targeting the asset data and/or metadata; and in which information mode (described below) should the aforementioned data integrity verification be performed.
  • modes of backup operations e.g., incremental, full, etc.
  • RPO recovery point objective
  • Step 206 based on the protection policy (received in Step 204 ), the backup and recovery agent (deployed to the asset source in Step 202 ) is instructed to initiate a backup operation targeting the asset(s) (or more specifically, certain data and/or metadata therein) with which the protection policy is associated.
  • the instructions may specify performing a data integrity verification of the asset data and/or metadata and, accordingly, may further specify an information mode, where the information mode references a format of a predefined collection of metadata required to perform the data integrity verification.
  • the information mode may reference a non-hash mode, where a predefined collection of metadata, descriptive of the asset data, may be sought to perform the data integrity verification.
  • the information mode may reference a hash mode, where, alternatively, a cryptographic hash of the aforementioned, predefined collection of metadata, descriptive of the asset data, may instead be sought to perform the data integrity verification.
  • the information mode may reference a combination mode, where: (a) a hash-mode based data integrity verification may be performed first; and (b) upon a successful result of (a), then a non-hash mode based data integrity verification may be performed second.
  • the predefined collection of metadata, descriptive of the given asset data may include, but is not limited to: a data type of the given asset data; a creation timestamp generated for the given asset data; a last modification timestamp generated for the given asset data; permission and/or ownership information associated with the given asset data; a format or extension of the given asset data; a size (e.g., in bytes) of the given asset data; and a checksum associated with the given asset data.
  • a cryptographic hash of a given information set may refer to a hash value or output that results when the given information set is fed through a cryptographic hash function.
  • a cryptographic hash function may refer to a deterministic algorithm configured to map data of any arbitrary size to a bit array of a fixed size. Accordingly, being deterministic, a cryptographic hash function always results in a same hash value/output for a given same input or information set.
  • Examples of the cryptographic hash function may include, but are not limited to: the MD5 message-digest algorithm (or MD5, for short); and one or more variants of the Secure Hash Algorithm 2 (SHA-2) (e.g., SHA-256, which produces a 256-bit hash value/output).
  • MD5 message-digest algorithm or MD5
  • SHA-2 Secure Hash Algorithm 2
  • source backup information is received from the asset source.
  • the source backup information may include a predefined collection of metadata descriptive of asset data from the asset(s) with which the protection policy (received in Step 204 ) is associated.
  • the source backup information may include a cryptographic hash of the aforementioned, predefined collection of metadata. Further, in either embodiment, the source backup information may pertain to a state of the asset data on the asset source just prior to the initiation of a backup operation targeting said asset data based on the instructions (provided to the asset source in Step 206 ).
  • a backup transferred notification is received from a backup target (described above) (see e.g., FIG. 1 ).
  • the backup transferred notification may inform the data protection manager that transfer of an asset backup (i.e., a copy of asset data of the asset(s) targeted for protection by way of the backup operation (initiated in Step 206 )) from the asset source to the backup target is complete.
  • an information request is submitted to the backup target.
  • the information request may include the information mode (specified by way of the protection policy received in Step 204 ). Accordingly, in one embodiment of the invention, the information mode may reference a non-hash mode. In another embodiment of the invention, the information mode may alternatively reference a hash mode.
  • target backup information is received from the backup target.
  • the target backup information may include a predefined collection of metadata descriptive of asset backup data pertaining to the asset backup.
  • the target backup information may include a cryptographic hash of the aforementioned, predefined collection of metadata descriptive of asset backup data pertaining to the asset backup. Further, in either embodiment, the target backup information may associate with a state of the asset backup data, which had arrived at the backup target from the asset source.
  • Step 216 a comparison is performed between the source backup information (received in Step 208 ) and the target backup information (received in Step 214 ).
  • Step 218 a determination is made as to whether the source backup information matches the target backup information. In one embodiment of the invention, if it is determined that the source backup information matches the target backup information, then the method proceeds to Step 220 . On the other hand, in another embodiment of the invention, if it is alternatively determined that the target backup information mismatches the target backup information, then the method alternatively proceeds to Step 222 .
  • Step 220 following the determination (in Step 218 ) that the source backup information (received in Step 208 ) matches the target backup information (received in Step 214 ), instructions are provided to the backup target to commit the asset backup into storage. That is, in one embodiment of the invention, in having matching backup information, the asset backup that had arrived at the backup target from the asset source is found to be corruption-free and, accordingly, the asset backup may be committed.
  • Step 222 following the alternative determination (in Step 218 ) that the source backup information (received in Step 208 ) mismatches the target backup information (received in Step 214 ), instructions are provided to the backup target to discard the asset backup. That is, in one embodiment of the invention, in having mismatching backup information, the asset backup that had arrived at the backup target from the asset source is found to be corrupt and, accordingly, the asset backup may be discarded.
  • FIG. 3 shows an exemplary computing system in accordance with one or more embodiments of the invention.
  • the computing system ( 300 ) may include one or more computer processors ( 302 ), non-persistent storage ( 304 ) (e.g., volatile memory, such as random access memory (RAM), cache memory), persistent storage ( 306 ) (e.g., a hard disk, an optical drive such as a compact disk (CD) drive or digital versatile disk (DVD) drive, a flash memory, etc.), a communication interface ( 312 ) (e.g., Bluetooth interface, infrared interface, network interface, optical interface, etc.), input devices ( 310 ), output devices ( 308 ), and numerous other elements (not shown) and functionalities. Each of these components is described below.
  • non-persistent storage 304
  • persistent storage e.g., a hard disk, an optical drive such as a compact disk (CD) drive or digital versatile disk (DVD) drive, a flash memory, etc.
  • the computer processor(s) ( 302 ) may be an integrated circuit for processing instructions.
  • the computer processor(s) may be one or more cores or micro-cores of a central processing unit (CPU) and/or a graphics processing unit (GPU).
  • the computing system ( 300 ) may also include one or more input devices ( 310 ), such as a touchscreen, keyboard, mouse, microphone, touchpad, electronic pen, or any other type of input device.
  • the communication interface ( 312 ) may include an integrated circuit for connecting the computing system ( 300 ) to a network (not shown) (e.g., a local area network (LAN), a wide area network (WAN) such as the Internet, mobile network, or any other type of network) and/or to another device, such as another computing device.
  • a network not shown
  • LAN local area network
  • WAN wide area network
  • the Internet such as the Internet
  • mobile network such as another computing device.
  • the computing system ( 300 ) may include one or more output devices ( 308 ), such as a screen (e.g., a liquid crystal display (LCD), a plasma display, touchscreen, cathode ray tube (CRT) monitor, projector, or other display device), a printer, external storage, or any other output device.
  • a screen e.g., a liquid crystal display (LCD), a plasma display, touchscreen, cathode ray tube (CRT) monitor, projector, or other display device
  • One or more of the output devices may be the same or different from the input device(s).
  • the input and output device(s) may be locally or remotely connected to the computer processor(s) ( 302 ), non-persistent storage ( 304 ), and persistent storage ( 306 ).
  • the computer processor(s) 302
  • non-persistent storage 304
  • persistent storage 306
  • Software instructions in the form of computer readable program code to perform embodiments of the invention may be stored, in whole or in part, temporarily or permanently, on a non-transitory computer readable medium such as a CD, DVD, storage device, a diskette, a tape, flash memory, physical memory, or any other computer readable storage medium.
  • the software instructions may correspond to computer readable program code that, when executed by a processor(s), is configured to perform one or more embodiments of the invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Hardware Design (AREA)
  • Bioethics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Power Engineering (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method and system for source versus target metadata-based data integrity checking. Concerning backup operations directed to protecting given data, said given data may be subjected to corruption detection at the source prior to initiating a backup operation, however, said given data may not be checked for data integrity following transfer of said given data to a target storage medium prior to committing the said given data thereto. That is, at least presently, the prospect of data corruption compromising given data during the time window through which the given data journeys, usually via a network, from its source to a target storage medium, is often overlooked. The disclosed method and system, accordingly, propose a scheme directed to detecting corruption amongst data transferred from a source to a target storage medium, and handling said data given the determined integrity of said data.

Description

BACKGROUND
At least presently, the prospect of data corruption compromising given data during the time window through which the given data journeys, usually via a network, from its source to a target storage medium, is often overlooked.
SUMMARY
In general, in one aspect, the invention relates to a method for data protection. The method includes providing, to an asset source, instructions for initiating a backup operation targeting an asset on the asset source, receiving, from the asset source and in response to the instructions, source backup information pertinent to the asset, receiving, from a backup target, target backup information pertinent to an asset backup associated with the asset, making a determination that the source backup information matches the target backup information, and instructing, based on the determination, the backup target to commit the asset backup.
In general, in one aspect, the invention relates to a non-transitory computer readable medium (CRM). The non-transitory CRM includes computer readable program code, which when executed by a computer processor, enables the computer processor to perform a method for data protection. The method includes providing, to an asset source, instructions for initiating a backup operation targeting an asset on the asset source, receiving, from the asset source and in response to the instructions, source backup information pertinent to the asset, receiving, from a backup target, target backup information pertinent to an asset backup associated with the asset, making a determination that the source backup information matches the target backup information, and instructing, based on the determination, the backup target to commit the asset backup.
Other aspects of the invention will be apparent from the following description and the appended claims.
BRIEF DESCRIPTION OF DRAWINGS
FIG. 1 shows a system in accordance with one or more embodiments of the invention.
FIG. 2 shows a flowchart describing a method for source versus target metadata-based data integrity checking in accordance with one or more embodiments of the invention.
FIG. 3 shows an exemplary computing system in accordance with one or more embodiments of the invention.
DETAILED DESCRIPTION
Specific embodiments of the invention will now be described in detail with reference to the accompanying figures. In the following detailed description of the embodiments of the invention, numerous specific details are set forth in order to provide a more thorough understanding of the invention. However, it will be apparent to one of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description.
In the following description of FIGS. 1-3 , any component described with regard to a figure, in various embodiments of the invention, may be equivalent to one or more like-named components described with regard to any other figure. For brevity, descriptions of these components will not be repeated with regard to each figure. Thus, each and every embodiment of the components of each figure is incorporated by reference and assumed to be optionally present within every other figure having one or more like-named components. Additionally, in accordance with various embodiments of the invention, any description of the components of a figure is to be interpreted as an optional embodiment which may be implemented in addition to, in conjunction with, or in place of the embodiments described with regard to a corresponding like-named component in any other figure.
Throughout the application, ordinal numbers (e.g., first, second, third, etc.) may be used as an adjective for an element (i.e., any noun in the application). The use of ordinal numbers is not to necessarily imply or create any particular ordering of the elements nor to limit any element to being only a single element unless expressly disclosed, such as by the use of the terms “before”, “after”, “single”, and other such terminology. Rather, the use of ordinal numbers is to distinguish between the elements. By way of an example, a first element is distinct from a second element, and a first element may encompass more than one element and succeed (or precede) the second element in an ordering of elements.
In general, embodiments of the invention relate to a method and system for source versus target metadata-based data integrity checking. Concerning backup operations directed to protecting given data, said given data may be subjected to corruption detection at the source prior to initiating a backup operation, however, said given data may not be checked for data integrity following transfer of said given data to a target storage medium prior to committing the said given data thereto. That is, at least presently, the prospect of data corruption compromising given data during the time window through which the given data journeys, usually via a network, from its source to a target storage medium, is often overlooked. Embodiments of the invention, accordingly, propose a scheme directed to detecting corruption amongst data transferred from a source to a target storage medium, and handling said data given the determined integrity of said data.
FIG. 1 shows a system in accordance with one or more embodiments of the invention. The system (100) may include an asset source (102), a backup target (110), a data protection manager (114), and an admin device (118). Each of these system (100) components is described below.
In one embodiment of the invention, the asset source (102) may represent any physical appliance or computing system designed and configured to receive, generate, process, store, and/or transmit data, as well as to provide an environment in which one or more computer programs may execute thereon. The computer program(s) may, for example, implement large-scale and complex data processing; or implement one or more services offered locally or over a network. Further, in providing an execution environment for any computer program(s) installed thereon, the asset source (102) may include and allocate various resources (e.g., computer processors, memory, storage, virtualization, network bandwidth, etc.), as needed, to the computer program(s) and the workloads instantiated thereby. One of ordinary skill will appreciate that the asset source (102) may perform other functionalities without departing from the scope of the invention. Examples of the asset source (102) may include, but are not limited to, a desktop computer, a laptop computer, a server, a mainframe, or any other computing system similar to the exemplary computing system shown in FIG. 3 . Moreover, the asset source (102) may include one or more assets (104A-104N), a backup and recovery agent (106), and a source corruption detection agent (108). Each of these asset source (102) subcomponents is described below.
In one embodiment of the invention, an asset (104A-104N) may refer to a database, or any logical container to and from data (and/or metadata thereof), which has been received by or generated on the asset source (102), may be stored and retrieved, respectively. An asset (104A-104N) may occupy any portion of persistent storage (not shown) available on the asset source (102). Examples of persistent storage may include, but are not limited to, optical storage, magnetic storage, NAND Flash Memory, NOR Flash Memory, Magnetic Random Access memory (M-RAM), Spin Torque Magnetic RAM (ST-MRAM), Phase Change Memory (PCM), or any other storage defined as non-volatile Storage Class memory (SCM).
In one embodiment of the invention, the backup and recovery agent (106) may refer to a computer program that may execute on the underlying hardware of the asset source (102), which may be responsible for facilitating backup and recovery operations targeting one or more assets (104A-104N) on the asset source (102). To that extent, the backup and recovery agent (106) may protect one or more assets (104A-104N) against data loss (i.e., backup the targeted data and/or metadata); and reconstruct one or more assets (104A-104N) following such data loss (i.e., recover the targeted data and/or metadata). Further, one of ordinary skill will appreciate that the backup and recovery agent (106) may perform other functionalities without departing from the scope of the invention.
In one embodiment of the invention, the source corruption detection agent (108) may refer to a computer program that may execute on the underlying hardware of the asset source (102), which may be responsible for metadata submission therefrom. More specifically, the source corruption detection agent (108) may include functionality to: upon receipt of instructions from the data protection manager (114) to initiate a backup operation at the asset source (102) for a given asset (104A-104N), obtain metadata (or a cryptographic hash thereof) descriptive of the data belonging to the given asset (104A-104N); and submit the obtained metadata (or the cryptographic hash thereof) to the data protection manager (114) prior to initiating the backup operation per the received instructions. One of ordinary skill, however, will appreciate that the source corruption detection agent (108) may perform other functionalities without departing from the scope of the invention.
In one embodiment of the invention, the backup target (110) may represent any data backup, archiving, and/or disaster recovery storage system. The backup target (110) may be implemented using one or more storage servers (or computing systems similar to the exemplary computing system shown in FIG. 3 ) (not shown)—each of which may house one or many storage devices for storing data. Further, the backup target (110) may, at least in part, include persistent storage. Examples of persistent storage may include, but are not limited to, optical storage, magnetic storage, NAND Flash Memory, NOR Flash Memory, Magnetic Random Access Memory (M-RAM), Spin Torque Magnetic RAM (ST-MRAM), Phase Change Memory (PCM), or any other storage defined as non-volatile Storage Class Memory (SCM). Moreover, the backup target (110) may include a target corruption detection agent (112), which is described below.
In one embodiment of the invention, the target corruption detection agent (112) may refer to a computer program that may execute on the underlying hardware of the backup target (110), which may be responsible for metadata submission therefrom. More specifically, the target corruption detection agent (112) may include functionality to: receive data belonging to a given asset (104A-104N) from the asset source (102) during a backup operation targeting the given asset (104A-104N); upon completing the transfer of the data from the asset source (102) to the backup target (110), notify the data protection manager (114) of the completion of said data transfer; receive a request thereafter, from the data protection manager (114), to provide metadata (or a cryptographic hash thereof) descriptive of the transferred data; obtaining the requested metadata (or the requested cryptographic hash thereof) for the transferred data; submitting, in response to the received request, the obtained metadata (or cryptographic hash thereof) to the data protection manager (114); and receiving, thereafter in return, instructions from the data protection manager (114) to either commit the transferred data into storage on the backup target (110) or discard the transferred data, based on the submitted metadata (or cryptographic hash thereof). One of ordinary skill, however, will appreciate that the target corruption detection agent (112) may perform other functionalities without departing from the scope of the invention.
In one embodiment of the invention, the data protection manager (114) may represent information technology (IT) infrastructure configured for data integrity check management. To that extent, the data protection manager (114) may include functionality to perform the method outlined and described through FIG. 2 , below. One of ordinary skill, however, will appreciate that the data protection manager (114) may perform other functionalities without departing from the scope of the invention. Furthermore, the data protection manager (114) may include and employ an integrity verifier (116) to, at least in part, fulfill the aforementioned functionality, which is described below.
In one embodiment of the invention, the integrity verifier (116) may refer to a computer program that may execute on the underlying hardware of the data protection manager (114), which may be responsible for determining whether metadata (or a cryptographic hash thereof) submitted from the asset source (102), and metadata (or a cryptographic hash thereof) submitted from the backup target, (110) match or mismatch. As a result of the former, the integrity verifier (116) may deduce that the data being targeted for backup, a copy of which now transferred to the backup target (110), is corruption-free and, accordingly, may instruct the backup target (110) to commit the aforementioned copy of the data into storage. On the other hand, as a result of the latter, the integrity verifier (116) may alternatively deduce that the data being targeted for backup, a copy of which now transferred to the backup target (110), is corrupted and, accordingly, may alternatively instruct the backup target (110) to discard the aforementioned copy of the data it has obtained. One of ordinary skill will appreciate that the integrity verifier (116) may perform other functionalities without departing from the scope of the invention.
In one embodiment of the invention, the admin device (118) may represent any physical appliance or computing system operated by one or more administrators of the system (100). An administrator may refer to an individual or entity whom may be responsible for overseeing system (100) operations and maintenance. To that extent, and at least as it pertains to embodiments of the invention, the admin device (118) may include functionality to enable an administrator to: register the asset source (102) with the data protection manager (114); and submit protection policies, concerning one or more assets (described below) on the asset source (102), to the data protection manager (114). These functionalities are described in further detail in FIG. 2 , below. Further, one of ordinary skill will appreciate that the admin device (118) may perform other functionalities without departing from the scope of the invention.
In one embodiment of the invention, the above-mentioned system (100) components (or subcomponents thereof) may communicate with one another through a network (not shown) (e.g., a local area network (LAN), a wide area network (WAN) such as the Internet, a mobile network, any other network type, or a combination thereof). The network may be implemented using any combination of wired and/or wireless connections. Further, the network may encompass various interconnected, network-enabled subcomponents (or systems) (e.g., switches, routers, etc.) that may facilitate communications between the above-mentioned system (100) components (or subcomponents thereof). Moreover, in communicating with one another, the above-mentioned system (100) components (or subcomponents thereof) may employ any combination of wired and/or wireless communication protocols.
While FIG. 1 shows a configuration of components, other system (100) configurations may be used without departing from the scope of the invention. For example, the system (100) may include more than one asset source (not shown) and/or more than one backup target (not shown).
FIG. 2 shows a flowchart describing a method for source versus target metadata-based data integrity checking in accordance with one or more embodiments of the invention. The various steps outlined below may be performed by the data protection manager (see e.g., FIG. 1 ). Further, while the various steps in the flowchart are presented and described sequentially, one of ordinary skill will appreciate that some or all steps may be executed in different orders, may be combined or omitted, and some or all steps may be executed in parallel.
Turning to FIG. 2 , in Step 200, an asset source registration, for an asset source (see e.g., FIG. 1 ), is received from an admin device. In one embodiment of the invention, the asset source registration may refer to connection information for the asset source. Connection information may entail information necessary to connect to and/or interact with the asset source, which may include, but is not limited to: an Internet Protocol (IP) address assigned to the asset source; a network port number of the asset source through which a connection thereto may be attempted; and authentication information (e.g., authentication mode, username or login, and password) for accessing the asset source.
In Step 202, based on the asset source registration (received in Step 200), the asset source is discovered and agents are deployed thereto. In one embodiment of the invention, discovering the asset source may entail establishing a connection with and successfully accessing the asset source using the provided connection information. Further, agents deployed to and/or installed on the asset source may include, but are not limited to, a backup and recovery agent and a source corruption detection agent (both described above) (see e.g., FIG. 1 ).
In Step 204, a protection policy, for one or more assets (described above) (see e.g., FIG. 1 ) on the asset source, is received from the admin device. In one embodiment of the invention, the protection policy may refer to a collection of rules and/or preferences directed to the protection of asset data and/or metadata. The rules and/or preferences specified in/by the protection policy may include, but are not limited to: which asset data and/or metadata is/are critical regarding data protection; which modes of backup operations (e.g., incremental, full, etc.) should be performed that target the asset data and/or metadata, and how often (or on what schedule) should the backup operations occur; which backup target or storage medium should asset backup(s) be committed to; what is the retention span or recovery point objective (RPO) assigned to the asset data and/or metadata; whether data integrity verification, in accordance with embodiments of the invention, should be applied while performing backup operations targeting the asset data and/or metadata; and in which information mode (described below) should the aforementioned data integrity verification be performed.
In Step 206, based on the protection policy (received in Step 204), the backup and recovery agent (deployed to the asset source in Step 202) is instructed to initiate a backup operation targeting the asset(s) (or more specifically, certain data and/or metadata therein) with which the protection policy is associated. The instructions may specify performing a data integrity verification of the asset data and/or metadata and, accordingly, may further specify an information mode, where the information mode references a format of a predefined collection of metadata required to perform the data integrity verification. In one embodiment of the invention, the information mode may reference a non-hash mode, where a predefined collection of metadata, descriptive of the asset data, may be sought to perform the data integrity verification. In another embodiment of the invention, the information mode may reference a hash mode, where, alternatively, a cryptographic hash of the aforementioned, predefined collection of metadata, descriptive of the asset data, may instead be sought to perform the data integrity verification. Further, in yet another embodiment of the invention, the information mode may reference a combination mode, where: (a) a hash-mode based data integrity verification may be performed first; and (b) upon a successful result of (a), then a non-hash mode based data integrity verification may be performed second.
In one embodiment of the invention, for a given asset data of one or more assets being targeted for data protection through a backup operation, the predefined collection of metadata, descriptive of the given asset data, may include, but is not limited to: a data type of the given asset data; a creation timestamp generated for the given asset data; a last modification timestamp generated for the given asset data; permission and/or ownership information associated with the given asset data; a format or extension of the given asset data; a size (e.g., in bytes) of the given asset data; and a checksum associated with the given asset data.
Further, in one embodiment of the invention, a cryptographic hash of a given information set (e.g., the above-mentioned predefined collection of metadata descriptive of a given asset data) may refer to a hash value or output that results when the given information set is fed through a cryptographic hash function. A cryptographic hash function may refer to a deterministic algorithm configured to map data of any arbitrary size to a bit array of a fixed size. Accordingly, being deterministic, a cryptographic hash function always results in a same hash value/output for a given same input or information set. Examples of the cryptographic hash function, which may be employed, may include, but are not limited to: the MD5 message-digest algorithm (or MD5, for short); and one or more variants of the Secure Hash Algorithm 2 (SHA-2) (e.g., SHA-256, which produces a 256-bit hash value/output).
In Step 208, source backup information is received from the asset source. In one embodiment of the invention, based on the information mode referencing a non-hash mode (described above), the source backup information may include a predefined collection of metadata descriptive of asset data from the asset(s) with which the protection policy (received in Step 204) is associated. In another embodiment of the invention, based on the information mode alternatively referencing a hash mode (described above), the source backup information may include a cryptographic hash of the aforementioned, predefined collection of metadata. Further, in either embodiment, the source backup information may pertain to a state of the asset data on the asset source just prior to the initiation of a backup operation targeting said asset data based on the instructions (provided to the asset source in Step 206).
In Step 210, a backup transferred notification is received from a backup target (described above) (see e.g., FIG. 1 ). In one embodiment of the invention, the backup transferred notification may inform the data protection manager that transfer of an asset backup (i.e., a copy of asset data of the asset(s) targeted for protection by way of the backup operation (initiated in Step 206)) from the asset source to the backup target is complete.
In Step 212, in response to the backup transferred notification (received in Step 210), an information request is submitted to the backup target. The information request may include the information mode (specified by way of the protection policy received in Step 204). Accordingly, in one embodiment of the invention, the information mode may reference a non-hash mode. In another embodiment of the invention, the information mode may alternatively reference a hash mode.
In Step 214, in response to the information request (submitted in Step 212), target backup information is received from the backup target. In one embodiment of the invention, based on the information mode referencing a non-hash mode (described above), the target backup information may include a predefined collection of metadata descriptive of asset backup data pertaining to the asset backup. In another embodiment of the invention, based on the information mode alternatively referencing a hash mode (described above), the target backup information may include a cryptographic hash of the aforementioned, predefined collection of metadata descriptive of asset backup data pertaining to the asset backup. Further, in either embodiment, the target backup information may associate with a state of the asset backup data, which had arrived at the backup target from the asset source.
In Step 216, a comparison is performed between the source backup information (received in Step 208) and the target backup information (received in Step 214). By way of the comparison, in Step 218, a determination is made as to whether the source backup information matches the target backup information. In one embodiment of the invention, if it is determined that the source backup information matches the target backup information, then the method proceeds to Step 220. On the other hand, in another embodiment of the invention, if it is alternatively determined that the target backup information mismatches the target backup information, then the method alternatively proceeds to Step 222.
In Step 220, following the determination (in Step 218) that the source backup information (received in Step 208) matches the target backup information (received in Step 214), instructions are provided to the backup target to commit the asset backup into storage. That is, in one embodiment of the invention, in having matching backup information, the asset backup that had arrived at the backup target from the asset source is found to be corruption-free and, accordingly, the asset backup may be committed.
In Step 222, following the alternative determination (in Step 218) that the source backup information (received in Step 208) mismatches the target backup information (received in Step 214), instructions are provided to the backup target to discard the asset backup. That is, in one embodiment of the invention, in having mismatching backup information, the asset backup that had arrived at the backup target from the asset source is found to be corrupt and, accordingly, the asset backup may be discarded.
FIG. 3 shows an exemplary computing system in accordance with one or more embodiments of the invention. The computing system (300) may include one or more computer processors (302), non-persistent storage (304) (e.g., volatile memory, such as random access memory (RAM), cache memory), persistent storage (306) (e.g., a hard disk, an optical drive such as a compact disk (CD) drive or digital versatile disk (DVD) drive, a flash memory, etc.), a communication interface (312) (e.g., Bluetooth interface, infrared interface, network interface, optical interface, etc.), input devices (310), output devices (308), and numerous other elements (not shown) and functionalities. Each of these components is described below.
In one embodiment of the invention, the computer processor(s) (302) may be an integrated circuit for processing instructions. For example, the computer processor(s) may be one or more cores or micro-cores of a central processing unit (CPU) and/or a graphics processing unit (GPU). The computing system (300) may also include one or more input devices (310), such as a touchscreen, keyboard, mouse, microphone, touchpad, electronic pen, or any other type of input device. Further, the communication interface (312) may include an integrated circuit for connecting the computing system (300) to a network (not shown) (e.g., a local area network (LAN), a wide area network (WAN) such as the Internet, mobile network, or any other type of network) and/or to another device, such as another computing device.
In one embodiment of the invention, the computing system (300) may include one or more output devices (308), such as a screen (e.g., a liquid crystal display (LCD), a plasma display, touchscreen, cathode ray tube (CRT) monitor, projector, or other display device), a printer, external storage, or any other output device. One or more of the output devices may be the same or different from the input device(s). The input and output device(s) may be locally or remotely connected to the computer processor(s) (302), non-persistent storage (304), and persistent storage (306). Many different types of computing systems exist, and the aforementioned input and output device(s) may take other forms.
Software instructions in the form of computer readable program code to perform embodiments of the invention may be stored, in whole or in part, temporarily or permanently, on a non-transitory computer readable medium such as a CD, DVD, storage device, a diskette, a tape, flash memory, physical memory, or any other computer readable storage medium. Specifically, the software instructions may correspond to computer readable program code that, when executed by a processor(s), is configured to perform one or more embodiments of the invention.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims.

Claims (20)

What is claimed is:
1. A method for data protection, comprising:
providing, to an asset source, instructions for initiating a backup operation targeting asset data of an asset on the asset source;
receiving, from the asset source and in response to the instructions, source backup information indicating a state of the asset data prior to initiation of the backup operation by the asset source;
receiving, from a backup target, a backup transferred notification informing that the backup target has received a completed transfer of an asset backup representing a copy of the asset data from the asset source by way of the backup operation;
submitting, to the backup target and based on receiving the backup transferred notification, a target backup information request comprising an information mode;
receiving, from the backup target and in response to submission of the target backup information request, target backup information indicating a state of the asset backup;
making a determination that the asset backup, having arrived at the backup target, is corruption-free based on the state of the asset data matching the state of the asset backup; and
instructing, based on the determination, the backup target to commit the asset backup.
2. The method of claim 1, wherein the state of the asset data comprises asset metadata descriptive of the asset data, the asset metadata comprising a data type of the asset data, a creation timestamp generated for the asset data, a last modification timestamp generated for the asset data, ownership information associated with the asset data, a size of the asset data, and a checksum associated with the asset data.
3. The method of claim 1, wherein the source backup information comprises a cryptographic hash of asset metadata describing the asset data prior to the initiation of the backup operation.
4. The method of claim 1, wherein the information mode references a non-hash mode.
5. The method of claim 4, wherein based on the information mode referencing the non-hash mode, the target backup information comprises asset backup metadata describing asset backup data pertaining to the asset backup following a transfer of the asset backup from the asset source to the backup target.
6. The method of claim 1, wherein the information mode references a hash mode.
7. The method of claim 6, wherein based on the information mode referencing the hash mode, the target backup information comprises a cryptographic hash of asset backup metadata describing asset backup data pertaining to the asset backup following a transfer of the asset backup from the asset source to the backup target.
8. The method of claim 1, further comprising:
providing, to the asset source, second instructions for initiating a second backup operation targeting second asset data of a second asset on the asset source;
receiving, from the asset source and in response to the second instructions, second source backup information indicating a second state of the second asset data prior to initiation of the second backup operation by the asset source;
receiving, from the backup target and following a second arrival thereon of a second asset backup representing a copy of the second asset data, second target backup information indicating a second state of the second asset backup,
wherein the second arrival of the second asset backup follows a second completed transfer thereof from the asset source by way of the second backup operation;
making a second determination that the second asset backup, having arrived at the backup target, is corrupted based on the second state of the second asset data mismatching the second state of the second asset backup; and
instructing, based on the second determination, the backup target to discard the second asset backup.
9. A non-transitory computer readable medium (CRM) comprising computer readable program code, which when executed by a computer processor, enables the computer processor to perform a method for data protection, the method comprising:
providing, to an asset source, instructions for initiating a backup operation targeting asset data of an asset on the asset source;
receiving, from the asset source and in response to the instructions, source backup information indicating a state of the asset data prior to initiation of the backup operation by the asset source;
receiving, from a backup target, a backup transferred notification informing that the backup target has received a completed transfer of an asset backup representing a copy of the asset data from the asset source by way of the backup operation;
submitting, to the backup target and based on receiving the backup transferred notification, a target backup information request comprising an information mode;
receiving, from the backup target and in response to submission of the target backup information request, target backup information indicating a state of the asset backup;
making a determination that the asset backup, having arrived at the backup target, is corruption-free based on the state of the asset data matching the state of the asset backup; and
instructing, based on the determination, the backup target to commit the asset backup.
10. The non-transitory CRM of claim 9, wherein the state of the asset data comprises asset metadata descriptive of the asset data, the asset metadata comprising a data type of the asset data, a creation timestamp generated for the asset data, a last modification timestamp generated for the asset data, ownership information associated with the asset data, a size of the asset data, and a checksum associated with the asset data.
11. The non-transitory CRM of claim 9, wherein the source backup information comprises a cryptographic hash of asset metadata describing the asset data prior to the initiation of the backup operation.
12. The non-transitory CRM of claim 9, wherein the information mode references a non-hash mode.
13. The non-transitory CRM of claim 12, wherein based on the information mode referencing the non-hash mode, the target backup information comprises asset backup metadata describing asset backup data pertaining to the asset backup following a transfer of the asset backup from the asset source to the backup target.
14. The non-transitory CRM of claim 9, wherein the information mode references a hash mode.
15. The non-transitory CRM of claim 14, wherein based on the information mode referencing the hash mode, the target backup information comprises a cryptographic hash of asset backup metadata describing asset backup data pertaining to the asset backup following a transfer of the asset backup from the asset source to the backup target.
16. The non-transitory CRM of claim 9, the method further comprising:
providing, to the asset source, second instructions for initiating a second backup operation targeting second asset data of a second asset on the asset source;
receiving, from the asset source and in response to the second instructions, second source backup information indicating a second state of the second asset data prior to initiation of the second backup operation by the asset source;
receiving, from the backup target and following a second arrival thereon of a second asset backup representing a copy of the second asset data, second target backup information indicating a second state of the second asset backup,
wherein the second arrival of the second asset backup follows a second completed transfer thereof from the asset source by way of the second backup operation;
making a second determination that the second asset backup, having arrived at the backup target, is corrupted based on the second state of the second asset data mismatching the second state of the second asset backup; and
instructing, based on the second determination, the backup target to discard the second asset backup.
17. A system, comprising:
an asset source comprising an asset;
a backup target operatively connected to the asset source; and
a data protection manager operatively connected to the asset source and the backup target, and comprising a computer processor configured to perform a method for data protection, the method comprising:
providing, to the asset source, instructions for initiating a backup operation targeting asset data of the asset on the asset source;
receiving, from the asset source and in response to the instructions, source backup information indicating a state of the asset data prior to initiation of the backup operation by the asset source;
receiving, from the backup target, a backup transferred notification informing that the backup target has received a completed transfer of an asset backup representing a copy of the asset data from the asset source by way of the backup operation;
submitting, to the backup target and based on receiving the backup transferred notification, a target backup information request comprising an information mode;
receiving, from the backup target and in response to submission of the target backup information request, target backup information indicating a state of the asset backup;
making a determination that the asset backup, having arrived at the backup target, is corruption-free based on the state of the asset data matching the state of the asset backup; and
instructing, based on the determination, the backup target to commit the asset backup.
18. The system of claim 17, wherein the source backup information comprises a cryptographic hash of asset metadata describing the asset data prior to the initiation of the backup operation.
19. The system of claim 17, wherein the information mode references a non-hash mode, and wherein based on the information mode referencing the non-hash mode: the target backup information comprises asset backup metadata describing asset backup data pertaining to the asset backup following a transfer of the asset backup from the asset source to the backup target.
20. The system of claim 17, wherein the information mode references a hash mode, and wherein based on the information mode referencing the hash mode: the target backup information comprises a cryptographic hash of asset backup metadata describing asset backup data pertaining to the asset backup following a transfer of the asset backup from the asset source to the backup target.
US17/384,975 2021-06-11 2021-07-26 Source versus target metadata-based data integrity checking Active 2041-11-02 US11782795B2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN202141026166 2021-06-11
IN202141026166 2021-06-11

Publications (2)

Publication Number Publication Date
US20220398165A1 US20220398165A1 (en) 2022-12-15
US11782795B2 true US11782795B2 (en) 2023-10-10

Family

ID=84390219

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/384,975 Active 2041-11-02 US11782795B2 (en) 2021-06-11 2021-07-26 Source versus target metadata-based data integrity checking

Country Status (1)

Country Link
US (1) US11782795B2 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070050777A1 (en) * 2003-06-09 2007-03-01 Hutchinson Thomas W Duration of alerts and scanning of large data stores
US20090059915A1 (en) * 2007-08-29 2009-03-05 Dell Products, Lp System and method of automating use of a data integrity routine within a network
US20100191907A1 (en) * 2009-01-26 2010-07-29 Lsi Corporation RAID Converter and Methods for Transforming a First RAID Array to a Second RAID Array Without Creating a Backup Copy
US20140188805A1 (en) * 2012-12-28 2014-07-03 Commvault Systems, Inc. Backup and restoration for a deduplicated file system
US20160285872A1 (en) * 2011-10-04 2016-09-29 Electro Industries/Gauge Tech Intelligent electronic devices, systems and methods for communicating messages over a network
CN106936771A (en) * 2015-12-29 2017-07-07 航天信息股份有限公司 A kind of secure cloud storage method and system based on graded encryption
US9760445B1 (en) * 2014-06-05 2017-09-12 EMC IP Holding Company LLC Data protection using change-based measurements in block-based backup
US20200293671A1 (en) * 2019-03-13 2020-09-17 Hewlett Packard Enterprise Development Lp Device and method for secure data backup

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070050777A1 (en) * 2003-06-09 2007-03-01 Hutchinson Thomas W Duration of alerts and scanning of large data stores
US20090059915A1 (en) * 2007-08-29 2009-03-05 Dell Products, Lp System and method of automating use of a data integrity routine within a network
US20100191907A1 (en) * 2009-01-26 2010-07-29 Lsi Corporation RAID Converter and Methods for Transforming a First RAID Array to a Second RAID Array Without Creating a Backup Copy
US20160285872A1 (en) * 2011-10-04 2016-09-29 Electro Industries/Gauge Tech Intelligent electronic devices, systems and methods for communicating messages over a network
US20140188805A1 (en) * 2012-12-28 2014-07-03 Commvault Systems, Inc. Backup and restoration for a deduplicated file system
US9760445B1 (en) * 2014-06-05 2017-09-12 EMC IP Holding Company LLC Data protection using change-based measurements in block-based backup
CN106936771A (en) * 2015-12-29 2017-07-07 航天信息股份有限公司 A kind of secure cloud storage method and system based on graded encryption
US20200293671A1 (en) * 2019-03-13 2020-09-17 Hewlett Packard Enterprise Development Lp Device and method for secure data backup

Also Published As

Publication number Publication date
US20220398165A1 (en) 2022-12-15

Similar Documents

Publication Publication Date Title
EP3776208B1 (en) Runtime self-correction for blockchain ledgers
CN110352445B (en) Performing multiparty transactions using smart contracts
US20170046152A1 (en) Firmware update
US10684791B2 (en) System and method for environment aware backup and restoration
US8910235B2 (en) Policy based provisioning of shared boot images
BR112017000347B1 (en) METHOD FOR DEFINING A CORRECTIVE PROCESS ASSOCIATED WITH A CLOUD APPLICATION, COMPUTING SYSTEM AND COMPUTER READABLE HARDWARE MEMORY DEVICE
US11243843B2 (en) Method and system for optimizing backup and backup discovery operations using change based metadata tracking (CBMT)
US20180019869A1 (en) System and method for secure messaging between distributed computing nodes
US11119866B2 (en) Method and system for intelligently migrating to a centralized protection framework
US11275601B2 (en) System and method for auto recovery of deleted virtual machines identified through comparison of virtual machine management application snapshots and having corresponding backups at a storage device
US10691353B1 (en) Checking of data difference for writes performed via a bus interface to a dual-server storage controller
US20220038487A1 (en) Method and system for a security assessment of physical assets using physical asset state information
US11494493B1 (en) Software verification for network-accessible applications
US11074136B2 (en) System and method for a hybrid workflow backup operation of data in a cloud-based service with third-party applications
US9323474B2 (en) Selective zone control session termination
US11782795B2 (en) Source versus target metadata-based data integrity checking
US10855777B2 (en) Declarative security management plugins
US11431564B1 (en) HCI distributed ledger management system
US10659483B1 (en) Automated agent for data copies verification
US20210133339A1 (en) Method and system for capturing asset protection metadata pertinent to analytics
US20210271554A1 (en) Method and system for a cloud backup service leveraging peer-to-peer data recovery
US20200349017A1 (en) Method and system for auto live-mounting database golden copies
US10635838B1 (en) Cloud based dead drop for isolated recovery systems
US10977138B1 (en) Method and system for efficiently handling backup discovery operations
US20220398313A1 (en) Threat aware data protection

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: EMC IP HOLDING COMPANY LLC, MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BIJOY, SAVITHA SUSAN;KULKARNI, GURURAJ;KAMATH, MAHESH;AND OTHERS;REEL/FRAME:056993/0097

Effective date: 20210719

AS Assignment

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, NORTH CAROLINA

Free format text: SECURITY AGREEMENT;ASSIGNORS:DELL PRODUCTS, L.P.;EMC IP HOLDING COMPANY LLC;REEL/FRAME:057682/0830

Effective date: 20211001

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT, TEXAS

Free format text: SECURITY INTEREST;ASSIGNORS:DELL PRODUCTS L.P.;EMC IP HOLDING COMPANY LLC;REEL/FRAME:057931/0392

Effective date: 20210908

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT, TEXAS

Free format text: SECURITY INTEREST;ASSIGNORS:DELL PRODUCTS L.P.;EMC IP HOLDING COMPANY LLC;REEL/FRAME:058014/0560

Effective date: 20210908

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT, TEXAS

Free format text: SECURITY INTEREST;ASSIGNORS:DELL PRODUCTS L.P.;EMC IP HOLDING COMPANY LLC;REEL/FRAME:057758/0286

Effective date: 20210908

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: EMC IP HOLDING COMPANY LLC, TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (057931/0392);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:062022/0382

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (057931/0392);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:062022/0382

Effective date: 20220329

Owner name: EMC IP HOLDING COMPANY LLC, TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (057758/0286);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061654/0064

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (057758/0286);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061654/0064

Effective date: 20220329

Owner name: EMC IP HOLDING COMPANY LLC, TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (058014/0560);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:062022/0473

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (058014/0560);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:062022/0473

Effective date: 20220329

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE