WO2009031157A2 - Method and apparatus for grid based data recovery - Google Patents

Method and apparatus for grid based data recovery Download PDF

Info

Publication number
WO2009031157A2
WO2009031157A2 PCT/IL2008/001210 IL2008001210W WO2009031157A2 WO 2009031157 A2 WO2009031157 A2 WO 2009031157A2 IL 2008001210 W IL2008001210 W IL 2008001210W WO 2009031157 A2 WO2009031157 A2 WO 2009031157A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
volume
blocks
disk
agent
Prior art date
Application number
PCT/IL2008/001210
Other languages
French (fr)
Other versions
WO2009031157A3 (en
Inventor
Leonid Remennik
Henry Broodney
Eli Bernstein
Original Assignee
Ingrid Networks Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ingrid Networks Ltd filed Critical Ingrid Networks Ltd
Publication of WO2009031157A2 publication Critical patent/WO2009031157A2/en
Publication of WO2009031157A3 publication Critical patent/WO2009031157A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1453Management of the data involved in backup or backup restore using de-duplication of the data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents

Definitions

  • the present invention relates generally to recovery of data from a backup, and in particular to recovery of data from a backup stored on a network incorporating a grid based backup system.
  • a major concern in computerized systems is reliably backing up data, so that the data can be restored if needed.
  • the need to restore data may arise due to external physical causes (e.g. theft, fire, and flood), hardware failure (e.g. a disk crash) or software related issues, such as viruses, accidental or intentional erasure or other causes.
  • Some backup methods require installation of additional hardware specifically for this purpose (e.g. a tape backup).
  • Other methods use existing multipurpose hardware, for example a CD/DVD writer to backup data. Some methods are performed automatically and some methods are performed manually by a user. Additionally, some methods are performed at a specific time and other methods backup data continuously each time the file is altered.
  • the current invention deals with a backup system that utilizes free storage space on computers connected together in a network to backup each other.
  • a backup system is referred to as a backup grid.
  • a user may copy files from his/her computer over the network to another computer's disk to serve as a backup copy in case of damage to a specific file on the user's station.
  • an agent is activated on all participating nodes of the network. The agent automatically backs up data based on user preferences, for example specific disk drives, directories or specific files.
  • the backup data may replace previous versions of the files or may be implemented as incremental backups and provide multiple versions of a file so that the user may restore the file to any point in time.
  • a backup grid over a network may enable backing up some or all data of the stations of a network without the use of additional hardware.
  • the damaged file can be recovered from other stations in the network.
  • the local disk of a computer is damaged More complex methods of data recovery will be required to access the data and recover from a damaged disk
  • An aspect of an embodiment of the invention relates to a system and method of backup recovery using a backup grid over a network.
  • the backup grid is based on a group of computer stations connected over a network wherein data from a specific station is stored on other stations of the network and can then be restored if needed.
  • each participating station executes an agent software application to enable it to participate in the backup grid.
  • data from a specific station selected to be backed up is defined to be a volume. The contents of the volume are divided up into a linked list of blocks and the blocks are dispersed over the stations participating in the backup grid.
  • To restore data a user provides an agent with a volume identifier.
  • the agent locates the first block of the volume and then uses the first block to locate the rest of the blocks belonging to the volume.
  • the agent mounts the volume as a virtual disk so that the user can access the data like accessing a local hard disk without needing to take into account the identity of the stations that are storing the data.
  • some or all of the blocks from which the volume is made up from are stored more than once on other stations of the grid. This provides a certain level of tolerance so that if some stations are not available or lose data the backed up volume can still be accessed.
  • the station from which the user wants to access the data has a damaged disk or messed up content so that it's boot ability is nonfunctional.
  • the user may replace the local disk with a functional disk and boot the station from an external disk, for example an external SCSI disk or external USB disk.
  • the user may boot the disk from a CD or from a boot ROM that accesses a network server.
  • the server may or may not be participating in the backup grid as long as it is accessible from the network.
  • the agent is executed to provide access to the backup grid, so that a volume can be mounted.
  • the agent can format a new or unreadable local disk and copy the content of a mounted volume to the local disk, so that once the station is restarted it will function as it did before the local disk's contents were damaged.
  • a method of restoring data to a selected station from a backup grid wherein the backup grid is implemented as a network of computer stations serving as peers providing a backup service, the method comprising:
  • the collection of blocks from which the volume of data is made up from form a linked list.
  • the collection of blocks from which the volume of data is made up from form a tree structure.
  • the identifier is translated by the agent to provide the location of the first block of the volume of data.
  • copies of at least some of the blocks from the collection of blocks are stored at more than one participating station.
  • the selected station is booted from an external source and loads the agent from the external source.
  • the external source is a bootable external USB drive or SCSI drive. Alternatively or additionally, the external source is a bootable CD.
  • the external source is a server that enables booting from a network boot ROM.
  • the server executes a copy of the agent and participates in the backup grid.
  • the server does not participate in the backup grid.
  • the method includes booting a virtual machine from the mounted virtual disk to enable a user to work with the operating system of the backed up volume and use the data of the backed up volume.
  • the virtual machine uses a local disk to satisfy write requests by the virtual machine.
  • the virtual machine uses an accessible network disk to satisfy write requests by the virtual machine.
  • the method includes restoring a local disk with the content of the virtual disk while working with the virtual machine.
  • the blocks of data required by the user or the virtual machine while the local disk is being restored will be provided from the local disk if already available on the local disk, otherwise they will be preferentially retrieved from the virtual disk and written to the local disk.
  • some of the agents provide local storage to store blocks and some agents do not.
  • a single agent provides storage to store blocks.
  • a grid system for backing up and restoring data comprising:
  • At least two computer stations with local disks connected over a network At least two computer stations with local disks connected over a network
  • An agent application executed on each computer station participating in the grid system
  • the agent application is adapted to define selected information from the station on which it is installed as a volume of data made up by a collection of blocks of data; and backup the volume of data by dispersing the blocks to the other agents participating in the network, so that they may store copies on their local disks;
  • the agent is adapted to identify a volume of data; locate the blocks belonging to the volume of data in the network and mount the volume of data as a virtual disk by providing access to the data in the volume wherein the location of the blocks from which the volume of data is made up from is transparent to the accessor.
  • the agent is further adapted to boot a virtual machine from the mounted virtual disk to enable a user to work with the operating system of the backed up volume and use the data of the backed up volume.
  • the agent is further adapted to restore the content of the local disk of the computer station that mounted the virtual machine from the content of the virtual disk while the user is working with the virtual machine.
  • FIG. 1 is a schematic illustration of a grid based backup system incorporated over a network, according to an exemplary embodiment of the invention
  • Fig. 2 is a block diagram of the elements of an agent application, according to an exemplary embodiment of the invention.
  • Fig. 3 is a schematic representation of a backed up disk volume on a grid, according to an exemplary embodiment of the invention.
  • Fig. 4 is a flow diagram of a process of restoring data, according to an exemplary embodiment of the invention.
  • Fig. 5 is a schematic illustration of a system with a damaged hard drive having a non- functional local boot-up, according to an exemplary embodiment of the invention.
  • Fig. 6 is a flow diagram of a process of repairing a station with a nonfunctional boot-up, according to an exemplary embodiment of the invention.
  • Fig. 1 is a schematic illustration of a grid based backup system 100 incorporated over a network 120, according to an exemplary embodiment of the invention.
  • network 120 may be a LAN, or a WAN (e.g. the Internet) or any other type of network.
  • multiple computer stations 110 are connected via network 120 to participate together as a peer to peer grid based backup system, referred to henceforth as grid 100.
  • each station executes an agent application 200 to identify it as a member of grid 100 and to enable it to interact with grid 100.
  • a station 110 that needs to restore data will need to load an agent 200 to communicate with the agents of the other stations 1 10 that are participating in grid 100.
  • Fig. 2 is a block diagram of the elements of agent application 200, according to an exemplary embodiment of the invention.
  • agent application 200 includes a grid network interface 230 to communicate with other agents 200 over the network, for example to receive data, provide data or provide control information between members of grid 100.
  • agent 200 includes a backup service 250 to handle the backup of data over grid 100 from the local disks of the station 110 hosting the agent 200.
  • agent 200 optionally includes a recovery service 240 to handle restoration of data from grid 100 back to the local disks of station 110.
  • agent 200 includes a backup control 220 to deal with the control of the storage of data provided by other stations 110 to its local disk.
  • a section of the local disk is designated to be used as a local storage 210 for the members of grid 100.
  • the designated local storage 210 may be allocated statically or dynamically, for example statically by being decremented from the overall disk space available to the local user or dynamically by being set as a maximum size that is dynamically allocated until reaching the maximum value or being used up by the user.
  • a user can designate, which content is of interest to be backed up and the designated content will be backed up by agent 200 to grid 100 every time the content is altered.
  • the designated content can be an entire disk, an entire partition, a specific directory or group of directories or specific files.
  • the content will be split into blocks which will be dispersed among the stations 110 participating in grid 100.
  • each block of data is saved 2, 3 or 4 times or more depending on the reliability of the stations 110 or network 120.
  • some blocks are stored more times than others optionally depending on their importance to being able to restore the backed up data, for example a block holding directory information may be more important than a block of a single file, since if the directory information is missing the all the blocks in the backup set may be useless.
  • the selected content from each station 1 10 is referred to as a volume and transformed into a linked list of blocks or a tree of blocks.
  • Each block is identified by a key value, and each block points to other blocks of the list or tree.
  • the data may be compressed, encoded or preprocessed in other ways before or after being divided into blocks.
  • agent 200 receives a block of data, performs a hashing algorithm on the data to generate a key value for identifying the specific block based on its content.
  • two blocks with the same data will provide the same key value and can be handled once.
  • agent 200 executes a translation function on the key value to generate a list of one or more stations 110, which will be selected to store a copy of each specific block.
  • Fig. 3 is a schematic representation of a backed up disk volume on a grid, according to an exemplary embodiment of the invention.
  • the backed up disk volume is labeled "disk c" and represented as a tree of blocks 300. Copies of the blocks are stored in local storage 210 of various stations of grid 100 as illustrated by table 310.
  • the first block of the volume is given a unique key value by agent 200 to identify the volume for future access.
  • the key value for the first block is based on a user selected name or the name of the content (e.g. station name and disk name) instead of being based on the content as the rest of the blocks.
  • the key value for the first block is created by applying a translation function to a name and password provided by the user to identify the content.
  • Fig. 4 is a flow diagram of a process 400 of restoring data, according to an exemplary embodiment of the invention.
  • the user needs to access (410) a station with agent 200 participating in grid 100.
  • the user provides (420) the identifier of the volume (e.g. disk, partition) containing the desired content (e.g. file, directory) to be restored.
  • agent 200 calculates (430) the key value of the first block of the backed up volume.
  • agent 200 uses (440) the translation function to determine the identity of the stations with copies of the first block.
  • Agent 200 accesses (450) the first block and uses it to verify the content of the entire volume, since the first block include pointers to one or more sequential blocks.
  • agent 200 mounts (460) the volume as a virtual disk (VD) to allow access to the content of the volume.
  • VD virtual disk
  • mounting the volume under an operating system such as Microsoft windows TM includes providing a drive letter and allowing the user to view and copy the content of the drive, so that he/she may restore the desired files/folders back to local disk drives.
  • the disk of station 1 10 is damaged or the operating system is damaged so that the user cannot boot-up station 110 and access agent 200 to recover the content of the disk.
  • Use of the above process would only allow access to the backed up data from a functional station.
  • the user could repair station 1 10, for example by replacing the disk, installing the operating system, installing agent 200, and then accessing the backed up volume to restore its content.
  • operating systems that don't allow restoration of programs by simply copying the program folder, instead they require reinstallation of all programs when reinstalling the operating system (e.g. Microsoft windows TM).
  • Fig. 5 is a schematic illustration of a grid system 500 with a station 510 having a non-functional boot-up, according to an exemplary embodiment of the invention.
  • Fig. 6 is a flow diagram of the process 600 of repairing station 510 with a non-functional boot-up, according to an exemplary embodiment of the invention.
  • the hard drive is first replaced (610) with a functional hard drive with a capacity that will suffice to accommodate the backed up content of the station (e.g. a disk of the same size or larger).
  • Station 510 is then booted up (620) using other boot up means, since it's hard drive is not loaded with an operating system.
  • One possible method is using a boot CD 560 that will load an operating system, and then executing (630) the software for agent 200 from the boot CD to access grid 100.
  • station 510 can be booted up from an external hard drive 570 (e.g. a SCSI disk or USB disk).
  • station 510 can be booted up using a boot-ROM from its network interface 580, if network 120 provides access of station 510 to a server 540 that supports booting up from a network interface boot ROM.
  • station 510 will load an operating system from server 540 and execute (630) agent 200 to access grid 100.
  • server 540 also executes agent 200 and participates in grid 100.
  • server 540 may provide the boot service to station 510 but does participate in grid 100.
  • the user can identify (640) the volume that has the backed up image of station 510.
  • Agent 200 will mount (650) the volume as a virtual disk 550 (VD) to make the content of the volume accessible.
  • virtual disk 550 is a read only disk since agent 200 handles backing up data from a local disk and not direct writing from virtual disk 550.
  • station 510 can then boot (660) a virtual machine (VM) providing the user with an instance of the previous operating system running off of virtual disk 550.
  • VM virtual machine
  • agent 200 can be activated to format the local functional disk and restore (670) the entire content of the volume previously backed up from station 510 while the user is working.
  • the virtual machine will use disk space from the local disk or from an accessible network disk as a cache for write requests issued by the virtual machine.
  • station 510 can be restarted (680) to function like it was before the damage to the disk or disk content occurred.
  • the transition from working off of the virtual disk to working off the local disk is performed continuously without requiring the user to restart station 510.
  • agent 200 keeps track of the blocks that have been restored and written to the local disk 520 and the blocks that have not been restored, for example using a bit map with a bit representing each block of the volume being restored and resetting the bit when the block is read from grid 100 and written to the local disk.
  • agent 200 will check if the required blocks have been restored and their content is available on local disk 520 that is being reconstructed. If the blocks are available locally, agent 200 will provide them from local disk 520, otherwise they will be restored ahead of turn and provided to agent 200 for the user and written to local disk 520.
  • Agent 200 may then disconnect the restore operation from grid 100 since the entire volume has been restored.
  • a user may boot a virtual machine to access his/her data and not attempt to restore the local disk, for example when booting from an alternative station.
  • a cache is allocated by agent 200 from the available disk space on the local machine to buffer retrieved data and satisfy write requests by the virtual machine. If no local disk space is available, for example if the local disk is full or damaged, an accessible network drive or a USB device can be used.
  • not all agents 200 have a local storage available on the station 1 10 for storing data or some agents may have blocked off this option.
  • some agents may only store data and not backup data. As a result some agents may only retrieve backed up data and some may only backup data.
  • grid 100 may include a single station backing up data for a plurality of stations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A method of restoring data to a selected station from a backup grid, wherein the backup grid is implemented as a network of computer stations serving as peers providing a backup service, the method including, executing an agent on each computer station participating in the backup grid, providing the agent of the selected station with an identifier to find a volume of data previously backed up to the backup grid as a collection of blocks dispersed over the participating stations; locating the blocks belonging to the volume of data; mounting the volume of data as a virtual disk by providing access to the data in the volume wherein the location of the blocks from which the volume of data is made up from is transparent to the accessor.

Description

METHOD AND APPARATUS FOR GRID BASED DATA RECOVERY
RELATED APPLICATIONS
The current application claims priority from US provisional application No: 60/970,957 titled Method and Apparatus for Grid Based Data Protection, filed on September 9, 2007 the disclosure of which is incorporated herein by reference.
FIELD OF THE INVENTION
The present invention relates generally to recovery of data from a backup, and in particular to recovery of data from a backup stored on a network incorporating a grid based backup system.
BACKGROUND OF THE INVENTION
A major concern in computerized systems is reliably backing up data, so that the data can be restored if needed. The need to restore data may arise due to external physical causes (e.g. theft, fire, and flood), hardware failure (e.g. a disk crash) or software related issues, such as viruses, accidental or intentional erasure or other causes.
Some backup methods require installation of additional hardware specifically for this purpose (e.g. a tape backup). Other methods use existing multipurpose hardware, for example a CD/DVD writer to backup data. Some methods are performed automatically and some methods are performed manually by a user. Additionally, some methods are performed at a specific time and other methods backup data continuously each time the file is altered.
The current invention deals with a backup system that utilizes free storage space on computers connected together in a network to backup each other. Such a backup system is referred to as a backup grid. In the simplest form a user may copy files from his/her computer over the network to another computer's disk to serve as a backup copy in case of damage to a specific file on the user's station. In a more complex system such as described in the above related application an agent is activated on all participating nodes of the network. The agent automatically backs up data based on user preferences, for example specific disk drives, directories or specific files. The backup data may replace previous versions of the files or may be implemented as incremental backups and provide multiple versions of a file so that the user may restore the file to any point in time.
The use of a backup grid over a network may enable backing up some or all data of the stations of a network without the use of additional hardware. In case of damage to a specific file, the damaged file can be recovered from other stations in the network. However if the local disk of a computer is damaged More complex methods of data recovery will be required to access the data and recover from a damaged disk
A PCT application filed on September 7, 2008 by the applicant, titled "Method and Apparatus for Grid Based Data Protection", the disclosure of which is incorporated herein by reference describes backup methods based on the description in the provisional application listed above.
SUMMARY OF THE INVENTION
An aspect of an embodiment of the invention, relates to a system and method of backup recovery using a backup grid over a network. The backup grid is based on a group of computer stations connected over a network wherein data from a specific station is stored on other stations of the network and can then be restored if needed. In an exemplary embodiment of the invention, each participating station executes an agent software application to enable it to participate in the backup grid. Optionally, data from a specific station selected to be backed up is defined to be a volume. The contents of the volume are divided up into a linked list of blocks and the blocks are dispersed over the stations participating in the backup grid. To restore data a user provides an agent with a volume identifier. The agent locates the first block of the volume and then uses the first block to locate the rest of the blocks belonging to the volume. The agent mounts the volume as a virtual disk so that the user can access the data like accessing a local hard disk without needing to take into account the identity of the stations that are storing the data.
In some embodiments of the invention, some or all of the blocks from which the volume is made up from, are stored more than once on other stations of the grid. This provides a certain level of tolerance so that if some stations are not available or lose data the backed up volume can still be accessed.
In some embodiments of the invention, the station from which the user wants to access the data has a damaged disk or messed up content so that it's boot ability is nonfunctional. Optionally, the user may replace the local disk with a functional disk and boot the station from an external disk, for example an external SCSI disk or external USB disk. Alternatively, the user may boot the disk from a CD or from a boot ROM that accesses a network server. The server may or may not be participating in the backup grid as long as it is accessible from the network. After booting the station the agent is executed to provide access to the backup grid, so that a volume can be mounted. In an exemplary embodiment of the invention, the agent can format a new or unreadable local disk and copy the content of a mounted volume to the local disk, so that once the station is restarted it will function as it did before the local disk's contents were damaged.
There is thus provided according to an exemplary embodiment of the invention, a method of restoring data to a selected station from a backup grid, wherein the backup grid is implemented as a network of computer stations serving as peers providing a backup service, the method comprising:
Executing an agent on each computer station participating in the backup grid;
Providing the agent of the selected station with an identifier to find a volume of data previously backed up to the backup grid as a collection of blocks dispersed over the participating stations; locating the blocks belonging to the volume of data; mounting the volume of data as a virtual disk by providing access to the data in the volume wherein the location of the blocks from which the volume of data is made up from is transparent to the accessor.
In an exemplary embodiment of the invention, the collection of blocks from which the volume of data is made up from, form a linked list. Optionally, the collection of blocks from which the volume of data is made up from, form a tree structure. In an exemplary embodiment of the invention, the identifier is translated by the agent to provide the location of the first block of the volume of data. Optionally, copies of at least some of the blocks from the collection of blocks are stored at more than one participating station. In an exemplary embodiment of the invention, the selected station is booted from an external source and loads the agent from the external source. Optionally, the external source is a bootable external USB drive or SCSI drive. Alternatively or additionally, the external source is a bootable CD. Further alternatively or additionally, the external source is a server that enables booting from a network boot ROM. In an exemplary embodiment of the invention, the server executes a copy of the agent and participates in the backup grid. Alternatively, the server does not participate in the backup grid.
In an exemplary embodiment of the invention, the method includes booting a virtual machine from the mounted virtual disk to enable a user to work with the operating system of the backed up volume and use the data of the backed up volume. Optionally, the virtual machine uses a local disk to satisfy write requests by the virtual machine. In an exemplary embodiment of the invention, the virtual machine uses an accessible network disk to satisfy write requests by the virtual machine. Optionally, the method includes restoring a local disk with the content of the virtual disk while working with the virtual machine. In an exemplary embodiment of the invention, the blocks of data required by the user or the virtual machine while the local disk is being restored will be provided from the local disk if already available on the local disk, otherwise they will be preferentially retrieved from the virtual disk and written to the local disk. Optionally, some of the agents provide local storage to store blocks and some agents do not. In an exemplary embodiment of the invention, a single agent provides storage to store blocks.
There is further provided according to an exemplary embodiment of the invention, a grid system for backing up and restoring data, comprising:
At least two computer stations with local disks connected over a network;
An agent application executed on each computer station participating in the grid system;
Wherein the agent application is adapted to define selected information from the station on which it is installed as a volume of data made up by a collection of blocks of data; and backup the volume of data by dispersing the blocks to the other agents participating in the network, so that they may store copies on their local disks;
Then the agent is adapted to identify a volume of data; locate the blocks belonging to the volume of data in the network and mount the volume of data as a virtual disk by providing access to the data in the volume wherein the location of the blocks from which the volume of data is made up from is transparent to the accessor.
In an exemplary embodiment of the invention, the agent is further adapted to boot a virtual machine from the mounted virtual disk to enable a user to work with the operating system of the backed up volume and use the data of the backed up volume. Optionally, the agent is further adapted to restore the content of the local disk of the computer station that mounted the virtual machine from the content of the virtual disk while the user is working with the virtual machine.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will be understood and better appreciated from the following detailed description taken in conjunction with the drawings. Identical structures, elements or parts, which appear in more than one figure, are generally labeled with the same or similar number in all the figures in which they appear, wherein:
Fig. 1 is a schematic illustration of a grid based backup system incorporated over a network, according to an exemplary embodiment of the invention;
Fig. 2 is a block diagram of the elements of an agent application, according to an exemplary embodiment of the invention;
Fig. 3 is a schematic representation of a backed up disk volume on a grid, according to an exemplary embodiment of the invention;
Fig. 4 is a flow diagram of a process of restoring data, according to an exemplary embodiment of the invention;
Fig. 5 is a schematic illustration of a system with a damaged hard drive having a non- functional local boot-up, according to an exemplary embodiment of the invention; and
Fig. 6 is a flow diagram of a process of repairing a station with a nonfunctional boot-up, according to an exemplary embodiment of the invention.
DETAILED DESCRIPTION
Fig. 1 is a schematic illustration of a grid based backup system 100 incorporated over a network 120, according to an exemplary embodiment of the invention. Optionally, network 120 may be a LAN, or a WAN (e.g. the Internet) or any other type of network.
In an exemplary embodiment of the invention, multiple computer stations 110 are connected via network 120 to participate together as a peer to peer grid based backup system, referred to henceforth as grid 100. Optionally, each station executes an agent application 200 to identify it as a member of grid 100 and to enable it to interact with grid 100. In an exemplary embodiment of the invention, a station 110 that needs to restore data will need to load an agent 200 to communicate with the agents of the other stations 1 10 that are participating in grid 100.
Fig. 2 is a block diagram of the elements of agent application 200, according to an exemplary embodiment of the invention. In an exemplary embodiment of the agent application 200 includes a grid network interface 230 to communicate with other agents 200 over the network, for example to receive data, provide data or provide control information between members of grid 100. Optionally, agent 200 includes a backup service 250 to handle the backup of data over grid 100 from the local disks of the station 110 hosting the agent 200. Likewise agent 200 optionally includes a recovery service 240 to handle restoration of data from grid 100 back to the local disks of station 110. In an exemplary embodiment of the invention, agent 200 includes a backup control 220 to deal with the control of the storage of data provided by other stations 110 to its local disk. Optionally, a section of the local disk is designated to be used as a local storage 210 for the members of grid 100. The designated local storage 210 may be allocated statically or dynamically, for example statically by being decremented from the overall disk space available to the local user or dynamically by being set as a maximum size that is dynamically allocated until reaching the maximum value or being used up by the user. In some embodiments of the invention, a user can designate, which content is of interest to be backed up and the designated content will be backed up by agent 200 to grid 100 every time the content is altered. Optionally, the designated content can be an entire disk, an entire partition, a specific directory or group of directories or specific files. In an exemplary embodiment of the invention, the content will be split into blocks which will be dispersed among the stations 110 participating in grid 100.
In some embodiments of the invention, multiple copies of each block will be stored in grid 100 in case a station 110 is unavailable or in case of damage to the data at a specific station, for example due to disk failure of a station. Optionally, each block of data is saved 2, 3 or 4 times or more depending on the reliability of the stations 110 or network 120. In some embodiments of the invention, some blocks are stored more times than others optionally depending on their importance to being able to restore the backed up data, for example a block holding directory information may be more important than a block of a single file, since if the directory information is missing the all the blocks in the backup set may be useless.
In an exemplary embodiment of the invention, the selected content from each station 1 10 (e.g. a disk, partition, folder, file) is referred to as a volume and transformed into a linked list of blocks or a tree of blocks. Each block is identified by a key value, and each block points to other blocks of the list or tree. Optionally, the data may be compressed, encoded or preprocessed in other ways before or after being divided into blocks. In an exemplary embodiment of the invention, agent 200 receives a block of data, performs a hashing algorithm on the data to generate a key value for identifying the specific block based on its content. Optionally two blocks with the same data will provide the same key value and can be handled once. In an exemplary embodiment of the invention, agent 200, executes a translation function on the key value to generate a list of one or more stations 110, which will be selected to store a copy of each specific block. Fig. 3 is a schematic representation of a backed up disk volume on a grid, according to an exemplary embodiment of the invention. The backed up disk volume is labeled "disk c" and represented as a tree of blocks 300. Copies of the blocks are stored in local storage 210 of various stations of grid 100 as illustrated by table 310.
In an exemplary embodiment of the invention, the first block of the volume is given a unique key value by agent 200 to identify the volume for future access. Optionally, the key value for the first block is based on a user selected name or the name of the content (e.g. station name and disk name) instead of being based on the content as the rest of the blocks. Optionally, the key value for the first block is created by applying a translation function to a name and password provided by the user to identify the content.
In some embodiments of the invention, the user is interested in restoring content to a functional station, for example to restore a damaged file or a file that was accidentally erased. Fig. 4 is a flow diagram of a process 400 of restoring data, according to an exemplary embodiment of the invention. In an exemplary embodiment of the invention, to restore any content, for example a specific file from a specific station that was previously backed up, the user needs to access (410) a station with agent 200 participating in grid 100. Optionally, the user provides (420) the identifier of the volume (e.g. disk, partition) containing the desired content (e.g. file, directory) to be restored. In an exemplary embodiment of the invention, agent 200 calculates (430) the key value of the first block of the backed up volume. Then agent 200 uses (440) the translation function to determine the identity of the stations with copies of the first block. Agent 200 accesses (450) the first block and uses it to verify the content of the entire volume, since the first block include pointers to one or more sequential blocks. Then agent 200 mounts (460) the volume as a virtual disk (VD) to allow access to the content of the volume. In an exemplary embodiment of the invention, mounting the volume under an operating system such as Microsoft windows ™ includes providing a drive letter and allowing the user to view and copy the content of the drive, so that he/she may restore the desired files/folders back to local disk drives.
In some embodiments of the invention, the disk of station 1 10 is damaged or the operating system is damaged so that the user cannot boot-up station 110 and access agent 200 to recover the content of the disk. Use of the above process would only allow access to the backed up data from a functional station. Optionally, the user could repair station 1 10, for example by replacing the disk, installing the operating system, installing agent 200, and then accessing the backed up volume to restore its content. However it would be more desirable to be able to restore the entire image of a disk without first reinstalling the operating system. Especially for operating systems that don't allow restoration of programs by simply copying the program folder, instead they require reinstallation of all programs when reinstalling the operating system (e.g. Microsoft windows ™).
Fig. 5 is a schematic illustration of a grid system 500 with a station 510 having a non-functional boot-up, according to an exemplary embodiment of the invention. Fig. 6 is a flow diagram of the process 600 of repairing station 510 with a non-functional boot-up, according to an exemplary embodiment of the invention.
In an exemplary embodiment of the invention, if the hard disk 520 of station 510 is damaged the hard drive is first replaced (610) with a functional hard drive with a capacity that will suffice to accommodate the backed up content of the station (e.g. a disk of the same size or larger). Station 510 is then booted up (620) using other boot up means, since it's hard drive is not loaded with an operating system. One possible method is using a boot CD 560 that will load an operating system, and then executing (630) the software for agent 200 from the boot CD to access grid 100. Alternatively, station 510 can be booted up from an external hard drive 570 (e.g. a SCSI disk or USB disk). Further alternatively station 510 can be booted up using a boot-ROM from its network interface 580, if network 120 provides access of station 510 to a server 540 that supports booting up from a network interface boot ROM. Optionally, station 510 will load an operating system from server 540 and execute (630) agent 200 to access grid 100. In some embodiments of the invention, server 540 also executes agent 200 and participates in grid 100. Alternatively, server 540 may provide the boot service to station 510 but does participate in grid 100.
In an exemplary embodiment of the invention, after loading agent 200 the user can identify (640) the volume that has the backed up image of station 510. Agent 200 will mount (650) the volume as a virtual disk 550 (VD) to make the content of the volume accessible. Optionally, virtual disk 550 is a read only disk since agent 200 handles backing up data from a local disk and not direct writing from virtual disk 550. In an exemplary embodiment of the invention, station 510 can then boot (660) a virtual machine (VM) providing the user with an instance of the previous operating system running off of virtual disk 550. Once the user is running the virtual machine off of virtual disk 550, agent 200 can be activated to format the local functional disk and restore (670) the entire content of the volume previously backed up from station 510 while the user is working. Optionally, the virtual machine will use disk space from the local disk or from an accessible network disk as a cache for write requests issued by the virtual machine. After restoring the disk content, station 510 can be restarted (680) to function like it was before the damage to the disk or disk content occurred.
In some embodiments of the invention, the transition from working off of the virtual disk to working off the local disk is performed continuously without requiring the user to restart station 510. Optionally, agent 200 keeps track of the blocks that have been restored and written to the local disk 520 and the blocks that have not been restored, for example using a bit map with a bit representing each block of the volume being restored and resetting the bit when the block is read from grid 100 and written to the local disk. When the user requests to read data, agent 200 will check if the required blocks have been restored and their content is available on local disk 520 that is being reconstructed. If the blocks are available locally, agent 200 will provide them from local disk 520, otherwise they will be restored ahead of turn and provided to agent 200 for the user and written to local disk 520. Optionally, when all the blocks have been restored from grid 100 to local disk 520 the user will have smoothly transitioned to a state wherein all data access is performed from local disk 520. Agent 200 may then disconnect the restore operation from grid 100 since the entire volume has been restored.
In some embodiments of the invention, a user may boot a virtual machine to access his/her data and not attempt to restore the local disk, for example when booting from an alternative station. Optionally, a cache is allocated by agent 200 from the available disk space on the local machine to buffer retrieved data and satisfy write requests by the virtual machine. If no local disk space is available, for example if the local disk is full or damaged, an accessible network drive or a USB device can be used.
In some embodiments of the invention, not all agents 200 have a local storage available on the station 1 10 for storing data or some agents may have blocked off this option. Alternatively or additionally, some agents may only store data and not backup data. As a result some agents may only retrieve backed up data and some may only backup data. In an exemplary embodiment of the invention, grid 100 may include a single station backing up data for a plurality of stations.
It should be appreciated that the above described methods and apparatus may be varied in many ways, including omitting or adding steps, changing the order of steps and the type of devices used. It should be appreciated that different features may be combined in different ways. In particular, not all the features shown above in a particular embodiment are necessary in every embodiment of the invention. Further combinations of the above features are also considered to be within the scope of some embodiments of the invention.
It will be appreciated by persons skilled in the art that the present invention is not limited to what has been particularly shown and described hereinabove. Rather the scope of the present invention is defined only by the claims, which follow.

Claims

1. A method of restoring data to a selected station from a backup grid, wherein said backup grid is implemented as a network of computer stations serving as peers providing a backup service, said method comprising: executing an agent on each computer station participating in the backup grid; providing the agent of said selected station with an identifier to find a volume of data previously backed up to the backup grid as a collection of blocks dispersed over the participating stations; locating the blocks belonging to said volume of data; mounting said volume of data as a virtual disk by providing access to the data in said volume wherein the location of the blocks from which the volume of data is made up from is transparent to the accessor.
2. A method according to claim 1, wherein the collection of blocks from which the volume of data is made up from, forms a linked list.
3. A method according to claim 1, wherein the collection of blocks from which the volume of data is made up from, forms a tree structure.
4. A method according to claim 1, wherein said identifier is translated by said agent to provide the location of the first block of said volume of data.
5. A method according to claim 1, wherein copies of at least some of the blocks from said collection of blocks are stored at more than one participating station.
6. A method according to claim 1, further comprising booting said selected station from an external source and loading said agent from the external source.
7. A method according to claim 6, wherein said external source is a bootable external USB drive or SCSI drive.
8. A method according to claim 6, wherein said external source is a bootable CD.
9. A method according to claim 6, wherein said external source is a server that enables booting from a network boot ROM.
10. A method according to claim 9, wherein said server executes a copy of said agent and participates in said backup grid.
11. A method according to claim 9, wherein said server does not participate in said backup grid.
12. A method according to claim 1, further comprising booting a virtual machine from said mounted virtual disk to enable a user to work with the operating system of the backed up volume and use the data of the backed up volume.
13. A method according to claim 12, wherein said virtual machine uses a local disk to satisfy write requests by the virtual machine.
14. A method according to claim 12, wherein said virtual machine uses an accessible network disk to satisfy write requests by the virtual machine.
15. A method according to claim 12, further comprising restoring a local disk with the content of said virtual disk while working with said virtual machine.
16. A method according to claim 15, wherein blocks of data required by the user or the virtual machine while said local disk is being restored will be provided from the local disk if already available on the local disk, otherwise they will be preferentially retrieved from said virtual disk and written to said local disk.
17. A method according to claim 1, wherein some of said agents provide local storage to store blocks and some agents do not.
18. A method according to claim 1, wherein a single agent provides storage to store blocks.
19. A grid system for backing up and restoring data, comprising: at least two computer stations with local disks connected over a network; an agent application executed on each computer station participating in the grid system; wherein said agent application is adapted to define selected information from the station on which it is installed as a volume of data made up by a collection of blocks of data; and backup the volume of data by dispersing the blocks to the other agents participating in the network, so that they may store copies on their local disks; then said agent is adapted to identify a volume of data; locate the blocks belonging to said volume of data in said network and mount said volume of data as a virtual disk by providing access to the data in said volume wherein the location of the blocks from which the volume of data is made up from is transparent to the accessor.
20. A grid system according to claim 19, wherein said agent is further adapted to boot a virtual machine from said mounted virtual disk to enable a user to work with the operating system of the backed up volume and use the data of the backed up volume.
21. A grid system according to claim 20, wherein said agent is further adapted to restore the content of the local disk of the computer station that mounted the virtual machine from the content of said virtual disk while the user is working with said virtual machine.
PCT/IL2008/001210 2007-09-09 2008-09-09 Method and apparatus for grid based data recovery WO2009031157A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US97095707P 2007-09-09 2007-09-09
US60/970,957 2007-09-09

Publications (2)

Publication Number Publication Date
WO2009031157A2 true WO2009031157A2 (en) 2009-03-12
WO2009031157A3 WO2009031157A3 (en) 2010-03-04

Family

ID=40429502

Family Applications (3)

Application Number Title Priority Date Filing Date
PCT/IL2008/001206 WO2009031156A2 (en) 2007-09-09 2008-09-07 Method and apparatus for grid based data protection
PCT/IL2008/001211 WO2009031158A2 (en) 2007-09-09 2008-09-09 Method and apparatus for network based data recovery
PCT/IL2008/001210 WO2009031157A2 (en) 2007-09-09 2008-09-09 Method and apparatus for grid based data recovery

Family Applications Before (2)

Application Number Title Priority Date Filing Date
PCT/IL2008/001206 WO2009031156A2 (en) 2007-09-09 2008-09-07 Method and apparatus for grid based data protection
PCT/IL2008/001211 WO2009031158A2 (en) 2007-09-09 2008-09-09 Method and apparatus for network based data recovery

Country Status (1)

Country Link
WO (3) WO2009031156A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2499415A (en) * 2012-02-15 2013-08-21 Tivarri Ltd Backup and restore method using virtual disk on an onsite or offsite storage appliance
US20150355984A1 (en) * 2014-06-04 2015-12-10 Pure Storage, Inc. Disaster recovery at high reliability in a storage cluster
US9442803B2 (en) 2014-06-24 2016-09-13 International Business Machines Corporation Method and system of distributed backup for computer devices in a network
CN110018878A (en) * 2018-01-09 2019-07-16 阿里巴巴集团控股有限公司 A kind of distributed system data load method and device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7716179B1 (en) 2009-10-29 2010-05-11 Wowd, Inc. DHT-based distributed file system for simultaneous use by millions of frequently disconnected, world-wide users
US8930320B2 (en) 2011-09-30 2015-01-06 Accenture Global Services Limited Distributed computing backup and recovery system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060053332A1 (en) * 2004-09-07 2006-03-09 Emc Corporation Systems and methods for recovering and backing up data
US20070156793A1 (en) * 2005-02-07 2007-07-05 D Souza Roy P Synthetic full copies of data and dynamic bulk-to-brick transformation
US20070180306A1 (en) * 2003-08-14 2007-08-02 Soran Philip E Virtual Disk Drive System and Method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5212772A (en) * 1991-02-11 1993-05-18 Gigatrend Incorporated System for storing data in backup tape device
US5778395A (en) * 1995-10-23 1998-07-07 Stac, Inc. System for backing up files from disk volumes on multiple nodes of a computer network
US6912645B2 (en) * 2001-07-19 2005-06-28 Lucent Technologies Inc. Method and apparatus for archival data storage
US7296125B2 (en) * 2001-11-29 2007-11-13 Emc Corporation Preserving a snapshot of selected data of a mass storage system
US7814056B2 (en) * 2004-05-21 2010-10-12 Computer Associates Think, Inc. Method and apparatus for data backup using data blocks
US7284019B2 (en) * 2004-08-18 2007-10-16 International Business Machines Corporation Apparatus, system, and method for differential backup using snapshot on-write data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070180306A1 (en) * 2003-08-14 2007-08-02 Soran Philip E Virtual Disk Drive System and Method
US20060053332A1 (en) * 2004-09-07 2006-03-09 Emc Corporation Systems and methods for recovering and backing up data
US20070156793A1 (en) * 2005-02-07 2007-07-05 D Souza Roy P Synthetic full copies of data and dynamic bulk-to-brick transformation

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2499415A (en) * 2012-02-15 2013-08-21 Tivarri Ltd Backup and restore method using virtual disk on an onsite or offsite storage appliance
GB2499415B (en) * 2012-02-15 2014-09-24 Tivarri Ltd A method of backing-up, and making available by alternative means, electronic data and software initially stored on a client server
US20150355984A1 (en) * 2014-06-04 2015-12-10 Pure Storage, Inc. Disaster recovery at high reliability in a storage cluster
US10152397B2 (en) * 2014-06-04 2018-12-11 Pure Storage, Inc. Disaster recovery at high reliability in a storage cluster
US9442803B2 (en) 2014-06-24 2016-09-13 International Business Machines Corporation Method and system of distributed backup for computer devices in a network
CN110018878A (en) * 2018-01-09 2019-07-16 阿里巴巴集团控股有限公司 A kind of distributed system data load method and device
CN110018878B (en) * 2018-01-09 2022-08-30 阿里巴巴集团控股有限公司 Distributed system data loading method and device

Also Published As

Publication number Publication date
WO2009031158A2 (en) 2009-03-12
WO2009031156A3 (en) 2010-03-04
WO2009031157A3 (en) 2010-03-04
WO2009031158A3 (en) 2010-03-04
WO2009031156A2 (en) 2009-03-12

Similar Documents

Publication Publication Date Title
JP4871359B2 (en) Archiving data in a virtual application environment
EP1907935B1 (en) System and method for virtualizing backup images
US9489266B2 (en) System and method of storing backup image catalog
US7827150B1 (en) Application aware storage appliance archiving
US8510271B1 (en) Application and file system data virtualization from image backup
US8732121B1 (en) Method and system for backup to a hidden backup storage
US6618736B1 (en) Template-based creation and archival of file systems
US7885938B1 (en) Techniques for granular recovery of data from local and remote storage
CN101253484B (en) Method for storing data from client and the client
US8145607B1 (en) System and method for online backup and restore of MS exchange server
US8489830B2 (en) Implementing read/write, multi-versioned file system on top of backup data
US8489552B1 (en) Generic granular restore of application data from a volume image backup
US20060200639A1 (en) System and method for computer backup and recovery using incremental file-based updates applied to an image of a storage device
US9658925B1 (en) Systems and methods for restoring application data
JP2005523517A (en) Method and system for disaster recovery
WO2009031157A2 (en) Method and apparatus for grid based data recovery
US20030229819A1 (en) Method and apparatus for data backup and recovery
US20100037092A1 (en) System and method for backup, reboot, and recovery
US9075809B1 (en) Methods and systems for application cluster virtual nodes
US7685460B1 (en) Multiple concurrent restore using same user interface
KR101460452B1 (en) Apparatus of generating snapshot image based on hibernation and method of the same
US9152817B1 (en) Methods and systems for performing data protection operations
US9952807B1 (en) Virtual machine back-up
CN112068928B (en) Virtual machine data storage method and system based on redirection
Both Logical Volume Management (LVM)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08808018

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS EPO FORM 1205A DATED 15.09.2010.

122 Ep: pct application non-entry in european phase

Ref document number: 08808018

Country of ref document: EP

Kind code of ref document: A2