WO2017095465A1 - Create snapshot of subset of data - Google Patents

Create snapshot of subset of data Download PDF

Info

Publication number
WO2017095465A1
WO2017095465A1 PCT/US2016/021799 US2016021799W WO2017095465A1 WO 2017095465 A1 WO2017095465 A1 WO 2017095465A1 US 2016021799 W US2016021799 W US 2016021799W WO 2017095465 A1 WO2017095465 A1 WO 2017095465A1
Authority
WO
WIPO (PCT)
Prior art keywords
snapshot
subset
data
unit
create
Prior art date
Application number
PCT/US2016/021799
Other languages
French (fr)
Inventor
Venkatesh Marisamy
Kanthimathi Vedaraman
Original Assignee
Hewlett Packard Enterprise Development Lp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett Packard Enterprise Development Lp filed Critical Hewlett Packard Enterprise Development Lp
Publication of WO2017095465A1 publication Critical patent/WO2017095465A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0605Improving or facilitating administration, e.g. storage management by facilitating the interaction with a user or administrator
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/53Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by executing in a restricted environment, e.g. sandbox or secure virtual machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
    • G06F21/74Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information operating in dual or compartmented mode, i.e. at least one secure mode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2097Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements maintaining the standby controller/processing unit updated
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/84Using snapshots, i.e. a logical point-in-time copy of the data

Definitions

  • Example data sets may include files as well as Virtual Machines (VMs) residing on a Network-Attached Storage (NAS) server or on an Array.
  • VMs Virtual Machines
  • NAS Network-Attached Storage
  • FIG. 1 is an example block diagram of a system to create a snapshot of a subset of data
  • FIG. 2 is another example block diagram of a system to create a snapshot of a subset of data
  • FIG. 3 is an example block diagram of a computing device including instructions for creating a snapshot of a subset of data
  • FIG. 4 is an example flowchart of a method for creating a snapshot of a subset of data.
  • the data management may be integral to continuous operations and for providing effective services.
  • the data may be hosted on a Redundant Array of Independent Disks (RAID) based arrays and administrators may run array based snapshots to create a replicas for vertical and horizontal management of data. Snapshots may also be used as sources for data backups. These snapshots may be taken at a Logical Unit Number (LUN) level, which are not aware of the applications or databases since they are sitting on top of OS/Files system. Thus, no application based snapshots are possible. Further, a LUN snapshot meant for a particular application may include all types of data which is not relevant for the purpose it was created. This may result in overly expensive snapshots in terms of size and operational efficiency of the array.
  • LUN Logical Unit Number
  • Management and protection of the virtualized data may not be as straightforward as in the case of physical servers.
  • To protect a VM it may have to be in quiescent mode to maintain VM consistency first, and then a snapshot at array level may have to be taken. For example, if hundreds of VMs are sitting on single LUN, and there is a request is to create a snapshot of VMs belonging to a particular user (e.g. 10 VMs), the snapshot may be taken for the complete LUN which has all other 90 VMs which don't belong to that user. This does not address the actual requirement of data center management, also results in excessive use/wastage of processing cycles plus the space of the Array. Further, current techniques do not provide a mechanism of application awareness at the block level to protect only blocks corresponding to the particular data set, which is chosen by the data/backup administrator to be protected.
  • a system may include a data protection unit, an address unit and a snapshot unit.
  • the data protection unit may determine a subset of a data set at a Logical Unit Number (LUN) to be replicated to a storage system.
  • the address unit may determine a physical address of the subset.
  • the snapshot unit may create a snapshot of the subset corresponding to the determined address. The created snapshot may include less than an entirety of the LUN.
  • Examples may reduce a number of times copy-on-write is performed, which is performed when any change is made in the blocks for which the snapshot is created. This is because examples may reduce the number of blocks in snapshot. Hence, examples may avoiding copy-on-write for the blocks not related to the snapshot, so processing cycles and space usage may be improved or optimized only for required blocks.
  • FIG. 1 is an example block diagram of a system 100 to create a snapshot of a subset of data.
  • the system 100 may include or be part of a microprocessor, a controller, a memory module or device, a notebook computer, a desktop computer, an all-in-one system, a server, a network device, a wireless device, a network and the like.
  • the system 100 is shown to include a data protection unit 1 10, an address unit 120 and a snapshot unit 130.
  • the data protection, address and snapshot units 1 10, 120 and 130 may include, for example, a hardware device including electronic circuitry for implementing the functionality described below, such as control logic and/or memory.
  • the data protection, address and snapshot units 1 10, 120 and 130 may be implemented as a series of instructions encoded on a machine-readable storage medium and executable by a processor.
  • the data protection unit to determine a subset 1 12 of a data set at a Logical Unit Number (LUN) to be replicated to a storage system.
  • the data set may include a file, file system, virtual machine (VM), and the like.
  • the storage system may include at least one of storage virtualization, a storage area network (SAN), and a Network attached storage (NAS).
  • the address unit 120 may determine a physical address 122 of the subset 1 12.
  • the snapshot unit 130 may create a snapshot 132 of the subset 1 12 corresponding to the determined address 122.
  • the created snapshot 132 may include less than an entirety of the LUN.
  • the term snapshot may refer to a snapshot is the state of a system at a particular point in time and the term LUN may refer to a virtual device that provides an area of usable storage capacity on one or more physical disk drive(s) in a computer system.
  • The may be located separately from the LUN or storage system.
  • the LUN may be located separately from the storage system.
  • the data protection, address and/or snapshot units 1 10, 120 and 130 may be located at a host while the LUN is located at a client.
  • the storage system may be located at the host or at a site remote from both the host and the client. The system 100 is explained in greater detail below with respects to FIGS. 2-4.
  • FIG. 2 is another example block diagram of a system 200 to create a snapshot of a subset of data.
  • the system 200 may include or be part of a microprocessor, a controller, a memory module or device, a notebook computer, a desktop computer, an all-in-one system, a server, a network device, a wireless device, a network and the like. Further, the system 200 of FIG. 2 may include at least the functionality and/or hardware of the system 100 of FIG. 1.
  • a data protection unit 210, an address unit 220 and a snapshot unit 230 of the system 200 of FIG. 2 may include at least the respective functionality and/or hardware of the data protection, address and/or snapshot units 1 10, 120 and 130 of the system 100 of FIG. 1.
  • the system 200 of FIG. 2 is also shown to include the LUN 240 and the storage system 250.
  • the storage system 250 may be, for example a NAS array.
  • the data protection unit 210 may determine a subset 212 of a data set 241 at the LUN 240 to be replicated to the storage system 250. For example, an administrator, protection policy and/or a user may determine the subset 212 to be replicated and then indicate the subset to the data protection unit 210, such as through a Graphical User Interface (GUI).
  • GUI Graphical User Interface
  • the address unit 220 may determine a physical address 222 of the subset 212, such as via plugin like Use tool like vmkfstools or hdparm.
  • the snapshot unit 230 may create a snapshot of the subset 212 corresponding to the determined address 222.
  • the address unit 220 may create a continuous filter string 224 indicating an address 222 of the subset 212.
  • the snapshot unit 230 may use the filter string 224 to create the snapshot 232 in the storage system 250.
  • the data set 241 at the LUN 240 is shown to include a set of 4 VMs 242-1 to 242-4.
  • examples are not limited to VMs and may include other types of objects, such as file systems or files.
  • FIG. 2 shows 4 VMs 242-1 to 242-4, examples of the data 241 set may include more or less than 4 VMs.
  • the LUN 240 may be a thin-provisioned volume, which allows many VMs to be stored on a same data volume.
  • the data protection unit 210 may select the subset 212 of the set 241 of VMs 242. Here, it may be decided, such as by the administrator or user, to select the first and third VMs 242-1 and 242-3. Therefore, the data protection unit 210 may determine the subset 212 to correspond to only the subset 212 of the VMs 242, such the first and third VMs 242-1 and 242-3. The snapshot unit 230 may then create the snapshot 232 to correspond to the subset 212 of VMs 242. The created snapshot 232 may not include any of a remainder of the set 241 of VMs 242. Thus, the snapshot 232 created at the storage system 250 may only include the first and third VMs 242-1 and 242-3. Further, the created snapshot 232 may include less than an entirety of the LUN.
  • the data protection unit 210 may further determine the data set 241 to correspond to only an application (not shown) of the subset 212 of VMs 242.
  • the snapshot unit 230 may create a snapshot 232 of only blocks of data corresponding to the application.
  • the data protection unit 210 may trigger the snapshot unit 230 to create a new snapshot 232 and/or modify the current snapshot 232, if at least one block of data of the subset 212 is changed. For example, a copy-on-write may be carried out for a change to any selected block of data corresponding to the subset 212. Conversely, a copy-on-write may be skipped for a change to a block of data not corresponding to the subset 212.
  • the snapshot unit 230 may also store metadata 260, such as a mapping table, of the subset 212 separately for data management when creating the snapshot 232.
  • FIG. 3 is an example block diagram of a computing device 300 including instructions for creating a snapshot of a subset of data.
  • the computing device 300 includes a processor 310 and a machine- readable storage medium 320.
  • the machine-readable storage medium 320 further includes instructions 322, 324 and 326 for creating a snapshot of a subset of data.
  • the computing device 300 may be included in or part of, for example, a microprocessor, a controller, a memory module or device, a notebook computer, a desktop computer, an all-in-one system, a server, a network device, a wireless device, or any other type of device capable of executing the instructions 322, 324 and 326.
  • the computing device 300 may include or be connected to additional components such as memories, controllers, etc.
  • the processor 310 may be, at least one central processing unit (CPU), at least one semiconductor-based microprocessor, at least one graphics processing unit (GPU), a microcontroller, special purpose logic hardware controlled by microcode or other hardware devices suitable for retrieval and execution of instructions stored in the machine-readable storage medium 320, or combinations thereof.
  • the processor 310 may fetch, decode, and execute instructions 322, 324 and 326 to implement creating the snapshot of the subset of data.
  • the processor 310 may include at least one integrated circuit (IC), other control logic, other electronic circuits, or combinations thereof that include a number of electronic components for performing the functionality of instructions 322, 324 and 326.
  • IC integrated circuit
  • the machine-readable storage medium 320 may be any electronic, magnetic, optical, or other physical storage device that contains or stores executable instructions.
  • the machine-readable storage medium 320 may be, for example, Random Access Memory (RAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a storage drive, a Compact Disc Read Only Memory (CD-ROM), and the like.
  • RAM Random Access Memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • CD-ROM Compact Disc Read Only Memory
  • the machine- readable storage medium 320 can be non-transitory.
  • machine-readable storage medium 320 may be encoded with a series of executable instructions for creating the snapshot of the subset of data.
  • the instructions 322, 324 and 326 when executed by a processor (e.g., via one processing element or multiple processing elements of the processor) can cause the processor to perform processes, such as, the process of FIG. 4.
  • the receive instructions 322 may be executed by the processor 310 to receive a request to take a snapshot of a subset of a data set.
  • the determine instructions 324 may be executed by the processor 310 to determine a physical address of the subset.
  • the create instructions 326 may be executed by the processor 310 to create a snapshot of the subset based on the determined address.
  • the created snapshot may not include a remainder of the data set. Further, the snapshot may only be modified if a block of data of the subset changes.
  • FIG. 4 is an example flowchart of a method 400 for creating a snapshot of a subset of data.
  • execution of the method 400 is described below with reference to the system 100, other suitable components for execution of the method 400 can be utilized, such as the system 200. Additionally, the components for executing the method 400 may be spread among multiple devices (e.g., a processing device in communication with input and output devices). In certain scenarios, multiple devices acting in coordination can be considered a single device to perform the method 400.
  • the method 400 may be implemented in the form of executable instructions stored on a machine-readable storage medium, such as storage medium 320, and/or in the form of electronic circuitry.
  • the system 100 selects a subset 1 12 of a data set stored on a single logical unit number (LUN) of a storage system.
  • the system 100 determines a location of the subset 1 12.
  • the system 100 creates a snapshot 132 of the blocks of data corresponding to the subset 1 12. The snapshot 132 is triggered if any the blocks of data corresponding to the subset 1 12 change. However, the snapshot 132 may not be triggered in response to a block of data changing that corresponds a remainder of the data set.

Abstract

A subset of a data set is determined at a Logical Unit Number (LUN) to be replicated to a storage system. A physical address of the subset is determined. A snapshot of the subset corresponding to the determined address is created. The created snapshot includes less than an entirety of the LUN.

Description

CREATE SNAPSHOT OF SUBSET OF DATA
BACKGROUND
[0001 ] Today, the data centers are the hosting platforms for private and public cloud offerings. The data sets at these data centers is often protected and managed. Example data sets may include files as well as Virtual Machines (VMs) residing on a Network-Attached Storage (NAS) server or on an Array.
BRIEF DESCRIPTION OF THE DRAWINGS
[0002] The following detailed description references the drawings, wherein:
[0003] FIG. 1 is an example block diagram of a system to create a snapshot of a subset of data;
[0004] FIG. 2 is another example block diagram of a system to create a snapshot of a subset of data;
[0005] FIG. 3 is an example block diagram of a computing device including instructions for creating a snapshot of a subset of data; and
[0006] FIG. 4 is an example flowchart of a method for creating a snapshot of a subset of data.
DETAILED DESCRIPTION
[0007] Specific details are given in the following description to provide a thorough understanding of embodiments. However, it will be understood that embodiments may be practiced without these specific details. For example, systems may be shown in block diagrams in order not to obscure embodiments in unnecessary detail. In other instances, well-known processes, structures and techniques may be shown without unnecessary detail in order to avoid obscuring embodiments.
[0008] In Big Data centers, the data management may be integral to continuous operations and for providing effective services. Generally, the data may be hosted on a Redundant Array of Independent Disks (RAID) based arrays and administrators may run array based snapshots to create a replicas for vertical and horizontal management of data. Snapshots may also be used as sources for data backups. These snapshots may be taken at a Logical Unit Number (LUN) level, which are not aware of the applications or databases since they are sitting on top of OS/Files system. Thus, no application based snapshots are possible. Further, a LUN snapshot meant for a particular application may include all types of data which is not relevant for the purpose it was created. This may result in overly expensive snapshots in terms of size and operational efficiency of the array.
[0009] Management and protection of the virtualized data may not be as straightforward as in the case of physical servers. To protect a VM, it may have to be in quiescent mode to maintain VM consistency first, and then a snapshot at array level may have to be taken. For example, if hundreds of VMs are sitting on single LUN, and there is a request is to create a snapshot of VMs belonging to a particular user (e.g. 10 VMs), the snapshot may be taken for the complete LUN which has all other 90 VMs which don't belong to that user. This does not address the actual requirement of data center management, also results in excessive use/wastage of processing cycles plus the space of the Array. Further, current techniques do not provide a mechanism of application awareness at the block level to protect only blocks corresponding to the particular data set, which is chosen by the data/backup administrator to be protected.
[0010] Examples provide a method and/or system to take a snapshot of only the relevant data, such as a data set chosen by an administrator. In one example, a system may include a data protection unit, an address unit and a snapshot unit. The data protection unit may determine a subset of a data set at a Logical Unit Number (LUN) to be replicated to a storage system. The address unit may determine a physical address of the subset. The snapshot unit may create a snapshot of the subset corresponding to the determined address. The created snapshot may include less than an entirety of the LUN.
[001 1 ] Examples may reduce a number of times copy-on-write is performed, which is performed when any change is made in the blocks for which the snapshot is created. This is because examples may reduce the number of blocks in snapshot. Hence, examples may avoiding copy-on-write for the blocks not related to the snapshot, so processing cycles and space usage may be improved or optimized only for required blocks.
[0012] Examples may provide continuous block level thin snapshots, which may remove the dependency on application level incremental backups. Examples may also store metadata based on the thin snapshots, which may then be presented instantaneously to choose the Files/VMs to be accessed for further data management, such as restoring and/or replica access. [0013] Referring now to the drawings, FIG. 1 is an example block diagram of a system 100 to create a snapshot of a subset of data. The system 100 may include or be part of a microprocessor, a controller, a memory module or device, a notebook computer, a desktop computer, an all-in-one system, a server, a network device, a wireless device, a network and the like.
[0014] The system 100 is shown to include a data protection unit 1 10, an address unit 120 and a snapshot unit 130. The data protection, address and snapshot units 1 10, 120 and 130 may include, for example, a hardware device including electronic circuitry for implementing the functionality described below, such as control logic and/or memory. In addition or as an alternative, the data protection, address and snapshot units 1 10, 120 and 130 may be implemented as a series of instructions encoded on a machine-readable storage medium and executable by a processor.
[0015] The data protection unit to determine a subset 1 12 of a data set at a Logical Unit Number (LUN) to be replicated to a storage system. The data set may include a file, file system, virtual machine (VM), and the like. The storage system may include at least one of storage virtualization, a storage area network (SAN), and a Network attached storage (NAS).
[0016] The address unit 120 may determine a physical address 122 of the subset 1 12. The snapshot unit 130 may create a snapshot 132 of the subset 1 12 corresponding to the determined address 122. The created snapshot 132 may include less than an entirety of the LUN. The term snapshot may refer to a snapshot is the state of a system at a particular point in time and the term LUN may refer to a virtual device that provides an area of usable storage capacity on one or more physical disk drive(s) in a computer system.
[0017] The may be located separately from the LUN or storage system. Similarly, the LUN may be located separately from the storage system. For example, the data protection, address and/or snapshot units 1 10, 120 and 130 may be located at a host while the LUN is located at a client. The storage system may be located at the host or at a site remote from both the host and the client. The system 100 is explained in greater detail below with respects to FIGS. 2-4.
[0018] FIG. 2 is another example block diagram of a system 200 to create a snapshot of a subset of data. The system 200 may include or be part of a microprocessor, a controller, a memory module or device, a notebook computer, a desktop computer, an all-in-one system, a server, a network device, a wireless device, a network and the like. Further, the system 200 of FIG. 2 may include at least the functionality and/or hardware of the system 100 of FIG. 1. For example, a data protection unit 210, an address unit 220 and a snapshot unit 230 of the system 200 of FIG. 2 may include at least the respective functionality and/or hardware of the data protection, address and/or snapshot units 1 10, 120 and 130 of the system 100 of FIG. 1. The system 200 of FIG. 2 is also shown to include the LUN 240 and the storage system 250. The storage system 250 may be, for example a NAS array.
[0019] As noted above, the data protection unit 210 may determine a subset 212 of a data set 241 at the LUN 240 to be replicated to the storage system 250. For example, an administrator, protection policy and/or a user may determine the subset 212 to be replicated and then indicate the subset to the data protection unit 210, such as through a Graphical User Interface (GUI).
[0020] The address unit 220 may determine a physical address 222 of the subset 212, such as via plugin like Use tool like vmkfstools or hdparm. The snapshot unit 230 may create a snapshot of the subset 212 corresponding to the determined address 222. In one example, the address unit 220 may create a continuous filter string 224 indicating an address 222 of the subset 212. In this case, the snapshot unit 230 may use the filter string 224 to create the snapshot 232 in the storage system 250.
[0021 ] Here, the data set 241 at the LUN 240 is shown to include a set of 4 VMs 242-1 to 242-4. However, examples are not limited to VMs and may include other types of objects, such as file systems or files. Further, while FIG. 2 shows 4 VMs 242-1 to 242-4, examples of the data 241 set may include more or less than 4 VMs. The LUN 240 may be a thin-provisioned volume, which allows many VMs to be stored on a same data volume.
[0022] The data protection unit 210 may select the subset 212 of the set 241 of VMs 242. Here, it may be decided, such as by the administrator or user, to select the first and third VMs 242-1 and 242-3. Therefore, the data protection unit 210 may determine the subset 212 to correspond to only the subset 212 of the VMs 242, such the first and third VMs 242-1 and 242-3. The snapshot unit 230 may then create the snapshot 232 to correspond to the subset 212 of VMs 242. The created snapshot 232 may not include any of a remainder of the set 241 of VMs 242. Thus, the snapshot 232 created at the storage system 250 may only include the first and third VMs 242-1 and 242-3. Further, the created snapshot 232 may include less than an entirety of the LUN.
[0023] In another example, the data protection unit 210 may further determine the data set 241 to correspond to only an application (not shown) of the subset 212 of VMs 242. In this case, the snapshot unit 230 may create a snapshot 232 of only blocks of data corresponding to the application.
[0024] The data protection unit 210 may trigger the snapshot unit 230 to create a new snapshot 232 and/or modify the current snapshot 232, if at least one block of data of the subset 212 is changed. For example, a copy-on-write may be carried out for a change to any selected block of data corresponding to the subset 212. Conversely, a copy-on-write may be skipped for a change to a block of data not corresponding to the subset 212. In another example, the snapshot unit 230 may also store metadata 260, such as a mapping table, of the subset 212 separately for data management when creating the snapshot 232.
[0025] FIG. 3 is an example block diagram of a computing device 300 including instructions for creating a snapshot of a subset of data. In the embodiment of FIG. 3, the computing device 300 includes a processor 310 and a machine- readable storage medium 320. The machine-readable storage medium 320 further includes instructions 322, 324 and 326 for creating a snapshot of a subset of data.
[0026] The computing device 300 may be included in or part of, for example, a microprocessor, a controller, a memory module or device, a notebook computer, a desktop computer, an all-in-one system, a server, a network device, a wireless device, or any other type of device capable of executing the instructions 322, 324 and 326. In certain examples, the computing device 300 may include or be connected to additional components such as memories, controllers, etc.
[0027] The processor 310 may be, at least one central processing unit (CPU), at least one semiconductor-based microprocessor, at least one graphics processing unit (GPU), a microcontroller, special purpose logic hardware controlled by microcode or other hardware devices suitable for retrieval and execution of instructions stored in the machine-readable storage medium 320, or combinations thereof. The processor 310 may fetch, decode, and execute instructions 322, 324 and 326 to implement creating the snapshot of the subset of data. As an alternative or in addition to retrieving and executing instructions, the processor 310 may include at least one integrated circuit (IC), other control logic, other electronic circuits, or combinations thereof that include a number of electronic components for performing the functionality of instructions 322, 324 and 326.
[0028] The machine-readable storage medium 320 may be any electronic, magnetic, optical, or other physical storage device that contains or stores executable instructions. Thus, the machine-readable storage medium 320 may be, for example, Random Access Memory (RAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a storage drive, a Compact Disc Read Only Memory (CD-ROM), and the like. As such, the machine- readable storage medium 320 can be non-transitory. As described in detail below, machine-readable storage medium 320 may be encoded with a series of executable instructions for creating the snapshot of the subset of data. [0029] Moreover, the instructions 322, 324 and 326, when executed by a processor (e.g., via one processing element or multiple processing elements of the processor) can cause the processor to perform processes, such as, the process of FIG. 4. For example, the receive instructions 322 may be executed by the processor 310 to receive a request to take a snapshot of a subset of a data set.
[0030] The determine instructions 324 may be executed by the processor 310 to determine a physical address of the subset. The create instructions 326 may be executed by the processor 310 to create a snapshot of the subset based on the determined address. The created snapshot may not include a remainder of the data set. Further, the snapshot may only be modified if a block of data of the subset changes.
[0031 ] FIG. 4 is an example flowchart of a method 400 for creating a snapshot of a subset of data. Although execution of the method 400 is described below with reference to the system 100, other suitable components for execution of the method 400 can be utilized, such as the system 200. Additionally, the components for executing the method 400 may be spread among multiple devices (e.g., a processing device in communication with input and output devices). In certain scenarios, multiple devices acting in coordination can be considered a single device to perform the method 400. The method 400 may be implemented in the form of executable instructions stored on a machine-readable storage medium, such as storage medium 320, and/or in the form of electronic circuitry.
[0032] At block 410, the system 100 selects a subset 1 12 of a data set stored on a single logical unit number (LUN) of a storage system. At block 420, the system 100 determines a location of the subset 1 12. At block 430, the system 100 creates a snapshot 132 of the blocks of data corresponding to the subset 1 12. The snapshot 132 is triggered if any the blocks of data corresponding to the subset 1 12 change. However, the snapshot 132 may not be triggered in response to a block of data changing that corresponds a remainder of the data set.

Claims

CLAIMS We claim:
1 . A system, comprising:
a data protection unit to determine a subset of a data set at a Logical Unit Number (LUN) to be replicated to a storage system;
an address unit to determine a physical address of the subset; and a snapshot unit to create a snapshot of the subset corresponding to the determined address, wherein
the created snapshot includes less than an entirety of the LUN.
2. The system of claim 1 , wherein,
the data set includes at least one of a file, file system and a virtual machine (VM), and
the storage system includes at least one of storage virtualization, a storage area network (SAN), and a Network attached storage (NAS).
3. The system of claim 2, wherein,
the LUN includes a set of VMs, and
a subset of the set of VMs is selected.
4. The system of claim 3, wherein,
the data protection unit is to determine the subset to correspond to only the subset of VMs, and
the snapshot unit is to create the snapshot to correspond to the subset of VMs.
5. The system of claim 4, wherein,
the data protection unit is to further determine the data set to correspond to only an application of the subset of VMs, and
the snapshot unit is to create a snapshot of only blocks of data
corresponding to the application.
6. The system of claim 4, wherein the created snapshot does not include any of a remainder of the set of VMs.
7. The system of claim 1 , wherein at least one of an administrator, a user, and a protection policy are to determine the subset to be replicated.
8. The system of claim 1 , wherein,
the address unit is to create a continuous filter string indicating an address of the subset, and
the snapshot unit is to use the filter string to create the snapshot in the storage system.
9. The system of claim 1 , wherein the data protection unit is to trigger the snapshot unit to at least one of create a new snapshot and modify the current snapshot, if at least one block of data of the subset is changed.
10. The system of claim 10, wherein,
a copy-on-write is carried out for a change to any selected block of data corresponding to the subset, and
a copy-on-write is skipped for a change to a block of data not corresponding to the subset.
1 1 . The system of claim 1 , wherein the snapshot unit is to store metadata of the subset separately for data management when creating the snapshot.
12. A method, comprising:
selecting subset of a data set stored on a single logical unit number (LUN) of a storage system;
determining a location of the subset; and
creating a snapshot of the blocks of data corresponding to the subset, wherein
the snapshot is triggered, if any of the blocks of data corresponding to the subset change.
13. The method of claim 12, wherein a snapshot is not triggered in response to a block of data changing that corresponds a remainder of the data set.
14. A non-transitory computer-readable storage medium storing instructions that, if executed by a processor of a device, cause the processor to: receive a request to take a snapshot of a subset of a data set;
determine a physical address of the subset; and
create a snapshot of the subset based on the determined address, wherein
the created snapshot does not include a remainder of the data set.
15. The non-transitory computer-readable storage medium of claim 14, wherein the snapshot is modified only if a block of data of the subset changes.
PCT/US2016/021799 2015-12-04 2016-03-10 Create snapshot of subset of data WO2017095465A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN6514CH2015 2015-12-04
IN6514/CHE/2015 2015-12-04

Publications (1)

Publication Number Publication Date
WO2017095465A1 true WO2017095465A1 (en) 2017-06-08

Family

ID=58797680

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/021799 WO2017095465A1 (en) 2015-12-04 2016-03-10 Create snapshot of subset of data

Country Status (1)

Country Link
WO (1) WO2017095465A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030229651A1 (en) * 2002-03-22 2003-12-11 Hitachi, Ltd. Snapshot acquisition method, storage system
US7243198B2 (en) * 2002-02-07 2007-07-10 Microsoft Corporation Method and system for transporting data content on a storage area network
US20100077160A1 (en) * 2005-06-24 2010-03-25 Peter Chi-Hsiung Liu System And Method for High Performance Enterprise Data Protection
CN104407935A (en) * 2014-11-07 2015-03-11 华为数字技术(成都)有限公司 Snapshot rollback method and storage equipment
US20150212893A1 (en) * 2014-01-24 2015-07-30 Commvault Systems, Inc. Single snapshot for multiple applications

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7243198B2 (en) * 2002-02-07 2007-07-10 Microsoft Corporation Method and system for transporting data content on a storage area network
US20030229651A1 (en) * 2002-03-22 2003-12-11 Hitachi, Ltd. Snapshot acquisition method, storage system
US20100077160A1 (en) * 2005-06-24 2010-03-25 Peter Chi-Hsiung Liu System And Method for High Performance Enterprise Data Protection
US20150212893A1 (en) * 2014-01-24 2015-07-30 Commvault Systems, Inc. Single snapshot for multiple applications
CN104407935A (en) * 2014-11-07 2015-03-11 华为数字技术(成都)有限公司 Snapshot rollback method and storage equipment

Similar Documents

Publication Publication Date Title
US11397705B2 (en) Storage system configured to generate sub-volume snapshots
EP3008600B1 (en) Virtual machine backup from storage snapshot
EP2840495B1 (en) Container-based processing method and apparatus
US9851906B2 (en) Virtual machine data placement in a virtualized computing environment
US9678680B1 (en) Forming a protection domain in a storage architecture
US9251047B1 (en) Backup of volatile memory to persistent storage
US10140144B2 (en) Multi-site disaster recovery consistency group for heterogeneous systems
US8849966B2 (en) Server image capacity optimization
AU2015229685A1 (en) Dynamically modifying durability properties for individual data volumes
EP3022647B1 (en) Systems and methods for instantly restoring virtual machines in high input/output load environments
US10992768B1 (en) Resuming copying of snapshots from a storage system to cloud storage
US20130238867A1 (en) Method and apparatus to deploy and backup volumes
WO2015020661A1 (en) Boot from modified factory image
US20150106334A1 (en) Systems and methods for backing up a live virtual machine
KR101970864B1 (en) A parity data deduplication method in All Flash Array based OpenStack cloud block storage
US10228961B2 (en) Live storage domain decommissioning in a virtual environment
US10248619B1 (en) Restoring a virtual machine from a copy of a datastore
WO2017095465A1 (en) Create snapshot of subset of data
US11537555B2 (en) Managing network shares utilizing filesystem snapshots comprising metadata characterizing network shares
US11397589B2 (en) Snapshot transmission from storage array to cloud using multi-path input-output
US20180095690A1 (en) Creating virtual storage volumes in storage systems
US10831520B2 (en) Object to object communication between hypervisor and virtual machines
US11556430B2 (en) Selecting restore processes for applications hosted on storage volumes that are part of group replication sessions
US20230089331A1 (en) Pattern-Based Identification of Sensitive Data in a Storage System
US20220083238A1 (en) Generating recommendations for protection operations performed for virtual storage volumes

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16871182

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16871182

Country of ref document: EP

Kind code of ref document: A1