CN109154905A - Multiple data set backup versions of spanning multilayer storage - Google Patents
Multiple data set backup versions of spanning multilayer storage Download PDFInfo
- Publication number
- CN109154905A CN109154905A CN201780031635.XA CN201780031635A CN109154905A CN 109154905 A CN109154905 A CN 109154905A CN 201780031635 A CN201780031635 A CN 201780031635A CN 109154905 A CN109154905 A CN 109154905A
- Authority
- CN
- China
- Prior art keywords
- backup
- data
- data set
- cloud
- accumulation layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
- G06F3/0605—Improving or facilitating administration, e.g. storage management by facilitating the interaction with a user or administrator
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
- G06F11/1453—Management of the data involved in backup or backup restore using de-duplication of the data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1464—Management of the backup or restore process for networked environments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
- G06F3/0641—De-duplication techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0647—Migration mechanisms
- G06F3/0649—Lifecycle management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0652—Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Abstract
Store the retention period different different editions of shelf manager creation data set backup.Each of version is clearly identifiable, even if initially indicating same data set backup.One version can be described as the cached version of data set backup, and another version can be described as the cloud version of data set backup.When the retention period of the cached version of data set backup expires, shelf manager is stored by the cloud version of data set backup and moves to cloud storage layer from cache accumulation layer.Then, storage shelf manager can restore the memory space occupied by the data migrated, as long as that data is not shared with other cached versions of other data set backups due to duplicate removal.
Description
Technical field
The disclosure generally relates to the field of data processing, and relates more particularly to Backup Data.
Background technique
Tissue backups to public and/or private cloud storage equipment (" cloud backup ") to reduce information technology (" IT ") cost.
In the case where cloud backup, tissue can be easier to expansion scale, because the IT department of tissue can avoid their storage base of extension
The time and money cost of Infrastructure.Cloud is backed up, the data of tissue usually arrive public or private cloud storage equipment in storage
It by duplicate removal and compresses before.
Detailed description of the invention
The aspect of the disclosure can be more fully understood by reference to attached drawing.
Fig. 1 provides the concept side for describing the retention period different two different back-up devices indicated of creation data set backup
Block diagram.
Fig. 2 description is created by the explicit request to different expressions from backup application using multiple retention periods
The flow chart of the exemplary operation of multiple expressions of data set backup.
Fig. 3 depicted example stores the conceptual diagram of shelf manager, and the storage shelf manager is utilized to duplicate removal data
The arrangement of reference, which efficiently to create cloud-type backup, to be indicated.
Fig. 4 describes the stream that the cache backup object based on data set efficiently creates the exemplary operation of cloud backup object
Cheng Tu.
Fig. 5 is that the cloud backup object and lower data in arranging data plate move to object-based cloud storage equipment
Exemplary operation flow chart.
Fig. 6 is the cached representation of the release data set backup after the cloud of data set backup indicates to move to cloud target
Exemplary operation flow chart.
Fig. 7 is to restore trustship after the retention period of the cached representation of data set backup expires and store shelf manager
Storage system memory space exemplary operation flow chart.
Fig. 8 describes the exemplary memory system with storage shelf manager, and the system is based on multiple retention periods and generates
Multiple expressions of data set backup.
Specific embodiment
Description
Exemplary system, method, technology and the program flow of the aspect including embodying the disclosure is described below.However, should manage
Solution, can practice the disclosure without these specific details.For example, this disclosure relates to being moved to by data set backup
Data are temporarily stored at local backup device before cloud storage equipment.Data, which are moved to cloud from back-up device, is only
One example of layer to layer migration.The aspect of the disclosure can be applied to other layers and migrate to layer data, such as input/output performance
Layer between two different cloud targets of ability is migrated to layer data.In other cases, it is not illustrated in detail well-known
Command Example, agreement, structure and technology so as not to keeping description fuzzy.
Brief introduction
In order to promote cloud backup to allow the local recoveries of data simultaneously, tissue be can be used local cache and cloud backup
The back-up device (" integrating cloud back-up device ") of integration.When data will be backed up to Yun Zhongshi, the data to take various forms are from depositing
Storage server integrates cloud back-up device to cloud traversal.It integrates cloud back-up device and data set backup is locally stored, this allows from whole
It closes cloud back-up device and efficiently restores data set backup.Then, it integrates cloud back-up device and data backup is moved to specified cloud mesh
Mark.Integrate cloud back-up device when being locally stored can duplicate removal, compression, and encrypt the data from storage server.Thus, it is whole
Specified cloud target can be moved to for compression, encryption data by closing cloud back-up device.
It summarizes
The application program (" storage shelf manager ") of data at management accumulation layer can be designed to creation data set backup
Different expressions or version.Each of expression of data set backup be it is clearly identifiable, even if initially indicating phase
Same data set backup.These different expressions are associated from different retention periods.This causes administrator preferably to control data
Life cycle management and allow additional data management function.Expression corresponds to the structuring member number of data set backup
According to.Although duplicate removal causes to indicate reference same data set backup, indicates and data set backup is logically after can deviating from
Two different data set backups of continuous manipulation.One expression is to reside in back-up device according to life cycle management strategy
The cache backup version for continuing to provide the data set backup of low delay access at accumulation layer when relatively short retention period is (" high
Speed caching backup " or " cached representation ").Another expression is to retain cloud storage according to the offer of life cycle management strategy
Continue the cloud backup version (" cloud backup " or " cloud expression ") of the data set backup of longer retention period in equipment.Work as data set backup
The retention period of cached version when expiring, storage shelf manager is by the cloud version of data set backup from cache accumulation layer
Move to cloud storage layer.Then, the memory space that the data that storage shelf manager is restored to have migrated occupy, as long as the data
It is not shared with other cached versions of other data set backups due to duplicate removal.
Exemplary illustration
Fig. 1 provides the concept side for describing the retention period different two different back-up devices indicated of creation data set backup
Block diagram.From the perspective of data station 103, back-up device 110 as data set backup cache and operate.Data station
Point 103 further includes data storage 115 and backup server 116.115 trustship backup application 114 of data storage and
116 trustship backup application 113 of backup server.
Fig. 1 is explained with a series of letter A-D.These letters indicate the level segment of operation, and wherein each level segment may include one
A or multiple operations.Level segment is not necessarily mutual exclusion, and can be overlapped.Although these level segments are sorted for this example
Son, but level segment illustrates an example to help to understand the disclosure and should not be taken to limit claims.Belong to right to want
Ask the theme in the range of book that can change about some in sequence and operation.
Level segment A includes that data set A 112 is transferred to back-up device 110 to be used for data set A by backup application 114
112 backup.Data set A is backuped to the backup application 113 of cloud target by request back-up device 110 by level segment A
Triggering.In this description, cloud target is in cloud storage equipment 140.Before data set A 112 is written, backup application
114 is (remote according to Common Internet File System (CIFS) agreement or Network File System (NFS) agreement or use customization RPC
Journey Procedure Call) the storage network protocols such as proprietary protocol open the connection with back-up device 110.Backup application 114 is logical
Believe the request of the identifier of designation date collection A 112, the cloud storage equipment 140 and two reservations of the cloud target backed up as cloud
Phase (NP and NC) dominate the instruction that data set A is backed up.Data lifecycle management strategy 101 defines two retention periods.Retention period NP
It provides time cycle (generally about a couple of days or several weeks) of the data set A backup cache at back-up device 110, and retains
Phase NCRegulation data set A backup remains in the time cycle (generally about several months or several years) in cloud storage equipment 140.Some
In the case of, time cycle NPThe triggerable data set A that expires backup to attribute it is different (for example, different accessibilities with it is difference extensive
It is multiple to guarantee) cloud storage equipment migration.After establishing the connection with back-up device 110, backup application 114 will be backed up
Device 110 is set as backup target, is effectively intermediate/temporary backup target, and start for data set A 112 to be transferred to standby
Part device 110.Data set A 112 is transmitted as fixed or variable-size composition data unit (example by backup application 114
Such as, panel, data block etc.).
Level segment B includes the first expression of the backup that back-up device 110 creates data set A 112 on back-up device 110.Grade
Section B is triggered by from the request of backup application 114.As mentioned, the backup of designation date collection A 112 is requested to have
Two retention periods.Based on the instruction of two retention periods, back-up device 110 creates data set A is backed up first and indicates, and described first
It indicates to correspond to retention period NC.This first expression includes data set A backup metadata 122.Data set A backup metadata 122
Identifier including metadata 122 (for example, unique identifier in the NameSpace managed by back-up device 110).First number
It further include the identifier of data set A 112 and the metadata (for example, license, size, creation data etc.) of data set A according to 122.When
Receive data set A 112 composition data unit when, back-up device 110 carry out include duplicate removal storage efficiency operate.Backup dress
Set 110 backups that compression and encryption can also be applied to data set A 112.When data set A 112 is processed and by locally
When storage, back-up device 110 is with the reference of the component units backed up to data set A more new metadata 122.Because duplicate removal just by into
Row, so metadata 122 can quote duplicate removal data and non-duplicate removal data.In this description, backup application 113 is in level segment
Before A or data set B 111 is transferred to back-up device 110 to be used to back up by a certain moment Chong Die with level segment A.Work as progress
When deduplication operation, back-up device 110 finds the repeated data between data set A 112 and data set B 111.Therefore, metadata
The non-duplicate removal data 129 (that is, not making carbon copies the data on back-up device 110) of 122 references and duplicate removal data 130.To avoid
Degree is complicated, and duplicate removal data 130 are only quoted by metadata 122 and database B cache backup metadata 121.Data set B high
The backup metadata 121 of speed caching also quotes the non-duplicate removal data 124 corresponding to data set B 111.
Level segment C includes that back-up device 110 is based on retention period NPCreate the backup of data set A 112 second indicates.Level segment C
It can be by being triggered from the explicit request of back-up device 114 to create the second of data set A backup and indicate, it can be for from data
Collect the implicit request of the instruction of two retention periods of A backup, or can be the default action (example for the data with particular community
Such as, it second indicates to be in the data for tissue completely or only for the creation of certain departments).Back-up device 110 is based on number
Indicate that creation second indicates according to the first of collection A backup.The creation of backup application 110 second indicates standby to include data set A cloud
Part metadata 123 and data set A backup 125.Back-up device 110 can be indicated by duplication first to create the second expression.Although
It is copy, but the different identifiers that the result of the first copy indicated will be indicated at least with second.Back-up device 110 can answer
Data set A 112 processed inhibits duplicate removal to create data set A backup 125 to maintain the separation of lower data.However, backup
Device 110 also allows for duplicate removal, this will lead to data set A backup 125 for institute in duplicate removal data 130 and non-duplicate removal data 129
The data of the identical data set A 112 indicated.In other words, data set A backup 125 can be the reference to data.These draw
With the part that can be data set A cloud backup metadata 123 or the independent structure quoted by data set A cloud backup metadata 123.
Level segment D, which includes back-up device 110, moves to cloud storage equipment 140 for metadata 123 and data set A backup 125.Member
The migration of data 123 and data set A backup 125 leads to object 141 (including metadata 142 and data set backup 143), it is assumed that cloud
It stores equipment 140 and uses object-based storage technology.Metadata 123 and the migration form of data set A backup 125 may depend on
Used service and/or agreement are (for example, AmazonStorage service, Microsoft Azure platform,Webscale object storage software,Swift object/binary large object storage connects
Mouthful etc.) and change.Retention period NCDominate the object 141 in cloud storage equipment 140.In NCAfter expiring, object 141 can be moved again
Move or move to storage/archive of different level.After Successful migration metadata 123 and data set A backup 125, metadata
123 can be removed from back-up device 110, and back-up device 110 can notify backup application 114: data set A backup 112
It has been stored in cloud storage equipment 140.This identifier of notice including object A 141 is to allow from cloud storage equipment 140
Retrieval.After having notified backup application 114, back-up device 110 can start in composition data unit not by back-up device
Data set A backup is removed from back-up device 110 in the degree that other backups on 110 are shared.
Data station 103 may include not describing to avoid the various other hardware and/or software of Fig. 1 complexity are unnecessarily made
Element.Fig. 1 describes the data memory 115 as the comparison with backup server 116, and the disclosure is avoided to be limited to take from backup
Business device receives the misunderstanding of the data set of backup.For example, client device can trustship by the data set transmissions of backup to back-up device
110 backup application.
Other than allowing to manage the independent retention period of data set backup, the multiple expressions for creating data set backup allow pair
The more preferable control of other Data lifecycle management variables (for example, backup time delay, regulation backup strategy of different user group etc.)
System.Unique identifier is provided for each of multiple expressions of data set backup, by allowing different variables and multiple expressions
In specific one be associated to promote more preferably to control.Service level agreement (" SLA ") and/or Storage Lifecycle Policy
(" SLP ") can be assigned to the expression of each data set backup with unique identifier.Indicate that the uniqueness of identifier can pass through
Ensured using mutual exclusion NameSpace.For example, expression can be stored in different logics by modifying directories or subdirectories by identifier
In container (for example, volume), it will indicate to be stored in that the medium with different installation points is medium to occupy mutual exclusion NameSpace.
The cached representation and cloud at least creating data set backup indicate permission data isolation, such as closing rule safely
Property.The license creation and/or storage that cached representation can be limited arrive limited storage equipment/memory.Cloud expression can take
It indicates from limitation cloud to the security strategy of the movement of limited destination set (for example, only to particular cloud target) or by the peace
Full strategy dominates, and is only transferred to limited destination by secure connection and/or agreement and gathers.
Multiple expressions of creation data set backup also allow the different different SLA/SLP phases for indicating and meeting jurisdiction requirement
Association.For example, two clouds that back-up device can create data set backup indicate, each of medium cloud expression will be stored in
In cloud storage equipment under different jurisdictions.Because jurisdiction can have different data-privacy methods, individual cloud is indicated
The specific SLA/SLP of jurisdiction is allowed efficiently to be applied.For example, data owner can avoid creation with each jurisdictional
The overall data management strategy of rule and the assessment of the rule of each data set backup moved in cloud storage equipment.In addition,
Additional expression can be created using the SLA/SLP of their own and move to different back-up devices as the spare of failover.
Fig. 2 description is directed to multiple retention period creations by the explicit request to different expressions from backup application
The flow chart of the exemplary operation of multiple expressions of data set backup.When carrying out the exemplary operation of Fig. 2, Fig. 2 is related to storing
Shelf manager.Because the creation of multiple expressions of multiple retention periods corresponds to data to the backup in cloud storage equipment, make
With term " storage shelf manager ".Fig. 2 is related to storing shelf manager rather than back-up device, to avoid specifically configured set is required
The explanation of standby (for example, storage device).For example, storage shelf manager can execute in virtual machine.Dotted line in Fig. 2 is used to show
Indirect or asynchronous flow between represented exemplary operation.
At box 201, storage shelf manager detects the request of the low delay backup to data set.To the low of data set
The request backed up that is delayed can be the explicit request for creating backup in low delay storage equipment, or can be implicitly.It is standby
Part request can impliedly request low delay backup to backup to the seondary effect in cloud storage equipment as requested data set.Accumulation layer pipe
Reason device can be programmed to be the request to both cloud backup and low delay backup to the request processing that cloud backs up.As another
Any request processing for being used to backup data set can be the request for creating two expressions of data set by example, request, often
The retention period of a expression, is different.
At box 203, cloud backup application creates the backup of data set.Store the standby of shelf manager creation data set
Part indicates.Identifier is assigned to created backup and indicated by cloud backup application, and is set backup and be expressed as delaying at a high speed
Deposit the instruction of type.The backup expression of cache types is set to indicate lower data and resides in local, the lower number of plies
According to being not necessarily resident in the memory of conventional cache type.Although data set is just being backed up to relative to cloud storage equipment
Local memory device, but data set is finally backed up in cloud storage equipment.Local memory device will have compared to cloud
Store the lower access delay of equipment.As exemplified in figure 1, local memory device can be for by the data station of source data " local "
The storage equipment (for example, disk memory array or flash memory storage array) of equipment (for example, storage device) management at place." local "
Source data can be represented in same local network, it is medium in identical building.Do not consider specific deployments, the backup of cache types
It indicates to access compared to the relatively low delay of cloud storage equipment.Backup indicates to include the metadata indicated with both data sets.
Backup indicates that metadata includes being assigned to the identifier of backup expression (for example, Universal Unique Identifier by storage shelf manager
(UUID)).Backup, which indicates to be considered as, to be included data set backup or may include reference to data set backup.Cloud back-up device
Also it for example indicates to set in metadata in backup and backs up the instruction for being expressed as cache types.This can be later used to right by type
Indicate the operation of operation.Backup indicates to be logically viewed as including data set backup, although backup indicates can there is logarithm
According to the reference of the component units of collection backup.In more general terms, type instruction can be the value of the corresponding accumulation layer of expression.For example, with
In the value " 1 " that the type instruction of the first accumulation layer of low access delay storage equipment can be corresponding to the first accumulation layer, or can quilt
" cache " is to indicate that it corresponds to low access delay accumulation layer.
At box 206, storage shelf manager, which backs up cache, indicates that identifier and cache types instruction pass
Up to requestor.Storage shelf manager is according to the communication protocol for being used to communicate with storage shelf manager by backup application or deposits
It stores up network protocol and conveys identifier and type instruction.Storage shelf manager provides identifier and type instruction to back-up application journey
Sequence indicates previously described to allow to back up the backup application management indicated control or carry out the multiple of data set backup
Manipulation.For example, requestor can notify storage shelf manager when the retention period that cache backup indicates expires.
At box 207, storage shelf manager detects the request of Indicated Cloud target and data set.Backup application can
Send another request for the data set that instruction had previously indicated in other requests.In some implementations, backup application will
Single backup request is passed to storage shelf manager.Single request processing can be for creating multiple tables by cloud backup application
The request shown.
At box 208, another backup of storage shelf manager creation data set is indicated.Storing shelf manager will be different
Identifier be assigned to this additional backup and indicate, and set the instruction for being expressed as the expression of cloud-type.Store layer-management
Device sets cloud-type instruction, is indicated with instruction backup and represented data set backup will be stored in cloud storage equipment.Cloud
Type backup indicates there is longer retention period, usually substantially indicates long (for example, a couple of days is compared than the backup of cache types
Several years).
At box 209, storage shelf manager, which backs up cloud, indicates that identifier and cloud-type instruction are communicated to requestor.Please
Cloud backup can be used to indicate identifier to access the data set backup in cloud storage equipment for the person of asking.Storage shelf manager is evicting number from
According to or carry out garbage collection when can be used type come distinguishes data collection backup and metadata.
At box 210, the cloud-type of data set is backed up expression migration in response to migration triggering by storage shelf manager
To cloud target.The retention period that migration triggering can indicate for cache backup, expires.If cloud-type backup indicates
Lower data collection backup, then the backup of lower data collection is also migrated and relationship between the two is maintained as the part of migration
In cloud storage equipment.
The exemplary operation for covering creation and the variation in the multiple expressions of management is presented in Fig. 2.Fig. 3-4, which provides to have, removes tuple
According to the backup with multiple retention periods multiple expressions exemplary illustration.In the case where duplicate removal data, metadata can be with
The multiple expressions backed up are allowed to arrange in a manner of creation faster.Fig. 3 depicted example stores the conceptual diagram of shelf manager,
The storage shelf manager is indicated using the arrangement of the reference to duplicate removal data efficiently to create cloud-type backup.Fig. 4 describes high
The flow chart for the exemplary operation that effect creation cloud-type backup indicates.
In Fig. 3, storage shelf manager 302 backs up data to the cloud storage equipment 340 using object storage technology.It deposits
Reservoir management device 302 is illustrated as the backup of management data set A and data set B.Due to duplicate removal, data set A backup and data set B are standby
Part shares some data.Store the data cell (" composition data unit ") that shelf manager 302 assembles composition data collection.Form number
" data plate " is referred to herein as according to the aggregation of unit.Storing shelf manager 302 can be by the composition data list of multiple data sets
Member forms data plate.Each composition data unit of data plate can be shared by multiple data sets.Storing shelf manager 302 can
Data plate is formed based on the configuration size of data plate.Storing shelf manager 302 can be accumulated with duplicate removal, composition data unit
Data plate, the database block size until reaching configuration, with and without filling.Storage shelf manager 302 maintains every number
According to the metadata of collection, to restore data set from data plate.This metadata of data set is referred to herein as composition data and reflects
It penetrates.The composition data mapping of data set includes the identifier of data plate, and the identifier has the composition data list of data set
The location information of first and every data plate.Each composition data unit of location information designation date collection starts in data plate
In the length or size of where (" offset of data plate ") and composition data unit.Composition data mapping can also indicate that data plate
Compression algorithm and encryption.
In the case where data set A backup and data set B backup, storage shelf manager 302 has formed the database in Fig. 3
Block 309.Data set A cache backup object 301 is related to composition data mapping 305A.Composition data maps 305A and identifies data
Plate 307, the data plate are the subsets of data plate 309.Data set A cache backup object 301 includes data set A
Metadata, object be cache types instruction and object identifier.Fig. 3 is related to object rather than indicates, because of data
The metadata for collecting A is different from the metadata for lower layer's composition data unit that data set A is backed up.This arrangement can be considered backup point
Solution is at 3 parts: 1) data cell of composition data collection, 2) metadata of positioning composition data unit or retrieval metadata, and
3) data set backup metadata.Data plate 307 includes the composition data unit of data set A.It is several for this limited example
A composition data unit is identified as A1、S1And AN, wherein composition data cell S1Indicate shared composition data cell.Form number
According to cell S1It is also the composition data unit of data set B.The composition data of data set B backup maps 311 reference data plate collection
Close the composition data cell S in 3071And the element of other data plates in data plate 309.Data set B cache
Backup metadata 313 is related to composition data mapping 311.
Such as in Fig. 1, Fig. 3 is explained with a series of letter A-C.These letters indicate the level segment of operation, wherein each grade
Section may include one or more operations.Level segment is not necessarily mutual exclusion, and can be overlapped.Although these level segments be sorted with
In this example, but level segment illustrates an example to help to understand the disclosure and should not be taken to limit claims.Belong to
Theme in the range of claims can change about some in sequence and operation.
Level segment A includes that storage shelf manager 302 replicates and modifies data set A cache backup object 301 to create number
According to collection A cloud backup object 303.Storage shelf manager 302 at least modifies copy to indicate the new identifier of object 303 and by object
Type is designated as cloud-type.Level segment A is triggered by the explicit or implicit request from backup application with backup data set A, and
And data set A backup is dominated by multiple retention periods.
Level segment B includes that storage shelf manager 302 replicates composition data mapping 305A to create composition data mapping 305B.It deposits
Reservoir management device 302 modifies data set A cloud backup object 303 to be related to the composition data of duplication and map 305B.At the moment, it stores
Two expressions that creation data set A is backed up in the case where the expense of no duplication lower data of shelf manager 302.In addition,
Storage shelf manager 302 can rely on duplicate removal program code manage make data set A back up two indicate diverging lower layer after
Continuous modification.If the cache backup of request modification data set A, duplicate removal program code quotes management, so that composition
Data mapping 305A is related to being updated may be in the data of the change in different data plate, and composition data maps 305B
It will continue to be related to data unchanged.
Level segment C includes the cloud backup for storing 302 migrating data collection A of shelf manager.In order to which the cloud of migrating data collection A backs up,
Storage shelf manager may reflect data set A cloud backup object 303, composition data after transformation (for example, compression, encryption etc.)
It penetrates 305B and data plate 307 is communicated to object storage device 340.Migration leads to four be stored in object storage device 340
A object: 1) data set A cloud backup metadata object 315;2) composition data mapping object 317;3) in data plate 307
The object 319 of one data plate;With the object 321 of the second data plate in 4) data plate 307.Store shelf manager 302
It is created in cloud storage equipment 340 based on the identifier of the counter structure managed by storage shelf manager 302 with object key
These objects.Shelf manager 302 is stored based on the data plate identifier of corresponding data plate to create object with object key
321.Similarly, storage shelf manager 302 creates object based on the plate identifier of its corresponding data plate with object key
319.Shelf manager 302 is stored based on the identifier of composition data mapping 305B come with object key creation composition data mapping pair
As 317.Finally, storage shelf manager 302 creates number based on the identifier of data set A cloud backup object 303 with object key
According to collection A cloud backup metadata object 315.
Although how much being driven by data management/efficiency function of storage shelf manager 302, it is more that backup is separated into these
A object allows the efficient retrieval and the storage overdue efficiency of equipment of the different aspect of data set.In the feelings of metadata object 315
Under condition, the metadata of data set A can be retrieved in the case where not retrieving lower data collection A, this will include retrieval data plate pair
As 319,321, and then database block object 319, the 321 data set for reconstruction A by retrieving.Expense from reconstruction will depend on
The transformation of data before moving in cloud storage equipment and change.For example, storage shelf manager 302 can be by data plate
Simultaneously encryption data plate is compressed before storing cloud storage equipment.For data set for reconstruction, data set can extracted from data plate
Composition data unit before, the data plate of retrieval will be decrypted, and then be decompressed.It is gone at storage shelf manager 302
The storage efficiency of weight is brought into cloud storage equipment 340, because database block object will be shared containing multiple data set backups are crossed over
Composition data unit.After Successful migration, storage shelf manager 302, which may depend on, to be dominated SLP and sets from managed storage
It is standby to remove data set A cloud backup object 303 and composition data mapping 305B.For example, SLP allows data set backup to be present in tool
On the multiple layers for having overlapping retention period.Confirm after the removing and/or depending on dominating SLP and move to cloud storage equipment
After 340, the migration of the cloud backup of data set A can be considered as completely, and storing shelf manager 302 can notify request to back up
Application data collection A has been successfully stored in cloud storage equipment 340.Storage shelf manager 302 can also be removed or be evicted from
The composition data unit of the data set A of other data sets is not formed.
At a time, similar operation can be applied to the first number of data set B cache backup by storage shelf manager 302
According to 313 and composition data mapping 311.Storage shelf manager 302 will replicate and to modify metadata 313 standby to generate data set B cloud
Part object.The composition data mapping 311 that storage shelf manager 302 is also quoted duplication by metadata 313.Store shelf manager
302 quote more new data set B cloud backup object in the copy of composition data mapping 311.Then, storage shelf manager 302 will
Such as modification migrating data collection B cloud backup object and the composition data mapping 311 above in relation to reference described in data set A
Copy.Cloud storage equipment 340 is quoted and had not migrated into migration also by the copy of composition data mapping 311 by storage shelf manager
Those of data plate 309.For this example, when migrating data collection A backup, storage shelf manager 302 has been created
Database block object 321.Therefore, the migration of data set B backup is by reference data plate object 321.Storing shelf manager 302 can
The migration of data plate is tracked using different technologies.Storage shelf manager 302 can be moved to the success in cloud storage equipment
The instruction of shifting is come locally flag data plate and/or maintains to list to have migrated to another accumulation layer in managed accumulation layer
Data plate independent data structure and identify target storage layer.Storage shelf manager 302 can also use cloud service API
The function of definition is to determine whether data plate has been migrated to cloud storage equipment 340.
Fig. 4 describes the cache backup object based on data set efficiently to create the exemplary operation of cloud backup object
Flow chart.For the consistency with Fig. 2-3, Fig. 4 is related to the storage shelf manager operated.The operation of Fig. 4 is provided such as Fig. 2
The exemplary illustration of an indicated embodiment for creating another backup expression in box 208.In the operation of Fig. 4
At the time of beginning, storage shelf manager has determined that data set backup will indicate there are two having.
At box 402, the copy of storage shelf manager creation cache backup object, to create cloud backup object.
Different identifiers by copy backup object but is assigned to copy by duplication operation.This identifier can be generated by operating system
And it is assigned to copy, storage shelf manager executes in the operating system.
At box 403, storage shelf manager modifies copy by the instruction of the backup object of cloud-type of object.Accumulation layer
Title/routing update copy that manager can also be provided by backup application.For example, cache backup object and cloud are standby
Part object can be written to different paths and/or the different sets of the storage medium corresponding to object type.
At box 404, the duplication of storage shelf manager is mapped by the composition data of the copy reference for cloud backup object.When
When storing shelf manager duplication cache backup object, the reference of composition data mapping is also replicated.
At box 408, storage shelf manager updates cloud backup object to quote the copy of composition data mapping.Accumulation layer
Manager replicates composition data mapping and the reference from cloud backup object is updated the copy to composition data mapping, with true
It is distinguishing for protecting backup object.Cloud backup object will no longer affect to the variation of cache backup object.
Fig. 5 is that the cloud backup object and lower data in arranging data plate move to object-based cloud storage equipment
Exemplary operation flow chart.Fig. 5 is related to the operation indicated by the level segment C of Fig. 3.
At box 501, storage shelf manager detects migration triggering.Migration triggering can be sent from backup application.
Migration triggering, which can be, detects retention period NP(the retention period NPProvide cache backup object retention period) expire
And/or in relation to the expired notice.Although retention period, embodiment can be defined for each type of backup object
Setting multiple retention periods can be backed up for data sets.If the cloud backup object of data set backup exists, accumulation layer pipe
Reason device can be used as default action and the expired of the retention period of data set backup be construed to expiring for cache backup object.Separately
Outside, backup application or other entities can notify storage shelf manager NPIt has expired, and instruction can be conveyed standby to migrate cloud
Part object.
Box 503 starts the process circuit in the exemplary operation of Fig. 5, so that, in conjunction with box 509 (lower section), make box
The 505 and 507 substantially each data plates quoted for the composition data mapping by cloud backup object are repeated at least once more.Pass through
" substantially each data plate ", meaning under certain condition, may not by the specific data plate of composition data mapping reference
Including in the process circuit formed by box 503 and 509.Come really for example, storage shelf manager can carry out additional operation
Whether fixed number has moved according to plate to cloud target, then avoids the operation for migrating identical data plate again.Store shelf manager
Access cloud backup object is mapped with determining by the composition data that cloud backup object is quoted.It is reflected with the composition data of cloud backup object
It penetrates, the reference that storage shelf manager can start the composition data unit in mapping composition data is iterated.
During each iteration in the process circuit established by box 503 and 509, at box 505, shelf manager is stored
Object is created in cloud target for the releasing reference data plate of iteration.Storage shelf manager is to release reference data plate wound
Build object.For example, storage shelf manager can call the function defined by the Application Programming Interface of cloud service supplier to create
Object.One in argument of function can be data plate, perhaps from compression and enciphering transformation, and function another
Independent variable can be the object key for the object that will just creating for identification.
At box 507, storage shelf manager updates composition data mapping, to indicate the database block object for creation
Object key.Storing shelf manager will be finally with the object key with the database block object of identification creation rather than to backup
The reference of data plate at device is updated to be mapped by the composition data that cloud backup object is quoted.
At box 509, storage shelf manager is determined whether there is by another database of composition data mapping reference
Block.If it is present, control is back to box 503 with the data plate for handling next reference.Otherwise, control continues
Box 511.
At box 511, backup application is mapped in cloud target with composition data and creates object.In the data of reference
After plate moves in cloud target, composition data mapping include database block object key rather than to low delay accumulation layer or
The reference of the data plate in layer (that is, relative to storage shelf manager local) is locally stored.If from cloud target retrieval group
At data mapping object, then database block object key will be used to retrieve required data plate.Composition data mapping pair
As will still include restoring the information of the composition data unit of data set (for example, location information, solution confidential information, compressing information
Deng).
At box 513, after confirmation composition data mapping object has been created, storage shelf manager updates cloud backup
Object, to indicate the object key of composition data mapping object.Substantially, cloud of the shelf manager to map composition data is stored
It quotes (that is, object key) and replaces local reference.
At box 515, shelf manager is stored with cloud backup object and creates object in cloud target.For example, accumulation layer pipe
Reason device calls previously mentioned creation objective function in the case where cloud backup object is as independent variable.Storage shelf manager can make
Use the identifier of cloud backup object as object key, or can be from the identifier of cloud backup object or the title of exposure (for example, text
Part system handle) derived object key.
At box 517, storage shelf manager generates the instruction retained in cloud target in relation to data set backup.Storage
Data set backup can be stored in cloud target and be communicated to backup application by shelf manager, and provide cloud backup object
Object key.
At box 519, storage shelf manager is from associated low delay accumulation layer removal cloud backup object and and by cloud
The composition data mapping of backup object reference.If not by the composition data unit in the data plate of composition data mapping reference
By other object references, then can be removed by garbage collection.
The removal of the composition data unit of the data set backup of migration can carry out in different ways.Fig. 6 describes as from originally
Ground/low delay accumulation layer deletes the part of the cached representation of data set backup to remove the exemplary behaviour of composition data unit
The flow chart of work.The rubbish that Fig. 7 is incorporated to the composition data unit of the expired cached representation in relation to removing data set backup is received
Collection aspect.
Fig. 6 is the cached representation of the release data set backup after the cloud of data set backup indicates to move to cloud target
Exemplary operation flow chart.Storage shelf manager restores triggering in response to memory space and removes cached representation.It deposits
The example that triggering is restored in storage space includes that data set backup moves to the completion of different accumulation layers, in relation to deleting data from current layer
The request of collection backup, and/or retention period associated with current layer, are expired.
At box 601, storage shelf manager detects the triggering based on retention period to remove cache backup.It is based on
The triggering of retention period expires corresponding to data retention period Np's.However, triggering is not necessarily expiring for retention period.Triggering can be pair
The Successful migration for answering cloud to back up is triggered in response to expiring for retention period.
At box 602, storage shelf manager generates the composition data unit quoted by cache backup metadata
List.Storing shelf manager can be with the reference (for example, logical address) of composition data unit and/or with the knowledge of composition data unit
It Fu not (for example, block number) filling array, hash table, lists of links etc..
At box 603, storage shelf manager starts to process each of the composition data unit indicated in list.It deposits
Reservoir management device traversal of lists and select the composition data unit of each instruction for processing.
At box 605, storage shelf manager determines whether the metadata of another cache backup quotes selected group
At data cell.If fingerprint database or associated structure recognition are quoted by the object of the data of fingerprint representation, deposit
Reservoir management device can make the determination whether selected composition data unit is shared with fingerprint database.If fingerprint database or
Associated structure is unidentified to be related to object, then storage shelf manager can traverse the backup of all caches with determination it is any its
Whether the backup of its cache is related to selected composition data unit.In some embodiments, storage shelf manager, which may have access to, refers to
Line database whether there is with the entry for determining selected composition data unit.If entry is not present or if reference counter quilt
Setting is to 1, then storage shelf manager can proceed with, like selected composition data list is quoted in no other cache backups
Member.If reference counter is greater than 1, storage shelf manager continues with the additional reference of determination to be from cache
Backup or cloud backup.If selected composition data unit is quoted in the backup of another cache, control flow to box
606.Otherwise, control proceeds to box 607.
At box 606, storage shelf manager removes the instruction of selected composition data unit from list.If another is high
Selected composition data unit is quoted in speed caching backup, is unsuitable then discharging selected composition data block.
At box 607, storage shelf manager determines whether list includes another composition data unit not yet selected.
If including control flowing returns to box 603.Otherwise, control flow to box 609.
At box 609, storage shelf manager deletes all compositions still indicated in lists from low delay accumulation layer
Data cell.At this point, list should only indicate the composition data unit of the cache backup reference only to be expired by retention period.Yun Bei
Part should not quote not by the composition data unit of its corresponding cache backup reference.Thus, the backup of reference cloud may will be by
It removes or is just removed.
Fig. 7 is the sky for restoring storage shelf manager by storage shelf manager after the retention period of cache backup expires
Between exemplary operation flow chart.Storage space is restored to be sweeping extensively by substantially all data blocks on low delay layer
Retouch realization.Fig. 7 restores to have migrated but still resided in the storage space of the data cell on managed layer.
At box 701, storage shelf manager detects expiring for the retention period of cache backup.Similar to 601, base
In triggering the expiring corresponding to data retention period Np of retention period.However, triggering is not necessarily expiring for retention period.Triggering to be
The Successful migration of corresponding cloud backup, is triggered in response to expiring for retention period.
At box 702, storage shelf manager generates the composition data unit quoted by cache backup metadata
List.Similar to 602, storage shelf manager can be with the reference (for example, logical address) of composition data unit and/or to form number
According to identifier (for example, block number) the filling array of unit, hash table, lists of links etc..
At box 703, storage shelf manager starts to scan by storage shelf manager for the data cell in accumulation layer
The memory space of the accumulation layer of management.Each data cell that storage shelf manager encounters during scanning is known as selected data list
Member.For Fig. 7 description during scanning by data cell rather than composition data unit is used in operation because discovery number
Any data set may not be formed according to unit.
At box 705, storage shelf manager determines whether the metadata of another cache backup quotes selected number
According to unit.Similar to the 605 of Fig. 6, how storage shelf manager, which makes this, is determined depending on backup maintenance for data sets
Information, such as the specific implementation of fingerprint database.If fingerprint database or the reference of associated structure recognition are by fingerprint representation
Data object, then storage shelf manager the determination whether selected data unit is shared can be made with fingerprint database.
If fingerprint database or associated structure is unidentified is related to object, storage shelf manager can traverse all caches
Whether backup is related to selected data unit with any other cache backup of determination.In some embodiments, accumulation layer pipe
Reason device may have access to fingerprint database whether there is with the entry for determining selected composition data unit.If entry is not present or entry
In the presence of and reference counter be set to 1, then storage shelf manager can proceed with, like no other caches are standby
Part reference selected data unit.If reference counter is greater than 1, storage shelf manager continues to draw so that determination is additional
Be from cache backup or cloud backup.If selected composition data unit is quoted in the backup of another cache,
Control flow to box 707.Otherwise, control proceeds to box 709.
At box 707, if indicated in lists, storage shelf manager removes the finger of selected data from list
Show.Selected data unit can be backed up by another cache rather than current cache backup reference, in this case,
Selected data unit will be not present in list.Control flow to box 713 from box 707.
If storage shelf manager determines that the metadata of another cache backup is unreferenced selected at box 705
Data cell, then, at box 709, storage shelf manager is determined: 1) whether the list of composition data unit includes selected number
According to unit and 2) cloud backup metadata whether also quote selected data unit.Storage shelf manager makes this determination, to know
Those of do not quoted by the metadata of the metadata of cache backup and the cloud backup still resided in managed accumulation layer
Selected data unit.If list indicates selected data unit and the metadata of cloud backup quotes selected data unit,
Control flow to box 711.Otherwise, control flow to box 713.
At box 711, the instruction of the selected data unit in shelf manager flag column table is stored.Store shelf manager with
Data flag (for example, place value or multiple bit value) label instruction.Data flag, which is used to identify, will retain the composition in cloud storage layer
Data block.After the completion of box 711, stream may then continue to carry out box 713.
At box 713, whether storage shelf manager determination is completed by the scanning of the accumulation layer of accumulation layer manager administration.
If completed, control flow to box 715.Otherwise, control flowing returns to box 703.
At box 715, storage shelf manager migration still indicates in lists and forms number with data flag marker
According to unit.It stores shelf manager and the data cell marked in lists is moved into cloud storage layer.This migration can be in cloud
Object is created in accumulation layer.As mentioned in the description for 709, the composition data unit in cloud storage layer is moved to by cloud
The metadata of backup is quoted.Cloud backup is not necessarily the cloud backup corresponding to cache backup.In other words, accumulation layer is swept
Retouch the migration for ensuring the composition data unit of the metadata reference by moving to the cloud not yet completed in cloud storage layer backup.Or
Person, since the reference of cloud backup metadata is for example not yet ready for moving to the data plate in cloud storage layer, even if corresponding height
Speed caching backup expires, and cloud backup still cannot migrate.Indicate in lists do not have markd composition data unit indicate by
The composition data unit of the metadata reference of cache backup, but for the composition data unit, cloud backup has been moved
Move on to cloud storage layer.
At box 717, after moving to cloud storage layer, store in shelf manager delete list indicate unmarked and
Both label composition data cells.That is, once the migration of the label composition data cell indicated in list is successfully
It completes, then storing shelf manager can proceed with the composition data unit indicated in delete list or make the composition data list
Member is expired without considering to mark.The composition data unit for those of not marking list is the composition data list previously migrated
Member is perhaps the shared data unit for the migration of another data set backup.Therefore, their removal can be considered as rubbish receipts
Collection, while assuming that idempotent migrates, also avoid the resource expenditure for migrating them again.
Variation
Foregoing exemplary illustrates to be related in the case where multiple expressions of data set backup according to two reservation period management numbers
It is backed up according to collection.However, embodiment can create multiple expressions of data set backup, to promote several retention periods and greater than two
Accumulation layer.For example, each accumulation layer can trustship storage shelf manager.Storage shelf manager hosted data collection at accumulation layer N is standby
The expression N of part.The related data set backup of storage shelf manager can be notified to be subjected to retention period NNWith retention period NN+1,J1And NN+1,J2,
It is both middle to be greater than NN.Symbol J1 and J2 indicate the migration target under different jurisdictions.Based on multiple retention periods, shelf manager is stored
Creation indicates NN+1,J1With expression NN+1,J2.Work as NNWhen expiring, storage shelf manager will indicate NN+1,J1It moves under jurisdiction J1
Cloud target, and will indicate N NN+1,J2Move to the cloud target under jurisdiction J2.
In addition, migration be not necessarily performance capability it is lower and lower (for example, reliability it is lower and lower or access delay it is more next
It is higher) accumulation layer.In some cases, the expired of retention period that data set backup indicates can trigger higher carry out accumulation layer
Migration.In order to illustrate the financial documentation of enterprise can be migrated to high access delay accumulation layer up to 9 months, then move to low
Access delay accumulation layer reaches the duration in season of declaring dutiable goods.
In addition, terms used herein are flexible to a certain extent.For example, exemplary illustration is related to data set backup
Expression.Then the metadata and data set of data set backup are resolved into the expression of disclosure permission data set backup.Then show
Example property explanation maps separated from meta-data at backup object and data.Logically, the different of data set backup indicate to be regarded
For the difference backup of data set.Although different backups can composition data unit having the same (for example, being related to identical data
Block, panel or plate), but metadata is identified as difference, allow to back up individually being manipulated and being accessed.In order to illustrate standby
Part device may be in response to the request of backup file EX1 and create backup file EX1_Cache, and create backup file EX2_
Cloud.Two files EX1_Cache and EX2_Cloud have the pointer of parsing same composition data cell.Backup file is that have
Difference, but file initially shares identical composition data unit, because they back up identical data set.
Flow chart is provided to help to understand explanation, and is not limited to the range of claims.Flow chart description can
The exemplary operation changed within the scope of the claims.It can carry out additional operation;It can carry out less operation;It can be parallel
Ground is operated;And it can be operated by different order.It will be understood that flow chart illustrates and/or each box of block diagram, and
Flow chart illustrates and/or the combination of the box in block diagram can be realized by program code.Program code be can provide to general meter
The processor of calculation machine, special purpose computer or other programmable machines or device.
As will be understood, the aspect of the disclosure can be presented as the system being stored in one or more machine readable medias,
Method or program code/instruction.Therefore, can take herein can be all collectively referred to as circuit, " module " or " system " for aspect
Hardware, the combined form in terms of software (including firmware, resident software, microcode etc.) or software and hardware.Exemplary
Be rendered as in explanation separate modular/unit function can according to platform (operating system and/or hardware), application program ecosystem,
Any one of interface, programmer's preference, programming language, administrator preferences etc. are subject to tissue in different ways.
It can use any combination of one or more machine readable medias.Machine readable media can be machine-readable signal
Medium or machine readable storage medium.Machine readable storage medium can be used to store program code such as, but not limited to use
Any one of electronics, magnetism, optics, electromagnetism, infrared or semiconductor technology or combined system, device or equipment.Machine
The particularly example (non-exhaustive list) of readable storage medium storing program for executing will include the following: portable computer diskette, hard disk, random access
It is memory (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), portable
Compact disc read-only memory (CD-ROM), optical storage apparatus, magnetic storage apparatus, or previously every any suitable combination.At this
In the context of a document, machine readable storage medium can be can to contain, or storage is for by instruction execution system, device, or
Any tangible medium for the program that equipment uses or combine described instruction execution system, device or equipment to use.It is machine readable to deposit
Storage media is not machine-readable signal medium.
Any appropriate medium transmission, any appropriate medium can be used in the program code embodied on a machine-readable medium
Including but not limited to wireless, wired, fiber optic cables, RF etc., or previously every any suitable combination.
Computer program code for executing the operation in terms of being used for the disclosure can use one or more programming languages
Any combination is write, and the programming language includes the programming language of object-oriented, such asProgramming language, C++ etc.;Dynamic is compiled
Cheng Yuyan, such as Python;Scripting language, such as Perl programming language or PowerShell scripting language;And conventional process programming
Language, such as " C " programming language or similar programming language.Program code can execute on stand-alone machine completely, may span across multiple machines
Device executes with a scattered manner, and can execute on one machine, and provides result on another machine and/or receive input.
Program code/instruction may also be stored on machine readable media, the machine readable media can guidance machine with spy
Determine mode to work, so that it includes implementing in flow chart and/or one or more that the instruction that is stored in machine readable media, which generates,
The product of the instruction of function action specified in a block diagram block.
Fig. 8 describes the exemplary memory system with storage shelf manager, and the system is based on multiple retention periods and generates
Multiple expressions of data set backup.Storage system includes that processor unit 801 (may include multiple processors, multiple kernels, more
A node, and/or implement multithreading etc.).Storage system includes memory 807.Memory 807 can for system storage (for example,
Cache, SRAM, DRAM, zero capacitor RAM, pair transistor RAM, eDRAM, EDO RAM, DDR RAM, EEPROM, NRAM,
RRAM, SONOS, PRAM etc.) or the already described above of machine readable media any one or more of be able to achieve.Storage
System further include bus 803 (for example, PCI, ISA, PCI-Express,Bus,Bus, NuBus etc.) and network interface 805 (for example, fiber channel interface, Ethernet interface, internet are small
Type computer system interface, sonet interface, wireless interface etc.).System further includes storage shelf manager 811 and storage medium collection
Close 815.Multiple expressions that shelf manager 811 creates data set backup are stored, and are permitted based on corresponding retention period and accumulation layer
Perhaps the life cycle management of each of expression.When storage system is cache accumulation layer or low access delay accumulation layer
When, storage shelf manager 811 will create multiple expressions, and storage each of can will be indicated into storage medium 815
Mutual exclusion storage medium, or the mutual exclusion logic container at least in storage medium set 815.Storage medium 815 can be disk array, sudden strain of a muscle
Deposit the mixing array etc. of array, flash memory and disk unit.When removing composition data unit from accumulation layer, shelf manager is stored
811 save the composition data list shared due to duplicate removal by other expressions of other data set backups of managed accumulation layer
Member.The information created by duplicate removal program code can be used to determine which composition data unit by managed in storage shelf manager 811
The backup for managing accumulation layer indicates shared and does not indicate shared by the backup.Any one of previously described function can part
Ground (or fully) realize within hardware and/or on processor unit 801.For example, function can use application-specific IC reality
It is existing, it is logically implemented in processor unit 801, or the coprocessor being implemented on peripheral equipment or card is medium.In addition, real
It now may include the less or additional component that is not illustrated in Fig. 8 (for example, video card, audio card, additional network interfaces, peripheral equipment
Deng).Processor unit 801 and network interface 805 are connected to bus 803.Although being illustrated as being connected to bus 803, storage
Device 807 can be connected to processor unit 801.
Multiple examples are provided to for here depicted as the component of single instance, operation or structure.Finally, various
Boundary between component, operation and data storage is arbitrary to a certain extent, and specific operation is illustrated in specific theory
In the context of bright property configuration.Other distribution of function are conceived to and can belong to the scope of the present disclosure.In general, in example
Property configuration in be rendered as the structure and function of separate part and can realize as composite structure or component.Similarly, it is rendered as single portion
The structure and function of part can be realized as separate part.These and other variation, modification, addition, and improvement can belong to the disclosure
Range.
Term
For efficiency and convenient for explaining, this is described, and use is related with cloud to write a Chinese character in simplified form term.When being related to " cloud ", this
A description is just being related to the resource of cloud service supplier.For example, cloud can cover the server of cloud service supplier, virtual machine, and deposit
Store up equipment.Term " cloud storage equipment " and " cloud storage layer " are related to the logical collection of " cloud target ".Term " cloud target " is related to having
There is the entity of network address, the network address is used as the endpoint of network connection.Entity can be physical equipment (for example, clothes
Business device) or can be pseudo-entity (for example, virtual server or virtual memory facilities).More generally, consumer is addressable
Cloud service supplier resource is possessed/is managed by cloud service provider entity by being connected to the network addressable resource.In general, visiting
Ask it is according to the Application Programming Interface or Software Development Kit that are provided by cloud service supplier.
Use of the phrase "...... at least one" together with conjunction "and" before enumerating is not construed as mutual exclusion column
It lifts, and is not construed as enumerating for the type with a project from each type, unless otherwise stated.Listed items
In only one, one or more of multiple and listed items in listed items and another unlisted project can
It is disagreed with the subordinate sentence of narration " at least one of A, B and C ".
Claims (12)
1. a kind of method comprising:
Detect the related triggering that expires with the first retention period, first retention period is related to the first backup of data set
Connection, wherein first backup is the backup of the first kind;With
After detecting the triggering,
Identify multiple composition data units of the data set in the first accumulation layer;
It determines in the multiple composition data unit not shared with another backup of the first kind of different data collection
One or more of set;
The multiple composition data unit is moved into the second accumulation layer from first accumulation layer;With
The set of one or more of composition data units is removed from first accumulation layer.
2. the method as described in claim 1, wherein migration includes:
Object is created for each of the multiple composition data unit in second accumulation layer.
3. the method as described in any one of claims 1 to 2, wherein determining not shared one or more of composition datas
The set of unit includes:
First accumulation layer is scanned to determine the reference to the multiple composition data unit;
Determine the backup for corresponding to the reference;With
Determine the type for corresponding to the backup of the reference.
4. the method as described in any one of claims 1 to 2, wherein determining not with another backup of the first kind altogether
The set for the one or more of composition data units enjoyed includes:
Identify multiple backups that at least one of the multiple composition data unit is shared in first accumulation layer;With
For each of the multiple backup, determine whether the backup has the first kind.
5. the method as described in any one of claims 1 to 3 or claims 1 to 2 and 4,
Wherein identify that the multiple composition data unit includes creation data structure, the data structure includes the multiple composition
The identifier of data cell;
Wherein determine the collection of one or more of composition data units not shared with another backup of the first kind
Conjunction includes,
It determines by those of in the shared composition data unit of another backup of the first kind;
Identification is removed by the shared composition data unit of another backup of the first kind from the data structure
Those of identifier.
6. method as claimed in claim 5, wherein determining described one shared not with another backup of the first kind
The set of a or multiple composition data units further include:
It determines by those of in the shared the multiple composition data unit of another backup of Second Type;With
Marker recognition is by the knowledge those of in the shared the multiple composition data unit of another backup of Second Type
Not Fu in those of,
The set for wherein migrating the composition data unit, which is included in from the data structure, removes identification by the first kind
The shared composition data unit of another backup in those of the identifier after migrate still in the data
The each composition data unit identified in structure.
7. method as claimed in claim 5, wherein the first kind corresponds to first retention period and the first access is prolonged
When, and the second category corresponds to second retention period longer than first retention period and corresponds to than first visit
Ask the second high access delay of delay.
8. the method as described in any preceding claims, further include modify the data set the second backup it is described to indicate
Multiple composition data units of the migration in second accumulation layer rather than the multiple composition in first accumulation layer
Data cell, wherein second backup is the backup of Second Type.
9. method according to claim 8 further includes the second backup after detecting the triggering by the modification
Move to second accumulation layer.
10. a kind of machine readable media, having can be executed by processor to carry out the program of the method as described in claim 1
Code.
11. a kind of deduplication storage comprising:
Processor unit;With
Machine readable media comprising can be executed by the processor unit so that the deduplication storage performs the following operation
Program code,
The multiple of the data set backup are created in the first accumulation layer for multiple retention periods associated with data set backup
It indicates,
Wherein each of the multiple expression includes about the metadata of the data set backup and to the composition data
Collect the reference of multiple data cells of backup,
Wherein first in the multiple expression indicates first corresponding to first accumulation layer and in the multiple retention period
Retention period,
Wherein first accumulation layer corresponds to the deduplication storage;
Second in the multiple data cell and the multiple expression is indicated to copy to the second accumulation layer, wherein described second
Accumulation layer corresponds to the second retention period in the multiple retention period;With
After first retention period expires, restore not by the expression of another data set backup of first accumulation layer
The memory space occupied those of in the multiple data cell of reference.
12. deduplication storage as claimed in claim 11, wherein the said program code for being used to replicate includes can be by described
Processor unit executes so that the program code that the deduplication storage performs the following operation:
The first set of one or more objects is created in second accumulation layer for the multiple data cell, wherein institute
State the storage of the second accumulation layer objective for implementation;With
With the reference of the first set to one or more of objects rather than to described first in second accumulation layer
The object that the reference creation described second of the multiple data cell in accumulation layer indicates.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/081,546 | 2016-03-25 | ||
US15/081,546 US10620834B2 (en) | 2016-03-25 | 2016-03-25 | Managing storage space based on multiple dataset backup versions |
PCT/US2017/024156 WO2017165857A1 (en) | 2016-03-25 | 2017-03-24 | Multiple dataset backup versions across multi-tiered storage |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109154905A true CN109154905A (en) | 2019-01-04 |
CN109154905B CN109154905B (en) | 2022-03-25 |
Family
ID=58548880
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201780031635.XA Active CN109154905B (en) | 2016-03-25 | 2017-03-24 | Multiple data set backup versions across multiple tiers of storage |
Country Status (4)
Country | Link |
---|---|
US (1) | US10620834B2 (en) |
EP (1) | EP3433739B1 (en) |
CN (1) | CN109154905B (en) |
WO (1) | WO2017165857A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11310137B2 (en) | 2017-02-05 | 2022-04-19 | Veritas Technologies Llc | System and method to propagate information across a connected set of entities irrespective of the specific entity type |
US11429640B2 (en) | 2020-02-28 | 2022-08-30 | Veritas Technologies Llc | Methods and systems for data resynchronization in a replication environment |
US11531604B2 (en) | 2020-02-28 | 2022-12-20 | Veritas Technologies Llc | Methods and systems for data resynchronization in a replication environment |
US11928030B2 (en) * | 2020-03-31 | 2024-03-12 | Veritas Technologies Llc | Optimize backup from universal share |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107870728B (en) * | 2016-09-23 | 2021-02-09 | 伊姆西Ip控股有限责任公司 | Method and apparatus for moving data |
US11169960B2 (en) * | 2017-06-29 | 2021-11-09 | Ashish Govind Khurange | Data transfer appliance method and system |
US10721304B2 (en) | 2017-09-14 | 2020-07-21 | International Business Machines Corporation | Storage system using cloud storage as a rank |
US10372363B2 (en) | 2017-09-14 | 2019-08-06 | International Business Machines Corporation | Thin provisioning using cloud based ranks |
US10581969B2 (en) * | 2017-09-14 | 2020-03-03 | International Business Machines Corporation | Storage system using cloud based ranks as replica storage |
US10817204B1 (en) * | 2017-10-11 | 2020-10-27 | EMC IP Holding Company LLC | Migration of versioned data between storage devices |
US10936238B2 (en) * | 2017-11-28 | 2021-03-02 | Pure Storage, Inc. | Hybrid data tiering |
US10990282B1 (en) | 2017-11-28 | 2021-04-27 | Pure Storage, Inc. | Hybrid data tiering with cloud storage |
US11436344B1 (en) | 2018-04-24 | 2022-09-06 | Pure Storage, Inc. | Secure encryption in deduplication cluster |
US11392553B1 (en) | 2018-04-24 | 2022-07-19 | Pure Storage, Inc. | Remote data management |
WO2019209392A1 (en) * | 2018-04-24 | 2019-10-31 | Pure Storage, Inc. | Hybrid data tiering |
US11106378B2 (en) | 2018-11-21 | 2021-08-31 | At&T Intellectual Property I, L.P. | Record information management based on self describing attributes |
US11853575B1 (en) | 2019-06-08 | 2023-12-26 | Veritas Technologies Llc | Method and system for data consistency across failure and recovery of infrastructure |
US11593215B2 (en) * | 2020-02-05 | 2023-02-28 | EMC IP Holding Company LLC | Method and system for generating immutable backups with configurable retention spans |
US11593017B1 (en) * | 2020-08-26 | 2023-02-28 | Pure Storage, Inc. | Protection of objects in an object store from deletion or overwriting |
US11436103B2 (en) * | 2020-10-13 | 2022-09-06 | EMC IP Holding Company LLC | Replication for cyber recovery for multiple tier data |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101499924A (en) * | 2008-01-31 | 2009-08-05 | 杭州美创科技有限公司 | On-line switchover method for computer production system |
US20120095968A1 (en) * | 2010-10-17 | 2012-04-19 | Stephen Gold | Storage tiers for different backup types |
US20120117029A1 (en) * | 2010-11-08 | 2012-05-10 | Stephen Gold | Backup policies for using different storage tiers |
CN102917072A (en) * | 2012-10-31 | 2013-02-06 | 北京奇虎科技有限公司 | Device, system and method for carrying out data migration between data server clusters |
CN102982085A (en) * | 2012-10-31 | 2013-03-20 | 北京奇虎科技有限公司 | System and method of data migration |
CN103544075A (en) * | 2011-12-31 | 2014-01-29 | 华为数字技术(成都)有限公司 | Data processing method and system |
US20150261792A1 (en) * | 2014-03-17 | 2015-09-17 | Commvault Systems, Inc. | Maintaining a deduplication database |
Family Cites Families (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5875481A (en) * | 1997-01-30 | 1999-02-23 | International Business Machines Corporation | Dynamic reconfiguration of data storage devices to balance recycle throughput |
US6088694A (en) | 1998-03-31 | 2000-07-11 | International Business Machines Corporation | Continuous availability and efficient backup for externally referenced objects |
US9075851B2 (en) | 2003-12-09 | 2015-07-07 | Emc Corporation | Method and apparatus for data retention in a storage system |
US7536424B2 (en) * | 2004-05-02 | 2009-05-19 | Yoram Barzilai | System and methods for efficiently managing incremental data backup revisions |
JP4377790B2 (en) * | 2004-09-30 | 2009-12-02 | 株式会社日立製作所 | Remote copy system and remote copy method |
US8527468B1 (en) | 2005-02-08 | 2013-09-03 | Renew Data Corp. | System and method for management of retention periods for content in a computing system |
US8825971B1 (en) * | 2007-12-31 | 2014-09-02 | Emc Corporation | Age-out selection in hash caches |
US8484162B2 (en) * | 2008-06-24 | 2013-07-09 | Commvault Systems, Inc. | De-duplication systems and methods for application-specific data |
US20100199036A1 (en) * | 2009-02-02 | 2010-08-05 | Atrato, Inc. | Systems and methods for block-level management of tiered storage |
US20100274772A1 (en) | 2009-04-23 | 2010-10-28 | Allen Samuels | Compressed data objects referenced via address references and compression references |
US20100293147A1 (en) | 2009-05-12 | 2010-11-18 | Harvey Snow | System and method for providing automated electronic information backup, storage and recovery |
US8554735B1 (en) * | 2009-05-27 | 2013-10-08 | MiMedia LLC | Systems and methods for data upload and download |
US8356017B2 (en) * | 2009-08-11 | 2013-01-15 | International Business Machines Corporation | Replication of deduplicated data |
US8850142B2 (en) | 2009-09-15 | 2014-09-30 | Hewlett-Packard Development Company, L.P. | Enhanced virtual storage replication |
US8694469B2 (en) * | 2009-12-28 | 2014-04-08 | Riverbed Technology, Inc. | Cloud synthetic backups |
US8799413B2 (en) * | 2010-05-03 | 2014-08-05 | Panzura, Inc. | Distributing data for a distributed filesystem across multiple cloud storage systems |
US8473886B2 (en) | 2010-09-10 | 2013-06-25 | Synopsys, Inc. | Parallel parasitic processing in static timing analysis |
US9128948B1 (en) * | 2010-09-15 | 2015-09-08 | Symantec Corporation | Integration of deduplicating backup server with cloud storage |
US8909845B1 (en) * | 2010-11-15 | 2014-12-09 | Symantec Corporation | Systems and methods for identifying candidate duplicate memory pages in a virtual environment |
US8886901B1 (en) * | 2010-12-31 | 2014-11-11 | Emc Corporation | Policy based storage tiering |
US9715434B1 (en) * | 2011-09-30 | 2017-07-25 | EMC IP Holding Company LLC | System and method for estimating storage space needed to store data migrated from a source storage to a target storage |
US9262449B2 (en) | 2012-03-08 | 2016-02-16 | Commvault Systems, Inc. | Automated, tiered data retention |
US9116851B2 (en) | 2012-12-28 | 2015-08-25 | Futurewei Technologies, Inc. | System and method for virtual tape library over S3 |
US9563517B1 (en) * | 2013-12-30 | 2017-02-07 | EMC IP Holding Company LLC | Cloud snapshots |
US11315197B2 (en) | 2014-03-13 | 2022-04-26 | Fannie Mae | Dynamic display of representative property information with interactive access to source data |
US10380072B2 (en) * | 2014-03-17 | 2019-08-13 | Commvault Systems, Inc. | Managing deletions from a deduplication database |
US10089185B2 (en) | 2014-09-16 | 2018-10-02 | Actifio, Inc. | Multi-threaded smart copy |
US11783898B2 (en) | 2014-09-18 | 2023-10-10 | Jonker Llc | Ephemeral storage elements, circuits, and systems |
US9659047B2 (en) * | 2014-12-03 | 2017-05-23 | Netapp, Inc. | Data deduplication utilizing extent ID database |
-
2016
- 2016-03-25 US US15/081,546 patent/US10620834B2/en active Active
-
2017
- 2017-03-24 CN CN201780031635.XA patent/CN109154905B/en active Active
- 2017-03-24 WO PCT/US2017/024156 patent/WO2017165857A1/en active Application Filing
- 2017-03-24 EP EP17717950.4A patent/EP3433739B1/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101499924A (en) * | 2008-01-31 | 2009-08-05 | 杭州美创科技有限公司 | On-line switchover method for computer production system |
US20120095968A1 (en) * | 2010-10-17 | 2012-04-19 | Stephen Gold | Storage tiers for different backup types |
US20120117029A1 (en) * | 2010-11-08 | 2012-05-10 | Stephen Gold | Backup policies for using different storage tiers |
CN103544075A (en) * | 2011-12-31 | 2014-01-29 | 华为数字技术(成都)有限公司 | Data processing method and system |
CN102917072A (en) * | 2012-10-31 | 2013-02-06 | 北京奇虎科技有限公司 | Device, system and method for carrying out data migration between data server clusters |
CN102982085A (en) * | 2012-10-31 | 2013-03-20 | 北京奇虎科技有限公司 | System and method of data migration |
US20150261792A1 (en) * | 2014-03-17 | 2015-09-17 | Commvault Systems, Inc. | Maintaining a deduplication database |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11310137B2 (en) | 2017-02-05 | 2022-04-19 | Veritas Technologies Llc | System and method to propagate information across a connected set of entities irrespective of the specific entity type |
US11429640B2 (en) | 2020-02-28 | 2022-08-30 | Veritas Technologies Llc | Methods and systems for data resynchronization in a replication environment |
US11531604B2 (en) | 2020-02-28 | 2022-12-20 | Veritas Technologies Llc | Methods and systems for data resynchronization in a replication environment |
US11847139B1 (en) | 2020-02-28 | 2023-12-19 | Veritas Technologies Llc | Methods and systems for data resynchronization in a replication environment |
US11928030B2 (en) * | 2020-03-31 | 2024-03-12 | Veritas Technologies Llc | Optimize backup from universal share |
Also Published As
Publication number | Publication date |
---|---|
EP3433739B1 (en) | 2020-02-05 |
US20170277435A1 (en) | 2017-09-28 |
WO2017165857A1 (en) | 2017-09-28 |
CN109154905B (en) | 2022-03-25 |
EP3433739A1 (en) | 2019-01-30 |
US10620834B2 (en) | 2020-04-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109154905A (en) | Multiple data set backup versions of spanning multilayer storage | |
US11768803B2 (en) | Snapshot metadata arrangement for efficient cloud integrated data management | |
US10489345B2 (en) | Multiple retention period based representations of a dataset backup | |
US11188500B2 (en) | Reducing stable data eviction with synthetic baseline snapshot and eviction state refresh | |
US20170277597A1 (en) | Efficient creation of multiple retention period based representations of a dataset backup | |
JP6553822B2 (en) | Dividing and moving ranges in distributed systems | |
CN104813321B (en) | The content and metadata of uncoupling in distributed objects store the ecosystem | |
US9817835B2 (en) | Efficient data synchronization for storage containers | |
CN109726044B (en) | Efficient restoration of multiple files from deduplication storage based on data chunk names | |
US8464013B2 (en) | Apparatus and method for on-demand in-memory database management platform | |
JP4414381B2 (en) | File management program, file management apparatus, and file management method | |
CN104081391B (en) | The single-instancing method cloned using file and the document storage system using this method | |
CN106775446A (en) | Based on the distributed file system small documents access method that solid state hard disc accelerates | |
JP2009522677A (en) | Method, system, and device for file system dump / restore by node numbering | |
US10915246B2 (en) | Cloud storage format to enable space reclamation while minimizing data transfer | |
CN105046162B (en) | The caching safeguarded in content addressable storage systems and father is mapped using son | |
US20220206991A1 (en) | Storage system and data management method | |
Shmueli et al. | The SURF System for Continuous Data and Applications Placement Across Clouds | |
CN112181899A (en) | Metadata processing method and device and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: American California Applicant after: NETAPP incorporated company Address before: American California Applicant before: Network Area Storage Technology Co., Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |