CN109697021A - A kind of data processing method and device of disk snapshot - Google Patents

A kind of data processing method and device of disk snapshot Download PDF

Info

Publication number
CN109697021A
CN109697021A CN201710994265.1A CN201710994265A CN109697021A CN 109697021 A CN109697021 A CN 109697021A CN 201710994265 A CN201710994265 A CN 201710994265A CN 109697021 A CN109697021 A CN 109697021A
Authority
CN
China
Prior art keywords
data block
block identifier
disk
snapshot
disk snapshot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710994265.1A
Other languages
Chinese (zh)
Inventor
廖武钧
鲁振伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201710994265.1A priority Critical patent/CN109697021A/en
Priority to PCT/CN2018/109933 priority patent/WO2019080717A1/en
Publication of CN109697021A publication Critical patent/CN109697021A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed herein is a kind of data processing method of disk snapshot and devices;The above method, comprising: according to the data block identifier of the corresponding data block of disk snapshot, judge whether there is duplicate data block identifier;If there is duplicate data block identifier, duplicate data block identifier is deleted, to obtain the data block identifier after duplicate removal;According to the data block identifier after duplicate removal, at least one of following processing is carried out: examining the data integrity of disk snapshot, across storage domain migration disk snapshot.

Description

A kind of data processing method and device of disk snapshot
Technical field
The present invention relates to data processing technique more particularly to the data processing methods and device of a kind of disk snapshot.
Background technique
Disk snapshot is the complete documentation a time point to the storage content of disk.It can be magnetic according to disk snapshot Disk reverts to the data content of any one disk snapshot record, i.e., disk is reverted to the state of disk snapshot generation time point. It is in different time points each disk snapshot of disk creation, a snapshot chain can be formed.Disk snapshot is mainly used for Backup and disaster tolerance.If data in magnetic disk need to be restored, data in magnetic disk rollback can be carried out according to snapshot chain, the data on disk are restored The data content recorded for any one disk snapshot on snapshot chain.It is carried out at recovering disk data and migration using disk snapshot When reason, how to improve the treatment effeciency of disk snapshot is problem to be solved.
Summary of the invention
The one aspect of the application provides a kind of method, comprising: according to the data block mark of the corresponding data block of disk snapshot Know, judges whether there is duplicate data block identifier;If there is duplicate data block identifier, the duplicate data block identifier of institute is deleted, To obtain the data block identifier after duplicate removal;According to the data block identifier after duplicate removal, carries out at least one of following processing: examining disk The data integrity of snapshot, across storage domain migration disk snapshot.
Detailed description of the invention
It is illustrated by way of example, and not limitation in the accompanying drawings.For simple and clear explanation, illustrate in attached drawing Element and be not required to proportionally draw.For example, for clarity, the size of some elements may be overstated relative to other elements Greatly.In addition, when being deemed appropriate, repeat reference numerals are in the accompanying drawings to indicate corresponding or similar element.
Fig. 1 is a kind of implementation diagram of the application;
Fig. 2 is the duplicate removal schematic diagram of disk snapshot;
Fig. 3 is the schematic diagram of the first embodiment of the application;
Fig. 4 is the schematic diagram of second of embodiment of the application;
Fig. 5 is the schematic diagram of the third embodiment of the application;
Fig. 6 is the schematic diagram of the 4th kind of embodiment of the application;
Fig. 7 is the schematic diagram of the 5th kind of embodiment of the application;
Fig. 8 is the schematic diagram of the 6th kind of embodiment of the application;
Fig. 9 is a kind of exemplary diagram of system provided by the present application.
Specific embodiment
The embodiment of the present application is described in detail below in conjunction with attached drawing, it should be understood that embodiments described below is only For instruction and explanation of the application, it is not used to limit the application.
Although scope of the present application is susceptible to various modifications and alternative forms, its specific embodiment has passed through attached drawing In example show, and will be described in detail herein.It is understood that the limit for the embodiment that scope of the present application is not disclosed System, on the contrary it is intended to cover with spirit herein and the consistent various modifications of claims, be equal and alternative form.
Described reality is indicated to the reference of " one embodiment ", " embodiment ", " example embodiment " etc. in the description Applying example may include specific feature, structure or characteristic, but each embodiment can be not required to include the specific feature, structure or Characteristic.In addition, when describing specific feature, structure or characteristic in conjunction with one embodiment, it is careful to propose, in those skilled in the art Knowledge in, this feature, structure or characteristic can be implemented in conjunction with other embodiments (whether is it by retouching in detail It states).In addition, " at least one in A, B and C " indicates (A), (B), (C), (A and B), (A and C), (B and C) or (A, B and C). Similarly, " at least one in A, B or C " indicates (A), (B), (C), (A and B), (A and C), (B and C) or (A, B and C). " A/B " expression " A or B "." A and/or B " indicates (A), (B) or (A and B).
The embodiment of the present application can according to hardware, firmware, software or its in conjunction with realizing.The embodiment of the present application can also lead to Cross the instruction for carrying or being stored on one or more temporary or non-temporary machine readable medias (for example, computer-readable medium) It realizes, instruction can be read or be executed by one or more processors.Machine readable media can by any storage device, mechanism, Or other physical structures are realized, for by machine readable manner storage or transmission information (for example, volatibility or non-volatile depositing Reservoir, media discs or other media apparatus).
Wherein, computer-readable medium includes permanent and non-permanent, removable and non-movable storage medium.Storage Medium can be accomplished by any method or technique information storage.Information can be computer readable instructions, data structure, program Module or other data.The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random is deposited Access to memory (SRAM), other kinds of random access memory (RAM), read-only is deposited at dynamic random access memory (DRAM) Reservoir (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory techniques, CD-ROM Read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassettes, disk storage or its His magnetic storage device or any other non-transmission medium, can be used for storing and can be accessed by a computing device information.According to this Defining in text, computer-readable medium do not include non-temporary computer readable media (transitory media), such as modulation Data-signal and carrier wave.
In the accompanying drawings, the characteristics of some structure or methods, can show specific arrangement and/or sequence.However, should recognize Knowing such specific arrangement and/or sequence may not be needed.On the contrary, in some embodiments, these features can be with difference Mode and/or sequence arrange, rather than shown in exemplary drawings.In addition, including structure or method feature in certain figures It is not meant to need such feature in all embodiments, and in certain embodiments, may not include or may be with Other features combine.
Fig. 1 is a kind of implementation diagram of the application.As shown in Figure 1, storage system may include at least one client Equipment is calculated (for example, client computing device 10a to 10n), at least one disk are (for example, disk 14a to 14n) and connection At least one server of disk is (for example, server 12a to 12n).Wherein, each server can connect one or more magnetic Disk.Wherein, client computing device can carry out reading and writing data to disk by server.The application does not limit client meter The type of equipment is calculated, client computing device may include desktop computer or various portable computers or electronic equipment, than Such as, PC, notebook computer, smart phone or other electronic equipments etc..
As shown in Figure 1, server 12a may include processor 120 and system storage 122.Processor 120 and system Memory 122 can be connected by system bus.System bus may include the bus structures of following at least one type: storage Bus or storage control, use the local bus of various bus architectures at peripheral bus.System storage 122 can wrap Include volatile memory (for example, RAM), nonvolatile memory (for example, ROM), flash memory or a combination thereof.System storage 122 may include operating system 124 and snapshot module 126.Operating system 124 is used for the operation of control server 12a, such as It is executed with other operating systems or application program cooperation.Wherein, snapshot module 126 may include: that duplicate removal unit 1264, snapshot are complete Whole property verification unit 1266 and snapshot migration units 1268.Processor 120 can be stored by executing in system storage 122 Snapshot module 126 to realize the relevant operation of various snapshots and task.For example, processor 120 can be calculated according to client The snapshot integrity check of equipment 10a is requested, and is realized by executing duplicate removal unit 1264 and snapshot integrity check unit 1266 To the snapshot integrity check of disk 14a;Alternatively, processor 120 can be according to snapshot migration request, by executing duplicate removal unit 1264 and snapshot migration units 1268 realize and the snapshot of disk 14a moved into disk 14n.
Wherein, snapshot integrity check refers to whether any one data block for examining snapshot to include is readable.Wherein, a disk Memory space can be divided into multiple sections, such as 2MB by address offset as a section, then each interval censored data can be made It is stored for a data block (also known as slice, fragment) for disk snapshot.Restore data in magnetic disk after disk failure or power down When, it is reliable in order to guarantee data security, need to check the data integrity of disk snapshot, that is, check in a disk snapshot Whether all data blocks can be used by normal read, and the data block that cannot be normally read if it exists, then disk snapshot is deposited In the imperfect problem of data.
Wherein, snapshot migration refers to across storage domain migration disk snapshot.Wherein, storage domain refers to that a block access permission is independent and deposits Storage area domain a, for example, computer room, a set of computer cluster etc..User's (such as computer, virtual machine) in one storage domain can Directly to access the storage resource in this storage domain (such as cloud disk, disk snapshot), but it cross-domain cannot access other storages The storage resource in domain.When the user that one stores domain needs to store the data on domain using other, need to store other The data in domain first move to the storage domain, and then, the user in the storage domain could read what migration came in this storage domain Data.In addition, in some cases, for example, computer room is dissolved, storage resource is integrated etc., also resulting in the data across storage domain and moving It moves.
It should be noted that snapshot module 126 also may be integrally incorporated to real in operating system 124 in other implementations It is existing.When there are multiple servers, the structure of each server is referred to shown in server 12a, therefore is repeated no more in this.
It should be noted that the application can also be applied to Cloud Server, one or more can establish on Cloud Server A disk example, for being written and read use as computer disk, actual data by backstage one or more physics magnetic Disk is stored.
In the present embodiment, since partial data for a long time can just change in disk, the different disk in a snapshot chain The often only a small amount of difference of the data content of snapshot record.In order to save memory space, duplicate removal mode storage disk can be used The data block of snapshot.That is, can check each area of the disk when a time point is that a disk creates disk snapshot Between;With the disk snapshot of current point in time creation for new disk snapshot, the disk snapshot of previous time point creation is old disk For snapshot, if the data in a section of current point in time exist compared with the data for corresponding to the section in old disk snapshot It changes, then creates new data block according to the latest data in the section of current point in time, new disk snapshot uses newly created Data block;Otherwise, new disk snapshot continues to use the data block that the section is corresponded in old disk snapshot.
As illustrated in fig. 2, it is assumed that a disk is divided into 4 sections by address offset, when creating disk snapshot A, disk is fast According to corresponding four data blocks of A;When creating disk snapshot B, it is assumed that only data block 1-A and the corresponding disk region data block 3-A Between data have modification, then only new creation a data block 1-B and data block 3-B, disk snapshot B can continue to use disk snapshot A Data block 2-A and data block 4-A.In order to save memory space, duplicate data block, can only store one, such as Fig. 2 if it exists In, for disk snapshot A and B, although both corresponding data block 2-A and 4-A, only store a data block 2-A and One data block 4-A.In addition, each disk snapshot has corresponding metadata, metadata includes the data block mark of disk snapshot List is known, for the mark (for example, data block title) for the data block that recording disc snapshot is used, so as to read block mark The data block of instruction.For example, the metadata of disk snapshot A is for recording following title: data block 1-A, data block 2-A, data Block 3-A, data block 4-A;The metadata of disk snapshot B is for recording following title: data block 1-B, data block 2-A, data block 3-B, data block 4-A.
In the present embodiment, duplicate removal unit 1264 can be used for the data block mark according to the corresponding data block of disk snapshot Know, judges whether there is duplicate data block identifier;If there is duplicate data block identifier, the duplicate data block identifier of institute is deleted, To obtain the data block identifier after duplicate removal;Snapshot integrity check unit 1266 can be used for according to the data block mark after duplicate removal Know, examines the data integrity of disk snapshot;Snapshot migration units 1268 can be used for according to the data block identifier after duplicate removal, across Store domain migration disk snapshot.
In the exemplary embodiment, duplicate removal unit 1264 can be used for corresponding according to disk snapshot in the following manner The data block identifier of data block judges whether there is duplicate data block identifier: corresponding according to the disk snapshot in disk snapshot chain Data block data block identifier, judge whether the corresponding data block of different disk snapshot has duplicate data in disk snapshot chain Block identification.
In the exemplary embodiment, the data block identifier of the corresponding data block of any disk snapshot is stored in a data In block identification list;
Duplicate removal unit 1264 can be used for obtaining the data block identifier after duplicate removal in the following manner:
Creation is initially the data block identifier set of empty set;The data block identifier list for traversing multiple disk snapshots, will count It is added in data block identifier set according to the data block identifier not having in block identification set, according to traversing after multiple disk snapshots The data block identifier set arrived, the data block identifier after determining the corresponding duplicate removal of multiple disk snapshots.
In the exemplary embodiment, snapshot integrity check unit 1266 can be used in the following manner according to duplicate removal Data block identifier afterwards examines the data integrity of disk snapshot:
The data block of data block identifier instruction after reading duplicate removal;If successfully reading each data block identifier after duplicate removal to refer to The data block shown, it is determined that the data of disk snapshot are complete;If reading the number of at least one data block identifier instruction after duplicate removal Fail according to block, it is determined that the data of the disk snapshot of the data block of corresponding data block identifier instruction are imperfect.
In the exemplary embodiment, snapshot migration units 1268 can be also used for determining that the migration of multiple disk snapshots is suitable Sequence;Snapshot migration units 1268 can be used in the following manner according to the data block identifier after duplicate removal, across storage domain migration magnetic Disk snapshot: according to determining migration sequence, the data block of the data block identifier instruction after duplicate removal is copied into mesh from source storage domain Storage domain;The metadata of disk snapshot is copied into purpose storage domain from source storage domain, so as to first number based on disk snapshot Disk snapshot is rebuild in purpose storage domain according to data block corresponding with disk snapshot.
Wherein, each data block identifier after duplicate removal can be corresponding with a migration label;
Snapshot migration units 1268 can be used in the following manner according to migration sequence, by the data block identifier after duplicate removal The data block of instruction copies to purpose storage domain: the data block identifier according to migration sequence, after traversing duplicate removal from source storage domain;Needle To each data block identifier, if the corresponding migration label instruction of data block identifier does not migrate, the number that data block identifier is indicated Purpose storage domain is copied to from source storage domain according to block, and updates the corresponding migration label instruction of the data block identifier and has moved, if The corresponding migration label instruction of data block identifier is had moved, then the data block without indicating the data block identifier replicates.
Fig. 3 is the schematic diagram of the first embodiment of the application.As shown in figure 3, in square 301, according to disk snapshot The data block identifier of corresponding data block judges whether there is duplicate data block identifier;If there is duplicate data block identifier, delete Except the duplicate data block identifier of institute, to obtain the data block identifier after duplicate removal;In square 302, according to the data block after duplicate removal Mark carries out at least one of following processing: examining the data integrity of disk snapshot, across storage domain migration disk snapshot.
Wherein, the data block identifier of the corresponding data block of disk snapshot can be read from the metadata of disk snapshot.Often It may include a data block identifier list, all numbers used including the disk snapshot in the metadata of a disk snapshot According to the mark of block.Data block identifier can be data block title or number (ID), and the application does not limit this.According to data Block identification can read the data block being stored on physical disk.
Wherein, in square 301, the data block identifier set for being initially empty set can be created;Traverse multiple magnetic of disk The data block identifier list of disk snapshot (such as a disk snapshot chain), the data block identifier having in data block identifier set It is added in data block identifier set, to obtain the data block identifier after duplicate removal.So, it is ensured that in data block identifier set not Including duplicate data block identifier, i.e., the number of every kind of data block identifier is one in data block identifier set.
Referring to Fig. 4 to Fig. 8, the application is illustrated by multiple embodiments.
As shown in figure 4, the present embodiment describes the data integrity check process of a snapshot chain of a disk.This implementation It include multiple disk snapshots in the snapshot chain of disk in example, each disk snapshot has a data block identifier list, data block It include all data block titles used in disk snapshot in identification list.
As shown in figure 4, creating a data block identifier set, and data block identifier set is initially empty in square 401 Collection;In square 402, the data block identifier list of multiple disk snapshots in the snapshot chain of disk is traversed, by data block identifier set In the data block title that does not have data block identifier set is added;If not traversed the data block identifier column of all disk snapshots Table then repeats square 402, if having traversed the data block identifier list of all disk snapshots, executes square 403.Wherein, After the data block identifier list for all disk snapshots for having traversed this disk, obtained data block identifier set is exactly one Set after duplicate removal.The data block title for including in i.e. finally obtained data block identifier set does not repeat, and covers All data block titles used in all disk snapshots in the snapshot chain of this disk.
In square 403, each of ergodic data block identification set data block title, the instruction of read block title Data block;Wherein, data block title can serve to indicate that data block, can according to data block title in the storage location of physical disk To read corresponding data block from physical disk.If data block reads failure, prove that data block is unavailable;If successfully reading number According to block, then prove that data block is available.
Which can determine after having traversed all data blocks title in data block identifier set by square 403 The data block of data block title instruction is available, and the data block of which data block title instruction is unavailable;In square 404, judgement is It is no success reading database logo collection instruction all data blocks, i.e., judge data block identifier set indicate data block whether It is all available.If the data block of data block identifier set instruction is all available, can determine in the snapshot chain of this disk The data of all disk snapshots are complete;If the data block of data block identifier set instruction, can be according to not there are not available The data block identifier list of available data block title and any one disk snapshot determines to include not available data block title Data block identifier list, so that it is determined that the incomplete disk snapshot of data.
In the present embodiment, by carrying out duplicate removal processing, when whether inspection data block is readable, every number to data block identifier It is only read according to block once, avoids many unnecessary read requests, improve the data integrity check of disk snapshot Efficiency.
As shown in figure 5, the present embodiment description migrates multiple disk snapshots in the snapshot chain of a disk in source storage domain To the process in purpose storage domain, wherein the migration sequence of disk snapshot can be set.In the present embodiment, in the snapshot chain of disk Including multiple disk snapshots, the corresponding metadata of each disk snapshot includes data block identifier list, in data block identifier list Including all data block titles used in a disk snapshot.
As shown in figure 5, creating a data block identifier set, and data block identifier set is initially empty in square 501 Collection;In square 502, the data block identifier list of multiple disk snapshots in a disk is traversed, it will be in data block identifier set Data block identifier set is added in no data block title;If not traversing the data block identifier column for completing all disk snapshots Table then repeats square 502.Wherein, it after the data block identifier list for all disk snapshots for having traversed this disk, obtains To data block identifier set be exactly a set after duplicate removal.Include in i.e. finally obtained data block identifier set Data block title does not repeat, and covers all data block titles used in multiple disk snapshots of this disk.Wherein, Each data block identifier in finally obtained data block identifier set is corresponding with a migration label, is used to indicate data block Whether the corresponding data block of title has migrated;In starting (when not starting to be migrated), in data block identifier set The migration label of each data block identifier indicates not migrate.In some implementations, migration label can be indicated using 0 or 1, For example, instruction is had moved when migrating label is 1, when migrating label is 0, instruction is not migrated.However, the application to this simultaneously It does not limit.
In square 503, the migration sequence of multiple disk snapshots of disk is determined, wherein the migration of multiple disk snapshots is suitable Sequence can need to specify according to business, for example, the snapshot that priority migration is important;Alternatively, can also default according to disk snapshot Chronological order is migrated.The application does not limit this.
The data block identifier list of multiple disk snapshots of disk is traversed according to migration sequence in square 504;For every Each data block title in the data block identifier list of a disk snapshot a, if data block in data block identifier set The corresponding migration label instruction of title does not migrate, then the data block that the data block identifier indicates is copied to purpose from source storage domain Domain is stored, and updates the corresponding migration label instruction of the data block title in data block identifier set and has moved;If in data The corresponding migration label instruction of a data block title in block identification set is had moved (that is, illustrating that the data block title indicates Data block copied to purpose storage domain), then replicated without the data block that indicates the data block title.
After the data block identifier list that traversal completes a disk snapshot, that is, confirm the data block of a disk snapshot Purpose storage domain has been copied to, at this point, in square 505, the metadata (including data block identifier list) of disk snapshot is multiple Make purpose storage domain.In this way, the corresponding data block of metadata and disk snapshot based on disk snapshot, can deposit in purpose Storage domain rebuilds this disk snapshot.That is it is complete that a disk snapshot migrates the duplication successfully including metadata and all data blocks At.In square 506, judge whether whole disk snapshots of one disk of Successful migration, if whole Successful migrations, have confirmed It is migrated at the disk snapshot of this disk, if needing to be traversed for the data block mark of next disk snapshot without whole migrations List is known, to judge whether next disk snapshot migrates completion.
In the present embodiment, it is added to the process of duplicate removal processing, each is repeated the data block of reference, only copy one It is secondary, then the data copy amount of disk snapshot migration can be greatly reduced, is repeatedly copied to avoid duplicate data block.And And allow the copy sequence of specified snapshot chain, preferentially to back up important disk snapshot.
As shown in fig. 6, the present embodiment description migrates multiple disk snapshots in the snapshot chain of a disk in source storage domain To the process in purpose storage domain, wherein the migration sequence of disk snapshot can be set.The present embodiment and embodiment illustrated in fig. 5 Difference is: in the present embodiment, can carry out data block batch according to migration sequence and copy.
As shown in fig. 6, the explanation of square 601, square 602 and square 603 is referred to square 501, square in Fig. 5 502 and square 503 explanation, therefore repeated no more in this.
The data block of any one disk snapshot is replicated according to migration sequence in square 604.Wherein, according in square 603 The migration sequence of middle determination, can traverse each disk snapshot, by reading the metadata of each disk snapshot, obtain each The data block identifier list of disk snapshot.It, can be in this way, in conjunction with the data block identifier list of migration sequence and each disk snapshot It determines which corresponding data block of data block title needs first to be copied, and batch copy is carried out to these data blocks.
It should be noted that needing will in data block identifier set after copying data blocks to purpose storage domain Indicate that the corresponding migration tag update of the data block title of the data block is had moved for instruction, for example, migration label is updated from 0 For 1, (migration label is 1, and instruction is had moved;Migrating label is 0, and instruction does not migrate).
In square 605, judge whether the corresponding migration label of any of data block identifier set data block title refers both to Show and have moved, if all instruction is had moved, illustrate that data block duplication is completed, if there is also the corresponding migration marks of data block title Label instruction does not migrate, then illustrates that data block does not replicate completion, continue to execute square 604.
The data block identifier list of multiple disk snapshots is traversed, according to data block mark according to migration sequence in square 606 Know any of the set corresponding migration label of data block title, determines the data block in the data block identifier list of disk snapshot All the data block of title instruction whether complete by migration.Wherein, for each data block name in each data block identifier list Claim, the data block title is found in data block identifier set, and read the migration label of the data block title, if the data block The migration label instruction of title does not migrate, then the data block that the data block title indicates is copied to purpose storage domain, and modify Migration label instruction is had moved;If the migration label instruction of the data block title is had moved, no longer the data block title is referred to The data block shown carries out migration process.After traversal completes a data block identifier list, that is, confirm a disk snapshot Data block has copied to after purpose storage domain, and in square 607, by the metadata of disk snapshot, (including data block identifier is arranged Table) copy to purpose storage domain.In this way, the corresponding data block of metadata and disk snapshot based on disk snapshot, Ke Yi Purpose storage domain rebuilds this snapshot.
In square 608, judge whether whole disk snapshots of one disk of Successful migration, if whole Successful migrations, The disk snapshot migration of this disk is completed in confirmation, if needing to be traversed for the number of next disk snapshot without whole migrations According to block identification list, to judge whether next disk snapshot migrates completion.
It should be noted that square 604 and square 606 can be started simultaneously at and be executed, alternatively, executing in square 604 Start to execute square 606 after a period of time.
In the present embodiment, it is added to the process of duplicate removal processing, each is repeated the data block of reference, only copy one It is secondary, then the data copy amount of snapshot migration can be greatly reduced, is repeatedly copied to avoid duplicate data block.Moreover, permitting Perhaps the copy sequence of specified snapshot chain, preferentially to back up important snapshot.There is no successive dependence sequence, multiple disks for disk snapshot Snapshot can be migrated simultaneously.It is in the present embodiment, whether data block copy process and the corresponding data block of disk snapshot is whole The detection process of copy is performed separately, and can be realized the batch copy of data block, to improve data migration efficiency.
In the examples shown in figure 5 and figure 6, in the data block identifier set after duplicate removal, each can clearly be demarcated Whether data block has copied, as long as soon as data block used in disk snapshot has all copied purpose storage domain, it can be Purpose storage domain reconstructs this disk snapshot.I.e. after the duplicate removal processing for carrying out data block title to snapshot chain, to data block The duplication sequence of logo collection does not require, and therefore, can permit the migration sequence of designated disk snapshot, preferentially important disk The corresponding data block of snapshot and copies of metadata to purpose store domain, can preferentially recover completely in purpose storage domain in this way Important disk snapshot so improves the flexibility of disk snapshot migration.
It is deposited as shown in fig. 7, multiple disk snapshots of a disk in source storage domain are moved to purpose by the present embodiment description Store up the process in domain, wherein there is no specified preferred sequences for disk snapshot migration.In the present embodiment, wrapped in the snapshot chain of disk Include multiple disk snapshots, the corresponding meta data block of each disk snapshot includes data block identifier list, in data block identifier list Including all data block titles used in a disk snapshot.
As shown in fig. 7, square 501 that the explanation of square 701 and square 702 is referred in Fig. 5 and square 502 are said It is bright, therefore repeated no more in this.
In square 703, the data block by multiple data block titles instruction in data block identifier set copies to mesh in batches Storage domain;The present embodiment improves the concurrent capability of data block duplication by batch copy.
In square 704, after the duplication for completing all data blocks, by the metadata replication of multiple disk snapshots to purpose Domain is stored, to reconstruct multiple disk snapshots of disk in purpose storage domain based on metadata and the data block replicated.
As shown in figure 8, the snapshot chain that source stores multiple disks in domain is moved to purpose storage domain by the present embodiment description Process.It wherein, include multiple disk snapshots in the snapshot chain of each disk, the corresponding meta data block of each disk snapshot includes Data block identifier list includes all data block titles used in a disk snapshot in data block identifier list.
As shown in figure 8, creating the mapping set of disk and disk snapshot in square 801, being initially empty set;In square 802, ergodic source stores disk snapshot list to be migrated on domain, and mapping ensemblen is added in the Disk name not in mapping set It closes, and records the corresponding disk snapshot of each Disk name in mapping set;In this way, each of these according to mapping set One or more disk snapshots of a disk are to be migrated, i.e. each Disk name of mapping set is corresponding with a magnetic to be migrated Disk snapshot list.
For example, disk snapshot to be migrated in current source purpose storage domain includes following three: disk snapshot first, disk is fast According to second, disk snapshot third;Assuming that disk snapshot first belongs to disk x, disk snapshot second belongs to disk x, and disk snapshot third belongs to magnetic Disk y, then in mapping set, available disc-pack { x, y }, wherein the corresponding disk snapshot of disk x is first, second;Disk The corresponding disk snapshot of y is third.
In square 803, Ergodic Maps set executes disk snapshot to each disk in mapping set and migrates process.Its In, the disk snapshot migration process of each disk is referred to the description of Fig. 5 to Fig. 7 any embodiment.Therefore it is repeated no more in this.
In addition, the embodiment of the present application also provides a kind of method, comprising: obtain the data block of the corresponding data block of disk snapshot Mark carries out duplicate removal to data block identifier;According to the data block identifier after duplicate removal, carries out at least one of following processing: examining magnetic The data integrity of disk snapshot, across storage domain migration disk snapshot.
Deduplication operation about the data block identifier to the corresponding data block of disk snapshot is referred in above-described embodiment To the operating instruction of duplicate removal unit 1264, the processing carried out based on the data block identifier after duplicate removal is referred in above-described embodiment To the operating instruction of snapshot integrity check unit 1266 and snapshot migration units 1268, therefore repeated no more in this.
Fig. 9 is the exemplary diagram according to a system 900 of various embodiments.System 900 may include one or more processors 904, the system control logic 908 coupled at least one of processor 904, the system coupled with system control logic 908 Memory (memory) 912, nonvolatile storage (NVM, the Non-Volatile coupled with system control logic 908 Memory)/storage device (storage) 916, the network interface 920 coupled with system control logic 908 and with system control Input/output (I/O) device 932 that logic 908 couples.
Processor 904 may include one or more single or multiple core processors.Processor 904 may include general processor With any combination of application specific processor (for example, graphics processor, application processor, baseband processor etc.).
In one embodiment, system control logic 908 may include any suitable interface controller, any suitable for providing Any suitable device or portion that the interface of conjunction is communicated at least one of processor 904 and/or with system control logic 908 Part.
In one embodiment, system control logic 908 may include one or more Memory Controllers, for providing interface To system storage 912.System storage 912 can be used for loading and storing to the data and/or instruction of system 900, such as Instruction 924.In one embodiment, system storage 912 may include any appropriate volatile memory, such as be suitble to dynamic State random access memory (DRAM) etc..
NVM/ storage device 916 may include one or more tangible nonvolatile computer-readable mediums, for storing number According to and/or instruction, such as instruction 924.NVM/ storage device 916 may include any appropriate nonvolatile memory, such as Flash memory etc., and/or may include any appropriate non-volatile memory device, such as one or more hard disk drives (HDDs), one or more compact disk (CD) drivers, and/or one or more digital multi-purpose disk (DVD) drivers etc..
NVM/ storage device 916 may include be physically the device mounted thereto of system 900 a part storage money Source or some that it can be accessed by the device without being set to the device.For example, NVM/ storage device 916 can be via Network interface 920 is accessed by network instruction and/or by input/output device 932.
When executing instruction 924 by one or more of processor 904, system 900 can be caused to implement such as Fig. 3 to Fig. 8 Method described in middle any embodiment.In various embodiments, instruction 924 or its hardware, solid, and/or software part, can It is arranged in in addition/alternative element of system 900.
Network interface 920 can have a transceiver to provide radio interface to system 900, for passing through one or more It network communication and/or is communicated with other any suitable devices.In various embodiments, transceiver can with system 900 its His component is integrated.For example, transceiver may include that the processor of processor 904, the memory of system storage 912 and NVM/ are deposited The NVM/ storage device of storage device 916.Network interface 920 may include any appropriate hardware and/or solid.Network interface 920 It may include mutiple antennas, for providing multiple inputs, multiple output radio interface.In one embodiment, network interface 920 can To include: wired network adapter, wireless network adapter, telephone modem and/or radio modem.
In one embodiment, at least one of processor 904 can be with one or more controllers of system control logic 908 Logic be packaged together.In one embodiment, at least one of processor 904 can with one of system control logic 908 or The logic of multiple controllers is packaged together to form system in package (SiP).In one embodiment, in processor 904 extremely Few one can be integrated on chip identical with the logic of one or more controllers of system control logic 908.In an embodiment In, at least one of processor 904 can be integrated in core identical with the logic of one or more controllers of system control logic 908 System on chip (SoC) is formed on piece.
In various embodiments, input/output device 932 may include being designed to realize to interact with the user of system 900 User interface is designed to realize the Peripheral component interface interacted with the peripheral components of system 900 and/or is designed to that determination is related to being The environmental condition of system 900 and/or the sensor of site information.
In various embodiments, user interface can include but is not limited to: display is (for example, liquid crystal display, touch screen Display etc.), loudspeaker, microphone, one or more filming apparatus (for example, camera and/or video camera), flash lamp (example Such as, LED spotlight) and keyboard.
In various embodiments, Peripheral component interface can include but is not limited to: nonvolatile memory port, general string The port row bus (USB), audio sockets and power supply interface.
In various embodiments, sensor can include but is not limited to: gyro sensor, accelerometer, close sensing Device, ambient light sensor and positioning unit.Positioning unit can also be some or and network interface of network interface 920 920 interactions carry out the component communication with the positioning network of such as global positioning system (GPS) satellite.
In various embodiments, system 900 can be mobile computing device.In various embodiments, system 900 can have More or fewer components and/or different frameworks.
It is illustrated below by multiple example embodiments.
In example embodiment one, a kind of method, comprising: according to the data block identifier of the corresponding data block of disk snapshot, Judge whether there is duplicate data block identifier;If there is duplicate data block identifier, the duplicate data block identifier of institute is deleted, to obtain Data block identifier after obtaining duplicate removal;According to the data block identifier after duplicate removal, carries out at least one of following processing: examining the disk The data integrity of snapshot, across storage domain migration described in disk snapshot.
In example embodiment two, method described in one accoding to exemplary embodiment, according to the corresponding data of disk snapshot The data block identifier of block judges whether there is duplicate data block identifier, may include:
According to the data block identifier of the corresponding data block of disk snapshot in disk snapshot chain, the disk snapshot chain is judged Whether the corresponding data block of interior different disk snapshot has duplicate data block identifier.
In example embodiment three, method described in two accoding to exemplary embodiment, according to the data block identifier after duplicate removal, At least one of following processing is carried out, may include:
According to the data block identifier after the corresponding duplicate removal of disk snapshot chain and one or more magnetic in disk snapshot chain Disk snapshot carries out at least one of following processing: examining the data integrity of disk snapshot, across storage domain migration disk snapshot.
In example embodiment four, method described in one accoding to exemplary embodiment, the corresponding data of any disk snapshot The data block identifier of block is stored in a data block identifier list;
According to the data block identifier of the corresponding data block of disk snapshot, duplicate data block identifier is judged whether there is, if There is duplicate data block identifier, deleting the duplicate data block identifier of institute to obtain the data block identifier after duplicate removal may include:
Creation is initially the data block identifier set of empty set;
The data block identifier list for traversing multiple disk snapshots, the data block mark having in the data block identifier set Knowledge is added in the data block identifier set, according to the data block identifier collection obtained after the multiple disk snapshot of traversal It closes, the data block identifier after determining the corresponding duplicate removal of the multiple disk snapshot.
In example embodiment five, method described in one accoding to exemplary embodiment, the data block according to after duplicate removal Mark, examines the data integrity of the disk snapshot, may include: the data of the data block identifier instruction after reading duplicate removal Block;If successfully reading the data block of each data block identifier instruction after duplicate removal, it is determined that the data of the disk snapshot are complete; If reading the data block failure of at least one data block identifier instruction after duplicate removal, it is determined that the corresponding data block identifier instruction Data block disk snapshot data it is imperfect.
In example embodiment six, method described in one accoding to exemplary embodiment, the data block according to after duplicate removal Mark, before disk snapshot described in storage domain migration, the method can also comprise determining that multiple disk snapshots of disk Migration sequence;
The data block identifier according to after duplicate removal, across disk snapshot described in storage domain migration, comprising:
According to the migration sequence, the data block of the data block identifier instruction after duplicate removal is copied into purpose from source storage domain Store domain;The metadata of the disk snapshot is copied into purpose storage domain from source storage domain, to be based on the disk snapshot Metadata and the corresponding data block of the disk snapshot in purpose storage domain rebuild the disk snapshot.
In example embodiment seven, method described in six accoding to exemplary embodiment, each data block after the duplicate removal Mark can be corresponding with a migration label;
It is described according to migration sequence, the data block of the data block identifier instruction after duplicate removal is copied to from source storage domain Purpose stores domain, may include:
Data block identifier according to the migration sequence, after traversing duplicate removal;For each data block identifier, if the data The corresponding migration label instruction of block identification does not migrate, then copies to the data block that the data block identifier indicates from source storage domain Purpose stores domain, and updates the corresponding migration label instruction of the data block identifier and have moved, if the data block identifier is corresponding The instruction of migration label have moved, then replicated without the data block that indicates the data block identifier.
In example embodiment eight, method described in one accoding to exemplary embodiment, the data block according to after duplicate removal Mark, across disk snapshot described in storage domain migration, comprising: the data block of the data block identifier instruction after duplicate removal is stored into domain from source Copy to purpose storage domain;The metadata of the disk snapshot is copied into purpose storage domain from source storage domain, to be based on institute The metadata and the corresponding data block of the disk snapshot for stating disk snapshot are in the purpose storage domain reconstruction disk snapshot.
In example embodiment nine, one to eight described in any item methods, the method may be used also accoding to exemplary embodiment To include: to determine that the mapping of any one disk and disk snapshot is closed before the disk snapshot across the storage multiple disks of domain migration System.
In example embodiment ten, a kind of device, comprising: in snapshot integrity detection unit and snapshot migration units at least One and duplicate removal unit;Wherein, duplicate removal unit, for the data block identifier according to the corresponding data block of disk snapshot, judgement Whether duplicate data block identifier is had;If there is duplicate data block identifier, the duplicate data block identifier of institute is deleted, to be gone Data block identifier after weight;Snapshot integrity check unit, for examining the disk fast according to the data block identifier after duplicate removal According to data integrity;Snapshot migration units, for according to the data block identifier after duplicate removal, disk described in domain migration to be fast across storing According to.
In example embodiment 11, device described in ten according to example embodiment, the duplicate removal unit, for by with Under type judges whether there is duplicate data block identifier according to the data block identifier of the corresponding data block of disk snapshot:
According to the data block identifier of the corresponding data block of disk snapshot in disk snapshot chain, the disk snapshot chain is judged Whether the corresponding data block of interior different disk snapshot has duplicate data block identifier.
In example embodiment 12, device described in ten according to example embodiment, the corresponding data of any disk snapshot The data block identifier of block is stored in a data block identifier list;
The duplicate removal unit, for obtaining the data block identifier after duplicate removal in the following manner:
Creation is initially the data block identifier set of empty set;
The data block identifier list for traversing multiple disk snapshots, the data block mark having in the data block identifier set Knowledge is added in the data block identifier set, according to the data block identifier collection obtained after the multiple disk snapshot of traversal It closes, the data block identifier after determining the corresponding duplicate removal of the multiple disk snapshot.
In example embodiment 13, device described in ten according to example embodiment, snapshot integrity check unit can be with For according to the data block identifier after duplicate removal, examining the data integrity of the disk snapshot in the following manner: reading duplicate removal The data block of data block identifier instruction afterwards;If successfully reading the data block of each data block identifier instruction after duplicate removal, really The data of the fixed disk snapshot are complete;If reading the data block failure of at least one data block identifier instruction after duplicate removal, Determine that the data of the disk snapshot of the data block of the corresponding data block identifier instruction are imperfect.
In example embodiment 14, device described in ten according to example embodiment, the snapshot migration units can be with For determining the migration sequence of multiple disk snapshots;
The snapshot migration units can be used for being moved according to the data block identifier after duplicate removal across storage domain in the following manner Move the disk snapshot:
According to the migration sequence, the data block of the data block identifier instruction after duplicate removal is copied into purpose from source storage domain Store domain;The metadata of the disk snapshot is copied into purpose storage domain from source storage domain, to be based on the disk snapshot Metadata and the corresponding data block of the disk snapshot in purpose storage domain rebuild the disk snapshot.
In example embodiment 15, device described in 14 according to example embodiment, each data block mark after duplicate removal Knowledge can be corresponding with a migration label;
The snapshot migration units can be used in the following manner according to the migration sequence, by the data block after duplicate removal The data block for identifying instruction copies to purpose storage domain from source storage domain:
Data block identifier according to the migration sequence, after traversing duplicate removal;For each data block identifier, if the data The corresponding migration label instruction of block identification does not migrate, then copies to the data block that the data block identifier indicates from source storage domain Purpose stores domain, and updates the corresponding migration label instruction of the data block identifier and have moved, if the data block identifier is corresponding The instruction of migration label have moved, then replicated without the data block that indicates the data block identifier.
In example embodiment 16, a kind of system, comprising: one or more processors;And one or more storages There is the machine readable media of multiple instruction, when the multiple instruction is executed by one or more of processors, so that described Method described in any example embodiment in system implementation example embodiment one to nine.
In example embodiment 17, a kind of machine readable media being stored with multiple instruction, when multiple instruction is by one Or multiple processors method described in any example embodiment in implementation example embodiment one to nine when executing.
In exemplary embodiment 18, a kind of method, comprising: obtain the data block mark of the corresponding data block of disk snapshot Know, duplicate removal is carried out to the data block identifier;According to the data block identifier after duplicate removal, carries out at least one of following processing: examining The data integrity of the disk snapshot, across storage domain migration described in disk snapshot.
The advantages of basic principles and main features and the application of the application have been shown and described above.The application is not by upper The limitation for stating embodiment, the above embodiments and description only describe the principles of the application, are not departing from the application Under the premise of spirit and scope, the application be will also have various changes and improvements, these changes and improvements both fall within claimed Within the scope of the application.

Claims (18)

1. a kind of method, comprising:
According to the data block identifier of the corresponding data block of disk snapshot, duplicate data block identifier is judged whether there is;If there is weight Multiple data block identifier deletes the duplicate data block identifier of institute, to obtain the data block identifier after duplicate removal;
According to the data block identifier after duplicate removal, carry out at least one of following processing: examine the disk snapshot data integrity, Across disk snapshot described in storage domain migration.
2. the method according to claim 1, wherein the data block according to the corresponding data block of disk snapshot Mark, judges whether there is duplicate data block identifier, comprising:
According to the data block identifier of the corresponding data block of disk snapshot in disk snapshot chain, judge in the disk snapshot chain not Whether there is duplicate data block identifier with the corresponding data block of disk snapshot.
3. according to the method described in claim 2, it is characterized in that, the data block identifier according to after duplicate removal, carries out following At least one processing, comprising:
According to the data block identifier after the corresponding duplicate removal of the disk snapshot chain and one or more in the disk snapshot chain A disk snapshot carries out at least one of following processing: examining the data integrity of the disk snapshot, across described in storage domain migration Disk snapshot.
4. the method according to claim 1, wherein the data block identifier of the corresponding data block of any disk snapshot It is stored in a data block identifier list;
The data block identifier according to the corresponding data block of disk snapshot, judges whether there is duplicate data block identifier, if There is duplicate data block identifier, the duplicate data block identifier of institute deleted, to obtain the data block identifier after duplicate removal, comprising:
Creation is initially the data block identifier set of empty set;The data block identifier list for traversing multiple disk snapshots, by the number It is added in the data block identifier set according to the data block identifier not having in block identification set, according to the multiple disk of traversal The data block identifier set obtained after snapshot, the data block identifier after determining the corresponding duplicate removal of the multiple disk snapshot.
5. the method according to claim 1, wherein the data block identifier according to after duplicate removal, described in inspection The data integrity of disk snapshot, comprising:
The data block of data block identifier instruction after reading duplicate removal;If successfully reading each data block identifier instruction after duplicate removal Data block, it is determined that the data of the disk snapshot are complete;If reading the number of at least one data block identifier instruction after duplicate removal Fail according to block, it is determined that the data of the disk snapshot of the data block of the corresponding data block identifier instruction are imperfect.
6. the method according to claim 1, wherein the data block identifier according to after duplicate removal, across storage domain Before migrating the disk snapshot, the method also includes: determine the migration sequence of multiple disk snapshots;
The data block identifier according to after duplicate removal, across disk snapshot described in storage domain migration, comprising:
According to the migration sequence, the data block of the data block identifier instruction after duplicate removal is copied into purpose storage from source storage domain Domain;The metadata of the disk snapshot is copied into purpose storage domain from source storage domain, so as to the member based on the disk snapshot Data and the corresponding data block of the disk snapshot are in the purpose storage domain reconstruction disk snapshot.
7. according to the method described in claim 6, it is characterized in that, each data block identifier after the duplicate removal is corresponding with one Migrate label;
It is described according to migration sequence, the data block of the data block identifier instruction after duplicate removal is copied into purpose from source storage domain Store domain, comprising:
Data block identifier according to the migration sequence, after traversing duplicate removal;For each data block identifier, if the data block mark Know corresponding migration label instruction not migrate, then the data block that the data block identifier indicates is copied into purpose from source storage domain Domain is stored, and updates the corresponding migration label instruction of the data block identifier and has moved, is moved if the data block identifier is corresponding Transfer label instruction is had moved, then the data block without indicating the data block identifier replicates.
8. the method according to claim 1, wherein the data block identifier according to after duplicate removal, across storage domain Migrate the disk snapshot, comprising:
The data block of data block identifier instruction after duplicate removal is copied into purpose storage domain from source storage domain;
The metadata of the disk snapshot is copied into purpose storage domain from source storage domain, so as to the member based on the disk snapshot Data and the corresponding data block of the disk snapshot are in the purpose storage domain reconstruction disk snapshot.
9. method according to any one of claims 1 to 8, which is characterized in that the method also includes: it is moved across storage domain It moves before the disk snapshot of multiple disks, determines the mapping relations of any one disk and disk snapshot.
10. a kind of device characterized by comprising at least one of snapshot integrity detection unit and snapshot migration units with And duplicate removal unit;
The duplicate removal unit judges whether there is duplicate number for the data block identifier according to the corresponding data block of disk snapshot According to block identification;If there is duplicate data block identifier, the duplicate data block identifier of institute is deleted, to obtain the data block mark after duplicate removal Know;
The snapshot integrity check unit, for examining the data of the disk snapshot according to the data block identifier after duplicate removal Integrality;
The snapshot migration units, for according to the data block identifier after duplicate removal, across disk snapshot described in storage domain migration.
11. device according to claim 10, which is characterized in that the duplicate removal unit, for basis in the following manner The data block identifier of the corresponding data block of disk snapshot, judges whether there is duplicate data block identifier: according in disk snapshot chain The corresponding data block of disk snapshot data block identifier, judge the corresponding data of different disk snapshot in the disk snapshot chain Whether block has duplicate data block identifier.
12. device according to claim 10, which is characterized in that the data block mark of the corresponding data block of any disk snapshot Knowledge is stored in a data block identifier list;
The duplicate removal unit, for obtaining the data block identifier after duplicate removal in the following manner:
Creation is initially the data block identifier set of empty set;The data block identifier list for traversing multiple disk snapshots, by the number It is added in the data block identifier set according to the data block identifier not having in block identification set, according to the multiple disk of traversal The data block identifier set obtained after snapshot, the data block identifier after determining the corresponding duplicate removal of the multiple disk snapshot.
13. device according to claim 10, which is characterized in that the snapshot integrity check unit, for by with Under type examines the data integrity of the disk snapshot according to the data block identifier after duplicate removal:
The data block of data block identifier instruction after reading duplicate removal;If successfully reading each data block identifier instruction after duplicate removal Data block, it is determined that the data of the disk snapshot are complete;If reading the number of at least one data block identifier instruction after duplicate removal Fail according to block, it is determined that the data of the disk snapshot of the data block of the corresponding data block identifier instruction are imperfect.
14. device according to claim 10, which is characterized in that the snapshot migration units are also used to determine multiple disks The migration sequence of snapshot;
The snapshot migration units are used in the following manner according to the data block identifier after duplicate removal, across magnetic described in storage domain migration Disk snapshot: according to the migration sequence, the data block of the data block identifier instruction after duplicate removal is copied into purpose from source storage domain Store domain;The metadata of the disk snapshot is copied into purpose storage domain from source storage domain, to be based on the disk snapshot Metadata and the corresponding data block of the disk snapshot in purpose storage domain rebuild the disk snapshot.
15. device according to claim 14, which is characterized in that each data block identifier after the duplicate removal is corresponding with one A migration label;
The snapshot migration units are for according to the migration sequence, the data block identifier after duplicate removal to be indicated in the following manner Data block from source storage domain copy to purpose storage domain:
Data block identifier according to the migration sequence, after traversing duplicate removal;For each data block identifier, if the data block mark Know corresponding migration label instruction not migrate, then the data block that the data block identifier indicates is copied into purpose from source storage domain Domain is stored, and updates the corresponding migration label instruction of the data block identifier and has moved, is moved if the data block identifier is corresponding Transfer label instruction is had moved, then the data block without indicating the data block identifier replicates.
16. a kind of system characterized by comprising
One or more processors;And
One or more is stored with the machine readable media of multiple instruction, when the multiple instruction is by one or more of processing Device makes the system realize method described in any one of claims 1 to 9 when executing.
17. a kind of machine readable media for being stored with multiple instruction, when the multiple instruction is executed by one or more processors Method described in Shi Shixian any one of claims 1 to 9.
18. a kind of method, comprising:
The data block identifier for obtaining the corresponding data block of disk snapshot carries out duplicate removal to the data block identifier;
According to the data block identifier after duplicate removal, carry out at least one of following processing: examine the disk snapshot data integrity, Across disk snapshot described in storage domain migration.
CN201710994265.1A 2017-10-23 2017-10-23 A kind of data processing method and device of disk snapshot Pending CN109697021A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710994265.1A CN109697021A (en) 2017-10-23 2017-10-23 A kind of data processing method and device of disk snapshot
PCT/CN2018/109933 WO2019080717A1 (en) 2017-10-23 2018-10-12 Method and device for processing data of disk snapshot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710994265.1A CN109697021A (en) 2017-10-23 2017-10-23 A kind of data processing method and device of disk snapshot

Publications (1)

Publication Number Publication Date
CN109697021A true CN109697021A (en) 2019-04-30

Family

ID=66226809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710994265.1A Pending CN109697021A (en) 2017-10-23 2017-10-23 A kind of data processing method and device of disk snapshot

Country Status (2)

Country Link
CN (1) CN109697021A (en)
WO (1) WO2019080717A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113031851A (en) * 2019-12-25 2021-06-25 阿里巴巴集团控股有限公司 Data snapshot method, device and equipment
CN114077569A (en) * 2020-08-18 2022-02-22 富泰华工业(深圳)有限公司 Method and equipment for compressing data and method and equipment for decompressing data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201774A (en) * 2006-12-15 2008-06-18 英业达股份有限公司 Method for snapshot of magnetic disc
CN102081552A (en) * 2009-12-01 2011-06-01 华为技术有限公司 Method, device and system for transferring from physical machine to virtual machine on line
CN102378969A (en) * 2009-03-30 2012-03-14 惠普开发有限公司 Deduplication of data stored in a copy volume
CN102567218A (en) * 2010-12-17 2012-07-11 微软公司 Garbage collection and hotspots relief for a data deduplication chunk store
CN104484480A (en) * 2014-12-31 2015-04-01 华为技术有限公司 Deduplication-based remote replication method and device
US20170206219A1 (en) * 2013-01-11 2017-07-20 Commvault Systems, Inc. High availability distributed deduplicated storage system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101650679A (en) * 2009-07-27 2010-02-17 浪潮电子信息产业股份有限公司 Efficient snapshot technology based on disk IO read-write change
US9047301B2 (en) * 2010-04-19 2015-06-02 Greenbytes, Inc. Method for optimizing the memory usage and performance of data deduplication storage systems
CN105095016B (en) * 2014-05-16 2018-05-18 北京云巢动脉科技有限公司 A kind of disk snapshot rollback method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201774A (en) * 2006-12-15 2008-06-18 英业达股份有限公司 Method for snapshot of magnetic disc
CN102378969A (en) * 2009-03-30 2012-03-14 惠普开发有限公司 Deduplication of data stored in a copy volume
CN102081552A (en) * 2009-12-01 2011-06-01 华为技术有限公司 Method, device and system for transferring from physical machine to virtual machine on line
CN102567218A (en) * 2010-12-17 2012-07-11 微软公司 Garbage collection and hotspots relief for a data deduplication chunk store
US20170206219A1 (en) * 2013-01-11 2017-07-20 Commvault Systems, Inc. High availability distributed deduplicated storage system
CN104484480A (en) * 2014-12-31 2015-04-01 华为技术有限公司 Deduplication-based remote replication method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113031851A (en) * 2019-12-25 2021-06-25 阿里巴巴集团控股有限公司 Data snapshot method, device and equipment
CN113031851B (en) * 2019-12-25 2024-06-11 阿里巴巴集团控股有限公司 Data snapshot method, device and equipment
CN114077569A (en) * 2020-08-18 2022-02-22 富泰华工业(深圳)有限公司 Method and equipment for compressing data and method and equipment for decompressing data
CN114077569B (en) * 2020-08-18 2023-07-18 富泰华工业(深圳)有限公司 Method and device for compressing data, and method and device for decompressing data

Also Published As

Publication number Publication date
WO2019080717A1 (en) 2019-05-02

Similar Documents

Publication Publication Date Title
US11086774B2 (en) Address translation for storage device
US10733053B1 (en) Disaster recovery for high-bandwidth distributed archives
CN106407040B (en) A kind of duplicating remote data method and system
US10613791B2 (en) Portable snapshot replication between storage systems
US20220050858A1 (en) Snapshot-Based Hydration Of A Cloud-Based Storage System
US20240303362A1 (en) Implementing Volume-Level Access Policies In Storage Systems
US20220019366A1 (en) Providing Data Services During Migration
US11966841B2 (en) Search acceleration for artificial intelligence
US20210326048A1 (en) Efficiently writing data in a zoned drive storage system
US11722064B2 (en) Address translation for storage device
US9189494B2 (en) Object file system
US20210349649A1 (en) Heterogeneity supportive resiliency groups
US10521151B1 (en) Determining effective space utilization in a storage system
US20240281139A1 (en) Updating Volume Data References In A Storage System
US11112986B2 (en) Systems and methods for storing information within hybrid storage with local and cloud-based storage devices
US11422731B1 (en) Metadata-based replication of a dataset
CN108369487A (en) System and method for shooting snapshot in duplicate removal Virtual File System
US9619322B2 (en) Erasure-coding extents in an append-only storage system
US20230244399A1 (en) Selecting Storage Resources Based On Data Characteristics
US11921567B2 (en) Temporarily preventing access to a storage device
US20240012752A1 (en) Guaranteeing Physical Deletion of Data in a Storage System
US12045173B2 (en) Stale data recovery using virtual storage metadata
US20240256568A1 (en) Leveraging Snapshots Of Remote Datasets In The Cloud
CN109697021A (en) A kind of data processing method and device of disk snapshot
US20240339159A1 (en) Optimizing allocation unit sizes for heterogeneous storage systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190430