CN105938457B - Filter method, device and the data reading system of data - Google Patents

Filter method, device and the data reading system of data Download PDF

Info

Publication number
CN105938457B
CN105938457B CN201610200260.2A CN201610200260A CN105938457B CN 105938457 B CN105938457 B CN 105938457B CN 201610200260 A CN201610200260 A CN 201610200260A CN 105938457 B CN105938457 B CN 105938457B
Authority
CN
China
Prior art keywords
data
snapshot
target
bitmap
baseline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610200260.2A
Other languages
Chinese (zh)
Other versions
CN105938457A (en
Inventor
梁峰
周江鲤
曾强
胡伟
朱磊
龙红梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201610200260.2A priority Critical patent/CN105938457B/en
Publication of CN105938457A publication Critical patent/CN105938457A/en
Application granted granted Critical
Publication of CN105938457B publication Critical patent/CN105938457B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0646Configuration or reconfiguration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems

Abstract

This application discloses a kind of filter method of data, device and data reading system, the treatment effeciency for improving storage resource.The filter method of the data includes:Target snapshot is created for source book;The target snapshot and baseline snapshot are subjected to diversity ratio pair, obtain comparison result, the comparison result is for recording variance data;The bitmap querying command that receiving host is sent, the bitmap querying command are used to inquire the bitmap of the valid data of the target snapshot;The bitmap of valid data is generated according to the variance data, the bitmap record of the valid data has the position of the valid data of the target snapshot;The bitmap of the valid data is returned into the host.

Description

Filter method, device and the data reading system of data
Technical field
This application involves field of storage, more particularly to a kind of filter method of data, device and data reading system.
Background technology
Logical unit number (Logical Unit Number, LUN) is can be identified on a storage device by application server Individual memory cells.The space sources of one LUN are in storage pool, and the space sources of storage pool are in several pieces of composition hard disk domain Hard disk.It stands in application server level, a LUN can be considered similar one piece of hard disk that can be used.Thick LUN (are passed Unite and non-simplify LUN) and Thin LUN (simplifying LUN) be a kind of type of LUN respectively, support virtual resource allocation, can with compared with Easy mode created, dilatation and squeeze operation.Thick LUN and Thin LUN will produce invalid data space, nothing The logical space shared by data space, that is, LUN intermediate value invalid datas is imitated, the logical space shared by this partial invalidity data does not have It distributes corresponding physical space or has been assigned corresponding physical space but value all zero, read invalid data space When, it will usually return to full zero data.It is two kinds of concrete conditions for generating invalid data as follows:
1, the allocation space that Thick LUN are not written by host I/O (Input/Output, input/output).For Thick LUN, the corresponding physical space of all LUN has been distributed when creating, and in general, the physical space after distribution can be complete Portion is filled with zero immediately, and the physical space that it is zero that this, which is partially filled with, belongs to invalid data space.If reading this part The spaces LUN that distribution is not covered by host I/O still, can be zero data.
2, the unallocated space of Thin LUN.For Thin LUN, it is empty that when establishment, will not distribute the corresponding physics of LUN sizes Between, and as data are written, then gradually increase distribution physical space on demand.So for Thin LUN, it is possible to which its part is patrolled It collects space and does not distribute physical space.This part of logical space also belongs to invalid data space.If reading this part Space can return to full zero data.
Invalid data filtering process key is to filter the nothing in snapshot by zero data identification technology in existing snapshot Imitate data.Zero data identification mainly has following three kinds of methods:
Method one:Byte-by-byte inspection:After this method takes data block to be checked, real-time byte-by-byte progress zero data inspection It looks into.If eligible, it is judged as zero data.
Method two:Zero data marks:This method checks certain number in write-in data every time or when periodically being checked Whether it is complete zero according to block, if it is, being 1 by the mark position of the data block, is judging whether corresponding data block is complete in this way When zero data, respective flag position is directly inquired.
Method three:Thin LUN labels:This method is to obtain the bitmap of its allocation space for Thin LUN, from And learn remaining unallocated space, as full zero data space.
For in relatively common method two and method three, being required to that data are marked, the nothing in filtering snapshot By inquiring the corresponding data of marker recognition when imitating data, then the invalid data in the data of identification is filtered out to obtain effectively Data.Therefore, the expense that can additionally increase system resource, influences the treatment effeciency of storage resource.
Invention content
This application provides a kind of filter method of data, device and data reading systems, for improving storage resource Treatment effeciency.
For the application in a first aspect, providing a kind of filter method of data, the method is applied to storage device, wherein The storage device includes the baseline snapshot of source book and the source book, and the source book is used for the host for storage device connection Data storage is provided, the snapshot before data is written for the source book for the baseline snapshot, and the baseline snapshot record is nothing Data are imitated, the method includes:
Target snapshot is created for the source book;The target snapshot and the baseline snapshot are subjected to diversity ratio pair, obtained Comparison result, the comparison result is for recording variance data;The bitmap querying command that receiving host is sent, the bitmap inquiry Order the bitmap of the valid data for inquiring the target snapshot;The bitmap of valid data is generated according to the variance data, The bitmap record of the valid data has the position of the valid data of the target snapshot;The bitmap of the valid data is returned To the host.Since the bitmap record of the valid data has the position of the valid data of the target snapshot, the host The valid data in the target snapshot can be read according to the bitmap of the valid data.Come to avoid passing through addition label It identifies and filters the invalid data in snapshot, and then reduce the expense of system resource, improve the treatment effeciency of storage resource.
In conjunction with the application in a first aspect, in the first realization method of first aspect, it is described by the target snapshot with The baseline snapshot carries out diversity ratio to specifically including:
Data block in the target snapshot is carried out block-by-block with the data block in the baseline snapshot to compare;When the mesh Mark the first data block in snapshot it is identical as the first data block in the baseline snapshot when, determine the in the target snapshot The corresponding data of one data block are invalid data;When the in the first data block and the baseline snapshot in the target snapshot When one data block differs, determine that the corresponding data of the first data block in the target snapshot are variance data.The realization side Formula therefrom obtains the position of the variance data block of snapshot by reading out two snapshot datas and data block is compared one by one It sets.The realization method is smaller on existing procedure influence, and comparison result is accurate.
In conjunction with the application in a first aspect, in second of realization method of first aspect, it is described by the target snapshot with The baseline snapshot carries out diversity ratio to specifically including:
Version number used in tracking CBT is changed by block to mark respectively in the target snapshot and the baseline snapshot Data block;When the corresponding CBT version numbers of the first data block in the target snapshot and the first data in the baseline snapshot When the corresponding CBT version numbers of block are identical, determine that the corresponding data of the first data block in the target snapshot are invalid data;When The corresponding CBT version numbers of the first data block in the target snapshot are corresponding with the first data block in the baseline snapshot When CBT version numbers differ, determine that the corresponding data of the first data block in the target snapshot are variance data.The realization side The time that formula is searched is very short.Because difference is already recorded in CBT, it is only necessary to which a small amount of time traversal CBT tables extract difference.
In conjunction with the application in a first aspect, in the third realization method of first aspect, it is described by the target snapshot with The baseline snapshot carries out diversity ratio to specifically including:
The privately owned mapping table of the target snapshot is searched, and reads the record of the address in the privately owned mapping table;According to institute State address be recorded in the data block of the target snapshot search described address record corresponding data, determine described address record Corresponding data are variance data;The shared mapping table of the target snapshot and the baseline snapshot is searched, and is read described total Enjoy the address record in mapping table;It is recorded in the data block of the target snapshot according to described address and searches described address record Corresponding data, it is variance data to determine that described address records corresponding data.The realization method take it is less because only needing Inquire existing mapping table record;Too many work need not be increased simultaneously, because first there is snapshot to have mapping table mechanism.
In conjunction with the first to the second any realization method of the application first aspect or first aspect, the of first aspect In four kinds of realization methods, the bitmap that valid data are generated according to the variance data specifically includes:
The range of data to be found in the target snapshot is determined according to the bitmap querying command;According to the difference number The variance data block corresponding to range according to the determination data to be found;Valid data are generated according to the variance data block Bitmap.In the realization method, the corresponding data of variance data block are valid data.
The application second aspect, provides a kind of filter device of data, and described device includes source book and the source book Baseline snapshot, the source book are used to provide data storage for the host that the storage device connects, and the baseline snapshot is described The snapshot before data is written in source book, and the baseline snapshot record is invalid data, and described device includes:
Snapshot management module, for creating target snapshot for the source book;
Snapshot comparing module obtains comparing knot for the target snapshot and the baseline snapshot to be carried out diversity ratio pair Fruit, the comparison result is for recording variance data;
The snapshot management module, is additionally operable to the bitmap querying command of receiving host transmission, and the bitmap querying command is used In the bitmap for the valid data for inquiring the target snapshot;
The snapshot comparing module is additionally operable to generate the bitmap of valid data according to the variance data, and has described The bitmap of effect data returns to the host;The bitmap record of the valid data has the position of the valid data of the target snapshot It sets.
Since the bitmap record of the valid data has the position of the valid data of the target snapshot, the host can be with The valid data in the target snapshot are read according to the bitmap of the valid data.It is identified to avoid passing through addition label And the invalid data in snapshot is filtered, and then reduce the expense of system resource, improve the treatment effeciency of storage resource.
In conjunction with the application second aspect, in the first realization method of the application second aspect, the snapshot compares mould Block is used to the target snapshot and the baseline snapshot carrying out diversity ratio pair, specifically includes:
The snapshot comparing module is used for, by the data block in the target snapshot and the data block in the baseline snapshot Carry out block-by-block comparison;When the first data block in the target snapshot is identical as the first data block in the baseline snapshot, Determine that the corresponding data of the first data block in the target snapshot are invalid data;When the first data in the target snapshot When block is differed with the first data block in the baseline snapshot, the corresponding number of the first data block in the target snapshot is determined According to for variance data.The realization method is smaller on existing procedure influence, and comparison result is accurate.
In conjunction with the application second aspect, in second of realization method of the application second aspect, the snapshot compares mould Block is used to the target snapshot and the baseline snapshot carrying out diversity ratio pair, specifically includes:
The snapshot comparing module is used for, and change version number used in tracking CBT by block marks the target respectively Snapshot and the data block in the baseline snapshot;When the corresponding CBT version numbers of the first data block and institute in the target snapshot State corresponding CBT version numbers of the first data block in baseline snapshot it is identical when, determine the first data block in the target snapshot Corresponding data are invalid data;When in the target snapshot the corresponding CBT version numbers of the first data block and the benchmark it is fast When the corresponding CBT version numbers of the first data block according in differ, determine that the first data block in the target snapshot is corresponding Data are variance data.The time that the realization method is searched is very short.Because difference is already recorded in CBT, it is only necessary to Shao Liangshi Between traversal CBT tables extract difference.
In conjunction with the application second aspect, in the third realization method of the application second aspect, the snapshot compares mould Block is used to the target snapshot and the baseline snapshot carrying out diversity ratio pair, specifically includes:
The snapshot comparing module is used for, and searches the privately owned mapping table of the target snapshot, and reads the privately owned mapping Address record in table;According to described address be recorded in the data block of the target snapshot search described address record it is corresponding Data, it is variance data to determine that described address records corresponding data;Search being total to for the target snapshot and the baseline snapshot Mapping table is enjoyed, and reads the record of the address in the shared mapping table;The number of the target snapshot is recorded according to described address Corresponding data are recorded according to described address is searched in block, it is variance data to determine that described address records corresponding data.The realization Mode take it is less because only needing to inquire existing mapping table record;Too many work need not be increased simultaneously, because first There is snapshot to have mapping table mechanism.
In conjunction with the application second aspect or second aspect first to any realization method of third, in the application second party In the 4th kind of realization method in face, the snapshot comparing module is used to generate the bitmap of valid data according to the variance data, It specifically includes:
The snapshot comparing module, for determining data to be found in the target snapshot according to the bitmap querying command Range;The variance data block corresponding to the range of the data to be found is determined according to the variance data;According to the difference Different data block generates the bitmap of valid data.In the realization method, the corresponding data of variance data block are valid data.
The application third aspect provides a kind of storage device, wherein the storage device includes source book and the source book Baseline snapshot, the source book be used for for the host that the storage device connects provide data store, the baseline snapshot is institute The snapshot before source book write-in data is stated, the baseline snapshot record is invalid data, and the storage device includes controller And memory, for storing instruction, the controller is for executing described instruction to execute the application first for the memory The filter method of the data provided in any middle realization method of aspect and first aspect.
The application fourth aspect provides a kind of data reading system, including the storage that host and the third aspect provide Equipment;Wherein, it is communicated by communication network between the host and the storage device;The host is described for receiving The bitmap for the valid data that storage device is sent, and the significant figure in target snapshot is read according to the bitmap of the valid data According to.
The 5th aspect of the application provides a kind of storage medium, stores program code in the storage medium, the program generation When code is run by storage device, the filtering side of the data of any one realization method offer of first aspect or first aspect is executed Method.The storage medium includes but not limited to flash memory (English:Flash memory), hard disk (English:hard disk Drive, abbreviation:HDD) or solid state disk is (English:Solid state drive, abbreviation:SSD).
Since technical solution provided by the present application is the source before the source book write-in data created to storage device Volume creates baseline snapshot, and baseline snapshot record is invalid data.Then utilize snapshot compare method by target snapshot with The baseline snapshot carries out diversity ratio pair, and variant data are recorded in obtained comparison result.In the bitmap for receiving host transmission After querying command, the bitmap of valid data is generated according to the variance data that is recorded in comparison result, and by the valid data Bitmap returns to the host.Wherein, the bitmap record of the valid data has the position of the valid data of target snapshot, therefore institute Valid data in the target snapshot can be read according to the bitmap of the valid data by stating host.To avoid passing through addition It marks to identify and filter the invalid data in snapshot, and then reduces the expense of system resource, improve the place of storage resource Manage efficiency.
Description of the drawings
Fig. 1 is the schematic diagram of COW snapshot structures provided herein;
Fig. 2 is the configuration diagram of data reading system provided herein;
Fig. 3 is the structural schematic diagram of storage device provided herein;
Fig. 4 is one embodiment flow diagram of the filter method of data provided herein;
Fig. 5 is difference bitmap data structure schematic diagram provided herein;
Fig. 6 is the data structure schematic diagram of CBT tables provided herein;
Fig. 7 is that CBT provided herein generates schematic diagram;
Fig. 8 is snapshot structural schematic diagram provided herein;
Fig. 9 is one embodiment structural schematic diagram of the filter device of data provided herein;
Figure 10 is another example structure schematic diagram of the filter device of data provided herein;
Figure 11 is another embodiment flow diagram of the filter method of data provided herein.
Specific implementation mode
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, the every other implementation that those skilled in the art are obtained without making creative work Example, shall fall in the protection scope of this application.
It will be appreciated that though each forwarding end may be described using term first, second etc. in the embodiment of the present application Mouth or module, but port or module is forwarded to should not necessarily be limited by these terms.These terms are only used for that port or module will be forwarded each other It distinguishes.For example, in the case where not departing from the embodiment of the present application range, the first data block can also be referred to as the second data Block, similarly, the second data block can also be referred to as the first data block;Likewise, the second data block can also be referred to as third Data block etc. does not have the dependence in logic or sequential between the first, second, third data block.
Present application example uses snapshot comparison technology, realizes zero data identification.The cardinal principle of this method is, for nothing Data are imitated, it, can be with since it does not change Snapshot time point to be transmitted to generating at the beginning of establishment from source book With snapshot comparison technology, a baseline snapshot is generated at the beginning of establishment in source book, then carries out snapshot to be transmitted and baseline snapshot Compare, the data except the difference of gained are invalid data.Therefore it is compared by snapshot, can effectively carry out invalid data mistake Filter.
Through this specification, snapshot is also commonly referred to as virtual snapshot (virtual copy) or time point logical copy (point-in-time logical copy).Snapshot is the completely available copy of an only logic to specified data acquisition system, The copy includes frozen image of the source data in copy time point.Snapshot supports that generation source LUN is upper virtual at some time point Consistency image quickly obtains a data pair consistent with source LUN/ file system under the premise of not interrupting regular traffic This.Copy is immediately available after generating, and no longer influences source data to the read-write operation of copy.Backup is snapping technique one Typical case scene.Under backup scenario, snapshot can be periodically generated according to the backup policy of user, then be transferred to snapshot In backup medium.According to the different disposal mechanism to data placement location is newly written, snapshot is divided into Copy on write snapshot (English: Copy On Write Snapshot, abbreviation:COW;Or English:Copy On First Write Snapshot, abbreviation: Snapshot (English is redirected when COFW) and writing:Redirect On Write Snapshot, abbreviation:ROW, or English: Redirect On First Write Snapshot, abbreviation:ROFW).By taking COW snapshots as an example, the possible realization of one kind is such as Shown in Fig. 1, Fig. 1 show the snapshot of time0 moment points generation.One snapshot includes three parts data space, i.e. source book, COW Space, snapped volume include additionally two parts mapping table, i.e., shared mapping table and privately owned mapping table.Wherein source book is also referred to as made a living Production volume, stores creation data and its update of user.The spaces COW store the data at the time point of all snapshot protections.Snapped volume Store the update that snapshot itself is written.Mapping between shared mapping table storage COW spatial datas and each snapshot.It is privately owned to reflect Firing table stores the data in snapped volume and the mapping between snapshot.Each row in shared mapping table and privately owned mapping table correspond to Each Snapshot time point.
When just having generated snapshot, because not yet for the I/O of source book or snapped volume write-ins, all pointers of snapshot The corresponding data block of source book is all pointed to, the spaces COW and snapped volume are sky, and it is sky to share mapping table and privately owned mapping table also.
If generate user after snapshot is written I/O to source book, data block to be covered can be shifted to an earlier date in source book It copies the spaces COW to, while changing and sharing mapping table, which is directed toward to pointer (original acquiescence direction source of corresponding data block Volume) it is modified to and is directed toward in the spaces COW.Then the I/O of write-in is covered into source book again.
If generate user after snapshot is written I/O to snapshot, it is divided into following several situations:1. it is corresponding that I/O is written Data block is in source book or the spaces COW.Then the shared data block (may be from the spaces COW or source book) is first copied to In snapped volume, then the I/O of write-in is covered in snapped volume, and the address of snapshot direction corresponding data block is modified to finger To snapped volume, it is stored in privately owned mapping table.2. the corresponding data blocks of write-in I/O are in snapped volume.Then directly write-in I/O is covered Cover corresponding position in snapped volume.
For a volume (LUN or file system), under many scenes, need entirely to copy its data into distal end to. Such as it will be in its initial back-up to another independent backup medium under backup scenario;Such as in synchronous/asynchronous remote copy field Under scape, when first carry out data transmission, need main side volume data initial synchronisation being entirely transferred to remote station;Such as double Under scene living, when carrying out data synchronization for the first time, need main side volume data initial synchronisation being entirely transferred to distal end dual-active website; Such as striding equipment clone, it needs to roll up and entirely copies on remote equipment.When carrying out respective operations under these scenes, general meeting A snapshot is generated to this volume first, thus obtains the time point copy of this volume, it then can be by this time point Copy is transferred to specified destination by initial synchronisation.The data of this snapshot will include user when generating snapshot and have been written into Effective parts I/O and remaining full zero data part.Generally LUN embryonic stages when due to executing the step, so very The I/O of possible user's write-in is less, causes the zero data for including in snapshot relatively very much, these full zero datas can occupy largely Bandwidth but any practical significance is not had to business.So if can identify the full zero data in snapshot and not transmit it It filters out before, then can effectively save bandwidth, promote efficiency of transmission, while not influencing business also and being normally carried out.
Due to the relatively little of feature of the characteristic quickly created and occupied space of snapshot, in some scenes such as periodic backups Or it in the case of replicating, needs periodically to generate snapshot, and snapshot data is transferred to destination.It is fast for each According to, including in upper snapshot generated time point to variance data all between recent snapshot generated time point, and There is no the data changed between this period.The data that this part does not change are identical as a upper snapshot.Therefore, If every time transmission snapshot when can only transmit its variance data between last snapshot, can ensure business normally into Row saves bandwidth under the premise of not losing any effective information and improves efficiency of transmission.
Relatively and obtain between snapshot difference there are many method, common two kinds of methods include that snapshot block-by-block compares and tracking Difference record sheet two ways.The method of difference between calling interface and three kinds of common acquisition snapshots that snapshot compares:Soon It is compared according to block-by-block, block modification tracking (English:Changed Block Tracking, abbreviation:CBT snapshot difference and snapshot) are tracked Mapping table records difference.
By snapshot difference acquiring technology, the data difference between two specified time points can be extracted, it is standby to reduce Part data, improve backup efficiency, shorten backup window.The technology is a critically important work(in realizing incremental backup or replicating Energy.
The system architecture that the embodiment of the present application is applied
The configuration diagram for the data reading system that Fig. 2 is applied by the embodiment of the present application is set including host, storage Standby, which includes controller and storage medium.Wherein host includes host application, this is possible to issue to read snapshot life One of source of order.Controller includes the characteristic using snapshot, and snapshot reads interface, snapshot difference extraction module and data pipe Manage module.Wherein, the use of the characteristic of snapshot refer to that other such as replicate breeding property, these characteristics, which may issue, reads snapshot life It enables, snapshot, which reads interface, to be responsible for receiving the order of snapshot reading and be handed down to data management module, and data management module is responsible for The snapshot data in storage medium is read, is then returned it into.Snapshot difference extraction module is for reading data management module Snapshot data carry out being compared with ready baseline snapshot in advance before it returns to snapshot management module, and return Its difference.Storage medium includes snapshot data.
Storage device in Fig. 2 can be realized by the storage device 200 in Fig. 3.The institutional framework of storage device 200 is shown It is intended to as shown in figure 3, including controller 202, memory 204, can also including bus 208, communication interface 206.
Wherein, logical between controller 202, memory 204 and communication interface 206 can be realized by bus 208 Letter connection can also realize communication by other means such as wireless transmissions.
Memory 204 may include volatile memory (English:Volatile memory), such as random access memory Device (English:Random-access memory, abbreviation:RAM);Memory can also include nonvolatile memory (English: Non-volatile memory), such as read-only memory (English:Read-only memory, abbreviation:ROM), flash Device (English:Flash memory), hard disk (English:Hard disk drive, abbreviation:HDD) or solid state disk is (English: Solid state drive, abbreviation:SSD);Memory 204 can also include the combination of the memory of mentioned kind.Passing through Software is come when realizing technical solution provided by the present application, for realizing the program of the filter method of the application Fig. 4 data provided Code preserves in memory 204, and is executed by controller 202.
Storage device 200 passes through communication interface 206 and main-machine communication.
Storage device 200 in the application includes the baseline snapshot of source book and the source book, and the source book is for being described The host of storage device connection provides data storage, and the snapshot before data is written for the source book for the baseline snapshot, described Baseline snapshot record is invalid data, and the controller 202 in the application storage device 200 is used to create target for the source book Snapshot, which is another snapshot different from said reference snapshot, herein with target only for purposes of illustration only, and not being The function and purposes of snapshot are limited.
The target snapshot and the baseline snapshot are carried out diversity ratio pair by the controller 202, obtain comparison result, described Comparison result is for recording variance data;The bitmap querying command that 202 receiving host of controller is sent, the bitmap querying command It may be from host, it is also possible to come from storage system, specifically may be from the application in host side, or come from storage Other characteristics of internal system.The controller 202 has after receiving the bitmap querying command according to variance data generation The bitmap of data is imitated, and the valid data bitmap is returned into the host.Since the bitmap record of the valid data has this The position of the valid data of target snapshot, therefore the host can be fast according to the bitmap of the valid data reading target Valid data according in.The invalid data in snapshot is identified and filter to avoid passing through addition label, and then is reduced and be The expense for resource of uniting, improves the treatment effeciency of storage resource.
Optionally, the controller 202 by the target snapshot and the baseline snapshot carry out diversity ratio to specifically include as Lower three kinds of modes:
One, snapshot block-by-block alignments, specifically include:
Data block in the target snapshot is carried out block-by-block with the data block in the baseline snapshot to compare;
When the first data block in the target snapshot is identical as the first data block in the baseline snapshot, institute is determined It is invalid data to state the corresponding data of the first data block in target snapshot;When in the target snapshot the first data block and institute When stating the first data block in baseline snapshot and differing, determine that the corresponding data of the first data block in the target snapshot are poor Heteromerism evidence.
Two, CBT tracks snapshot difference mode, specifically includes:
Version number used in tracking CBT is changed by block to mark respectively in the target snapshot and the baseline snapshot Data block;
When the corresponding CBT version numbers of the first data block in the target snapshot and the first data in the baseline snapshot When the corresponding CBT version numbers of block are identical, determine that the corresponding data of the first data block in the target snapshot are invalid data;When The corresponding CBT version numbers of the first data block in the target snapshot are corresponding with the first data block in the baseline snapshot When CBT version numbers differ, determine that the corresponding data of the first data block in the target snapshot are variance data.
Three, snapmap table records difference mode, specifically includes:
The privately owned mapping table of the target snapshot is searched, and reads the record of the address in the privately owned mapping table;According to institute State address be recorded in the data block of the target snapshot search described address record corresponding data, determine described address record Corresponding data are variance data;
The shared mapping table of the target snapshot and the baseline snapshot is searched, and reads the ground in the shared mapping table Location records;It is recorded in lookup described address in the data block of the target snapshot according to described address and records corresponding data, really Determine described address to record corresponding data to be variance data.
Optionally, which specifically includes according to the bitmap of variance data generation valid data:
The range of data to be found in the target snapshot is determined according to the bitmap querying command;
The variance data block corresponding to the range of the data to be found is determined according to the variance data;
The bitmap of valid data is generated according to the variance data block.
Optionally, which is additionally operable to receive the valid data reading order for the target snapshot that the host is sent; The valid data in the target snapshot are read according to the valid data reading order, and the valid data are returned into institute State host.
Optionally, which is additionally operable to create difference bitmap, and the difference bitmap is for storing the comparison knot Fruit.
Present invention also provides a kind of filter method of data, it is applied to storage device, wherein the storage device includes The baseline snapshot of source book and the source book, the source book are used to provide data storage, institute for the host that the storage device connects The snapshot before baseline snapshot is source book write-in data is stated, the baseline snapshot record is invalid data.Control in Fig. 2 Storage device 200 in device and Fig. 3 processed executes this method when running, and flow diagram is as shown in Figure 4.
401, it is that source book creates target snapshot.
The target snapshot be different from said reference snapshot another snapshot, herein with target only for purposes of illustration only, and Not the function of snapshot and purposes are limited.
402, the target snapshot and the baseline snapshot are subjected to diversity ratio pair, obtain comparison result, the comparison knot Fruit is for recording variance data.
The method that the target snapshot and the baseline snapshot carry out diversity ratio pair is generally included following three kinds:Snapshot by Block compares, and CBT tracks snapshot difference and snapmap table records difference.The snapshot of these three methods compares method of calling and can use Following order indicates:
QuerySnapShotDiff<SnapshotID1>,<SnapshotID2>,<chunkSize>
QuerySnapShotDiff is command name, indicates to inquire the difference between two snapshots.SnapshotID1 and SnapshotID2 identifies two snapshots to be compared respectively.ChunkSize is indicated according to great granularity division block, to two The corresponding data block of a snapshot is compared.
The order return value is as follows:
<ChunkBitmap>
ChunkBitmap is a difference bitmap, and the value of each indicates that the comparison result of corresponding position data block is No is identical data.The example data structure of difference bitmap is as shown in Figure 5.
Optionally, following steps are also performed before step 403:Difference bitmap is created, the difference bitmap is for depositing Store up the comparison result.
In difference bitmap data structure shown in fig. 5, each bit indicates two and is compared snapshot corresponding position data block Comparison result.Without loss of generality, it is believed that be worth and indicate that corresponding position data block contents are different for 1, be variance data;Value is 0 It indicates that corresponding position data block contents are identical, is identical data.
For above-mentioned three kinds of comparison methods, it is described in detail below:
It is optionally, described that the target snapshot and the baseline snapshot are subjected to diversity ratio for the first comparison method To specifically including:
Data block in the target snapshot is carried out block-by-block with the data block in the baseline snapshot to compare;
When the first data block in the target snapshot is identical as the first data block in the baseline snapshot, institute is determined It is invalid data to state the corresponding data of the first data block in target snapshot;When in the target snapshot the first data block and institute When stating the first data block in baseline snapshot and differing, determine that the corresponding data of the first data block in the target snapshot are poor Heteromerism evidence.
Concrete implementation process can refer to as follows:
1, difference bitmap is created.The step creates difference bitmap as shown in Figure 5, for storing the snapshot diversity ratio pair returned As a result.2 are entered step after establishment.
2, judge whether two snapshot the last one data blocks.Judge current position to be compared whether be two snapshots most Data block afterwards.If it is, indicating to have compared to finish, then 6 are entered step;If it is not, then indicating there be number to be compared According to block, then 3 are entered step.Wherein, the two snapshots correspond to above-mentioned target snapshot and baseline snapshot.
3, byte-by-byte or compare two data blocks by bit, judge comparison result, if two data blocks are identical, into Enter step 4, if two data block differences, enter step 5.
4, the comparison result for recording this position in difference bitmap is 0, subsequently into step 2.Indicate corresponding data block phase Together.
5, the comparison for recording this position in difference bitmap is 1 to result, subsequently into step 2.The expression pair of this result Answer data block different.
6, difference bitmap is returned.It enters step 6 expression, two snapshots and has compared and finish, therefore using difference bitmap as knot Fruit returns.
This method therefrom obtains the difference of snapshot by reading out two snapshot datas and data block is compared one by one The position of different data block.This method is smaller on existing procedure influence, and comparison result is accurate.
It is optionally, described that the target snapshot and the baseline snapshot are subjected to diversity ratio for second of comparison method To specifically including:
Version number used in tracking CBT is changed by block to mark respectively in the target snapshot and the baseline snapshot Data block;
When the corresponding CBT version numbers of the first data block in the target snapshot and the first data in the baseline snapshot When the corresponding CBT version numbers of block are identical, determine that the corresponding data of the first data block in the target snapshot are invalid data;When The corresponding CBT version numbers of the first data block in the target snapshot are corresponding with the first data block in the baseline snapshot When CBT version numbers differ, determine that the corresponding data of the first data block in the target snapshot are variance data.
Concrete implementation process can refer to as follows:
Tracking snapshot difference refers to that variation is tracked and recorded when snapshot changes each time, to obtain two snapshots it Between difference, and be recorded in tracking table.The action of difference record can be completed by storage system, can also be by upper layer It is completed using (such as virtual machine platform software).Its typical mode is CBT, that is, changes data block tracking.Which passes through CBT Table tracks and records the situation of change of the data block of a production volume, marks production volume from changed data after last time snapshot Block position.In this way, the difference between two snapshots can be extracted.The data structure of CBT tables is as shown in fig. 6, shown in Fig. 6 Data structure in, each int element is a version number, is made of a positive integer value, indicate corresponding position number According to the version number value of block.After generating new snapshot every time, if changed in corresponding data block, version number's numerical value can increase It is added to corresponding snapshot serial number.
Detailed operation instruction is generated with reference to figure 7, CBT shown in Fig. 7 in schematic diagram, and each row represent a Snapshot time Point.The first row " the time point snapshot data " refer to respective column snapshot generate after include in snapshot data." the cut-off of second row Data are rolled up in production before to next Snapshot time " show to arrive before next Snapshot time point after generating snapshot, production Roll up the change due to being written into creation data generation.The third line " difference between production volume and snapshot " is system tracking write-in number According to and generate difference record.Show due to being written into creation data, the difference of current production volume and a nearest Snapshot time. Fourth line " the CBT records before to next Snapshot time point " is that system is generated according to the third line difference of track record CBT is recorded, which changes in real time with collected difference.Fifth line " the current CBT that user inquires ", which is user, to be looked into That askes records for the CBT currently rolled up.And the 6th row " CBT tables when current point in time snapshot generates " refers to when generating current Between when putting snapshot, the CBT record sheets that can inquire in system.The CBT tables at a corresponding time point can be copied when generating snapshot.
The data block changed in production volume after CBT will can be generated from a upper snapshot in real time is included in CBT tables, CBT tables The version number of lattice has new snapshot to generate and then adds 1 every time since 0.By taking Fig. 7 as an example, the time point snapshot that snapshot 2 is protected is " A, B, 0,0 ", after generating snapshot 2, since creation data is written, so production volume becomes " A1, B, C, D ".I.e. from snapshot 2 Between time point before time point to snapshot 3, one of data block contents become A1 by A, and the value of another data block is by 0 Become C, the value of third data block becomes D from 0.The CBT of snapshot 2 can intercept and capture the creation data of write-in and record these changes in real time Change, so the corresponding positions CBT become snapshot 2 in corresponding snapshot 2 " the CBT records before to next Snapshot time point " Corresponding CBT version numbers 2;Other data blocks not changed, corresponding version number do not change.
If consulting CBT data of the recent snapshot with snapshot before, can when generating snapshot, when retaining to correspondence Between the CBT tables put, i.e. " CBT tables when current point in time snapshot generates " a line in Fig. 7.Can therefrom calculate the snapshot with The data difference of any time in past point snapshot.Specific method is to find out to be more than time in the past point snapshot version number value in CBT tables CBT records, indicate that corresponding data block has differences in the time in the past point between current point in time.By taking Fig. 7 as an example, If searching the data difference between snapshot 4 and snapshot 2, in " CBT tables when current point in time snapshot generates " a line, It finds snapshot 4 one and arranges corresponding CBT tables, there it can be seen that have 1 " 1 " in the CBT tables, 1 " 3 " and 2 " 2 ".Snapshot 2 Version number be 1, and " 3 ">1, " 2 ">1, so showing that 1 " 3 " and 2 " 2 " corresponding data block is changed.It is practical Data situation be by snapshot 2 " A, B, 00 " become " A2, B, C, D ", really 1 " 3 " and 2 " 2 " corresponding data Block is changed, and is actually consistent.
The time that this method is searched is very short.Because difference is already recorded in CBT, it is only necessary to which a small amount of time traverses CBT tables Lattice extract difference.
It is optionally, described that the target snapshot and the baseline snapshot are subjected to diversity ratio for the third comparison method To specifically including:
The privately owned mapping table of the target snapshot is searched, and reads the record of the address in the privately owned mapping table;According to institute State address be recorded in the data block of the target snapshot search described address record corresponding data, determine described address record Corresponding data are variance data;
The shared mapping table of the target snapshot and the baseline snapshot is searched, and reads the ground in the shared mapping table Location records;It is recorded in lookup described address in the data block of the target snapshot according to described address and records corresponding data, really Determine described address to record corresponding data to be variance data.
Concrete implementation process can refer to as follows:
For most of snapshot realization method, snapshot itself can use the variance data of data structure records and source book (the shared and privately owned mapping table of such as COW snapshots), therefore, by two contents impinged upon soon in shared mapping table of comparison, and Mapping item in its respective privately owned mapping table, so that it may to obtain the differences between its snapshot.To obtain the data of variation Block.By taking COW snapshots as an example, the algorithm flow for extracting difference is as follows:
1, difference bitmap table is created, and will wherein all positions 0.This step is to carry out the initialization of difference bitmap table.0 indicates Indifference between two snapshots.
2, the respective privately owned mapping table of two snapshots is inquired, and takes out record therein successively.These records are all to fast According to the modification of itself, therefore the difference being seen as between two snapshots.If do not had in the respective privately owned mapping table of two snapshots Record or COW do not have privately owned mapping table in realizing, then are directly entered step 4.
3, according to step 2 as a result, the corresponding position of difference bitmap table is set to 1.
4, into shared mapping table, inquiry is more than or equal to smaller Snapshot time point, is less than the note of larger Snapshot time point Record.These records all show the difference between two snapshots.
5, according to the change in the shared mapping table inquired in step 4, corresponding position in difference bitmap table is set to 1.
It illustrates how to obtain snapshot difference from shared mapping table by taking Fig. 8 as an example.
In snapshot structural schematic diagram shown in Fig. 8, it is assumed that at the time0 moment, generate time0 snapshots comprising time0 is private There are snapped volume, the privately owned mapping tables of time0, while further including the spaces COW, shares mapping table and source book.At the time1 moment, After time0 snapshots, and generate time1 snapshots.Two snapshots share source book, the spaces COW and shared mapping table;Two snapshots are each Have the private volume of oneself and privately owned mapping table by oneself.At the time2 moment, after time0 and time1 snapshots, and it is fast to generate time2 According to.Three snapshots share source book, the spaces COW and shared mapping table;Each private volume for having oneself by oneself of three snapshots and privately owned mapping Table.
Fig. 8 shows after generating 3 snapshots, mapping table is shared in source book, the spaces COW, the snapped volume of three snapshots and The variation that respective privately owned mapping table occurs.By searching for corresponding shared mapping table and privately owned mapping table, snapshot can be found Between difference.According to it is noted earlier extraction difference algorithm flow, now calculate snapshot figure 8 above in time2 and snapshot time0 it Between difference process it is as follows:
The respective privately owned mapping tables of time2 and time0 in time2 snapshots are first looked for, find out difference record therein (such as Fruit time0 snapshots and time2 snapshots are all not written to, then need not inquire privately owned mapping table).Therefrom obtaining time0 has one " k0, G0 ", expression have modification to item record to time0 snapshots, this is the difference of the two, therefore corresponding address G3 is corresponding Difference bit map location is 1.After this, it inquires and shares mapping table, inquire three records and be both greater than equal to time0, and be less than time2.It is respectively " h1, G1 ", " h1 ', G1 " and " h0, G0 ".This three records show that the corresponding data of G1 and G0 have modification, because The corresponding difference bit map locations of this G1 and G0 are 1.To sum up, show that the difference between time2 snapshots and time0 snapshots is " G0, G1,G3”。
This method take it is less because only needing to inquire existing mapping table record;It need not increase simultaneously too many Work, because first there is snapshot to have mapping table mechanism.
403, the bitmap querying command that receiving host is sent, the bitmap querying command is for inquiring the target snapshot The bitmap of valid data.
In the step, the application of host or characteristic send bitmap querying command, which includes that target is fast According to ID and data to be found start-stop range (such as valid data of the inquiry snapshot snap1 from address 0 to address 100).This is deposited After storage equipment receives the bitmap querying command, internal recognizable instruction is converted into according to corresponding ID, to read the position Figure querying command.The source of the bitmap querying command may be from the application in host side, it is also possible to come from storage system Other characteristics in portion.
404, the bitmap of valid data is generated according to the variance data, the bitmap record of the valid data has the mesh Mark the position of the valid data of snapshot.
Optionally, the bitmap that valid data are generated according to the variance data specifically includes:
The range of data to be found in the target snapshot is determined according to the bitmap querying command;
The difference corresponding to the range of the data to be found is determined according to the variance data recorded in the comparison result Data block;
The bitmap of valid data is generated according to the variance data block.
It should be noted that the storage device is using the variance data block within the scope of the bitmap querying command defined as having Imitate data block.Detailed process is, from initial address as defined in the querying command (address 0 enumerated in step 403) to end Address (address 100 enumerated in step 403) range, the comparison result obtained according to step 402 are looked into based on the comparison result Look for the correspondence of data block and the comparison result within the scope of address above mentioned.For example, if the corresponding number of the data block found According to for variance data, it is determined that the data block is variance data block, then is 1 by the corresponding position of the bitmap of valid data, otherwise, It is then 0 by the corresponding position of the bitmap of valid data.Finally, the data block that position is 1 is corresponded in the bitmap of the valid data of generation Variance data block is represented, which is valid data.
405, the bitmap of the valid data is returned into the host.
The bitmap of the valid data is returned to the host by storage device, which is receiving the valid data position After figure, the valid data in the target snapshot are read according to the bitmap of the valid data.Specifically, the host is receiving After the bitmap for the valid data that the storage device returns, some data bit of valid data bitmap is checked one by one first, if The position is 1, then is shown to be valid data, issues valid data reading order to storage device, the valid data in snapshot are read Go out;If the position is 0, it is shown to be invalid data, then skips over the data bit and continues to search for next data bit.
Optionally, during reading valid data, the method further includes:
Receive the valid data reading order for the target snapshot that the host is sent;
The valid data in the target snapshot are read according to the valid data reading order, and will be had described in reading Effect data return to the host.
In the embodiment of the present application, before the source book write-in data created to storage device, benchmark is created for the source book Snapshot, baseline snapshot record is invalid data.Then utilize the method that snapshot compares by target snapshot and the baseline snapshot Diversity ratio pair is carried out, variant data are recorded in obtained comparison result.After the bitmap querying command for receiving host transmission, The bitmap of valid data is generated according to the variance data recorded in comparison result, and the bitmap of the valid data is returned into institute Host is stated, the bitmap record of the valid data has the position of the valid data of target snapshot.Come to avoid passing through addition label It identifies and filters the invalid data in snapshot.And then reduce the expense of system resource, improve the treatment effeciency of storage resource. Meanwhile the application carries out byte-by-byte inspection when data need not also be written periodically or every time, reduces write step and system money Source expense, and Thin LUN and Thick LUN are applicable in.
The embodiment of the present application also provides the filter device 600 of data, which can pass through storage device shown in Fig. 3 200 realize, can also pass through application-specific integrated circuit (English:Application-specific integrated circuit, Abbreviation:ASIC it) realizes or programmable logic device is (English:Programmable logic device, abbreviation:PLD it) realizes. Above-mentioned PLD can be Complex Programmable Logic Devices (English:Complex programmable logic device, abbreviation: CPLD), FPGA, Universal Array Logic (English:Generic array logic, abbreviation:GAL) or it is arbitrarily combined.The data Filter device 600 for realizing data shown in Fig. 4 filter method.Pass through the filtering of software realization data shown in Fig. 4 When method, the filter device 600 or software module of data.
The institutional framework schematic diagram of the filter device 600 of data is as shown in figure 9, include:Snapshot management module 602 and snapshot Comparing module 604.When snapshot management module 602 works, step 401 in the filter method of data shown in Fig. 4,403 are executed, When snapshot comparing module 604 works, step 402 in the read method of data shown in Fig. 4,404,405 are executed.
The filter device of data provided by the embodiments of the present application, can storage device is created source book write-in data it Before, baseline snapshot is created for the source book, baseline snapshot record is invalid data.Then utilize the method that snapshot compares will Target snapshot carries out diversity ratio pair with the baseline snapshot, and variant data are recorded in obtained comparison result.Receiving host After the bitmap querying command of transmission, the bitmap of valid data is generated according to the variance data recorded in comparison result, and will be described The bitmap of valid data returns to the host, and the bitmap record of the valid data has the position of the valid data of target snapshot. The invalid data in snapshot is identified and filters to avoid passing through addition label.And then reduce the expense of system resource, it carries The high treatment effeciency of storage resource.
Optionally, as shown in Figure 10, the filter device 600 of the data further includes data management module 606, data management When module 606 works, following steps are executed:Receive the valid data reading order for the target snapshot that the host is sent;According to The valid data reading order reads the valid data in the target snapshot, and the valid data of reading are returned to The host.
The associated description of above-mentioned apparatus can correspond to associated description and effect refering to embodiment of the method part and be understood, This place, which is not done, excessively to be repeated.
The filter apparatus configuration of the filter method of data based on Fig. 4 and data shown in Fig. 10, this application provides One example with this method, as shown in figure 11, Figure 11 are the flow chart of the filter method based on the data, and specific steps are such as Under:
1, it sends volume and creates order.By the order of application or characteristic transmission establishment volume in host.The step is existing Technology.
2, a baseline snapshot is created immediately after creating volume.This is created together when impinging upon establishment volume soon.And because Volume is empty when being created for the snapshot, so the snapshot is full zero data.Snapshot management module is connected to after the order for creating volume, first Create volume.Then before any data are written to snapshot or source book, a baseline snapshot is created immediately.The snapshot is to user It is invisible.
3, it returns to volume and creates order.After having created volume and baseline snapshot, returns and create volume command result.
4, it sends snapshot and creates order.The follow-up time point snapshot for needing to copy or transmit is created for target volume.
5, it returns to snapshot and creates result.Snapshot management module is created after receiving order, then returns result to life Enable originating end.
6, it determines target snapshot, and target snapshot and baseline snapshot is subjected to diversity ratio pair.Snapshot comparing module is from establishment Time point snapshot in determine a target snapshot, the target snapshot and baseline snapshot are subjected to diversity ratio pair, which tied In fruit storage to difference bitmap.
7, bitmap querying command is sent, which includes 7.1~7.2.Using or characteristic send snapshot bitmap inquiry Order.The order can be sent to snapshot management module, and snapshot management module is converted into internal recognizable according to corresponding ID Instruction, then it is transmitted to snapshot comparing module.Order may be from the application in host side, it is also possible to come from inside storage system Other characteristics.
8, the bitmap of valid data is generated.Snapshot comparing module is after receiving the bitmap querying command, according to comparison As a result the variance data recorded in determines the variance data block corresponding to the range of data to be found, and according to the variance data block Generate the bitmap of valid data.
9, the bitmap of valid data is returned to, which includes 9.1~9.2.Snapshot comparing module is by the valid data of generation Bitmap return to the applications/features of host side/storage system.
10, the valid data reading order of target snapshot is sent, which includes 10.1~10.2.Using or characteristic according to The bitmap of the valid data of return checks the data bit of the bitmap of valid data one by one, issues valid data reading order to data Management module, to read the valid data in target snapshot.
11, the valid data read are returned to, which includes 11.1~11.2.Data management module read valid data it After return it into.
With the read method of data provided by the embodiments of the present application, number can be written in the source book created to storage device According to before, baseline snapshot is created for the source book, baseline snapshot record is invalid data.Then the side for utilizing snapshot to compare Target snapshot and the baseline snapshot are carried out diversity ratio pair by method, and variant data are recorded in obtained comparison result.It is receiving After the bitmap querying command that host is sent, the bitmap of valid data is generated according to the variance data recorded in comparison result, and will The bitmap of the valid data returns to the host, and the bitmap record of the valid data has the position of the valid data of target snapshot It sets.The invalid data in snapshot is identified and filters to avoid passing through addition label.And then reduce the expense of system resource, Improve the treatment effeciency of storage resource.
Present invention also provides a kind of data reading systems, and the institutional framework schematic diagram of the system is as shown in Fig. 2, the system The storage device 200 and host for including Fig. 3 descriptions, pass through communication network between the host and the storage device 200 Network is communicated;The host is used to receive the bitmap for the valid data that the storage device is sent, and according to the significant figure According to bitmap read target snapshot in valid data.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of division of logic function, formula that in actual implementation, there may be another division manner, such as multiple units or component It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or The mutual coupling, direct-coupling or communication connection discussed can be the indirect coupling by some interfaces, device or unit It closes or communicates to connect, can be electrical, machinery or other forms.
In addition, each functional unit in each embodiment of the application can be integrated in a processing unit, it can also It is that each unit physically exists alone, it can also be during two or more units be integrated in one unit.Above-mentioned integrated list The form that hardware had both may be used in member is realized, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can be stored in a computer read/write memory medium.Based on this understanding, the technical solution of the application is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the application Portion or part steps.And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (English:Read-Only Memory, referred to as:ROM), random access memory (English:Random Access Memory, referred to as:RAM), magnetic disc or The various media that can store program code such as CD.
The above, above example are only to illustrate the technical solution of the application, rather than its limitations;Although with reference to before Embodiment is stated the application is described in detail, it will be understood by those of ordinary skill in the art that:It still can be to preceding The technical solution recorded in each embodiment is stated to modify or equivalent replacement of some of the technical features;And these Modification or replacement, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution.

Claims (12)

1. a kind of filter method of data, it is applied to storage device, which is characterized in that the storage device includes source book and described The baseline snapshot of source book, the source book are used to provide data storage, the baseline snapshot for the host that the storage device connects The snapshot before data is written for the source book, the baseline snapshot record is invalid data, the method includes:
Target snapshot is created for the source book;
The target snapshot and the baseline snapshot are subjected to diversity ratio pair, obtain comparison result, the comparison result is for remembering Record variance data;
The bitmap querying command that receiving host is sent, the bitmap querying command are used to inquire the valid data of the target snapshot Bitmap;
The bitmap of valid data is generated according to the variance data, the bitmap record of the valid data has the target snapshot The position of valid data;
The bitmap of the valid data is returned into the host.
2. according to the method described in claim 1, it is characterized in that, described carry out the target snapshot and the baseline snapshot Diversity ratio is to specifically including:
Data block in the target snapshot is carried out block-by-block with the data block in the baseline snapshot to compare;
When the first data block in the target snapshot is identical as the first data block in the baseline snapshot, the mesh is determined It is invalid data to mark the corresponding data of the first data block in snapshot;When the first data block and the base in the target snapshot When the first data block in quasi- snapshot differs, determine that the corresponding data of the first data block in the target snapshot are difference number According to.
3. according to the method described in claim 1, it is characterized in that, described carry out the target snapshot and the baseline snapshot Diversity ratio is to specifically including:
Version number used in tracking CBT, which is changed, by block marks the target snapshot and the data in the baseline snapshot respectively Block;
When the corresponding CBT version numbers of the first data block in the target snapshot and the first data block pair in the baseline snapshot When the CBT version numbers answered are identical, determine that the corresponding data of the first data block in the target snapshot are invalid data;When described The corresponding CBT version numbers of the first data block in target snapshot and the first data block in the baseline snapshot are CBT editions corresponding This number when differing, determines that the corresponding data of the first data block in the target snapshot are variance data.
4. according to the method described in claim 1, it is characterized in that, described carry out the target snapshot and the baseline snapshot Diversity ratio is to specifically including:
The privately owned mapping table of the target snapshot is searched, and reads the record of the address in the privately owned mapping table;According to described Location is recorded in lookup described address in the data block of the target snapshot and records corresponding data, determines that described address record corresponds to Data be variance data;
The shared mapping table of the target snapshot and the baseline snapshot is searched, and reads the note of the address in the shared mapping table Record;It is recorded in lookup described address in the data block of the target snapshot according to described address and records corresponding data, determines institute It is variance data to state address and record corresponding data.
5. method according to any one of claims 1 to 4, which is characterized in that described to be had according to variance data generation The bitmap of effect data specifically includes:
The range of data to be found in the target snapshot is determined according to the bitmap querying command;
The variance data block corresponding to the range of the data to be found is determined according to the variance data;
The bitmap of valid data is generated according to the variance data block.
6. a kind of filter device of data, which is characterized in that described device includes the baseline snapshot of source book and the source book, described Source book is used to provide data storage for the host that storage device connects, and the baseline snapshot is that the source book is written before data Snapshot, the baseline snapshot record is invalid data, and described device includes:
Snapshot management module, for creating target snapshot for the source book;
Snapshot comparing module obtains comparison result, institute for the target snapshot and the baseline snapshot to be carried out diversity ratio pair Comparison result is stated for recording variance data;
The snapshot management module is additionally operable to the bitmap querying command of receiving host transmission, and the bitmap querying command is for looking into Ask the bitmap of the valid data of the target snapshot;
The snapshot comparing module is additionally operable to generate the bitmap of valid data according to the variance data, and by the significant figure According to bitmap return to the host;The bitmap record of the valid data has the position of the valid data of the target snapshot.
7. device according to claim 6, which is characterized in that the snapshot comparing module be used for by the target snapshot with The baseline snapshot carries out diversity ratio pair, specifically includes:
The snapshot comparing module is used for, and the data block in the target snapshot is carried out with the data block in the baseline snapshot Block-by-block compares;
When the first data block in the target snapshot is identical as the first data block in the baseline snapshot, the mesh is determined It is invalid data to mark the corresponding data of the first data block in snapshot;When the first data block and the base in the target snapshot When the first data block in quasi- snapshot differs, determine that the corresponding data of the first data block in the target snapshot are difference number According to.
8. device according to claim 6, which is characterized in that the snapshot comparing module be used for by the target snapshot with The baseline snapshot carries out diversity ratio pair, specifically includes:
The snapshot comparing module is used for, and change version number used in tracking CBT by block marks the target snapshot respectively With the data block in the baseline snapshot;
When the corresponding CBT version numbers of the first data block in the target snapshot and the first data block pair in the baseline snapshot When the CBT version numbers answered are identical, determine that the corresponding data of the first data block in the target snapshot are invalid data;When described The corresponding CBT version numbers of the first data block in target snapshot and the first data block in the baseline snapshot are CBT editions corresponding This number when differing, determines that the corresponding data of the first data block in the target snapshot are variance data.
9. device according to claim 6, which is characterized in that the snapshot comparing module be used for by the target snapshot with The baseline snapshot carries out diversity ratio pair, specifically includes:
The snapshot comparing module is used for, and searches the privately owned mapping table of the target snapshot, and is read in the privately owned mapping table Address record;It is recorded in lookup described address in the data block of the target snapshot according to described address and records corresponding number According to it is variance data to determine that described address records corresponding data;
The shared mapping table of the target snapshot and the baseline snapshot is searched, and reads the note of the address in the shared mapping table Record;It is recorded in lookup described address in the data block of the target snapshot according to described address and records corresponding data, determines institute It is variance data to state address and record corresponding data.
10. according to claim 6 to 9 any one of them device, which is characterized in that the snapshot comparing module is used for according to institute The bitmap that variance data generates valid data is stated, is specifically included:
The snapshot comparing module, the model for determining data to be found in the target snapshot according to the bitmap querying command It encloses;The variance data block corresponding to the range of the data to be found is determined according to the variance data;According to the difference number The bitmap of valid data is generated according to block.
11. a kind of storage device, which is characterized in that the storage device includes the baseline snapshot of source book and the source book, described Source book be used for for the host that the storage device connect provide data storage, the baseline snapshot for the source book write-in data it Preceding snapshot, baseline snapshot record is invalid data, and the storage device includes controller and memory, described to deposit For storing instruction, for the controller for executing described instruction, described instruction makes institute to reservoir when being executed by the controller State method of the storage device execution as described in any one of claim 1 to 5.
12. a kind of data reading system, which is characterized in that including host and storage device as claimed in claim 11, institute It states and is communicated by communication network between host and the storage device;The host is sent for receiving the storage device Valid data bitmap, and according to the bitmap of the valid data read target snapshot in valid data.
CN201610200260.2A 2016-03-31 2016-03-31 Filter method, device and the data reading system of data Active CN105938457B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610200260.2A CN105938457B (en) 2016-03-31 2016-03-31 Filter method, device and the data reading system of data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610200260.2A CN105938457B (en) 2016-03-31 2016-03-31 Filter method, device and the data reading system of data

Publications (2)

Publication Number Publication Date
CN105938457A CN105938457A (en) 2016-09-14
CN105938457B true CN105938457B (en) 2018-10-02

Family

ID=57152002

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610200260.2A Active CN105938457B (en) 2016-03-31 2016-03-31 Filter method, device and the data reading system of data

Country Status (1)

Country Link
CN (1) CN105938457B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018119998A1 (en) * 2016-12-30 2018-07-05 华为技术有限公司 Snapshot rollback method, apparatus, storage controller, and system
CN106909514B (en) * 2017-03-01 2021-04-30 郑州云海信息技术有限公司 Method and device for positioning snapshot disk address
CN109753228B (en) * 2017-11-08 2022-08-02 阿里巴巴集团控股有限公司 Snapshot deleting method, device and system
CN108733513A (en) * 2018-05-07 2018-11-02 杭州宏杉科技股份有限公司 A kind of data-updating method and device
CN108664593A (en) * 2018-05-08 2018-10-16 东软集团股份有限公司 Data consistency verification method, device, storage medium and electronic equipment
CN109165120B (en) * 2018-08-08 2022-04-05 华为技术有限公司 Method and product for generating management snapshot and difference bitmap in distributed storage system
CN110858248B (en) * 2018-12-13 2023-03-24 安天科技集团股份有限公司 Information security detection method and device for printing equipment
CN111611110A (en) * 2020-06-30 2020-09-01 上海爱数信息技术股份有限公司 Difference recovery method and device based on fusion computer platform
CN112000279A (en) * 2020-07-29 2020-11-27 北京浪潮数据技术有限公司 Data volume synchronization method, device and medium
CN112230846B (en) * 2020-09-18 2023-01-06 苏州浪潮智能科技有限公司 Method and system for accelerating synchronization of snapshot data
EP4258097A1 (en) * 2022-04-07 2023-10-11 Samsung Electronics Co., Ltd. Operation method of host device and operation method of storage device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7373366B1 (en) * 2005-06-10 2008-05-13 American Megatrends, Inc. Method, system, apparatus, and computer-readable medium for taking and managing snapshots of a storage volume
CN101840362A (en) * 2009-10-28 2010-09-22 创新科存储技术有限公司 Method and device for achieving copy-on-write snapshot
CN102411524A (en) * 2011-12-30 2012-04-11 云海创想信息技术(天津)有限公司 Snapshot volume data copying method
CN103064763A (en) * 2012-12-27 2013-04-24 华为技术有限公司 Data backup method and related device and system
CN103415842A (en) * 2010-11-16 2013-11-27 阿克蒂菲奥股份有限公司 Systems and methods for data management virtualization

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4385215B2 (en) * 2003-10-21 2009-12-16 日本電気株式会社 Disk array device having snapshot simulation function

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7373366B1 (en) * 2005-06-10 2008-05-13 American Megatrends, Inc. Method, system, apparatus, and computer-readable medium for taking and managing snapshots of a storage volume
CN101840362A (en) * 2009-10-28 2010-09-22 创新科存储技术有限公司 Method and device for achieving copy-on-write snapshot
CN103415842A (en) * 2010-11-16 2013-11-27 阿克蒂菲奥股份有限公司 Systems and methods for data management virtualization
CN102411524A (en) * 2011-12-30 2012-04-11 云海创想信息技术(天津)有限公司 Snapshot volume data copying method
CN103064763A (en) * 2012-12-27 2013-04-24 华为技术有限公司 Data backup method and related device and system

Also Published As

Publication number Publication date
CN105938457A (en) 2016-09-14

Similar Documents

Publication Publication Date Title
CN105938457B (en) Filter method, device and the data reading system of data
TWI222562B (en) Dynamic links to file system snapshots
CN106021016A (en) Virtual point in time access between snapshots
CN101577735B (en) Method, device and system for taking over fault metadata server
CN100399327C (en) Managing file system versions
CN105993013B (en) A kind of data processing method apparatus and system
US7774565B2 (en) Methods and apparatus for point in time data access and recovery
TW200303469A (en) Providing a snapshot of a subset of a file system
CN102255962B (en) Distributive storage method, device and system
CN106354582B (en) A kind of continuous data protection method
CN103473277B (en) The Snapshot Method and device of file system
CN100498796C (en) Logic log generation method, database backup/ restoration method and system
CN102221982B (en) Method and system for implementing deletion of repeated data on block-level virtual storage equipment
CN101777017B (en) Rapid recovery method of continuous data protection system
CN108536752A (en) A kind of method of data synchronization, device and equipment
CN105868396A (en) Multi-version control method of memory file system
CN101814045A (en) Data organization method for backup services
CN105718217A (en) Method and device for maintaining data consistency of thin provisioning database
CN108076090A (en) Data processing method and storage management system
CN105593829B (en) Method, system and the medium of file system object are excluded from original image backup
CN105653396B (en) Standby system and its backup method
CN103034592B (en) Data processing method and device
CN103514249A (en) Method and system for automatic data reduction and storage device
CN104572242A (en) Method and device for expanding disk space of virtual machine and virtual machine system
CN109144416A (en) The method and apparatus for inquiring data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant