CN103577565B - A kind of method and apparatus that file is exported to tape - Google Patents
A kind of method and apparatus that file is exported to tape Download PDFInfo
- Publication number
- CN103577565B CN103577565B CN201310513281.6A CN201310513281A CN103577565B CN 103577565 B CN103577565 B CN 103577565B CN 201310513281 A CN201310513281 A CN 201310513281A CN 103577565 B CN103577565 B CN 103577565B
- Authority
- CN
- China
- Prior art keywords
- stub file
- tape
- data block
- subset
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1748—De-duplication implemented within the file system, e.g. based on file segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/172—Caching, prefetching or hoarding of files
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of method and apparatus that file exports to tape, described method is for deriving at least two original document preserved heavily to delete the form of data to tape, and described heavy data of deleting include stub file collection, single-instance storehouse and fingerprint base.Described method includes: selects at least one from described stub file concentration and to derive the stub file composition stub file subset to tape;Described stub file subset and described sub-single-instance storehouse are derived to a tape;And include not yet deriving to the stub file of tape at described stub file collection, repeat both of the aforesaid step.Method according to embodiments of the present invention can be still heavily to delete the form preservation original document of data, that is the value that data are heavily deleted is maintained, save the memory space of tape, and heavily deleted territory in same tape by guarantee, it is ensured that fast quick-recovery has also read original document.
Description
Technical field
The present invention relates to field of data storage, particularly relate to a kind of method that file is exported to tape and dress
Put.
Background technology
Data de-duplication technology refers to, file is divided into data block one by one, to each data block meter
Calculate a fingerprint, and contrast with the fingerprint existed;If this fingerprint has existed, then say
This data block bright has existed for, and avoids the need for saving again, it is only necessary to this data block is quoted meter
Number adds one, to illustrate that this data block be refer to again once more;If this fingerprint does not exists, then illustrate
This data block is unique, is at this moment accomplished by this fingerprint and corresponding data block to save.
After carrying out data de-duplication, would generally be formed in storage system and include three below part
Heavily delete data.First part is single-instance storehouse (the Single Instance for depositing data block
Repository, abbreviation: SIR).The second part is fingerprint base, be used for depositing all of fingerprint and
The count information of the data block that fingerprint is corresponding.3rd part is stub file, is used for preserving each file
The fingerprint of the data block marked off and the positional information of data block corresponding to this fingerprint.
Data de-duplication can greatly be saved to preserve the resource of file and space.But in order to carry out
Long term archival, generally also needs to be saved in tape-shaped medium's file.In the prior art, will be heavily to delete
The file derivation that data mode preserves is that stub file is restored to original document to a kind of method of tape
Backup to again go in tape, namely carrying out data convert by heavily deleting when data export to tape.It will be apparent that
This method loses the characteristic heavily deleting data, needs take mass data space and safeguard resource.
The another kind of method of file derivation to tape heavily to delete data mode preservation is by prior art,
Directly export to heavily deleting data in tape.Although this method maintains the characteristic heavily deleting data, but by
In not considering the physical characteristic of tape, will cause recovering the very time-consuming poor efficiency of original document from tape.Specifically
For, the sequential operation mode of tape determines and wants to obtain high-performance, it is necessary to order read-write as far as possible
All data, to avoid rewind operation as far as possible.Further, tape library generally only has a number of driver,
Typically cannot accomplish that all tapes are the most online.When recovering original document from tape or tape library, except
Stub file, the data block cited in file to be read is read from tape.And these data blocks can
Multiple tape can be deposited in so that recover an original document to relate to the loading of multiple tape, removal,
Read data and rewind operation, the time consumed and the wasting of resources will be not acceptable.
Summary of the invention
Technical problem
In view of this, the technical problem to be solved in the present invention is, the form such as how data de-duplication will
In storage server, the file of backup exports to magnetic tape system, and ensure that can be fast from magnetic tape system
Quick-recovery file.
Solution
In order to solve above-mentioned technical problem, according to one embodiment of the invention, it is provided that file is led by one
Go out the method to tape, for being derived extremely by least two original document preserved heavily to delete the form of data
Tape, described heavy data of deleting include stub file collection, single-instance storehouse and fingerprint base, described single
Case library includes all single data block marked off from each described original document, described stub file collection
Including at least two stub file corresponding with each described original document respectively, each described stub file includes
At least one finger print data, described finger print data includes that fingerprint and positional information, described fingerprint are used for identifying
The data block marked off from the original document corresponding with described stub file, described positional information represents and institute
Stating data block corresponding to finger print data position in described single-instance storehouse, described fingerprint base includes each institute
Stating fingerprint and reference count thereof, the reference count of described fingerprint represents the data block quoting described fingerprint
The number of stub file.
The described method that file exports to tape includes: concentrates from described stub file and selects at least one
Derive to tape stub file form stub file subset, wherein, described stub file subset and
The total amount of data in the sub-single-instance storehouse corresponding with described stub file subset is not more than the appearance of a tape
Amount, described sub-single-instance storehouse includes all lists quoted by the stub file in described stub file subset
The data block of one;Described stub file subset and described sub-single-instance storehouse are derived to a tape;
And include not yet deriving to the stub file of tape at described stub file collection, repeat
Both of the aforesaid step, till the stub file of described stub file concentration is all exported to tape.
For the above-mentioned method that file is exported to tape, in a kind of possible implementation, by described
Stub file subset and described sub-single-instance storehouse are derived to a tape, including: to described counterfoil literary composition
The finger print data in stub file in part subset is modified, so that in amended finger print data
Positional information represents the data block corresponding with the described finger print data position in described sub-single-instance storehouse;
The described stub file subset having carried out described amendment is derived to described tape;And by single for described son
Case library is derived to described tape.
For the above-mentioned method that file is exported to tape, in a kind of possible implementation, to institute
State before the finger print data in the stub file in stub file subset modifies, also comprise determining that institute
Stating the first data block in sub-single-instance storehouse, wherein, described first data block is by a described counterfoil
The data block that file is quoted;By in the stub file in described stub file subset with described first data block
Corresponding finger print data replaces with described first data block;And described first data block is single from described son
One case library removes.
For the above-mentioned method that file is exported to tape, in a kind of possible implementation, from described
Stub file concentration selects at least one to derive the stub file composition stub file subset to tape, bag
Include: from described stub file concentrate select predetermined quantity to derive to tape stub file form candidate
Subset;Calculation procedure, calculates described candidate subset and candidate corresponding with described candidate subset is single
The total amount of data of case library, wherein, described candidate sub-single-instance storehouse includes by described candidate subset
The all single data block that stub file is quoted;And it is not more than one in the total amount of data calculated
In the case of the capacity of tape, described candidate subset is defined as described stub file subset, otherwise from institute
State candidate subset remove a stub file and repeat described calculation procedure.
For the above-mentioned method that file is exported to tape, in a kind of possible implementation, from described
Stub file concentrate select predetermined quantity to derive to tape stub file form candidate subset, bag
Include: concentrate the preservation order deriving the stub file to tape according to described stub file, select successively
The stub file of predetermined quantity forms described candidate subset;Or concentrate to derive according to described stub file
Data block shared relationship between the stub file of tape, the shared data block selecting predetermined quantity is most
Stub file form described candidate subset.
For the above-mentioned method that file is exported to tape, in a kind of possible implementation, from institute
State stub file concentrate select at least one to derive to tape stub file form stub file subset
Before, the second data block in described single-instance storehouse, wherein, described second data are also comprised determining that
Block is the data block quoted by a described stub file;In the stub file that described stub file is concentrated
The finger print data corresponding with described second data block replaces with described second data block;And by described second
Data block removes from described single-instance storehouse.
In order to solve above-mentioned technical problem, according to one embodiment of the invention, it is provided that file is led by one
Go out the device to tape, for being derived extremely by least two original document preserved heavily to delete the form of data
Tape, described heavy data of deleting include stub file collection, single-instance storehouse and fingerprint base, described single
Case library includes all single data block marked off from each described original document, described stub file collection
Including at least two stub file corresponding with each described original document respectively, each described stub file includes
At least one finger print data, described finger print data includes that fingerprint and positional information, described fingerprint are used for identifying
The data block marked off from the original document corresponding with described stub file, described positional information represents and institute
Stating data block corresponding to finger print data position in described single-instance storehouse, described fingerprint base includes each institute
Stating fingerprint and reference count thereof, the reference count of described fingerprint represents the data block quoting described fingerprint
The number of stub file.
The described device that file exports to tape includes: select module, for from described stub file collection
In select at least one to derive to tape stub file form stub file subset, wherein, described in deposit
The total amount of data in root subset of the file and the sub-single-instance storehouse corresponding with described stub file subset is little
In the capacity of a tape, described sub-single-instance storehouse includes by the counterfoil literary composition in described stub file subset
The all single data block that part is quoted;Perform module, be connected with described selection module, for by described
Stub file subset and described sub-single-instance storehouse are derived to a tape;And judge module, with institute
State execution module and described selection module connects, be used for judging whether described stub file collection includes not yet leading
Go out the stub file to tape.
For the above-mentioned device that file is exported to tape, in a kind of possible implementation, described in hold
Row module includes: amendment unit, is connected with described selection module, in described stub file subset
Stub file in finger print data modify, so that the positional information in amended finger print data
Represent the data block corresponding with the described finger print data position in described sub-single-instance storehouse;Derive single
Unit, is connected with described amendment unit, for the described stub file subset having carried out described amendment being derived
Derive to described tape to described tape and by described sub-single-instance storehouse.
For the above-mentioned device that file is exported to tape, in a kind of possible implementation, also include
Restoration module, described restoration module is connected with described execution module, and described restoration module is configured to:
Determining the first data block in described sub-single-instance storehouse, wherein, described first data block is for by an institute
State the data block that stub file is quoted;By in the stub file in described stub file subset with described first
Finger print data corresponding to data block replaces with described first data block, by described first data block from described son
Single-instance storehouse removes, and the described stub file subset output after replacing is to described amendment unit.
For the above-mentioned device that file is exported to tape, in a kind of possible implementation, described choosing
Select module to include: candidate subset selects unit, select predetermined quantity for concentrating from described stub file
Derive the stub file composition candidate subset to tape;Computing unit, selects single with described candidate subset
Unit connects, for calculating described candidate subset and the sub-single-instance of the candidate corresponding with described candidate subset
The total amount of data in storehouse, wherein, described candidate sub-single-instance storehouse includes by the counterfoil in described candidate subset
The all single data block that file is quoted;And determine unit, it is connected with described computing unit, is used for
In the case of the total amount of data calculated at described computing unit is not more than the capacity of a tape, by institute
State candidate subset and be defined as described stub file subset, otherwise remove a counterfoil literary composition from described candidate subset
Part.
For the above-mentioned device that file is exported to tape, in a kind of possible implementation, described time
Subset is selected to select unit to be configured to: concentrating according to described stub file to derive the stub file to tape
Preservation order, select successively predetermined quantity stub file form described candidate subset;Or according to institute
State stub file concentration and to derive the data block shared relationship between the stub file of tape, select predetermined
The stub file that the shared data block of quantity is most forms described candidate subset.
For the above-mentioned device that file is exported to tape, in a kind of possible implementation, described multiple
Grand master pattern block is also connected with described selection module, and is also configured to determine in described single-instance storehouse
Second data block, wherein, described second data block is the data block quoted by a described stub file;
And the finger print data corresponding with described second data block in the stub file that described stub file concentrated
Replace with described second data block, described second data block is removed from described single-instance storehouse, and will replace
Described stub file collection output after changing is to described selection module.
Beneficial effect
The stub file corresponding by the original document by the form preservation heavily to delete data and this counterfoil
The data block that the finger print data of file is corresponding derives to a tape, according to embodiments of the present invention by file
The method exporting to tape can be still heavily to delete the form preservation original document of data characteristic, say, that
Maintain the value that data are heavily deleted, saved the memory space of tape, and heavily deleted territory same by guarantee
In one tape, it is ensured that fast quick-recovery also reads original document.
According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the further feature of the present invention and side
Face will be clear from.
Accompanying drawing explanation
The accompanying drawing of the part comprising in the description and constituting description together illustrates with description
The exemplary embodiment of the present invention, feature and aspect, and for explaining the principle of the present invention.
Fig. 1 illustrates the schematic diagram with the backup storage server heavily deleting function;
Fig. 2 illustrates the method flow diagram that file exports to tape according to an embodiment of the invention;
Fig. 3 illustrates the method flow diagram that file exports to tape according to another embodiment of the present invention;
Fig. 4 illustrates the method flow diagram that file exports to tape according to another embodiment of the present invention;
Fig. 5 illustrates in the method that file exports to tape according to an embodiment of the invention and determines that candidate deposits
The concrete operation step flow chart of root subset of the file;
Fig. 6 illustrates the method flow diagram that file exports to tape according to further embodiment of this invention;
Fig. 7 illustrates the structured flowchart of the device that file exports to tape according to an embodiment of the invention;
Fig. 8 illustrates the structural frames of the device that file exports to tape according to another embodiment of the present invention
Figure;
Fig. 9 illustrates the execution module of the device that file exports to tape according to an embodiment of the invention
Structured flowchart;
Figure 10 illustrates the selection module of the device that file exports to tape according to an embodiment of the invention
Structured flowchart;
Figure 11 illustrates the structural frames of the device that file exports to tape according to further embodiment of this invention
Figure.
Detailed description of the invention
Various exemplary embodiments, feature and the aspect of the present invention is described in detail below with reference to accompanying drawing.Attached
Reference identical in figure represents the same or analogous element of function.Although enforcement shown in the drawings
The various aspects of example, but unless otherwise indicated, it is not necessary to accompanying drawing drawn to scale.
The most special word " exemplary " means " as example, embodiment or illustrative ".Here as
Any embodiment illustrated by " exemplary " should not necessarily be construed as preferred or advantageous over other embodiments.
It addition, in order to better illustrate the present invention, detailed description of the invention below gives numerous
Detail.It will be appreciated by those skilled in the art that do not have some detail, the present invention is equally
Implement.In some instances, for method well known to those skilled in the art, means, element and circuit
It is not described in detail, in order to highlight the purport of the present invention.
Embodiment 1
The method that file exports to tape according to embodiments of the present invention, for by heavily to delete the shape of data
At least two original document that formula preserves is derived to tape, and described heavy data of deleting include stub file collection, list
One case library and fingerprint base.The original document preserved heavily to delete the form of data can be stored in such as figure
In storage server shown in 1, as it is shown in figure 1, described single-instance storehouse includes from each described original document
The all single data block marked off, such as data block DB11, data block DB12 etc..Described counterfoil
File set includes at least two stub file corresponding with each described original document respectively, each described counterfoil literary composition
Part includes that at least one finger print data, described finger print data include fingerprint and positional information, and described fingerprint is used
In the data block that mark marks off from the original document corresponding with described stub file, described positional information table
Show the data block corresponding with described finger print data position in described single-instance storehouse, such as, described in deposit
Root file set includes that stub file SF1 and stub file SF2, stub file SF1 include and from its original literary composition
The fingerprint Fp12 etc. that fingerprint Fp11 corresponding to data block DB11 that part marks off, data block DB12 are corresponding,
Stub file SF2 includes the fingerprint Fp21 corresponding with data block DB21 marked off from its original document, number
According to fingerprint Fp22 etc. corresponding for block DB22.Described fingerprint base includes each described fingerprint and reference count thereof,
The reference count of described fingerprint represents the number of the stub file of the data block quoting described fingerprint.
Fig. 2 illustrates the flow chart of the method that file exports to tape according to an embodiment of the invention.As
Shown in Fig. 2, the method specifically includes that
Step S210, selects at least one from described stub file concentration and to derive the stub file to tape
Composition stub file subset, wherein, described stub file subset and corresponding with described stub file subset
The total amount of data in sub-single-instance storehouse be not more than the capacity of a tape, described sub-single-instance storehouse includes
The all single data block quoted by the stub file in described stub file subset.
Specifically, the number between the stub file of tape can be derived according to described stub file concentration
Determine described stub file subset according to block shared relationship, it is also possible to by described stub file concentrate to derive to
The preservation order of the stub file of tape determines stub file subset, if stub file finally determined
The total amount of data in collection and the sub-single-instance storehouse corresponding with described stub file subset is not more than a magnetic
The capacity of band.So can ensure that a tape is one and heavily deletes territory, say, that can ensure that original literary composition
The stub file of part and single-instance storehouse at same tape, read from tape original document time
Wait, only need to load a tape and just can complete.
Step S220, derives described stub file subset and described sub-single-instance storehouse to a magnetic
Band.
Step S230, it is judged that whether described stub file collection includes not yet deriving the stub file to tape.
If including, repeated execution of steps S210 and step S220, until the counterfoil literary composition that described stub file is concentrated
Till part is all exported to tape, if do not included, the file of the embodiment of the present invention exports to the stream of tape
Journey terminates.
So, by by with heavily delete stub file corresponding to original document that the form of data preserves and
The data block that the finger print data of this stub file is corresponding derives to a tape, according to the present embodiment by literary composition
Part exports to the method for tape can be still heavily to delete the form preservation original document of data characteristic, namely
Say and maintain the value that data are heavily deleted, saved the memory space of tape, and by ensureing that heavily deleting territory exists
In same tape, it is ensured that fast quick-recovery also reads original document.
Embodiment 2
Fig. 3 illustrates the method flow diagram that file exports to tape according to another embodiment of the present invention, Fig. 3
The assembly that middle label is identical with Fig. 2 has identical function, for simplicity's sake, omits these assemblies
Describe in detail.
As it is shown on figure 3, the method for tape that file is exported to shown in Fig. 3 and file is led shown in Fig. 2
The method gone out to tape differs primarily in that, in a kind of possible implementation, and step in embodiment 1
The concrete operation step of S220 may include that
Step S321, the finger print data in the stub file in described stub file subset is modified,
So that the positional information in amended finger print data represents the data block corresponding with described finger print data
Position in described sub-single-instance storehouse.
Step S322, the described stub file subset that will carry out described amendment derive to described tape.
Step S323, described sub-single-instance storehouse is derived to described tape.
Due to the sequential operation mode of tape, first the described stub file subset having carried out described amendment is led
Go out, more described sub-single-instance storehouse is derived to described tape, so that read original document user
When without rewind operation, such that it is able to realize the fast quick-recovery of original document.
Embodiment 3
Fig. 4 illustrates the method flow diagram that file exports to tape according to another embodiment of the present invention, Fig. 4
The assembly that middle label is identical with Fig. 3 has identical function, for simplicity's sake, omits these assemblies
Describe in detail.
As shown in Figure 4, the method for tape that file is exported to shown in Fig. 4 and file is led shown in Fig. 3
The method gone out to tape differs primarily in that, in a kind of possible implementation, before step S321,
Can also include:
Step S410, the first data block determined in described sub-single-instance storehouse, wherein, described first number
It it is the data block quoted by a described stub file according to block.
Step S420, by corresponding with described first data block in the stub file in described stub file subset
Finger print data replace with described first data block.
Step S430, described first data block is removed from described sub-single-instance storehouse.
The physical characteristic of tape determines that to carry out the recovery operation of original document in tape be the most time-consuming
Poor efficiency, the method that file exports to tape of the present embodiment is first by depositing in described stub file subset
The data block do not shared between root file is restored to described stub file subset, then by this data block
Remove from described sub-single-instance storehouse, then perform step S220, when minimizing user reads original document
Original document recovery operation, while realizing quickly the reading of original document, does not increase extra depositing
Reserves.
Embodiment 4
Fig. 5 illustrates step S210 in the method that file exports to tape of according to embodiments of the present invention 1
Concrete operation step, as it is shown in figure 5, in a kind of possible implementation, this step may include that
Step S510, from described stub file concentrate select predetermined quantity to derive to tape counterfoil literary composition
Part composition candidate subset.
Step S520, calculation procedure, calculate described candidate subset and the time corresponding with described candidate subset
Selecting the total amount of data in sub-single-instance storehouse, wherein, described candidate sub-single-instance storehouse includes by described candidate
The all single data block that stub file in subset is quoted.
Whether the total amount of data that step S530, judgement are calculated is more than the capacity of a tape.If institute
The total amount of data calculated is not more than the capacity of a tape, then perform step S540, by described candidate
Collection is defined as described stub file subset, otherwise performs step S550, removes one from described candidate subset
Stub file also repeats described calculation procedure S520.
Specifically, the number between the stub file of tape can be derived according to described stub file concentration
According to block shared relationship, select stub file composition described candidate that the shared data block of predetermined quantity is most
Collection.So can ensure that the space utilisation of tape is the biggest, say, that identical tape storage is empty
Between the storage data that can maintain up to.
In addition it is also possible to concentrate the preservation deriving the stub file to tape suitable according to described stub file
Sequence, the stub file selecting predetermined quantity successively forms described candidate subset.So while it is not guaranteed that magnetic
The space utilisation of band is maximum, but can avoid because calculating data block in each subset of stub file collection
The number being shared and the amount of calculation brought.Those skilled in the art will be understood that user can root completely
The method selecting described candidate subset is set flexibly according to personal like and/or actual application scenarios.
By first determining that the stub file derived to tape by least one forms stub file subset,
And ensure this stub file subset and the sum in the sub-single-instance storehouse corresponding with this stub file subset
The capacity of a tape it is not more than according to amount.Can ensure that a tape is one and heavily deletes territory, from tape
The when of reading original document, only need to load a tape and just can complete, it is ensured that fast quick-recovery is also
Read original document.
Embodiment 5
Fig. 6 illustrates the method flow diagram that file exports to tape according to further embodiment of this invention, Fig. 6
The assembly that middle label is identical with Fig. 2 has identical function, for simplicity's sake, omits these assemblies
Describe in detail.
As shown in Figure 6, the method for tape that file is exported to shown in Fig. 6 and file is led shown in Fig. 2
The method gone out to tape differs primarily in that, in a kind of possible implementation, and step in embodiment 1
Before S210, it is also possible to including:
Step S610, the second data block determined in described single-instance storehouse, wherein, described second data
Block is the data block quoted by a described stub file.
In step S620, the stub file that described stub file is concentrated corresponding with described second data block
Finger print data replaces with described second data block.
Step S630, described second data block is removed from described single-instance storehouse.
It should be noted that the method that file exports to tape of the present embodiment can include embodiment 3
Described step S410, step S420, step S430, perform step S610, step S620, step
S630 allows for the physical characteristic of tape equally, carry out in tape the recovery operation of original document be and
Its time-consuming poor efficiency, therefore, the method that file exports to tape of the present embodiment is first by described counterfoil literary composition
The data block do not shared between the stub file that part is concentrated is restored and is concentrated to described stub file, then will
This data block removes from described single-instance storehouse, then performs step S220, reads original literary composition reducing user
Original document recovery operation during part, while realizing quickly the reading of original document, does not increase volume
Outer amount of storage.
Embodiment 6
The device that file exports to tape according to embodiments of the present invention, for by heavily to delete the shape of data
At least two original document that formula preserves is derived to tape, and described heavy data of deleting include stub file collection, list
One case library and fingerprint base.The original document preserved heavily to delete the form of data can be stored in such as figure
In storage server shown in 1, specifically it is referred to the associated description of preceding method embodiment, at this not
Repeat again.
Fig. 7 illustrates the structural frames of the device 700 that file exports to tape according to an embodiment of the invention
Figure, as it is shown in fig. 7, device 700 specifically includes that selection module 710, performs module 720 and judge mould
Block 730, wherein, select module 710 for concentrate from described stub file select at least one to derive to
Tape stub file composition stub file subset, described stub file subset and with described stub file
The total amount of data in the sub-single-instance storehouse that subset is corresponding is not more than the capacity of a tape, the single reality of described son
Example storehouse includes all single data block quoted by the stub file in described stub file subset;Perform
Module 720 is connected with selecting module 710, for by described stub file subset and described sub-single-instance
Storehouse is derived to a tape;Judge module 730 is connected with performing module 720, is used for judging described counterfoil literary composition
Whether part collection includes not yet deriving the stub file to tape.
In a kind of possible implementation, select module 710 can concentrate according to described stub file and want
The data block shared relationship derived between the stub file of tape determines described stub file subset, it is possible to
To concentrate the preservation order deriving the stub file to tape to determine stub file by described stub file
Subset, as long as the stub file subset finally determined and the son corresponding with described stub file subset are single
The total amount of data of case library is not more than the capacity of a tape.So can ensure that a tape is a weight
Delete territory, say, that can ensure that the stub file of original document and single-instance storehouse at same tape,
Reading original document from tape when, only need to load a tape and just can complete.
So, by performing module 720 by heavily to delete corresponding the depositing of original document that the form of data preserves
The data block that the finger print data of root file and this stub file is corresponding derives to a tape, according to this
The device that file exports to tape of embodiment can be still former heavily to delete the form preservation of data characteristic
Beginning file, say, that maintain the value that data are heavily deleted, has saved the memory space of tape, and has led to
Cross guarantee and heavily delete territory in same tape, it is ensured that fast quick-recovery also reads original document.
Embodiment 7
Fig. 8 illustrates the structural frames of the device that file exports to tape according to another embodiment of the present invention
Figure, the assembly that in Fig. 8, label is identical with Fig. 7 has identical function, for simplicity's sake, omits these
The detailed description of assembly.
Shown in the device that file is exported to tape shown in Fig. 8 and Fig. 7, file is exported to tape
Differring primarily in that of device, in a kind of possible implementation, device 700 also includes restoration module
740.Restoration module 740 is connected with performing module 720, for determining the in described sub-single-instance storehouse
One data block, wherein, described first data block is the data block quoted by a described stub file;With
And by finger print data corresponding with described first data block in the stub file in described stub file subset
Replace with described first data block, described first data block is removed from described sub-single-instance storehouse, and will
Described stub file subset output after replacement is to performing module 720.
In a kind of possible implementation, restoring means 740 can also be connected with selecting module 710.Tool
Body ground, restoration module 740 is configured to determine that the second data block in described single-instance storehouse, wherein,
Described second data block is the data block quoted by a described stub file;And by described stub file
In the stub file concentrated, the finger print data corresponding with described second data block replaces with described second data
Block, removes described second data block from described single-instance storehouse, and will replace after described stub file
Collection output is to selecting module 710.
The physical characteristic of tape determines that to carry out the recovery operation of original document in tape be the most time-consuming
Poor efficiency, the device that file exports to tape of the present embodiment first passes through restoring means 740 and deposits described
The data block do not shared between stub file in root file set and/or described stub file subset is restored,
By performance element 720, described stub file subset and described sub-single-instance storehouse are derived extremely the most again
One tape, the original document recovery operation when reducing user and reading original document is to realize original document
Quickly read while, do not increase extra amount of storage.
In a kind of possible implementation, can include revising unit as it is shown in figure 9, perform module 720
721 and lead-out unit 722.Specifically, amendment unit 721 is connected with selecting module 710, for described
The finger print data in stub file in stub file subset is modified, so that amended fingerprint number
Positional information according to represents that the data block corresponding with described finger print data is in described sub-single-instance storehouse
Position;Lead-out unit 722 is connected with amendment unit 721, for depositing having carried out described in described amendment
Root subset of the file is derived to described tape;And described sub-single-instance storehouse is derived to described tape.
Due to the sequential operation mode of tape, the institute of described amendment first will be carried out by lead-out unit 722
State stub file subset to derive, more described sub-single-instance storehouse is derived to described tape, so that
Without rewind operation when that user reading original document, such that it is able to realize the most extensive of original document
Multiple.
In a kind of possible implementation, as shown in Figure 10, module 710 is selected can to include candidate's
Collection selects unit 711, computing unit 712 and determines unit 713.
Specifically, candidate subset selects unit 711 to select predetermined quantity for concentrating from described stub file
To derive to tape stub file form candidate subset;Computing unit 712 and candidate subset select single
Unit 711 connects, and is used for calculating described candidate subset and candidate corresponding with described candidate subset is single
The total amount of data of case library, wherein, described candidate sub-single-instance storehouse includes by described candidate subset
The all single data block that stub file is quoted;Determine that unit 713 is connected with computing unit 712, be used for
In the case of the total amount of data calculated at described computing unit is not more than the capacity of a tape, by institute
State candidate subset and be defined as described stub file subset, otherwise remove a counterfoil literary composition from described candidate subset
Part.
In a kind of possible implementation, candidate subset selects unit 711 to be configured to: according to
Described stub file concentrates the preservation order deriving the stub file to tape, selects predetermined quantity successively
Stub file form described candidate subset;Or concentrate according to described stub file and to derive to tape
Data block shared relationship between stub file, selects the counterfoil literary composition that the shared data block of predetermined quantity is most
Part forms described candidate subset.Those skilled in the art will be understood that user can be based entirely on individual's happiness
Good and/or actual application scenarios sets candidate subset flexibly and selects unit 711 to select the side of described candidate subset
Method.
By first determining that the stub file derived to tape by least one forms stub file subset,
And ensure this stub file subset and the sum in the sub-single-instance storehouse corresponding with this stub file subset
The capacity of a tape it is not more than according to amount.Can ensure that a tape is one and heavily deletes territory, from tape
The when of reading original document, only need to load a tape and just can complete, it is ensured that fast quick-recovery is also
Read original document.
Embodiment 8
Figure 11 is the structured flowchart of the device that file exports to tape of another embodiment of the present invention.Institute
Stating and file exports to the device 1100 of tape can be to possess the host server of computing capability, Ge Renji
Calculation machine PC or portable portable computer or terminal etc..The specific embodiment of the invention is not to meter
Implementing of operator node limits.
The described device 1100 that file exports to tape includes that processor (processor) 1110, communication connect
Mouth (Communications Interface) 1120, memorizer (memory) 1130 and bus 1140.Wherein,
Processor 1110, communication interface 1120 and memorizer 1130 complete mutual leading to by bus 1140
Letter.
Communication interface 1120 is used for and net element communication, and wherein network element includes such as Virtual Machine Manager center, is total to
Enjoy storage etc..
Processor 1110 is used for performing program.Processor 1110 is probably a central processor CPU, or
Person is application-specific integrated circuit ASIC (Application Specific Integrated Circuit), or quilt
It is configured to implement one or more integrated circuits of the embodiment of the present invention.
Memorizer 1130 is used for storing file.Memorizer 1130 may comprise high-speed RAM container, it is also possible to
Also include non-volatile container (non-volatile memory), for example, at least one case for magnetic disk.Memorizer
1130 can also be vessel array.Memorizer 1130 is also possible to by piecemeal, and described piece can be by certain
Rule sets synthesis virtual volume.
In a kind of possible embodiment, said procedure can be the program generation including computer-managed instruction
Code.This program is particularly used in: selecting at least one from described stub file concentration to derive to tape
Stub file composition stub file subset, wherein, described stub file subset and with described stub file
The total amount of data in the sub-single-instance storehouse that subset is corresponding is not more than the capacity of a tape, the single reality of described son
Example storehouse includes all single data block quoted by the stub file in described stub file subset;By institute
State stub file subset and described sub-single-instance storehouse is derived to a tape;And at described counterfoil literary composition
Part collection includes not yet deriving to the stub file of tape, repeats both of the aforesaid step, directly
To described stub file concentrate stub file be all exported to tape.
In a kind of possible embodiment, this program is particularly used in: in described stub file subset
Stub file in finger print data modify, so that the positional information in amended finger print data
Represent the data block corresponding with the described finger print data position in described sub-single-instance storehouse;Will carry out
The described stub file subset of described amendment derives to described tape;And described sub-single-instance storehouse is led
Go out to described tape.
In a kind of possible embodiment, this program is particularly used in: determine described sub-single-instance storehouse
In the first data block, wherein, described first data block is the data quoted by a described stub file
Block;By finger print data corresponding with described first data block in the stub file in described stub file subset
Replace with described first data block;And described first data block is removed from described sub-single-instance storehouse.
In a kind of possible embodiment, this program is particularly used in: concentrate choosing from described stub file
Go out predetermined quantity to derive the stub file composition candidate subset to tape;Calculation procedure, calculates described
Candidate subset and the total amount of data in the candidate corresponding with described candidate subset sub-single-instance storehouse, wherein,
Described candidate sub-single-instance storehouse include by the stub file in described candidate subset quote all single
Data block;And in the case of the total amount of data calculated is not more than the capacity of a tape, will
Described candidate subset is defined as described stub file subset, otherwise removes a counterfoil from described candidate subset
File also repeats described calculation procedure.
In a kind of possible embodiment, this program is particularly used in: concentrate according to described stub file
Deriving the preservation order of the stub file to tape, the stub file selecting predetermined quantity successively forms institute
State candidate subset;Or concentrate according to described stub file and to derive the number between the stub file of tape
According to block shared relationship, select stub file composition described candidate that the shared data block of predetermined quantity is most
Collection.
In a kind of possible embodiment, this program is particularly used in: determine in described single-instance storehouse
The second data block, wherein, described second data block is the data block quoted by a described stub file;
Finger print data corresponding with described second data block in the stub file concentrate described stub file is replaced
For described second data block;And described second data block is removed from described single-instance storehouse.
According to the device that file is exported to tape of the present embodiment, with described in embodiment 6 to 8 by file
Export to that the device of tape explained is similar, and those skilled in the art will be understood that aforesaid possible realization
Mode all can be applicable to the present embodiment and can obtain identical beneficial effect, repeats no more here.
The above, the only detailed description of the invention of the present invention, but protection scope of the present invention is not limited to
In this, any those familiar with the art, can be easily in the technical scope that the invention discloses
Expect change or replace, all should contain within protection scope of the present invention.Therefore, the protection of the present invention
Scope should described be as the criterion with scope of the claims.
Claims (14)
1. the method that file exports to tape, for by the form preservation heavily to delete data extremely
Few two original documents are derived to tape, described heavy delete data include stub file collection, single-instance storehouse,
And fingerprint base, described single-instance storehouse includes all single number marked off from each described original document
According to block, described stub file collection includes at least two counterfoil literary composition corresponding with each described original document respectively
Part, each described stub file includes that at least one finger print data, described finger print data include fingerprint and position
Information, the data that described fingerprint marks off from the original document corresponding with described stub file for mark
Block, described positional information represents that the data block corresponding with described finger print data is in described single-instance storehouse
Position, described fingerprint base includes each described fingerprint and reference count thereof, and the reference count of described fingerprint represents
Quote the number of the stub file of the data block of described fingerprint, it is characterised in that the method includes:
Select at least one from described stub file concentration and to derive the stub file composition counterfoil to tape
Subset of the file, wherein, described stub file subset and the son corresponding with described stub file subset are single
The total amount of data of case library is not more than the capacity of a tape, and described sub-single-instance storehouse includes being deposited by described
The all single data block that stub file in root subset of the file is quoted;
Described stub file subset and described sub-single-instance storehouse are derived to a tape;And
Include not yet deriving to the stub file of tape at described stub file collection, repeat
Both of the aforesaid step, till the stub file of described stub file concentration is all exported to tape.
Method the most according to claim 1, it is characterised in that by described stub file subset and
Described sub-single-instance storehouse is derived to a tape, including:
Finger print data in stub file in described stub file subset is modified, so that amendment
After finger print data in positional information represent that the data block corresponding with described finger print data is single at described son
Position in one case library;
The described stub file subset having carried out described amendment is derived to described tape;And
Described sub-single-instance storehouse is derived to described tape.
Method the most according to claim 2, it is characterised in that in described stub file subset
Stub file in finger print data modify before, also include:
Determining the first data block in described sub-single-instance storehouse, wherein, described first data block is by one
The data block that individual described stub file is quoted;
By fingerprint number corresponding with described first data block in the stub file in described stub file subset
According to replacing with described first data block;And
Described first data block is removed from described sub-single-instance storehouse.
The most according to the method in any one of claims 1 to 3, it is characterised in that from described counterfoil
File set is selected at least one and to derive the stub file composition stub file subset to tape, including:
From described stub file concentrate select predetermined quantity to derive to tape stub file form wait
Select subset;
Calculation procedure, calculates described candidate subset and the candidate corresponding with the described candidate subset single reality of son
The total amount of data in example storehouse, wherein, described candidate sub-single-instance storehouse includes by depositing in described candidate subset
The all single data block that root file is quoted;And
In the case of the total amount of data calculated is not more than the capacity of a tape, by described candidate
Collection is defined as described stub file subset, otherwise removes a stub file from described candidate subset and repeats
Described calculation procedure.
Method the most according to claim 4, it is characterised in that concentrate from described stub file and select
Predetermined quantity to derive to tape stub file form candidate subset, including:
Concentrate the preservation order deriving the stub file to tape according to described stub file, select successively
The stub file of predetermined quantity forms described candidate subset;Or
Concentrate the data block derived between the stub file of tape to share to close according to described stub file
System, selects the stub file described candidate subset of composition that the shared data block of predetermined quantity is most.
The most according to the method in any one of claims 1 to 3, it is characterised in that depositing from described
Root file set is selected at least one to derive before the stub file composition stub file subset of tape,
Also include:
Determining the second data block in described single-instance storehouse, wherein, described second data block is by one
The data block that described stub file is quoted;
Finger print data corresponding with described second data block in the stub file that described stub file is concentrated
Replace with described second data block;And
Described second data block is removed from described single-instance storehouse.
Method the most according to claim 4, it is characterised in that concentrating choosing from described stub file
Go out at least one to derive before the stub file composition stub file subset of tape, also include:
Determining the second data block in described single-instance storehouse, wherein, described second data block is by one
The data block that described stub file is quoted;
Finger print data corresponding with described second data block in the stub file that described stub file is concentrated
Replace with described second data block;And
Described second data block is removed from described single-instance storehouse.
Method the most according to claim 5, it is characterised in that concentrating choosing from described stub file
Go out at least one to derive before the stub file composition stub file subset of tape, also include:
Determining the second data block in described single-instance storehouse, wherein, described second data block is by one
The data block that described stub file is quoted;
Finger print data corresponding with described second data block in the stub file that described stub file is concentrated
Replace with described second data block;And
Described second data block is removed from described single-instance storehouse.
9. file is exported to a device for tape, for by the form preservation heavily to delete data extremely
Few two original documents are derived to tape, described heavy delete data include stub file collection, single-instance storehouse,
And fingerprint base, described single-instance storehouse includes all single number marked off from each described original document
According to block, described stub file collection includes at least two counterfoil literary composition corresponding with each described original document respectively
Part, each described stub file includes that at least one finger print data, described finger print data include fingerprint and position
Information, the data that described fingerprint marks off from the original document corresponding with described stub file for mark
Block, described positional information represents that the data block corresponding with described finger print data is in described single-instance storehouse
Position, described fingerprint base includes each described fingerprint and reference count thereof, and the reference count of described fingerprint represents
Quote the number of the stub file of the data block of described fingerprint, it is characterised in that this device includes:
Select module, to derive the counterfoil to tape for selecting at least one from described stub file concentration
File composition stub file subset, wherein, described stub file subset and with described stub file subset
The total amount of data in corresponding sub-single-instance storehouse is not more than the capacity of a tape, described sub-single-instance storehouse
Including all single data block quoted by the stub file in described stub file subset;
Perform module, be connected with described selection module, for by described stub file subset and described son
Single-instance storehouse is derived to a tape;And
Judge module, is connected with described execution module and described selection module, is used for judging described counterfoil literary composition
Whether part collection includes not yet deriving the stub file to tape.
Device the most according to claim 9, it is characterised in that described execution module includes:
Amendment unit, is connected with described selection module, for the counterfoil literary composition in described stub file subset
Finger print data in part is modified, so that the positional information in amended finger print data represents and institute
State data block corresponding to finger print data position in described sub-single-instance storehouse;
Lead-out unit, is connected with described amendment unit, for having carried out the described counterfoil literary composition of described amendment
Part subset derives to described tape and derives described sub-single-instance storehouse to described tape.
11. devices according to claim 10, it is characterised in that also include restoration module, described
Restoration module is connected with described execution module, and described restoration module is configured to:
Determining the first data block in described sub-single-instance storehouse, wherein, described first data block is by one
The data block that individual described stub file is quoted;
By fingerprint number corresponding with described first data block in the stub file in described stub file subset
According to replacing with described first data block, described first data block is removed from described sub-single-instance storehouse, and
Described stub file subset output after replacing is to described amendment unit.
12. according to the device according to any one of claim 9 to 11, it is characterised in that described selection
Module includes:
Candidate subset select unit, for from described stub file concentrate select predetermined quantity to derive to
The stub file composition candidate subset of tape;
Computing unit, selects unit to be connected with described candidate subset, be used for calculating described candidate subset and
The total amount of data in the candidate corresponding with described candidate subset sub-single-instance storehouse, wherein, described candidate's sub-list
One case library includes all single data block quoted by the stub file in described candidate subset;And
Determine unit, be connected with described computing unit, for the sum calculated at described computing unit
Be not more than the capacity of a tape according to amount in the case of, described candidate subset is defined as described stub file
Subset, otherwise removes a stub file from described candidate subset.
13. devices according to claim 12, it is characterised in that described candidate subset selects unit
It is configured to:
Concentrate the preservation order deriving the stub file to tape according to described stub file, select successively
The stub file of predetermined quantity forms described candidate subset;Or
Concentrate the data block derived between the stub file of tape to share to close according to described stub file
System, selects the stub file described candidate subset of composition that the shared data block of predetermined quantity is most.
14. devices according to claim 11, it is characterised in that described restoration module is also with described
Selection module connects, and is also configured to
Determining the second data block in described single-instance storehouse, wherein, described second data block is by one
The data block that described stub file is quoted;And
Finger print data corresponding with described second data block in the stub file that described stub file is concentrated
Replace with described second data block, described second data block is removed from described single-instance storehouse, and will replace
Described stub file collection output after changing is to described selection module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310513281.6A CN103577565B (en) | 2013-10-25 | 2013-10-25 | A kind of method and apparatus that file is exported to tape |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310513281.6A CN103577565B (en) | 2013-10-25 | 2013-10-25 | A kind of method and apparatus that file is exported to tape |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103577565A CN103577565A (en) | 2014-02-12 |
CN103577565B true CN103577565B (en) | 2017-01-04 |
Family
ID=50049341
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310513281.6A Active CN103577565B (en) | 2013-10-25 | 2013-10-25 | A kind of method and apparatus that file is exported to tape |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103577565B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110647077B (en) * | 2019-09-26 | 2020-12-25 | 珠海格力电器股份有限公司 | Control method and system of industrial control device, storage medium and industrial control device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103049391A (en) * | 2012-12-29 | 2013-04-17 | 华为技术有限公司 | Data processing method, data format and equipment |
CN103154950A (en) * | 2012-05-04 | 2013-06-12 | 华为技术有限公司 | Repeated data deleting method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8108638B2 (en) * | 2009-02-06 | 2012-01-31 | International Business Machines Corporation | Backup of deduplicated data |
-
2013
- 2013-10-25 CN CN201310513281.6A patent/CN103577565B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103154950A (en) * | 2012-05-04 | 2013-06-12 | 华为技术有限公司 | Repeated data deleting method and device |
CN103049391A (en) * | 2012-12-29 | 2013-04-17 | 华为技术有限公司 | Data processing method, data format and equipment |
Non-Patent Citations (3)
Title |
---|
The Effectiveness of Deduplication on Virtual Machine Disk Images;Keren Jin等;《Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference》;20090527;7-13 * |
磁盘的重复数据删除技术在数据备份系统中的应用;马锡坤等;《中国医疗设备》;20121031;第27卷(第10期);78-79,171 * |
重复数据删除算法在VTL系统中的应用研究;孙虎威等;《微型机与应用》;20130607;第32卷(第6期);82-85 * |
Also Published As
Publication number | Publication date |
---|---|
CN103577565A (en) | 2014-02-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104461390B (en) | Write data into the method and device of imbricate magnetic recording SMR hard disks | |
CN102629258B (en) | Repeating data deleting method and device | |
CN104866497B (en) | The metadata updates method, apparatus of distributed file system column storage, host | |
CN106874066B (en) | Virtual machine migration method and device and electronic equipment | |
CN103473298B (en) | Data archiving method and device and storage system | |
CN106909372B (en) | Method and system for calculating purchase path of mobile terminal user | |
CN108132838A (en) | A kind of method, apparatus and system of diagram data processing | |
CN108491333A (en) | Method for writing data, device, equipment and the medium of buffer circle | |
CN102750317B (en) | Method and device for data persistence processing and data base system | |
CN103370691A (en) | Managing buffer overflow conditions | |
CN106469120A (en) | Scrap cleaning method, device and equipment | |
CN104731515B (en) | Control the method and apparatus of storage device group of planes abrasion equilibrium | |
CN101945131A (en) | Storage virtualization-based data migration method | |
CN109558213A (en) | The method and apparatus for managing the virtual machine snapshot of OpenStack platform | |
CN111930716A (en) | Database capacity expansion method, device and system | |
CN103294799B (en) | A kind of data parallel batch imports the method and system of read-only inquiry system | |
CN110134646B (en) | Knowledge platform service data storage and integration method and system | |
CN104050057A (en) | Historical sensed data duplicate removal fragment eliminating method and system | |
CN107368545A (en) | A kind of De-weight method and device based on MerkleTree deformation algorithms | |
CN103577565B (en) | A kind of method and apparatus that file is exported to tape | |
US20160210372A1 (en) | Method and system for obtaining knowledge point implicit relationship | |
CN116339643B (en) | Formatting method, formatting device, formatting equipment and formatting medium for disk array | |
CN103177173B (en) | Method and device for selecting network virtual character attiring | |
CN107783826A (en) | A kind of virtual machine migration method, apparatus and system | |
CN103210389B (en) | A kind for the treatment of method and apparatus of metadata |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |