CN103577565B - A kind of method and apparatus that file is exported to tape - Google Patents

A kind of method and apparatus that file is exported to tape Download PDF

Info

Publication number
CN103577565B
CN103577565B CN201310513281.6A CN201310513281A CN103577565B CN 103577565 B CN103577565 B CN 103577565B CN 201310513281 A CN201310513281 A CN 201310513281A CN 103577565 B CN103577565 B CN 103577565B
Authority
CN
China
Prior art keywords
stub file
tape
data block
subset
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310513281.6A
Other languages
Chinese (zh)
Other versions
CN103577565A (en
Inventor
李育国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201310513281.6A priority Critical patent/CN103577565B/en
Publication of CN103577565A publication Critical patent/CN103577565A/en
Application granted granted Critical
Publication of CN103577565B publication Critical patent/CN103577565B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1748De-duplication implemented within the file system, e.g. based on file segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files

Abstract

The invention discloses a kind of method and apparatus that file exports to tape, described method is for deriving at least two original document preserved heavily to delete the form of data to tape, and described heavy data of deleting include stub file collection, single-instance storehouse and fingerprint base.Described method includes: selects at least one from described stub file concentration and to derive the stub file composition stub file subset to tape;Described stub file subset and described sub-single-instance storehouse are derived to a tape;And include not yet deriving to the stub file of tape at described stub file collection, repeat both of the aforesaid step.Method according to embodiments of the present invention can be still heavily to delete the form preservation original document of data, that is the value that data are heavily deleted is maintained, save the memory space of tape, and heavily deleted territory in same tape by guarantee, it is ensured that fast quick-recovery has also read original document.

Description

A kind of method and apparatus that file is exported to tape
Technical field
The present invention relates to field of data storage, particularly relate to a kind of method that file is exported to tape and dress Put.
Background technology
Data de-duplication technology refers to, file is divided into data block one by one, to each data block meter Calculate a fingerprint, and contrast with the fingerprint existed;If this fingerprint has existed, then say This data block bright has existed for, and avoids the need for saving again, it is only necessary to this data block is quoted meter Number adds one, to illustrate that this data block be refer to again once more;If this fingerprint does not exists, then illustrate This data block is unique, is at this moment accomplished by this fingerprint and corresponding data block to save.
After carrying out data de-duplication, would generally be formed in storage system and include three below part Heavily delete data.First part is single-instance storehouse (the Single Instance for depositing data block Repository, abbreviation: SIR).The second part is fingerprint base, be used for depositing all of fingerprint and The count information of the data block that fingerprint is corresponding.3rd part is stub file, is used for preserving each file The fingerprint of the data block marked off and the positional information of data block corresponding to this fingerprint.
Data de-duplication can greatly be saved to preserve the resource of file and space.But in order to carry out Long term archival, generally also needs to be saved in tape-shaped medium's file.In the prior art, will be heavily to delete The file derivation that data mode preserves is that stub file is restored to original document to a kind of method of tape Backup to again go in tape, namely carrying out data convert by heavily deleting when data export to tape.It will be apparent that This method loses the characteristic heavily deleting data, needs take mass data space and safeguard resource.
The another kind of method of file derivation to tape heavily to delete data mode preservation is by prior art, Directly export to heavily deleting data in tape.Although this method maintains the characteristic heavily deleting data, but by In not considering the physical characteristic of tape, will cause recovering the very time-consuming poor efficiency of original document from tape.Specifically For, the sequential operation mode of tape determines and wants to obtain high-performance, it is necessary to order read-write as far as possible All data, to avoid rewind operation as far as possible.Further, tape library generally only has a number of driver, Typically cannot accomplish that all tapes are the most online.When recovering original document from tape or tape library, except Stub file, the data block cited in file to be read is read from tape.And these data blocks can Multiple tape can be deposited in so that recover an original document to relate to the loading of multiple tape, removal, Read data and rewind operation, the time consumed and the wasting of resources will be not acceptable.
Summary of the invention
Technical problem
In view of this, the technical problem to be solved in the present invention is, the form such as how data de-duplication will In storage server, the file of backup exports to magnetic tape system, and ensure that can be fast from magnetic tape system Quick-recovery file.
Solution
In order to solve above-mentioned technical problem, according to one embodiment of the invention, it is provided that file is led by one Go out the method to tape, for being derived extremely by least two original document preserved heavily to delete the form of data Tape, described heavy data of deleting include stub file collection, single-instance storehouse and fingerprint base, described single Case library includes all single data block marked off from each described original document, described stub file collection Including at least two stub file corresponding with each described original document respectively, each described stub file includes At least one finger print data, described finger print data includes that fingerprint and positional information, described fingerprint are used for identifying The data block marked off from the original document corresponding with described stub file, described positional information represents and institute Stating data block corresponding to finger print data position in described single-instance storehouse, described fingerprint base includes each institute Stating fingerprint and reference count thereof, the reference count of described fingerprint represents the data block quoting described fingerprint The number of stub file.
The described method that file exports to tape includes: concentrates from described stub file and selects at least one Derive to tape stub file form stub file subset, wherein, described stub file subset and The total amount of data in the sub-single-instance storehouse corresponding with described stub file subset is not more than the appearance of a tape Amount, described sub-single-instance storehouse includes all lists quoted by the stub file in described stub file subset The data block of one;Described stub file subset and described sub-single-instance storehouse are derived to a tape; And include not yet deriving to the stub file of tape at described stub file collection, repeat Both of the aforesaid step, till the stub file of described stub file concentration is all exported to tape.
For the above-mentioned method that file is exported to tape, in a kind of possible implementation, by described Stub file subset and described sub-single-instance storehouse are derived to a tape, including: to described counterfoil literary composition The finger print data in stub file in part subset is modified, so that in amended finger print data Positional information represents the data block corresponding with the described finger print data position in described sub-single-instance storehouse; The described stub file subset having carried out described amendment is derived to described tape;And by single for described son Case library is derived to described tape.
For the above-mentioned method that file is exported to tape, in a kind of possible implementation, to institute State before the finger print data in the stub file in stub file subset modifies, also comprise determining that institute Stating the first data block in sub-single-instance storehouse, wherein, described first data block is by a described counterfoil The data block that file is quoted;By in the stub file in described stub file subset with described first data block Corresponding finger print data replaces with described first data block;And described first data block is single from described son One case library removes.
For the above-mentioned method that file is exported to tape, in a kind of possible implementation, from described Stub file concentration selects at least one to derive the stub file composition stub file subset to tape, bag Include: from described stub file concentrate select predetermined quantity to derive to tape stub file form candidate Subset;Calculation procedure, calculates described candidate subset and candidate corresponding with described candidate subset is single The total amount of data of case library, wherein, described candidate sub-single-instance storehouse includes by described candidate subset The all single data block that stub file is quoted;And it is not more than one in the total amount of data calculated In the case of the capacity of tape, described candidate subset is defined as described stub file subset, otherwise from institute State candidate subset remove a stub file and repeat described calculation procedure.
For the above-mentioned method that file is exported to tape, in a kind of possible implementation, from described Stub file concentrate select predetermined quantity to derive to tape stub file form candidate subset, bag Include: concentrate the preservation order deriving the stub file to tape according to described stub file, select successively The stub file of predetermined quantity forms described candidate subset;Or concentrate to derive according to described stub file Data block shared relationship between the stub file of tape, the shared data block selecting predetermined quantity is most Stub file form described candidate subset.
For the above-mentioned method that file is exported to tape, in a kind of possible implementation, from institute State stub file concentrate select at least one to derive to tape stub file form stub file subset Before, the second data block in described single-instance storehouse, wherein, described second data are also comprised determining that Block is the data block quoted by a described stub file;In the stub file that described stub file is concentrated The finger print data corresponding with described second data block replaces with described second data block;And by described second Data block removes from described single-instance storehouse.
In order to solve above-mentioned technical problem, according to one embodiment of the invention, it is provided that file is led by one Go out the device to tape, for being derived extremely by least two original document preserved heavily to delete the form of data Tape, described heavy data of deleting include stub file collection, single-instance storehouse and fingerprint base, described single Case library includes all single data block marked off from each described original document, described stub file collection Including at least two stub file corresponding with each described original document respectively, each described stub file includes At least one finger print data, described finger print data includes that fingerprint and positional information, described fingerprint are used for identifying The data block marked off from the original document corresponding with described stub file, described positional information represents and institute Stating data block corresponding to finger print data position in described single-instance storehouse, described fingerprint base includes each institute Stating fingerprint and reference count thereof, the reference count of described fingerprint represents the data block quoting described fingerprint The number of stub file.
The described device that file exports to tape includes: select module, for from described stub file collection In select at least one to derive to tape stub file form stub file subset, wherein, described in deposit The total amount of data in root subset of the file and the sub-single-instance storehouse corresponding with described stub file subset is little In the capacity of a tape, described sub-single-instance storehouse includes by the counterfoil literary composition in described stub file subset The all single data block that part is quoted;Perform module, be connected with described selection module, for by described Stub file subset and described sub-single-instance storehouse are derived to a tape;And judge module, with institute State execution module and described selection module connects, be used for judging whether described stub file collection includes not yet leading Go out the stub file to tape.
For the above-mentioned device that file is exported to tape, in a kind of possible implementation, described in hold Row module includes: amendment unit, is connected with described selection module, in described stub file subset Stub file in finger print data modify, so that the positional information in amended finger print data Represent the data block corresponding with the described finger print data position in described sub-single-instance storehouse;Derive single Unit, is connected with described amendment unit, for the described stub file subset having carried out described amendment being derived Derive to described tape to described tape and by described sub-single-instance storehouse.
For the above-mentioned device that file is exported to tape, in a kind of possible implementation, also include Restoration module, described restoration module is connected with described execution module, and described restoration module is configured to: Determining the first data block in described sub-single-instance storehouse, wherein, described first data block is for by an institute State the data block that stub file is quoted;By in the stub file in described stub file subset with described first Finger print data corresponding to data block replaces with described first data block, by described first data block from described son Single-instance storehouse removes, and the described stub file subset output after replacing is to described amendment unit.
For the above-mentioned device that file is exported to tape, in a kind of possible implementation, described choosing Select module to include: candidate subset selects unit, select predetermined quantity for concentrating from described stub file Derive the stub file composition candidate subset to tape;Computing unit, selects single with described candidate subset Unit connects, for calculating described candidate subset and the sub-single-instance of the candidate corresponding with described candidate subset The total amount of data in storehouse, wherein, described candidate sub-single-instance storehouse includes by the counterfoil in described candidate subset The all single data block that file is quoted;And determine unit, it is connected with described computing unit, is used for In the case of the total amount of data calculated at described computing unit is not more than the capacity of a tape, by institute State candidate subset and be defined as described stub file subset, otherwise remove a counterfoil literary composition from described candidate subset Part.
For the above-mentioned device that file is exported to tape, in a kind of possible implementation, described time Subset is selected to select unit to be configured to: concentrating according to described stub file to derive the stub file to tape Preservation order, select successively predetermined quantity stub file form described candidate subset;Or according to institute State stub file concentration and to derive the data block shared relationship between the stub file of tape, select predetermined The stub file that the shared data block of quantity is most forms described candidate subset.
For the above-mentioned device that file is exported to tape, in a kind of possible implementation, described multiple Grand master pattern block is also connected with described selection module, and is also configured to determine in described single-instance storehouse Second data block, wherein, described second data block is the data block quoted by a described stub file; And the finger print data corresponding with described second data block in the stub file that described stub file concentrated Replace with described second data block, described second data block is removed from described single-instance storehouse, and will replace Described stub file collection output after changing is to described selection module.
Beneficial effect
The stub file corresponding by the original document by the form preservation heavily to delete data and this counterfoil The data block that the finger print data of file is corresponding derives to a tape, according to embodiments of the present invention by file The method exporting to tape can be still heavily to delete the form preservation original document of data characteristic, say, that Maintain the value that data are heavily deleted, saved the memory space of tape, and heavily deleted territory same by guarantee In one tape, it is ensured that fast quick-recovery also reads original document.
According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the further feature of the present invention and side Face will be clear from.
Accompanying drawing explanation
The accompanying drawing of the part comprising in the description and constituting description together illustrates with description The exemplary embodiment of the present invention, feature and aspect, and for explaining the principle of the present invention.
Fig. 1 illustrates the schematic diagram with the backup storage server heavily deleting function;
Fig. 2 illustrates the method flow diagram that file exports to tape according to an embodiment of the invention;
Fig. 3 illustrates the method flow diagram that file exports to tape according to another embodiment of the present invention;
Fig. 4 illustrates the method flow diagram that file exports to tape according to another embodiment of the present invention;
Fig. 5 illustrates in the method that file exports to tape according to an embodiment of the invention and determines that candidate deposits The concrete operation step flow chart of root subset of the file;
Fig. 6 illustrates the method flow diagram that file exports to tape according to further embodiment of this invention;
Fig. 7 illustrates the structured flowchart of the device that file exports to tape according to an embodiment of the invention;
Fig. 8 illustrates the structural frames of the device that file exports to tape according to another embodiment of the present invention Figure;
Fig. 9 illustrates the execution module of the device that file exports to tape according to an embodiment of the invention Structured flowchart;
Figure 10 illustrates the selection module of the device that file exports to tape according to an embodiment of the invention Structured flowchart;
Figure 11 illustrates the structural frames of the device that file exports to tape according to further embodiment of this invention Figure.
Detailed description of the invention
Various exemplary embodiments, feature and the aspect of the present invention is described in detail below with reference to accompanying drawing.Attached Reference identical in figure represents the same or analogous element of function.Although enforcement shown in the drawings The various aspects of example, but unless otherwise indicated, it is not necessary to accompanying drawing drawn to scale.
The most special word " exemplary " means " as example, embodiment or illustrative ".Here as Any embodiment illustrated by " exemplary " should not necessarily be construed as preferred or advantageous over other embodiments.
It addition, in order to better illustrate the present invention, detailed description of the invention below gives numerous Detail.It will be appreciated by those skilled in the art that do not have some detail, the present invention is equally Implement.In some instances, for method well known to those skilled in the art, means, element and circuit It is not described in detail, in order to highlight the purport of the present invention.
Embodiment 1
The method that file exports to tape according to embodiments of the present invention, for by heavily to delete the shape of data At least two original document that formula preserves is derived to tape, and described heavy data of deleting include stub file collection, list One case library and fingerprint base.The original document preserved heavily to delete the form of data can be stored in such as figure In storage server shown in 1, as it is shown in figure 1, described single-instance storehouse includes from each described original document The all single data block marked off, such as data block DB11, data block DB12 etc..Described counterfoil File set includes at least two stub file corresponding with each described original document respectively, each described counterfoil literary composition Part includes that at least one finger print data, described finger print data include fingerprint and positional information, and described fingerprint is used In the data block that mark marks off from the original document corresponding with described stub file, described positional information table Show the data block corresponding with described finger print data position in described single-instance storehouse, such as, described in deposit Root file set includes that stub file SF1 and stub file SF2, stub file SF1 include and from its original literary composition The fingerprint Fp12 etc. that fingerprint Fp11 corresponding to data block DB11 that part marks off, data block DB12 are corresponding, Stub file SF2 includes the fingerprint Fp21 corresponding with data block DB21 marked off from its original document, number According to fingerprint Fp22 etc. corresponding for block DB22.Described fingerprint base includes each described fingerprint and reference count thereof, The reference count of described fingerprint represents the number of the stub file of the data block quoting described fingerprint.
Fig. 2 illustrates the flow chart of the method that file exports to tape according to an embodiment of the invention.As Shown in Fig. 2, the method specifically includes that
Step S210, selects at least one from described stub file concentration and to derive the stub file to tape Composition stub file subset, wherein, described stub file subset and corresponding with described stub file subset The total amount of data in sub-single-instance storehouse be not more than the capacity of a tape, described sub-single-instance storehouse includes The all single data block quoted by the stub file in described stub file subset.
Specifically, the number between the stub file of tape can be derived according to described stub file concentration Determine described stub file subset according to block shared relationship, it is also possible to by described stub file concentrate to derive to The preservation order of the stub file of tape determines stub file subset, if stub file finally determined The total amount of data in collection and the sub-single-instance storehouse corresponding with described stub file subset is not more than a magnetic The capacity of band.So can ensure that a tape is one and heavily deletes territory, say, that can ensure that original literary composition The stub file of part and single-instance storehouse at same tape, read from tape original document time Wait, only need to load a tape and just can complete.
Step S220, derives described stub file subset and described sub-single-instance storehouse to a magnetic Band.
Step S230, it is judged that whether described stub file collection includes not yet deriving the stub file to tape. If including, repeated execution of steps S210 and step S220, until the counterfoil literary composition that described stub file is concentrated Till part is all exported to tape, if do not included, the file of the embodiment of the present invention exports to the stream of tape Journey terminates.
So, by by with heavily delete stub file corresponding to original document that the form of data preserves and The data block that the finger print data of this stub file is corresponding derives to a tape, according to the present embodiment by literary composition Part exports to the method for tape can be still heavily to delete the form preservation original document of data characteristic, namely Say and maintain the value that data are heavily deleted, saved the memory space of tape, and by ensureing that heavily deleting territory exists In same tape, it is ensured that fast quick-recovery also reads original document.
Embodiment 2
Fig. 3 illustrates the method flow diagram that file exports to tape according to another embodiment of the present invention, Fig. 3 The assembly that middle label is identical with Fig. 2 has identical function, for simplicity's sake, omits these assemblies Describe in detail.
As it is shown on figure 3, the method for tape that file is exported to shown in Fig. 3 and file is led shown in Fig. 2 The method gone out to tape differs primarily in that, in a kind of possible implementation, and step in embodiment 1 The concrete operation step of S220 may include that
Step S321, the finger print data in the stub file in described stub file subset is modified, So that the positional information in amended finger print data represents the data block corresponding with described finger print data Position in described sub-single-instance storehouse.
Step S322, the described stub file subset that will carry out described amendment derive to described tape.
Step S323, described sub-single-instance storehouse is derived to described tape.
Due to the sequential operation mode of tape, first the described stub file subset having carried out described amendment is led Go out, more described sub-single-instance storehouse is derived to described tape, so that read original document user When without rewind operation, such that it is able to realize the fast quick-recovery of original document.
Embodiment 3
Fig. 4 illustrates the method flow diagram that file exports to tape according to another embodiment of the present invention, Fig. 4 The assembly that middle label is identical with Fig. 3 has identical function, for simplicity's sake, omits these assemblies Describe in detail.
As shown in Figure 4, the method for tape that file is exported to shown in Fig. 4 and file is led shown in Fig. 3 The method gone out to tape differs primarily in that, in a kind of possible implementation, before step S321, Can also include:
Step S410, the first data block determined in described sub-single-instance storehouse, wherein, described first number It it is the data block quoted by a described stub file according to block.
Step S420, by corresponding with described first data block in the stub file in described stub file subset Finger print data replace with described first data block.
Step S430, described first data block is removed from described sub-single-instance storehouse.
The physical characteristic of tape determines that to carry out the recovery operation of original document in tape be the most time-consuming Poor efficiency, the method that file exports to tape of the present embodiment is first by depositing in described stub file subset The data block do not shared between root file is restored to described stub file subset, then by this data block Remove from described sub-single-instance storehouse, then perform step S220, when minimizing user reads original document Original document recovery operation, while realizing quickly the reading of original document, does not increase extra depositing Reserves.
Embodiment 4
Fig. 5 illustrates step S210 in the method that file exports to tape of according to embodiments of the present invention 1 Concrete operation step, as it is shown in figure 5, in a kind of possible implementation, this step may include that
Step S510, from described stub file concentrate select predetermined quantity to derive to tape counterfoil literary composition Part composition candidate subset.
Step S520, calculation procedure, calculate described candidate subset and the time corresponding with described candidate subset Selecting the total amount of data in sub-single-instance storehouse, wherein, described candidate sub-single-instance storehouse includes by described candidate The all single data block that stub file in subset is quoted.
Whether the total amount of data that step S530, judgement are calculated is more than the capacity of a tape.If institute The total amount of data calculated is not more than the capacity of a tape, then perform step S540, by described candidate Collection is defined as described stub file subset, otherwise performs step S550, removes one from described candidate subset Stub file also repeats described calculation procedure S520.
Specifically, the number between the stub file of tape can be derived according to described stub file concentration According to block shared relationship, select stub file composition described candidate that the shared data block of predetermined quantity is most Collection.So can ensure that the space utilisation of tape is the biggest, say, that identical tape storage is empty Between the storage data that can maintain up to.
In addition it is also possible to concentrate the preservation deriving the stub file to tape suitable according to described stub file Sequence, the stub file selecting predetermined quantity successively forms described candidate subset.So while it is not guaranteed that magnetic The space utilisation of band is maximum, but can avoid because calculating data block in each subset of stub file collection The number being shared and the amount of calculation brought.Those skilled in the art will be understood that user can root completely The method selecting described candidate subset is set flexibly according to personal like and/or actual application scenarios.
By first determining that the stub file derived to tape by least one forms stub file subset, And ensure this stub file subset and the sum in the sub-single-instance storehouse corresponding with this stub file subset The capacity of a tape it is not more than according to amount.Can ensure that a tape is one and heavily deletes territory, from tape The when of reading original document, only need to load a tape and just can complete, it is ensured that fast quick-recovery is also Read original document.
Embodiment 5
Fig. 6 illustrates the method flow diagram that file exports to tape according to further embodiment of this invention, Fig. 6 The assembly that middle label is identical with Fig. 2 has identical function, for simplicity's sake, omits these assemblies Describe in detail.
As shown in Figure 6, the method for tape that file is exported to shown in Fig. 6 and file is led shown in Fig. 2 The method gone out to tape differs primarily in that, in a kind of possible implementation, and step in embodiment 1 Before S210, it is also possible to including:
Step S610, the second data block determined in described single-instance storehouse, wherein, described second data Block is the data block quoted by a described stub file.
In step S620, the stub file that described stub file is concentrated corresponding with described second data block Finger print data replaces with described second data block.
Step S630, described second data block is removed from described single-instance storehouse.
It should be noted that the method that file exports to tape of the present embodiment can include embodiment 3 Described step S410, step S420, step S430, perform step S610, step S620, step S630 allows for the physical characteristic of tape equally, carry out in tape the recovery operation of original document be and Its time-consuming poor efficiency, therefore, the method that file exports to tape of the present embodiment is first by described counterfoil literary composition The data block do not shared between the stub file that part is concentrated is restored and is concentrated to described stub file, then will This data block removes from described single-instance storehouse, then performs step S220, reads original literary composition reducing user Original document recovery operation during part, while realizing quickly the reading of original document, does not increase volume Outer amount of storage.
Embodiment 6
The device that file exports to tape according to embodiments of the present invention, for by heavily to delete the shape of data At least two original document that formula preserves is derived to tape, and described heavy data of deleting include stub file collection, list One case library and fingerprint base.The original document preserved heavily to delete the form of data can be stored in such as figure In storage server shown in 1, specifically it is referred to the associated description of preceding method embodiment, at this not Repeat again.
Fig. 7 illustrates the structural frames of the device 700 that file exports to tape according to an embodiment of the invention Figure, as it is shown in fig. 7, device 700 specifically includes that selection module 710, performs module 720 and judge mould Block 730, wherein, select module 710 for concentrate from described stub file select at least one to derive to Tape stub file composition stub file subset, described stub file subset and with described stub file The total amount of data in the sub-single-instance storehouse that subset is corresponding is not more than the capacity of a tape, the single reality of described son Example storehouse includes all single data block quoted by the stub file in described stub file subset;Perform Module 720 is connected with selecting module 710, for by described stub file subset and described sub-single-instance Storehouse is derived to a tape;Judge module 730 is connected with performing module 720, is used for judging described counterfoil literary composition Whether part collection includes not yet deriving the stub file to tape.
In a kind of possible implementation, select module 710 can concentrate according to described stub file and want The data block shared relationship derived between the stub file of tape determines described stub file subset, it is possible to To concentrate the preservation order deriving the stub file to tape to determine stub file by described stub file Subset, as long as the stub file subset finally determined and the son corresponding with described stub file subset are single The total amount of data of case library is not more than the capacity of a tape.So can ensure that a tape is a weight Delete territory, say, that can ensure that the stub file of original document and single-instance storehouse at same tape, Reading original document from tape when, only need to load a tape and just can complete.
So, by performing module 720 by heavily to delete corresponding the depositing of original document that the form of data preserves The data block that the finger print data of root file and this stub file is corresponding derives to a tape, according to this The device that file exports to tape of embodiment can be still former heavily to delete the form preservation of data characteristic Beginning file, say, that maintain the value that data are heavily deleted, has saved the memory space of tape, and has led to Cross guarantee and heavily delete territory in same tape, it is ensured that fast quick-recovery also reads original document.
Embodiment 7
Fig. 8 illustrates the structural frames of the device that file exports to tape according to another embodiment of the present invention Figure, the assembly that in Fig. 8, label is identical with Fig. 7 has identical function, for simplicity's sake, omits these The detailed description of assembly.
Shown in the device that file is exported to tape shown in Fig. 8 and Fig. 7, file is exported to tape Differring primarily in that of device, in a kind of possible implementation, device 700 also includes restoration module 740.Restoration module 740 is connected with performing module 720, for determining the in described sub-single-instance storehouse One data block, wherein, described first data block is the data block quoted by a described stub file;With And by finger print data corresponding with described first data block in the stub file in described stub file subset Replace with described first data block, described first data block is removed from described sub-single-instance storehouse, and will Described stub file subset output after replacement is to performing module 720.
In a kind of possible implementation, restoring means 740 can also be connected with selecting module 710.Tool Body ground, restoration module 740 is configured to determine that the second data block in described single-instance storehouse, wherein, Described second data block is the data block quoted by a described stub file;And by described stub file In the stub file concentrated, the finger print data corresponding with described second data block replaces with described second data Block, removes described second data block from described single-instance storehouse, and will replace after described stub file Collection output is to selecting module 710.
The physical characteristic of tape determines that to carry out the recovery operation of original document in tape be the most time-consuming Poor efficiency, the device that file exports to tape of the present embodiment first passes through restoring means 740 and deposits described The data block do not shared between stub file in root file set and/or described stub file subset is restored, By performance element 720, described stub file subset and described sub-single-instance storehouse are derived extremely the most again One tape, the original document recovery operation when reducing user and reading original document is to realize original document Quickly read while, do not increase extra amount of storage.
In a kind of possible implementation, can include revising unit as it is shown in figure 9, perform module 720 721 and lead-out unit 722.Specifically, amendment unit 721 is connected with selecting module 710, for described The finger print data in stub file in stub file subset is modified, so that amended fingerprint number Positional information according to represents that the data block corresponding with described finger print data is in described sub-single-instance storehouse Position;Lead-out unit 722 is connected with amendment unit 721, for depositing having carried out described in described amendment Root subset of the file is derived to described tape;And described sub-single-instance storehouse is derived to described tape.
Due to the sequential operation mode of tape, the institute of described amendment first will be carried out by lead-out unit 722 State stub file subset to derive, more described sub-single-instance storehouse is derived to described tape, so that Without rewind operation when that user reading original document, such that it is able to realize the most extensive of original document Multiple.
In a kind of possible implementation, as shown in Figure 10, module 710 is selected can to include candidate's Collection selects unit 711, computing unit 712 and determines unit 713.
Specifically, candidate subset selects unit 711 to select predetermined quantity for concentrating from described stub file To derive to tape stub file form candidate subset;Computing unit 712 and candidate subset select single Unit 711 connects, and is used for calculating described candidate subset and candidate corresponding with described candidate subset is single The total amount of data of case library, wherein, described candidate sub-single-instance storehouse includes by described candidate subset The all single data block that stub file is quoted;Determine that unit 713 is connected with computing unit 712, be used for In the case of the total amount of data calculated at described computing unit is not more than the capacity of a tape, by institute State candidate subset and be defined as described stub file subset, otherwise remove a counterfoil literary composition from described candidate subset Part.
In a kind of possible implementation, candidate subset selects unit 711 to be configured to: according to Described stub file concentrates the preservation order deriving the stub file to tape, selects predetermined quantity successively Stub file form described candidate subset;Or concentrate according to described stub file and to derive to tape Data block shared relationship between stub file, selects the counterfoil literary composition that the shared data block of predetermined quantity is most Part forms described candidate subset.Those skilled in the art will be understood that user can be based entirely on individual's happiness Good and/or actual application scenarios sets candidate subset flexibly and selects unit 711 to select the side of described candidate subset Method.
By first determining that the stub file derived to tape by least one forms stub file subset, And ensure this stub file subset and the sum in the sub-single-instance storehouse corresponding with this stub file subset The capacity of a tape it is not more than according to amount.Can ensure that a tape is one and heavily deletes territory, from tape The when of reading original document, only need to load a tape and just can complete, it is ensured that fast quick-recovery is also Read original document.
Embodiment 8
Figure 11 is the structured flowchart of the device that file exports to tape of another embodiment of the present invention.Institute Stating and file exports to the device 1100 of tape can be to possess the host server of computing capability, Ge Renji Calculation machine PC or portable portable computer or terminal etc..The specific embodiment of the invention is not to meter Implementing of operator node limits.
The described device 1100 that file exports to tape includes that processor (processor) 1110, communication connect Mouth (Communications Interface) 1120, memorizer (memory) 1130 and bus 1140.Wherein, Processor 1110, communication interface 1120 and memorizer 1130 complete mutual leading to by bus 1140 Letter.
Communication interface 1120 is used for and net element communication, and wherein network element includes such as Virtual Machine Manager center, is total to Enjoy storage etc..
Processor 1110 is used for performing program.Processor 1110 is probably a central processor CPU, or Person is application-specific integrated circuit ASIC (Application Specific Integrated Circuit), or quilt It is configured to implement one or more integrated circuits of the embodiment of the present invention.
Memorizer 1130 is used for storing file.Memorizer 1130 may comprise high-speed RAM container, it is also possible to Also include non-volatile container (non-volatile memory), for example, at least one case for magnetic disk.Memorizer 1130 can also be vessel array.Memorizer 1130 is also possible to by piecemeal, and described piece can be by certain Rule sets synthesis virtual volume.
In a kind of possible embodiment, said procedure can be the program generation including computer-managed instruction Code.This program is particularly used in: selecting at least one from described stub file concentration to derive to tape Stub file composition stub file subset, wherein, described stub file subset and with described stub file The total amount of data in the sub-single-instance storehouse that subset is corresponding is not more than the capacity of a tape, the single reality of described son Example storehouse includes all single data block quoted by the stub file in described stub file subset;By institute State stub file subset and described sub-single-instance storehouse is derived to a tape;And at described counterfoil literary composition Part collection includes not yet deriving to the stub file of tape, repeats both of the aforesaid step, directly To described stub file concentrate stub file be all exported to tape.
In a kind of possible embodiment, this program is particularly used in: in described stub file subset Stub file in finger print data modify, so that the positional information in amended finger print data Represent the data block corresponding with the described finger print data position in described sub-single-instance storehouse;Will carry out The described stub file subset of described amendment derives to described tape;And described sub-single-instance storehouse is led Go out to described tape.
In a kind of possible embodiment, this program is particularly used in: determine described sub-single-instance storehouse In the first data block, wherein, described first data block is the data quoted by a described stub file Block;By finger print data corresponding with described first data block in the stub file in described stub file subset Replace with described first data block;And described first data block is removed from described sub-single-instance storehouse.
In a kind of possible embodiment, this program is particularly used in: concentrate choosing from described stub file Go out predetermined quantity to derive the stub file composition candidate subset to tape;Calculation procedure, calculates described Candidate subset and the total amount of data in the candidate corresponding with described candidate subset sub-single-instance storehouse, wherein, Described candidate sub-single-instance storehouse include by the stub file in described candidate subset quote all single Data block;And in the case of the total amount of data calculated is not more than the capacity of a tape, will Described candidate subset is defined as described stub file subset, otherwise removes a counterfoil from described candidate subset File also repeats described calculation procedure.
In a kind of possible embodiment, this program is particularly used in: concentrate according to described stub file Deriving the preservation order of the stub file to tape, the stub file selecting predetermined quantity successively forms institute State candidate subset;Or concentrate according to described stub file and to derive the number between the stub file of tape According to block shared relationship, select stub file composition described candidate that the shared data block of predetermined quantity is most Collection.
In a kind of possible embodiment, this program is particularly used in: determine in described single-instance storehouse The second data block, wherein, described second data block is the data block quoted by a described stub file; Finger print data corresponding with described second data block in the stub file concentrate described stub file is replaced For described second data block;And described second data block is removed from described single-instance storehouse.
According to the device that file is exported to tape of the present embodiment, with described in embodiment 6 to 8 by file Export to that the device of tape explained is similar, and those skilled in the art will be understood that aforesaid possible realization Mode all can be applicable to the present embodiment and can obtain identical beneficial effect, repeats no more here.
The above, the only detailed description of the invention of the present invention, but protection scope of the present invention is not limited to In this, any those familiar with the art, can be easily in the technical scope that the invention discloses Expect change or replace, all should contain within protection scope of the present invention.Therefore, the protection of the present invention Scope should described be as the criterion with scope of the claims.

Claims (14)

1. the method that file exports to tape, for by the form preservation heavily to delete data extremely Few two original documents are derived to tape, described heavy delete data include stub file collection, single-instance storehouse, And fingerprint base, described single-instance storehouse includes all single number marked off from each described original document According to block, described stub file collection includes at least two counterfoil literary composition corresponding with each described original document respectively Part, each described stub file includes that at least one finger print data, described finger print data include fingerprint and position Information, the data that described fingerprint marks off from the original document corresponding with described stub file for mark Block, described positional information represents that the data block corresponding with described finger print data is in described single-instance storehouse Position, described fingerprint base includes each described fingerprint and reference count thereof, and the reference count of described fingerprint represents Quote the number of the stub file of the data block of described fingerprint, it is characterised in that the method includes:
Select at least one from described stub file concentration and to derive the stub file composition counterfoil to tape Subset of the file, wherein, described stub file subset and the son corresponding with described stub file subset are single The total amount of data of case library is not more than the capacity of a tape, and described sub-single-instance storehouse includes being deposited by described The all single data block that stub file in root subset of the file is quoted;
Described stub file subset and described sub-single-instance storehouse are derived to a tape;And
Include not yet deriving to the stub file of tape at described stub file collection, repeat Both of the aforesaid step, till the stub file of described stub file concentration is all exported to tape.
Method the most according to claim 1, it is characterised in that by described stub file subset and Described sub-single-instance storehouse is derived to a tape, including:
Finger print data in stub file in described stub file subset is modified, so that amendment After finger print data in positional information represent that the data block corresponding with described finger print data is single at described son Position in one case library;
The described stub file subset having carried out described amendment is derived to described tape;And
Described sub-single-instance storehouse is derived to described tape.
Method the most according to claim 2, it is characterised in that in described stub file subset Stub file in finger print data modify before, also include:
Determining the first data block in described sub-single-instance storehouse, wherein, described first data block is by one The data block that individual described stub file is quoted;
By fingerprint number corresponding with described first data block in the stub file in described stub file subset According to replacing with described first data block;And
Described first data block is removed from described sub-single-instance storehouse.
The most according to the method in any one of claims 1 to 3, it is characterised in that from described counterfoil File set is selected at least one and to derive the stub file composition stub file subset to tape, including:
From described stub file concentrate select predetermined quantity to derive to tape stub file form wait Select subset;
Calculation procedure, calculates described candidate subset and the candidate corresponding with the described candidate subset single reality of son The total amount of data in example storehouse, wherein, described candidate sub-single-instance storehouse includes by depositing in described candidate subset The all single data block that root file is quoted;And
In the case of the total amount of data calculated is not more than the capacity of a tape, by described candidate Collection is defined as described stub file subset, otherwise removes a stub file from described candidate subset and repeats Described calculation procedure.
Method the most according to claim 4, it is characterised in that concentrate from described stub file and select Predetermined quantity to derive to tape stub file form candidate subset, including:
Concentrate the preservation order deriving the stub file to tape according to described stub file, select successively The stub file of predetermined quantity forms described candidate subset;Or
Concentrate the data block derived between the stub file of tape to share to close according to described stub file System, selects the stub file described candidate subset of composition that the shared data block of predetermined quantity is most.
The most according to the method in any one of claims 1 to 3, it is characterised in that depositing from described Root file set is selected at least one to derive before the stub file composition stub file subset of tape, Also include:
Determining the second data block in described single-instance storehouse, wherein, described second data block is by one The data block that described stub file is quoted;
Finger print data corresponding with described second data block in the stub file that described stub file is concentrated Replace with described second data block;And
Described second data block is removed from described single-instance storehouse.
Method the most according to claim 4, it is characterised in that concentrating choosing from described stub file Go out at least one to derive before the stub file composition stub file subset of tape, also include:
Determining the second data block in described single-instance storehouse, wherein, described second data block is by one The data block that described stub file is quoted;
Finger print data corresponding with described second data block in the stub file that described stub file is concentrated Replace with described second data block;And
Described second data block is removed from described single-instance storehouse.
Method the most according to claim 5, it is characterised in that concentrating choosing from described stub file Go out at least one to derive before the stub file composition stub file subset of tape, also include:
Determining the second data block in described single-instance storehouse, wherein, described second data block is by one The data block that described stub file is quoted;
Finger print data corresponding with described second data block in the stub file that described stub file is concentrated Replace with described second data block;And
Described second data block is removed from described single-instance storehouse.
9. file is exported to a device for tape, for by the form preservation heavily to delete data extremely Few two original documents are derived to tape, described heavy delete data include stub file collection, single-instance storehouse, And fingerprint base, described single-instance storehouse includes all single number marked off from each described original document According to block, described stub file collection includes at least two counterfoil literary composition corresponding with each described original document respectively Part, each described stub file includes that at least one finger print data, described finger print data include fingerprint and position Information, the data that described fingerprint marks off from the original document corresponding with described stub file for mark Block, described positional information represents that the data block corresponding with described finger print data is in described single-instance storehouse Position, described fingerprint base includes each described fingerprint and reference count thereof, and the reference count of described fingerprint represents Quote the number of the stub file of the data block of described fingerprint, it is characterised in that this device includes:
Select module, to derive the counterfoil to tape for selecting at least one from described stub file concentration File composition stub file subset, wherein, described stub file subset and with described stub file subset The total amount of data in corresponding sub-single-instance storehouse is not more than the capacity of a tape, described sub-single-instance storehouse Including all single data block quoted by the stub file in described stub file subset;
Perform module, be connected with described selection module, for by described stub file subset and described son Single-instance storehouse is derived to a tape;And
Judge module, is connected with described execution module and described selection module, is used for judging described counterfoil literary composition Whether part collection includes not yet deriving the stub file to tape.
Device the most according to claim 9, it is characterised in that described execution module includes:
Amendment unit, is connected with described selection module, for the counterfoil literary composition in described stub file subset Finger print data in part is modified, so that the positional information in amended finger print data represents and institute State data block corresponding to finger print data position in described sub-single-instance storehouse;
Lead-out unit, is connected with described amendment unit, for having carried out the described counterfoil literary composition of described amendment Part subset derives to described tape and derives described sub-single-instance storehouse to described tape.
11. devices according to claim 10, it is characterised in that also include restoration module, described Restoration module is connected with described execution module, and described restoration module is configured to:
Determining the first data block in described sub-single-instance storehouse, wherein, described first data block is by one The data block that individual described stub file is quoted;
By fingerprint number corresponding with described first data block in the stub file in described stub file subset According to replacing with described first data block, described first data block is removed from described sub-single-instance storehouse, and Described stub file subset output after replacing is to described amendment unit.
12. according to the device according to any one of claim 9 to 11, it is characterised in that described selection Module includes:
Candidate subset select unit, for from described stub file concentrate select predetermined quantity to derive to The stub file composition candidate subset of tape;
Computing unit, selects unit to be connected with described candidate subset, be used for calculating described candidate subset and The total amount of data in the candidate corresponding with described candidate subset sub-single-instance storehouse, wherein, described candidate's sub-list One case library includes all single data block quoted by the stub file in described candidate subset;And
Determine unit, be connected with described computing unit, for the sum calculated at described computing unit Be not more than the capacity of a tape according to amount in the case of, described candidate subset is defined as described stub file Subset, otherwise removes a stub file from described candidate subset.
13. devices according to claim 12, it is characterised in that described candidate subset selects unit It is configured to:
Concentrate the preservation order deriving the stub file to tape according to described stub file, select successively The stub file of predetermined quantity forms described candidate subset;Or
Concentrate the data block derived between the stub file of tape to share to close according to described stub file System, selects the stub file described candidate subset of composition that the shared data block of predetermined quantity is most.
14. devices according to claim 11, it is characterised in that described restoration module is also with described Selection module connects, and is also configured to
Determining the second data block in described single-instance storehouse, wherein, described second data block is by one The data block that described stub file is quoted;And
Finger print data corresponding with described second data block in the stub file that described stub file is concentrated Replace with described second data block, described second data block is removed from described single-instance storehouse, and will replace Described stub file collection output after changing is to described selection module.
CN201310513281.6A 2013-10-25 2013-10-25 A kind of method and apparatus that file is exported to tape Active CN103577565B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310513281.6A CN103577565B (en) 2013-10-25 2013-10-25 A kind of method and apparatus that file is exported to tape

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310513281.6A CN103577565B (en) 2013-10-25 2013-10-25 A kind of method and apparatus that file is exported to tape

Publications (2)

Publication Number Publication Date
CN103577565A CN103577565A (en) 2014-02-12
CN103577565B true CN103577565B (en) 2017-01-04

Family

ID=50049341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310513281.6A Active CN103577565B (en) 2013-10-25 2013-10-25 A kind of method and apparatus that file is exported to tape

Country Status (1)

Country Link
CN (1) CN103577565B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110647077B (en) * 2019-09-26 2020-12-25 珠海格力电器股份有限公司 Control method and system of industrial control device, storage medium and industrial control device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049391A (en) * 2012-12-29 2013-04-17 华为技术有限公司 Data processing method, data format and equipment
CN103154950A (en) * 2012-05-04 2013-06-12 华为技术有限公司 Repeated data deleting method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8108638B2 (en) * 2009-02-06 2012-01-31 International Business Machines Corporation Backup of deduplicated data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103154950A (en) * 2012-05-04 2013-06-12 华为技术有限公司 Repeated data deleting method and device
CN103049391A (en) * 2012-12-29 2013-04-17 华为技术有限公司 Data processing method, data format and equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
The Effectiveness of Deduplication on Virtual Machine Disk Images;Keren Jin等;《Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference》;20090527;7-13 *
磁盘的重复数据删除技术在数据备份系统中的应用;马锡坤等;《中国医疗设备》;20121031;第27卷(第10期);78-79,171 *
重复数据删除算法在VTL系统中的应用研究;孙虎威等;《微型机与应用》;20130607;第32卷(第6期);82-85 *

Also Published As

Publication number Publication date
CN103577565A (en) 2014-02-12

Similar Documents

Publication Publication Date Title
CN104461390B (en) Write data into the method and device of imbricate magnetic recording SMR hard disks
CN102629258B (en) Repeating data deleting method and device
CN104866497B (en) The metadata updates method, apparatus of distributed file system column storage, host
CN104091617B (en) Flash memory equipment detection method and apparatus thereof
CN103473298B (en) Data archiving method and device and storage system
CN106909372B (en) Method and system for calculating purchase path of mobile terminal user
CN108491333A (en) Method for writing data, device, equipment and the medium of buffer circle
CN102750317B (en) Method and device for data persistence processing and data base system
CN103370691A (en) Managing buffer overflow conditions
CN106469120A (en) Scrap cleaning method, device and equipment
CN104731515B (en) Control the method and apparatus of storage device group of planes abrasion equilibrium
CN101945131A (en) Storage virtualization-based data migration method
CN109558213A (en) The method and apparatus for managing the virtual machine snapshot of OpenStack platform
CN108021449A (en) One kind association journey implementation method, terminal device and storage medium
CN111930716A (en) Database capacity expansion method, device and system
CN103294799B (en) A kind of data parallel batch imports the method and system of read-only inquiry system
CN110134646B (en) Knowledge platform service data storage and integration method and system
CN107368545A (en) A kind of De-weight method and device based on MerkleTree deformation algorithms
CN103577565B (en) A kind of method and apparatus that file is exported to tape
CN104050057A (en) Historical sensed data duplicate removal fragment eliminating method and system
CN116339643B (en) Formatting method, formatting device, formatting equipment and formatting medium for disk array
CN103177173B (en) Method and device for selecting network virtual character attiring
CN107783826A (en) A kind of virtual machine migration method, apparatus and system
CN103210389B (en) A kind for the treatment of method and apparatus of metadata
CN104246716A (en) Method and device for processing storage space object

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant