CN107506150A - Distributed storage devices, delete, write again, deleting, read method and system - Google Patents
Distributed storage devices, delete, write again, deleting, read method and system Download PDFInfo
- Publication number
- CN107506150A CN107506150A CN201710764079.9A CN201710764079A CN107506150A CN 107506150 A CN107506150 A CN 107506150A CN 201710764079 A CN201710764079 A CN 201710764079A CN 107506150 A CN107506150 A CN 107506150A
- Authority
- CN
- China
- Prior art keywords
- data
- fingerprint
- target
- osd
- distributed storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
- G06F3/0641—De-duplication techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0652—Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Abstract
This application discloses a kind of distributed storage devices, delete, write, delete again, read method and system, applied to distributed storage devices, including:The destination object data fingerprint of target data objects is obtained in unified accumulation layer, and corresponding OSD is arrived into the storage of destination object data fingerprint;Destination object data fingerprint is calculated using preset algorithm, obtains the target OSD of target data objects;Judge whether number of targets OSD has saved historical data object;If having saved historical data object, the counting of the reference count of historical data object is added one.It is of the invention directly to find number of targets OSD using destination object data fingerprint, set up the corresponding relation between object data fingerprint and OSD, so as to directly judge whether duplicate data, avoid and matching inquiry is carried out in distributed storage network using fingerprint base, it is inefficient caused by and, improve distributed storage and delete operating efficiency again.
Description
Technical field
The present invention relates to technical field of memory, more particularly to a kind of distributed storage devices, distributed storage delete, write again,
Deletion, read method and system.
Background technology
Data are disperseed to be stored in more independent equipment by distributed memory system.Traditional network store system uses
The storage server of concentration deposits all data, and storage server turns into the bottleneck of systematic function, and reliability and security
Focus, it is impossible to meet Mass storage application needs.Distributed network storage system uses expansible system architecture, profit
Storage load is shared with more storage servers, positions storage information using location server, what it not only increased system can
By property, availability and access efficiency, it is also easy to extend.
It is a kind of capacity optimisation technique that data de-duplication (De-duplication, referred to as deletes) technology again, and it is by disappearing
Except the data repeated in storage system, the data of actual storage or the data by network transmission in reduction system, in backup, long
Phase filing and data disaster recovery etc. are widely used.Industrial quarters and academia had corresponding product and
Technical research results.In recent years, with the development that software definition stores, and virtualization technology, application specific processor technology and new
The appearance of storage medium, with reference to the processing of online duplicate data and the consideration of reduction storage cell Capacity Cost, data de-duplication
Technical need is more and more urgent.
The research of the online data de-duplication technology of distributed storage at present mainly using fingerprint base is created, utilizes fingerprint base
Come judge data whether repeat method, realize, be required in distributed storage network regardless of the storage mode of fingerprint base
Interior carry out matching inquiry, efficiency are low.
The content of the invention
In view of this, delete, write, delete again it is an object of the invention to provide a kind of distributed storage devices, distributed storage
Remove, read method and system, matching inquiry is carried out in distributed storage network using fingerprint base to avoid, and caused by
Poor efficiency, improve distributed storage and delete operating efficiency again.Its concrete scheme is as follows:
A kind of distributed storage deletes method again, applied to distributed storage devices, including:
The destination object data fingerprint of target data objects is obtained in unified accumulation layer, and the destination object data are referred to
Corresponding OSD is arrived in line storage;
The destination object data fingerprint is calculated using preset algorithm, obtains the target of the target data objects
OSD;
Judge whether the number of targets OSD has saved historical data object;
If having preserved the historical data object, the counting of the reference count of the historical data object is added one.
Optionally, the process of the destination object data fingerprint that target data objects are obtained in unified accumulation layer, including:
In unified accumulation layer, using the target data objects unique mark of the target data objects, by hash function,
Obtain the destination object data fingerprint.
The invention also discloses a kind of distributed storage wiring method, applied to distributed storage devices, including:
Using the target data objects unique mark of target data objects, inquiry destination object data fingerprint whether there is;
If it is present calculate fresh target object data fingerprint using the target data objects, and by the new mesh
Mark object data fingerprint and be stored in corresponding OSD;
The fresh target object data fingerprint is recycled, the target data objects are stored in corresponding target OSD;
The reference count of data object corresponding with the destination object data fingerprint is subtracted one.
Optionally, it is described to calculate fresh target object data fingerprint using the target data objects, and by the new mesh
The process that object data fingerprint is stored in corresponding OSD is marked, including:
Judge whether the data block length of the target data objects is equal to default piecemeal size;
If equal to the first fresh target object data of the target data objects is then calculated using the preset algorithm
Fingerprint, the first fresh target object data fingerprint is stored in corresponding OSD;
If the data block length of the target data objects is less than the piecemeal size, the target data pair is utilized
As unique mark, the destination object data fingerprint is obtained, the old number of targets is obtained using the destination object data fingerprint
According to object;
The old target data objects are spliced with the target data objects, obtains splicing data object, utilizes
The preset algorithm calculates the second fresh target object data fingerprint of the splicing data object;
The second fresh target object data fingerprint is stored in corresponding OSD.
Optionally, the mistake that the reference count by data object corresponding with the destination object data fingerprint subtracts one
Journey, including:
When the data block length of the target data objects is less than default piecemeal size, then the destination object number is utilized
According to fingerprint, the old reference count corresponding with the destination object data fingerprint is found, and by the meter of the old reference count
Number subtracts one;
If the old reference count is counted as zero, the old target data objects are deleted.
The invention also discloses a kind of distributed storage delet method, applied to distributed storage devices, including:
Using the history object unique mark of historical data object, the history object number of the historical data object is found
According to fingerprint;
Using the history object data fingerprint, the reference count of the historical data object is found, and go through described
The reference count of history data object subtracts one;
If the reference count of the historical data object is zero after subtracting one, delete the historical data object and and its
Related data.
Optionally, the history object unique mark using historical data object, finds the historical data object
History object data fingerprint process, including:
Using the history object unique mark of historical data object, calculated by hash function, obtain the history object
The OSD of unique mark, the history object data fingerprint is found from the OSD of the history object unique mark.
Optionally, it is described to utilize the history object data fingerprint, find the reference count of the historical data object
Process, including:
Using the history object data fingerprint, calculated by hash function, obtain the reference of the historical data object
The OSD of counting, so as to find the reference count of the historical data object.
Optionally, the process for deleting the historical data object and relative data, including:
Delete the historical data object and the history object data fingerprint.
The invention also discloses a kind of distributed storage read method, applied to distributed storage devices, including:
Using target data objects unique mark, by preset algorithm, the target data objects unique mark is calculated
OSD;
Read out the destination object data fingerprint for the OSD for being stored in the target data objects unique mark;
Using the destination object data fingerprint, by the preset algorithm, the mesh of the target data objects is calculated
OSD is marked, reads the target data objects.
The invention also discloses a kind of distributed storage to delete system again, and applied to distributed storage devices, the system includes:
Fingerprint acquisition module, for obtaining the destination object data fingerprint of target data objects in unified accumulation layer;
Fingerprint storage module, for destination object data fingerprint storage to be arrived into corresponding OSD;
Position computation module, for being calculated using preset algorithm the destination object data fingerprint, obtain described
The target OSD of target data objects;
Judge module, for judging whether the number of targets OSD has saved historical data object;
Change module is counted, if for having preserved the historical data object, by drawing for the historical data object
Add one with the counting of counting.
The invention also discloses a kind of distributed storage writing system, applied to distributed storage devices, including:
Fingerprint queries module, for the target data objects unique mark using target data objects, inquire about destination object
Data fingerprint whether there is;
New fingerprint storage module, for if it is present calculating fresh target number of objects using the target data objects
Corresponding OSD is stored according to fingerprint, and by the fresh target object data fingerprint;
Data object memory module, for recycling the fresh target object data fingerprint, by the target data objects
It is stored in corresponding target OSD;
Change module is counted, for the reference count of data object corresponding with the destination object data fingerprint to be subtracted
One.
The invention also discloses a kind of distributed storage deletion system, applied to distributed storage devices, including:
Fingerprint searching modul, for the history object unique mark using historical data object, find the history number
According to the history object data fingerprint of object;
Change module is counted, for utilizing the history object data fingerprint, finds drawing for the historical data object
With counting, and the reference count of the historical data object is subtracted one;
Removing module, if the reference count for the historical data object subtract one after be zero, delete the history
Data object and relative data.
The invention also discloses a kind of distributed storage to read system, applied to distributed storage devices, including:
Identifier lookup module, for utilizing target data objects unique mark, by preset algorithm, calculate the target
The OSD of data object unique mark;
Fingerprint read module, for reading out the destination object for the OSD for being stored in the target data objects unique mark
Data fingerprint;
Data read module, for utilizing the destination object data fingerprint, by the preset algorithm, calculate described
The target OSD of target data objects, reads the target data objects.
The invention also discloses a kind of distributed storage devices, including foregoing distributed storage deletes system, foregoing again
Distributed storage writing system, foregoing distributed storage deletion system and foregoing distributed storage read system.
In the present invention, distributed storage deletes method again, applied to distributed storage devices, including:Obtained in unified accumulation layer
The destination object data fingerprint of target data objects is taken, and corresponding OSD is arrived into the storage of destination object data fingerprint;Using default
Algorithm is calculated destination object data fingerprint, obtains the target OSD of target data objects;Judge number of targets OSD whether
Save historical data object;If having saved historical data object, the counting of the reference count of historical data object is added one.
The present invention obtains the destination object data fingerprint of target data objects in unified accumulation layer, using preset algorithm to mesh
Mark object data fingerprint is calculated, and obtains the target OSD of target data objects, so as to further judge that number of targets OSD is
The no object that saved historical data, because OSD is calculated by object data fingerprint, therefore, when number of targets OSD has been preserved
Historical data object, then illustrate that target data objects and historical data object are same data, then drawing historical data object
Add one with the counting of counting, directly using destination object data fingerprint find number of targets OSD, it is established that object data fingerprint and
Corresponding relation between OSD, so as to directly judge whether duplicate data, avoid using fingerprint base in distributed storage net
Carry out matching inquiry in network, and caused by it is inefficient, improve distributed storage and delete operating efficiency again.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this
The embodiment of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis
The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is that a kind of distributed storage disclosed in the embodiment of the present invention deletes method flow schematic diagram again;
Fig. 2 is a kind of distributed storage wiring method schematic flow sheet disclosed in the embodiment of the present invention;
Fig. 3 is a kind of distributed storage delet method schematic flow sheet disclosed in the embodiment of the present invention;
Fig. 4 is a kind of distributed storage read method schematic flow sheet disclosed in the embodiment of the present invention;
Fig. 5 is that a kind of distributed storage disclosed in the embodiment of the present invention deletes system structure diagram again;
Fig. 6 is a kind of distributed storage writing system structural representation disclosed in the embodiment of the present invention;
Fig. 7 is a kind of distributed storage deletion system structural representation disclosed in the embodiment of the present invention;
Fig. 8 is that a kind of distributed storage reads system structure diagram disclosed in the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made
Embodiment, belong to the scope of protection of the invention.
The embodiment of the invention discloses a kind of distributed storage to delete method again, shown in Figure 1, applied to distributed storage
Device, this method include:
Step S11:The destination object data fingerprint of target data objects is obtained in unified accumulation layer, and by destination object number
Corresponding OSD is arrived according to fingerprint storage.
Specifically, in unified accumulation layer, using the target data objects unique mark of target data objects, pass through Hash letter
Number, destination object data fingerprint is obtained, because no matter whether target data objects are duplicate data, to utilize destination object data
Fingerprint establishes the adduction relationship of copy and historical data object, preserves destination object data fingerprint to corresponding OSD (Object
Storage Device, object storage device).
Step S12:Destination object data fingerprint is calculated using preset algorithm, obtains the target of target data objects
OSD。
Specifically, being calculated using hash function destination object data fingerprint, the target of target data objects is obtained
OSD, map to obtain the target OSD of target data objects by carrying out Hash calculation to destination object data fingerprint.
It is understood that preset algorithm can be hash function, or other can be with hash function function class
As other algorithms, to realize the purpose of preset algorithm, the type of preset algorithm is not limited herein.
Step S13:Judge whether number of targets OSD has saved historical data object.
Specifically, after the target OSD of target data objects is calculated by destination object data fingerprint, if number of targets
OSD has preserved historical data object, and the history object data fingerprint and destination object data for illustrating the historical data object refer to
Line is identical, and therefore, historical data object is identical with the data content that target data objects preserve, and is duplicate data;If target
Number OSD are not saved historical data object, then duplicate data is not present in explanation, and target data objects can be preserved.
Step S14:If having saved historical data object, the counting of the reference count of historical data object is added one.
If the object specifically, number of targets OSD has saved historical data, it is duplicate data to illustrate target data objects, because
This, no longer preserves target data objects, but the counting of the reference count of historical data object directly is added into one, so as to show have
New user's usage history data object.
It is understood that if the object that do not save historical data, target data objects are preserved to destination object number
According to equipment OSD, and generate the reference count of target data objects and plus one.
It can be seen that the embodiment of the present invention obtains the destination object data fingerprint of target data objects in unified accumulation layer, utilize
Preset algorithm is calculated destination object data fingerprint, obtains the target OSD of target data objects, so as to further judge
Whether number of targets OSD has saved historical data object, because OSD is calculated by object data fingerprint, therefore, works as target
Number OSD have saved historical data object, then illustrate that target data objects and historical data object are same data, then by history number
Counting according to the reference count of object adds one, directly finds number of targets OSD using destination object data fingerprint, it is established that object
Corresponding relation between data fingerprint and OSD, so as to directly judge whether duplicate data, avoid and divided using fingerprint base
Carry out matching inquiry in cloth storage network, and caused by it is inefficient, improve distributed storage and delete operating efficiency again.
The rope between object data unique mark, object data fingerprint, object data equipment OSD is have modified due to the present invention
Draw relation, therefore, the embodiment of the invention also discloses a kind of distributed storage wiring method, it is shown in Figure 2, applied to distribution
Formula storage device, this method include:
Step S21:Using the target data objects unique mark of target data objects, inquiry destination object data fingerprint is
No presence.
Specifically, utilizing the target data objects unique mark of target data objects by hash function, target is calculated
Object data fingerprint, it whether there is so as to inquire about destination object data fingerprint, to judge that the target data objects of write-in are to create
Write or modification is write.
Step S22:If it is present calculate fresh target object data fingerprint using target data objects, and by new mesh
Mark object data fingerprint and be stored in corresponding OSD.
Specifically, if it does not exist, then target data objects are preserved into the step S12 in a upper embodiment.
Wherein, fresh target object data fingerprint is calculated using target data objects, and by fresh target object data fingerprint
Corresponding OSD detailed process is stored in, may include steps of S221 to step S225;Wherein,
Step S221:Judge whether the data block length of target data objects is equal to default piecemeal size.
, then can be with specifically, by judging whether the data block length of target data objects is equal to default piecemeal size
Judge the modification scope of target data objects.
Step S222:If equal to the first fresh target number of objects of target data objects is then calculated using preset algorithm
According to fingerprint, the first fresh target object data fingerprint is stored in corresponding OSD.
Specifically, the data block length of target data objects is equal to default piecemeal size, then illustrate target data objects
Differed with all data of old target data objects, therefore, the first of target data objects can be obtained by Hash calculation
Fresh target object data fingerprint, and the first fresh target object data fingerprint is stored in corresponding OSD.
Step S223:If the data block length of target data objects is less than piecemeal size, target data objects are utilized
Unique mark, destination object data fingerprint is obtained, old target data objects are obtained using destination object data fingerprint.
Specifically, then show that target data objects are to modify what is obtained to the partial data of old target data objects,
Therefore, first with target data objects and old target data objects identical target data objects unique mark, Hash letter is passed through
Number is calculated, and obtains destination object data fingerprint, and old target data objects are obtained using destination object data fingerprint.
Step S224:Old target data objects are spliced with target data objects, obtains splicing data object, utilizes
Preset algorithm calculates the second fresh target object data fingerprint of splicing data object.
Specifically, old target data objects are spliced with target data objects, make
Step S225:Second fresh target object data fingerprint is stored in corresponding OSD.
It should be noted that fresh target object data fingerprint includes the first fresh target object data fingerprint and the second fresh target
Object data fingerprint.
Step S23:Fresh target object data fingerprint is recycled, target data objects are stored in corresponding target OSD.
Step S24:The reference count of data object corresponding with destination object data fingerprint is subtracted one.
Specifically, using destination object data fingerprint, the old reference meter corresponding with destination object data fingerprint is found
Number, and the counting of old reference count is subtracted one, because target data objects are obtained after being modified by old target data objects
, target data objects are identical with the object unique mark of old target data objects, and destination object data fingerprint is old target
The old destination object data fingerprint of data object, therefore, destination object data fingerprint can be utilized, is found and destination object number
According to the corresponding old reference count of fingerprint, because old destination object data are changed to target data objects, target data objects with
Old target data objects no longer, so the counting of old reference count subtracts one.
Wherein, if old reference count is counted as zero, the old target data pair corresponding with old reference count is deleted
As.
It is shown in Figure 3 the embodiment of the invention also discloses a kind of distributed storage delet method, deposited applied to distribution
Storage device, this method include:
Step S31:Using the history object unique mark of historical data object, the history pair of historical data object is found
Image data fingerprint.
Specifically, because object data fingerprint and object unique mark are stored in same OSD, and data object and data pair
The reference count of elephant is stored in same OSD, and the OSD of data object is to be calculated by object data fingerprint by hash function
, therefore, when there is user to delete to be stored in the historical data object in OSD, utilize the history object of historical data object
Unique mark, calculated by hash function, obtain the OSD of history object unique mark, from the OSD of history object unique mark
Find history object data fingerprint.
Step S32:Using history object data fingerprint, the reference count of historical data object is found, and by history number
Subtract one according to the reference count of object.
Specifically, by historical data object may be used also by other users, therefore, each user's deleting history
During data object, subtract one to the reference count for stating historical data object, rather than directly deleting history data object;Utilize history
Object data fingerprint, is calculated by hash function, obtains the OSD of the reference count of historical data object, so as to find history
The reference count of data object.
Step S33:If the reference count of historical data object is zero after subtracting one, deleting history data object and and its
Related data.
Specifically, when no user reference history data object again, i.e. the reference count of historical data object subtract one after be
Zero, then deleting history data object and relative data.
Wherein, the data related to historical data object are history object data fingerprint.
It is shown in Figure 4 the embodiment of the invention also discloses a kind of distributed storage read method, deposited applied to distribution
Storage device, this method include:
Step S41:Using target data objects unique mark, by preset algorithm, it is unique to calculate target data objects
The OSD of mark;
Step S42:Read out the destination object data fingerprint for the OSD for being stored in target data objects unique mark;
Step S43:Using destination object data fingerprint, by preset algorithm, the target of target data objects is calculated
OSD, read target data objects.
Specifically, because target data objects unique mark and destination object data fingerprint are stored in same OSD kinds, because
This, by preset algorithm, using target data objects unique mark, can calculate the OSD of target data objects unique mark,
And read out destination object data fingerprint therein, due to the target OSD of target data objects be by destination object data fingerprint,
It is calculated by preset algorithm, so, using destination object data fingerprint, target OSD is found, and read target data
Object.
Wherein, preset algorithm can be hash function.
Accordingly, the embodiment of the invention also discloses a kind of distributed storage to delete system again, shown in Figure 5, is applied to
Distributed storage devices, the system include:
Fingerprint acquisition module 11, for obtaining the destination object data fingerprint of target data objects in unified accumulation layer;
Fingerprint storage module 12, for the storage of destination object data fingerprint to be arrived into corresponding OSD;
Position computation module 13, for being calculated using preset algorithm destination object data fingerprint, obtain number of targets
According to the target OSD of object;
Judge module 14, for judging whether number of targets OSD has saved historical data object;
Change module 15 is counted, if for the object that saved historical data, by the reference count of historical data object
Counting add one.
It can be seen that the embodiment of the present invention obtains the destination object data fingerprint of target data objects in unified accumulation layer, utilize
Preset algorithm is calculated destination object data fingerprint, obtains the target OSD of target data objects, so as to further judge
Whether number of targets OSD has saved historical data object, because OSD is calculated by object data fingerprint, therefore, works as target
Number OSD have saved historical data object, then illustrate that target data objects and historical data object are same data, then by history number
Counting according to the reference count of object adds one, directly finds number of targets OSD using destination object data fingerprint, it is established that object
Corresponding relation between data fingerprint and OSD, so as to directly judge whether duplicate data, avoid and divided using fingerprint base
Carry out matching inquiry in cloth storage network, and caused by it is inefficient, improve distributed storage and delete operating efficiency again.
In present example, fingerprint acquisition module 11, it can be specifically used for, in unified accumulation layer, utilizing target data objects
Target data objects unique mark, pass through hash function, obtain destination object data fingerprint.
Accordingly, it is shown in Figure 6 the embodiment of the invention also discloses a kind of distributed storage writing system, it is applied to
Distributed storage devices, including:
Fingerprint queries module 21, for the target data objects unique mark using target data objects, inquire about target pair
Image data fingerprint whether there is;
New fingerprint storage module 22, for if it is present calculating fresh target object data using target data objects
Fingerprint, and fresh target object data fingerprint is stored in corresponding OSD;
Data object memory module 23, for recycling fresh target object data fingerprint, target data objects are stored in phase
The target OSD answered;
Change module 24 is counted, for subtracting one by the reference count of data object corresponding with destination object data fingerprint.
In the embodiment of the present invention, above-mentioned new fingerprint storage module 22, length determining unit can be included, the first fingerprint preserves
Unit, data object acquiring unit, data object concatenation unit and data object concatenation unit;Wherein,
Length determining unit, for judging whether the data block length of target data objects is equal to default piecemeal size;
First fingerprint storage unit, for if equal to, then calculate the first of target data objects using preset algorithm
Fresh target object data fingerprint, the first fresh target object data fingerprint is stored in corresponding OSD;
Data object acquiring unit, if the data block length for target data objects is less than piecemeal size, utilize
Target data objects unique mark, destination object data fingerprint is obtained, old target data is obtained using destination object data fingerprint
Object;
Data object concatenation unit, for old target data objects to be spliced with target data objects, spliced
Data object, the second fresh target object data fingerprint of splicing data object is calculated using preset algorithm;
Data object concatenation unit, for the second fresh target object data fingerprint to be stored in into corresponding OSD.
Module 24 is changed in above-mentioned counting, can include:
Changing unit is counted, is less than default piecemeal size for the data block length when target data objects, then utilizes
Destination object data fingerprint, finds the old reference count corresponding with destination object data fingerprint, and by old reference count
Counting subtracts one;
Unit is deleted, if being counted as zero for old reference count, deletes old target data objects.
Accordingly, it is shown in Figure 7 the embodiment of the invention also discloses a kind of distributed storage deletion system, it is applied to
Distributed storage devices, including:
Fingerprint searching modul 31, for the history object unique mark using historical data object, find historical data
The history object data fingerprint of object;
Change module 32 is counted, for utilizing history object data fingerprint, finds the reference count of historical data object,
And the reference count of historical data object is subtracted one;
Removing module 33, if the reference count for historical data object subtract one after be zero, deleting history data pair
As with relative data.
Wherein, the data related to historical data object can be history object data fingerprint.
In the embodiment of the present invention, above-mentioned fingerprint searching modul 31, the history using historical data object can be specifically used for
Object unique mark, is calculated by hash function, obtains the OSD of history object unique mark, from history object unique mark
History object data fingerprint is found in OSD.
Module 32 is changed in above-mentioned counting, can include reference count searching unit;Wherein,
Reference count searching unit, for utilizing history object data fingerprint, calculated by hash function, obtain history number
According to the OSD of the reference count of object, so as to find the reference count of historical data object.
Accordingly, the embodiment of the invention also discloses a kind of distributed storage to read system, shown in Figure 8, is applied to
Distributed storage devices, including:
Identifier lookup module 41, for utilizing target data objects unique mark, by preset algorithm, calculate number of targets
According to the OSD of object unique mark;
Fingerprint read module 42, for reading out the destination object number for the OSD for being stored in target data objects unique mark
According to fingerprint;
Data read module 43, for utilizing destination object data fingerprint, by preset algorithm, calculate target data pair
The target OSD of elephant, read target data objects.
In addition, the embodiment of the invention also discloses a kind of distributed storage devices, including distribution disclosed in previous embodiment
Formula storage deletes system, distributed storage writing system, distributed storage deletion system and distributed storage and reads system again.Respectively
System specific configuration may be referred to the content disclosed in previous embodiment, will not be repeated here.
Finally, it is to be noted that, herein, such as first and second or the like relational terms be used merely to by
One entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operation
Between any this actual relation or order be present.Moreover, term " comprising ", "comprising" or its any other variant meaning
Covering including for nonexcludability, so that process, method, article or equipment including a series of elements not only include that
A little key elements, but also the other element including being not expressly set out, or also include for this process, method, article or
The intrinsic key element of equipment.In the absence of more restrictions, the key element limited by sentence "including a ...", is not arranged
Except other identical element in the process including the key element, method, article or equipment being also present.
Professional further appreciates that, with reference to the unit of each example of the embodiments described herein description
And algorithm steps, can be realized with electronic hardware, computer software or the combination of the two, in order to clearly demonstrate hardware and
The interchangeability of software, the composition and step of each example are generally described according to function in the above description.These
Function is performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme.Specialty
Technical staff can realize described function using distinct methods to each specific application, but this realization should not
Think beyond the scope of this invention.
Above to a kind of distributed storage devices provided by the present invention, distributed storage deletes, writes, deletes, reading side again
Method and system are described in detail, and specific case used herein is explained the principle and embodiment of the present invention
State, the explanation of above example is only intended to help the method and its core concept for understanding the present invention;Meanwhile for this area
Those skilled in the art, according to the thought of the present invention, there will be changes in specific embodiments and applications, to sum up institute
State, this specification content should not be construed as limiting the invention.
Claims (15)
1. a kind of distributed storage deletes method again, it is characterised in that applied to distributed storage devices, including:
The destination object data fingerprint of target data objects is obtained in unified accumulation layer, and the destination object data fingerprint is deposited
Store up corresponding OSD;
The destination object data fingerprint is calculated using preset algorithm, obtains the target OSD of the target data objects;
Judge whether the number of targets OSD has saved historical data object;
If having preserved the historical data object, the counting of the reference count of the historical data object is added one.
2. distributed storage according to claim 1 deletes method again, it is characterised in that described to obtain mesh in unified accumulation layer
The process of the destination object data fingerprint of data object is marked, including:
In unified accumulation layer, using the target data objects unique mark of the target data objects, by hash function, obtain
The destination object data fingerprint.
A kind of 3. distributed storage wiring method, it is characterised in that applied to distributed storage devices, including:
Using the target data objects unique mark of target data objects, inquiry destination object data fingerprint whether there is;
If it is present calculate fresh target object data fingerprint using the target data objects, and by the fresh target pair
Image data fingerprint is stored in corresponding OSD;
The fresh target object data fingerprint is recycled, the target data objects are stored in corresponding target OSD;
The reference count of data object corresponding with the destination object data fingerprint is subtracted one.
4. distributed storage wiring method according to claim 3, it is characterised in that described to utilize the target data pair
As calculating fresh target object data fingerprint, and the fresh target object data fingerprint is stored in corresponding OSD process, bag
Include:
Judge whether the data block length of the target data objects is equal to default piecemeal size;
If equal to the first fresh target object data that the target data objects are then calculated using the preset algorithm is referred to
Line, the first fresh target object data fingerprint is stored in corresponding OSD;
If the data block length of the target data objects is less than the piecemeal size, using the target data objects only
One mark, obtains the destination object data fingerprint, and the old target data pair is obtained using the destination object data fingerprint
As;
The old target data objects are spliced with the target data objects, obtain splicing data object, using described
Preset algorithm calculates the second fresh target object data fingerprint of the splicing data object;
The second fresh target object data fingerprint is stored in corresponding OSD.
5. distributed storage wiring method according to claim 4, it is characterised in that it is described will be with the destination object number
The process for subtracting one according to the reference count of the corresponding data object of fingerprint, including:
When the data block length of the target data objects is less than default piecemeal size, then referred to using the destination object data
Line, the old reference count corresponding with the destination object data fingerprint is found, and the counting of the old reference count is subtracted
One;
If the old reference count is counted as zero, the old target data objects are deleted.
A kind of 6. distributed storage delet method, it is characterised in that applied to distributed storage devices, including:
Using the history object unique mark of historical data object, the history object data for finding the historical data object refer to
Line;
Using the history object data fingerprint, the reference count of the historical data object is found, and by the history number
Subtract one according to the reference count of object;
If the reference count of the historical data object is zero after subtracting one, the historical data object and associated therewith is deleted
Data.
7. distributed storage delet method according to claim 6, it is characterised in that described using historical data object
History object unique mark, the process of the history object data fingerprint of the historical data object is found, including:
Using the history object unique mark of historical data object, calculated by hash function, it is unique to obtain the history object
The OSD of mark, the history object data fingerprint is found from the OSD of the history object unique mark.
8. distributed storage delet method according to claim 6, it is characterised in that described to utilize the history object number
According to fingerprint, the process of the reference count of the historical data object is found, including:
Using the history object data fingerprint, calculated by hash function, obtain the reference count of the historical data object
OSD, so as to find the reference count of the historical data object.
9. distributed storage according to claim 6 deletes method again, it is characterised in that described to delete the historical data pair
As the process with relative data, including:
Delete the historical data object and the history object data fingerprint.
A kind of 10. distributed storage read method, it is characterised in that applied to distributed storage devices, including:
Using target data objects unique mark, by preset algorithm, the target data objects unique mark is calculated
OSD;
Read out the destination object data fingerprint for the OSD for being stored in the target data objects unique mark;
Using the destination object data fingerprint, by the preset algorithm, the target of the target data objects is calculated
OSD, read the target data objects.
11. a kind of distributed storage deletes system again, it is characterised in that applied to distributed storage devices, the system includes:
Fingerprint acquisition module, for obtaining the destination object data fingerprint of target data objects in unified accumulation layer;
Fingerprint storage module, for destination object data fingerprint storage to be arrived into corresponding OSD;
Position computation module, for being calculated using preset algorithm the destination object data fingerprint, obtain the target
The target OSD of data object;
Judge module, for judging whether the number of targets OSD has saved historical data object;
Change module is counted, if for having preserved the historical data object, by the reference meter of the historical data object
Several countings add one.
A kind of 12. distributed storage writing system, it is characterised in that applied to distributed storage devices, including:
Fingerprint queries module, for the target data objects unique mark using target data objects, inquire about destination object data
Fingerprint whether there is;
New fingerprint storage module, for referring to if it is present calculating fresh target object data using the target data objects
Line, and the fresh target object data fingerprint is stored in corresponding OSD;
Data object memory module, for recycling the fresh target object data fingerprint, the target data objects are stored in
Corresponding target OSD;
Change module is counted, for subtracting one by the reference count of data object corresponding with the destination object data fingerprint.
A kind of 13. distributed storage deletion system, it is characterised in that applied to distributed storage devices, including:
Fingerprint searching modul, for the history object unique mark using historical data object, find the historical data pair
The history object data fingerprint of elephant;
Change module is counted, for utilizing the history object data fingerprint, finds the reference meter of the historical data object
Number, and the reference count of the historical data object is subtracted one;
Removing module, if the reference count for the historical data object subtract one after be zero, delete the historical data
Object and relative data.
14. a kind of distributed storage reads system, it is characterised in that applied to distributed storage devices, including:
Identifier lookup module, for utilizing target data objects unique mark, by preset algorithm, calculate the target data
The OSD of object unique mark;
Fingerprint read module, for reading out the destination object data for the OSD for being stored in the target data objects unique mark
Fingerprint;
Data read module, for utilizing the destination object data fingerprint, by the preset algorithm, calculate the target
The target OSD of data object, reads the target data objects.
15. a kind of distributed storage devices, it is characterised in that deleted again including distributed storage as claimed in claim 11 and be
System, distributed storage writing system as claimed in claim 12, distributed storage deletion system as claimed in claim 13
And distributed storage as claimed in claim 14 reads system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710764079.9A CN107506150A (en) | 2017-08-30 | 2017-08-30 | Distributed storage devices, delete, write again, deleting, read method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710764079.9A CN107506150A (en) | 2017-08-30 | 2017-08-30 | Distributed storage devices, delete, write again, deleting, read method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107506150A true CN107506150A (en) | 2017-12-22 |
Family
ID=60694359
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710764079.9A Pending CN107506150A (en) | 2017-08-30 | 2017-08-30 | Distributed storage devices, delete, write again, deleting, read method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107506150A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110245129A (en) * | 2019-04-23 | 2019-09-17 | 平安科技(深圳)有限公司 | Distributed global data deduplication method and device |
CN111177088A (en) * | 2019-12-29 | 2020-05-19 | 北京浪潮数据技术有限公司 | Data deduplication method and device, electronic equipment and storage medium |
CN112286457A (en) * | 2020-10-28 | 2021-01-29 | 杭州宏杉科技股份有限公司 | Object deduplication method and device, electronic equipment and machine-readable storage medium |
WO2021109587A1 (en) * | 2019-12-06 | 2021-06-10 | 浪潮电子信息产业股份有限公司 | File storage method and apparatus, and device and readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101599079A (en) * | 2009-07-22 | 2009-12-09 | 中国科学院计算技术研究所 | A kind of Backup Data is concentrated the management method of storage |
CN102495894A (en) * | 2011-12-12 | 2012-06-13 | 成都市华为赛门铁克科技有限公司 | Method, device and system for searching repeated data |
CN102915278A (en) * | 2012-09-19 | 2013-02-06 | 浪潮(北京)电子信息产业有限公司 | Data deduplication method |
US20130297884A1 (en) * | 2012-05-07 | 2013-11-07 | International Business Machines Corporation | Enhancing data processing performance by cache management of fingerprint index |
-
2017
- 2017-08-30 CN CN201710764079.9A patent/CN107506150A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101599079A (en) * | 2009-07-22 | 2009-12-09 | 中国科学院计算技术研究所 | A kind of Backup Data is concentrated the management method of storage |
CN102495894A (en) * | 2011-12-12 | 2012-06-13 | 成都市华为赛门铁克科技有限公司 | Method, device and system for searching repeated data |
US20130297884A1 (en) * | 2012-05-07 | 2013-11-07 | International Business Machines Corporation | Enhancing data processing performance by cache management of fingerprint index |
CN102915278A (en) * | 2012-09-19 | 2013-02-06 | 浪潮(北京)电子信息产业有限公司 | Data deduplication method |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110245129A (en) * | 2019-04-23 | 2019-09-17 | 平安科技(深圳)有限公司 | Distributed global data deduplication method and device |
WO2020215580A1 (en) * | 2019-04-23 | 2020-10-29 | 平安科技(深圳)有限公司 | Distributed global data deduplication method and device |
CN110245129B (en) * | 2019-04-23 | 2022-05-13 | 平安科技(深圳)有限公司 | Distributed global data deduplication method and device |
WO2021109587A1 (en) * | 2019-12-06 | 2021-06-10 | 浪潮电子信息产业股份有限公司 | File storage method and apparatus, and device and readable storage medium |
CN111177088A (en) * | 2019-12-29 | 2020-05-19 | 北京浪潮数据技术有限公司 | Data deduplication method and device, electronic equipment and storage medium |
CN112286457A (en) * | 2020-10-28 | 2021-01-29 | 杭州宏杉科技股份有限公司 | Object deduplication method and device, electronic equipment and machine-readable storage medium |
CN112286457B (en) * | 2020-10-28 | 2022-08-26 | 杭州宏杉科技股份有限公司 | Object deduplication method and device, electronic equipment and machine-readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107506150A (en) | Distributed storage devices, delete, write again, deleting, read method and system | |
CN103890738B (en) | The system and method for the weight that disappears in storage object after retaining clone and separate operation | |
US8799601B1 (en) | Techniques for managing deduplication based on recently written extents | |
CN103136243B (en) | File system duplicate removal method based on cloud storage and device | |
US7606817B2 (en) | Primenet data management system | |
CN104933112A (en) | Distributed Internet transaction information storage and processing method | |
CN103959264A (en) | Managing redundant immutable files using deduplication in storage clouds | |
JP2005267600A5 (en) | ||
CN102169491B (en) | Dynamic detection method for multi-data concentrated and repeated records | |
CN105787037A (en) | Repeated data deleting method and device | |
CN106874481A (en) | A kind of metadata of distributed type file system information-reading method and system | |
CN103530322B (en) | Data processing method and device | |
CN110737680A (en) | Cache data management method and device, storage medium and electronic equipment | |
CN107391761A (en) | A kind of data managing method and device based on data de-duplication technology | |
CN109242458A (en) | Approaches to IM and relevant device based on block chain | |
CN107506484B (en) | Operation and maintenance data association auditing method, system, equipment and storage medium | |
CN112182004A (en) | Method and device for viewing data in real time, computer equipment and storage medium | |
WO2020215580A1 (en) | Distributed global data deduplication method and device | |
CN107368545A (en) | A kind of De-weight method and device based on MerkleTree deformation algorithms | |
CN106528703A (en) | Deduplication mode switching method and apparatus | |
CN106487937A (en) | A kind of cloud storage system file De-weight method and system | |
CN104317955B (en) | File scanning method and device in a kind of mobile terminal memory space | |
CN109753379A (en) | Snapshot data backup, delet method, apparatus and system | |
KR101252375B1 (en) | Mapping management system and method for enhancing performance of deduplication in storage apparatus | |
CN102831240B (en) | The storage means of extended metadata file and storage organization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171222 |
|
RJ01 | Rejection of invention patent application after publication |