CN107506150A - Distributed storage devices, delete, write again, deleting, read method and system - Google Patents

Distributed storage devices, delete, write again, deleting, read method and system Download PDF

Info

Publication number
CN107506150A
CN107506150A CN201710764079.9A CN201710764079A CN107506150A CN 107506150 A CN107506150 A CN 107506150A CN 201710764079 A CN201710764079 A CN 201710764079A CN 107506150 A CN107506150 A CN 107506150A
Authority
CN
China
Prior art keywords
data
fingerprint
target
osd
distributed storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710764079.9A
Other languages
Chinese (zh)
Inventor
胡永刚
张子奇
王利朋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201710764079.9A priority Critical patent/CN107506150A/en
Publication of CN107506150A publication Critical patent/CN107506150A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Abstract

This application discloses a kind of distributed storage devices, delete, write, delete again, read method and system, applied to distributed storage devices, including:The destination object data fingerprint of target data objects is obtained in unified accumulation layer, and corresponding OSD is arrived into the storage of destination object data fingerprint;Destination object data fingerprint is calculated using preset algorithm, obtains the target OSD of target data objects;Judge whether number of targets OSD has saved historical data object;If having saved historical data object, the counting of the reference count of historical data object is added one.It is of the invention directly to find number of targets OSD using destination object data fingerprint, set up the corresponding relation between object data fingerprint and OSD, so as to directly judge whether duplicate data, avoid and matching inquiry is carried out in distributed storage network using fingerprint base, it is inefficient caused by and, improve distributed storage and delete operating efficiency again.

Description

Distributed storage devices, delete, write again, deleting, read method and system
Technical field
The present invention relates to technical field of memory, more particularly to a kind of distributed storage devices, distributed storage delete, write again, Deletion, read method and system.
Background technology
Data are disperseed to be stored in more independent equipment by distributed memory system.Traditional network store system uses The storage server of concentration deposits all data, and storage server turns into the bottleneck of systematic function, and reliability and security Focus, it is impossible to meet Mass storage application needs.Distributed network storage system uses expansible system architecture, profit Storage load is shared with more storage servers, positions storage information using location server, what it not only increased system can By property, availability and access efficiency, it is also easy to extend.
It is a kind of capacity optimisation technique that data de-duplication (De-duplication, referred to as deletes) technology again, and it is by disappearing Except the data repeated in storage system, the data of actual storage or the data by network transmission in reduction system, in backup, long Phase filing and data disaster recovery etc. are widely used.Industrial quarters and academia had corresponding product and Technical research results.In recent years, with the development that software definition stores, and virtualization technology, application specific processor technology and new The appearance of storage medium, with reference to the processing of online duplicate data and the consideration of reduction storage cell Capacity Cost, data de-duplication Technical need is more and more urgent.
The research of the online data de-duplication technology of distributed storage at present mainly using fingerprint base is created, utilizes fingerprint base Come judge data whether repeat method, realize, be required in distributed storage network regardless of the storage mode of fingerprint base Interior carry out matching inquiry, efficiency are low.
The content of the invention
In view of this, delete, write, delete again it is an object of the invention to provide a kind of distributed storage devices, distributed storage Remove, read method and system, matching inquiry is carried out in distributed storage network using fingerprint base to avoid, and caused by Poor efficiency, improve distributed storage and delete operating efficiency again.Its concrete scheme is as follows:
A kind of distributed storage deletes method again, applied to distributed storage devices, including:
The destination object data fingerprint of target data objects is obtained in unified accumulation layer, and the destination object data are referred to Corresponding OSD is arrived in line storage;
The destination object data fingerprint is calculated using preset algorithm, obtains the target of the target data objects OSD;
Judge whether the number of targets OSD has saved historical data object;
If having preserved the historical data object, the counting of the reference count of the historical data object is added one.
Optionally, the process of the destination object data fingerprint that target data objects are obtained in unified accumulation layer, including:
In unified accumulation layer, using the target data objects unique mark of the target data objects, by hash function, Obtain the destination object data fingerprint.
The invention also discloses a kind of distributed storage wiring method, applied to distributed storage devices, including:
Using the target data objects unique mark of target data objects, inquiry destination object data fingerprint whether there is;
If it is present calculate fresh target object data fingerprint using the target data objects, and by the new mesh Mark object data fingerprint and be stored in corresponding OSD;
The fresh target object data fingerprint is recycled, the target data objects are stored in corresponding target OSD;
The reference count of data object corresponding with the destination object data fingerprint is subtracted one.
Optionally, it is described to calculate fresh target object data fingerprint using the target data objects, and by the new mesh The process that object data fingerprint is stored in corresponding OSD is marked, including:
Judge whether the data block length of the target data objects is equal to default piecemeal size;
If equal to the first fresh target object data of the target data objects is then calculated using the preset algorithm Fingerprint, the first fresh target object data fingerprint is stored in corresponding OSD;
If the data block length of the target data objects is less than the piecemeal size, the target data pair is utilized As unique mark, the destination object data fingerprint is obtained, the old number of targets is obtained using the destination object data fingerprint According to object;
The old target data objects are spliced with the target data objects, obtains splicing data object, utilizes The preset algorithm calculates the second fresh target object data fingerprint of the splicing data object;
The second fresh target object data fingerprint is stored in corresponding OSD.
Optionally, the mistake that the reference count by data object corresponding with the destination object data fingerprint subtracts one Journey, including:
When the data block length of the target data objects is less than default piecemeal size, then the destination object number is utilized According to fingerprint, the old reference count corresponding with the destination object data fingerprint is found, and by the meter of the old reference count Number subtracts one;
If the old reference count is counted as zero, the old target data objects are deleted.
The invention also discloses a kind of distributed storage delet method, applied to distributed storage devices, including:
Using the history object unique mark of historical data object, the history object number of the historical data object is found According to fingerprint;
Using the history object data fingerprint, the reference count of the historical data object is found, and go through described The reference count of history data object subtracts one;
If the reference count of the historical data object is zero after subtracting one, delete the historical data object and and its Related data.
Optionally, the history object unique mark using historical data object, finds the historical data object History object data fingerprint process, including:
Using the history object unique mark of historical data object, calculated by hash function, obtain the history object The OSD of unique mark, the history object data fingerprint is found from the OSD of the history object unique mark.
Optionally, it is described to utilize the history object data fingerprint, find the reference count of the historical data object Process, including:
Using the history object data fingerprint, calculated by hash function, obtain the reference of the historical data object The OSD of counting, so as to find the reference count of the historical data object.
Optionally, the process for deleting the historical data object and relative data, including:
Delete the historical data object and the history object data fingerprint.
The invention also discloses a kind of distributed storage read method, applied to distributed storage devices, including:
Using target data objects unique mark, by preset algorithm, the target data objects unique mark is calculated OSD;
Read out the destination object data fingerprint for the OSD for being stored in the target data objects unique mark;
Using the destination object data fingerprint, by the preset algorithm, the mesh of the target data objects is calculated OSD is marked, reads the target data objects.
The invention also discloses a kind of distributed storage to delete system again, and applied to distributed storage devices, the system includes:
Fingerprint acquisition module, for obtaining the destination object data fingerprint of target data objects in unified accumulation layer;
Fingerprint storage module, for destination object data fingerprint storage to be arrived into corresponding OSD;
Position computation module, for being calculated using preset algorithm the destination object data fingerprint, obtain described The target OSD of target data objects;
Judge module, for judging whether the number of targets OSD has saved historical data object;
Change module is counted, if for having preserved the historical data object, by drawing for the historical data object Add one with the counting of counting.
The invention also discloses a kind of distributed storage writing system, applied to distributed storage devices, including:
Fingerprint queries module, for the target data objects unique mark using target data objects, inquire about destination object Data fingerprint whether there is;
New fingerprint storage module, for if it is present calculating fresh target number of objects using the target data objects Corresponding OSD is stored according to fingerprint, and by the fresh target object data fingerprint;
Data object memory module, for recycling the fresh target object data fingerprint, by the target data objects It is stored in corresponding target OSD;
Change module is counted, for the reference count of data object corresponding with the destination object data fingerprint to be subtracted One.
The invention also discloses a kind of distributed storage deletion system, applied to distributed storage devices, including:
Fingerprint searching modul, for the history object unique mark using historical data object, find the history number According to the history object data fingerprint of object;
Change module is counted, for utilizing the history object data fingerprint, finds drawing for the historical data object With counting, and the reference count of the historical data object is subtracted one;
Removing module, if the reference count for the historical data object subtract one after be zero, delete the history Data object and relative data.
The invention also discloses a kind of distributed storage to read system, applied to distributed storage devices, including:
Identifier lookup module, for utilizing target data objects unique mark, by preset algorithm, calculate the target The OSD of data object unique mark;
Fingerprint read module, for reading out the destination object for the OSD for being stored in the target data objects unique mark Data fingerprint;
Data read module, for utilizing the destination object data fingerprint, by the preset algorithm, calculate described The target OSD of target data objects, reads the target data objects.
The invention also discloses a kind of distributed storage devices, including foregoing distributed storage deletes system, foregoing again Distributed storage writing system, foregoing distributed storage deletion system and foregoing distributed storage read system.
In the present invention, distributed storage deletes method again, applied to distributed storage devices, including:Obtained in unified accumulation layer The destination object data fingerprint of target data objects is taken, and corresponding OSD is arrived into the storage of destination object data fingerprint;Using default Algorithm is calculated destination object data fingerprint, obtains the target OSD of target data objects;Judge number of targets OSD whether Save historical data object;If having saved historical data object, the counting of the reference count of historical data object is added one.
The present invention obtains the destination object data fingerprint of target data objects in unified accumulation layer, using preset algorithm to mesh Mark object data fingerprint is calculated, and obtains the target OSD of target data objects, so as to further judge that number of targets OSD is The no object that saved historical data, because OSD is calculated by object data fingerprint, therefore, when number of targets OSD has been preserved Historical data object, then illustrate that target data objects and historical data object are same data, then drawing historical data object Add one with the counting of counting, directly using destination object data fingerprint find number of targets OSD, it is established that object data fingerprint and Corresponding relation between OSD, so as to directly judge whether duplicate data, avoid using fingerprint base in distributed storage net Carry out matching inquiry in network, and caused by it is inefficient, improve distributed storage and delete operating efficiency again.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this The embodiment of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is that a kind of distributed storage disclosed in the embodiment of the present invention deletes method flow schematic diagram again;
Fig. 2 is a kind of distributed storage wiring method schematic flow sheet disclosed in the embodiment of the present invention;
Fig. 3 is a kind of distributed storage delet method schematic flow sheet disclosed in the embodiment of the present invention;
Fig. 4 is a kind of distributed storage read method schematic flow sheet disclosed in the embodiment of the present invention;
Fig. 5 is that a kind of distributed storage disclosed in the embodiment of the present invention deletes system structure diagram again;
Fig. 6 is a kind of distributed storage writing system structural representation disclosed in the embodiment of the present invention;
Fig. 7 is a kind of distributed storage deletion system structural representation disclosed in the embodiment of the present invention;
Fig. 8 is that a kind of distributed storage reads system structure diagram disclosed in the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made Embodiment, belong to the scope of protection of the invention.
The embodiment of the invention discloses a kind of distributed storage to delete method again, shown in Figure 1, applied to distributed storage Device, this method include:
Step S11:The destination object data fingerprint of target data objects is obtained in unified accumulation layer, and by destination object number Corresponding OSD is arrived according to fingerprint storage.
Specifically, in unified accumulation layer, using the target data objects unique mark of target data objects, pass through Hash letter Number, destination object data fingerprint is obtained, because no matter whether target data objects are duplicate data, to utilize destination object data Fingerprint establishes the adduction relationship of copy and historical data object, preserves destination object data fingerprint to corresponding OSD (Object Storage Device, object storage device).
Step S12:Destination object data fingerprint is calculated using preset algorithm, obtains the target of target data objects OSD。
Specifically, being calculated using hash function destination object data fingerprint, the target of target data objects is obtained OSD, map to obtain the target OSD of target data objects by carrying out Hash calculation to destination object data fingerprint.
It is understood that preset algorithm can be hash function, or other can be with hash function function class As other algorithms, to realize the purpose of preset algorithm, the type of preset algorithm is not limited herein.
Step S13:Judge whether number of targets OSD has saved historical data object.
Specifically, after the target OSD of target data objects is calculated by destination object data fingerprint, if number of targets OSD has preserved historical data object, and the history object data fingerprint and destination object data for illustrating the historical data object refer to Line is identical, and therefore, historical data object is identical with the data content that target data objects preserve, and is duplicate data;If target Number OSD are not saved historical data object, then duplicate data is not present in explanation, and target data objects can be preserved.
Step S14:If having saved historical data object, the counting of the reference count of historical data object is added one.
If the object specifically, number of targets OSD has saved historical data, it is duplicate data to illustrate target data objects, because This, no longer preserves target data objects, but the counting of the reference count of historical data object directly is added into one, so as to show have New user's usage history data object.
It is understood that if the object that do not save historical data, target data objects are preserved to destination object number According to equipment OSD, and generate the reference count of target data objects and plus one.
It can be seen that the embodiment of the present invention obtains the destination object data fingerprint of target data objects in unified accumulation layer, utilize Preset algorithm is calculated destination object data fingerprint, obtains the target OSD of target data objects, so as to further judge Whether number of targets OSD has saved historical data object, because OSD is calculated by object data fingerprint, therefore, works as target Number OSD have saved historical data object, then illustrate that target data objects and historical data object are same data, then by history number Counting according to the reference count of object adds one, directly finds number of targets OSD using destination object data fingerprint, it is established that object Corresponding relation between data fingerprint and OSD, so as to directly judge whether duplicate data, avoid and divided using fingerprint base Carry out matching inquiry in cloth storage network, and caused by it is inefficient, improve distributed storage and delete operating efficiency again.
The rope between object data unique mark, object data fingerprint, object data equipment OSD is have modified due to the present invention Draw relation, therefore, the embodiment of the invention also discloses a kind of distributed storage wiring method, it is shown in Figure 2, applied to distribution Formula storage device, this method include:
Step S21:Using the target data objects unique mark of target data objects, inquiry destination object data fingerprint is No presence.
Specifically, utilizing the target data objects unique mark of target data objects by hash function, target is calculated Object data fingerprint, it whether there is so as to inquire about destination object data fingerprint, to judge that the target data objects of write-in are to create Write or modification is write.
Step S22:If it is present calculate fresh target object data fingerprint using target data objects, and by new mesh Mark object data fingerprint and be stored in corresponding OSD.
Specifically, if it does not exist, then target data objects are preserved into the step S12 in a upper embodiment.
Wherein, fresh target object data fingerprint is calculated using target data objects, and by fresh target object data fingerprint Corresponding OSD detailed process is stored in, may include steps of S221 to step S225;Wherein,
Step S221:Judge whether the data block length of target data objects is equal to default piecemeal size.
, then can be with specifically, by judging whether the data block length of target data objects is equal to default piecemeal size Judge the modification scope of target data objects.
Step S222:If equal to the first fresh target number of objects of target data objects is then calculated using preset algorithm According to fingerprint, the first fresh target object data fingerprint is stored in corresponding OSD.
Specifically, the data block length of target data objects is equal to default piecemeal size, then illustrate target data objects Differed with all data of old target data objects, therefore, the first of target data objects can be obtained by Hash calculation Fresh target object data fingerprint, and the first fresh target object data fingerprint is stored in corresponding OSD.
Step S223:If the data block length of target data objects is less than piecemeal size, target data objects are utilized Unique mark, destination object data fingerprint is obtained, old target data objects are obtained using destination object data fingerprint.
Specifically, then show that target data objects are to modify what is obtained to the partial data of old target data objects, Therefore, first with target data objects and old target data objects identical target data objects unique mark, Hash letter is passed through Number is calculated, and obtains destination object data fingerprint, and old target data objects are obtained using destination object data fingerprint.
Step S224:Old target data objects are spliced with target data objects, obtains splicing data object, utilizes Preset algorithm calculates the second fresh target object data fingerprint of splicing data object.
Specifically, old target data objects are spliced with target data objects, make
Step S225:Second fresh target object data fingerprint is stored in corresponding OSD.
It should be noted that fresh target object data fingerprint includes the first fresh target object data fingerprint and the second fresh target Object data fingerprint.
Step S23:Fresh target object data fingerprint is recycled, target data objects are stored in corresponding target OSD.
Step S24:The reference count of data object corresponding with destination object data fingerprint is subtracted one.
Specifically, using destination object data fingerprint, the old reference meter corresponding with destination object data fingerprint is found Number, and the counting of old reference count is subtracted one, because target data objects are obtained after being modified by old target data objects , target data objects are identical with the object unique mark of old target data objects, and destination object data fingerprint is old target The old destination object data fingerprint of data object, therefore, destination object data fingerprint can be utilized, is found and destination object number According to the corresponding old reference count of fingerprint, because old destination object data are changed to target data objects, target data objects with Old target data objects no longer, so the counting of old reference count subtracts one.
Wherein, if old reference count is counted as zero, the old target data pair corresponding with old reference count is deleted As.
It is shown in Figure 3 the embodiment of the invention also discloses a kind of distributed storage delet method, deposited applied to distribution Storage device, this method include:
Step S31:Using the history object unique mark of historical data object, the history pair of historical data object is found Image data fingerprint.
Specifically, because object data fingerprint and object unique mark are stored in same OSD, and data object and data pair The reference count of elephant is stored in same OSD, and the OSD of data object is to be calculated by object data fingerprint by hash function , therefore, when there is user to delete to be stored in the historical data object in OSD, utilize the history object of historical data object Unique mark, calculated by hash function, obtain the OSD of history object unique mark, from the OSD of history object unique mark Find history object data fingerprint.
Step S32:Using history object data fingerprint, the reference count of historical data object is found, and by history number Subtract one according to the reference count of object.
Specifically, by historical data object may be used also by other users, therefore, each user's deleting history During data object, subtract one to the reference count for stating historical data object, rather than directly deleting history data object;Utilize history Object data fingerprint, is calculated by hash function, obtains the OSD of the reference count of historical data object, so as to find history The reference count of data object.
Step S33:If the reference count of historical data object is zero after subtracting one, deleting history data object and and its Related data.
Specifically, when no user reference history data object again, i.e. the reference count of historical data object subtract one after be Zero, then deleting history data object and relative data.
Wherein, the data related to historical data object are history object data fingerprint.
It is shown in Figure 4 the embodiment of the invention also discloses a kind of distributed storage read method, deposited applied to distribution Storage device, this method include:
Step S41:Using target data objects unique mark, by preset algorithm, it is unique to calculate target data objects The OSD of mark;
Step S42:Read out the destination object data fingerprint for the OSD for being stored in target data objects unique mark;
Step S43:Using destination object data fingerprint, by preset algorithm, the target of target data objects is calculated OSD, read target data objects.
Specifically, because target data objects unique mark and destination object data fingerprint are stored in same OSD kinds, because This, by preset algorithm, using target data objects unique mark, can calculate the OSD of target data objects unique mark, And read out destination object data fingerprint therein, due to the target OSD of target data objects be by destination object data fingerprint, It is calculated by preset algorithm, so, using destination object data fingerprint, target OSD is found, and read target data Object.
Wherein, preset algorithm can be hash function.
Accordingly, the embodiment of the invention also discloses a kind of distributed storage to delete system again, shown in Figure 5, is applied to Distributed storage devices, the system include:
Fingerprint acquisition module 11, for obtaining the destination object data fingerprint of target data objects in unified accumulation layer;
Fingerprint storage module 12, for the storage of destination object data fingerprint to be arrived into corresponding OSD;
Position computation module 13, for being calculated using preset algorithm destination object data fingerprint, obtain number of targets According to the target OSD of object;
Judge module 14, for judging whether number of targets OSD has saved historical data object;
Change module 15 is counted, if for the object that saved historical data, by the reference count of historical data object Counting add one.
It can be seen that the embodiment of the present invention obtains the destination object data fingerprint of target data objects in unified accumulation layer, utilize Preset algorithm is calculated destination object data fingerprint, obtains the target OSD of target data objects, so as to further judge Whether number of targets OSD has saved historical data object, because OSD is calculated by object data fingerprint, therefore, works as target Number OSD have saved historical data object, then illustrate that target data objects and historical data object are same data, then by history number Counting according to the reference count of object adds one, directly finds number of targets OSD using destination object data fingerprint, it is established that object Corresponding relation between data fingerprint and OSD, so as to directly judge whether duplicate data, avoid and divided using fingerprint base Carry out matching inquiry in cloth storage network, and caused by it is inefficient, improve distributed storage and delete operating efficiency again.
In present example, fingerprint acquisition module 11, it can be specifically used for, in unified accumulation layer, utilizing target data objects Target data objects unique mark, pass through hash function, obtain destination object data fingerprint.
Accordingly, it is shown in Figure 6 the embodiment of the invention also discloses a kind of distributed storage writing system, it is applied to Distributed storage devices, including:
Fingerprint queries module 21, for the target data objects unique mark using target data objects, inquire about target pair Image data fingerprint whether there is;
New fingerprint storage module 22, for if it is present calculating fresh target object data using target data objects Fingerprint, and fresh target object data fingerprint is stored in corresponding OSD;
Data object memory module 23, for recycling fresh target object data fingerprint, target data objects are stored in phase The target OSD answered;
Change module 24 is counted, for subtracting one by the reference count of data object corresponding with destination object data fingerprint.
In the embodiment of the present invention, above-mentioned new fingerprint storage module 22, length determining unit can be included, the first fingerprint preserves Unit, data object acquiring unit, data object concatenation unit and data object concatenation unit;Wherein,
Length determining unit, for judging whether the data block length of target data objects is equal to default piecemeal size;
First fingerprint storage unit, for if equal to, then calculate the first of target data objects using preset algorithm Fresh target object data fingerprint, the first fresh target object data fingerprint is stored in corresponding OSD;
Data object acquiring unit, if the data block length for target data objects is less than piecemeal size, utilize Target data objects unique mark, destination object data fingerprint is obtained, old target data is obtained using destination object data fingerprint Object;
Data object concatenation unit, for old target data objects to be spliced with target data objects, spliced Data object, the second fresh target object data fingerprint of splicing data object is calculated using preset algorithm;
Data object concatenation unit, for the second fresh target object data fingerprint to be stored in into corresponding OSD.
Module 24 is changed in above-mentioned counting, can include:
Changing unit is counted, is less than default piecemeal size for the data block length when target data objects, then utilizes Destination object data fingerprint, finds the old reference count corresponding with destination object data fingerprint, and by old reference count Counting subtracts one;
Unit is deleted, if being counted as zero for old reference count, deletes old target data objects.
Accordingly, it is shown in Figure 7 the embodiment of the invention also discloses a kind of distributed storage deletion system, it is applied to Distributed storage devices, including:
Fingerprint searching modul 31, for the history object unique mark using historical data object, find historical data The history object data fingerprint of object;
Change module 32 is counted, for utilizing history object data fingerprint, finds the reference count of historical data object, And the reference count of historical data object is subtracted one;
Removing module 33, if the reference count for historical data object subtract one after be zero, deleting history data pair As with relative data.
Wherein, the data related to historical data object can be history object data fingerprint.
In the embodiment of the present invention, above-mentioned fingerprint searching modul 31, the history using historical data object can be specifically used for Object unique mark, is calculated by hash function, obtains the OSD of history object unique mark, from history object unique mark History object data fingerprint is found in OSD.
Module 32 is changed in above-mentioned counting, can include reference count searching unit;Wherein,
Reference count searching unit, for utilizing history object data fingerprint, calculated by hash function, obtain history number According to the OSD of the reference count of object, so as to find the reference count of historical data object.
Accordingly, the embodiment of the invention also discloses a kind of distributed storage to read system, shown in Figure 8, is applied to Distributed storage devices, including:
Identifier lookup module 41, for utilizing target data objects unique mark, by preset algorithm, calculate number of targets According to the OSD of object unique mark;
Fingerprint read module 42, for reading out the destination object number for the OSD for being stored in target data objects unique mark According to fingerprint;
Data read module 43, for utilizing destination object data fingerprint, by preset algorithm, calculate target data pair The target OSD of elephant, read target data objects.
In addition, the embodiment of the invention also discloses a kind of distributed storage devices, including distribution disclosed in previous embodiment Formula storage deletes system, distributed storage writing system, distributed storage deletion system and distributed storage and reads system again.Respectively System specific configuration may be referred to the content disclosed in previous embodiment, will not be repeated here.
Finally, it is to be noted that, herein, such as first and second or the like relational terms be used merely to by One entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operation Between any this actual relation or order be present.Moreover, term " comprising ", "comprising" or its any other variant meaning Covering including for nonexcludability, so that process, method, article or equipment including a series of elements not only include that A little key elements, but also the other element including being not expressly set out, or also include for this process, method, article or The intrinsic key element of equipment.In the absence of more restrictions, the key element limited by sentence "including a ...", is not arranged Except other identical element in the process including the key element, method, article or equipment being also present.
Professional further appreciates that, with reference to the unit of each example of the embodiments described herein description And algorithm steps, can be realized with electronic hardware, computer software or the combination of the two, in order to clearly demonstrate hardware and The interchangeability of software, the composition and step of each example are generally described according to function in the above description.These Function is performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme.Specialty Technical staff can realize described function using distinct methods to each specific application, but this realization should not Think beyond the scope of this invention.
Above to a kind of distributed storage devices provided by the present invention, distributed storage deletes, writes, deletes, reading side again Method and system are described in detail, and specific case used herein is explained the principle and embodiment of the present invention State, the explanation of above example is only intended to help the method and its core concept for understanding the present invention;Meanwhile for this area Those skilled in the art, according to the thought of the present invention, there will be changes in specific embodiments and applications, to sum up institute State, this specification content should not be construed as limiting the invention.

Claims (15)

1. a kind of distributed storage deletes method again, it is characterised in that applied to distributed storage devices, including:
The destination object data fingerprint of target data objects is obtained in unified accumulation layer, and the destination object data fingerprint is deposited Store up corresponding OSD;
The destination object data fingerprint is calculated using preset algorithm, obtains the target OSD of the target data objects;
Judge whether the number of targets OSD has saved historical data object;
If having preserved the historical data object, the counting of the reference count of the historical data object is added one.
2. distributed storage according to claim 1 deletes method again, it is characterised in that described to obtain mesh in unified accumulation layer The process of the destination object data fingerprint of data object is marked, including:
In unified accumulation layer, using the target data objects unique mark of the target data objects, by hash function, obtain The destination object data fingerprint.
A kind of 3. distributed storage wiring method, it is characterised in that applied to distributed storage devices, including:
Using the target data objects unique mark of target data objects, inquiry destination object data fingerprint whether there is;
If it is present calculate fresh target object data fingerprint using the target data objects, and by the fresh target pair Image data fingerprint is stored in corresponding OSD;
The fresh target object data fingerprint is recycled, the target data objects are stored in corresponding target OSD;
The reference count of data object corresponding with the destination object data fingerprint is subtracted one.
4. distributed storage wiring method according to claim 3, it is characterised in that described to utilize the target data pair As calculating fresh target object data fingerprint, and the fresh target object data fingerprint is stored in corresponding OSD process, bag Include:
Judge whether the data block length of the target data objects is equal to default piecemeal size;
If equal to the first fresh target object data that the target data objects are then calculated using the preset algorithm is referred to Line, the first fresh target object data fingerprint is stored in corresponding OSD;
If the data block length of the target data objects is less than the piecemeal size, using the target data objects only One mark, obtains the destination object data fingerprint, and the old target data pair is obtained using the destination object data fingerprint As;
The old target data objects are spliced with the target data objects, obtain splicing data object, using described Preset algorithm calculates the second fresh target object data fingerprint of the splicing data object;
The second fresh target object data fingerprint is stored in corresponding OSD.
5. distributed storage wiring method according to claim 4, it is characterised in that it is described will be with the destination object number The process for subtracting one according to the reference count of the corresponding data object of fingerprint, including:
When the data block length of the target data objects is less than default piecemeal size, then referred to using the destination object data Line, the old reference count corresponding with the destination object data fingerprint is found, and the counting of the old reference count is subtracted One;
If the old reference count is counted as zero, the old target data objects are deleted.
A kind of 6. distributed storage delet method, it is characterised in that applied to distributed storage devices, including:
Using the history object unique mark of historical data object, the history object data for finding the historical data object refer to Line;
Using the history object data fingerprint, the reference count of the historical data object is found, and by the history number Subtract one according to the reference count of object;
If the reference count of the historical data object is zero after subtracting one, the historical data object and associated therewith is deleted Data.
7. distributed storage delet method according to claim 6, it is characterised in that described using historical data object History object unique mark, the process of the history object data fingerprint of the historical data object is found, including:
Using the history object unique mark of historical data object, calculated by hash function, it is unique to obtain the history object The OSD of mark, the history object data fingerprint is found from the OSD of the history object unique mark.
8. distributed storage delet method according to claim 6, it is characterised in that described to utilize the history object number According to fingerprint, the process of the reference count of the historical data object is found, including:
Using the history object data fingerprint, calculated by hash function, obtain the reference count of the historical data object OSD, so as to find the reference count of the historical data object.
9. distributed storage according to claim 6 deletes method again, it is characterised in that described to delete the historical data pair As the process with relative data, including:
Delete the historical data object and the history object data fingerprint.
A kind of 10. distributed storage read method, it is characterised in that applied to distributed storage devices, including:
Using target data objects unique mark, by preset algorithm, the target data objects unique mark is calculated OSD;
Read out the destination object data fingerprint for the OSD for being stored in the target data objects unique mark;
Using the destination object data fingerprint, by the preset algorithm, the target of the target data objects is calculated OSD, read the target data objects.
11. a kind of distributed storage deletes system again, it is characterised in that applied to distributed storage devices, the system includes:
Fingerprint acquisition module, for obtaining the destination object data fingerprint of target data objects in unified accumulation layer;
Fingerprint storage module, for destination object data fingerprint storage to be arrived into corresponding OSD;
Position computation module, for being calculated using preset algorithm the destination object data fingerprint, obtain the target The target OSD of data object;
Judge module, for judging whether the number of targets OSD has saved historical data object;
Change module is counted, if for having preserved the historical data object, by the reference meter of the historical data object Several countings add one.
A kind of 12. distributed storage writing system, it is characterised in that applied to distributed storage devices, including:
Fingerprint queries module, for the target data objects unique mark using target data objects, inquire about destination object data Fingerprint whether there is;
New fingerprint storage module, for referring to if it is present calculating fresh target object data using the target data objects Line, and the fresh target object data fingerprint is stored in corresponding OSD;
Data object memory module, for recycling the fresh target object data fingerprint, the target data objects are stored in Corresponding target OSD;
Change module is counted, for subtracting one by the reference count of data object corresponding with the destination object data fingerprint.
A kind of 13. distributed storage deletion system, it is characterised in that applied to distributed storage devices, including:
Fingerprint searching modul, for the history object unique mark using historical data object, find the historical data pair The history object data fingerprint of elephant;
Change module is counted, for utilizing the history object data fingerprint, finds the reference meter of the historical data object Number, and the reference count of the historical data object is subtracted one;
Removing module, if the reference count for the historical data object subtract one after be zero, delete the historical data Object and relative data.
14. a kind of distributed storage reads system, it is characterised in that applied to distributed storage devices, including:
Identifier lookup module, for utilizing target data objects unique mark, by preset algorithm, calculate the target data The OSD of object unique mark;
Fingerprint read module, for reading out the destination object data for the OSD for being stored in the target data objects unique mark Fingerprint;
Data read module, for utilizing the destination object data fingerprint, by the preset algorithm, calculate the target The target OSD of data object, reads the target data objects.
15. a kind of distributed storage devices, it is characterised in that deleted again including distributed storage as claimed in claim 11 and be System, distributed storage writing system as claimed in claim 12, distributed storage deletion system as claimed in claim 13 And distributed storage as claimed in claim 14 reads system.
CN201710764079.9A 2017-08-30 2017-08-30 Distributed storage devices, delete, write again, deleting, read method and system Pending CN107506150A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710764079.9A CN107506150A (en) 2017-08-30 2017-08-30 Distributed storage devices, delete, write again, deleting, read method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710764079.9A CN107506150A (en) 2017-08-30 2017-08-30 Distributed storage devices, delete, write again, deleting, read method and system

Publications (1)

Publication Number Publication Date
CN107506150A true CN107506150A (en) 2017-12-22

Family

ID=60694359

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710764079.9A Pending CN107506150A (en) 2017-08-30 2017-08-30 Distributed storage devices, delete, write again, deleting, read method and system

Country Status (1)

Country Link
CN (1) CN107506150A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110245129A (en) * 2019-04-23 2019-09-17 平安科技(深圳)有限公司 Distributed global data deduplication method and device
CN111177088A (en) * 2019-12-29 2020-05-19 北京浪潮数据技术有限公司 Data deduplication method and device, electronic equipment and storage medium
CN112286457A (en) * 2020-10-28 2021-01-29 杭州宏杉科技股份有限公司 Object deduplication method and device, electronic equipment and machine-readable storage medium
WO2021109587A1 (en) * 2019-12-06 2021-06-10 浪潮电子信息产业股份有限公司 File storage method and apparatus, and device and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101599079A (en) * 2009-07-22 2009-12-09 中国科学院计算技术研究所 A kind of Backup Data is concentrated the management method of storage
CN102495894A (en) * 2011-12-12 2012-06-13 成都市华为赛门铁克科技有限公司 Method, device and system for searching repeated data
CN102915278A (en) * 2012-09-19 2013-02-06 浪潮(北京)电子信息产业有限公司 Data deduplication method
US20130297884A1 (en) * 2012-05-07 2013-11-07 International Business Machines Corporation Enhancing data processing performance by cache management of fingerprint index

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101599079A (en) * 2009-07-22 2009-12-09 中国科学院计算技术研究所 A kind of Backup Data is concentrated the management method of storage
CN102495894A (en) * 2011-12-12 2012-06-13 成都市华为赛门铁克科技有限公司 Method, device and system for searching repeated data
US20130297884A1 (en) * 2012-05-07 2013-11-07 International Business Machines Corporation Enhancing data processing performance by cache management of fingerprint index
CN102915278A (en) * 2012-09-19 2013-02-06 浪潮(北京)电子信息产业有限公司 Data deduplication method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110245129A (en) * 2019-04-23 2019-09-17 平安科技(深圳)有限公司 Distributed global data deduplication method and device
WO2020215580A1 (en) * 2019-04-23 2020-10-29 平安科技(深圳)有限公司 Distributed global data deduplication method and device
CN110245129B (en) * 2019-04-23 2022-05-13 平安科技(深圳)有限公司 Distributed global data deduplication method and device
WO2021109587A1 (en) * 2019-12-06 2021-06-10 浪潮电子信息产业股份有限公司 File storage method and apparatus, and device and readable storage medium
CN111177088A (en) * 2019-12-29 2020-05-19 北京浪潮数据技术有限公司 Data deduplication method and device, electronic equipment and storage medium
CN112286457A (en) * 2020-10-28 2021-01-29 杭州宏杉科技股份有限公司 Object deduplication method and device, electronic equipment and machine-readable storage medium
CN112286457B (en) * 2020-10-28 2022-08-26 杭州宏杉科技股份有限公司 Object deduplication method and device, electronic equipment and machine-readable storage medium

Similar Documents

Publication Publication Date Title
CN107506150A (en) Distributed storage devices, delete, write again, deleting, read method and system
CN103890738B (en) The system and method for the weight that disappears in storage object after retaining clone and separate operation
US8799601B1 (en) Techniques for managing deduplication based on recently written extents
CN103136243B (en) File system duplicate removal method based on cloud storage and device
US7606817B2 (en) Primenet data management system
CN104933112A (en) Distributed Internet transaction information storage and processing method
CN103959264A (en) Managing redundant immutable files using deduplication in storage clouds
JP2005267600A5 (en)
CN102169491B (en) Dynamic detection method for multi-data concentrated and repeated records
CN105787037A (en) Repeated data deleting method and device
CN106874481A (en) A kind of metadata of distributed type file system information-reading method and system
CN103530322B (en) Data processing method and device
CN110737680A (en) Cache data management method and device, storage medium and electronic equipment
CN107391761A (en) A kind of data managing method and device based on data de-duplication technology
CN109242458A (en) Approaches to IM and relevant device based on block chain
CN107506484B (en) Operation and maintenance data association auditing method, system, equipment and storage medium
CN112182004A (en) Method and device for viewing data in real time, computer equipment and storage medium
WO2020215580A1 (en) Distributed global data deduplication method and device
CN107368545A (en) A kind of De-weight method and device based on MerkleTree deformation algorithms
CN106528703A (en) Deduplication mode switching method and apparatus
CN106487937A (en) A kind of cloud storage system file De-weight method and system
CN104317955B (en) File scanning method and device in a kind of mobile terminal memory space
CN109753379A (en) Snapshot data backup, delet method, apparatus and system
KR101252375B1 (en) Mapping management system and method for enhancing performance of deduplication in storage apparatus
CN102831240B (en) The storage means of extended metadata file and storage organization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20171222

RJ01 Rejection of invention patent application after publication