CN102722450A - Storage method for redundancy deletion block device based on location-sensitive hash - Google Patents

Storage method for redundancy deletion block device based on location-sensitive hash Download PDF

Info

Publication number
CN102722450A
CN102722450A CN2012101682422A CN201210168242A CN102722450A CN 102722450 A CN102722450 A CN 102722450A CN 2012101682422 A CN2012101682422 A CN 2012101682422A CN 201210168242 A CN201210168242 A CN 201210168242A CN 102722450 A CN102722450 A CN 102722450A
Authority
CN
China
Prior art keywords
data
superfluous
data segment
execution
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012101682422A
Other languages
Chinese (zh)
Other versions
CN102722450B (en
Inventor
余宏亮
孙竞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201210168242.2A priority Critical patent/CN102722450B/en
Publication of CN102722450A publication Critical patent/CN102722450A/en
Application granted granted Critical
Publication of CN102722450B publication Critical patent/CN102722450B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a storage method for a redundancy deletion block device based on location-sensitive hash, and belongs to the data storage field. The method comprises the following steps: putting data blocks of redundant writing detection operation and a corresponding digital finger print into the current operating queue; D: judging whether the number of the data blocks in the queue exceeds threshold value or not, if so, taking threshold value of data blocks as a data section, and executing the step F, and otherwise, executing the step E; E: judging whether the data block at the front of the queue is overtime or not, if so, taking the data blocks as the data section, and executing the step F, and otherwise, executing the step D; F: judging whether the set of metadata of similar data sections exists or not, if so, executing the step G, and otherwise, establishing an empty set, and executing the step G; and G: orderly judging whether digital finger prints of data blocks exist in the set of metadata of the similar data sections or not, if so, modifying the memory addresses of the data blocks, and otherwise, generating the metadata of the data blocks. The method reduces the time of accessing the metadata in the redundant writing detection operation process.

Description

The responsive Hash of a kind of position-based delete superfluous block device storage means
Technical field
The present invention relates to technical field of data storage, the responsive Hash of particularly a kind of position-based delete superfluous block device storage means.
Background technology
Along with the explosive growth of amount of digital information, the data occupancy space is increasing; In in the past 10 years, the storage system capacity that a lot of industries provide develops into hundreds of TB from tens of GB, even number PB, has turned over more than 10,000 times fully.Along with the exponential growth of data, the quick backup that enterprise faces and the time point of recovery are more and more, and the cost of management preservation data is increasingly high, and the space of data center and electric energy expend and also become more and more.Discover, the data that application system is preserved, up to 60% being redundant, and also As time goes on more and more serious.
In order to alleviate the volume grows problem of storage system, reduction data occupancy space reduces cost, farthest utilizes existing resource, and redundant data deleting technique (superfluous technology deleted in abbreviation) has just arisen at the historic moment.On the one hand, utilize and delete superfluous technology, can be optimized operating factor of memory space.The traditional data compression technology mainly utilizes traditional data analysis tool and technology to come elimination of duplicate data according to some fixing patterns; Can not improve cost benefit effectively based on data in magnetic disk; So need be through probing into the characteristic of repeating data; Utilize and delete superfluous technology accordingly, be distributed in same file or data block in the storage system with elimination.On the other hand, utilize and delete superfluous technology, can reduce data volume, and then reduce energy consumption and network cost in transmission through network.The target of deleting superfluous technology is to eliminate to be distributed in identical and similar documents or the data block in the storage system, therefore can reduce a large amount of disk consumption, and duplicates for data and to save the network bandwidth greatly.
Delete superfluous technology and can be widely used in many applications such as virtual machine storage, file server, mail server, Disk Backup, community network.Deleting superfluous technology does not traditionally use as main storage system; But in recent years; Development along with technology such as cloud storages becomes an important techniques problem to delete superfluous technique construction main storage system, abbreviates as with the main storage system of deleting superfluous technique construction and deletes superfluous storage system.
Block device is the most basic memory device; Widespread use and SAN (Storage Area Network; Storage area network), in the NAS various storage systems such as (Network Attached Storage, network attached storage), but make up support embedded when deleting superfluous block device storage system; Face two great technological challenges: (1) is deleted superfluous result as the bottom block device and how to be notified upper system, and how compatible having do not support to delete superfluous upper strata storage system.(2) storage system is very high to the block device performance requirement, and deletes the superfluous great amount of calculation expense that can produce, and increases and delete superfluous metadata in a large number, and when write data, need search the data write whether in system, and this can obviously increase the time overhead of write data.
Summary of the invention
The technical matters that (one) will solve
The technical matters that the present invention will solve is: how the superfluous block device storage means of deleting of the responsive Hash of a kind of position-based is provided, deletes the time overhead of superfluous write operation process accesses meta-data with minimizing, guarantee the performance requirement of block device.
(2) technical scheme
For solving the problems of the technologies described above, the present invention provides the superfluous block device storage means of deleting of the responsive Hash of a kind of position-based, and it comprises step:
B: obtain the process ID of initiating to delete superfluous write operation, judge whether to exist the corresponding said superfluous formation of deleting of superfluous write operation of deleting according to said process ID, if delete superfluous formation as work at present formation, execution in step C with said; Otherwise, create and new delete superfluous formation as work at present formation, execution in step C;
C: said data of deleting superfluous write operation are divided into a plurality of data blocks, calculate the digital finger-print of each said data block, said data block and corresponding digital fingerprint are put into said work at present formation;
D: whether the quantity of judging data block described in the said work at present formation surpass threshold value, if, with after the said threshold value said data block dequeue as a data segment, execution in step F; Otherwise, execution in step E;
E: whether the stand-by period of judging the said data block of head of the queue surpasses the schedule time, if, with after all said data block dequeues in the said work at present formation as a data segment, execution in step F; Otherwise, execution in step D;
F: calculate the position-sensitive Hash function value of said data segment, judge whether to exist the similar data segment metadata of said data segment to gather according to said position-sensitive Hash function value, if there is execution in step G; Otherwise, create the similar data segment metadata set of a null set, execution in step G as said data segment;
G: whether the digital finger-print of judging each the said data block in the said data segment successively is present in the said similar data segment metadata set; If the memory address of revising said data block is the memory address of said data fingerprint metadata corresponding in said similar data segment metadata set; Otherwise, the metadata of the said data block of generation in said similar data segment metadata set, said metadata comprises: the digital finger-print of said data block and memory address.
Preferably, before said step B, also comprise steps A: delete superfluous write operation in the increase of block device layer, judge whether the type of current write operation is to delete superfluous write operation, if, execution in step B; Otherwise directly the data with said current write operation write secondary storage.
Preferably, said step B specifically comprises step:
B1: obtain the process ID of initiating to delete superfluous write operation, said data of deleting superfluous write operation are added buffer memory, the state that the I/O of common apparatus layer accomplishes function is set to pending;
B2: judge whether to exist the corresponding said superfluous formation of deleting of superfluous write operation of deleting according to said process ID, if delete superfluous formation as work at present formation, execution in step C with said; Otherwise, create and new delete superfluous formation as work at present formation, execution in step C.
Preferably; Said step C specifically comprises step: said data of deleting superfluous write operation are divided into a plurality of data blocks according to predetermined size; Calculate the digital finger-print of the secure hash functional value of said data block, said data block and corresponding digital fingerprint are put into said work at present formation as said data block.
Preferably, among the said step D, said threshold value is 100.
Preferably, in the said step e, the said schedule time is 5 seconds.
Preferably, said step F specifically comprises step:
F1: use bloom filter that said data segment is carried out normalization and handle, generate measured length data block characteristics vector;
F2:, calculate the position-sensitive Hash function value of said data segment through position-sensitive Hash function based on p-stable according to said measured length data block characteristics vector;
F3: judge the similar data segment metadata set that whether has said data segment in the secondary storage according to said position-sensitive Hash function value, if there is execution in step G; Otherwise, create the similar data segment metadata set of a null set, execution in step G as said data segment.
Preferably, said step G specifically comprises step:
G1: whether the digital finger-print of judging each the said data block in the said data segment successively is present in the said similar data segment metadata set, if, execution in step G2; Otherwise, execution in step G3;
G2: the memory address of revising said data block is the memory address of said data fingerprint metadata corresponding in said similar data segment metadata set; The fiducial value of said metadata is added 1; The state that said I/O accomplishes function is set to amended; Return the former memory address and the amended memory address of said data block, said similar data segment metadata set is write back said secondary storage;
G3: in said similar data segment metadata set, generate the metadata of said data block, the state that said I/O accomplishes function is set to accomplish, and said similar data segment metadata set is write back said secondary storage; Said metadata comprises: the digital finger-print of said data block, memory address and fiducial value, the initial value of said fiducial value are 1.
(3) beneficial effect
The responsive Hash of said position-based of the present invention delete superfluous block device storage means; Have following advantage: said method is divided into groups according to process ID through deleting superfluous write operation; Effectively utilize the same process special time data locality that operation brings to identical file; For realizing that similar data segment deletes the superfluous basis that provides, and then utilize position-sensitive Hash function that similar collection of metadata is mapped to identical Hash locus, fast and identify similar data segment exactly; Both effectively reduced and deleted the number of times that superfluous write operation uses internal memory; Realized fast access again, reduced the time overhead of deleting superfluous write operation process accesses meta-data, effectively guaranteed to delete superfluous block device write operation performance collection of metadata; Said method both can make that the upper strata storage system of supporting to delete superfluous block device is convenient uses that block device is embedded deletes superfluous function; Making to have does not support the upper strata storage system of deleting superfluous block device can this equipment be used as generic block equipment yet; And the upper strata storage system is used flexibly and is deleted superfluous write operation; Can reduce storage space and use, can guarantee that also a plurality of copies of significant data are not deleted superfluous; Said method passes through to increase the state that two I/O accomplish functions, is not revising existing writing on the basis of flow process, supports that deleting superfluous result returns the upper strata storage system, minimizes upper strata storage system use and deletes the change that superfluous block device brings.
Description of drawings
Fig. 1 is the superfluous block device storage means process flow diagram of deleting of the responsive Hash of position-based of the present invention;
Fig. 2 is similar data segment metadata index of set hoist pennants.
Embodiment
Below in conjunction with accompanying drawing and embodiment, specific embodiments of the invention describes in further detail.Following examples are used to explain the present invention, but are not used for limiting scope of the present invention.
Fig. 1 is the superfluous block device storage means process flow diagram of deleting of the responsive Hash of position-based of the present invention.As shown in Figure 1, said method comprises step:
A: delete superfluous write operation in the increase of block device layer, judge whether the type of current write operation is to delete superfluous write operation, if, execution in step B; Otherwise directly the data with said current write operation write secondary storage, promptly carry out existing write operation.In the said steps A, do not change original reading and writing operation, can guarantee that the existing storage system of not supporting to delete superfluous block device still can be used according to the conventional bar DeviceMode delete superfluous block device; And support the storage system delete superfluous block device, then can select as required to use and delete superfluous write operation, let delete superfluous block device to the data that write carry out embedded delete superfluous.Like this; The upper strata storage system is preserved the data of a plurality of copies for important needs; Can still use the existing non-superfluous write operation of deleting, avoid deleting superfluous block device a plurality of identical copies data are deleted only reservation portion, to guarantee the requirement of upper system to the data reliability.
B: obtain the process ID (identity, identification number) of initiating to delete superfluous write operation, judge whether to exist the corresponding said superfluous formation of deleting of superfluous write operation of deleting according to said process ID, if delete superfluous formation as work at present formation, execution in step C with said; Otherwise, create and new delete superfluous formation as the work at present formation, execution in step C, new here create delete superfluous formation through said process ID with said to delete superfluous write operation corresponding.The upper strata storage system generally adopts sequential read, write operation as far as possible in order effectively to improve the performance of accessing storage device, and the upper strata storage system improves block device layer sequential access ratio through setting up methods such as file cache.Therefore; If the block device layer can utilize the data segment sequential access mode that often occurs; The data block metadata of connected reference is organized together; Then can realize the connected reference of a plurality of metadata, will significantly reduce the disk random access number of times that brings when metadata conducted interviews like this, effectively improve the metadata handling property.Usually a process of upper strata storage system is operated a file on certain time point; Therefore divide into groups according to the process ID of initiating to delete superfluous write operation; Can obtain the data segment to a file connected reference, this pattern to a file connected reference also can repeat to occur later on high probability very.
Said step B specifically comprises step:
B1: obtain the process ID of initiating to delete superfluous write operation, said data of deleting superfluous write operation are added buffer memory, the state that the I/O of common apparatus layer (I/O) accomplishes function is set to pending.The present invention increases armed state and amended state (among the following step G2) for I/O accomplishes function; Guaranteed that promptly generic block equipment not being write flow process does too big change; In general manner realize again deleting the upper strata storage system that superfluous block device is deleted in superfluous result notification support, the upper strata storage system can be done handled according to deleting superfluous result like this.This flow process also conforms to, has minimized upper strata storage system support and deleted the needed modification of superfluous block device with generic block equipment write operation (promptly existing write operation) flow process.Saidly delete superfluous write operation and accomplish by independently deleting superfluous thread.
B2: saidly delete superfluous thread and judge whether to exist the corresponding said superfluous formation of deleting of superfluous write operation of deleting, if delete superfluous formation as work at present formation, execution in step C with said according to said process ID; Otherwise, create and new delete superfluous formation as work at present formation, execution in step C.
C: said data of deleting superfluous write operation are divided into a plurality of data blocks according to predetermined size; Calculate the digital finger-print of the secure hash functional value of said data block, said data block, corresponding digital fingerprint and current time are stabbed put into said work at present formation as said data block.Said secure hash functional value adopts SHA (Secure Hash Algorithm, SHA)-1 or SHA-256.
D: whether the quantity of judging data block described in the said work at present formation surpass threshold value, if, with after the said threshold value said data block dequeue as a data segment, execution in step F; Otherwise, execution in step E.Said threshold value is generally about 100.
E: whether the stand-by period of judging the said data block of head of the queue surpasses the schedule time, if, with after all said data block dequeues in the said work at present formation as a data segment, execution in step F; Otherwise, execution in step D.The said schedule time is 5 seconds.Here adopt an overtime monitoring thread, regularly calculated timestamp and current time poor of the said data block of head of the queue, whether overtime with the said data block of judging head of the queue.
F: calculate the position-sensitive Hash function value of said data segment, judge whether to exist the similar data segment metadata of said data segment to gather according to said position-sensitive Hash function value, if there is execution in step G; Otherwise, create the similar data segment metadata set of a null set, execution in step G as said data segment.The position-sensitive Hash function value (is Location Sensitive Hash; LSH) different with general hash function value is position sensing property; Just the similitude before the hash also can be similar to a certain extent through after the Hash, and have certain probability assurance.Therefore, use suitable position-sensitive Hash function value can similar data segment be mapped to the close positions in cryptographic hash space, thereby guaranteed similar data segment tissue and recognition requirement.
Said step F specifically comprises step:
F1: use bloom filter (a kind of binary vector data structure) that said data segment is carried out normalization and handle, generate measured length data block characteristics vector;
F2:, calculate the position-sensitive Hash function value of said data segment through position-sensitive Hash function based on p-stable according to said measured length data block characteristics vector;
F3: judge the similar data segment metadata set that whether has said data segment in the secondary storage according to said position-sensitive Hash function value, if there is execution in step G; Otherwise, create the similar data segment metadata set of a null set, execution in step G as said data segment.Fig. 2 is similar data segment metadata index of set hoist pennants; As shown in Figure 2; In internal memory, safeguard a similar data segment metadata index of set table; There are the position-sensitive Hash function value of said data segment and the address of corresponding similar data segment metadata set in the table, can externally store according to the address of similar data segment metadata set and find similar data segment metadata set on (being secondary storage), and then execution in step G.
G: whether the digital finger-print of judging each the said data block in the said data segment successively is present in the said similar data segment metadata set; If the memory address of revising said data block is the memory address of said data fingerprint metadata corresponding in said similar data segment metadata set; Otherwise, the metadata of the said data block of generation in said similar data segment metadata set, said metadata comprises: the digital finger-print of said data block and memory address.In carrying out said data segment during the inquiry of each data block; If can find a little set; Only element in this set being carried out the result of data query just can be identical on probability with the result who in whole data acquisition, carries out data query, then can improve the efficient of data query.The present invention has utilized this point just; Similar data segment (data segment that just includes the similar data block of some) put together constitutes similar data segment metadata set, thus search this similar data segment metadata set just can with search all data segments and reach and similarly delete superfluous effect.
Said step G specifically comprises step:
G1: whether the digital finger-print of judging each the said data block in the said data segment successively is present in the said similar data segment metadata set; If; The data block that identical content has been arranged in the secondary storage is described, is not needed to write once more, thus execution in step G2; Otherwise, explain that said data block is new, execution in step G3;
G2: the memory address of revising said data block is the memory address of said data fingerprint metadata corresponding in said similar data segment metadata set; The fiducial value of said metadata is added 1; The state that said I/O accomplishes function is set to amended; Return the former memory address and the amended memory address of said data block, said similar data segment metadata set is write back said secondary storage.If the memory address of said similar data segment metadata set changes the said similar data segment metadata index of set table of modify.
G3: in said similar data segment metadata set, generate the metadata of said data block, the state that said I/O accomplishes function is set to accomplish; Said metadata comprises: the digital finger-print of said data block, memory address and fiducial value, the initial value of said fiducial value are 1.Then, said similar data segment metadata set is write back said secondary storage.If the memory address of said similar data segment metadata set changes the said similar data segment metadata index of set table of modify.
The responsive Hash of the said position-based of the embodiment of the invention delete superfluous block device storage means; Have advantage: said method is divided into groups according to process ID through deleting superfluous write operation; Effectively utilize the same process special time data locality that operation brings to identical file; For realizing that similar data segment deletes the superfluous basis that provides, and then utilize position-sensitive Hash function that similar collection of metadata is mapped to identical Hash locus, fast and identify similar data segment exactly; Both effectively reduced and deleted the number of times that superfluous write operation uses internal memory; Realized fast access again, reduced the time overhead of deleting superfluous write operation process accesses meta-data, effectively guaranteed to delete superfluous block device write operation performance collection of metadata; Said method both can make that the upper strata storage system of supporting to delete superfluous block device is convenient uses that block device is embedded deletes superfluous function; Making to have does not support the upper strata storage system of deleting superfluous block device can this equipment be used as generic block equipment yet; And the upper strata storage system is used flexibly and is deleted superfluous write operation; Can reduce storage space and use, can guarantee that also a plurality of copies of significant data are not deleted superfluous; Said method passes through to increase the state that two I/O accomplish functions, is not revising existing writing on the basis of flow process, supports that deleting superfluous result returns the upper strata storage system, minimizes upper strata storage system use and deletes the change that superfluous block device brings.
Above embodiment only is used to explain the present invention; And be not limitation of the present invention; The those of ordinary skill in relevant technologies field under the situation that does not break away from the spirit and scope of the present invention, can also be made various variations and modification; Therefore all technical schemes that are equal to also belong to category of the present invention, and scope of patent protection of the present invention should be defined by the claims.

Claims (8)

  1. The responsive Hash of a position-based delete superfluous block device storage means, it is characterized in that, comprise step:
    B: obtain the process ID of initiating to delete superfluous write operation, judge whether to exist the corresponding said superfluous formation of deleting of superfluous write operation of deleting according to said process ID, if delete superfluous formation as work at present formation, execution in step C with said; Otherwise, create and new delete superfluous formation as work at present formation, execution in step C;
    C: said data of deleting superfluous write operation are divided into a plurality of data blocks, calculate the digital finger-print of each said data block, said data block and corresponding digital fingerprint are put into said work at present formation;
    D: whether the quantity of judging data block described in the said work at present formation surpass threshold value, if, with after the said threshold value said data block dequeue as a data segment, execution in step F; Otherwise, execution in step E;
    E: whether the stand-by period of judging the said data block of head of the queue surpasses the schedule time, if, with after all said data block dequeues in the said work at present formation as a data segment, execution in step F; Otherwise, execution in step D;
    F: calculate the position-sensitive Hash function value of said data segment, judge whether to exist the similar data segment metadata of said data segment to gather according to said position-sensitive Hash function value, if there is execution in step G; Otherwise, create the similar data segment metadata set of a null set, execution in step G as said data segment;
    G: whether the digital finger-print of judging each the said data block in the said data segment successively is present in the said similar data segment metadata set; If the memory address of revising said data block is the memory address of said data fingerprint metadata corresponding in said similar data segment metadata set; Otherwise, the metadata of the said data block of generation in said similar data segment metadata set, said metadata comprises: the digital finger-print of said data block and memory address.
  2. 2. the method for claim 1 is characterized in that, before said step B, also comprises steps A: delete superfluous write operation in the increase of block device layer, judge whether the type of current write operation is to delete superfluous write operation, if, execution in step B; Otherwise directly the data with said current write operation write secondary storage.
  3. 3. the method for claim 1 is characterized in that, said step B specifically comprises step:
    B1: obtain the process ID of initiating to delete superfluous write operation, said data of deleting superfluous write operation are added buffer memory, the state that the I/O of common apparatus layer accomplishes function is set to pending;
    B2: judge whether to exist the corresponding said superfluous formation of deleting of superfluous write operation of deleting according to said process ID, if delete superfluous formation as work at present formation, execution in step C with said; Otherwise, create and new delete superfluous formation as work at present formation, execution in step C.
  4. 4. method as claimed in claim 3; It is characterized in that; Said step C specifically comprises step: said data of deleting superfluous write operation are divided into a plurality of data blocks according to predetermined size; Calculate the digital finger-print of the secure hash functional value of said data block, said data block and corresponding digital fingerprint are put into said work at present formation as said data block.
  5. 5. the method for stating like claim 4 is characterized in that, among the said step D, said threshold value is 100.
  6. 6. method as claimed in claim 4 is characterized in that, in the said step e, the said schedule time is 5 seconds.
  7. 7. method as claimed in claim 4 is characterized in that, said step F specifically comprises step:
    F1: use bloom filter that said data segment is carried out normalization and handle, generate measured length data block characteristics vector;
    F2:, calculate the position-sensitive Hash function value of said data segment through position-sensitive Hash function based on p-stable according to said measured length data block characteristics vector;
    F3: judge the similar data segment metadata set that whether has said data segment in the secondary storage according to said position-sensitive Hash function value, if there is execution in step G; Otherwise, create the similar data segment metadata set of a null set, execution in step G as said data segment.
  8. 8. method as claimed in claim 7 is characterized in that, said step G specifically comprises step:
    G1: whether the digital finger-print of judging each the said data block in the said data segment successively is present in the said similar data segment metadata set, if, execution in step G2; Otherwise, execution in step G3;
    G2: the memory address of revising said data block is the memory address of said data fingerprint metadata corresponding in said similar data segment metadata set; The fiducial value of said metadata is added 1; The state that said I/O accomplishes function is set to amended; Return the former memory address and the amended memory address of said data block, said similar data segment metadata set is write back said secondary storage;
    G3: in said similar data segment metadata set, generate the metadata of said data block, the state that said I/O accomplishes function is set to accomplish, and said similar data segment metadata set is write back said secondary storage; Said metadata comprises: the digital finger-print of said data block, memory address and fiducial value, the initial value of said fiducial value are 1.
CN201210168242.2A 2012-05-25 2012-05-25 Storage method for redundancy deletion block device based on location-sensitive hash Expired - Fee Related CN102722450B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210168242.2A CN102722450B (en) 2012-05-25 2012-05-25 Storage method for redundancy deletion block device based on location-sensitive hash

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210168242.2A CN102722450B (en) 2012-05-25 2012-05-25 Storage method for redundancy deletion block device based on location-sensitive hash

Publications (2)

Publication Number Publication Date
CN102722450A true CN102722450A (en) 2012-10-10
CN102722450B CN102722450B (en) 2015-01-14

Family

ID=46948223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210168242.2A Expired - Fee Related CN102722450B (en) 2012-05-25 2012-05-25 Storage method for redundancy deletion block device based on location-sensitive hash

Country Status (1)

Country Link
CN (1) CN102722450B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014094421A1 (en) * 2012-12-21 2014-06-26 华为技术有限公司 Data processing method and virtual machine management platform
CN104102748A (en) * 2014-08-08 2014-10-15 中国联合网络通信集团有限公司 Method and device for file mapping and method and device for file recommendation
WO2014206242A1 (en) * 2013-06-25 2014-12-31 Tencent Technology (Shenzhen) Company Limited Systems and methods for data processing
CN104965689A (en) * 2015-05-22 2015-10-07 浪潮电子信息产业股份有限公司 Hybrid parallel computing method and device for CPUs/GPUs
CN111737519A (en) * 2020-06-09 2020-10-02 北京奇艺世纪科技有限公司 Method and device for identifying robot account, electronic equipment and computer-readable storage medium
WO2020253406A1 (en) * 2019-06-17 2020-12-24 华为技术有限公司 Data processing method and device, and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101706825A (en) * 2009-12-10 2010-05-12 华中科技大学 Replicated data deleting method based on file content types
CN101963982A (en) * 2010-09-27 2011-02-02 清华大学 Method for managing metadata of redundancy deletion and storage system based on location sensitive Hash
US8032529B2 (en) * 2007-04-12 2011-10-04 Cisco Technology, Inc. Enhanced bloom filters
CN102222085A (en) * 2011-05-17 2011-10-19 华中科技大学 Data de-duplication method based on combination of similarity and locality

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8032529B2 (en) * 2007-04-12 2011-10-04 Cisco Technology, Inc. Enhanced bloom filters
CN101706825A (en) * 2009-12-10 2010-05-12 华中科技大学 Replicated data deleting method based on file content types
CN101963982A (en) * 2010-09-27 2011-02-02 清华大学 Method for managing metadata of redundancy deletion and storage system based on location sensitive Hash
CN102222085A (en) * 2011-05-17 2011-10-19 华中科技大学 Data de-duplication method based on combination of similarity and locality

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
尹玉冰 等: "《一种广域网环境下的分布式冗余删除存储系统》", 《中兴通讯技术》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014094421A1 (en) * 2012-12-21 2014-06-26 华为技术有限公司 Data processing method and virtual machine management platform
WO2014206242A1 (en) * 2013-06-25 2014-12-31 Tencent Technology (Shenzhen) Company Limited Systems and methods for data processing
US20150269206A1 (en) * 2013-06-25 2015-09-24 Tencent Technology (Shenzhen) Company Limited Systems and Methods for Data Processing
US10268715B2 (en) 2013-06-25 2019-04-23 Tencent Technology (Shenzhen) Company Limited Systems and methods for data processing
CN104102748A (en) * 2014-08-08 2014-10-15 中国联合网络通信集团有限公司 Method and device for file mapping and method and device for file recommendation
CN104102748B (en) * 2014-08-08 2017-12-22 中国联合网络通信集团有限公司 File Mapping method and device and file recommendation method and device
CN104965689A (en) * 2015-05-22 2015-10-07 浪潮电子信息产业股份有限公司 Hybrid parallel computing method and device for CPUs/GPUs
WO2020253406A1 (en) * 2019-06-17 2020-12-24 华为技术有限公司 Data processing method and device, and computer readable storage medium
US11797204B2 (en) 2019-06-17 2023-10-24 Huawei Technologies Co., Ltd. Data compression processing method and apparatus, and computer-readable storage medium
CN111737519A (en) * 2020-06-09 2020-10-02 北京奇艺世纪科技有限公司 Method and device for identifying robot account, electronic equipment and computer-readable storage medium
CN111737519B (en) * 2020-06-09 2023-10-03 北京奇艺世纪科技有限公司 Method and device for identifying robot account, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN102722450B (en) 2015-01-14

Similar Documents

Publication Publication Date Title
US11068455B2 (en) Mapper tree with super leaf nodes
CN101963982B (en) Method for managing metadata of redundancy deletion and storage system based on location sensitive Hash
US9639289B2 (en) Systems and methods for retaining and using data block signatures in data protection operations
US10031675B1 (en) Method and system for tiering data
US9047301B2 (en) Method for optimizing the memory usage and performance of data deduplication storage systems
US8725698B2 (en) Stub file prioritization in a data replication system
US8504515B2 (en) Stubbing systems and methods in a data replication environment
US8352422B2 (en) Data restore systems and methods in a replication environment
CN101777017B (en) Rapid recovery method of continuous data protection system
CN105069048A (en) Small file storage method, query method and device
CN106874348B (en) File storage and index method and device and file reading method
CN109445702B (en) block-level data deduplication storage system
CN102722450A (en) Storage method for redundancy deletion block device based on location-sensitive hash
CN106708427A (en) Storage method suitable for key value pair data
CN106445405B (en) Data access method and device for flash memory storage
CN107291889A (en) A kind of date storage method and system
CN102323958A (en) Data de-duplication method
US20180253252A1 (en) Storage system
CN106407224A (en) Method and device for file compaction in KV (Key-Value)-Store system
CN104965835B (en) A kind of file read/write method and device of distributed file system
CN102541982B (en) Method for organizing and accessing metadata file log
CN110427347A (en) Method, apparatus, memory node and the storage medium of data de-duplication
CN108595589A (en) A kind of efficient access method of magnanimity science data picture
CN102693315A (en) Method and device for removing URL (uniform resource locator) duplicate on basis of shared memory mapping
CN104424189A (en) Positioning resolving method and positioning resolving system based on cloud platform

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150114