CN105930101A - Weak fingerprint repeated data deletion mechanism based on flash memory solid-state disk - Google Patents

Weak fingerprint repeated data deletion mechanism based on flash memory solid-state disk Download PDF

Info

Publication number
CN105930101A
CN105930101A CN201610286135.8A CN201610286135A CN105930101A CN 105930101 A CN105930101 A CN 105930101A CN 201610286135 A CN201610286135 A CN 201610286135A CN 105930101 A CN105930101 A CN 105930101A
Authority
CN
China
Prior art keywords
data
state disk
page
fingerprint
heavily
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610286135.8A
Other languages
Chinese (zh)
Inventor
肖侬
陈正国
陈志广
刘芳
陈微
欧洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201610286135.8A priority Critical patent/CN105930101A/en
Publication of CN105930101A publication Critical patent/CN105930101A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]

Abstract

The present invention relates to a weak fingerprint repeated data deletion mechanism based on a flash memory solid-state disk. Oriented to online applications with repeated data, the mechanism implements a deduplication system in a solid-state disk by using hardware resources, and deduplication is performed through weak fingerprint calculation and byte direct comparison, so that the defects of the conventional deduplication technology and pre-hash deduplication technology are overcome, a higher deduplication throughput and deduplication rate are achieved, and overheads on the aspects of time, space, hardware resources and the like are minimized. Moreover, due to a low hardware overhead, parallelism in the solid-state disk can be fully used, and a plurality of deduplication parts perform deduplication in parallel, so that a write delay is greatly reduced. The mechanism is capable of combining with a flash memory solid-state disk properly, is capable of effectively eliminating repeated data, and improves the service life and performance of the solid-state disk.

Description

A kind of weak fingerprint data de-duplication based on flash memory solid-state disk mechanism
Technical field
The present invention is applicable to data de-duplication technology field, provide a kind of weak fingerprint data de-duplication based on flash memory solid-state disk mechanism, write data by what deletion repeated, eliminate load bottleneck during data write solid-state disk, promote write performance and the space availability ratio of solid-state disk.
Background technology
Along with developing rapidly of information technology revolution, big data and cloud computing have become as the main flow of current era, explosive increase and the improving constantly of computing power of data propose the highest requirement to storage system, and storage systems face the challenge of capacity and performance.
Data de-duplication technology, may be used for eliminating the data repeated in storage system, is considered as gradually the effective technology reduced the storage overhead, alleviates the capacity pressure of storage system.After entering big data age, although having substantial amounts of data to produce every year, but many research work it turned out, and there are substantial amounts of repetition data in storage system, no matter it is the system that mainly stores, the backup storage system of the second level or the data center of high-performance calculation.Owing to heavily the technology of deleting has the highest efficiency and extensibility, it is widely used in the storage system of bottom and the application on upper strata to support to store efficiently system, the heavily technology of deleting.It has become as large-scale storage systems, especially the important component part of standby system.
On the other hand, the novel storage medium fast development with flash memory as representative, integrated level is continuously increased, and price constantly reduces, and alleviates the performance pressures of storage system.Flash memory solid-state disk based on flash memory, is made up of semiconductor chip, inherits the good characteristic of flash memory, and such as high random access performance, low-power consumption, volume is little, and anti-vibration etc. has vast potential for future development.But, the erasable number of times of flash memory is limited so that its restricted lifetime, hinders the development of flash memory solid-state disk.Therefore, in the environment that reliability is sensitive, flash memory solid-state disk is the most inapplicable.Further, writing intensive load for cloud computation data center generation, flash memory solid-state disk cannot be competent at storage work.Fortunately, in data center, the data set of application write storage generally comprises significant portion of repetition data, and Appropriate application heavily deletes technology, by eliminating redundant data, provides possibility for solving the life problems of flash memory solid-state disk.
Many researcheres are attempted combining flash memory solid-state disk and the heavily technology of deleting, and realize heavily deleting system inside solid-state disk.Accompanying drawing explanation 1 is shown in by traditional schematic diagram heavily deleted, and has four steps, specific as follows:
The first step: deblocking, selects a kind of partition strategy, such as fixed length piecemeal, by file block, by content piecemeal etc., data is carried out piecemeal;
Second step, calculates fingerprint value, chooses a kind of hash function, usually SHA-1 or MD5, calculates the fingerprint value of each data block, as unique mark of data block, and is pointed to the address mark of this data block;
3rd step, fingerprint value mates, and by treating that the fingerprint value heavily deleting data block mates with other fingerprint values, if fingerprint value is identical, then complex data of attaching most importance to, is otherwise unique data block;
4th step, unique block stores, and the unique data block after only the 3rd step being judged is deposited in storage, leaves out the data block of repetition, and the fingerprint value of the data block repeated is pointed to same address mark.
For a big file, as long as sequentially preserving the fingerprint value of its each deblocking, represent according to the address that fingerprint value points to, it is possible to by big file access pattern.But, inside solid-state disk, use traditional technology of heavily deleting improper.First, the computing cost of data block fingerprint value is big, can reduce the write performance of solid-state disk;Secondly, for accelerating fingerprint matching speed, fingerprint value must be saved in the internal memory within solid-state disk, and memory size is limited, it is impossible to preserve substantial amounts of fingerprint value, and then rate is heavily deleted in impact.
In order to solve the problem in terms of two above, there is researcher to propose a kind of pre-hash and heavily delete technology (Pre-hashing).Accompanying drawing explanation 2 is shown in by its schematic diagram.For on-line system, the multiplicity of data is the highest, major part data be unique, therefore, for major part data for, calculate fingerprint value it is not necessary that.Then, the basic thought of pre-hash technology is exactly, and only to it may happen that the data block of repetition, calculates its fingerprint value, reduces the computing cost and metadata space expense heavily deleted.Specifically comprising the following steps that of it
The first step, deblocking, inside solid-state disk, owing to the read-write granularity of solid-state disk is page, for convenience of management, generally press page size (such as 4KB) and use fixed length piecemeal;
Second step, calculate weak fingerprint, use a kind of weak hash function CRC32, the data block of all writes is all calculated its weak fingerprint value, build weak fingerprint table, by the weak fingerprint value of presently written data block and weak fingerprint table comparison, owing to CRC32 function exists collision rate, if the weak fingerprint value of two data blocks is identical, then these two data blocks may repeat;
3rd step, if presently written data block is the data block that possible repeat, then uses SHA-1 function to calculate its fingerprint value, the fingerprint value of further both comparisons;
4th step, when both fingerprint values are identical, it is determined that complex data of attaching most importance to, leaves out, otherwise, for unique data block.
Based on this principle, owing to weak fingerprint computing cost is low, pre-hash technology the most only calculates the SHA-1 fingerprint value of the data block that may repeat, and has avoided substantial amounts of fingerprint computing cost, has improved the handling capacity heavily deleted.Simultaneously as the figure place of weak fingerprint is few, the metadata space needed for heavily deleting identical data volume is heavily deleted few than tradition.But, owing to pre-hash technology only calculates the fingerprint value of the data block that may repeat, sacrifice a lot of rates of heavily deleting.Such as, in accompanying drawing explanation 2, it is assumed that be continuously written into 3 identical data blocks M, during owing to writing M for the first time, only calculate the weak fingerprint value of M, do not calculate the fingerprint value of M, it is impossible to the M that second time arrives is left out, reduces rate of heavily deleting.
In sum, traditional heavily deleting depends on the most time-consuming fingerprint and calculates process and find to repeat data, although can make moderate progress the service life of solid-state disk, but heavily the system of deleting can become the performance bottleneck of solid-state disk.Therefore, traditional heavily technology of deleting cannot combine with solid-state disk.Although it addition, pre-hash technology can reduce the expense that fingerprint calculates, promoting the handling capacity heavily deleted, improving the performance of solid-state disk, but reduce rate of heavily deleting.
Summary of the invention
The technical problem to be solved is to realize data deduplication system in design in the flash memory solid-state disk of application on site, utilize heavily delete system eliminate repeat write data, promote the life-span of solid-state disk, simultaneously, solve heavily to delete the computing cost problem of system, reduce the write delay of data, promote the readwrite performance of solid-state disk.It addition, take into account rate of heavily deleting, leave out as far as possible and more repeat data.By the page data of write and the page data that may repeat with this page read from Flash stores are carried out that byte level is other directly to be compared, judge whether this page data attaches most importance to complex data.The weak fingerprint of special proposition heavily deletes technology, can combine closely with solid-state disk, by deleting duplicated data, promotes write performance and the service life of solid-state disk.
The principle of the weak fingerprint mechanism of heavily deleting is summarized as follows.First half is similar with pre-hash, and when a write request arrives, it uses a kind of weak hash function, such as CRC, calculate the weak fingerprint value of data block, build with weak fingerprint value and heavily delete prediction table, by by weak fingerprint and the value phase comparison heavily deleted in prediction table, it was predicted that go out the data block that may repeat.Afterwards, it is not to use SHA-1 function to calculate fingerprint value, but the data block that repeat possible with presently written data block is read from bottom flash memory, then by both byte comparisons in turn, it may be judged whether whether data block repeats.It is all to use SHA-1 fingerprint value to judge whether data block repeats that tradition heavily deletes machine-processed and pre-hash technology, computing cost is big, and the weak fingerprint mechanism of heavily deleting uses byte alignments to judge to repeat data block, whole process is not related to the fingerprint computing cost of complexity, thoroughly solves fingerprint computing cost problem.Simultaneously as the fingerprint of this mechanism calculates and comparing is simple to operate, hardware spending is little, it is possible to use the concurrency within SSD, it is achieved multiple heavy delete process, the most heavily deletes data, drastically increases the handling capacity heavily deleted.In space expense, the weak fingerprint value that this mechanism preserves is short, and such as 32, and SHA-1 fingerprint value occupies 160, and space expense is little.In terms of heavily deleting rate, this mechanism overcomes the drawback of pre-hash mechanism, the data block of repetition can all be identified in theory, will not reduce and heavily delete rate.
Weak fingerprint heavily deletes the rate of heavily deleting of mechanism realization mainly to be affected by two aspects.On the one hand it is the size heavily deleting prediction table.This table is saved in internal memory, and in solid-state disk, memory size is limited, it was predicted that table is unsuitable excessive.Fortunately, a lot of application on site have value principle of locality, and the data block i.e. repeated generally arrives in one section of relatively short period of time, therefore can only preserve the weak fingerprint value of nearest a period of time, it is achieved heavily delete prediction, solve space expense problem.On the other hand it is the collision rate of CRC function.If clashing, the most weak fingerprint heavily mechanism of deleting can introduce extra read operation, affects solid-state disk performance.Collision rate can be controlled in less scope by this problem by adjusting the length of CRC fingerprint value.Increasing the length of CRC fingerprint value, computing cost can't be increased dramatically.
Use the present invention can reach following beneficial effect:
1, higher rate of heavily deleting is realized.Due to SSD internal memory finite capacity, heavily deleting compared to tradition and pre-hash technology, the data structure that weak fingerprint heavily deletes mechanism employing is the compactest, can realize bigger rate of heavily deleting in limited heavily deleting in metadata space, improves life-span and the space availability ratio of solid-state disk;
2, the performance of solid-state disk is promoted.Owing to the fingerprint computing cost of the weak fingerprint mechanism of heavily deleting is little, can make full use of again the concurrency within solid-state disk and the little advantage of read latency, executed in parallel is multiple heavy deletes process, greatly improves the handling capacity heavily deleting system, by reducing write delay, improve the readwrite performance of solid-state disk.
The expense realizing the present invention is the least, including following three points:
1, time overhead: the weak fingerprint heavily mechanism of deleting relates only to weak fingerprint and calculates and comparing operates, and is not related to the fingerprint computing cost of complexity, utilizes the concurrency within solid-state disk and the little feature of read latency the most dexterously, and time overhead is little.
2, space expense: the weak fingerprint figure place that weak fingerprint heavily deletes mechanism employing is few, and data structure is compact, it is few heavily to delete the space that metadata occupies, and space expense is little.
3, hardware resource cost: the weak fingerprint mechanism of heavily deleting realizes inside solid-state disk, uses hardware resource to calculate, and comparing, owing to computing cost is little, comparing is simple to operate, therefore hardware resource cost is little.
Accompanying drawing explanation
Fig. 1 is data de-duplication principle schematic;
Fig. 2 is that pre-hash heavily deletes mechanism principle schematic diagram;
Fig. 3 is that weak fingerprint based on flash memory solid-state disk heavily deletes mechanism system assumption diagram;
Fig. 4 is that weak fingerprint heavily deletes mechanism principle schematic diagram.
Detailed description of the invention
Fig. 1 is data de-duplication principle schematic, including deblocking, calculating fingerprint value, search index table, deletes repeatable block and stores unique block.
Fig. 2 is that pre-hash heavily deletes mechanism principle schematic diagram, including calculating weak fingerprint, calculates SHA-1 fingerprint, is continuously written into 3 identical data blocks M in figure, and second M cannot leave out.
Fig. 3 is that based on flash memory solid-state disk the weak fingerprint that the present invention uses heavily deletes mechanism system assumption diagram, and the weak fingerprint mechanism of heavily deleting realizes at the flash translation layer (FTL) (Flash within solid-state disk Translation Layer).
Fig. 4 is that weak fingerprint heavily deletes mechanism principle schematic diagram.Concrete execution process is:
The first step, deblocking: due to inside solid-state disk with page for read-write granularity, therefore file is carried out fixed length piecemeal (such as 4KB) with page size;
Second step, calculates weak fingerprint: CRC function is a kind of weak hash function, according to depositing the space size heavily deleting prediction table and use a kind of function of CRC, such as CRC32, for presently written data page, it is assumed that for m, logical address is LAm, calculate the weak fingerprint CRC32 of mm, thus this data page is carried out preliminary identification, judge whether the foundation of repetition as next step;
3rd step, flash translation layer (FTL) is responsible for building address mapping table and heavily deleting prediction table, and as a example by data page n, the form of the item in two tables is respectively (LogAn, PhyAn) and (CRC32n, PhyAn), wherein LogAnAnd PhyAnRepresenting logical address and the physical address of data page n respectively, for presently written data page, flash translation layer (FTL) wouldn't be its allocated physical address;
4th step, prediction table is heavily deleted in inquiry: fingerprint value previous step calculated mates one by one with the fingerprint value in existing prediction table, judge whether its data page represented repeats, in order to improve rate matched, prediction table is saved in the internal memory within solid-state disk, and limited memory size makes the size of necessary control forecasting table;
4.1st step, if fingerprint value coupling is unsuccessful, then this data page is unique, performs the 6.2nd step;
4.2nd step, if the match is successful for fingerprint value, then this data page is the data page that possible repeat, and takes out physical address PhyA in the prediction term that the match is successful, performs the 5th step;
5th step, according to the physical address PhyA obtained, reads the data page that in flash memory storage, this address is corresponding;
6th step, data page and the presently written data page comparison byte one by one that will read, if identical, then illustrate that presently written data are repeat, perform the 6.1st step;It is otherwise unique data, performs 6.2 steps;
6.1st step, leaves out the write request of current arrival, by item (LogAm, PhyA) add in address mapping table, recover for data afterwards, enter the 7th step;
6.2nd step, flash translation layer (FTL) is current write request allocated physical address PhyA arrivedm, by item (LogAm, PhyAm) add in address mapping table, if will heavily delete in prediction table containing CRC32mItem, then be replaced with (CRC32m, PhyAm), if not having, then add-ins (CRC32m, PhyAm), for heavily deleting afterwards, if prediction table capacity is beyond predefined size, then it is replaced according to FIFO principle, enters the 7th step;
7th step, terminates.
Traditional technology of heavily deleting is big due to fingerprint computing cost, the serious performance hindering solid-state disk.Although pre-hash technology heavily deletes rate as cost to sacrifice, avoid substantial amounts of fingerprint computing cost, improve the handling capacity heavily deleted, but inevitably there occurs that fingerprint calculates, hardware resource cost is big, it is impossible to does inside solid-state disk and multiple heavy deletes parts, is not suitable for high performance solid-state disk.
The present invention is directed to both the above situation, propose weak fingerprint and heavily delete technology, towards there is the application on site repeating data, realize heavily deleting system at solid-state disk inner utilization hardware resource, calculate only with weak fingerprint and the mode of the direct comparison of byte is heavily deleted, achieve and higher heavy delete handling capacity and heavily delete rate, all accomplish minimum at aspects such as time, space, hardware resource cost.Simultaneously as low hardware spending, solid-state disk internal concurrency can be made full use of, it is achieved multiple heavy component in parallel of deleting heavily are deleted, greatly reduce read and write access and postpone, improve the performance of solid-state disk.Experiment test based on the present invention shows, this mechanism can reduce access delay 14% under application-specific, eliminates the repetition data of more than 90%.This result shows, weak fingerprint heavily deletes mechanism, it is possible to flash memory solid-state disk reasonable combination, can elimination of duplicate data effectively, promote life-span and the performance of solid-state disk.

Claims (1)

1. weak fingerprint data de-duplication based on a flash memory solid-state disk mechanism, realize at the flash translation layer (FTL) within solid-state disk, it is characterized in that, by the page data of write and the page data that may repeat with this page read from Flash stores are carried out that byte level is other directly to be compared, judge whether this page data attaches most importance to complex data, weak fingerprint is used heavily to delete technology, first use the weak hash function CRC of low computing cost, calculate the weak fingerprint of data page, by mating whether weak fingerprint repeats prediction to data, owing to CRC function has conflict, the data page of repetition can only be had found that it is likely that after prediction, the data page of repetition is determined again by the direct comparison of byte, concretely comprise the following steps:
The first step, deblocking: due to inside solid-state disk with page for read-write granularity, therefore file is carried out fixed length piecemeal with page size;
Second step, calculates weak fingerprint: CRC function is a kind of weak hash function, according to depositing the space size heavily deleting prediction table and use a kind of function of CRC, such as CRC32, for presently written data page, it is assumed that for m, logical address is LAm, calculate the weak fingerprint CRC32 of mm, thus this data page is carried out preliminary identification, judge whether the foundation of repetition as next step;
3rd step, flash translation layer (FTL) is responsible for building address mapping table and heavily delete prediction table, and as a example by data page n, the form of the item in two tables is respectively LogAn, PhyAnAnd CRC32n, PhyAn, wherein LogAnAnd PhyAnRepresenting logical address and the physical address of data page n respectively, for presently written data page, flash translation layer (FTL) wouldn't be its allocated physical address;
4th step, inquires about and heavily deletes prediction table: fingerprint value previous step calculated mates one by one with the fingerprint value in existing prediction table, it is judged that whether its data page represented repeats;
4.1st step, if fingerprint value coupling is unsuccessful, then this data page is unique, performs the 6.2nd step;
4.2nd step, if the match is successful for fingerprint value, then this data page is the data page that possible repeat, and takes out physical address PhyA in the prediction term that the match is successful, performs the 5th step;
5th step, according to the physical address PhyA obtained, reads the data page that in flash memory storage, this address is corresponding;
6th step, data page and the presently written data page comparison byte one by one that will read, if identical, then illustrate that presently written data are repeat, perform the 6.1st step;It is otherwise unique data, performs 6.2 steps;
6.1st step, leaves out the write request of current arrival, by item LogAm, PhyA adds in address mapping table, recovers for data afterwards, enters the 7th step;
6.2nd step, flash translation layer (FTL) is current write request allocated physical address PhyA arrivedm, by item LogAm, PhyAmAdd in address mapping table, if heavily deleting in prediction table containing CRC32mItem, then be replaced with CRC32m, PhyAmIf not having, then add-ins CRC32m, PhyAm, for heavily deleting afterwards, if prediction table capacity is beyond predefined size, then it is replaced according to FIFO principle, enters the 7th step;
7th step, terminates.
CN201610286135.8A 2016-05-04 2016-05-04 Weak fingerprint repeated data deletion mechanism based on flash memory solid-state disk Pending CN105930101A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610286135.8A CN105930101A (en) 2016-05-04 2016-05-04 Weak fingerprint repeated data deletion mechanism based on flash memory solid-state disk

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610286135.8A CN105930101A (en) 2016-05-04 2016-05-04 Weak fingerprint repeated data deletion mechanism based on flash memory solid-state disk

Publications (1)

Publication Number Publication Date
CN105930101A true CN105930101A (en) 2016-09-07

Family

ID=56834297

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610286135.8A Pending CN105930101A (en) 2016-05-04 2016-05-04 Weak fingerprint repeated data deletion mechanism based on flash memory solid-state disk

Country Status (1)

Country Link
CN (1) CN105930101A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111090397A (en) * 2019-12-12 2020-05-01 苏州浪潮智能科技有限公司 Data deduplication method, system, equipment and computer readable storage medium
CN111124939A (en) * 2018-10-31 2020-05-08 深信服科技股份有限公司 Data compression method and system based on full flash memory array
US10789003B1 (en) 2019-03-28 2020-09-29 Western Digital Technologies, Inc. Selective deduplication based on data storage device controller status and media characteristics
CN115993939A (en) * 2023-03-22 2023-04-21 陕西中安数联信息技术有限公司 Method and device for deleting repeated data of storage system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514250A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for deleting global repeating data and storage device
CN103955530A (en) * 2014-05-12 2014-07-30 暨南大学 Data reconstruction and optimization method of on-line repeating data deletion system
US20150112941A1 (en) * 2013-10-18 2015-04-23 Power-All Networks Limited Backup management system and method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103514250A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for deleting global repeating data and storage device
US20150112941A1 (en) * 2013-10-18 2015-04-23 Power-All Networks Limited Backup management system and method thereof
CN103955530A (en) * 2014-05-12 2014-07-30 暨南大学 Data reconstruction and optimization method of on-line repeating data deletion system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHENGGUO CHEN, ZHIGUANG CHEN, NONG XIAO, FANG LIU: "NF-Dedupe: A Novel No-fingerprint Deduplication", 《COMPUTERS AND COMMUNICATION (ISCC), 2015 IEEE SYMPOSIUM ON》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124939A (en) * 2018-10-31 2020-05-08 深信服科技股份有限公司 Data compression method and system based on full flash memory array
US10789003B1 (en) 2019-03-28 2020-09-29 Western Digital Technologies, Inc. Selective deduplication based on data storage device controller status and media characteristics
CN111090397A (en) * 2019-12-12 2020-05-01 苏州浪潮智能科技有限公司 Data deduplication method, system, equipment and computer readable storage medium
CN111090397B (en) * 2019-12-12 2021-10-22 苏州浪潮智能科技有限公司 Data deduplication method, system, equipment and computer readable storage medium
CN115993939A (en) * 2023-03-22 2023-04-21 陕西中安数联信息技术有限公司 Method and device for deleting repeated data of storage system

Similar Documents

Publication Publication Date Title
CN102222085B (en) Data de-duplication method based on combination of similarity and locality
CN102012791B (en) Flash based PCIE (peripheral component interface express) board for data storage
US9092321B2 (en) System and method for performing efficient searches and queries in a storage node
CN102364474B (en) Metadata storage system for cluster file system and metadata management method
US9021189B2 (en) System and method for performing efficient processing of data stored in a storage node
CN102609360B (en) Data processing method, data processing device and data processing system
CN102968496B (en) The sorting in parallel method of task based access control driving and double buffers
CN107728937B (en) Key value pair persistent storage method and system using nonvolatile memory medium
CN104794070A (en) Solid-state flash memory write cache system and method based on dynamic non-covering RAID technology
CN103080910A (en) Storage system
CN106066890B (en) Distributed high-performance database all-in-one machine system
CN106775476A (en) Mixing memory system and its management method
CN105930101A (en) Weak fingerprint repeated data deletion mechanism based on flash memory solid-state disk
Zou et al. The dilemma between deduplication and locality: Can both be achieved?
CN113626431A (en) LSM tree-based key value separation storage method and system for delaying garbage recovery
US9336135B1 (en) Systems and methods for performing search and complex pattern matching in a solid state drive
WO2022199027A1 (en) Random write method, electronic device and storage medium
CN109165321B (en) Consistent hash table construction method and system based on nonvolatile memory
CN102567442B (en) Method for synchronizing metadata and disks in distributed file system
CN102722450B (en) Storage method for redundancy deletion block device based on location-sensitive hash
CN115794669A (en) Method, device and related equipment for expanding memory
CN110427347A (en) Method, apparatus, memory node and the storage medium of data de-duplication
CN106294189B (en) Memory defragmentation method and device
CN104050057A (en) Historical sensed data duplicate removal fragment eliminating method and system
CN115203079A (en) Method for writing data into solid state disk

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160907