CN102722583A - Hardware accelerating device for data de-duplication and method - Google Patents

Hardware accelerating device for data de-duplication and method Download PDF

Info

Publication number
CN102722583A
CN102722583A CN2012101878809A CN201210187880A CN102722583A CN 102722583 A CN102722583 A CN 102722583A CN 2012101878809 A CN2012101878809 A CN 2012101878809A CN 201210187880 A CN201210187880 A CN 201210187880A CN 102722583 A CN102722583 A CN 102722583A
Authority
CN
China
Prior art keywords
data
module
fingerprint
cutting
duplication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012101878809A
Other languages
Chinese (zh)
Inventor
张庆敏
张衡
胡刚
李天仁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WUXI SOUL STORAGE TECHNOLOGY Co Ltd
Original Assignee
WUXI SOUL STORAGE TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WUXI SOUL STORAGE TECHNOLOGY Co Ltd filed Critical WUXI SOUL STORAGE TECHNOLOGY Co Ltd
Priority to CN2012101878809A priority Critical patent/CN102722583A/en
Publication of CN102722583A publication Critical patent/CN102722583A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a hardware accelerating device for data de-duplication and a method. The device comprises a bus interface receiving module, an intelligent segmenting module, a fingerprint calculating module, a memory RAM (Random-Access Memory) module and a bus interface sending module, wherein the bus interface receiving module receives data stream needed to be processed and transmits the data stream to an intelligent segmenting module, the intelligent segmenting module detects and calculates the data stream and segments the data stream according to a segmenting rule, the fingerprint calculating module calculates the Hash of the data stream segmented by the intelligent segmenting module and generates a corresponding fingerprint Hash value, the memory RAM module temporarily stores segmenting information and the fingerprint Hash value of the intelligent segmenting module, and the bus interface sending module transmits the segmenting information and the fingerprint Hash value stored in the memory RAM module to a memory and informs a processor to retrieve and inquire the redundancy blocks. According to the technical scheme, the data block segmentation and data block fingerprint calculation functions are completed by hardware, and the resource consumption of the processor is reduced, so that the processor has more time to retrieve the redundancy blocks, thereby improving the integral performance of data de-duplication.

Description

Data de-duplication hardware accelerator and method
Technical field
The present invention relates to the data storing field, particularly, relate to a kind of FPGA of utilization and carry out data de-duplication hardware accelerator and method.
Background technology
Along with data message is volatile growth; The cost pressure of the required storage of backup and the disaster tolerance of the continuous rising that causes need be undertaken thus by enterprise; For solving this quagmire, present storage backup system and long-range disaster recovery and backup systems all adopt data de-duplication technology.
Data de-duplication is also referred to as Intelligent Compression or single instance storage; Be a kind ofly can search for repeating data automatically; Identical data is only kept a unique copy; And use the pointer that points to single copy to replace other duplicate copies, the memory technology that eliminate redundant data to reach, reduces storage capacity requirement.Data de-duplication technology not only can the optimal Storage space, improves storage efficiency, and can save the network bandwidth and the transmission time cost of storage or backup, thereby reduction enterprise is at the management and the time cost in data backup memory field.
Go heavy granularity according to different, data de-duplication technology can be divided into data de-duplication based on file-level, based on the data de-duplication of data block with based on the data de-duplication of byte level.Data de-duplication computing speed based on file-level is fast, but the superfluous rate of going heavily to disappear is relatively low; Higher relatively based on the data de-duplication of the byte level superfluous rate of then going heavily to disappear, but the calculating treatmenting time that needs is also longer relatively; Then gather both advantage based on the data de-duplication of data block, gone heavily rate and calculating treatmenting time relative equilibrium.
Commonly usedly in storage or the standby system product be based on the data block superfluous data de-duplication technology that disappears, and wherein most important parts comprises that mainly file data blocks cutting, data block fingerprint calculate and the identical data block retrieval realizes.The realization of data block cutting has determined the height of deletion rate index, and the data block fingerprint calculates and the identical data block retrieval realizes influencing the overall performance index.
In the patent No. is 200910136595.2; Application publication number is in the patent of CN101882141A; Disclose a kind of method and system of realizing repeated data deletion, disclose a kind of technical scheme of deleting redundant data, but realized that through software the file data blocks cutting of data de-duplication, data block fingerprint calculate and identical block search function and compression function; Need to consume a large amount of processor resources, influence performance of processors and counting yield.
Summary of the invention
The objective of the invention is to,, propose a kind of data de-duplication hardware accelerator and method,, improve the advantage of deletion speed and efficient to realize reducing processor pressure to the problems referred to above.
For realizing above-mentioned purpose, the technical scheme that the present invention adopts is:
A kind of data de-duplication hardware accelerator comprises:
The EBI receiver module: reception needs the data stream of processing, and gives intelligent cutting module with this stream data transmission;
Intelligence cutting module: accomplish detection and calculating to above-mentioned data stream, and according to segmentation rules with this data stream cutting;
The fingerprint computing module: the data stream to the cutting of above-mentioned intelligent cutting module is carried out hash calculation, and generates corresponding fingerprint cryptographic hash;
Storage RAM module: segment information, the fingerprint cryptographic hash of temporary above-mentioned intelligent cutting module;
The EBI sending module: being transferred to internal memory, and notification processor carries out the redundant block retrieval and inquisition with the segmental information in the above-mentioned storage RAM module, fingerprint cryptographic hash.
According to a preferred embodiment of the invention, said device also comprises compression module: the data of above-mentioned intelligent cutting module cutting are carried out compaction algorithms, further dwindle the data occupancy space; Data after the said compression module compaction algorithms are temporarily stored in the above-mentioned storage RAM module.
According to a preferred embodiment of the invention, this device is for based on the FPGA circuit design.
According to a preferred embodiment of the invention; Segmentation rules in the above-mentioned intelligent cutting module, the cutting method of employing content piecemeal, this cutting method comes specified data piece separation through a window that constantly slides; Adopt the Rabin fingerprint algorithm to calculate the fingerprint of moving window; As satisfy predetermined condition, then, realize piecemeal to the data object through moving window and calculated fingerprint with the ending of the starting position of this window as data block.
According to a preferred embodiment of the invention, adopt the Whirlpool algorithm in the above-mentioned fingerprint computing module.
The invention also discloses a kind of data de-duplication method that is used for the data de-duplication hardware accelerator, may further comprise the steps:
Above-mentioned EBI receiver module receives the order of processor, according to corresponding address information reading corresponding data in the internal memory, and the data that read are formed data stream passes to intelligent cutting module;
Above-mentioned intelligent cutting module is carried out the cutting computing to the data stream of receiving, and according to segmentation rules, continuous data stream is cut into the plurality of data piece;
The data block of above-mentioned fingerprint computing module after with the cutting of intelligent cutting module carried out Hash operation, and produces corresponding fingerprint cryptographic hash;
The fingerprint cryptographic hash of the segment information of above-mentioned intelligent cutting module, the production of fingerprint computing module is temporarily stored in the storage RAM module;
The EBI sending module takes out the information of data block after the cutting in the storage RAM module, and it is delivered in the internal memory, sends the interrupt notification processor simultaneously and does search operaqtion.
According to a preferred embodiment of the invention, the data block of said intelligent cutting module cutting can be carried out compaction algorithms through compression module, further dwindles the data occupancy space, and the data after the said compression module compaction algorithms are temporarily stored in the above-mentioned storage RAM module.
According to a preferred embodiment of the invention; Above-mentioned segmentation rules, the cutting method of employing content piecemeal, this cutting method comes specified data piece separation through a window that constantly slides; Adopt the Rabin fingerprint algorithm to calculate the fingerprint of moving window; As satisfy predetermined condition, then, realize piecemeal to the data object through moving window and calculated fingerprint with the ending of the starting position of this window as data block.
According to a preferred embodiment of the invention, above-mentioned Hash operation adopts the Whirlpool algorithm.
Technical scheme of the present invention is accomplished data block cutting and data block fingerprint computing function through hardware; In the entire process process; Processor only need be responsible for notifying the data de-duplication position that hardware-accelerated nuclear data stream is deposited and be the appointment of the data block information after cutting deposit position; The cutting of entire stream and fingerprint calculate does not need processor to participate in; Reduce the resource consumption of processor, made processor have more time to do the redundant block retrieval process, thereby improved the overall performance of data de-duplication.Through using FPGA (field programmable gate array) to come the intelligent data segmentation rules of parallel completion data de-duplication and data block fingerprint to calculate and compression algorithm; Realize the acceleration nuclear of data de-duplication hardware, thereby improved the overall performance of data de-duplication function.
Through accompanying drawing and embodiment, technical scheme of the present invention is done further detailed description below.
Description of drawings
Accompanying drawing is used to provide further understanding of the present invention, and constitutes the part of instructions, is used to explain the present invention with embodiments of the invention, is not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is the described theory structure synoptic diagram that is used for the data de-duplication hardware accelerator of the embodiment of the invention;
Fig. 2 is the method flow diagram of the described data de-duplication method of the embodiment of the invention.
Embodiment
Below in conjunction with accompanying drawing the preferred embodiments of the present invention are described, should be appreciated that preferred embodiment described herein only is used for explanation and explains the present invention, and be not used in qualification the present invention.
As shown in Figure 1, the data de-duplication hardware accelerator based on the FPGA circuit design comprises: the EBI receiver module: receive the data stream that needs are handled, and give intelligent cutting module with this stream data transmission;
Intelligence cutting module: accomplish detection and calculating to data stream, and according to segmentation rules with this data stream cutting;
The fingerprint computing module: the data stream to the cutting of intelligent cutting module is carried out hash calculation, and generates corresponding fingerprint cryptographic hash;
Storage RAM module: segment information, the fingerprint cryptographic hash of temporary intelligent cutting module;
EBI sending module: will store segmental information in the RAM module, fingerprint cryptographic hash being transferred to internal memory, and notification processor carries out the redundant block retrieval and inquisition.
If the compaction algorithms of the hardware-accelerated nuclear support of data de-duplication after to the data cutting, then compression module carries out compaction algorithms with the data of intelligent cutting module cutting, further dwindles the data occupancy space; Data after the said compression module compaction algorithms are temporarily stored in the above-mentioned storage RAM module.To the cutting of data stream with the Hash operation and the compaction algorithms of data block are carried out simultaneously; Therefore after a data block is accomplished by cutting; The fingerprint cryptographic hash of this data block and compression result are also accomplished simultaneously, and the segmental information of the logical data block of its result together leaves storage RAM module in.Segmentation rules in the intelligence cutting module; Adopt the cutting method of content piecemeal; This cutting method comes specified data piece separation through a window that constantly slides, and adopts the Rabin fingerprint algorithm to calculate the fingerprint of moving window, as satisfies predetermined condition; Then, realize piecemeal to the data object through moving window and calculated fingerprint with the ending of the starting position of this window as data block.Adopt the Whirlpool algorithm in the fingerprint computing module.
As shown in Figure 2, a kind of data de-duplication method that is used for the data de-duplication hardware accelerator is specific as follows:
(1) the EBI receiver module receives the notice of processor, in internal memory, gets corresponding data according to corresponding address information, and it composition data stream is passed to intelligent cutting module;
(2) intelligent cutting module is carried out the cutting computing to the data stream of receiving, and according to different segmentation rules, continuous data stream is cut into the plurality of data piece, and this module can be gone out the data transfer after the participation computing;
(3) the fingerprint computing module will carry out Hash operation to the data block of participating in cutting computing completion, and produce corresponding fingerprint cryptographic hash;
(4) if the compaction algorithms of the hardware-accelerated nuclear support of data de-duplication after to the data cutting, compression module can be the same with the fingerprint computing module so, and the data block of cutting computing completion is carried out compaction algorithms;
(5) the EBI sending module takes out the information of data block after the cutting in the storage RAM module, and it is delivered to relevant position in the internal memory, sends the interrupt notification processor simultaneously and does next step search operaqtion.
Wherein to the cutting of data stream with the Hash operation and the compaction algorithms of data block are carried out simultaneously; Therefore after a data block is accomplished by cutting; The fingerprint cryptographic hash of this data block and compression result are also accomplished simultaneously, and its result together leaves storage RAM module in the segmental information of data block.
Segmentation rules in the intelligence cutting module can be selected different cutting methods for use; The cutting method of the content-based piecemeal that uses in the present technique scheme; It comes specified data piece separation through a window that constantly slides, and adopts the fingerprint of Rabin fingerprint algorithm calculating moving window, if satisfy predetermined condition; Just, realize piecemeal through continuous moving window and calculated fingerprint like this to the data object with the ending of the starting position of this window as data block; The cutting method of content-based piecemeal can effectively carry out cutting to data to be handled, and the convenient hardware that uses is realized.Segmentation rules also can be selected the regular length segmentation rules for use or based on the cutting method of sliding shoe.
Can select different algorithms to come the fingerprint value of computational data piece in the fingerprint computing module; The fingerprint of data block normally carries out the dependency number mathematical operations to data block contents and obtains; The relatively more approaching and dreamboat of hash function; For example MD5, SHA1, SHA-256, SHA-512, RabinHash and WhirlPool etc., but all there is collision problem in these hash functions, and promptly the different pieces of information piece may produce identical data fingerprint; In order to increase safety of data; Reduce to occur the probability of data collision; The fingerprint computing module can be selected the lower algorithm of collision probability, like Whirlpool, SHA-512, can also use two hash algorithms to calculate its Hash fingerprint value simultaneously to same data block simultaneously.The present technique scheme has used the Whirlpool algorithm of higher implementation efficiency to come the fingerprint cryptographic hash of computational data piece; Can realize simultaneously other hash algorithm or other algorithm of the object of the invention; The fingerprint that can come computational data is all in protection scope of the present invention, like CRC32 etc.
Because data cutting and fingerprint calculate slow with respect to the EBI meeting of processing; In order to improve the overall computational performance of the hardware-accelerated nuclear of data de-duplication, the present invention can use a plurality of intelligent cutting modules, fingerprint computing module, compression module and storage RAM module to accomplish cutting, fingerprint and compress for data block simultaneously.Fpga chip in the technical scheme of the present invention can use asic chip or other platform to replace.
What should explain at last is: the above is merely the preferred embodiments of the present invention; Be not limited to the present invention; Although the present invention has been carried out detailed explanation with reference to previous embodiment; For a person skilled in the art, it still can be made amendment to the technical scheme that aforementioned each embodiment put down in writing, and perhaps part technical characterictic wherein is equal to replacement.All within spirit of the present invention and principle, any modification of being done, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (9)

1. a data de-duplication hardware accelerator is characterized in that, comprising:
The EBI receiver module: reception needs the data stream of processing, and gives intelligent cutting module with this stream data transmission;
Intelligence cutting module: accomplish detection and calculating to above-mentioned data stream, and according to segmentation rules with this data stream cutting;
The fingerprint computing module: the data stream to the cutting of above-mentioned intelligent cutting module is carried out hash calculation, and generates corresponding fingerprint cryptographic hash;
Storage RAM module: segment information, the fingerprint cryptographic hash of temporary above-mentioned intelligent cutting module;
The EBI sending module: being transferred to internal memory, and notification processor carries out the redundant block retrieval and inquisition with the segmental information in the above-mentioned storage RAM module, fingerprint cryptographic hash.
2. data de-duplication hardware accelerator according to claim 1 is characterized in that, also comprises compression module: the data of above-mentioned intelligent cutting module cutting are carried out compaction algorithms, further dwindle the data occupancy space; Data after the said compression module compaction algorithms are temporarily stored in the above-mentioned storage RAM module.
3. data de-duplication hardware accelerator according to claim 1 and 2 is characterized in that, this device is for based on the FPGA circuit design.
4. data de-duplication hardware accelerator according to claim 3 is characterized in that, the segmentation rules in the above-mentioned intelligent cutting module; Adopt the cutting method of content piecemeal; This cutting method comes specified data piece separation through a window that constantly slides, and adopts the Rabin fingerprint algorithm to calculate the fingerprint of moving window, as satisfies predetermined condition; Then, realize piecemeal to the data object through moving window and calculated fingerprint with the ending of the starting position of this window as data block.
5. data de-duplication hardware accelerator according to claim 3 is characterized in that, adopts the Whirlpool algorithm in the above-mentioned fingerprint computing module.
6. a data de-duplication method that is used for the described data de-duplication hardware accelerator of claim 1 to 5 is characterized in that, may further comprise the steps:
Above-mentioned EBI receiver module receives the order of processor, according to corresponding address information reading corresponding data in the internal memory, and the data that read are formed data stream passes to intelligent cutting module;
Above-mentioned intelligent cutting module is carried out the cutting computing to the data stream of receiving, and according to segmentation rules, continuous data stream is cut into the plurality of data piece;
The data block of above-mentioned fingerprint computing module after with the cutting of intelligent cutting module carried out Hash operation, and produces corresponding fingerprint cryptographic hash;
The fingerprint cryptographic hash of the segment information of above-mentioned intelligent cutting module, the production of fingerprint computing module is temporarily stored in the storage RAM module;
The EBI sending module takes out the information of data block after the cutting in the storage RAM module, and it is delivered in the internal memory, sends the interrupt notification processor simultaneously and does search operaqtion.
7. according to the said data de-duplication method of claim 6; It is characterized in that; The data block of said intelligent cutting module cutting can be carried out compaction algorithms through compression module, further dwindles the data occupancy space, and the data after the said compression module compaction algorithms are temporarily stored in the above-mentioned storage RAM module.
8. according to the said data de-duplication method of claim 7, it is characterized in that above-mentioned segmentation rules; Adopt the cutting method of content piecemeal; This cutting method comes specified data piece separation through a window that constantly slides, and adopts the Rabin fingerprint algorithm to calculate the fingerprint of moving window, as satisfies predetermined condition; Then, realize piecemeal to the data object through moving window and calculated fingerprint with the ending of the starting position of this window as data block.
9. said according to Claim 8 data de-duplication method is characterized in that, above-mentioned Hash operation adopts the Whirlpool algorithm.
CN2012101878809A 2012-06-07 2012-06-07 Hardware accelerating device for data de-duplication and method Pending CN102722583A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012101878809A CN102722583A (en) 2012-06-07 2012-06-07 Hardware accelerating device for data de-duplication and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012101878809A CN102722583A (en) 2012-06-07 2012-06-07 Hardware accelerating device for data de-duplication and method

Publications (1)

Publication Number Publication Date
CN102722583A true CN102722583A (en) 2012-10-10

Family

ID=46948344

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012101878809A Pending CN102722583A (en) 2012-06-07 2012-06-07 Hardware accelerating device for data de-duplication and method

Country Status (1)

Country Link
CN (1) CN102722583A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930004A (en) * 2012-10-29 2013-02-13 华为技术有限公司 Hash value storage method, device and chip
CN103345449A (en) * 2013-06-19 2013-10-09 暨南大学 Method and system for prefetching fingerprints oriented to data de-duplication technology
CN104270454A (en) * 2014-10-14 2015-01-07 无锡云捷科技有限公司 CDN dynamic application acceleration method based on data transmission optimizing system
CN104933010A (en) * 2014-03-18 2015-09-23 华为技术有限公司 Duplicated data deleting method and apparatus
CN105487819A (en) * 2015-11-30 2016-04-13 上海爱数信息技术股份有限公司 Task policy based memory level data quick storage method
CN105653209A (en) * 2015-12-31 2016-06-08 浪潮(北京)电子信息产业有限公司 Object storage data transmitting method and device
CN105677238A (en) * 2015-12-28 2016-06-15 国云科技股份有限公司 Method for distributed storage based data deduplication on virtual machine system disk
CN105706041A (en) * 2013-10-16 2016-06-22 网络装置公司 Technique for global deduplication across datacenters with minimal coordination
CN105931278A (en) * 2015-02-28 2016-09-07 阿尔特拉公司 Methods And Apparatus For Two-dimensional Block Bit-stream Compression And Decompression
CN106933701A (en) * 2015-12-30 2017-07-07 伊姆西公司 For the method and apparatus of data backup
CN107004031A (en) * 2016-04-19 2017-08-01 华为技术有限公司 Split while using Vector Processing
CN108415671A (en) * 2018-03-29 2018-08-17 上交所技术有限责任公司 A kind of data de-duplication method and system of Oriented Green cloud computing
CN110083743A (en) * 2019-03-28 2019-08-02 哈尔滨工业大学(深圳) A kind of quick set of metadata of similar data detection method based on uniform sampling
CN112162973A (en) * 2020-09-17 2021-01-01 华中科技大学 Fingerprint collision avoidance, deduplication and recovery method, storage medium and deduplication system
CN112667144A (en) * 2019-10-16 2021-04-16 北京白山耘科技有限公司 Data block construction and comparison method, device, medium and equipment
CN113672619A (en) * 2021-08-17 2021-11-19 天津南大通用数据技术股份有限公司 Method for segmenting data more uniformly according to hash rule
CN114415955A (en) * 2022-01-05 2022-04-29 上海交通大学 Block granularity data deduplication system and method based on fingerprints
CN115509763A (en) * 2022-10-31 2022-12-23 新华三信息技术有限公司 Fingerprint calculation method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102253820A (en) * 2011-06-16 2011-11-23 华中科技大学 Stream type repetitive data detection method
CN102378973A (en) * 2009-03-30 2012-03-14 爱萨有限公司 System and method for data deduplication
CN102467571A (en) * 2010-11-17 2012-05-23 英业达股份有限公司 Data block partition method and addition method for data de-duplication

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102378973A (en) * 2009-03-30 2012-03-14 爱萨有限公司 System and method for data deduplication
CN102467571A (en) * 2010-11-17 2012-05-23 英业达股份有限公司 Data block partition method and addition method for data de-duplication
CN102253820A (en) * 2011-06-16 2011-11-23 华中科技大学 Stream type repetitive data detection method

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930004A (en) * 2012-10-29 2013-02-13 华为技术有限公司 Hash value storage method, device and chip
CN102930004B (en) * 2012-10-29 2015-07-08 华为技术有限公司 Hash value storage method, device and chip
CN103345449A (en) * 2013-06-19 2013-10-09 暨南大学 Method and system for prefetching fingerprints oriented to data de-duplication technology
CN103345449B (en) * 2013-06-19 2016-12-28 暨南大学 A kind of fingerprint forecasting method towards data de-duplication technology and system
US11775503B2 (en) 2013-10-16 2023-10-03 Netapp, Inc. Technique for global deduplication across datacenters with minimal coordination
CN105706041A (en) * 2013-10-16 2016-06-22 网络装置公司 Technique for global deduplication across datacenters with minimal coordination
US11301455B2 (en) 2013-10-16 2022-04-12 Netapp, Inc. Technique for global deduplication across datacenters with minimal coordination
US10685013B2 (en) 2013-10-16 2020-06-16 Netapp Inc. Technique for global deduplication across datacenters with minimal coordination
CN105706041B (en) * 2013-10-16 2019-07-19 Netapp股份有限公司 For carrying out the technology of global duplicate removal between the data center with minimum cooperation
CN104933010B (en) * 2014-03-18 2019-02-19 华为技术有限公司 A kind of data de-duplication method and device
CN104933010A (en) * 2014-03-18 2015-09-23 华为技术有限公司 Duplicated data deleting method and apparatus
CN104270454A (en) * 2014-10-14 2015-01-07 无锡云捷科技有限公司 CDN dynamic application acceleration method based on data transmission optimizing system
CN105931278A (en) * 2015-02-28 2016-09-07 阿尔特拉公司 Methods And Apparatus For Two-dimensional Block Bit-stream Compression And Decompression
CN105487819A (en) * 2015-11-30 2016-04-13 上海爱数信息技术股份有限公司 Task policy based memory level data quick storage method
CN105677238A (en) * 2015-12-28 2016-06-15 国云科技股份有限公司 Method for distributed storage based data deduplication on virtual machine system disk
CN106933701A (en) * 2015-12-30 2017-07-07 伊姆西公司 For the method and apparatus of data backup
US11334255B2 (en) 2015-12-30 2022-05-17 EMC IP Holding Company LLC Method and device for data replication
CN105653209A (en) * 2015-12-31 2016-06-08 浪潮(北京)电子信息产业有限公司 Object storage data transmitting method and device
US10437817B2 (en) 2016-04-19 2019-10-08 Huawei Technologies Co., Ltd. Concurrent segmentation using vector processing
CN107004031A (en) * 2016-04-19 2017-08-01 华为技术有限公司 Split while using Vector Processing
CN108415671B (en) * 2018-03-29 2021-04-27 上交所技术有限责任公司 Method and system for deleting repeated data facing green cloud computing
CN108415671A (en) * 2018-03-29 2018-08-17 上交所技术有限责任公司 A kind of data de-duplication method and system of Oriented Green cloud computing
CN110083743A (en) * 2019-03-28 2019-08-02 哈尔滨工业大学(深圳) A kind of quick set of metadata of similar data detection method based on uniform sampling
CN112667144A (en) * 2019-10-16 2021-04-16 北京白山耘科技有限公司 Data block construction and comparison method, device, medium and equipment
CN112162973A (en) * 2020-09-17 2021-01-01 华中科技大学 Fingerprint collision avoidance, deduplication and recovery method, storage medium and deduplication system
CN113672619A (en) * 2021-08-17 2021-11-19 天津南大通用数据技术股份有限公司 Method for segmenting data more uniformly according to hash rule
CN113672619B (en) * 2021-08-17 2024-02-06 天津南大通用数据技术股份有限公司 Method for segmenting data according to hash rule to make data more uniform
CN114415955A (en) * 2022-01-05 2022-04-29 上海交通大学 Block granularity data deduplication system and method based on fingerprints
CN114415955B (en) * 2022-01-05 2024-04-09 上海交通大学 Fingerprint-based block granularity data deduplication system and method
CN115509763A (en) * 2022-10-31 2022-12-23 新华三信息技术有限公司 Fingerprint calculation method and device

Similar Documents

Publication Publication Date Title
CN102722583A (en) Hardware accelerating device for data de-duplication and method
CN101989929B (en) Disaster recovery data backup method and system
CN104932956B (en) A kind of cloud disaster-tolerant backup method towards big data
CN102782643B (en) Use the indexed search of Bloom filter
CN102323958A (en) Data de-duplication method
CN103116661B (en) A kind of data processing method of database
CN104932841A (en) Saving type duplicated data deleting method in cloud storage system
WO2017096532A1 (en) Data storage method and apparatus
CN101968796B (en) Method for segmenting bidirectionally and concurrently executed file level variable-length data
WO2017020576A1 (en) Method and apparatus for file compaction in key-value storage system
CN113836084A (en) Data storage method, device and system
US20140222770A1 (en) De-duplication data bank
Zhang et al. Survey of research on big data storage
CN104735110A (en) Metadata management method and system
CN113535706A (en) Two-stage cuckoo filter and repeated data deleting method based on two-stage cuckoo filter
CN108415671B (en) Method and system for deleting repeated data facing green cloud computing
CN105487942A (en) Backup and remote copy method based on data deduplication
CN104317676A (en) Data backup disaster tolerance method
CN102810108A (en) Method for processing repeated data
CN105095027A (en) Data backup method and apparatus
CN105630810A (en) Method for uploading mass small files in distributed storage system
Kumar et al. Bucket based data deduplication technique for big data storage system
CN110618790B (en) Mist storage data redundancy elimination method based on repeated data deletion
CN113535705B (en) SFAD cuckoo filter and repeated data deleting method based on SFAD cuckoo filter
Kim et al. Design and implementation of binary file similarity evaluation system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C53 Correction of patent of invention or patent application
CB02 Change of applicant information

Address after: 214122 Jiangsu Province, Wuxi City District Road No. 18 Wuxi Zhenze National Software Park Building 6 layer A Taurus

Applicant after: SOUL Storage Technology (Wuxi) Co., Ltd.

Address before: 214122 Wuxi Road, Jiangsu District, a city in Wuxi province Wuxi No. 18 National Software Park Building 6 layer A Taurus

Applicant before: Wuxi SOUL Storage Technology Co., Ltd.

COR Change of bibliographic data

Free format text: CORRECT: APPLICANT; FROM: WUXI SOUL STORAGE TECHNOLOGY CO., LTD. TO: SOUL STORAGE TECHNOLOGY UXI O., LTD.

C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent of invention or patent application
CB02 Change of applicant information

Address after: 214122 Jiangsu Province, Wuxi City District Road No. 18 Wuxi Zhenze National Software Park Building 6 layer A Taurus

Applicant after: WUXI SOUL DATA COMPUTING CO., LTD.

Address before: 214122 Jiangsu Province, Wuxi City District Road No. 18 Wuxi Zhenze National Software Park Building 6 layer A Taurus

Applicant before: SOUL Storage Technology (Wuxi) Co., Ltd.

COR Change of bibliographic data

Free format text: CORRECT: APPLICANT; FROM: SOUL STORAGE TECHNOLOGY UXI O., LTD. TO: SOUL DATA COMPUTING (WUXI) CO., LTD.

C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20121010