CN111177092A - Deduplication method and device based on erasure codes - Google Patents

Deduplication method and device based on erasure codes Download PDF

Info

Publication number
CN111177092A
CN111177092A CN201911251209.4A CN201911251209A CN111177092A CN 111177092 A CN111177092 A CN 111177092A CN 201911251209 A CN201911251209 A CN 201911251209A CN 111177092 A CN111177092 A CN 111177092A
Authority
CN
China
Prior art keywords
data
data block
stored
storage table
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911251209.4A
Other languages
Chinese (zh)
Inventor
唐聃
刘龙祥
蔡红亮
何磊
耿微
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu University of Information Technology
Original Assignee
Chengdu University of Information Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu University of Information Technology filed Critical Chengdu University of Information Technology
Priority to CN201911251209.4A priority Critical patent/CN111177092A/en
Publication of CN111177092A publication Critical patent/CN111177092A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1748De-duplication implemented within the file system, e.g. based on file segments
    • G06F16/1752De-duplication implemented within the file system, e.g. based on file segments based on file chunks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for deleting repeated data based on erasure codes, wherein the method comprises the following steps: performing security processing on a data block to be stored by using a nonlinear Hash function to obtain a security data block; carrying out operation processing on the safety data block by using an erasure code to obtain a storage value of the data block; judging whether the data block is a repeated data block or not according to the stored value of the data block and a pre-stored data storage table; and correspondingly processing the data block needing to be stored according to the judgment result.

Description

Deduplication method and device based on erasure codes
Technical Field
The invention relates to the technical field of data storage, in particular to a data de-duplication method and device based on erasure codes.
Background
In the 21 st century, with the advent of the Information age, MIS (Management Information System) was used by various industries around the world, which enhanced Information Management, collected, collated, and processed data of enterprises using computer and network communication, and then decision-makers could analyze the Information resources thus generated, thereby improving the Management level and benefit of the enterprises. The data volume of modern enterprises grows exponentially, and the required storage capacity of the modern enterprises is from dozens of TB of dozens of GB to several PB. The big data age has long been no longer only theoretical, but has come. Through research, nearly 60% of data in storage is duplicated, and the existence of duplicated data not only wastes storage space, but also reduces the processing speed and the calculation accuracy of the data. Naturally, reducing the number of copies of the repeated data blocks has become an effective way to reduce the storage capacity and save the storage space.
Deduplication is a data pruning technique that can efficiently optimize storage capacity. The definition of IDC (International Data Corporation ) for deletion of duplicate Data is: a technique that can normalize duplicate data into a single shared data object to improve storage capacity efficiency. The purpose of deduplication is to globally remove redundant data existing in a storage system, including intra-file and inter-file redundant data, whereas conventional data compression can only remove redundant information inside files. Compared with the prior art, the data compression effect of the data de-duplication technology is more obvious, and the data de-duplication rate for specific application data can reach 300: 1 and even higher, the two data compression techniques are only 2: about 1.
The key of the data de-duplication technology is to determine whether a file, a data block or even a byte in a storage system is duplicated by detecting duplicated data, and the de-duplication efficiency of the duplicated data needs to be determined according to the dividing method of the file. There are two main types of current deduplication: file-level data de-duplication can detect the same file or two files with different names and the same content at different positions, thereby avoiding the repeated storage of the same file; the data block level data de-duplication can detect the same data block in the file and ensure the unique storage of the data block.
The data de-duplication utilizes the identity and similarity of the files with the files and the interior of the files, and the finer the processing granularity is, the more redundant data is deleted. Today, the algorithm for computing the duplicate data is generally a Hash algorithm. And the MD5 algorithm and the SHA-1 algorithm are Hash algorithms which are widely applied at present. The Hash algorithm is utilized to calculate the repeated data, and generally, two modes are provided, namely full-text Hash and file blocking Hash.
Full file Hash is a method to find duplicate data at the file granularity level. In a storage system, since a file is generally used as a unit of one information set, it is originally thought that a deduplication technology compares duplicates based on a file. For files already stored in the storage system, their respective hash function values are first calculated (usually using MD5 or SHA-1) and organized into a hash function library for individual storage. The premise of applying the data de-duplication function is that the application has a lot of repeated data, otherwise, the storage space is actually wasted due to the fact that the hash function value of the file is stored. When new files to be stored arrive at the storage system, the hash function values of the new files are calculated. The resulting hash function value is compared with values already stored in a hash function value library. If the two files have the same hash function value, the two files are judged to be the same, and only a pointer pointing to the stored file is needed to replace a new file to be stored. If the new file to be stored is not found in the hash function value library, the file is judged not to be in the storage system, and the hash function value library is updated to add the new file hash function value in addition to storing the file.
File blocking Hash is similar to data compression techniques. The file blocking Hash is very similar to the dictionary type compression algorithm. And carrying out the Hash calculation of the file blocks, namely firstly dividing the data blocks and then carrying out the Hash calculation on the data blocks. The simplest way to divide a block is to fix the size of the data block. The block size is within a specified range of minimum and maximum sizes. Variable-size data blocks may be partitioned by a sliding window, and a partition is created when the Hash value of the sliding window matches a reference value. In general, the reference value may be calculated using a Rabin fingerprint, and the range of block size variation may be reduced by setting upper and lower limits of the block size. The storage of data blocks is similar to the way full file Hash, with identical blocks identified by linear block numbers. Fixed block sizes may reduce the need for block partitioning algorithms, but similarity detection for the same block will be reduced.
The full-text Hash has the advantage of high calculation speed in a common environment, but has the defect that the same data existing among different files cannot be detected and redundancy elimination cannot be realized. The advantage of the file block Hash is that the same data between different files can be detected and deleted, and the disadvantage is that the Hash index of the block must be saved, which additionally increases some storage space. The Hash algorithm has a common disadvantage that the security of data cannot be guaranteed.
Disclosure of Invention
The technical problem solved by the scheme provided by the embodiment of the invention is that the existing data in the existing data de-duplication technology has lower safety.
The deduplication method based on the erasure code provided by the embodiment of the invention comprises the following steps:
performing security processing on a data block to be stored by using a nonlinear Hash function to obtain a security data block;
carrying out operation processing on the safety data block by using an erasure code to obtain a storage value of the data block;
judging whether the data block is a repeated data block or not according to the stored value of the data block and a pre-stored data storage table;
and correspondingly processing the data block needing to be stored according to the judgment result.
Preferably, the method further comprises the following steps:
reading data to be stored;
and segmenting the data to be stored according to a preset size to obtain N data blocks with the same size.
Preferably, the data storage table comprises index locations, data blocks and storage values.
Preferably, the determining whether the data block is a duplicate data block according to the storage value of the data block and a pre-stored data storage table includes:
traversing the stored values in the pre-stored data storage table, and determining whether the stored values of the data blocks are contained in the data storage table;
when the data storage table is determined to contain the storage value of the data block, judging that the data block is a repeated data block;
and when the data storage table is determined not to contain the storage value of the data block, judging that the data block is a non-repeated data block.
Preferably, the performing, according to the determination result, the corresponding processing on the data block to be stored includes:
when the data block is judged to be a repeated data block, discarding the data block, and recording the index position of the data block in the data storage table;
and when the data block is judged to be a non-repeated data block, storing the data block and a storage value thereof, and recording the index position of the data block in the data storage table.
According to an embodiment of the present invention, a de-duplication apparatus based on erasure codes includes:
the safety processing module is used for carrying out safety processing on the data block needing to be stored by utilizing a nonlinear Hash function to obtain a safety data block;
the operation processing module is used for performing operation processing on the safety data block by using the erasure code to obtain a storage value of the data block;
the judging module is used for judging whether the data block is a repeated data block or not according to the stored value of the data block and a pre-stored data storage table;
and the processing module is used for correspondingly processing the data block needing to be stored according to the judgment result.
Preferably, the method further comprises the following steps:
the reading module is used for reading data needing to be stored;
and the segmentation module is used for segmenting the data to be stored according to a preset size to obtain N data blocks with the same size.
Preferably, the data storage table comprises index locations, data blocks and storage values.
Preferably, the judging module includes:
the determining unit is used for traversing the stored values in the pre-stored data storage table and determining whether the stored values of the data blocks are contained in the data storage table;
and the judging unit is used for judging that the data block is a repeated data block when the data storage table is determined to contain the stored value of the data block, and judging that the data block is a non-repeated data block when the data storage table is determined not to contain the stored value of the data block.
Preferably, the processing module is specifically configured to discard the data block and record an index position of the data block in the data storage table when the data block is determined to be a duplicate data block, and store the data block and a storage value thereof and record an index position of the data block in the data storage table when the data block is determined to be a non-duplicate data block.
According to the scheme provided by the embodiment of the invention, the erasure code technology is utilized to prevent the data from being deleted by mistake, thereby ensuring the safety of the data.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of a method for erasure code based deduplication provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of an erasure code based de-duplication apparatus according to an embodiment of the present invention;
fig. 3 is a flowchart of an erasure code based deduplication method provided by an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings, and it should be understood that the preferred embodiments described below are only for the purpose of illustrating and explaining the present invention, and are not to be construed as limiting the present invention.
Fig. 1 is a flowchart of an erasure code-based data de-duplication method according to an embodiment of the present invention, as shown in fig. 1, including:
step S100: performing security processing on a data block to be stored by using a nonlinear Hash function to obtain a security data block;
step S110: carrying out operation processing on the safety data block by using an erasure code to obtain a storage value of the data block;
step S120: judging whether the data block is a repeated data block or not according to the stored value of the data block and a pre-stored data storage table;
step S130: and correspondingly processing the data block needing to be stored according to the judgment result.
The invention also includes: reading data to be stored; and segmenting the data to be stored according to a preset size to obtain N data blocks with the same size.
Wherein the data storage table includes an index position, a data block, and a storage value.
Wherein the step S120 includes: traversing the stored values in the pre-stored data storage table, and determining whether the stored values of the data blocks are contained in the data storage table; when the data storage table is determined to contain the storage value of the data block, judging that the data block is a repeated data block; and when the data storage table is determined not to contain the storage value of the data block, judging that the data block is a non-repeated data block. Specifically, the performing, according to the determination result, the corresponding processing on the data block to be stored includes: when the data block is judged to be a repeated data block, discarding the data block, and recording the index position of the data block in the data storage table; and when the data block is judged to be a non-repeated data block, storing the data block and a storage value thereof, and recording the index position of the data block in the data storage table.
Fig. 2 is a schematic diagram of an erasure code-based data de-duplication apparatus according to an embodiment of the present invention, as shown in fig. 2, including: the device comprises a safety processing module, an operation processing module, a judgment module and a processing module.
The safety processing module is used for carrying out safety processing on the data block to be stored by utilizing a nonlinear Hash function to obtain a safety data block; the operation processing module is used for performing operation processing on the safety data block by using an erasure code to obtain a storage value of the data block; the judging module is used for judging whether the data block is a repeated data block according to the stored value of the data block and a pre-stored data storage table; and the processing module is used for correspondingly processing the data block needing to be stored according to the judgment result.
The invention also includes: the reading module is used for reading data needing to be stored; and the segmentation module is used for segmenting the data to be stored according to a preset size to obtain N data blocks with the same size.
Wherein the data storage table includes an index position, a data block, and a storage value.
Wherein, the judging module comprises: the determining unit is used for traversing the stored values in the pre-stored data storage table and determining whether the stored values of the data blocks are contained in the data storage table; and the judging unit is used for judging that the data block is a repeated data block when the data storage table is determined to contain the stored value of the data block, and judging that the data block is a non-repeated data block when the data storage table is determined not to contain the stored value of the data block. Specifically, the processing module is configured to discard the data block and record an index position of the data block in the data storage table when the data block is determined to be a duplicate data block, and store the data block and a storage value thereof and record an index position of the data block in the data storage table when the data block is determined to be a non-duplicate data block.
The method combines a data de-duplication technology method, divides data into n data blocks with fixed size, processes each data block by using a nonlinear hash function, uses binary Goppa codes to operate the processed data blocks to obtain keys, compares each calculation result with existing data in a database in sequence, and stores the index position of the data block without storing the data block if the data block already exists in the original database; otherwise, it is stored in the database and its index position is saved. When the file needs to be read, the data block index file is extracted from the database according to the search content, then the corresponding data block is searched according to the index position recorded in the index file found before, and then the found data block is restored into the original data file.
It should be noted that, the data and the values related to the embodiments of the present invention may be determined according to actual needs, and are not limited herein.
Fig. 3 is a flowchart of a deduplication method based on erasure codes according to an embodiment of the present invention, and as shown in fig. 3, taking a file with a size of about 32M as an example, where the file name is test, and the subscript i takes a value of 1 to 1024, including:
step 101 reads the data to be stored, here read test.
And 102, partitioning the file test according to a fixed size to obtain n data blocks.
Specifically, the file test in step 101 is divided into 1024 blocks, named n1, n2, n3 … n1024, according to 32kb each. Where the size of each block of data may vary with demand.
Step 103: the 1024 data blocks in step 102 are processed separately using a nonlinear Hash function.
Specifically, the 1024 data blocks in the step 102 are respectively processed according to the set nonlinear Hash function, so that the attack can be prevented, and meanwhile, the subsequent steps can be better realized. The nonlinear Hash function is here set to H (x) and the data blocks after processing are named m1, m2, m3 … m1024, respectively.
Step 104: calculating the data block in the step 103 according to a repeated data calculation rule;
specifically, m1, m2, m3 … m1024 in step 103 are operated by binary, i.e., Goppa codes, to obtain the corresponding values of each data block, which are named key1, key2, key3 … key 1024.
Step 105: the data block is processed according to the deduplication rule using the value in step 104.
Specifically, the value obtained in step 104 is compared with a value stored in the system to determine whether this value is present in the system; when the value of a data block is the same as the value in the system (through calculation, the value is the same as the value in the system, if the value is the same, the data block already exists in the original system, the data block is not stored, and repeated storage is avoided), recording the index position corresponding to the value in the system (the index position records what the data block after the data is split is specific, and subsequent data reconstruction is needed), and discarding the data block; when the value of a data block does not exist in the system, the data block and the value are stored, and the index position of the data block is recorded.
That is, the key1, key2, key3 … key1024 obtained in step 104 is compared with the key value stored in the system to determine whether this value exists in the system; when the keyi values are the same, recording the index positions of the data blocks corresponding to the values in the system, and discarding the data blocks; when the key i value does not exist in the system, storing the data block ni corresponding to the key value and the key value, and recording the index position of ni.
According to the scheme provided by the embodiment of the invention, the data is divided into n data blocks with fixed sizes, then each data block is calculated according to a certain rule to obtain a unique value key, and finally the value is compared with the key value of the existing data block in the original database, and if the key value exists in the original database, the data block is deleted; if the data block does not exist, the data block is stored in the database, and the safety of the data is ensured.
Although the present invention has been described in detail hereinabove, the present invention is not limited thereto, and various modifications can be made by those skilled in the art in light of the principle of the present invention. Thus, modifications made in accordance with the principles of the present invention should be understood to fall within the scope of the present invention.

Claims (10)

1. A deduplication method based on erasure codes is characterized by comprising the following steps:
performing security processing on a data block to be stored by using a nonlinear Hash function to obtain a security data block;
carrying out operation processing on the safety data block by using an erasure code to obtain a storage value of the data block;
judging whether the data block is a repeated data block or not according to the stored value of the data block and a pre-stored data storage table;
and correspondingly processing the data block needing to be stored according to the judgment result.
2. The method of claim 1, further comprising:
reading data to be stored;
and segmenting the data to be stored according to a preset size to obtain N data blocks with the same size.
3. The method of claim 1, wherein the data storage table comprises index locations, data blocks, and storage values.
4. The method according to claim 3, wherein the determining whether the data block is a duplicate data block according to the stored value of the data block and a pre-stored data storage table comprises:
traversing the stored values in the pre-stored data storage table, and determining whether the stored values of the data blocks are contained in the data storage table;
when the data storage table is determined to contain the storage value of the data block, judging that the data block is a repeated data block;
and when the data storage table is determined not to contain the storage value of the data block, judging that the data block is a non-repeated data block.
5. The method according to claim 4, wherein the performing the corresponding processing on the data block to be stored according to the determination result comprises:
when the data block is judged to be a repeated data block, discarding the data block, and recording the index position of the data block in the data storage table;
and when the data block is judged to be a non-repeated data block, storing the data block and a storage value thereof, and recording the index position of the data block in the data storage table.
6. An erasure code based de-duplication apparatus, comprising:
the safety processing module is used for carrying out safety processing on the data block needing to be stored by utilizing a nonlinear Hash function to obtain a safety data block;
the operation processing module is used for performing operation processing on the safety data block by using the erasure code to obtain a storage value of the data block;
the judging module is used for judging whether the data block is a repeated data block or not according to the stored value of the data block and a pre-stored data storage table;
and the processing module is used for correspondingly processing the data block needing to be stored according to the judgment result.
7. The apparatus of claim 6, further comprising:
the reading module is used for reading data needing to be stored;
and the segmentation module is used for segmenting the data to be stored according to a preset size to obtain N data blocks with the same size.
8. The apparatus of claim 6, wherein the data storage table comprises index locations, data blocks, and storage values.
9. The apparatus of claim 8, wherein the determining module comprises:
the determining unit is used for traversing the stored values in the pre-stored data storage table and determining whether the stored values of the data blocks are contained in the data storage table;
and the judging unit is used for judging that the data block is a repeated data block when the data storage table is determined to contain the stored value of the data block, and judging that the data block is a non-repeated data block when the data storage table is determined not to contain the stored value of the data block.
10. The apparatus according to claim 9, wherein the processing module is specifically configured to discard the data chunk and record an index position of the data chunk in the data storage table when the data chunk is determined to be a duplicate data chunk, and store the data chunk and a storage value thereof and record an index position of the data chunk in the data storage table when the data chunk is determined to be a non-duplicate data chunk.
CN201911251209.4A 2019-12-09 2019-12-09 Deduplication method and device based on erasure codes Pending CN111177092A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911251209.4A CN111177092A (en) 2019-12-09 2019-12-09 Deduplication method and device based on erasure codes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911251209.4A CN111177092A (en) 2019-12-09 2019-12-09 Deduplication method and device based on erasure codes

Publications (1)

Publication Number Publication Date
CN111177092A true CN111177092A (en) 2020-05-19

Family

ID=70653835

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911251209.4A Pending CN111177092A (en) 2019-12-09 2019-12-09 Deduplication method and device based on erasure codes

Country Status (1)

Country Link
CN (1) CN111177092A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115993939A (en) * 2023-03-22 2023-04-21 陕西中安数联信息技术有限公司 Method and device for deleting repeated data of storage system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103177111A (en) * 2013-03-29 2013-06-26 西安理工大学 System and method for deleting repeating data
CN103561057A (en) * 2013-10-15 2014-02-05 深圳清华大学研究院 Data storage method based on distributed hash table and erasure codes
CN105824720A (en) * 2016-03-10 2016-08-03 中国人民解放军国防科学技术大学 Continuous data reading oriented data placement method of deduplication and erasure correcting combined system
CN109522283A (en) * 2018-10-30 2019-03-26 深圳先进技术研究院 A kind of data de-duplication method and system
CN110149198A (en) * 2019-04-29 2019-08-20 成都信息工程大学 A kind of autonomous system and method that safeguard protection and storage controllably are carried out to data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103177111A (en) * 2013-03-29 2013-06-26 西安理工大学 System and method for deleting repeating data
CN103561057A (en) * 2013-10-15 2014-02-05 深圳清华大学研究院 Data storage method based on distributed hash table and erasure codes
CN105824720A (en) * 2016-03-10 2016-08-03 中国人民解放军国防科学技术大学 Continuous data reading oriented data placement method of deduplication and erasure correcting combined system
CN109522283A (en) * 2018-10-30 2019-03-26 深圳先进技术研究院 A kind of data de-duplication method and system
CN110149198A (en) * 2019-04-29 2019-08-20 成都信息工程大学 A kind of autonomous system and method that safeguard protection and storage controllably are carried out to data

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DAN TANG, YA-QIANG WANG, HAO-PENG YANG: ""Array Erasure Codes with Preset Fault Tolerance Capability"" *
DI FAN, FENG XIAO, AND DAN TANG: ""A New Erasure Code Decoding Algorithm"" *
唐聃;舒红平;: "一类多容错的阵列纠删码" *
朱江;冀鸣;杨志成;张嘉贤;曹雄;: "基于重复数据删除技术的存储系统分析" *
罗象宏 舒继武: ""存储系统中的纠删码研究综述"" *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115993939A (en) * 2023-03-22 2023-04-21 陕西中安数联信息技术有限公司 Method and device for deleting repeated data of storage system
CN115993939B (en) * 2023-03-22 2023-06-09 陕西中安数联信息技术有限公司 Method and device for deleting repeated data of storage system

Similar Documents

Publication Publication Date Title
US9223794B2 (en) Method and apparatus for content-aware and adaptive deduplication
US8914338B1 (en) Out-of-core similarity matching
US9454318B2 (en) Efficient data storage system
US9292584B1 (en) Efficient data communication based on lossless reduction of data by deriving data from prime data elements resident in a content-associative sieve
US7434015B2 (en) Efficient data storage system
US9430156B1 (en) Method to increase random I/O performance with low memory overheads
US9367448B1 (en) Method and system for determining data integrity for garbage collection of data storage systems
Lu et al. Frequency based chunking for data de-duplication
US20160283505A1 (en) Methods and apparatus for efficient compression and deduplication
US7117204B2 (en) Transparent content addressable data storage and compression for a file system
US20070043757A1 (en) Storage reports duplicate file detection
CN106611035A (en) Retrieval algorithm for deleting repetitive data in cloud storage
US20190379394A1 (en) System and method for global data compression
CN106980680B (en) Data storage method and storage device
Tan et al. Improving restore performance in deduplication-based backup systems via a fine-grained defragmentation approach
CN110888837A (en) Object storage small file merging method and device
US7379940B1 (en) Focal point compression method and apparatus
CN111177092A (en) Deduplication method and device based on erasure codes
US8244677B2 (en) Focal point compression method and apparatus
Vikraman et al. A study on various data de-duplication systems
US11836388B2 (en) Intelligent metadata compression
US12124420B2 (en) Systems, methods and devices for eliminating duplicates and value redundancy in computer memories
US20230076729A2 (en) Systems, methods and devices for eliminating duplicates and value redundancy in computer memories
US20240345955A1 (en) Detecting Modifications To Recently Stored Data
Zhou et al. BBMC: a novel block level chunking algorithm for de-duplication backup system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200519

RJ01 Rejection of invention patent application after publication