CN109213738A - A kind of cloud storage file-level data de-duplication searching system and method - Google Patents
A kind of cloud storage file-level data de-duplication searching system and method Download PDFInfo
- Publication number
- CN109213738A CN109213738A CN201811384763.5A CN201811384763A CN109213738A CN 109213738 A CN109213738 A CN 109213738A CN 201811384763 A CN201811384763 A CN 201811384763A CN 109213738 A CN109213738 A CN 109213738A
- Authority
- CN
- China
- Prior art keywords
- file
- information
- client
- name server
- characteristic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention discloses a kind of cloud storage file-level data de-duplication searching system and methods, this method passes through the characteristic information of fingerprint server storage file, when client proposes file storage application, coarse filtration is carried out first, it is searched in fingerprint server, if not finding the file record for having same characteristic features, this document is regarded as new file;If found, then carefully filtered, the file set being found is considered as comparison file, successively chooses the random point and characteristic interval for comparing file, carries out precise alignment, it is whether existing with confirmation request file, if it is, the metadata that demand file is arranged in name server is directed toward the metadata of the comparison file, if there is no, then file is stored, and records file feature information into fingerprint server.The present invention can largely reduce the typing of duplicate file by the filtering of thick, thin two steps, have the characteristics that execution efficiency is high, data de-duplication rate is high, be suitable for big data and cloud storage environment.
Description
Technical field
The present invention relates to the deletion of repeated data in computer storage, cloud storage and searching fields more particularly to a kind of cloud
Storage file grade data de-duplication searching system and method.
Background technique
The high speed development of internet produces mass data, and the transimission and storage scene for resulting in mass data increasingly increases
More, in this background, data storage technology is developed rapidly, and data de-duplication and compression are can to save largely
The technology of data storage.Data de-duplication is to carry out duplicate removal, and leave in corresponding storage location by identifying duplicate contents
Pointer minimizes data volume.Only a small number of main arrays provide additional function of the data de-duplication as product at present;
Duplicate data waste valuable cloud resource, and generate overhead, it was reported that real only less than 5% disk array
Support online data de-duplication and compression, the space by data deduplication saving is very considerable.Carry out deleting for repeated data
Except it is necessary to which file is compared, since storage system has a large amount of file, shadow inevitably is generated to comparison efficiency
It rings, a kind of the method elimination data redundancy and reduction memory capacity of file-level data de-duplication proposed by the present invention effectively solve
Certainly the problem of file comparison efficiency.
Summary of the invention
The technical problem to be solved in the present invention is that in the prior art, for repeated data in cloud space, waste
Valuable cloud resource leads to the problem of overhead and to solve the comparison efficiency of duplicate file, it is literary to provide a kind of cloud storage
Part grade data de-duplication searching system and method.
The technical solution adopted by the present invention to solve the technical problems is:
The present invention provides a kind of cloud storage file-level data de-duplication searching system, which includes: client, Yun Cun
Storage platform, fingerprint server and name server, cloud storage platform are made of multiple back end;Wherein:
Multiple back end are connected by name server with fingerprint server;Fingerprint server node for storing data
The characteristic information of middle file;Client is for sending the request searched file and filtered;In the mistake for carrying out file filter
Cheng Zhong carries out coarse filtration to file by the characteristic information of file;After the completion of coarse filtration, if also needing to carry out further file
Confirmation, generates thin filtration duty by name server, and back end completion is transferred to filter again.
Further, characteristic information of the invention indicates local fingerprint, size, metadata pointer and the characteristic area of file
Between.
Further, the data in fingerprint server of the invention carry out fingerprint extraction by the way of MD5, eliminate redundancy
Data block, further data de-duplication is then done on name server, wherein the key-value pair information of fingerprint extraction are as follows:
Key is file local fingerprint, and value is size, metadata pointer and the characteristic interval of file.
Further, the local fingerprint information of file of the invention are as follows: Hash operation, obtained text are carried out to file head and the tail
Part signing messages;If file size is not enough to carry out head and the tail Hash operation, using entire file as signing messages.
Further, the characteristic interval of file of the invention are as follows: file and similar documents to be uploaded is accurately being compared
Clock synchronization, generated difference section;Similar documents indicate partly or entirely there is identical fingerprints and file with file to be uploaded
The file of size.
Further, name server of the invention determines random area according to file size and the quantity of characteristic interval
Between number;According to file storage condition, random interval position is determined.
Further, back end of the invention receives the comparison request of name server transmitting, receives comparison data, presses
It is compared according to section is compared, and is notified to comparison result.
The present invention provides a kind of cloud storage file-level data de-duplication search method, method includes the following steps:
The head and the tail progress Hash operation of S1, client selecting file, obtain file label using MD5 finger print information extracting mode
Name, the local fingerprint information as file;
Since the MD5 fingerprint extraction arithmetic speed based on Hash is fast, CPU usage is low, and the data in fingerprint server are adopted
Fingerprint extraction is carried out with the mode of MD5, eliminates the data block of redundancy, further repeated data is then done on name server
It deletes.Wherein the key-value pair information of fingerprint extraction is that key is file local fingerprint, and value is that size, the metadata of file refer to
Needle and characteristic interval.
S2, the document size information to be uploaded and file signature are sent to fingerprint server, by fingerprint server
All Files corresponding to the finger print information, and statistics file information are directly taken out, obtained statistical information is returned into client
End;
S2, the document size information to be uploaded and file signature are sent to fingerprint server, and carry out storage text
The coarse filtration of part directly takes out All Files corresponding to the finger print information, and statistics file information as fingerprint server, will
To statistical information return to client;
S3, client receive the file information of fingerprint server return, if quantity of documents is 0, then it represents that by wait deposit
After storing up file coarse filtration, the characteristic information of this document is not matched in finger print information storehouse, the file to be uploaded is completely new text
Part, client sends storage request to name server, while carrying the local fingerprint information of this document, by name server
It determines the storage location of file, and the characteristic information of file is registered to fingerprint server;
If S4, quantity of documents are not 0, then it represents that after file to be stored coarse filtration, finger print information storehouse is matched to this article
The characteristic information of part, client carry out the cyclic check stage, and client can successively send file and compare request, can carry in request
File metadata pointer and characteristic interval further carefully filter file to be stored;
S5, name server obtain the verification request that client is sent, and according to file metadata pointer or index, find text
Part metadata, and according to the quantity and distribution that randomized test section is arranged the case where the storage condition of file, characteristic interval, at random
The quantity and the sum of the quantity of characteristic interval for examining section should be directly proportional with file size, ratio according to circumstances sets itself, special
Sign section is not overlapped with random interval, and the area size of random interval is fixed value, according to circumstances sets itself, name server
The random interval calculated is sent to client, begins preparing file precise alignment;
S6, client send the data of characteristic interval and random interval in name server, and name server will
Data and inspection section are issued in back end, complete precise alignment by back end, and wait back end that will examine
As a result it returns;
S7, back end, which obtain, examines block information and inspection data, is accurately compared the information in inspection section
It is right, if compared successfully, Success Flag is returned, if comparing failure, returns to failure flags, and first comparison is failed
Block information returns to name server;
S8, name server count comparing and increase file metadata information newly as a result, if compared completely successfully, will
It is directed toward and compares successfully that file completely, and returns to the information that file has found and stored to client;
If S9, comparing failure, name server caching compares the block information of failure, and starts to client request
File compares next time;
S10, client send new comparison solicited message, continue above-mentioned comparison step, if all comparisons have been tied
Beam, and name server does not return to comparison successful information, is completed then client sends to compare, application documents storage;
S11, name server receive file and complete and start the storage location distribution of new file after applying for storage, and inform
Client is ready for, and client sends file, and name server then starts storage file;
After S12, question paper storage are completed, by this document in caching when comparing, the generated section for comparing failure
As the relative characteristic section of file, if there is there are intersections between partial section in relative characteristic section, at this point, only retaining it
In a part, guarantee characteristic interval between be separated from each other, if characteristic interval is excessive, carry out selective selection, it is ensured that
The quantity of characteristic interval is no more than the range of setting;
S13, characteristic interval and file local fingerprint, file size, file metadata pointer are registered to fingerprint server
In, and client file is notified to be transmitted.
The beneficial effect comprise that: cloud storage file-level data de-duplication searching system of the invention and side
Method passes through the filtering of thick, thin two steps, it is ensured that can largely reduce the typing of duplicate file, which, which has, executes
Feature high-efficient, data de-duplication rate is high can provide rapidly the repetition situation of file, and execution efficiency is high, duplicate removal effect
Obviously, more suitable for being used under mass data storage and cloud storage environment.
Detailed description of the invention
Present invention will be further explained below with reference to the attached drawings and examples, in attached drawing:
Fig. 1 is the system block diagram of the embodiment of the present invention;
Fig. 2 is the method flow diagram of the embodiment of the present invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, not
For limiting the present invention.
As shown in Figure 1, the cloud storage file-level data de-duplication searching system of the embodiment of the present invention, the system include:
Client, cloud storage platform, fingerprint server and name server, cloud storage platform are made of multiple back end;Wherein:
Multiple back end are connected by name server with fingerprint server;Fingerprint server node for storing data
The characteristic information of middle file;Client is for sending the request searched file and filtered;In the mistake for carrying out file filter
Cheng Zhong carries out coarse filtration to file by the characteristic information of file;After the completion of coarse filtration, if also needing to carry out further file
Confirmation, generates thin filtration duty by name server, and back end completion is transferred to filter again.
By introducing fingerprint server come the characteristic information of storage file, these information include the local fingerprint of file, text
Part size, relative characteristic section, metadata pointer etc..
The present invention need client can fingerprint server communicate, client when presenting a paper upload request first
The local fingerprint information and document size information of calculation document, and transfer to fingerprint server to search these information.Fingerprint
The effect of server is compared for making the file of coarseness, to realize coarse filtration, fingerprint server returns to comparison result
To client, client, which sends further compare to name server according to the result of return, requests either file storage to be asked
It asks.Name server obtains the metadata pointer and its characteristic interval of the duplicate file that may be present of client transmitting, into
The thin filtering of row, is inquiry this document storage condition first, then comprehensively considers file size, memory partitioning situation, characteristic interval
The factors such as quantity are selected at random compares section, and is returned to client.Client is according to the block information extracting part of passback
Point the file information is simultaneously transmitted, and name server is issued after receiving in data to data node, is compared by back end.Number
Returned according to node compare whether successful information and for the first time unsuccessful block information is accused to name server by name server
Know whether client file repeats, and plan in next step, to complete thin filtering.
The specific implementation procedure of the technology of the present invention method:
The head and the tail that step 1. client selects file carry out Hash operation and obtain the hash signature of head and the tail, and are merged,
Wherein head and tail parts size is identical, and specific size can be set by situation, if file is too small, directly acquires entire text
The hash signature of part, the client-cache hash signature.
The document size information to be uploaded and file signature are sent to fingerprint server by step 2., by fingerprinting service
Device directly takes out All Files corresponding to the fingerprint, then carries out the comparison of file size, counts fingerprint and file size all
The quantity and file metadata of identical file index or the information such as pointer, characteristic interval return to client.
Step 3. client receives the information of fingerprint server return, first determines whether quantity of documents is 0, if it is
0, then prove that this document is a completely new file, client sends storage request to name server, while carrying this article
The local fingerprint information of part is determined the storage location of file by name server, and the characteristic information of file is registered to fingerprint
Server.
If what step 4. client received is not 0 there may be the quantity of duplicate file, client is followed
Ring checking stage, client can successively send file and compare request, can carry file metadata pointer and characteristic area in request
Between.
Step 5. name server obtains the verification request that client is sent, and according to file metadata pointer or index, looks for
To file metadata, and the case where according to the storage condition of file, characteristic interval etc. the quantity in setting randomized test section and point
Cloth, the sum of the quantity in randomized test section and quantity of characteristic interval should be directly proportional with file size, ratio can according to circumstances from
Row setting, characteristic interval are not overlapped as far as possible with random interval, and the area size of random interval is fixed value, can according to circumstances certainly
The random interval calculated is sent to client, begins preparing file precise alignment by row setting, name server.
Step 6. client sends the data of characteristic interval and random interval in name server, name server
By data and section is examined to be issued in back end, precise alignment is completed by back end, and waits back end that will examine
Test result return.
Step 7. back end, which obtains, examines block information and inspection data, carries out to the information examined in section accurate
It compares, if compared successfully, returns to Success Flag, if comparing failure, return to failure flags, and first comparison is failed
Block information return to name server.
Step 8. name server counts comparing and increases file metadata information newly as a result, if compared completely successfully,
It is directed toward and compares successfully that file completely, and return to the information that file has found and stored to client.
If step 9. compares failure, name server caching compares the block information of failure, and to client request
Start file next time to compare.
Step 10. client sends new comparison solicited message, continues above-mentioned comparison step, if all comparisons are
Terminate, and name server does not return to comparison successful information, is completed then client sends to compare, application documents are deposited
Storage.
Step 11. name server receives file and completes and apply the storage location distribution for starting new file after storing, and
Inform that client is ready for, client sends file, and name server then starts storage file.
After the storage of step 12. question paper is completed, by this document in caching when comparing, the generated area for comparing failure
Between relative characteristic section as file, may have between partial section that there are intersections in relative characteristic section, at this point, only retaining
A part therein guarantees to be separated from each other between characteristic interval, if characteristic interval is excessive, carries out selective selection, really
The quantity for protecting characteristic interval is no more than certain range.
Characteristic interval and file local fingerprint, file size, file metadata pointer etc. are registered to fingerprint clothes by step 13.
It is engaged in device, and client file is notified to be transmitted.
It should be understood that for those of ordinary skills, it can be modified or changed according to the above description,
And all these modifications and variations should all belong to the protection domain of appended claims of the present invention.
Claims (8)
1. a kind of cloud storage file-level data de-duplication searching system, which is characterized in that the system includes: client, Yun Cun
Storage platform, fingerprint server and name server, cloud storage platform are made of multiple back end;Wherein:
Multiple back end are connected by name server with fingerprint server;Node is Chinese for storing data for fingerprint server
The characteristic information of part;Client is for sending the request searched file and filtered;During carrying out file filter,
Coarse filtration is carried out to file by the characteristic information of file;After the completion of coarse filtration, if also needing to carry out further file confirmation,
Thin filtration duty is generated by name server, and back end completion is transferred to filter again.
2. cloud storage file-level data de-duplication searching system according to claim 1, which is characterized in that characteristic information
Indicate local fingerprint, size, metadata pointer and the characteristic interval of file.
3. cloud storage file-level data de-duplication searching system according to claim 1, which is characterized in that fingerprinting service
Data in device carry out fingerprint extraction by the way of MD5, eliminate the data block of redundancy, then do on name server into one
The data de-duplication of step, the wherein key-value pair information of fingerprint extraction are as follows: key is file local fingerprint, and value is the big of file
Small, metadata pointer and characteristic interval.
4. cloud storage file-level data de-duplication searching system according to claim 2, which is characterized in that the office of file
Portion's finger print information are as follows: Hash operation is carried out to file head and the tail, obtains file signature information;If file size is not enough to carry out head and the tail
Hash operation, then using entire file as signing messages.
5. cloud storage file-level data de-duplication searching system according to claim 2, which is characterized in that the spy of file
Levy section are as follows: file and similar documents to be uploaded is when carrying out precise alignment, generated difference section;Similar documents indicate
Partly or entirely there is with file to be uploaded the file of identical fingerprints information and file size.
6. cloud storage file-level data de-duplication searching system according to claim 2, which is characterized in that name service
Device determines the number of random interval according to file size and the quantity of characteristic interval;According to file storage condition, determine random
Section position.
7. cloud storage file-level data de-duplication searching system according to claim 1, which is characterized in that back end
The comparison request for receiving name server transmitting, receives comparison data, is compared according to section is compared, and is notified to compare knot
Fruit.
8. a kind of data de-duplication using cloud storage file-level data de-duplication searching system described in claim 1 is examined
Suo Fangfa, which is characterized in that method includes the following steps:
The head and the tail progress Hash operation of S1, client selecting file, obtain file signature using MD5 finger print information extracting mode,
Local fingerprint information as file;
S2, the document size information to be uploaded and file signature are sent to fingerprint server, and carry out storage file
Coarse filtration directly takes out All Files corresponding to the finger print information, and statistics file information as fingerprint server, by what is obtained
Statistical information returns to client;
S3, client receive the file information of fingerprint server return, if quantity of documents is 0, then it represents that pass through text to be stored
After part coarse filtration, the characteristic information of this document is not matched in finger print information storehouse, the file to be uploaded is completely new file, visitor
Family end sends storage request to name server, while carrying the local fingerprint information of this document, is determined by name server
The storage location of file, and the characteristic information of file is registered to fingerprint server;
If S4, quantity of documents are not 0, then it represents that after file to be stored coarse filtration, finger print information storehouse is matched to this document
Characteristic information, client carry out the cyclic check stage, and client can successively send file and compare request, can carry file in request
Metadata pointer and characteristic interval further carefully filter file to be stored;
S5, name server obtain the verification request that client is sent, and according to file metadata pointer or index, find file member
Data, and according to the quantity and distribution that randomized test section is arranged the case where the storage condition of file, characteristic interval, randomized test
The sum of the quantity in section and quantity of characteristic interval should be directly proportional with file size, ratio according to circumstances sets itself, characteristic area
Between be not overlapped with random interval, the area size of random interval is fixed value, according to circumstances sets itself, and name server will be counted
Good random interval is sent to client, begins preparing file precise alignment;
S6, client send the data of characteristic interval and random interval in name server, and name server is by data
And section is examined to be issued in the back end of cloud storage platform, precise alignment is completed by back end, and wait data section
Point returns to inspection result;
S7, back end, which obtain, examines block information and inspection data, carries out precise alignment to the information examined in section, such as
Fruit compares successfully, then returns to Success Flag, if comparing failure, returns to failure flags, and the section that first comparison is failed
Information returns to name server;
S8, name server count comparing as a result, if compared completely successfully, increase file metadata information newly, are referred to
Successfully that file is compared to complete, and returns to the information that file has found and stored to client;
If S9, comparing failure, name server caching compares the block information of failure, and starts to client request next
Secondary file compares;
S10, client send new comparison solicited message, continue above-mentioned comparison step, if all comparisons have terminated, and
And name server does not return to comparison successful information, is completed then client sends to compare, application documents storage;
S11, name server receive file and complete and start the storage location distribution of new file after applying for storage, and inform client
End is ready for, and client sends file, and name server then starts storage file;
After S12, question paper storage are completed, by this document in caching when comparing, the generated section conduct for comparing failure
The relative characteristic section of file, if there is there are intersections between partial section in relative characteristic section, at this point, only retaining therein
A part guarantees to be separated from each other between characteristic interval, if characteristic interval is excessive, carries out selective selection, it is ensured that feature
The quantity in section is no more than the range of setting;
S13, characteristic interval and file local fingerprint, file size, file metadata pointer are registered in fingerprint server, and
Notice client file is transmitted.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811384763.5A CN109213738B (en) | 2018-11-20 | 2018-11-20 | Cloud storage file-level repeated data deletion retrieval system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811384763.5A CN109213738B (en) | 2018-11-20 | 2018-11-20 | Cloud storage file-level repeated data deletion retrieval system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109213738A true CN109213738A (en) | 2019-01-15 |
CN109213738B CN109213738B (en) | 2022-01-25 |
Family
ID=64993843
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811384763.5A Active CN109213738B (en) | 2018-11-20 | 2018-11-20 | Cloud storage file-level repeated data deletion retrieval system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109213738B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110096483A (en) * | 2019-05-08 | 2019-08-06 | 北京奇艺世纪科技有限公司 | A kind of duplicate file detection method, terminal and server |
CN110636141A (en) * | 2019-10-17 | 2019-12-31 | 中国人民解放军陆军工程大学 | Multi-cloud storage system based on cloud and mist cooperation and management method thereof |
CN111177082A (en) * | 2019-12-03 | 2020-05-19 | 世强先进(深圳)科技股份有限公司 | PDF file duplicate removal storage method and system |
CN111294613A (en) * | 2020-02-20 | 2020-06-16 | 北京奇艺世纪科技有限公司 | Video processing method, client and server |
CN112347060A (en) * | 2020-10-19 | 2021-02-09 | 北京天融信网络安全技术有限公司 | Data storage method, device and equipment of desktop cloud system and readable storage medium |
CN112631514A (en) * | 2020-12-17 | 2021-04-09 | 龙存科技(北京)股份有限公司 | File duplicate removal method and system applied to cloud disk system |
WO2021164171A1 (en) * | 2020-02-17 | 2021-08-26 | 平安科技(深圳)有限公司 | Method and apparatus for processing data in knowledge base, and computer device and storage medium |
CN113362046A (en) * | 2021-08-10 | 2021-09-07 | 北京开科唯识技术股份有限公司 | Control method and device for preventing salary generation errors |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101477523A (en) * | 2008-11-24 | 2009-07-08 | 北京邮电大学 | Index structure and retrieval method for ultra-large fingerprint base |
CN102156727A (en) * | 2011-04-01 | 2011-08-17 | 华中科技大学 | Method for deleting repeated data by using double-fingerprint hash check |
US20120191669A1 (en) * | 2011-01-25 | 2012-07-26 | Sepaton, Inc. | Detection and Deduplication of Backup Sets Exhibiting Poor Locality |
US20120323859A1 (en) * | 2011-06-14 | 2012-12-20 | Netapp, Inc. | Hierarchical identification and mapping of duplicate data in a storage system |
CN103034659A (en) * | 2011-09-29 | 2013-04-10 | 国际商业机器公司 | Repeated data deleting method and system |
CN103177111A (en) * | 2013-03-29 | 2013-06-26 | 西安理工大学 | System and method for deleting repeating data |
CN104077422A (en) * | 2014-07-22 | 2014-10-01 | 百度在线网络技术(北京)有限公司 | Repeated APK removing method and device in APK downloading |
CN104932841A (en) * | 2015-06-17 | 2015-09-23 | 南京邮电大学 | Saving type duplicated data deleting method in cloud storage system |
CN105955675A (en) * | 2016-06-22 | 2016-09-21 | 南京邮电大学 | Repeated data deletion system and method for de-centralization cloud environment |
CN107924353A (en) * | 2015-10-14 | 2018-04-17 | 株式会社日立制作所 | The control method of storage system and storage system |
-
2018
- 2018-11-20 CN CN201811384763.5A patent/CN109213738B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101477523A (en) * | 2008-11-24 | 2009-07-08 | 北京邮电大学 | Index structure and retrieval method for ultra-large fingerprint base |
US20120191669A1 (en) * | 2011-01-25 | 2012-07-26 | Sepaton, Inc. | Detection and Deduplication of Backup Sets Exhibiting Poor Locality |
CN102156727A (en) * | 2011-04-01 | 2011-08-17 | 华中科技大学 | Method for deleting repeated data by using double-fingerprint hash check |
US20120323859A1 (en) * | 2011-06-14 | 2012-12-20 | Netapp, Inc. | Hierarchical identification and mapping of duplicate data in a storage system |
CN103034659A (en) * | 2011-09-29 | 2013-04-10 | 国际商业机器公司 | Repeated data deleting method and system |
CN103177111A (en) * | 2013-03-29 | 2013-06-26 | 西安理工大学 | System and method for deleting repeating data |
CN104077422A (en) * | 2014-07-22 | 2014-10-01 | 百度在线网络技术(北京)有限公司 | Repeated APK removing method and device in APK downloading |
CN104932841A (en) * | 2015-06-17 | 2015-09-23 | 南京邮电大学 | Saving type duplicated data deleting method in cloud storage system |
CN107924353A (en) * | 2015-10-14 | 2018-04-17 | 株式会社日立制作所 | The control method of storage system and storage system |
CN105955675A (en) * | 2016-06-22 | 2016-09-21 | 南京邮电大学 | Repeated data deletion system and method for de-centralization cloud environment |
Non-Patent Citations (2)
Title |
---|
SUBASHINI BALACHANDRAN: "Sequence of Hashes Compression in Data De-duplication", 《 DATA COMPRESSION CONFERENCE (DCC 2008)》 * |
贾志凯等: "一种并行层次化的重复数据删除技术", 《计算机研究与发展》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110096483A (en) * | 2019-05-08 | 2019-08-06 | 北京奇艺世纪科技有限公司 | A kind of duplicate file detection method, terminal and server |
CN110096483B (en) * | 2019-05-08 | 2021-04-30 | 北京奇艺世纪科技有限公司 | Duplicate file detection method, terminal and server |
CN110636141A (en) * | 2019-10-17 | 2019-12-31 | 中国人民解放军陆军工程大学 | Multi-cloud storage system based on cloud and mist cooperation and management method thereof |
CN111177082A (en) * | 2019-12-03 | 2020-05-19 | 世强先进(深圳)科技股份有限公司 | PDF file duplicate removal storage method and system |
WO2021164171A1 (en) * | 2020-02-17 | 2021-08-26 | 平安科技(深圳)有限公司 | Method and apparatus for processing data in knowledge base, and computer device and storage medium |
CN111294613A (en) * | 2020-02-20 | 2020-06-16 | 北京奇艺世纪科技有限公司 | Video processing method, client and server |
CN112347060A (en) * | 2020-10-19 | 2021-02-09 | 北京天融信网络安全技术有限公司 | Data storage method, device and equipment of desktop cloud system and readable storage medium |
CN112347060B (en) * | 2020-10-19 | 2023-09-26 | 北京天融信网络安全技术有限公司 | Data storage method, device and equipment of desktop cloud system and readable storage medium |
CN112631514A (en) * | 2020-12-17 | 2021-04-09 | 龙存科技(北京)股份有限公司 | File duplicate removal method and system applied to cloud disk system |
CN113362046A (en) * | 2021-08-10 | 2021-09-07 | 北京开科唯识技术股份有限公司 | Control method and device for preventing salary generation errors |
Also Published As
Publication number | Publication date |
---|---|
CN109213738B (en) | 2022-01-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109213738A (en) | A kind of cloud storage file-level data de-duplication searching system and method | |
EP3251031B1 (en) | Techniques for compact data storage of network traffic and efficient search thereof | |
CN109656999B (en) | Method, device, storage medium and apparatus for synchronizing large data volume data | |
US8949561B2 (en) | Systems, methods, and computer program products providing change logging in a deduplication process | |
CN107194006A (en) | A kind of video features structural management method | |
CN106598785A (en) | File system backup and restoration method and device | |
CN110188103A (en) | Data account checking method, device, equipment and storage medium | |
US20150066877A1 (en) | Segment combining for deduplication | |
WO2021237467A1 (en) | File uploading method, file downloading method and file management apparatus | |
CN110019873B (en) | Face data processing method, device and equipment | |
CN109669795A (en) | Crash info processing method and processing device | |
CN104965835B (en) | A kind of file read/write method and device of distributed file system | |
CN109271545A (en) | A kind of characteristic key method and device, storage medium and computer equipment | |
CN111522791B (en) | Distributed file repeated data deleting system and method | |
CN105072608B (en) | A kind of method and device of administrative authentication token | |
CN108241639B (en) | A kind of data duplicate removal method | |
CN107181773A (en) | Data storage and data managing method, the equipment of distributed memory system | |
US20140222771A1 (en) | Management device and management method | |
Du et al. | Deduplicated disk image evidence acquisition and forensically-sound reconstruction | |
CN109189813B (en) | Data sharing method and device | |
CN107169065B (en) | Method and device for removing specific content | |
WO2021163856A1 (en) | Content pushing method and apparatus, and server and storage medium | |
CN112988684A (en) | Method and system for extracting and de-duplicating electronic official document data based on Hash algorithm | |
CN106126375B (en) | A kind of each version restoration methods of YAFFS2 file based on Hash | |
CN109688176A (en) | A kind of file synchronisation method and terminal, the network equipment, storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |