CN108011956A - Distributed storage method based on file content cryptographic Hash - Google Patents
Distributed storage method based on file content cryptographic Hash Download PDFInfo
- Publication number
- CN108011956A CN108011956A CN201711274018.0A CN201711274018A CN108011956A CN 108011956 A CN108011956 A CN 108011956A CN 201711274018 A CN201711274018 A CN 201711274018A CN 108011956 A CN108011956 A CN 108011956A
- Authority
- CN
- China
- Prior art keywords
- file
- cryptographic hash
- document
- storage
- client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/06—Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/32—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
- H04L9/3236—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
- H04L9/3239—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving non-keyed hash functions, e.g. modification detection codes [MDCs], MD5, SHA or RIPEMD
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of distributed storage method based on file content cryptographic Hash, including:Step 1, client obtains the first cryptographic Hash of the file content to be uploaded by hash algorithm, and the write request with the first cryptographic Hash is sent to file application server;Step 2, mapping table of the file application server reception in write request of the client with the first cryptographic Hash, locating file database of record;Step 3, file application server obtains the second cryptographic Hash of this document content by hash algorithm, and by the second cryptographic Hash compared with the first cryptographic Hash;Step 4, the mapping table in file application server locating file storage database;If oneself is there are the map record of this document, the result that this document has been transmitted through on is returned to client;If there is no the map record of this document, this document is write into file storage database.The present invention has following advantage:Save memory space;Machine loading is effectively reduced, greatly improves document storage system handling capacity.
Description
Technical field
The present invention relates to computer memory technical field, particularly a kind of distributed storage based on file content cryptographic Hash
Method.
Background technology
In the Internet, applications, file storage is a more commonly used service module, either picture, audio, video,
Excel or PDF document can all use file storage service, it can be seen that, file storage service module is in the wide of the Internet, applications
General property and importance.And traditional file memory method is to upload the date according to file to create directory storage, the position that file stores
Put and compare concentration, the read-write pressure so to disk is bigger, and document storage server load also can be very high, causes file to store
Handling capacity is low, when file reads or writes peak, easily reaches system bottleneck.
The content of the invention
For above-mentioned technological deficiency, it is an object of the invention to provide a kind of solution above-mentioned technical problem based in file
Hold the distributed storage method of cryptographic Hash.
In order to solve the above technical problems, the distributed storage method provided by the invention based on file content cryptographic Hash, bag
Include:Step 1, client obtains the first cryptographic Hash of the file content to be uploaded by hash algorithm, to file application service
Device sends the write request with the first cryptographic Hash;
Step 2, file application server receives the write request that the first cryptographic Hash is carried from client, locating file record data
Mapping table in storehouse;If oneself is there are the map record of this document, result that this document has been transmitted through on is returned to client and should
Access address of the file in file storage database;If there is no the map record of this document, agree to client to file
Application server uploads this document;
Step 3, file application server obtains the second cryptographic Hash of this document content by hash algorithm, and by the second cryptographic Hash
Compared with the first cryptographic Hash;If the second cryptographic Hash is different from the value of the first cryptographic Hash, the second cryptographic Hash and first are breathed out
The result that the value of uncommon value is different returns to client and return to step 1, if the second cryptographic Hash is identical with the value of the first cryptographic Hash, into
Enter step 4;
Step 4, the mapping table in file application server locating file storage database;If there are the mapping note of this document for oneself
Record, then return to the result that this document has been transmitted through on to client;If there is no the map record of this document, this document is write
File storage database.
Step 4 includes:
Step 4.1, storage catalogue writable in file storage database is calculated in file application server;
Step 4.2, this document is write the storage catalogue and is returned the result to client by file application server.
Step 4.1 includes:
Step 4.1.1, file application server is according to second cryptographic Hash of this document or preceding 2 characters and text of the first cryptographic Hash
The ID and memory space situation of document storage server in part storage database, are calculated writable file storage service
Device;
Step 4.1.2, file application server is according to second cryptographic Hash of this document or first 3 to 6 words of the first cryptographic Hash
Symbol, which is calculated, is used for the storage catalogue for writing this document in this document storage server.
In step 4.2, this document is write under the storage catalogue;If write-in failure, write-in failure is returned to client
If as a result, write successfully, returned to client and write successful result.
In step 4.2, this document is write under the storage catalogue by name of cryptographic Hash.
In step 4.2, this document is write under the storage catalogue, the maximum attempts of write-in is three times.
In step 4.2, writing successful result includes the cryptographic Hash of this document content, and the ID of document storage server, deposit
Store up catalogue, preserve the access address of successful result and this document.
Client uploads this document by HTTP interface to file application server.
Distributed storage method of the invention based on file content cryptographic Hash has following advantage:
1)Cryptographic Hash based on file content, identical file repeatedly uploads, and only storage once, saves memory space;
2)The server and storage catalogue calculated according to file content cryptographic Hash, the server of storage and position are more dispersed, this
The storage efficiency of sample file is high, and pressure is also disperseed when reading, is effectively reduced machine loading, is greatly improved document storage system
Handling capacity.
3)The storage server and storage catalogue calculated according to cryptographic Hash is all stored in database, and such file is deposited
Storage system easily extends.
4)Client accesses file and is accessed according to file cryptographic Hash, so favorably beneficial to system fast positioning service
Device and storage catalogue, accelerate file reading speed.
5)The cryptographic Hash of file content is to be calculated in advance in client, and identical big file not repeat to transmit, and save
Bandwidth and transmission cost.
Embodiment
Distributed storage method of the invention based on file content cryptographic Hash, includes the following steps:
1st, the upper transmitting file of client selection, client is according to the file content to be uploaded, the Sha1 values of calculating this document content(Breathe out
Uncommon value);
2nd, client is the Sha1 values by uploading file content, requesting query file application server, file application server
According to the Sha1 that will upload file content, uploaded to file and transmitting file is inquired about on this in the file record database of record whether
Upload;
If the 3, transmitting file is transmitted through on this, file application server directly returns to the access that result and this document are transmitted through on
Address;
If the 4th, change upper transmitting file not on be transmitted through, file application server tell client not on be transmitted through, what client will upload
The Sha1 values of file and its content are transferred to file application server by HTTP interface;
5th, the Sha1 values of this document content are calculated according to the upload file content, file application server will obtain Sha1 values and visitor
The Sha1 values of this document content of family end transmission are compared;
If the 6, Sha1 values are inconsistent, the inconsistent result of file Sha1 values is directly returned into client;
If the 7, Sha1 values are consistent, to file storage database inquire about the Sha1 values file whether on be transmitted through, if on be transmitted through, directly
Return to this document and upload result to client;
If the 8th, not on be transmitted through, according to upload file content Sha1 values preceding 2 characters and file storage database in file deposit
List and the storage size of server are stored up, the document storage server that transmitting file should store on this is calculated;
9th, the storage catalogue in file in this document storage server is calculated according to 3 to 6 characters of the Sha1 values of file content;
10th, after obtaining storage this document storage catalogue, transmitting file on this is stored in this document storage by name of Sha1 values
Under catalogue;
If the 11, transmitting file preserves failure on this, maximum has three tries, if proving an abortion, failure result is returned to client.
If the 12, transmitting file preserves successfully on this, file application server is transmitting file Sha1 values, file storage service on this
Device ID and storage catalogue are saved in data, and preservation successful result and upload file access address are returned to client in the lump.
The preferred embodiment to the invention is illustrated above, but the present invention is not limited to embodiment,
Those skilled in the art can also be made on the premise of without prejudice to the invention spirit a variety of equivalent deformations or
Replace, these equivalent deformations or replacement are all contained in scope of the present application.
Claims (8)
1. a kind of distributed storage method based on file content cryptographic Hash, it is characterised in that include the following steps:
Step 1, client obtains the first cryptographic Hash of the file content to be uploaded by hash algorithm, to file application service
Device sends the write request with the first cryptographic Hash;
Step 2, file application server receives the write request that the first cryptographic Hash is carried from client, locating file record data
Mapping table in storehouse;If oneself is there are the map record of this document, result that this document has been transmitted through on is returned to client and should
Access address of the file in file storage database;If there is no the map record of this document, client is to file application
Server uploads this document;
Step 3, file application server obtains the second cryptographic Hash of this document content by hash algorithm, and by the second cryptographic Hash
Compared with the first cryptographic Hash;If the second cryptographic Hash is different from the value of the first cryptographic Hash, the second cryptographic Hash and first are breathed out
The result that the value of uncommon value is different returns to client and return to step 1, if the second cryptographic Hash is identical with the value of the first cryptographic Hash, into
Enter step 4;
Step 4, the mapping table in file application server locating file storage database;If there are the mapping note of this document for oneself
Record, then return to the result that this document has been transmitted through on to client;If there is no the map record of this document, this document is write
File storage database.
2. the distributed storage method according to claim 1 based on file content cryptographic Hash, it is characterised in that step 4
Including:
Step 4.1, storage catalogue writable in file storage database is calculated in file application server;
Step 4.2, this document is write the storage catalogue and is returned the result to client by file application server.
3. the distributed storage method according to claim 2 based on file content cryptographic Hash, it is characterised in that step
4.1 including:
Step 4.1.1, file application server is according to second cryptographic Hash of this document or preceding 2 characters and text of the first cryptographic Hash
The ID and memory space situation of document storage server in part storage database, are calculated writable file storage service
Device;
Step 4.1.2, file application server is according to second cryptographic Hash of this document or first 3 to 6 words of the first cryptographic Hash
Symbol, which is calculated, is used for the storage catalogue for writing this document in this document storage server.
4. the distributed storage method according to claim 2 based on file content cryptographic Hash, it is characterised in that step
In 4.2, this document is write under the storage catalogue;If write-in failure, if being returned to client that write-in fails as a result, write-in
Success, then return to client and write successful result.
5. the distributed storage method according to claim 4 based on file content cryptographic Hash, it is characterised in that step
In 4.2, this document is write under the storage catalogue by name of cryptographic Hash.
6. the distributed storage method according to claim 4 based on file content cryptographic Hash, it is characterised in that step
In 4.2, this document is write under the storage catalogue, the maximum attempts of write-in is three times.
7. the distributed storage method according to claim 4 based on file content cryptographic Hash, it is characterised in that step
In 4.2, writing successful result includes the cryptographic Hash of this document content, the ID of document storage server, storage catalogue, preservation into
The result of work(and the access address of this document.
8. the distributed storage method according to claim 1 based on file content cryptographic Hash, it is characterised in that step 2
In, client uploads this document by HTTP interface to file application server.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711274018.0A CN108011956A (en) | 2017-12-06 | 2017-12-06 | Distributed storage method based on file content cryptographic Hash |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711274018.0A CN108011956A (en) | 2017-12-06 | 2017-12-06 | Distributed storage method based on file content cryptographic Hash |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108011956A true CN108011956A (en) | 2018-05-08 |
Family
ID=62056839
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711274018.0A Withdrawn CN108011956A (en) | 2017-12-06 | 2017-12-06 | Distributed storage method based on file content cryptographic Hash |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108011956A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117407372A (en) * | 2023-10-18 | 2024-01-16 | 北京安证通信息科技股份有限公司 | Method and system for removing duplicate of uploaded file |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101534322A (en) * | 2009-04-13 | 2009-09-16 | 腾讯科技(深圳)有限公司 | File upload system and file upload method |
US7680998B1 (en) * | 2007-06-01 | 2010-03-16 | Emc Corporation | Journaled data backup during server quiescence or unavailability |
CN102622366A (en) * | 2011-01-28 | 2012-08-01 | 阿里巴巴集团控股有限公司 | Similar picture identification method and similar picture identification device |
CN104067259A (en) * | 2012-04-16 | 2014-09-24 | 惠普发展公司,有限责任合伙企业 | File upload based on hash value comparison |
CN106446001A (en) * | 2016-07-29 | 2017-02-22 | 北京北信源软件股份有限公司 | Method and system for storing files in computer storage mediums |
-
2017
- 2017-12-06 CN CN201711274018.0A patent/CN108011956A/en not_active Withdrawn
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7680998B1 (en) * | 2007-06-01 | 2010-03-16 | Emc Corporation | Journaled data backup during server quiescence or unavailability |
CN101534322A (en) * | 2009-04-13 | 2009-09-16 | 腾讯科技(深圳)有限公司 | File upload system and file upload method |
CN102622366A (en) * | 2011-01-28 | 2012-08-01 | 阿里巴巴集团控股有限公司 | Similar picture identification method and similar picture identification device |
CN104067259A (en) * | 2012-04-16 | 2014-09-24 | 惠普发展公司,有限责任合伙企业 | File upload based on hash value comparison |
CN106446001A (en) * | 2016-07-29 | 2017-02-22 | 北京北信源软件股份有限公司 | Method and system for storing files in computer storage mediums |
Non-Patent Citations (1)
Title |
---|
微软公司: "《面向.NET的Web应用程序设计》", 29 February 2004, 北京:高等教育出版社 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117407372A (en) * | 2023-10-18 | 2024-01-16 | 北京安证通信息科技股份有限公司 | Method and system for removing duplicate of uploaded file |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8650164B2 (en) | Efficient storage and retrieval for large number of data objects | |
US10664196B2 (en) | Random access file management | |
US9575968B2 (en) | Intelligent data delivery and storage based on data characteristics | |
US6754799B2 (en) | System and method for indexing and retrieving cached objects | |
CN106790434B (en) | Network data management method, network attached storage gateway and storage service system | |
CN107153644B (en) | Data synchronization method and device | |
US10592106B2 (en) | Replication target service | |
CN106506587A (en) | A kind of Docker image download methods based on distributed storage | |
US20110119233A1 (en) | System, method and computer program for synchronizing data between data management applications | |
US20100312749A1 (en) | Scalable lookup service for distributed database | |
CA2448423A1 (en) | Method and system for tracking receipt of electronic message | |
CN103475682A (en) | File transfer method and file transfer equipment | |
CN107040606B (en) | Method and device for processing http request | |
CN108108247A (en) | Distributed picture storage service system and method | |
CN108011956A (en) | Distributed storage method based on file content cryptographic Hash | |
CN105187565A (en) | Method for utilizing network storage data | |
CN111966742A (en) | Data migration method and system | |
US9239860B1 (en) | Augmenting virtual directories | |
CN106934066A (en) | A kind of metadata processing method, device and storage device | |
CN103701937A (en) | Method for uploading large files | |
US20130058333A1 (en) | Method For Handling Requests In A Storage System And A Storage Node For A Storage System | |
CN106649641B (en) | Method, device and management system for processing schema information of database object set | |
US20220191345A1 (en) | System and method for determining compression rates for images comprising text | |
US8868970B2 (en) | Object based storage system and method of operating thereof | |
CN113849125B (en) | CDN server disk reading method, device and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20180508 |