CN105678189A - Encrypted data file storage and retrieval system and method - Google Patents
Encrypted data file storage and retrieval system and method Download PDFInfo
- Publication number
- CN105678189A CN105678189A CN201610025930.1A CN201610025930A CN105678189A CN 105678189 A CN105678189 A CN 105678189A CN 201610025930 A CN201610025930 A CN 201610025930A CN 105678189 A CN105678189 A CN 105678189A
- Authority
- CN
- China
- Prior art keywords
- data file
- cloud storage
- storage system
- encryption
- content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6209—Protecting access to data via a platform, e.g. using keys or access control rules to a single file or object, e.g. in a secure envelope, encrypted and accessed using a key, or with access control rules appended to the object itself
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
- G06F16/134—Distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2141—Access rights, e.g. capability lists, access control lists, access tables, access matrices
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Storage Device Security (AREA)
Abstract
The invention discloses an encrypted data file storage and retrieval method. The method comprises the steps that after content metadata is extracted from a data file, the data file is encrypted to generate an encrypted data file to be stored in storage equipment of a cloud storage system; the content metadata is added with a file global identifier of the data file in an encrypted state and then stored in a content metadatabase of the cloud storage system; when the encrypted data file stored in the cloud storage system is retrieved, the content metadatabase is retrieved through an inverted index method to acquire the file global identifier matched with a retrieval keyword, and attribute information and content information of the encrypted data file corresponding to the file global identifier are listed to serve as a retrieval result. According to the method, the content metadata is extracted before the data file is encrypted, the file global identifier of the file in the encrypted state is added into the content metadata, the encrypted data file stored in the cloud storage system is retrieved through the file global identifier, and the retrieval convenience of the data file is guaranteed while the safety and the privacy of the data file in a cloud storage environment are guaranteed.
Description
Technical field
The present invention relates to field of information security technology, be specifically related to a kind of data file encryption storage and retrieval system based on cloud storage system and method.
Background technology
Comparing traditional data file storage mode, cloud storage technology has a lot of advantages:
(1) with low cost, under traditional approach, user needs to buy the infrastructure devices such as substantial amounts of server, hard disk, but also need regularly equipment to be upgraded, and in cloud storage environment, user is no longer necessary to buy these infrastructure devices, saves the cost buying infrastructure device on the one hand, decreases the expense of maintenance on the other hand;
(2) retractility is good, for medium-sized and small enterprises, early stage is difficult to estimate the memory capacity size needed, and cloud storage can well solve this problem, early stage can according to being currently needed for buying the memory capacity meeting demand, when business increases, when data volume increases, it is possible to dynamic increase memory capacity and data before not affecting;
(3) automatic duplicating of data, for data safety, data can be backed-up by a lot of users, and backup is often comparatively laborious and there is Backup Data safety and integrality protection problem, cloud storage supplier generally provides two or more copies to data file, fully ensure that the high availability of data file, thus saving out from the worry of data backup by user;
(4) fault automatically switches, during traditional storage system upgrade, need data to be moved on other storage server from old storage, after reaching the standard grade etc. new storage server, again Data Migration is returned, the interruption of service can be caused on the one hand, also bring along the risk of loss of data on the other hand, and these problems all will not exist in cloud storage environment, when system detects abnormal, automatically service can be switched on available redundant storage cluster, and not affect normal service, more will not lose data.
Although cloud storage has many advantages, but there is also some shortcomings, especially prominent is a bit that increasing user worries: the data of oneself are stored in the cloud storage system environment by other people management and control, it is possible to the content of leak data, bring loss to individual and company. The method solving problems at present is data to be stored in an encrypted form in cloud storage system.
Although by data file encryption storage; privacy and the safety of data file can be protected; but also bring a problem: under a lot of scenes simultaneously; user needs to retrieve data file according to some specific contents; if data file is encrypted; result in and cannot retrieve, or retrieval rate is slow.
Summary of the invention
The present invention provides a kind of data file encryption storage and retrieval system and method, solve the problem that data file encryption retrieval difficulty is slow with retrieval rate, can be on the basis of encrypted state in data file, quickly retrieve required data file encryption information.
For achieving the above object, the present invention provides a kind of data file encryption storage and retrieval system, is characterized in, this system comprises:
Cloud storage system, it comprises server end and storage device; Server Side Include authenticating user identification module, content metadata extraction module, metadata management system, data file encrypting module, information searching module; Connect and manage content metadata storehouse, system metadata storehouse and storage metadatabase under metadata management system; Storage device, it is used for storing data file, and data file comprises data file encryption and plaintext data file;
Client, it comprises content metadata extraction module and data file encryption module.
A kind of data file encryption storage and search method, be characterized in, the method comprises:
The content metadata being encrypted generation data file encryption, data file encryption and correspondence after the content metadata of client or cloud storage system server end extraction data file is stored respectively in the memory storage devices of cloud storage system and the content metadata storehouse of server end; Content metadata comprises attribute information and the content information of data file and the file global identifier that data file is in an encrypted state;
During the data file encryption stored in retrieval cloud storage system, the data file encryption retrieval module of service end is by inverted index method, the data file that retrieval acquisition is mated with search key in the content metadata storehouse of server end file global identifier in an encrypted state, attribute information and the content information of listing data file encryption corresponding to this document global identifier return as retrieval result.
It is encrypted, after the content metadata of above-mentioned client extraction data file, the method generating data file encryption to comprise:
Client extracts the content metadata of data file;
The data file of content metadata has been extracted in client encryption, generates data file encryption;
The content metadata of data file encryption and correspondence is uploaded to cloud storage system server end by client.
It is encrypted, after the content metadata of above-mentioned cloud storage system server end extraction data file, the method generating data file encryption to comprise:
Data file is uploaded to cloud storage system server end by client;
Cloud storage system server end extracts the content metadata of data file;
The data file extracting content metadata is encrypted by cloud storage system server end, generates data file encryption.
The content metadata of said extracted data file comprises: the appearance metadata extraction module of client or the cloud storage system server end characteristic according to data file, content data file is done preliminary analysis, extraction can embody attribute information and the content information of data file characteristics, and is added in content metadata by the file global identifier after data file encryption.
After the content metadata of said extracted data file, the content metadata being stored in cloud storage system server end content metadata storehouse can be modified by client.
Above-mentioned cloud storage system server end is by the data file encryption distributed storage storage device at cloud storage system, and is stored in by content metadata in the content metadata storehouse of cloud storage system.
In above-mentioned retrieval cloud storage system server end, the data file encryption of storage comprises:
Client sends the retrieval request comprising search key, and cloud storage system is analyzed retrieval request and determined the legitimacy of search key content in retrieval request;
Content metadatabase is carried out matching inquiry by inverted index method by the information searching module of cloud storage system, obtains the attribute information of the data file corresponding with the data file of search key coupling file global identifier in an encrypted state and file global identifier and content information as retrieving result;
Information searching module sends to client after retrieval result being ranked up.
Above-mentioned client is according to retrieval result, and optional download retrieves the data file encryption that file global identifier listed in result is corresponding;
If data file encryption is to encrypt in client, data file encryption is directly passed to subscription client by cloud storage system, client be decrypted;
If data file encryption is to encrypt at cloud storage system server end, then pass to client after being deciphered by data file encryption by cloud storage system server end.
The search method of above-mentioned data file encryption also comprises the optimization method of inverted index method, and the optimization method of this inverted index method comprises:
Through vertical segmentation and moving horizontally, the neutral element of the inverted index matrix of content data file metadata is made to move on to bottom and the right part of matrix;
Then through Block Cluster, original higher-dimension sparse matrix is changed into several low-dimensional dense matrix;
During to content metadata retrieval, several low-dimensional matrix in the sparse matrix of optimization is delivered to the different processing units in cloud storage system respectively and carries out parallel processing.
Data file encryption storage and retrieval system of the present invention and method store with the data file encryption of prior art and compare with retrieval technique, have an advantage in that, the present invention establishes a kind of novel content metadata structure, ensure that user is from multi-angle, the multi-faceted retrieval to data file encryption, ensure data file safety in cloud storage environment and privacy simultaneously, ensure the convenience of data file retrieval;
In the present invention, data file is all cryptographically be saved in cloud storage system, even if obtaining the data file of encryption, it does not have decruption key, data file is without leakage;
The present invention designs a kind of novel inverted index method being suitable for content metadata retrieval, the data file encryption of correspondence can be gone out at key word information quick-searching in cloud storage system that client provides according to user, ensure that the efficiency of user encryption data file retrieval and precision, solve the problem that data file encryption retrieval is difficult or retrieval rate is slow in the big data environments such as cloud storage;
The present invention is equally applicable for the retrieval of the data file encryption in cloud storage system and plaintext data file, is attained by quick-searching and returns the purpose of retrieval result.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of data file encryption of the present invention storage and search method;
Fig. 2 is three kinds of metadata relationship figure;
Fig. 3 is content metadata structure chart;
Fig. 4 is storage metadata structure figure;
Fig. 5 is the method flow diagram that a kind of data file encryption based on cloud storage system stores with search method embodiment;
Fig. 6 is the retrieval model figure of data file encryption;
Fig. 7 is the inverted index schematic diagram of content metadata;
The matrix that Fig. 8 is the inverted index of content metadata represents schematic diagram;
Fig. 9 is inverted index Factorization algorithm and the parallel processing schematic diagram thereof of content metadata.
Detailed description of the invention
Below in conjunction with accompanying drawing, further illustrate specific embodiments of the invention.
Present invention is disclosed a kind of data file encryption storage and retrieval system based on cloud storage system and method, by the attribute information of former data file and content information, and the file global identifier of data file encryption is stored in content metadata storehouse, in the way of realization is without decrypted original data file, complete the search operaqtion of data file encryption.
The know-why of the present invention: (1) is by designing special content data file metadata structure, content metadata includes data file file global identifier (FGID) in an encrypted state, before client or cloud storage system server end (i.e. cloud storage system) are to data file encryption, automatically extract the content metadata of data file, and it is deposited into the content metadata database of cloud storage system server end, namely being stored in cloud storage system, the retrieval for data file encryption provides foundation; (2) data file may select in client encryption or encrypts at the server end of cloud storage system, then storage mode in a distributed manner is stored in the storage device of cloud storage system, it is ensured that the safety of data file, privacy, high availability and data integrity; (3) ensure that the retrieval rate of huge volumes of content metadata in cloud storage system by a kind of novel inverted index method, it is achieved thereby that the quick-searching of data file encryption. This invention overcomes conventional cryptography data file retrieval difficulty or the defect that after deciphering, retrieval rate is slow.
Data file encryption storage and retrieval system based on cloud storage system disclosed by the invention comprises: cloud storage system and client.
Cloud storage system comprises server end and storage device; Server Side Include authenticating user identification module, content metadata extraction module, metadata management system, data file encrypting module, information searching module; Connect and manage content metadata storehouse, system metadata storehouse and storage metadatabase under described metadata management system; Storage device, it is used for storing data file, and data file comprises data file encryption and plaintext data file. Client comprises content metadata extraction module and data file encryption module.
Storing and search method as it is shown in figure 1, disclose a kind of data file encryption based on cloud storage system, the method includes the steps of:
User is carried out authentication by authentication interface by S100, client.
User authentication includes accessing control and subscriber identity information offer; Access and control: be the access of restriction disabled user, be build the first line of defence of data safety in cloud storage environment; User profile provides: in follow-up content data file meta-data extraction, and the access rights obtaining user-dependent information and data are arranged, and are required for from user authentication information and obtain.
Cloud storage system authenticating user identification controls the access of disabled user, the operation of refusal disabled user, ensure that the safety of cloud storage system, realize the security protection to cloud storage system, disabled user is kept outside of the door, extraction or generation for follow-up associated metadata provide corresponding attribute, also control to submit necessary information for the access of data file. Furthermore it is also possible to determine the identity of validated user, and limit the scope of its data access.
While carrying out authentication, it is judged that carry out file Stored Procedure or document retrieval flow process, if file Stored Procedure, then jump to S200, if document retrieval flow process, then jump to S300.
S200, subscriber data file is carried out file Stored Procedure, subscriber data file is encrypted and stored at cloud storage system.
S300, data file encryption is carried out document retrieval flow process, directly data file encryption is retrieved when not deciphering in cloud storage system.
Wherein, data file includes structured data file, semi-structured data file and non-structural data file. Structured data file refers to traditional various database files; Non-structured data file refers to various document files, picture file, audio file and video file etc.; Semi-structured data file is a kind of irregular database file, is the data file being embedded with unstructured data in database file.
The metadata as in figure 2 it is shown, the metadata in cloud storage environment is divided three classes by the present invention, is respectively as follows: system metadata, storage metadata and content metadata. Three class metadata are deposited in the metadatabase that high in the clouds is different respectively, and three class metadata of each data file encryption are to be associated by the FGID of this data file. FGID is the major key of three class metadata place tables, is also external key, identifies encrypted data file for unique, is determined by the content of data file encryption, can retrieve the data integrity of data file encryption with FGID. The length of each FGID is 128, say, that can represent 2128Individual file.
System metadata, file global identifier (FGID) including the Data Filename under the directory information of cloud storage system and directory pathname, each catalogue and data file encryption, and the information such as the attribute of data file and catalogue, user encryption data file after storing, system automatically generates.
As shown in Figure 3, content metadata is the key content realizing data file retrieval, data file characteristics can be embodied, including stored data file attribute information under plaintext state and content information face, and the file global identifier (FGID) that data file is in an encrypted state. Attribute information includes: file name, establishment time, document creation person, modification time, reviser, version information, file type etc., by attribute information, this data file can have an overall cognition. Content information comprises: content data file brief introduction, keyword, file another name, file label, remarks, purposes, Content Organizing structure, compress mode, coded format, the content characteristic information of data file. Here, data file file global identifier (FGID) item in an encrypted state, is automatically added by cloud storage system after data file encryption stores.
As shown in Figure 4, storage metadata includes Back ground Information and the storage information of data file encryption and the file global identifier (FGID) of data file encryption. Back ground Information includes data file size, ID, operable type, replicator, security attribute etc.; Storage information includes corresponding for data file encryption block ID content address list, block size list, block physical address and content address mapping table base address. ID is the possessory ID of this document, and operable type includes reading and writing, amendment etc., and replicator refers to the number that this data file backs up, and content address is the hash function value of this data block. After storage metadata is stored by user encryption data file, system automatically generates.
In the metadata management system of cloud storage system server end, by the file global identifier (FGID) of data file, its three classes metadata is linked together.The relevant information of content metadata is automatically extracted by system, and the user having permission can pass through network to the content metadata manual modification in content metadatabase in client.
Wherein, the encryption of data file: be divided into according to the physical location that data file encryption is residing: client encryption and server end encryption. Two kinds of cipher modes use different scenes respectively: for the optional cloud storage system server end encryption of data file that security requirement is higher, client can be selected to encrypt for the data file that security requirement is very high.
Wherein, the storage of metadata and management: system metadata, storage metadata and content metadata three class metadata are respectively stored in three kinds of specific metadata databases in cloud storage system, this data base is capable of mass data storage and efficient retrieval, the concurrent request of energy satisfying magnanimity user, and tool automatic fault is recovered and the function of data backup, it is ensured that the safety of metadata and high availability;
Wherein, the storage of data file encryption and management: the data file of encryption is by cloud storage system adopt distributed storage technology to be stored in storage device that cloud storage system is user virtual machine distribution; Described cloud storage system is based on distributed high availability storage system, by increasing number of nodes, it is possible to the total capacity of horizontal extension cloud storage system; By some physical data block being merged into a bigger logical memory space, reduce data management expense;
The data file encryption retrieval of content-based metadata, need first the input information of user to be carried out preliminary analysis, reduce range of search, it is determined that the meta data file that retrieval may relate to, so the huge volumes of content metadatabase in cloud storage system can accelerate retrieval rate; Described search operaqtion is not required to deciphering data file encryption, and the result finally retrieved is data file or the data file encryption list of encryption.
As it is shown in figure 5, be that the data file encryption based on cloud storage system stores a kind of embodiment with search method, its file Stored Procedure (S200) specifically comprises the steps of
After user is completed authentication by S201, client, client judges whether in client, data file to be encrypted, if, carry out client encryption, then jump to S202, if not, carry out server (cloud storage system server end) encryption, then jump to S205.
S202, client content metadata extraction module automatically extract the content metadata of data file. The extraction of content metadata comprises: automatically extract and manual modification.
Automatically extract: by the appearance metadata extraction module of the client characteristic according to file, file content is done preliminary analysis, extracts the above-mentioned attribute information that can embody data file characteristics and content information.
Manual modification: user is the owner of data file, the kind of data file, attribute, purposes, feature etc. there is the more comprehensive understanding of ratio, some special data file users can be carried out manual modification at the content metadata that system is automatically extracted by client by network, so can describe the characteristic of this data file more accurately, the accuracy and efficiency of retrieval can be improved.
Here, the content metadata of data file extracts, by the content metadata extraction module of client, automatically extract and send before data file is encrypted in the content metadata storehouse of cloud storage system server end and preserve, the user having permission can carry out edit-modify to being stored in content metadata storehouse content unit, make content metadata more meet the retrieval habit of user, accuracy and the recall precision of user search can be improved.
S203, client encrypting module data file is encrypted, be encrypted extracting the data file after content metadata by private key for user or symmetric key encryption or other AESs, generation data file encryption.
The content metadata of data file encryption and correspondence is uploaded to the server end of cloud storage system by S204, client, jumps to S208.
Data file is uploaded to the server end of cloud storage system by S205, client with plaintext version.
S206, cloud storage system the content metadata extraction module of server end automatically extract the content metadata of data file. The extraction of content metadata comprises: automatically extract and manual modification.
Automatically extract: by the appearance metadata extraction module of the client characteristic according to file, file content is done preliminary analysis, extract attribute information and the content information that can embody data file characteristics.
Manual modification: user is the owner of data file, the kind of data file, attribute, purposes, feature etc. there is the more comprehensive understanding of ratio, some special data file users can be carried out manual modification at the content metadata that system is automatically extracted by client by network, so can describe the characteristic of this data file more accurately, the accuracy and efficiency of retrieval can be improved.
Here, the content metadata of data file extracts, it is by the content metadata extraction module of cloud storage system server end, automatically extract and send before data file is encrypted in the content metadata storehouse of cloud storage system server end and preserve, the user having permission can carry out edit-modify to being stored in content metadata storehouse content unit, make content metadata more meet the retrieval habit of user, accuracy and the recall precision of user search can be improved.
S207, cloud storage system server end encrypting module be encrypted by private key for user or symmetric key encryption or other AESs extracting the data file after content metadata, generation data file encryption. Data file encryption ensure that the safety of data file, jumps to S208.
The data file encryption encrypted by client or server end encryption generates is stored by S208, cloud storage system server end, is stored in cloud storage system corresponding storage device.
After the server end of cloud storage system obtains the data file of user encryption, data file encryption is stored in the storage device that user virtual machine is corresponding.
Each cloud tenant (user) uses cloud storage system to carry out in units of user virtual machine.
The encrypting module of cloud storage system adopts MD5 algorithm to generate the file global identifier (FGID) of data file encryption simultaneously, and sends in the content metadata of this data file. File global identifier (FGID) is unique mark of this data file encryption, checks the data integrity of this data file encryption also dependent on file global identifier (FGID) simultaneously.
After data file encryption completes storage, cloud storage system generates system metadata and storage metadata, cloud storage system the directory information stored according to data file encryption, storage positional information and file global identifier FGID automatically generate system metadata and the storage metadata of data file encryption.
S209, the content metadata that extracts before encryption is stored in the content metadata storehouse of cloud storage system.
Further, can also after content metadata be stored in the content metadata storehouse of cloud storage system, content metadata database is updated, the user having permission can pass through network in client, content metadata in the content metadata storehouse of cloud storage system is modified, it is simple to user's retrieving more accurately for data file encryption.
At the data file encryption storing process based on S200, the search operaqtion to data file encryption can be realized, as shown in Figure 6, for the data file encryption retrieval model of content-based metadata, its rough flow is: 1. client upload retrieves the content information searching module to cloud storage system; 2. information searching module is by falling the method sorted query and search and the information retrieving content matching in the content metadata storehouse that metadata management system manages; 3. 2. retrieval is obtained by metadata management system file global identifier FGID and corresponding content metadata with the data file encryption retrieving content matching issue information searching module; 4. the result retrieving return in metadata management system is ranked up by information searching module, then the retrieval the results list after sequence is sent to client; 5. user is at the client file that selection to be downloaded from retrieval the results list, and the file global identifier (FGID) of selected file is sent to information searching module; 6. information searching module is according to the FGID of file in the storage metadatabase that metadata management system manages, and searches corresponding data file encryption storage positional information; 7. metadata management system is sent to distributed memory system by 6. searching the data file encryption storage positional information obtained; 8. distributed memory system stores positional information according to data file encryption, takes out data file encryption and is sent to Encryption Decryption module; 9. Encryption Decryption module passes to client after being deciphered by data file encryption, and the retrieval of whole data file encryption terminates.
In figure 6, if data file encryption is encrypted at server end, then the Encryption Decryption module in model is in the server end of cloud storage system; If data file encryption is encrypted in client, then the Encryption Decryption module in model is in client.
As it is shown in figure 5, be that the data file encryption based on cloud storage system stores a kind of embodiment with search method, wherein document retrieval flow process (S300) specifically comprises the steps of
S301, client receive inquiry request by query interface, include search key in inquiry request. The inquiry request comprising search key is uploaded to cloud storage system by client.
The inquiry request that the information searching module of cloud storage system is submitted to for client is analyzed, it is determined that the legitimacy of the comprised content of search key in inquiry request.
S302, data retrieval: content metadatabase is carried out matching inquiry by a kind of novel Inversed File Retrieval Algorithm by the information searching module of cloud storage system, and returns file global identifier (FGID) and the partial content metadata information of satisfactory data file encryption.
Novel inverted index method is a kind of inverted index method of improvement, the quick-searching of huge volumes of content metadata information in the content metadata storehouse of the applicable cloud storage system of this indexing means.
Such as Fig. 7 and in conjunction with shown in Fig. 8, it is represent for inverted index and the matrix thereof of content data file metadata disclosed in this invention. Wherein key word 1, key word 2 ... represent the content metadata item in content metadata storehouse, ID1, ID2, ..., the row and column cross term that represents in file global identifier FGID, Fig. 8 of data file encryption of IDn represent that a certain key word occurs in the number of times in a data file. The inverted index matrix of content data file metadata is a sparse matrix as can be seen from Figure 8, for the keyword retrieval speed improved in cloud storage system in huge volumes of content metadatabase, the inverted index matrix of data file content meta-data is optimized as follows: through vertical segmentation and moving horizontally, the neutral element making matrix moves on to bottom and the right part of matrix, then through Block Cluster, original higher-dimension sparse matrix is changed into low-dimensional dense matrix one by one. When to content metadata retrieval, low-dimensional matrix one by one in the sparse matrix of optimization is delivered to the different processing units in cloud storage system respectively and carries out parallel processing, so can be greatly improved the keyword retrieval speed in huge volumes of content metadatabase. Its principle as it is shown in figure 9, wherein M1, M2 ..., Mn represent low-dimensional dense matrix, P1, P2 ..., Pn represent the parallel processing element in cloud storage system.
S303, cloud storage system judge whether to retrieve successfully, if so, then jump to step S304, and if not then, output sends to client without the information of the retrieval result of corresponding data file encryption, and jumps to S305.
S304, cloud storage system position respective encrypted data file according to the file global identifier (FGID) of data file encryption.
The result that S302 retrieval is returned by the information searching module of cloud storage system simultaneously is ranked up, the retrieval result after sending sequence to client.
S305, client receive retrieval result, retrieval result is exported to user by client, if obtaining the retrieval result of the file global identifier including the data file encryption meeting retrieval content, illustrating to meet the retrieval result of retrieval content, user can obtain required data file encryption information in client according to retrieval result.
It addition, client is according to retrieval result, the data file encryption that also the optional file global identifier downloaded listed by retrieval result is corresponding.
Encrypting in client if data file encryption is originally, data file encryption is directly passed to subscription client by cloud storage system, client be decrypted. Encrypt at cloud storage system server end if data file encryption is originally, then pass to client after being deciphered by data file encryption by cloud storage system.
Although present disclosure has been made to be discussed in detail already by above preferred embodiment, but it should be appreciated that the description above is not considered as limitation of the present invention. After those skilled in the art have read foregoing, multiple amendment and replacement for the present invention all will be apparent from. Therefore, protection scope of the present invention should be limited to the appended claims.
Claims (10)
1. a data file encryption storage and retrieval system, it is characterised in that this system comprises:
Cloud storage system, it comprises server end and storage device; Described Server Side Include authenticating user identification module, content metadata extraction module, metadata management system, data file encrypting module, information searching module; Connect and manage content metadata storehouse, system metadata storehouse and storage metadatabase under described metadata management system; Described storage device, it is used for storing data file, and data file comprises data file encryption and plaintext data file;
Client, it comprises content metadata extraction module and data file encryption module.
2. a data file encryption storage and search method, it is characterised in that the method comprises:
The content metadata being encrypted generation data file encryption, data file encryption and correspondence after the content metadata of client or cloud storage system server end extraction data file is stored respectively in the memory storage devices of cloud storage system and the content metadata storehouse of server end; Content metadata comprises attribute information and the content information of data file and the file global identifier that data file is in an encrypted state;
During the data file encryption stored in retrieval cloud storage system, the data file encryption retrieval module of service end is by inverted index method, the data file that retrieval acquisition is mated with search key in the content metadata storehouse of server end file global identifier in an encrypted state, attribute information and the content information of listing data file encryption corresponding to this document global identifier return as retrieval result.
3. data file encryption stores and search method as claimed in claim 2, it is characterised in that is encrypted, after the content metadata of described client extraction data file, the method generating data file encryption and comprises:
Client extracts the content metadata of data file;
The data file of content metadata has been extracted in client encryption, generates data file encryption;
The content metadata of data file encryption and correspondence is uploaded to cloud storage system server end by client.
4. data file encryption stores and search method as claimed in claim 2, it is characterised in that is encrypted, after the content metadata of described cloud storage system server end extraction data file, the method generating data file encryption and comprises:
Data file is uploaded to cloud storage system server end by client;
Cloud storage system server end extracts the content metadata of data file;
The data file extracting content metadata is encrypted by cloud storage system server end, generates data file encryption.
5. data file encryption storage and search method as described in Claims 2 or 3 or 4, it is characterized in that, the content metadata of described extraction data file comprises: the appearance metadata extraction module of client or the cloud storage system server end characteristic according to data file, content data file is done preliminary analysis, extraction can embody attribute information and the content information of data file characteristics, and is added in content metadata by the file global identifier after data file encryption.
6. data file encryption storage and search method as described in Claims 2 or 3 or 4, it is characterized in that, after the content metadata of described extraction data file, the content metadata being stored in cloud storage system server end content metadata storehouse can be modified by client.
7. data file encryption storage and search method as described in Claims 2 or 3 or 4, it is characterized in that, described cloud storage system server end is by the data file encryption distributed storage storage device at cloud storage system, and is stored in by content metadata in the content metadata storehouse of cloud storage system.
8. data file encryption stores and search method as claimed in claim 2, it is characterised in that in described retrieval cloud storage system server end, the data file encryption of storage comprises:
Client sends the retrieval request comprising search key, and cloud storage system is analyzed retrieval request and determined the legitimacy of search key content in retrieval request;
Content metadatabase is carried out matching inquiry by inverted index method by the information searching module of cloud storage system, obtains the attribute information of the data file corresponding with the data file of search key coupling file global identifier in an encrypted state and file global identifier and content information as retrieving result;
Information searching module sends to client after retrieval result being ranked up.
9. data file encryption stores and search method as claimed in claim 8, it is characterised in that described client, according to retrieval result, may select the data file encryption that the file global identifier listed by download retrieval result is corresponding;
If data file encryption is to encrypt in client, data file encryption is directly passed to subscription client by cloud storage system, client be decrypted;
If data file encryption is to encrypt at cloud storage system server end, then pass to client after being deciphered by data file encryption by cloud storage system server end.
10. data file encryption storage and search method as described in claim 2 or 8 or 9, it is characterised in that the search method of described data file encryption also comprises the optimization method of inverted index method, and the optimization method of this inverted index method comprises:
Through vertical segmentation and moving horizontally, the neutral element of the inverted index matrix of content data file metadata is made to move on to bottom and the right part of matrix;
Then through Block Cluster, original higher-dimension sparse matrix is changed into several low-dimensional dense matrix;
During to content metadata retrieval, several low-dimensional matrix in the sparse matrix of optimization is delivered to the different processing units in cloud storage system respectively and carries out parallel processing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610025930.1A CN105678189B (en) | 2016-01-15 | 2016-01-15 | Data file encryption storage and retrieval system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610025930.1A CN105678189B (en) | 2016-01-15 | 2016-01-15 | Data file encryption storage and retrieval system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105678189A true CN105678189A (en) | 2016-06-15 |
CN105678189B CN105678189B (en) | 2018-10-23 |
Family
ID=56300884
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610025930.1A Active CN105678189B (en) | 2016-01-15 | 2016-01-15 | Data file encryption storage and retrieval system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105678189B (en) |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106131013A (en) * | 2016-07-06 | 2016-11-16 | 杨炳 | A kind of protecting data encryption system |
CN106302472A (en) * | 2016-08-09 | 2017-01-04 | 厦门乐享新网络科技有限公司 | The hidden method of information and device |
CN106302449A (en) * | 2016-08-15 | 2017-01-04 | 中国科学院信息工程研究所 | A kind of ciphertext storage cloud service method open with searching ciphertext and system |
CN106649880A (en) * | 2017-01-09 | 2017-05-10 | 北京中电普华信息技术有限公司 | Electric power statistical management system and method |
CN107291851A (en) * | 2017-06-06 | 2017-10-24 | 南京搜文信息技术有限公司 | Ciphertext index building method and its querying method based on encryption attribute |
CN107704768A (en) * | 2017-09-14 | 2018-02-16 | 上海海事大学 | A kind of multiple key classification safety search method of ciphertext |
CN108268558A (en) * | 2017-01-03 | 2018-07-10 | 中移(苏州)软件技术有限公司 | A kind of method and apparatus of data analysis |
CN108897859A (en) * | 2018-06-29 | 2018-11-27 | 郑州云海信息技术有限公司 | A kind of metadata retrieval method, apparatus, equipment and computer readable storage medium |
CN108984627A (en) * | 2018-06-20 | 2018-12-11 | 顺丰科技有限公司 | Searching method, system, equipment and the storage medium of encrypted document based on Elasticsearch |
CN109284290A (en) * | 2018-09-20 | 2019-01-29 | 佛山科学技术学院 | A kind of method for reading data based on distributed storage space |
CN109542895A (en) * | 2018-10-25 | 2019-03-29 | 北京开普云信息科技有限公司 | A kind of method for managing resource and system based on the customized extension of metadata |
CN109923549A (en) * | 2016-08-24 | 2019-06-21 | 罗伯特·博世有限公司 | Processing inverted index can search for symmetric encryption system and method |
CN110771190A (en) * | 2017-06-22 | 2020-02-07 | 森特里克斯信息安全技术有限公司 | Controlling access to data |
CN110929302A (en) * | 2019-10-31 | 2020-03-27 | 东南大学 | Data security encryption storage method and storage device |
CN111492354A (en) * | 2017-11-14 | 2020-08-04 | 斯诺弗雷克公司 | Database metadata in immutable storage |
CN112052219A (en) * | 2020-08-05 | 2020-12-08 | 中国建设银行股份有限公司 | File storage and retrieval method and device, electronic equipment and readable storage medium |
CN112233666A (en) * | 2020-10-22 | 2021-01-15 | 中国科学院信息工程研究所 | Method and system for storing and retrieving Chinese voice ciphertext in cloud storage environment |
CN112417473A (en) * | 2020-11-20 | 2021-02-26 | 季速漫 | Big data security management system |
CN112702379A (en) * | 2020-08-20 | 2021-04-23 | 纬领(青岛)网络安全研究院有限公司 | Full-secret search research for big data security |
CN112733180A (en) * | 2021-04-06 | 2021-04-30 | 北京神州泰岳智能数据技术有限公司 | Data query method and device and electronic equipment |
CN113254982A (en) * | 2021-07-13 | 2021-08-13 | 深圳市洞见智慧科技有限公司 | Secret track query method and system supporting keyword query |
CN113434877A (en) * | 2021-06-23 | 2021-09-24 | 平安国际智慧城市科技股份有限公司 | Method, device, equipment and storage medium for encrypting and decrypting user input data |
WO2021225687A1 (en) * | 2020-05-08 | 2021-11-11 | Bold Limited | Systems and methods for creating enhanced documents for perfect automated parsing |
CN113642026A (en) * | 2021-08-31 | 2021-11-12 | 立信(重庆)数据科技股份有限公司 | Method and device for inquiring event processing data on block chain |
US11188707B1 (en) | 2020-05-08 | 2021-11-30 | Bold Limited | Systems and methods for creating enhanced documents for perfect automated parsing |
US11436377B2 (en) * | 2020-06-26 | 2022-09-06 | Ncr Corporation | Secure workload image distribution and management |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101770462A (en) * | 2008-12-30 | 2010-07-07 | 日电(中国)有限公司 | Device for ciphertext index and search and method thereof |
CN102024054A (en) * | 2010-12-10 | 2011-04-20 | 中国科学院软件研究所 | Ciphertext cloud-storage oriented document retrieval method and system |
US20120016843A1 (en) * | 2003-05-22 | 2012-01-19 | Carmenso Data Limited Liability Company | Information Source Agent Systems and Methods for Backing Up Files To a Repository Using File Identicality |
CN103442057A (en) * | 2013-08-27 | 2013-12-11 | 玉林师范学院 | Cloud storage system based on user collaboration cloud |
-
2016
- 2016-01-15 CN CN201610025930.1A patent/CN105678189B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120016843A1 (en) * | 2003-05-22 | 2012-01-19 | Carmenso Data Limited Liability Company | Information Source Agent Systems and Methods for Backing Up Files To a Repository Using File Identicality |
CN101770462A (en) * | 2008-12-30 | 2010-07-07 | 日电(中国)有限公司 | Device for ciphertext index and search and method thereof |
CN102024054A (en) * | 2010-12-10 | 2011-04-20 | 中国科学院软件研究所 | Ciphertext cloud-storage oriented document retrieval method and system |
CN103442057A (en) * | 2013-08-27 | 2013-12-11 | 玉林师范学院 | Cloud storage system based on user collaboration cloud |
Cited By (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106131013A (en) * | 2016-07-06 | 2016-11-16 | 杨炳 | A kind of protecting data encryption system |
CN106302472A (en) * | 2016-08-09 | 2017-01-04 | 厦门乐享新网络科技有限公司 | The hidden method of information and device |
CN106302472B (en) * | 2016-08-09 | 2019-12-24 | 厦门乐享新网络科技有限公司 | Information hiding method and device |
CN106302449A (en) * | 2016-08-15 | 2017-01-04 | 中国科学院信息工程研究所 | A kind of ciphertext storage cloud service method open with searching ciphertext and system |
CN106302449B (en) * | 2016-08-15 | 2019-10-11 | 中国科学院信息工程研究所 | A kind of storage of ciphertext and the open cloud service method of searching ciphertext and system |
CN109923549B (en) * | 2016-08-24 | 2023-11-07 | 罗伯特·博世有限公司 | Searchable symmetric encryption system and method for processing inverted index |
CN109923549A (en) * | 2016-08-24 | 2019-06-21 | 罗伯特·博世有限公司 | Processing inverted index can search for symmetric encryption system and method |
CN108268558A (en) * | 2017-01-03 | 2018-07-10 | 中移(苏州)软件技术有限公司 | A kind of method and apparatus of data analysis |
CN106649880A (en) * | 2017-01-09 | 2017-05-10 | 北京中电普华信息技术有限公司 | Electric power statistical management system and method |
CN107291851B (en) * | 2017-06-06 | 2020-11-06 | 南京搜文信息技术有限公司 | Ciphertext index construction method based on attribute encryption and query method thereof |
CN107291851A (en) * | 2017-06-06 | 2017-10-24 | 南京搜文信息技术有限公司 | Ciphertext index building method and its querying method based on encryption attribute |
CN110771190A (en) * | 2017-06-22 | 2020-02-07 | 森特里克斯信息安全技术有限公司 | Controlling access to data |
CN107704768A (en) * | 2017-09-14 | 2018-02-16 | 上海海事大学 | A kind of multiple key classification safety search method of ciphertext |
CN111492354A (en) * | 2017-11-14 | 2020-08-04 | 斯诺弗雷克公司 | Database metadata in immutable storage |
CN108984627A (en) * | 2018-06-20 | 2018-12-11 | 顺丰科技有限公司 | Searching method, system, equipment and the storage medium of encrypted document based on Elasticsearch |
CN108897859A (en) * | 2018-06-29 | 2018-11-27 | 郑州云海信息技术有限公司 | A kind of metadata retrieval method, apparatus, equipment and computer readable storage medium |
CN109284290A (en) * | 2018-09-20 | 2019-01-29 | 佛山科学技术学院 | A kind of method for reading data based on distributed storage space |
CN109284290B (en) * | 2018-09-20 | 2022-04-26 | 佛山科学技术学院 | Data reading method based on distributed storage space |
CN109542895A (en) * | 2018-10-25 | 2019-03-29 | 北京开普云信息科技有限公司 | A kind of method for managing resource and system based on the customized extension of metadata |
CN110929302A (en) * | 2019-10-31 | 2020-03-27 | 东南大学 | Data security encryption storage method and storage device |
CN110929302B (en) * | 2019-10-31 | 2022-08-26 | 东南大学 | Data security encryption storage method and storage device |
US11188707B1 (en) | 2020-05-08 | 2021-11-30 | Bold Limited | Systems and methods for creating enhanced documents for perfect automated parsing |
US11537727B2 (en) | 2020-05-08 | 2022-12-27 | Bold Limited | Systems and methods for creating enhanced documents for perfect automated parsing |
US11281783B2 (en) | 2020-05-08 | 2022-03-22 | Bold Limited | Systems and methods for creating enhanced documents for perfect automated parsing |
EP3929797A1 (en) * | 2020-05-08 | 2021-12-29 | BOLD Limited | Systems and methods for creating enhanced documents for perfect automated parsing |
WO2021225687A1 (en) * | 2020-05-08 | 2021-11-11 | Bold Limited | Systems and methods for creating enhanced documents for perfect automated parsing |
EP3929798A1 (en) * | 2020-05-08 | 2021-12-29 | BOLD Limited | Systems and methods for creating enhanced documents for perfect automated parsing |
US11436377B2 (en) * | 2020-06-26 | 2022-09-06 | Ncr Corporation | Secure workload image distribution and management |
CN112052219A (en) * | 2020-08-05 | 2020-12-08 | 中国建设银行股份有限公司 | File storage and retrieval method and device, electronic equipment and readable storage medium |
CN112702379A (en) * | 2020-08-20 | 2021-04-23 | 纬领(青岛)网络安全研究院有限公司 | Full-secret search research for big data security |
CN112233666A (en) * | 2020-10-22 | 2021-01-15 | 中国科学院信息工程研究所 | Method and system for storing and retrieving Chinese voice ciphertext in cloud storage environment |
CN112417473A (en) * | 2020-11-20 | 2021-02-26 | 季速漫 | Big data security management system |
CN112733180A (en) * | 2021-04-06 | 2021-04-30 | 北京神州泰岳智能数据技术有限公司 | Data query method and device and electronic equipment |
CN113434877A (en) * | 2021-06-23 | 2021-09-24 | 平安国际智慧城市科技股份有限公司 | Method, device, equipment and storage medium for encrypting and decrypting user input data |
CN113254982B (en) * | 2021-07-13 | 2021-10-01 | 深圳市洞见智慧科技有限公司 | Secret track query method and system supporting keyword query |
CN113254982A (en) * | 2021-07-13 | 2021-08-13 | 深圳市洞见智慧科技有限公司 | Secret track query method and system supporting keyword query |
CN113642026A (en) * | 2021-08-31 | 2021-11-12 | 立信(重庆)数据科技股份有限公司 | Method and device for inquiring event processing data on block chain |
Also Published As
Publication number | Publication date |
---|---|
CN105678189B (en) | 2018-10-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105678189A (en) | Encrypted data file storage and retrieval system and method | |
US10223544B1 (en) | Content aware hierarchical encryption for secure storage systems | |
US11841967B2 (en) | Systems and methods of database encryption in a multitenant database management system | |
CN102833346B (en) | Based on cloud sensitive data safety system and the method for storing metadata | |
CN110647497A (en) | HDFS-based high-performance file storage and management system | |
CN103544261B (en) | A kind of magnanimity structuring daily record data global index's management method and device | |
WO2018201583A1 (en) | File management method and system, electronic device, and medium | |
CN102024054A (en) | Ciphertext cloud-storage oriented document retrieval method and system | |
JP2008517354A (en) | A computer with a method of building an encrypted database index for database table search | |
WO2011023134A1 (en) | Method and system for managing distributed storage system through virtual file system | |
CN103107889A (en) | System and method for cloud computing environment data encryption storage and capable of searching | |
CN103955537A (en) | Method and system for designing searchable encrypted cloud disc with fuzzy semantics | |
CN103812939A (en) | Big data storage system | |
US11256662B2 (en) | Distributed ledger system | |
CN104408111A (en) | Method and device for deleting duplicate data | |
US9886448B2 (en) | Managing downloads of large data sets | |
JP2012089094A (en) | Storage service provision apparatus, system, service provision method and service provision program | |
CN103970889A (en) | Security cloud disc for Chinese and English keyword fuzzy search | |
CN104778192A (en) | Representing directory structure in content-addressable storage systems | |
CN104992124A (en) | Document safety access method for cloud storage environment | |
JP5236129B2 (en) | Storage service providing apparatus, system, service providing method, and service providing program | |
CN116069729B (en) | Intelligent document packaging method, system and medium | |
Cao | Design of digital library service platform based on cloud computing | |
JP5174255B2 (en) | Storage service providing apparatus, system, service providing method, and service providing program | |
WO2014114987A1 (en) | Personal device encryption |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |