CN105678189A - Encrypted data file storage and retrieval system and method - Google Patents

Encrypted data file storage and retrieval system and method Download PDF

Info

Publication number
CN105678189A
CN105678189A CN201610025930.1A CN201610025930A CN105678189A CN 105678189 A CN105678189 A CN 105678189A CN 201610025930 A CN201610025930 A CN 201610025930A CN 105678189 A CN105678189 A CN 105678189A
Authority
CN
China
Prior art keywords
data file
cloud storage
storage system
encryption
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610025930.1A
Other languages
Chinese (zh)
Other versions
CN105678189B (en
Inventor
韩德志
毕坤
戴永涛
陈付梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maritime University
Original Assignee
Shanghai Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maritime University filed Critical Shanghai Maritime University
Priority to CN201610025930.1A priority Critical patent/CN105678189B/en
Publication of CN105678189A publication Critical patent/CN105678189A/en
Application granted granted Critical
Publication of CN105678189B publication Critical patent/CN105678189B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6209Protecting access to data via a platform, e.g. using keys or access control rules to a single file or object, e.g. in a secure envelope, encrypted and accessed using a key, or with access control rules appended to the object itself
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2141Access rights, e.g. capability lists, access control lists, access tables, access matrices

Abstract

The invention discloses an encrypted data file storage and retrieval method. The method comprises the steps that after content metadata is extracted from a data file, the data file is encrypted to generate an encrypted data file to be stored in storage equipment of a cloud storage system; the content metadata is added with a file global identifier of the data file in an encrypted state and then stored in a content metadatabase of the cloud storage system; when the encrypted data file stored in the cloud storage system is retrieved, the content metadatabase is retrieved through an inverted index method to acquire the file global identifier matched with a retrieval keyword, and attribute information and content information of the encrypted data file corresponding to the file global identifier are listed to serve as a retrieval result. According to the method, the content metadata is extracted before the data file is encrypted, the file global identifier of the file in the encrypted state is added into the content metadata, the encrypted data file stored in the cloud storage system is retrieved through the file global identifier, and the retrieval convenience of the data file is guaranteed while the safety and the privacy of the data file in a cloud storage environment are guaranteed.

Description

Data file encryption storage and retrieval system and method
Technical field
The present invention relates to field of information security technology, be specifically related to a kind of data file encryption storage and retrieval system based on cloud storage system and method.
Background technology
Comparing traditional data file storage mode, cloud storage technology has a lot of advantages:
(1) with low cost, under traditional approach, user needs to buy the infrastructure devices such as substantial amounts of server, hard disk, but also need regularly equipment to be upgraded, and in cloud storage environment, user is no longer necessary to buy these infrastructure devices, saves the cost buying infrastructure device on the one hand, decreases the expense of maintenance on the other hand;
(2) retractility is good, for medium-sized and small enterprises, early stage is difficult to estimate the memory capacity size needed, and cloud storage can well solve this problem, early stage can according to being currently needed for buying the memory capacity meeting demand, when business increases, when data volume increases, it is possible to dynamic increase memory capacity and data before not affecting;
(3) automatic duplicating of data, for data safety, data can be backed-up by a lot of users, and backup is often comparatively laborious and there is Backup Data safety and integrality protection problem, cloud storage supplier generally provides two or more copies to data file, fully ensure that the high availability of data file, thus saving out from the worry of data backup by user;
(4) fault automatically switches, during traditional storage system upgrade, need data to be moved on other storage server from old storage, after reaching the standard grade etc. new storage server, again Data Migration is returned, the interruption of service can be caused on the one hand, also bring along the risk of loss of data on the other hand, and these problems all will not exist in cloud storage environment, when system detects abnormal, automatically service can be switched on available redundant storage cluster, and not affect normal service, more will not lose data.
Although cloud storage has many advantages, but there is also some shortcomings, especially prominent is a bit that increasing user worries: the data of oneself are stored in the cloud storage system environment by other people management and control, it is possible to the content of leak data, bring loss to individual and company. The method solving problems at present is data to be stored in an encrypted form in cloud storage system.
Although by data file encryption storage; privacy and the safety of data file can be protected; but also bring a problem: under a lot of scenes simultaneously; user needs to retrieve data file according to some specific contents; if data file is encrypted; result in and cannot retrieve, or retrieval rate is slow.
Summary of the invention
The present invention provides a kind of data file encryption storage and retrieval system and method, solve the problem that data file encryption retrieval difficulty is slow with retrieval rate, can be on the basis of encrypted state in data file, quickly retrieve required data file encryption information.
For achieving the above object, the present invention provides a kind of data file encryption storage and retrieval system, is characterized in, this system comprises:
Cloud storage system, it comprises server end and storage device; Server Side Include authenticating user identification module, content metadata extraction module, metadata management system, data file encrypting module, information searching module; Connect and manage content metadata storehouse, system metadata storehouse and storage metadatabase under metadata management system; Storage device, it is used for storing data file, and data file comprises data file encryption and plaintext data file;
Client, it comprises content metadata extraction module and data file encryption module.
A kind of data file encryption storage and search method, be characterized in, the method comprises:
The content metadata being encrypted generation data file encryption, data file encryption and correspondence after the content metadata of client or cloud storage system server end extraction data file is stored respectively in the memory storage devices of cloud storage system and the content metadata storehouse of server end; Content metadata comprises attribute information and the content information of data file and the file global identifier that data file is in an encrypted state;
During the data file encryption stored in retrieval cloud storage system, the data file encryption retrieval module of service end is by inverted index method, the data file that retrieval acquisition is mated with search key in the content metadata storehouse of server end file global identifier in an encrypted state, attribute information and the content information of listing data file encryption corresponding to this document global identifier return as retrieval result.
It is encrypted, after the content metadata of above-mentioned client extraction data file, the method generating data file encryption to comprise:
Client extracts the content metadata of data file;
The data file of content metadata has been extracted in client encryption, generates data file encryption;
The content metadata of data file encryption and correspondence is uploaded to cloud storage system server end by client.
It is encrypted, after the content metadata of above-mentioned cloud storage system server end extraction data file, the method generating data file encryption to comprise:
Data file is uploaded to cloud storage system server end by client;
Cloud storage system server end extracts the content metadata of data file;
The data file extracting content metadata is encrypted by cloud storage system server end, generates data file encryption.
The content metadata of said extracted data file comprises: the appearance metadata extraction module of client or the cloud storage system server end characteristic according to data file, content data file is done preliminary analysis, extraction can embody attribute information and the content information of data file characteristics, and is added in content metadata by the file global identifier after data file encryption.
After the content metadata of said extracted data file, the content metadata being stored in cloud storage system server end content metadata storehouse can be modified by client.
Above-mentioned cloud storage system server end is by the data file encryption distributed storage storage device at cloud storage system, and is stored in by content metadata in the content metadata storehouse of cloud storage system.
In above-mentioned retrieval cloud storage system server end, the data file encryption of storage comprises:
Client sends the retrieval request comprising search key, and cloud storage system is analyzed retrieval request and determined the legitimacy of search key content in retrieval request;
Content metadatabase is carried out matching inquiry by inverted index method by the information searching module of cloud storage system, obtains the attribute information of the data file corresponding with the data file of search key coupling file global identifier in an encrypted state and file global identifier and content information as retrieving result;
Information searching module sends to client after retrieval result being ranked up.
Above-mentioned client is according to retrieval result, and optional download retrieves the data file encryption that file global identifier listed in result is corresponding;
If data file encryption is to encrypt in client, data file encryption is directly passed to subscription client by cloud storage system, client be decrypted;
If data file encryption is to encrypt at cloud storage system server end, then pass to client after being deciphered by data file encryption by cloud storage system server end.
The search method of above-mentioned data file encryption also comprises the optimization method of inverted index method, and the optimization method of this inverted index method comprises:
Through vertical segmentation and moving horizontally, the neutral element of the inverted index matrix of content data file metadata is made to move on to bottom and the right part of matrix;
Then through Block Cluster, original higher-dimension sparse matrix is changed into several low-dimensional dense matrix;
During to content metadata retrieval, several low-dimensional matrix in the sparse matrix of optimization is delivered to the different processing units in cloud storage system respectively and carries out parallel processing.
Data file encryption storage and retrieval system of the present invention and method store with the data file encryption of prior art and compare with retrieval technique, have an advantage in that, the present invention establishes a kind of novel content metadata structure, ensure that user is from multi-angle, the multi-faceted retrieval to data file encryption, ensure data file safety in cloud storage environment and privacy simultaneously, ensure the convenience of data file retrieval;
In the present invention, data file is all cryptographically be saved in cloud storage system, even if obtaining the data file of encryption, it does not have decruption key, data file is without leakage;
The present invention designs a kind of novel inverted index method being suitable for content metadata retrieval, the data file encryption of correspondence can be gone out at key word information quick-searching in cloud storage system that client provides according to user, ensure that the efficiency of user encryption data file retrieval and precision, solve the problem that data file encryption retrieval is difficult or retrieval rate is slow in the big data environments such as cloud storage;
The present invention is equally applicable for the retrieval of the data file encryption in cloud storage system and plaintext data file, is attained by quick-searching and returns the purpose of retrieval result.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of data file encryption of the present invention storage and search method;
Fig. 2 is three kinds of metadata relationship figure;
Fig. 3 is content metadata structure chart;
Fig. 4 is storage metadata structure figure;
Fig. 5 is the method flow diagram that a kind of data file encryption based on cloud storage system stores with search method embodiment;
Fig. 6 is the retrieval model figure of data file encryption;
Fig. 7 is the inverted index schematic diagram of content metadata;
The matrix that Fig. 8 is the inverted index of content metadata represents schematic diagram;
Fig. 9 is inverted index Factorization algorithm and the parallel processing schematic diagram thereof of content metadata.
Detailed description of the invention
Below in conjunction with accompanying drawing, further illustrate specific embodiments of the invention.
Present invention is disclosed a kind of data file encryption storage and retrieval system based on cloud storage system and method, by the attribute information of former data file and content information, and the file global identifier of data file encryption is stored in content metadata storehouse, in the way of realization is without decrypted original data file, complete the search operaqtion of data file encryption.
The know-why of the present invention: (1) is by designing special content data file metadata structure, content metadata includes data file file global identifier (FGID) in an encrypted state, before client or cloud storage system server end (i.e. cloud storage system) are to data file encryption, automatically extract the content metadata of data file, and it is deposited into the content metadata database of cloud storage system server end, namely being stored in cloud storage system, the retrieval for data file encryption provides foundation; (2) data file may select in client encryption or encrypts at the server end of cloud storage system, then storage mode in a distributed manner is stored in the storage device of cloud storage system, it is ensured that the safety of data file, privacy, high availability and data integrity; (3) ensure that the retrieval rate of huge volumes of content metadata in cloud storage system by a kind of novel inverted index method, it is achieved thereby that the quick-searching of data file encryption. This invention overcomes conventional cryptography data file retrieval difficulty or the defect that after deciphering, retrieval rate is slow.
Data file encryption storage and retrieval system based on cloud storage system disclosed by the invention comprises: cloud storage system and client.
Cloud storage system comprises server end and storage device; Server Side Include authenticating user identification module, content metadata extraction module, metadata management system, data file encrypting module, information searching module; Connect and manage content metadata storehouse, system metadata storehouse and storage metadatabase under described metadata management system; Storage device, it is used for storing data file, and data file comprises data file encryption and plaintext data file. Client comprises content metadata extraction module and data file encryption module.
Storing and search method as it is shown in figure 1, disclose a kind of data file encryption based on cloud storage system, the method includes the steps of:
User is carried out authentication by authentication interface by S100, client.
User authentication includes accessing control and subscriber identity information offer; Access and control: be the access of restriction disabled user, be build the first line of defence of data safety in cloud storage environment; User profile provides: in follow-up content data file meta-data extraction, and the access rights obtaining user-dependent information and data are arranged, and are required for from user authentication information and obtain.
Cloud storage system authenticating user identification controls the access of disabled user, the operation of refusal disabled user, ensure that the safety of cloud storage system, realize the security protection to cloud storage system, disabled user is kept outside of the door, extraction or generation for follow-up associated metadata provide corresponding attribute, also control to submit necessary information for the access of data file. Furthermore it is also possible to determine the identity of validated user, and limit the scope of its data access.
While carrying out authentication, it is judged that carry out file Stored Procedure or document retrieval flow process, if file Stored Procedure, then jump to S200, if document retrieval flow process, then jump to S300.
S200, subscriber data file is carried out file Stored Procedure, subscriber data file is encrypted and stored at cloud storage system.
S300, data file encryption is carried out document retrieval flow process, directly data file encryption is retrieved when not deciphering in cloud storage system.
Wherein, data file includes structured data file, semi-structured data file and non-structural data file. Structured data file refers to traditional various database files; Non-structured data file refers to various document files, picture file, audio file and video file etc.; Semi-structured data file is a kind of irregular database file, is the data file being embedded with unstructured data in database file.
The metadata as in figure 2 it is shown, the metadata in cloud storage environment is divided three classes by the present invention, is respectively as follows: system metadata, storage metadata and content metadata. Three class metadata are deposited in the metadatabase that high in the clouds is different respectively, and three class metadata of each data file encryption are to be associated by the FGID of this data file. FGID is the major key of three class metadata place tables, is also external key, identifies encrypted data file for unique, is determined by the content of data file encryption, can retrieve the data integrity of data file encryption with FGID. The length of each FGID is 128, say, that can represent 2128Individual file.
System metadata, file global identifier (FGID) including the Data Filename under the directory information of cloud storage system and directory pathname, each catalogue and data file encryption, and the information such as the attribute of data file and catalogue, user encryption data file after storing, system automatically generates.
As shown in Figure 3, content metadata is the key content realizing data file retrieval, data file characteristics can be embodied, including stored data file attribute information under plaintext state and content information face, and the file global identifier (FGID) that data file is in an encrypted state. Attribute information includes: file name, establishment time, document creation person, modification time, reviser, version information, file type etc., by attribute information, this data file can have an overall cognition. Content information comprises: content data file brief introduction, keyword, file another name, file label, remarks, purposes, Content Organizing structure, compress mode, coded format, the content characteristic information of data file. Here, data file file global identifier (FGID) item in an encrypted state, is automatically added by cloud storage system after data file encryption stores.
As shown in Figure 4, storage metadata includes Back ground Information and the storage information of data file encryption and the file global identifier (FGID) of data file encryption. Back ground Information includes data file size, ID, operable type, replicator, security attribute etc.; Storage information includes corresponding for data file encryption block ID content address list, block size list, block physical address and content address mapping table base address. ID is the possessory ID of this document, and operable type includes reading and writing, amendment etc., and replicator refers to the number that this data file backs up, and content address is the hash function value of this data block. After storage metadata is stored by user encryption data file, system automatically generates.
In the metadata management system of cloud storage system server end, by the file global identifier (FGID) of data file, its three classes metadata is linked together.The relevant information of content metadata is automatically extracted by system, and the user having permission can pass through network to the content metadata manual modification in content metadatabase in client.
Wherein, the encryption of data file: be divided into according to the physical location that data file encryption is residing: client encryption and server end encryption. Two kinds of cipher modes use different scenes respectively: for the optional cloud storage system server end encryption of data file that security requirement is higher, client can be selected to encrypt for the data file that security requirement is very high.
Wherein, the storage of metadata and management: system metadata, storage metadata and content metadata three class metadata are respectively stored in three kinds of specific metadata databases in cloud storage system, this data base is capable of mass data storage and efficient retrieval, the concurrent request of energy satisfying magnanimity user, and tool automatic fault is recovered and the function of data backup, it is ensured that the safety of metadata and high availability;
Wherein, the storage of data file encryption and management: the data file of encryption is by cloud storage system adopt distributed storage technology to be stored in storage device that cloud storage system is user virtual machine distribution; Described cloud storage system is based on distributed high availability storage system, by increasing number of nodes, it is possible to the total capacity of horizontal extension cloud storage system; By some physical data block being merged into a bigger logical memory space, reduce data management expense;
The data file encryption retrieval of content-based metadata, need first the input information of user to be carried out preliminary analysis, reduce range of search, it is determined that the meta data file that retrieval may relate to, so the huge volumes of content metadatabase in cloud storage system can accelerate retrieval rate; Described search operaqtion is not required to deciphering data file encryption, and the result finally retrieved is data file or the data file encryption list of encryption.
As it is shown in figure 5, be that the data file encryption based on cloud storage system stores a kind of embodiment with search method, its file Stored Procedure (S200) specifically comprises the steps of
After user is completed authentication by S201, client, client judges whether in client, data file to be encrypted, if, carry out client encryption, then jump to S202, if not, carry out server (cloud storage system server end) encryption, then jump to S205.
S202, client content metadata extraction module automatically extract the content metadata of data file. The extraction of content metadata comprises: automatically extract and manual modification.
Automatically extract: by the appearance metadata extraction module of the client characteristic according to file, file content is done preliminary analysis, extracts the above-mentioned attribute information that can embody data file characteristics and content information.
Manual modification: user is the owner of data file, the kind of data file, attribute, purposes, feature etc. there is the more comprehensive understanding of ratio, some special data file users can be carried out manual modification at the content metadata that system is automatically extracted by client by network, so can describe the characteristic of this data file more accurately, the accuracy and efficiency of retrieval can be improved.
Here, the content metadata of data file extracts, by the content metadata extraction module of client, automatically extract and send before data file is encrypted in the content metadata storehouse of cloud storage system server end and preserve, the user having permission can carry out edit-modify to being stored in content metadata storehouse content unit, make content metadata more meet the retrieval habit of user, accuracy and the recall precision of user search can be improved.
S203, client encrypting module data file is encrypted, be encrypted extracting the data file after content metadata by private key for user or symmetric key encryption or other AESs, generation data file encryption.
The content metadata of data file encryption and correspondence is uploaded to the server end of cloud storage system by S204, client, jumps to S208.
Data file is uploaded to the server end of cloud storage system by S205, client with plaintext version.
S206, cloud storage system the content metadata extraction module of server end automatically extract the content metadata of data file. The extraction of content metadata comprises: automatically extract and manual modification.
Automatically extract: by the appearance metadata extraction module of the client characteristic according to file, file content is done preliminary analysis, extract attribute information and the content information that can embody data file characteristics.
Manual modification: user is the owner of data file, the kind of data file, attribute, purposes, feature etc. there is the more comprehensive understanding of ratio, some special data file users can be carried out manual modification at the content metadata that system is automatically extracted by client by network, so can describe the characteristic of this data file more accurately, the accuracy and efficiency of retrieval can be improved.
Here, the content metadata of data file extracts, it is by the content metadata extraction module of cloud storage system server end, automatically extract and send before data file is encrypted in the content metadata storehouse of cloud storage system server end and preserve, the user having permission can carry out edit-modify to being stored in content metadata storehouse content unit, make content metadata more meet the retrieval habit of user, accuracy and the recall precision of user search can be improved.
S207, cloud storage system server end encrypting module be encrypted by private key for user or symmetric key encryption or other AESs extracting the data file after content metadata, generation data file encryption. Data file encryption ensure that the safety of data file, jumps to S208.
The data file encryption encrypted by client or server end encryption generates is stored by S208, cloud storage system server end, is stored in cloud storage system corresponding storage device.
After the server end of cloud storage system obtains the data file of user encryption, data file encryption is stored in the storage device that user virtual machine is corresponding.
Each cloud tenant (user) uses cloud storage system to carry out in units of user virtual machine.
The encrypting module of cloud storage system adopts MD5 algorithm to generate the file global identifier (FGID) of data file encryption simultaneously, and sends in the content metadata of this data file. File global identifier (FGID) is unique mark of this data file encryption, checks the data integrity of this data file encryption also dependent on file global identifier (FGID) simultaneously.
After data file encryption completes storage, cloud storage system generates system metadata and storage metadata, cloud storage system the directory information stored according to data file encryption, storage positional information and file global identifier FGID automatically generate system metadata and the storage metadata of data file encryption.
S209, the content metadata that extracts before encryption is stored in the content metadata storehouse of cloud storage system.
Further, can also after content metadata be stored in the content metadata storehouse of cloud storage system, content metadata database is updated, the user having permission can pass through network in client, content metadata in the content metadata storehouse of cloud storage system is modified, it is simple to user's retrieving more accurately for data file encryption.
At the data file encryption storing process based on S200, the search operaqtion to data file encryption can be realized, as shown in Figure 6, for the data file encryption retrieval model of content-based metadata, its rough flow is: 1. client upload retrieves the content information searching module to cloud storage system; 2. information searching module is by falling the method sorted query and search and the information retrieving content matching in the content metadata storehouse that metadata management system manages; 3. 2. retrieval is obtained by metadata management system file global identifier FGID and corresponding content metadata with the data file encryption retrieving content matching issue information searching module; 4. the result retrieving return in metadata management system is ranked up by information searching module, then the retrieval the results list after sequence is sent to client; 5. user is at the client file that selection to be downloaded from retrieval the results list, and the file global identifier (FGID) of selected file is sent to information searching module; 6. information searching module is according to the FGID of file in the storage metadatabase that metadata management system manages, and searches corresponding data file encryption storage positional information; 7. metadata management system is sent to distributed memory system by 6. searching the data file encryption storage positional information obtained; 8. distributed memory system stores positional information according to data file encryption, takes out data file encryption and is sent to Encryption Decryption module; 9. Encryption Decryption module passes to client after being deciphered by data file encryption, and the retrieval of whole data file encryption terminates.
In figure 6, if data file encryption is encrypted at server end, then the Encryption Decryption module in model is in the server end of cloud storage system; If data file encryption is encrypted in client, then the Encryption Decryption module in model is in client.
As it is shown in figure 5, be that the data file encryption based on cloud storage system stores a kind of embodiment with search method, wherein document retrieval flow process (S300) specifically comprises the steps of
S301, client receive inquiry request by query interface, include search key in inquiry request. The inquiry request comprising search key is uploaded to cloud storage system by client.
The inquiry request that the information searching module of cloud storage system is submitted to for client is analyzed, it is determined that the legitimacy of the comprised content of search key in inquiry request.
S302, data retrieval: content metadatabase is carried out matching inquiry by a kind of novel Inversed File Retrieval Algorithm by the information searching module of cloud storage system, and returns file global identifier (FGID) and the partial content metadata information of satisfactory data file encryption.
Novel inverted index method is a kind of inverted index method of improvement, the quick-searching of huge volumes of content metadata information in the content metadata storehouse of the applicable cloud storage system of this indexing means.
Such as Fig. 7 and in conjunction with shown in Fig. 8, it is represent for inverted index and the matrix thereof of content data file metadata disclosed in this invention. Wherein key word 1, key word 2 ... represent the content metadata item in content metadata storehouse, ID1, ID2, ..., the row and column cross term that represents in file global identifier FGID, Fig. 8 of data file encryption of IDn represent that a certain key word occurs in the number of times in a data file. The inverted index matrix of content data file metadata is a sparse matrix as can be seen from Figure 8, for the keyword retrieval speed improved in cloud storage system in huge volumes of content metadatabase, the inverted index matrix of data file content meta-data is optimized as follows: through vertical segmentation and moving horizontally, the neutral element making matrix moves on to bottom and the right part of matrix, then through Block Cluster, original higher-dimension sparse matrix is changed into low-dimensional dense matrix one by one. When to content metadata retrieval, low-dimensional matrix one by one in the sparse matrix of optimization is delivered to the different processing units in cloud storage system respectively and carries out parallel processing, so can be greatly improved the keyword retrieval speed in huge volumes of content metadatabase. Its principle as it is shown in figure 9, wherein M1, M2 ..., Mn represent low-dimensional dense matrix, P1, P2 ..., Pn represent the parallel processing element in cloud storage system.
S303, cloud storage system judge whether to retrieve successfully, if so, then jump to step S304, and if not then, output sends to client without the information of the retrieval result of corresponding data file encryption, and jumps to S305.
S304, cloud storage system position respective encrypted data file according to the file global identifier (FGID) of data file encryption.
The result that S302 retrieval is returned by the information searching module of cloud storage system simultaneously is ranked up, the retrieval result after sending sequence to client.
S305, client receive retrieval result, retrieval result is exported to user by client, if obtaining the retrieval result of the file global identifier including the data file encryption meeting retrieval content, illustrating to meet the retrieval result of retrieval content, user can obtain required data file encryption information in client according to retrieval result.
It addition, client is according to retrieval result, the data file encryption that also the optional file global identifier downloaded listed by retrieval result is corresponding.
Encrypting in client if data file encryption is originally, data file encryption is directly passed to subscription client by cloud storage system, client be decrypted. Encrypt at cloud storage system server end if data file encryption is originally, then pass to client after being deciphered by data file encryption by cloud storage system.
Although present disclosure has been made to be discussed in detail already by above preferred embodiment, but it should be appreciated that the description above is not considered as limitation of the present invention. After those skilled in the art have read foregoing, multiple amendment and replacement for the present invention all will be apparent from. Therefore, protection scope of the present invention should be limited to the appended claims.

Claims (10)

1. a data file encryption storage and retrieval system, it is characterised in that this system comprises:
Cloud storage system, it comprises server end and storage device; Described Server Side Include authenticating user identification module, content metadata extraction module, metadata management system, data file encrypting module, information searching module; Connect and manage content metadata storehouse, system metadata storehouse and storage metadatabase under described metadata management system; Described storage device, it is used for storing data file, and data file comprises data file encryption and plaintext data file;
Client, it comprises content metadata extraction module and data file encryption module.
2. a data file encryption storage and search method, it is characterised in that the method comprises:
The content metadata being encrypted generation data file encryption, data file encryption and correspondence after the content metadata of client or cloud storage system server end extraction data file is stored respectively in the memory storage devices of cloud storage system and the content metadata storehouse of server end; Content metadata comprises attribute information and the content information of data file and the file global identifier that data file is in an encrypted state;
During the data file encryption stored in retrieval cloud storage system, the data file encryption retrieval module of service end is by inverted index method, the data file that retrieval acquisition is mated with search key in the content metadata storehouse of server end file global identifier in an encrypted state, attribute information and the content information of listing data file encryption corresponding to this document global identifier return as retrieval result.
3. data file encryption stores and search method as claimed in claim 2, it is characterised in that is encrypted, after the content metadata of described client extraction data file, the method generating data file encryption and comprises:
Client extracts the content metadata of data file;
The data file of content metadata has been extracted in client encryption, generates data file encryption;
The content metadata of data file encryption and correspondence is uploaded to cloud storage system server end by client.
4. data file encryption stores and search method as claimed in claim 2, it is characterised in that is encrypted, after the content metadata of described cloud storage system server end extraction data file, the method generating data file encryption and comprises:
Data file is uploaded to cloud storage system server end by client;
Cloud storage system server end extracts the content metadata of data file;
The data file extracting content metadata is encrypted by cloud storage system server end, generates data file encryption.
5. data file encryption storage and search method as described in Claims 2 or 3 or 4, it is characterized in that, the content metadata of described extraction data file comprises: the appearance metadata extraction module of client or the cloud storage system server end characteristic according to data file, content data file is done preliminary analysis, extraction can embody attribute information and the content information of data file characteristics, and is added in content metadata by the file global identifier after data file encryption.
6. data file encryption storage and search method as described in Claims 2 or 3 or 4, it is characterized in that, after the content metadata of described extraction data file, the content metadata being stored in cloud storage system server end content metadata storehouse can be modified by client.
7. data file encryption storage and search method as described in Claims 2 or 3 or 4, it is characterized in that, described cloud storage system server end is by the data file encryption distributed storage storage device at cloud storage system, and is stored in by content metadata in the content metadata storehouse of cloud storage system.
8. data file encryption stores and search method as claimed in claim 2, it is characterised in that in described retrieval cloud storage system server end, the data file encryption of storage comprises:
Client sends the retrieval request comprising search key, and cloud storage system is analyzed retrieval request and determined the legitimacy of search key content in retrieval request;
Content metadatabase is carried out matching inquiry by inverted index method by the information searching module of cloud storage system, obtains the attribute information of the data file corresponding with the data file of search key coupling file global identifier in an encrypted state and file global identifier and content information as retrieving result;
Information searching module sends to client after retrieval result being ranked up.
9. data file encryption stores and search method as claimed in claim 8, it is characterised in that described client, according to retrieval result, may select the data file encryption that the file global identifier listed by download retrieval result is corresponding;
If data file encryption is to encrypt in client, data file encryption is directly passed to subscription client by cloud storage system, client be decrypted;
If data file encryption is to encrypt at cloud storage system server end, then pass to client after being deciphered by data file encryption by cloud storage system server end.
10. data file encryption storage and search method as described in claim 2 or 8 or 9, it is characterised in that the search method of described data file encryption also comprises the optimization method of inverted index method, and the optimization method of this inverted index method comprises:
Through vertical segmentation and moving horizontally, the neutral element of the inverted index matrix of content data file metadata is made to move on to bottom and the right part of matrix;
Then through Block Cluster, original higher-dimension sparse matrix is changed into several low-dimensional dense matrix;
During to content metadata retrieval, several low-dimensional matrix in the sparse matrix of optimization is delivered to the different processing units in cloud storage system respectively and carries out parallel processing.
CN201610025930.1A 2016-01-15 2016-01-15 Data file encryption storage and retrieval system and method Active CN105678189B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610025930.1A CN105678189B (en) 2016-01-15 2016-01-15 Data file encryption storage and retrieval system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610025930.1A CN105678189B (en) 2016-01-15 2016-01-15 Data file encryption storage and retrieval system and method

Publications (2)

Publication Number Publication Date
CN105678189A true CN105678189A (en) 2016-06-15
CN105678189B CN105678189B (en) 2018-10-23

Family

ID=56300884

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610025930.1A Active CN105678189B (en) 2016-01-15 2016-01-15 Data file encryption storage and retrieval system and method

Country Status (1)

Country Link
CN (1) CN105678189B (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106131013A (en) * 2016-07-06 2016-11-16 杨炳 A kind of protecting data encryption system
CN106302449A (en) * 2016-08-15 2017-01-04 中国科学院信息工程研究所 A kind of ciphertext storage cloud service method open with searching ciphertext and system
CN106302472A (en) * 2016-08-09 2017-01-04 厦门乐享新网络科技有限公司 The hidden method of information and device
CN106649880A (en) * 2017-01-09 2017-05-10 北京中电普华信息技术有限公司 Electric power statistical management system and method
CN107291851A (en) * 2017-06-06 2017-10-24 南京搜文信息技术有限公司 Ciphertext index building method and its querying method based on encryption attribute
CN107704768A (en) * 2017-09-14 2018-02-16 上海海事大学 A kind of multiple key classification safety search method of ciphertext
CN108268558A (en) * 2017-01-03 2018-07-10 中移(苏州)软件技术有限公司 A kind of method and apparatus of data analysis
CN108897859A (en) * 2018-06-29 2018-11-27 郑州云海信息技术有限公司 A kind of metadata retrieval method, apparatus, equipment and computer readable storage medium
CN108984627A (en) * 2018-06-20 2018-12-11 顺丰科技有限公司 Searching method, system, equipment and the storage medium of encrypted document based on Elasticsearch
CN109284290A (en) * 2018-09-20 2019-01-29 佛山科学技术学院 A kind of method for reading data based on distributed storage space
CN109542895A (en) * 2018-10-25 2019-03-29 北京开普云信息科技有限公司 A kind of method for managing resource and system based on the customized extension of metadata
CN109923549A (en) * 2016-08-24 2019-06-21 罗伯特·博世有限公司 Processing inverted index can search for symmetric encryption system and method
CN110771190A (en) * 2017-06-22 2020-02-07 森特里克斯信息安全技术有限公司 Controlling access to data
CN110929302A (en) * 2019-10-31 2020-03-27 东南大学 Data security encryption storage method and storage device
CN111492354A (en) * 2017-11-14 2020-08-04 斯诺弗雷克公司 Database metadata in immutable storage
CN112052219A (en) * 2020-08-05 2020-12-08 中国建设银行股份有限公司 File storage and retrieval method and device, electronic equipment and readable storage medium
CN112233666A (en) * 2020-10-22 2021-01-15 中国科学院信息工程研究所 Method and system for storing and retrieving Chinese voice ciphertext in cloud storage environment
CN112417473A (en) * 2020-11-20 2021-02-26 季速漫 Big data security management system
CN112702379A (en) * 2020-08-20 2021-04-23 纬领(青岛)网络安全研究院有限公司 Full-secret search research for big data security
CN112733180A (en) * 2021-04-06 2021-04-30 北京神州泰岳智能数据技术有限公司 Data query method and device and electronic equipment
CN113254982A (en) * 2021-07-13 2021-08-13 深圳市洞见智慧科技有限公司 Secret track query method and system supporting keyword query
CN113434877A (en) * 2021-06-23 2021-09-24 平安国际智慧城市科技股份有限公司 Method, device, equipment and storage medium for encrypting and decrypting user input data
WO2021225687A1 (en) * 2020-05-08 2021-11-11 Bold Limited Systems and methods for creating enhanced documents for perfect automated parsing
CN113642026A (en) * 2021-08-31 2021-11-12 立信(重庆)数据科技股份有限公司 Method and device for inquiring event processing data on block chain
US11188707B1 (en) 2020-05-08 2021-11-30 Bold Limited Systems and methods for creating enhanced documents for perfect automated parsing
US11436377B2 (en) * 2020-06-26 2022-09-06 Ncr Corporation Secure workload image distribution and management

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101770462A (en) * 2008-12-30 2010-07-07 日电(中国)有限公司 Device for ciphertext index and search and method thereof
CN102024054A (en) * 2010-12-10 2011-04-20 中国科学院软件研究所 Ciphertext cloud-storage oriented document retrieval method and system
US20120016843A1 (en) * 2003-05-22 2012-01-19 Carmenso Data Limited Liability Company Information Source Agent Systems and Methods for Backing Up Files To a Repository Using File Identicality
CN103442057A (en) * 2013-08-27 2013-12-11 玉林师范学院 Cloud storage system based on user collaboration cloud

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120016843A1 (en) * 2003-05-22 2012-01-19 Carmenso Data Limited Liability Company Information Source Agent Systems and Methods for Backing Up Files To a Repository Using File Identicality
CN101770462A (en) * 2008-12-30 2010-07-07 日电(中国)有限公司 Device for ciphertext index and search and method thereof
CN102024054A (en) * 2010-12-10 2011-04-20 中国科学院软件研究所 Ciphertext cloud-storage oriented document retrieval method and system
CN103442057A (en) * 2013-08-27 2013-12-11 玉林师范学院 Cloud storage system based on user collaboration cloud

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106131013A (en) * 2016-07-06 2016-11-16 杨炳 A kind of protecting data encryption system
CN106302472A (en) * 2016-08-09 2017-01-04 厦门乐享新网络科技有限公司 The hidden method of information and device
CN106302472B (en) * 2016-08-09 2019-12-24 厦门乐享新网络科技有限公司 Information hiding method and device
CN106302449A (en) * 2016-08-15 2017-01-04 中国科学院信息工程研究所 A kind of ciphertext storage cloud service method open with searching ciphertext and system
CN106302449B (en) * 2016-08-15 2019-10-11 中国科学院信息工程研究所 A kind of storage of ciphertext and the open cloud service method of searching ciphertext and system
CN109923549B (en) * 2016-08-24 2023-11-07 罗伯特·博世有限公司 Searchable symmetric encryption system and method for processing inverted index
CN109923549A (en) * 2016-08-24 2019-06-21 罗伯特·博世有限公司 Processing inverted index can search for symmetric encryption system and method
CN108268558A (en) * 2017-01-03 2018-07-10 中移(苏州)软件技术有限公司 A kind of method and apparatus of data analysis
CN106649880A (en) * 2017-01-09 2017-05-10 北京中电普华信息技术有限公司 Electric power statistical management system and method
CN107291851B (en) * 2017-06-06 2020-11-06 南京搜文信息技术有限公司 Ciphertext index construction method based on attribute encryption and query method thereof
CN107291851A (en) * 2017-06-06 2017-10-24 南京搜文信息技术有限公司 Ciphertext index building method and its querying method based on encryption attribute
CN110771190A (en) * 2017-06-22 2020-02-07 森特里克斯信息安全技术有限公司 Controlling access to data
CN107704768A (en) * 2017-09-14 2018-02-16 上海海事大学 A kind of multiple key classification safety search method of ciphertext
CN111492354A (en) * 2017-11-14 2020-08-04 斯诺弗雷克公司 Database metadata in immutable storage
CN108984627A (en) * 2018-06-20 2018-12-11 顺丰科技有限公司 Searching method, system, equipment and the storage medium of encrypted document based on Elasticsearch
CN108897859A (en) * 2018-06-29 2018-11-27 郑州云海信息技术有限公司 A kind of metadata retrieval method, apparatus, equipment and computer readable storage medium
CN109284290A (en) * 2018-09-20 2019-01-29 佛山科学技术学院 A kind of method for reading data based on distributed storage space
CN109284290B (en) * 2018-09-20 2022-04-26 佛山科学技术学院 Data reading method based on distributed storage space
CN109542895A (en) * 2018-10-25 2019-03-29 北京开普云信息科技有限公司 A kind of method for managing resource and system based on the customized extension of metadata
CN110929302A (en) * 2019-10-31 2020-03-27 东南大学 Data security encryption storage method and storage device
CN110929302B (en) * 2019-10-31 2022-08-26 东南大学 Data security encryption storage method and storage device
US11188707B1 (en) 2020-05-08 2021-11-30 Bold Limited Systems and methods for creating enhanced documents for perfect automated parsing
US11537727B2 (en) 2020-05-08 2022-12-27 Bold Limited Systems and methods for creating enhanced documents for perfect automated parsing
US11281783B2 (en) 2020-05-08 2022-03-22 Bold Limited Systems and methods for creating enhanced documents for perfect automated parsing
EP3929798A1 (en) * 2020-05-08 2021-12-29 BOLD Limited Systems and methods for creating enhanced documents for perfect automated parsing
WO2021225687A1 (en) * 2020-05-08 2021-11-11 Bold Limited Systems and methods for creating enhanced documents for perfect automated parsing
EP3929797A1 (en) * 2020-05-08 2021-12-29 BOLD Limited Systems and methods for creating enhanced documents for perfect automated parsing
US11436377B2 (en) * 2020-06-26 2022-09-06 Ncr Corporation Secure workload image distribution and management
CN112052219A (en) * 2020-08-05 2020-12-08 中国建设银行股份有限公司 File storage and retrieval method and device, electronic equipment and readable storage medium
CN112702379A (en) * 2020-08-20 2021-04-23 纬领(青岛)网络安全研究院有限公司 Full-secret search research for big data security
CN112233666A (en) * 2020-10-22 2021-01-15 中国科学院信息工程研究所 Method and system for storing and retrieving Chinese voice ciphertext in cloud storage environment
CN112417473A (en) * 2020-11-20 2021-02-26 季速漫 Big data security management system
CN112733180A (en) * 2021-04-06 2021-04-30 北京神州泰岳智能数据技术有限公司 Data query method and device and electronic equipment
CN113434877A (en) * 2021-06-23 2021-09-24 平安国际智慧城市科技股份有限公司 Method, device, equipment and storage medium for encrypting and decrypting user input data
CN113254982B (en) * 2021-07-13 2021-10-01 深圳市洞见智慧科技有限公司 Secret track query method and system supporting keyword query
CN113254982A (en) * 2021-07-13 2021-08-13 深圳市洞见智慧科技有限公司 Secret track query method and system supporting keyword query
CN113642026A (en) * 2021-08-31 2021-11-12 立信(重庆)数据科技股份有限公司 Method and device for inquiring event processing data on block chain

Also Published As

Publication number Publication date
CN105678189B (en) 2018-10-23

Similar Documents

Publication Publication Date Title
CN105678189A (en) Encrypted data file storage and retrieval system and method
US10223544B1 (en) Content aware hierarchical encryption for secure storage systems
CN110647497A (en) HDFS-based high-performance file storage and management system
JP5822452B2 (en) Storage service providing apparatus, system, service providing method, and service providing program
CN103544261B (en) A kind of magnanimity structuring daily record data global index's management method and device
WO2018201583A1 (en) File management method and system, electronic device, and medium
CN102024054A (en) Ciphertext cloud-storage oriented document retrieval method and system
CN104765848A (en) Symmetrical searchable encryption method for supporting result high-efficiency sequencing in hybrid cloud storage
JP2008517354A (en) A computer with a method of building an encrypted database index for database table search
WO2011023134A1 (en) Method and system for managing distributed storage system through virtual file system
CN103955537A (en) Method and system for designing searchable encrypted cloud disc with fuzzy semantics
CN102457555A (en) Security system and method for distributed storage
CN103812939A (en) Big data storage system
US11256662B2 (en) Distributed ledger system
CN104408111A (en) Method and device for deleting duplicate data
CN103970889A (en) Security cloud disc for Chinese and English keyword fuzzy search
CN104778192A (en) Representing directory structure in content-addressable storage systems
JP5236129B2 (en) Storage service providing apparatus, system, service providing method, and service providing program
CN103366008A (en) Resource searching method and device
CN103414555A (en) Array key management method based on IO block encryption
CN116069729B (en) Intelligent document packaging method, system and medium
Cao Design of digital library service platform based on cloud computing
JP5174255B2 (en) Storage service providing apparatus, system, service providing method, and service providing program
WO2014114987A1 (en) Personal device encryption
JP6033370B2 (en) Storage service providing apparatus, system, service providing method, and service providing program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant