CN105678189B - Data file encryption storage and retrieval system and method - Google Patents

Data file encryption storage and retrieval system and method Download PDF

Info

Publication number
CN105678189B
CN105678189B CN201610025930.1A CN201610025930A CN105678189B CN 105678189 B CN105678189 B CN 105678189B CN 201610025930 A CN201610025930 A CN 201610025930A CN 105678189 B CN105678189 B CN 105678189B
Authority
CN
China
Prior art keywords
data file
encryption
cloud storage
storage system
content metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610025930.1A
Other languages
Chinese (zh)
Other versions
CN105678189A (en
Inventor
韩德志
毕坤
戴永涛
陈付梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maritime University
Original Assignee
Shanghai Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maritime University filed Critical Shanghai Maritime University
Priority to CN201610025930.1A priority Critical patent/CN105678189B/en
Publication of CN105678189A publication Critical patent/CN105678189A/en
Application granted granted Critical
Publication of CN105678189B publication Critical patent/CN105678189B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6209Protecting access to data via a platform, e.g. using keys or access control rules to a single file or object, e.g. in a secure envelope, encrypted and accessed using a key, or with access control rules appended to the object itself
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2141Access rights, e.g. capability lists, access control lists, access tables, access matrices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Storage Device Security (AREA)

Abstract

Public encryption data file storage of the present invention and search method, including:It is encrypted after data file extraction content metadata and generates the storage device that data file encryption is stored in cloud storage system;The content metadata library of cloud storage system is stored in after the file global identifier of content metadata interpolation data file in an encrypted state;When the data file encryption stored in retrieval cloud storage system, by inverted index method retrieve content metadata library obtain with the matched file global identifier of search key, list the corresponding data file encryption of this document global identifier attribute information and content information as retrieval result.The present invention extracts content metadata before data file is encrypted, the file global identifier of add file in an encrypted state in content metadata, the data file encryption of cloud storage system is stored in by the retrieval of file global identifier, it ensures safety and privacy of the data file in cloud storage environment simultaneously, ensures the convenience of data file retrieval.

Description

Data file encryption storage and retrieval system and method
Technical field
The present invention relates to field of information security technology, and in particular to a kind of data file encryption based on cloud storage system is deposited Storage and searching system and method.
Background technology
Compared to traditional data file storage mode, cloud storage technology has many advantages:
(1)Of low cost, under traditional approach, user needs to buy the infrastructure devices such as a large amount of server, hard disk, but also It needs periodically to upgrade equipment, and in cloud storage environment, user no longer needs to buy these infrastructure devices, on the one hand saves The cost for having saved purchase infrastructure device, on the other hand decreases the expense of maintenance;
(2)Retractility is good, for medium-sized and small enterprises, is difficult to estimate the memory capacity size of needs early period, and cloud storage Can be very good to solve the problems, such as this, early period can according to currently need buy meet demand memory capacity, when business increase, When data volume increases, it can dynamically increase memory capacity without the data before influence;
(3)Automatic duplicating of data, for data safety, many users can back up data, and back up and often compare Cumbersome and there are Backup Data safety and integrality protection problem, cloud storage supplier generally provides two to data file Or more than two copies, the high availability of data file is fully ensured that, to save user from the worry of data backup Out;
(4)Failure automatically switches, and when traditional storage system upgrades, needs to move to data from old storage other In storage server, new storage server is waited to reach the standard grade and then Data Migration is returned, on the one hand can caused in service It is disconnected, the risk of loss of data is on the other hand also brought along, and these problems will all not exist in cloud storage environment, system inspection When measuring abnormal, service can be switched on available redundant storage cluster automatically, without influencing normal service, will not more be lost Lose data.
Although cloud storage has many advantages, there is also some shortcomings, and especially more outstanding is more and more User worries:The data of oneself are stored in by the cloud storage system environment of other people management and control, it is possible to leak data Content brings loss to personal and company.It is that data are stored in cloud in an encrypted form to solve the method for problems at present In storage system.
It is stored although data file is encrypted, privacy and the safety of data file can be protected, simultaneously also band A problem is carried out:Under many scenes, user needs to retrieve data file according to some specific contents, if data are literary Part is encrypted, and results in not retrieving or retrieval rate is slow.
Invention content
It is tired to solve data file encryption retrieval for a kind of data file encryption storage and retrieval system of present invention offer and method The difficult and slow problem of retrieval rate can quickly retrieve required add on the basis of data file is in encrypted state Ciphertext data fileinfo.
To achieve the above object, the present invention provides a kind of data file encryption storage and retrieval system, its main feature is that, this is System includes:
Cloud storage system, it includes server ends and storage device;Server Side Include authenticating user identification module, content Metadata extraction module, metadata management system, data file encrypting module, information searching module;Under metadata management system Connect and manage content metadata library, system metadata library and storage metadatabase;Storage device, file for storing data, Data file includes data file encryption and plaintext data file;
Client, it includes content metadata extraction modules and data file encryption module.
A kind of storage of data file encryption and search method, its main feature is that, this method includes:
Generation encryption is encrypted after the content metadata of client or cloud storage system server end extraction data file Data file, data file encryption and corresponding content metadata are stored respectively in the memory storage devices kimonos of cloud storage system It is engaged in the content metadata library at device end;Attribute information of the content metadata comprising data file and content information and data text The file global identifier of part in an encrypted state;
When the data file encryption stored in retrieval cloud storage system, the data file encryption retrieval module of server-side passes through Inverted index method, in the content metadata library of server end retrieval obtain adding with the matched data file of search key File global identifier under close state lists the attribute information of the corresponding data file encryption of this document global identifier and interior Hold information to return as retrieval result.
The method packet for generating data file encryption is encrypted after the content metadata of above-mentioned client extraction data file Contain:
Client extracts the content metadata of data file;
Client encrypts the data file for having extracted content metadata, generates data file encryption;
Data file encryption and corresponding content metadata are uploaded to cloud storage system server end by client.
Generation encryption data is encrypted after the content metadata of above-mentioned cloud storage system server end extraction data file The method of file includes:
Data file is uploaded to cloud storage system server end by client;
Cloud storage system server end extracts the content metadata of data file;
Cloud storage system server end encrypts the data file for having extracted content metadata, generates data file encryption.
The content metadata of said extracted data file includes:The Inner of client or cloud storage system server end holds first number According to extraction module according to the characteristic of data file, preliminary analysis is done to content data file, extraction can embody data file spy The attribute information and content information of property, and the encrypted file global identifier of data file is added in content metadata.
After the content metadata of said extracted data file, client can be to being stored in cloud storage system server end content Content metadata in metadatabase is modified.
Above-mentioned cloud storage system server end by data file encryption distributed storage cloud storage system storage device, And content metadata is stored in the content metadata library of cloud storage system.
The data file encryption stored in above-mentioned retrieval cloud storage system server end includes:
Client sends the retrieval request for including search key, and cloud storage system analysis retrieval request determines retrieval request The legitimacy of middle search key content;
The information searching module of cloud storage system carries out matching inquiry by inverted index method to content metadatabase, obtains It takes corresponding with the matched data file of search key file global identifier in an encrypted state and file global identifier Data file attribute information and content information as retrieval result;
Information searching module is sent to client after being ranked up retrieval result.
It is corresponding that file global identifier listed in download retrieval result may be selected according to retrieval result in above-mentioned client Data file encryption;
It is that data file encryption is directly passed to user visitor by cloud storage system if client is encrypted if data file encryption Family end, is decrypted by client;
If data file encryption is encrypted in cloud storage system server end, will be encrypted by cloud storage system server end Client is passed to after data file decryption.
The search method of above-mentioned data file encryption also includes the optimization method of inverted index method, the inverted index method Optimization method include:
It by vertical segmentation and moves horizontally, the neutral element of the inverted index matrix of content data file metadata is made to move on to The bottom of matrix and right part;
Using Block Cluster, original higher-dimension sparse matrix is changed into several low-dimensional dense matrix;
When to content metadata retrieval, the low-dimensional matrix of several in the sparse matrix of optimization is sent to cloud storage system respectively Different processing units in system carry out parallel processing.
Data file encryption storage and retrieval system of the present invention and the data file encryption of method and the prior art storage and Retrieval technique is compared, and the advantage is that, the present invention establishes a kind of novel content metadata structure, ensures user from polygonal Degree, the multi-faceted retrieval to data file encryption, have ensured that safety and privacy of the data file in cloud storage environment are same When, ensure the convenience of data file retrieval;
Data file is all cryptographically to be stored in cloud storage system in the present invention, even if obtaining encrypted data text Part, without decruption key, data file will not be revealed;
The present invention designs a kind of novel inverted index method being suitable for content metadata retrieval, can be according to user in visitor The key word information that family end provides quick-searching in cloud storage system goes out corresponding data file encryption, ensure that user encryption The efficiency and precision of data file retrieval solve data file encryption retrieval hardly possible or retrieval in the big datas environment such as cloud storage Slow-footed problem;
The present invention is equally applicable for the retrieval of data file encryption and plaintext data file in cloud storage system, can Achieve the purpose that quick-searching and returns to retrieval result.
Description of the drawings
Fig. 1 is the flow diagram of data file encryption of the present invention storage and search method;
Fig. 2 is three kinds of metadata relationship figures;
Fig. 3 is content metadata structure chart;
Fig. 4 is storage metadata structure figure;
Fig. 5 is a kind of method flow of data file encryption storage and search method embodiment based on cloud storage system Figure;
Fig. 6 is the retrieval model figure of data file encryption;
Fig. 7 is the inverted index schematic diagram of content metadata;
Fig. 8 is that the matrix of the inverted index of content metadata indicates schematic diagram;
Fig. 9 is the inverted index Factorization algorithm and its parallel processing schematic diagram of content metadata.
Specific implementation mode
Below in conjunction with attached drawing, the specific embodiment that further illustrates the present invention.
Present invention is disclosed a kind of data file encryption storage and retrieval system and method based on cloud storage system, will be former The attribute information and content information of data file and the file global identifier of data file encryption are stored in content metadata In library, the search operaqtion of data file encryption is completed in a manner of realizing without decrypted original data file.
The technical principle of the present invention:(1)By designing special content data file metadata structure, in content metadata It include the file global identifier of data file in an encrypted state(FGID), in client or cloud storage system server end (That is cloud storage system)Before data file encryption, the content metadata of data file is automatically extracted, and is deposited into cloud and deposits The content metadata database of storage system server end is stored in cloud storage system, is provided for the retrieval of data file encryption Foundation;(2)Data file may be selected to encrypt in client or be encrypted in the server end of cloud storage system, then in a distributed manner Storage mode is stored in the storage device of cloud storage system, ensures the safety of data file, privacy, high availability sum number According to integrality;(3)It ensure that the retrieval of huge volumes of content metadata in cloud storage system by a kind of novel inverted index method Speed, to realize the quick-searching of data file encryption.The invention overcomes conventional cryptography data file retrieval difficulty, or The slow defect of retrieval rate after decryption.
Data file encryption storage and retrieval system disclosed by the invention based on cloud storage system includes:Cloud storage system And client.
Cloud storage system includes server end and storage device;Server Side Include authenticating user identification module, content member Data extraction module, metadata management system, data file encrypting module, information searching module;The metadata management system Content metadata library, system metadata library and storage metadatabase are connect and managed under system;Storage device, it is literary for storing data Part, data file include data file encryption and plaintext data file.Client includes content metadata extraction module and data File encryption module.
As shown in Figure 1, disclose it is a kind of based on cloud storage system data file encryption storage and search method, this method It comprises the steps of:
S100, client carry out authentication by authentication interface to user.
User authentication includes that access control and subscriber identity information provide;Access control:It is the access for limiting disabled user, It is the first line of defence for building data safety in cloud storage environment;User information provides:In subsequent content data file member number According to the access rights setting in extraction, obtaining the relevant information of user and data, it is required for obtaining from user authentication information.
Cloud storage system authenticating user identification controls the access of disabled user, refuses the operation of disabled user, ensure that cloud The safety of storage system realizes the security protection to cloud storage system, disabled user is kept outside of the door, for follow-up related first number According to extraction or generate corresponding attribute be provided, also submit necessary information for the access control of data file.Furthermore it is also possible to really Determine the identity of validated user, and limits the range of its data access.
While carrying out authentication, judge to carry out file Stored Procedure or document retrieval flow, if file stores Flow then jumps to S200, if document retrieval flow, then jump to S300.
S200, file Stored Procedure is carried out to subscriber data file, subscriber data file is encrypted and stored at cloud storage System.
S300, document retrieval flow is carried out to data file encryption, it is straight in the case where not decrypting in cloud storage system It connects and data file encryption is retrieved.
Wherein, data file includes structured data file, semi-structured data file and non-structural data file.Structure Change the various database files that data file refers to traditional;Non-structured data file refer to various document files, picture file, Audio file and video file etc.;Semi-structured data file is a kind of irregular database file, is in database file In be embedded with the data file of unstructured data.
As shown in Fig. 2, the metadata in cloud storage environment is divided into three classes metadata by the present invention, respectively:System member number According to storage metadata and content metadata.Three classes metadata is stored respectively in the different metadatabase in high in the clouds, each encryption data The three classes metadata of file is associated with by the FGID of the data file.FGID is both the major key of table where three classes metadata, It is also external key, be encrypted data file for unique mark, determined by the content of data file encryption, can retrieved and be added with FGID The data integrity of ciphertext data file.The length of each FGID is 128, that is to say, that can indicate 2128A file.
Data file under system metadata, including the directory information of cloud storage system and directory pathname, each catalogue The file global identifier of name and data file encryption(FGID)And the information such as attribute of data file and catalogue, be by with System automatically generated after the data file encryption storage of family.
As shown in figure 3, content metadata is the key content for realizing data file retrieval, data file characteristics can be embodied, In an encrypted state including attribute information of the stored data file under plaintext state and content information face and data file File global identifier(FGID).Attribute information includes:File name, file creator, modification time, is repaiied at creation time The person of changing, version information, file type etc., by the way that attribute information can there are one whole cognitions to the data file.Content information Including:Content data file brief introduction, keyword, file alias, file label, remarks, purposes, Content Organizing structure, compression side Formula, coded format, the content characteristic information of data file.Here, the file global identifier of data file in an encrypted state (FGID), it is added automatically by cloud storage system after data file encryption storage.
As shown in figure 4, storage metadata includes the basic information and storage information and encryption data of data file encryption The file global identifier of file(FGID).Basic information include data file size, User ID, operable type, replicate because Son, security attribute etc.;It includes the corresponding content address lists of data file encryption block ID, block size list, block object to store information Manage address and content address mapping table base address.User ID is the ID of the owner of this document, operable type include reading and writing, Modification etc., replicator refer to the number of data file backup, and content address is the hash function value of the data block.Storage member Data are system automatically generateds after being stored by user encryption data file.
In the metadata management system of cloud storage system server end, pass through the file global identifier of data file (FGID)Its three classes metadata is linked together.The relevant information of content metadata is automatically extracted by system, the use having permission Family can in client by network to the content metadata manual modification in content metadatabase.
Wherein, the encryption of data file:According to data file encryption, residing physical location is divided into:Client encrypts kimonos The encryption of business device end.Two kinds of cipher modes use different scenes respectively:The higher data file of security requirement may be selected Cloud storage system server end is encrypted, and client can be selected to encrypt the very high data file of security requirement.
Wherein, the storage and management of metadata:System metadata stores metadata and content metadata three classes metadata point Be not stored in three kinds in cloud storage system specific metadata databases, the database can realize mass data storage and Efficient retrieval can meet the concurrent request of mass users, and have the function of automatic fault recovery and data backup, ensure metadata Safety and high availability;
Wherein, the storage and management of data file encryption:Encrypted data file is deposited by cloud storage system using distribution Storage technology is stored in the storage device that cloud storage system is user virtual machine distribution;The cloud storage system is based on distribution High availability storage system, by increasing number of nodes, the total capacity of extension cloud storage system that can be lateral;By will be several Physical data block is merged into a larger logical memory space, reduces data management expense;
Data file encryption retrieval based on content metadata, point for needing the input information progress first to user preliminary Range of search is reduced in analysis, determines the meta data file that retrieval may relate to, the huge volumes of content member in this way in cloud storage system It can accelerate retrieval rate in database;The search operaqtion does not need to decryption data file encryption, the result finally retrieved It is encrypted data file or data file encryption list.
As shown in figure 5, for a kind of embodiment of data file encryption storage and search method based on cloud storage system, Middle file Stored Procedure(S200)Specifically comprise the steps of:
After S201, client complete authentication to user, client judges whether to carry out data file in client Encryption then jumps to S202 if so, carrying out client encryption, if it is not, carrying out server(Cloud storage system server end)Add It is close, then jump to S205.
S202, client content metadata extraction module automatically extract the content metadata of data file.Content member number According to extraction include:It automatically extracts and manual modification.
It automatically extracts:Metadata extraction module is held according to the characteristic of file by the Inner of client, file content is done tentatively Analysis, extract the above-mentioned attribute information and content information that can embody data file characteristics.
Manual modification:User is the owner of data file, has ratio to the type of data file, attribute, purposes, feature etc. More comprehensively understand, the content member that some special data file users can automatically extract system by network in client Data carry out manual modification, can more accurately describe the characteristic of the data file in this way, can improve the accuracy and efficiency of retrieval.
Here, the content metadata extraction of data file, by the content metadata extraction module of client, in data file It automatically extracts and is sent in the content metadata library of cloud storage system server end before encryption and preserved, the user having permission can be to depositing Storage content member in content metadatabase carries out edit-modify, so that content metadata is more met the retrieval habit of user, can improve The accuracy and recall precision of user search.
S203, client encrypting module data file is encrypted, by extract content metadata after data file It is encrypted by private key for user or symmetric key encryption or other Encryption Algorithm, generates data file encryption.
Data file encryption and corresponding content metadata are uploaded to the server of cloud storage system by S204, client End, jumps to S208.
Data file is uploaded to the server end of cloud storage system by S205, client with plaintext version.
S206, cloud storage system server end content metadata extraction module automatically extract data file content member Data.The extraction of content metadata includes:It automatically extracts and manual modification.
It automatically extracts:Metadata extraction module is held according to the characteristic of file by the Inner of client, file content is done tentatively Analysis, extraction can embody the attribute information and content information of data file characteristics.
Manual modification:User is the owner of data file, has ratio to the type of data file, attribute, purposes, feature etc. More comprehensively understand, the content member that some special data file users can automatically extract system by network in client Data carry out manual modification, can more accurately describe the characteristic of the data file in this way, can improve the accuracy and efficiency of retrieval.
Here, the content metadata extraction of data file is extracted by the content metadata of cloud storage system server end Module, automatically extracts before data file encryption and send in the content metadata library of cloud storage system server end and preserve, and has the right The user of limit can carry out edit-modify to being stored in content member in content metadata library, and content metadata is made more to meet user's Retrieval habit can improve the accuracy and recall precision of user search.
S207, cloud storage system server end encrypting module to extraction content metadata after data file pass through user Private key or symmetric key encryption or other Encryption Algorithm are encrypted, and generate data file encryption.Data file encryption ensures The safety of data file, jumps to S208.
S208, cloud storage system server end are literary to the encryption data generated by client encryption or server end encryption Part is stored, and corresponding storage device in cloud storage system is stored in.
After the server end of cloud storage system obtains the data file of user encryption, by data file encryption storage to In the corresponding storage device of family virtual machine.
Each cloud tenant(User)It is carried out all as unit of user virtual machine using cloud storage system.
The encrypting module of cloud storage system generates the file global identifier of data file encryption using MD5 algorithms simultaneously (FGID), and be sent into the content metadata of the data file.File global identifier(FGID)It is the data file encryption Unique mark, while also can be according to file global identifier(FGID)Examine the data integrity of the data file encryption.
Cloud storage system generates system metadata and storage metadata after data file encryption completes storage, by cloud storage Directory information, storage location information and the file global identifier FGID that system is stored according to data file encryption, which are automatically generated, to be added The system metadata and storage metadata of ciphertext data file.
S209, the content metadata extracted before encryption is stored in the content metadata library of cloud storage system.
It further, can also be behind the content metadata library that content metadata is stored in cloud storage system, to content member number It is updated according to database, the user having permission can be in client by network, in the content metadata library of cloud storage system Content metadata modify, the more accurate retrieval convenient for user for data file encryption.
In the data file encryption storing process based on S200, the search operaqtion to data file encryption may be implemented, such as Shown in Fig. 6, for the data file encryption retrieval model based on content metadata, rough flow is:1. client uploads retrieval Content to cloud storage system information searching module;2. information searching module is by falling the method to sort in metadata management system The information of query and search and retrieval content matching in the content metadata library managed;3. 2. metadata management system will be retrieved To with retrieval content matching data file encryption file global identifier FGID and corresponding content metadata issue letter Breath retrieval module;It, then will sequence 4. the result for retrieving return in metadata management system is ranked up by information searching module Retrieval result list afterwards is sent to client;5. user selects the file to be downloaded in client from retrieval result list, And by the file global identifier of selected file(FGID)It is sent to information searching module;6. information searching module is according to file FGID in the storage metadatabase that metadata management system is managed, search corresponding data file encryption storage location letter Breath;7. the data file encryption storage location information 6. searched is sent to distributed storage system by metadata management system System;8. distributed memory system takes out data file encryption and is transmitted to encryption solution according to data file encryption storage location information Close module;9. Encryption Decryption module passes to client after decrypting data file encryption, entire data file encryption retrieval terminates.
In figure 6, if data file encryption is encrypted in server end, the Encryption Decryption module in model is position In the server end of cloud storage system;If data file encryption is encrypted in client, the encrypting and decrypting mould in model Block is to be located at client.
As shown in figure 5, for a kind of embodiment of data file encryption storage and search method based on cloud storage system, Middle file retrieval flow(S300)Specifically comprise the steps of:
S301, client receive inquiry request by query interface, include search key in inquiry request.Client Inquiry request comprising search key is uploaded to cloud storage system.
The inquiry request that the information searching module of cloud storage system submits client is analyzed, and determines inquiry request The legitimacy of middle the included content of search key.
S302, data retrieval:The information searching module of cloud storage system is internal by a kind of novel Inversed File Retrieval Algorithm Hold metadatabase and carry out matching inquiry, and returns to the file global identifier of satisfactory data file encryption(FGID)The portion and Divide content metadata information.
Novel inverted index method is a kind of improved inverted index method, which is suitble to cloud storage system The quick-searching of huge volumes of content metadata information in content metadata library.
Such as Fig. 7 and combine Fig. 8 shown in, be for content data file metadata disclosed in this invention inverted index and its Matrix indicates.Wherein keyword 1, keyword 2 ... indicate the content metadata item in content metadata library, ID1, ID2 ..., IDn indicates that the row and column cross term in file the global identifier FGID, Fig. 8 of data file encryption indicates that a certain keyword occurs Number in a data file.The inverted index matrix of content data file metadata is one sparse as can be seen from Figure 8 Matrix, in order to improve the keyword retrieval speed in cloud storage system in huge volumes of content metadatabase, to content data file member The inverted index matrix of data optimizes as follows:It by vertical segmentation and moves horizontally, the neutral element of matrix is made to move on to matrix Bottom and right part original higher-dimension sparse matrix is changed into low-dimensional dense matrix one by one using Block Cluster.Right When content metadata is retrieved, the difference that the low-dimensional matrix in the sparse matrix of optimization one by one is sent to respectively in cloud storage system Processing unit carries out parallel processing, can greatly improve the keyword retrieval speed in huge volumes of content metadatabase in this way.It is former Reason as shown in figure 9, wherein M1, M2 ..., Mn indicate low-dimensional dense matrix, P1, P2 ..., Pn indicate cloud storage system in it is parallel Processing unit.
S303, cloud storage system judge whether to retrieve successfully, if so, step S304 is jumped to, if otherwise, exporting without right It answers the information of the retrieval result of data file encryption to be sent to client, and jumps to S305.
S304, cloud storage system are according to the file global identifier of data file encryption(FGID)Position respective encrypted data File.
The result that the information searching module of cloud storage system returns to S302 retrievals simultaneously is ranked up, and is sent to client Retrieval result after sequence.
S305, client receive retrieval result, and client exports retrieval result to user, if it includes to meet inspection to obtain The retrieval result of the file global identifier of the data file encryption of rope content illustrates there is the retrieval result for meeting retrieval content, Required data file encryption information can be obtained according to retrieval result in client by user.
In addition, client according to retrieval result, also may be selected to download file global identifier pair listed in retrieval result The data file encryption answered.
It is originally that cloud storage system directly passes to data file encryption if client is encrypted if data file encryption Subscription client is decrypted by client.It is encrypted in cloud storage system server end if data file encryption is originally, Pass to client after data file encryption being decrypted by cloud storage system.
Although present disclosure is discussed in detail by above preferred embodiment, but it should be appreciated that above-mentioned Description is not considered as limitation of the present invention.After those skilled in the art have read the above, for the present invention's A variety of modifications and substitutions all will be apparent.Therefore, protection scope of the present invention should be limited to the appended claims.

Claims (10)

1. a kind of data file encryption storage and retrieval system, which is characterized in that the system includes:
Cloud storage system, it includes server ends and storage device;It is the Server Side Include authenticating user identification module, interior Hold metadata extraction module, metadata management system, data file encrypting module, information searching module;The metadata pipe Content metadata library, system metadata library and storage metadatabase are connect and managed under reason system;The storage device, is used for Data file is stored, data file includes data file encryption and plaintext data file;
Client, it includes content metadata extraction modules and data file encryption module;
Generation encryption is encrypted after the content metadata of the client or cloud storage system server end extraction data file Data file, data file encryption and corresponding content metadata are stored respectively in the storage device and server of cloud storage system In the content metadata library at end;Attribute information of the content metadata comprising data file and content information and data file exist File global identifier under encrypted state;When the data file encryption stored in retrieval cloud storage system, the encryption of server-side Data file retrieval module is by inverted index method, and retrieval obtains in the content metadata library of server end and retrieval is crucial The file global identifier of the matched data file of word in an encrypted state lists the corresponding encryption number of this document global identifier It is returned as retrieval result according to the attribute information and content information of file.
2. a kind of storage of data file encryption and search method, which is characterized in that this method includes:
Generation encryption data is encrypted after the content metadata of client or cloud storage system server end extraction data file File, data file encryption and corresponding content metadata are stored respectively in the storage device and server end of cloud storage system In content metadata library;Attribute information of the content metadata comprising data file and content information and data file are being encrypted File global identifier under state;
When the data file encryption stored in retrieval cloud storage system, the data file encryption retrieval module of server-side passes through the row of falling Indexing means, retrieval is obtained in the content metadata library of server end is encrypting shape with the matched data file of search key File global identifier under state lists the attribute information and content letter of the corresponding data file encryption of this document global identifier Breath is returned as retrieval result.
3. data file encryption storage as claimed in claim 2 and search method, which is characterized in that the client extracts data The method for generating data file encryption is encrypted after the content metadata of file includes:
Client extracts the content metadata of data file;
Client encrypts the data file for having extracted content metadata, generates data file encryption;
Data file encryption and corresponding content metadata are uploaded to cloud storage system server end by client.
4. data file encryption storage as claimed in claim 2 and search method, which is characterized in that the cloud storage system service The method for generating data file encryption, which is encrypted, after the content metadata of device end extraction data file includes:
Data file is uploaded to cloud storage system server end by client;
Cloud storage system server end extracts the content metadata of data file;
Cloud storage system server end encrypts the data file for having extracted content metadata, generates data file encryption.
5. data file encryption storage and search method as described in Claims 2 or 3 or 4, which is characterized in that the extraction data The content metadata of file includes:The content metadata extraction module of client or cloud storage system server end is according to data text The characteristic of part, does content data file preliminary analysis, and extraction can embody the attribute information and content letter of data file characteristics Breath, and the encrypted file global identifier of data file is added in content metadata.
6. data file encryption storage and search method as described in Claims 2 or 3 or 4, which is characterized in that the extraction data After the content metadata of file, client can be to the content member number that is stored in cloud storage system server end content metadata library According to modifying.
7. data file encryption storage and search method as described in Claims 2 or 3 or 4, which is characterized in that the cloud storage system Unite server end by data file encryption distributed storage in the storage device of cloud storage system, and content metadata is stored in In the content metadata library of cloud storage system.
8. data file encryption storage as claimed in claim 2 and search method, which is characterized in that the retrieval cloud storage system The data file encryption stored in server end includes:
Client sends the retrieval request for including search key, and cloud storage system analysis retrieval request, which determines in retrieval request, to be examined The legitimacy of rope key words content;
The information searching module of cloud storage system by inverted index method to content metadatabase carry out matching inquiry, obtain with The matched data file of search key file global identifier in an encrypted state and the corresponding number of file global identifier According to the attribute information and content information of file as retrieval result;
Information searching module is sent to client after being ranked up retrieval result.
9. data file encryption storage as claimed in claim 8 and search method, which is characterized in that the client is according to retrieval As a result, may be selected to download the corresponding data file encryption of file global identifier listed in retrieval result;
It is that data file encryption is directly passed to subscription client by cloud storage system if client is encrypted if data file encryption, It is decrypted by client;
If data file encryption is encrypted in cloud storage system server end, by cloud storage system server end by encryption data Client is passed to after file decryption.
10. data file encryption storage and search method as described in claim 2 or 8 or 9, which is characterized in that the encryption number Also include the optimization method of inverted index method according to the search method of file, the optimization method of the inverted index method includes:
It by vertical segmentation and moves horizontally, the neutral element of the inverted index matrix of content data file metadata is made to move on to matrix Bottom and right part;
Using Block Cluster, original higher-dimension sparse matrix is changed into several low-dimensional dense matrix;
When to content metadata retrieval, the low-dimensional matrix of several in the sparse matrix of optimization is sent in cloud storage system respectively Different processing units carry out parallel processing.
CN201610025930.1A 2016-01-15 2016-01-15 Data file encryption storage and retrieval system and method Active CN105678189B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610025930.1A CN105678189B (en) 2016-01-15 2016-01-15 Data file encryption storage and retrieval system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610025930.1A CN105678189B (en) 2016-01-15 2016-01-15 Data file encryption storage and retrieval system and method

Publications (2)

Publication Number Publication Date
CN105678189A CN105678189A (en) 2016-06-15
CN105678189B true CN105678189B (en) 2018-10-23

Family

ID=56300884

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610025930.1A Active CN105678189B (en) 2016-01-15 2016-01-15 Data file encryption storage and retrieval system and method

Country Status (1)

Country Link
CN (1) CN105678189B (en)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106131013A (en) * 2016-07-06 2016-11-16 杨炳 A kind of protecting data encryption system
CN106302472B (en) * 2016-08-09 2019-12-24 厦门乐享新网络科技有限公司 Information hiding method and device
CN106302449B (en) * 2016-08-15 2019-10-11 中国科学院信息工程研究所 A kind of storage of ciphertext and the open cloud service method of searching ciphertext and system
US11405192B2 (en) * 2016-08-24 2022-08-02 Robert Bosch Gmbh Searchable symmetric encryption system and method of processing inverted index
CN108268558B (en) * 2017-01-03 2020-12-04 中移(苏州)软件技术有限公司 Data analysis method and device
CN106649880B (en) * 2017-01-09 2021-02-02 北京国电通网络技术有限公司 Power statistics management system and method
CN107291851B (en) * 2017-06-06 2020-11-06 南京搜文信息技术有限公司 Ciphertext index construction method based on attribute encryption and query method thereof
GB201710013D0 (en) * 2017-06-22 2017-08-09 Scentrics Information Security Tech Ltd Control Access to data
CN107704768A (en) * 2017-09-14 2018-02-16 上海海事大学 A kind of multiple key classification safety search method of ciphertext
US10713238B2 (en) * 2017-11-14 2020-07-14 Snowflake Inc. Database metadata in immutable storage
CN108984627A (en) * 2018-06-20 2018-12-11 顺丰科技有限公司 Searching method, system, equipment and the storage medium of encrypted document based on Elasticsearch
CN108897859A (en) * 2018-06-29 2018-11-27 郑州云海信息技术有限公司 A kind of metadata retrieval method, apparatus, equipment and computer readable storage medium
CN109284290B (en) * 2018-09-20 2022-04-26 佛山科学技术学院 Data reading method based on distributed storage space
CN109542895B (en) * 2018-10-25 2019-12-06 北京开普云信息科技有限公司 resource management method and system based on metadata custom expansion
CN110929302B (en) * 2019-10-31 2022-08-26 东南大学 Data security encryption storage method and storage device
US20210349927A1 (en) 2020-05-08 2021-11-11 Bold Limited Systems and methods for creating enhanced documents for perfect automated parsing
BR122022003477A2 (en) * 2020-05-08 2022-03-29 Bold Limited Systems and methods for creating enhanced documents for seamless automated analysis
US11436377B2 (en) * 2020-06-26 2022-09-06 Ncr Corporation Secure workload image distribution and management
CN112052219A (en) * 2020-08-05 2020-12-08 中国建设银行股份有限公司 File storage and retrieval method and device, electronic equipment and readable storage medium
CN112702379A (en) * 2020-08-20 2021-04-23 纬领(青岛)网络安全研究院有限公司 Full-secret search research for big data security
CN112233666A (en) * 2020-10-22 2021-01-15 中国科学院信息工程研究所 Method and system for storing and retrieving Chinese voice ciphertext in cloud storage environment
CN112417473A (en) * 2020-11-20 2021-02-26 季速漫 Big data security management system
CN112733180A (en) * 2021-04-06 2021-04-30 北京神州泰岳智能数据技术有限公司 Data query method and device and electronic equipment
CN113434877B (en) * 2021-06-23 2024-07-05 平安国际智慧城市科技股份有限公司 Encryption and decryption methods, devices, equipment and storage medium for user input data
CN113254982B (en) * 2021-07-13 2021-10-01 深圳市洞见智慧科技有限公司 Secret track query method and system supporting keyword query
CN113642026A (en) * 2021-08-31 2021-11-12 立信(重庆)数据科技股份有限公司 Method and device for inquiring event processing data on block chain

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101770462A (en) * 2008-12-30 2010-07-07 日电(中国)有限公司 Device for ciphertext index and search and method thereof
CN102024054A (en) * 2010-12-10 2011-04-20 中国科学院软件研究所 Ciphertext cloud-storage oriented document retrieval method and system
CN103442057A (en) * 2013-08-27 2013-12-11 玉林师范学院 Cloud storage system based on user collaboration cloud

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9678967B2 (en) * 2003-05-22 2017-06-13 Callahan Cellular L.L.C. Information source agent systems and methods for distributed data storage and management using content signatures

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101770462A (en) * 2008-12-30 2010-07-07 日电(中国)有限公司 Device for ciphertext index and search and method thereof
CN102024054A (en) * 2010-12-10 2011-04-20 中国科学院软件研究所 Ciphertext cloud-storage oriented document retrieval method and system
CN103442057A (en) * 2013-08-27 2013-12-11 玉林师范学院 Cloud storage system based on user collaboration cloud

Also Published As

Publication number Publication date
CN105678189A (en) 2016-06-15

Similar Documents

Publication Publication Date Title
CN105678189B (en) Data file encryption storage and retrieval system and method
CN106127075B (en) Encryption method can search for based on secret protection under a kind of cloud storage environment
Demertzis et al. Fast searchable encryption with tunable locality
US8533489B2 (en) Searchable symmetric encryption with dynamic updating
US7519835B2 (en) Encrypted table indexes and searching encrypted tables
CN102075542B (en) Cloud computing data security supporting platform
US7865537B2 (en) File sharing system and file sharing method
US11256662B2 (en) Distributed ledger system
CN103107889A (en) System and method for cloud computing environment data encryption storage and capable of searching
US9886448B2 (en) Managing downloads of large data sets
US8799677B2 (en) Encrypted search database device, encrypted search data adding/deleting method and adding/deleting program
CN107094075B (en) Data block dynamic operation method based on convergence encryption
WO2014141802A1 (en) Information processing device, information processing system, information processing method, and program
Yuan et al. Towards privacy-preserving and practical image-centric social discovery
Ananthi et al. FSS-SDD: fuzzy-based semantic search for secure data discovery from outsourced cloud data
CN103414555A (en) Array key management method based on IO block encryption
CN116069729B (en) Intelligent document packaging method, system and medium
CN105159919A (en) Data multi-copy correlation method and system
WO2014114987A1 (en) Personal device encryption
TW202119229A (en) Data management method and system capable of safely accessing and deleting data wherein operations are performed by using a management server
Chen et al. Searchable encryption system for big data storage
Rattan et al. Survey on Secure Encrypted Data with Authorized De-duplication
Agrawal et al. Efficient Privacy Preserving Clustering Based Multi Keyword Search
Muppalaneni et al. A Survey on Efficient Data Deduplication in Data Analytics
Liu et al. An Efficient Keyword-Based Ciphertext Retrieval Scheme

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant