CN108363689B - Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud - Google Patents
Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud Download PDFInfo
- Publication number
- CN108363689B CN108363689B CN201810122376.8A CN201810122376A CN108363689B CN 108363689 B CN108363689 B CN 108363689B CN 201810122376 A CN201810122376 A CN 201810122376A CN 108363689 B CN108363689 B CN 108363689B
- Authority
- CN
- China
- Prior art keywords
- document
- vector
- retrieval
- keyword
- cloud server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/93—Document management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/04—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
- H04L63/0428—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/06—Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Security & Cryptography (AREA)
- Databases & Information Systems (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- General Business, Economics & Management (AREA)
- Computer Hardware Design (AREA)
- Business, Economics & Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Storage Device Security (AREA)
Abstract
The invention discloses a privacy protection multi-keyword Top-k ciphertext retrieval method and system facing to mixed cloud. The invention mainly solves the problem of low retrieval efficiency. The scheme is as follows: the data providing end generates a keyword dictionary sequence by utilizing the correlation among the keywords through a clustering technology; each document generates a high-dimensional document vector and a low-dimensional document filtering vector, and then the ciphertext document and the encrypted document vector are outsourced to an untrusted public cloud server, and the plaintext document filtering vector is stored in a trusted private cloud server. During retrieval, the candidate document set is calculated through the private cloud server, and then the Top-k document calculation of the retrieval result is achieved through the public cloud server. The gathering characteristic of related keywords in the keyword dictionary sequence improves the filtering effect of the private cloud server, and the size of the candidate document set is reduced. The method is simple in flow, high in safety and easy to implement, and can realize efficient multi-keyword ciphertext retrieval processing in a mixed cloud environment through less calculation overhead.
Description
Technical Field
The invention relates to user data privacy protection, in particular to a hybrid cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval method and system.
Background
The idea of serving IT resources is becoming more and more popular, and IT shows a trend of "all services" (XaaS), which becomes a core concept of cloud computing. However, while cloud computing is developing vigorously, cloud security is also becoming a problem of widespread concern. In a Cloud environment, since a user cannot directly control data placed in a remote Cloud Server (CS), there is a fear that own outsourced data is illegally acquired or abused by a Cloud service provider, especially for sensitive data with high privacy requirements, such as electronic medical records, bank transaction data, user mails, and the like. Although cloud service providers claim that they provide some security countermeasures to deal with privacy disclosure problems, such as access control technology, firewall technology, intrusion detection technology, and the like, user concerns about data security problems are undoubtedly major issues that inhibit further development of cloud computing.
A common method for protecting data privacy is to outsource data to a public cloud server after data encryption processing, but this severely restricts the use of outsourced data. In the field of information retrieval research, the conventional multi-keyword retrieval is mainly oriented to plaintext data and cannot be directly applied to the field of ciphertext retrieval. Downloading all encrypted data from the cloud to the local for decryption is obviously an impractical and resource-wasting processing method. Therefore, it is a challenging problem to research and solve a ciphertext data retrieval mechanism with a privacy protection function in a cloud environment, which has become one of the hot issues of concern in the field of cloud computing research in recent years.
In the prior art, most methods adopt public cloud service by default, and a series of multi-keyword ciphertext retrieval processing methods in encrypted cloud environment are provided on the basis of the assumption that public cloud provides service in a semi-honest model mode, but the methods have one or more problems of low retrieval efficiency, inaccurate retrieval result, complex index tree construction and the like.
Aiming at the problems, the Chinese patent application with application number 201710181664.6 discloses a fast multi-keyword semantic sorting search method for protecting data privacy in cloud computing, a private cloud server is added, a document vector is created for each document, a corresponding identification vector is created at the same time, an encrypted document vector is outsourced to a public cloud server, a plaintext identification vector is stored in the private cloud server, document set primary filtering operation is realized through the private cloud server, the number of document vectors related to retrieval vector score computing is reduced, and retrieval computing cost is reduced. Therefore, how to improve the filtering effect of the private cloud server plays an important role in improving the multi-keyword ciphertext retrieval efficiency supporting privacy protection in the hybrid cloud.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the problems in the prior art, the invention provides a privacy protection multi-keyword Top-k ciphertext retrieval method and system facing to a mixed cloud.
The technical scheme is as follows: the privacy protection multi-keyword Top-k ciphertext retrieval method facing the mixed cloud comprises the following steps:
(1) the data providing end extracts a keyword set from the provided document set and generates a keyword dictionary sequence through clustering and partitioning; generating a corresponding plaintext document vector for each document in the document set according to the keyword dictionary sequence, and blocking the plaintext document vectors according to the blocking condition of the keyword dictionary sequence to form document filtering vectors; encrypting the plaintext document vector to form an encrypted document vector, and encrypting each document in the document set to form an encrypted document set; finally, the document filtering vector is transmitted to a private cloud server, and the encrypted document vector and the encrypted document set are transmitted to a public cloud server;
(2) the data retrieval end generates retrieval vectors according to a plurality of keywords provided by a user, generates a retrieval trapdoor by adopting a security algorithm after normalization, and transmits the retrieval trapdoor and the number k of documents to be retrieved by the user to a public cloud server; generating a retrieval filtering vector for a plurality of keywords provided by a user according to the blocking condition of the keywords in the keyword dictionary sequence, and transmitting the retrieval filtering vector to a private cloud server;
(3) the private cloud server respectively performs AND operation on the received retrieval filtering vector and the document filtering vector of each document, if all bits of the vector obtained through operation are not all 0, the corresponding document number is added to the candidate document set, and the candidate document set is transmitted to the public cloud server;
(4) the public cloud server respectively calculates a security inner product between an encrypted document vector corresponding to each document in the candidate document set and the retrieval trapdoor according to the received candidate document set, the retrieval trapdoor and the number k of the retrieved documents, selects k ciphertext documents most relevant to the keywords provided by the user in the candidate document set according to the security inner product, and returns the k ciphertext documents to the data retrieval end;
(5) and the data retrieval end decrypts the received k ciphertext documents to obtain the most relevant k plaintext documents.
Further, the step (1) specifically comprises:
(1-1) extracting keywords from the provided document set DS by the data providing terminal to obtain a keyword set { w1,w2,…,wn};
(1-2) clustering the keywords in the keyword set according to the correlation relationship to obtain a plurality of clustering sub-clusters { c1,c2,…,ct};
(1-3) taking each sub-cluster as a block, thereby obtaining t blocks, b1,b2,…,btAnd generating a keyword dictionary sequence W ═ { W (b) according to the blocks1,1),w(b1,2),…,w(b2,1),w(b2,2),…,w(bt,1),w(bt2), … }, wherein w (b)jX) denotes belonging to the partition bjThe xth keyword in each block is unordered; block bj={w(bj,x)|0<x≤|bj|};
(1-4) adopting TF-IDF algorithm and space vector model according to the keyword in dictionary sequenceThe position of the keyword is the document set DS ═ D1,D2,…,DmEvery document D iniGenerating a corresponding plaintext document vector ViAnd carrying out normalization processing; wherein, ViIs n, each bit takes the value of the key word corresponding to the bit in the document DiThe word frequency TF value of (1);
(1-5) according to the block situation of the keyword dictionary sequence, the plaintext document vector ViDividing the document into t blocks, wherein the block boundaries are the same as those of the keyword dictionary sequence, and obtaining each document DiDocument filter vector DF ofi={b1,b2,…,bt}; wherein, if ViMiddle block bjAll the positions of the corresponding keywords are taken as 0, and b isjThe value of the block is 0, otherwise bjThe value of a block is 1, DFiIs a vector with 0/1 values of each bit of the t dimension;
(1-6) generating an encryption Key SK (S, M)1,M2,kf) (ii) a Wherein S is a random vector with 0/1 values of each bit, and M is1And M2Is two nxn invertible matrices, n is the length of the keyword dictionary sequence, kfIs a document encryption key;
(1-7) pairing each plaintext document vector V with the generated encryption key by the secure KNN techniqueiEncrypting to obtain corresponding encrypted document vectorWherein, when the jth element S [ j ] in the random vector S]When equal to 0, Vi′+Vi″=ViWhen S [ j ]]When 1, Vi′=Vi″=Vi;
(1-8) encrypting each document in the document set DS through a symmetric encryption algorithm to obtain an encrypted document set ES ═ e1,e2,…,em};
And (1-9) transmitting the document filtering vector to a private cloud server for storage, and transmitting the encrypted document vector and the encrypted document set to a public cloud server for storage.
Further, the step (2) specifically comprises:
(2-1) the data retrieval end provides a plurality of keywords w according to the user1,w2,…,wxGenerating a retrieval vector Q by adopting a TF-IDF algorithm and a space vector model, and normalizing; wherein the jth element of Q, Q [ j]Is the j-th keyword wjThe inverse document frequency IDF value in the document set DS provided by the data providing end;
(2-2) generating a retrieval trapdoor by adopting a safe KNN algorithm based on the retrieval vector QWherein, when the j bit S [ j ] of random vector in the encryption key generated by the data providing terminal]When equal to 0, Q' [ j ]]=Q″[j]=Q[j]When S [ j ]]When 1, Q' [ j]+Q″[j]=Q[j];
(2-3) generating a retrieval filtering vector QF according to the block condition of the keywords in the keyword dictionary sequence, wherein the QF is equal to { b }1,b2,…,btQF is a vector with a t-dimension each bit value of 0/1, if the block is bjAll the corresponding key words in the retrieval vector Q have the value of 0, and then QF [ j]0, otherwise QF [ j]=1;
And (2-4) transmitting the retrieval trapdoor and the number k of the documents to be retrieved by the user to a public cloud server, and transmitting the retrieval filtering vector to a private cloud server.
Further, the step (3) specifically comprises:
(3-1) the private cloud server filtering the received retrieval filtering vector QF and the document filtering vector DF of each documentiRespectively perform AND operation if QF&DFiIf all bits of the vector obtained by the operation (2) are not all 0, then DF is determinediCorresponding document number DidiAdding the document into the candidate document set to obtain a candidate document set CDS ═ d1,d2,…};
And (3-2) sending the candidate document set CDS to a public cloud server.
The privacy protection multi-keyword Top-k ciphertext retrieval system facing the hybrid cloud comprises a data providing end, a data retrieval end, a private cloud server and a public cloud server, wherein:
the data providing end is used for extracting a keyword set from the provided document set and generating a keyword dictionary sequence through clustering and partitioning; generating a corresponding plaintext document vector for each document in the document set according to the keyword dictionary sequence, and blocking the plaintext document vectors according to the blocking condition of the keyword dictionary sequence to form document filtering vectors; encrypting the plaintext document vector to form an encrypted document vector, and encrypting each document in the document set to form an encrypted document set; transmitting the document filtering vector to a private cloud server, and transmitting the encrypted document vector and the encrypted document set to a public cloud server;
the data retrieval end is used for generating retrieval vectors according to a plurality of keywords provided by a user, generating a retrieval trapdoor by adopting a security algorithm after normalization, and transmitting the retrieval trapdoor and the number k of documents to be retrieved by the user to the public cloud server; generating a retrieval filtering vector for a plurality of keywords provided by a user according to the blocking condition of the keywords in the keyword dictionary sequence, and transmitting the retrieval filtering vector to the private cloud server;
the private cloud server is used for respectively carrying out AND operation on the received retrieval filtering vector and the document filtering vector of each document, if all bits of the vector obtained by the operation are not all 0, adding the corresponding document number to the candidate document set, and transmitting the candidate document set to the public cloud server;
the public cloud server is used for respectively calculating a security inner product between an encrypted document vector corresponding to each document in the candidate document set and the retrieval trapdoor according to the received candidate document set, the retrieval trapdoor and the number k of the retrieved documents, selecting k ciphertext documents most relevant to the keywords provided by the user in the candidate document set according to the security inner product, and returning the k ciphertext documents to the data retrieval end;
and the data retrieval end is also used for decrypting the received k ciphertext documents to obtain the most relevant k plaintext documents.
Further, the data providing end specifically includes:
a keyword extraction module for extracting from the provided document set DSKeywords to obtain a keyword set { w1,w2,…,wn};
A clustering module for clustering the keywords in the keyword set according to the correlation relationship to obtain a plurality of clustering sub-clusters { c1,c2,…,ct};
A keyword dictionary generating module for using each sub-cluster as a block to obtain t blocks, b1,b2,…,btAnd generating a keyword dictionary sequence W ═ { W (b) according to the blocks1,1),w(b1,2),…,w(b2,1),w(b2,2),…,w(bt,1),w(bt2), … }; wherein w (b)jX) denotes belonging to the partition bjThe xth keyword in each block is unordered; block bj={w(bj,x)|0<x≤|bj|};
A plaintext document vector generation module for generating a document set DS (D) according to the position of the keyword in the keyword dictionary sequence by adopting a TF-IDF algorithm and a space vector model1,D2,…,DmEvery document D iniGenerating a corresponding plaintext document vector ViAnd carrying out normalization processing; wherein, ViIs n, each bit takes the value of the key word corresponding to the bit in the document DiThe word frequency TF value of (1);
a document filtering vector generation module for generating a plaintext document vector V according to the block condition of the keyword dictionary sequenceiDividing the document into t blocks, wherein the block boundaries are the same as those of the keyword dictionary sequence, and obtaining each document DiDocument filter vector DF ofi={b1,b2,…,bt}; wherein, if ViMiddle block bjAll the positions of the corresponding keywords are taken as 0, and b isjThe value of the block is 0, otherwise bjThe value of a block is 1, DFiIs a vector with 0/1 values of each bit of the t dimension;
a key generation module for generating an encryption key SK (S, M)1,M2,kf) (ii) a Wherein S is a value of 0/1 per bitRandom vector of, M1And M2Is two nxn invertible matrices, n is the length of the keyword dictionary sequence, kfIs a document encryption key;
a document vector encryption module for encrypting each plaintext document vector V by using the generated encryption key through a secure KNN technologyiEncrypting to obtain corresponding encrypted document vectorWherein, when the jth element S [ j ] in the random vector S]When equal to 0, Vi′+Vi″=ViWhen S [ j ]]When 1, Vi′=Vi″=Vi;
A document encryption module, configured to encrypt each document in the document set DS by using a symmetric encryption algorithm to obtain an encrypted document set ES ═ e1,e2,…,em};
And the transmission module is used for transmitting the document filtering vector to the private cloud server for storage, and transmitting the encrypted document vector and the encrypted document set to the public cloud server for storage.
Further, the data retrieval end specifically includes:
a search vector generation module for generating a plurality of keywords { w }according to a user1,w2,…,wxGenerating a retrieval vector Q by adopting a TF-IDF algorithm and a space vector model, and normalizing; wherein the jth element of Q, Q [ j]Is the j-th keyword wjThe inverse document frequency IDF value in the document set DS provided by the data providing end;
a retrieval trapdoor generation module used for generating the retrieval trapdoor by adopting a safe KNN algorithm based on the retrieval vector QWherein, when the j bit S [ j ] of random vector in the encryption key generated by the data providing terminal]When equal to 0, Q' [ j ]]=Q″[j]=Q[j]When S [ j ]]When 1, Q' [ j]+Q″[j]=Q[j];
A search filter vector generation module for generating a search filter vector according to the relation in the keyword dictionary sequenceGenerating a search filtering vector QF according to the block condition of the key word, wherein the QF is equal to { b }1,b2,…,btQF is a vector with a t-dimension each bit value of 0/1, if the block is bjAll the corresponding key words in the retrieval vector Q have the value of 0, and then QF [ j]0, otherwise QF [ j]=1;
And the transmission module is used for transmitting the retrieval trapdoor and the number k of the documents to be retrieved by the user to the public cloud server and transmitting the retrieval filtering vector to the private cloud server.
Further, the private cloud server specifically includes:
and an operation module for receiving the retrieval filter vector QF and the document filter vector DF of each documentiRespectively perform AND operation if QF&DFiIf all bits of the vector obtained by the operation (2) are not all 0, then DF is determinediCorresponding document number DidiAdding the document into the candidate document set to obtain a candidate document set CDS ═ d1,d2,…};
And the transmission module is used for sending the candidate document set CDS to the public cloud server.
Has the advantages that: compared with the prior art, the invention has the following remarkable advantages:
1. has high safety
The invention realizes ciphertext retrieval according to multiple keywords in an untrusted public cloud environment, realizes safe inner product calculation through a safe KNN technology, can realize that an inner product value between two encrypted vectors is equal to an inner product value between two plaintext vectors, does not need to decrypt a retrieval trapdoor in the public cloud environment, does not need to decrypt an encrypted document vector, and even does not need to decrypt an encrypted document. In the public cloud part, the whole process is operated under the ciphertext, and the Top-k result is finally obtained. Therefore, the safe KNN technology can realize the calculation of the Top-k retrieval result according to the multiple keywords and protect the data privacy of the data owner. The secure KNN technique has been widely applied in the field of multi-keyword ciphertext retrieval.
2. High accuracy
The mixed cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval method is divided into two steps when a data retrieval end provides interested multi-keywords for retrieval, and comprises the steps of firstly, generating a candidate document set CDS by a private cloud server, and then searching a Top-k result which is most relevant to the interested multi-keywords in the candidate document set by a public cloud server. When the private cloud generates the candidate document set, any document D in the whole document set DSiIf the Top-k document contains 1 or more interesting multi-keywords provided by data users, the Top-k document is added to the candidate document set, so that the Top-k document meeting the condition is not in the candidate document set; when the public cloud server obtains the candidate document set sent by the private cloud server, the Top-k result is obtained according to the inner product calculation result between the encrypted document vector of each document in the candidate document set and the retrieval trapdoor strictly, so that the private cloud server and the public cloud server can perform accurate sequencing on the retrieval result in a cooperative mode and return the Top-k document as the retrieval result to the data retrieval end.
3. The retrieval efficiency is high
The mixed cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval method provided by the invention aims at the problem that the efficiency of the searchable encryption method mainly based on the application of the current safe KNN calculation, TF-IDF, space vector models and other technologies is not high, a trusted private cloud server is added, a method for generating document filter vectors by document vector blocks is provided, the document filter vectors are uploaded to the private cloud server, because the dimension of the document filter vectors is small, the private cloud server can obtain a candidate document set through less operation overhead according to the retrieval filter vectors provided by a data user, a large number of irrelevant documents are quickly filtered out (the filtered documents cannot be the final Top-k result), the candidate document set is much smaller than the original document set, and therefore, the public cloud server only needs to perform inner product calculation among a small number of encryption vectors, the computing overhead of the public cloud server can be greatly saved. In addition, in view of the fact that interested multiple keywords input by a user are often related, in order to improve the filtering effect of the private cloud server, the candidate document set is further compressed, the positions of the keywords in the keyword dictionary sequence are not randomly placed, clustering is performed according to the keyword correlation, then multiple sub-clusters are obtained, and the keywords in each sub-cluster are located in the same block in the keyword dictionary sequence.
Drawings
FIG. 1 is an architecture diagram of a hybrid cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval method provided by the present invention;
FIG. 2 is a schematic flow chart of a hybrid cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval method provided by the invention;
FIG. 3 is a schematic diagram of a keyword dictionary sequence constructed by clustering keywords into 10 small clusters, where the corresponding keyword dictionary sequence is 10 blocks, and each block contains an indefinite number of keywords and has the same number of keywords as the corresponding small clusters;
FIG. 4 is a schematic illustration of a document vector and a document filter vector, and a retrieval vector and a retrieval filter vector before normalization processing;
FIG. 5 is a schematic diagram of a retrieval process, in which first a private cloud server obtains a candidate document set by an AND operation between a document filter vector and a retrieval filter vector, and then sends the candidate document set to a public cloud server; and the public cloud server obtains the Top-k document by calculating the relevancy score between the document vector and the retrieval vector in the candidate document set. For simplicity of drawing, the document vector and the retrieval vector are not subjected to normalization and encryption processes.
Detailed Description
Example 1
The embodiment provides a hybrid cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval method, as shown in fig. 1 and fig. 2, including the following steps:
(1) the data providing end extracts a keyword set from the provided document set and generates a keyword dictionary sequence through clustering and partitioning; generating a corresponding plaintext document vector for each document in the document set according to the keyword dictionary sequence, and blocking the plaintext document vectors according to the blocking condition of the keyword dictionary sequence to form document filtering vectors; encrypting the plaintext document vector to form an encrypted document vector, and encrypting each document in the document set to form an encrypted document set; and finally, transmitting the document filtering vector to a private cloud server, and transmitting the encrypted document vector and the encrypted document set to a public cloud server.
The method specifically comprises the following steps:
(1-1) extracting keywords from the provided document set DS by the data providing terminal to obtain a keyword set { w1,w2,…,wn};
(1-2) clustering the keywords in the keyword set according to the correlation relationship to obtain a plurality of clustering sub-clusters { c1,c2,…,ct};
(1-3) taking each sub-cluster as a block, thereby obtaining t blocks, b1,b2,…,btAnd generating a keyword dictionary sequence W ═ { W (b) according to the blocks1,1),w(b1,2),…,w(b2,1),w(b2,2),…,w(bt,1),w(bt2), … }, wherein w (b)jX) denotes belonging to the partition bjThe xth keyword in each block is unordered; block bj={w(bj,x)|0<x≤|bjL }; because of the clustering property of keywords, keywords having strong correlation in a dictionary sequence of keywords are clustered in the same block. For example, in fig. 3, the keyword sets are clustered together into 10 small natural clusters, and the number of keywords in each cluster is not fixed, so that the keyword dictionary includes 10 keyword blocks, the number of keywords included in each keyword block is the same as the number of keywords included in the corresponding cluster, and then a keyword dictionary sequence is generated according to the blocks;
(1-4) adopting TF-IDF algorithm and space vector model according to keyword wordsThe position of the keyword in the dictionary sequence is the document set DS ═ D1,D2,…,DmEvery document D iniGenerating a corresponding plaintext document vector ViAnd carrying out normalization processing; wherein, ViIs n, each bit takes the value of the key word corresponding to the bit in the document DiThe word frequency TF value of (1);
(1-5) according to the block situation of the keyword dictionary sequence, the plaintext document vector ViDividing the document into t blocks, wherein the block boundaries are the same as those of the keyword dictionary sequence, and obtaining each document DiDocument filter vector DF ofi={b1,b2,…,bt}; wherein, if ViMiddle block bjAll the positions of the corresponding keywords are taken as 0, and b isjThe value of the block is 0, otherwise bjThe value of a block is 1, DFiIs a vector with 0/1 values of each bit of the t dimension; for example, in fig. 4, a specific example of a document vector and a corresponding document filter vector is given, and in view of drawing simplicity, the document vector is not normalized, and the document is DiCorresponding document vector is ViThe document filtering vector DF is formed according to the position of the block boundary in the keyword dictionary sequenceiAs shown in fig. 4;
(1-6) generating an encryption Key SK (S, M)1,M2,kf) (ii) a Wherein S is an n-dimensional random column vector with 0/1 values per bit, and M is1And M2Is two nxn invertible matrices, n is the length of the keyword dictionary sequence, kfIs a document encryption key; SK only provides DO, DU usage, privacy to CS.
(1-7) pairing each plaintext document vector V with the generated encryption key by the secure KNN techniqueiEncrypting to obtain corresponding encrypted document vectorWherein, when the jth element S [ j ] in the random vector S]When equal to 0, Vi′+Vi″=ViWhen S [ j ]]When 1, Vi′=Vi″=Vi;
(1-8) encrypting each document in the document set DS through a symmetric encryption algorithm to obtain an encrypted document set ES ═ e1,e2,…,em};
And (1-9) transmitting the document filtering vector to a private cloud server for storage, and transmitting the encrypted document vector and the encrypted document set to a public cloud server for storage.
(2) The data retrieval end generates retrieval vectors according to a plurality of keywords provided by a user, generates a retrieval trapdoor by adopting a security algorithm after normalization, and transmits the retrieval trapdoor and the number k of documents to be retrieved by the user to a public cloud server; and generating a retrieval filtering vector for a plurality of keywords provided by the user according to the blocking condition of the keywords in the keyword dictionary sequence, and transmitting the retrieval filtering vector to the private cloud server.
The method specifically comprises the following steps:
(2-1) the data retrieval end provides a plurality of keywords w according to the user1,w2,…,wxGenerating a retrieval vector Q by adopting a TF-IDF algorithm and a space vector model, and normalizing; wherein the jth element of Q, Q [ j]Is the j-th keyword wjThe inverse document frequency IDF value in the document set DS provided by the data providing end;
(2-2) generating a retrieval trapdoor by adopting a safe KNN algorithm based on the retrieval vector QWherein, when the j bit S [ j ] of random vector in the encryption key generated by the data providing terminal]When equal to 0, Q' [ j ]]=Q″[j]=Q[j]When S [ j ]]When 1, Q' [ j]+Q″[j]=Q[j];
(2-3) generating a retrieval filtering vector QF according to the block condition of the keywords in the keyword dictionary sequence, wherein the QF is equal to { b }1,b2,…,btQF is a vector with a t-dimension each bit value of 0/1, if the block is bjAll the corresponding key words in the retrieval vector Q have the value of 0, and then QF [ j]0, otherwise QF [ j]=1;
And (2-4) transmitting the retrieval trapdoor and the number k of the documents to be retrieved by the user to a public cloud server, and transmitting the retrieval filtering vector to a private cloud server.
(3) And respectively performing AND operation on the received retrieval filtering vector and the document filtering vector of each document by the private cloud server, if all the bits of the vector obtained by the operation are not all 0, adding the corresponding document number to the candidate document set, and transmitting the candidate document set to the public cloud server.
The method specifically comprises the following steps:
(3-1) the private cloud server filtering the received retrieval filtering vector QF and the document filtering vector DF of each documentiRespectively perform AND operation if QF&DFiIf all bits of the vector obtained by the operation (2) are not all 0, then DF is determinediCorresponding document number DidiAdding the document into the candidate document set to obtain a candidate document set CDS ═ d1,d2,…};
And (3-2) sending the candidate document set CDS to a public cloud server. Fig. 5 shows a specific query example, where the private cloud server finds document numbers corresponding to document filter vectors whose operation results are not all 0 by performing and operation on the retrieval filter vector and the document filter vector, so as to obtain a candidate document set CDS { Did ═1,Did5,Did6And then sending the CDS to a public cloud server.
(4) The public cloud server respectively calculates a security inner product between an encrypted document vector corresponding to each document in the candidate document set and the retrieval trapdoor according to the received candidate document set, the retrieval trapdoor and the number k of the retrieved documents, selects k ciphertext documents most relevant to the keywords provided by the user in the candidate document set according to the security inner product, and returns the k ciphertext documents to the data retrieval end.
For example, in fig. 5, the public cloud server receives CDS (CDS) { Did) } as a candidate document set sent by the private cloud server1,Did5,Did6At this point the search space for Top-k documents is not already the corpus DS ═ D1,D2,…,D10Becomes a candidate document set { D }1,D5,D6The search space is changed from the original 10 documents to the current 3 documents, so that the inner product between the vectors is calculated for 3 times onlyAnd (4) calculating, namely calculating the dot product between the encrypted document vector and the encrypted retrieval vector corresponding to each document one by one in the candidate document set to obtain 3 relevancy scores, selecting the largest k corresponding encrypted documents, and returning the k corresponding encrypted documents to the data retrieval end.
(5) And the data retrieval end decrypts the received k ciphertext documents by adopting the symmetric key to obtain the most relevant k plaintext documents.
Example 2
The embodiment provides a mixed cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval system, which comprises a data providing end, a data retrieving end, a private cloud server and a public cloud server, wherein:
the data providing end is used for extracting a keyword set from the provided document set and generating a keyword dictionary sequence through clustering and partitioning; generating a corresponding plaintext document vector for each document in the document set according to the keyword dictionary sequence, and blocking the plaintext document vectors according to the blocking condition of the keyword dictionary sequence to form document filtering vectors; encrypting the plaintext document vector to form an encrypted document vector, and encrypting each document in the document set to form an encrypted document set; transmitting the document filtering vector to a private cloud server, and transmitting the encrypted document vector and the encrypted document set to a public cloud server;
the data retrieval end is used for generating retrieval vectors according to a plurality of keywords provided by a user, generating a retrieval trapdoor by adopting a security algorithm after normalization, and transmitting the retrieval trapdoor and the number k of documents to be retrieved by the user to the public cloud server; generating a retrieval filtering vector for a plurality of keywords provided by a user according to the blocking condition of the keywords in the keyword dictionary sequence, and transmitting the retrieval filtering vector to the private cloud server;
the private cloud server is used for respectively carrying out AND operation on the received retrieval filtering vector and the document filtering vector of each document, if all bits of the vector obtained by the operation are not all 0, adding the corresponding document number to the candidate document set, and transmitting the candidate document set to the public cloud server;
the public cloud server is used for respectively calculating a security inner product between an encrypted document vector corresponding to each document in the candidate document set and the retrieval trapdoor according to the received candidate document set, the retrieval trapdoor and the number k of the retrieved documents, selecting k ciphertext documents most relevant to the keywords provided by the user in the candidate document set according to the security inner product, and returning the k ciphertext documents to the data retrieval end;
and the data retrieval end is also used for decrypting the received k ciphertext documents to obtain the most relevant k plaintext documents.
Further, the data providing end specifically includes:
a keyword extraction module for extracting keywords from the provided document set DS to obtain a keyword set { w1,w2,…,wn};
A clustering module for clustering the keywords in the keyword set according to the correlation relationship to obtain a plurality of clustering sub-clusters { c1,c2,…,ct};
A keyword dictionary generating module for using each sub-cluster as a block to obtain t blocks, b1,b2,…,btAnd generating a keyword dictionary sequence W ═ { W (b) according to the blocks1,1),w(b1,2),…,w(b2,1),w(b2,2),…,w(bt,1),w(bt2), … }; wherein w (b)jX) denotes belonging to the partition bjThe xth keyword in each block is unordered; block bj={w(bj,x)|0<x≤|bj|};
A plaintext document vector generation module for generating a document set DS (D) according to the position of the keyword in the keyword dictionary sequence by adopting a TF-IDF algorithm and a space vector model1,D2,…,DmEvery document D iniGenerating a corresponding plaintext document vector ViAnd carrying out normalization processing; wherein, ViIs n, each bit takes the value of the key word corresponding to the bit in the document DiThe word frequency TF value of (1);
a document filtering vector generating module for generating a document filtering vector according to the block condition of the keyword dictionary sequence,vector V of plaintext documentiDividing the document into t blocks, wherein the block boundaries are the same as those of the keyword dictionary sequence, and obtaining each document DiDocument filter vector DF ofi={b1,b2,…,bt}; wherein, if ViMiddle block bjAll the positions of the corresponding keywords are taken as 0, and b isjThe value of the block is 0, otherwise bjThe value of a block is 1, DFiIs a vector with 0/1 values of each bit of the t dimension;
a key generation module for generating an encryption key SK (S, M)1,M2,kf) (ii) a Wherein S is a random vector with 0/1 values of each bit, and M is1And M2Is two nxn invertible matrices, n is the length of the keyword dictionary sequence, kfIs a document encryption key;
a document vector encryption module for encrypting each plaintext document vector V by using the generated encryption key through a secure KNN technologyiEncrypting to obtain corresponding encrypted document vectorWherein, when the jth element S [ j ] in the random vector S]When equal to 0, Vi′+Vi″=ViWhen S [ j ]]When 1, Vi′=Vi″=Vi;
A document encryption module, configured to encrypt each document in the document set DS by using a symmetric encryption algorithm to obtain an encrypted document set ES ═ e1,e2,…,em};
And the transmission module is used for transmitting the document filtering vector to the private cloud server for storage, and transmitting the encrypted document vector and the encrypted document set to the public cloud server for storage.
Further, the data retrieval end specifically includes:
a search vector generation module for generating a plurality of keywords { w }according to a user1,w2,…,wxGenerating a retrieval vector Q by adopting a TF-IDF algorithm and a space vector model, and normalizing; wherein the jth element of Q, Q [ j]Is the j-th bitKeyword wjThe inverse document frequency IDF value in the document set DS provided by the data providing end;
a retrieval trapdoor generation module used for generating the retrieval trapdoor by adopting a safe KNN algorithm based on the retrieval vector QWherein, when the j bit S [ j ] of random vector in the encryption key generated by the data providing terminal]When equal to 0, Q' [ j ]]=Q″[j]=Q[j]When S [ j ]]When 1, Q' [ j]+Q″[j]=Q[j];
A search filtering vector generating module for generating a search filtering vector QF according to the block condition of the keywords in the keyword dictionary sequence, wherein the QF is ═ b1,b2,…,btQF is a vector with a t-dimension each bit value of 0/1, if the block is bjAll the corresponding key words in the retrieval vector Q have the value of 0, and then QF [ j]0, otherwise QF [ j]=1;
And the transmission module is used for transmitting the retrieval trapdoor and the number k of the documents to be retrieved by the user to the public cloud server and transmitting the retrieval filtering vector to the private cloud server.
Further, the private cloud server specifically includes:
and an operation module for receiving the retrieval filter vector QF and the document filter vector DF of each documentiRespectively perform AND operation if QF&DFiIf all bits of the vector obtained by the operation (2) are not all 0, then DF is determinediCorresponding document number DidiAdding the document into the candidate document set to obtain a candidate document set CDS ═ d1,d2,…};
And the transmission module is used for sending the candidate document set CDS to the public cloud server.
The system corresponds to the method of embodiment 1 one to one, and other parts are not described again, so that reference may be made to embodiment 1.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the scope of the present invention, therefore, the appended claims are to be accorded the full scope of the invention.
Claims (8)
1. A mixed cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval method is characterized by comprising the following steps:
(1) the data providing end extracts a keyword set from the provided document set and generates a keyword dictionary sequence through clustering and partitioning; generating a corresponding plaintext document vector for each document in the document set according to the keyword dictionary sequence, and blocking the plaintext document vectors according to the blocking condition of the keyword dictionary sequence to form document filtering vectors; encrypting the plaintext document vector to form an encrypted document vector, and encrypting each document in the document set to form an encrypted document set; finally, the document filtering vector is transmitted to a private cloud server, and the encrypted document vector and the encrypted document set are transmitted to a public cloud server;
(2) the data retrieval end generates retrieval vectors according to a plurality of keywords provided by a user, generates a retrieval trapdoor by adopting a security algorithm after normalization, and transmits the retrieval trapdoor and the number k of documents to be retrieved by the user to a public cloud server; generating a retrieval filtering vector for a plurality of keywords provided by a user according to the blocking condition of the keywords in the keyword dictionary sequence, and transmitting the retrieval filtering vector to a private cloud server;
(3) the private cloud server respectively performs AND operation on the received retrieval filtering vector and the document filtering vector of each document, if all bits of the vector obtained through operation are not all 0, the corresponding document number is added to the candidate document set, and the candidate document set is transmitted to the public cloud server;
(4) the public cloud server respectively calculates a security inner product between an encrypted document vector corresponding to each document in the candidate document set and the retrieval trapdoor according to the received candidate document set, the retrieval trapdoor and the number k of the retrieved documents, selects k ciphertext documents most relevant to the keywords provided by the user in the candidate document set according to the security inner product, and returns the k ciphertext documents to the data retrieval end;
(5) and the data retrieval end decrypts the received k ciphertext documents to obtain the most relevant k plaintext documents.
2. The hybrid cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval method according to claim 1, wherein: the step (1) specifically comprises the following steps:
(1-1) extracting keywords from the provided document set DS by the data providing terminal to obtain a keyword set { w1,w2,…,wnN is the number of keywords;
(1-2) clustering the keywords in the keyword set according to the correlation relationship to obtain a plurality of clustering sub-clusters { c1,c2,…,ct};
(1-3) taking each sub-cluster as a block, thereby obtaining t blocks, b1,b2,…,btAnd generating a keyword dictionary sequence W ═ { W (b) according to the blocks1,1),w(b1,2),…,w(b2,1),w(b2,2),…,w(bt,1),w(bt2), … }, wherein w (b)jX) denotes belonging to the partition bjThe xth keyword in each block is unordered; block bj={w(bj,x)|0<x≤|bj|};
(1-4) adopting TF-IDF algorithm and space vector model, and according to the positions of keywords in the keyword dictionary sequence, obtaining the document set DS ═ D1,D2,…,DmEvery document D iniGenerating a corresponding plaintext document vector ViAnd carrying out normalization processing; where m is the number of documents in the document set DS, ViIs n, each bit takes the value of the key word corresponding to the bit in the document DiThe word frequency TF value of (1);
(1-5) according to the block situation of the keyword dictionary sequence, the plaintext document vector ViDividing the document into t blocks, wherein the block boundaries are the same as those of the keyword dictionary sequence, and obtaining each document DiDocument filter vector DF ofi={b1,b2,…,bt}; wherein, if ViMiddle block bjAll the positions of the corresponding keywords are taken as 0, and b isjThe value of the block is 0, otherwise bjThe value of a block is 1, DFiIs a vector with 0/1 values of each bit of the t dimension;
(1-6) generating an encryption Key SK (S, M)1,M2,kf) (ii) a Wherein S is a random vector with 0/1 values of each bit, and M is1And M2Is two nxn invertible matrices, n is the length of the keyword dictionary sequence, kfIs a document encryption key;
(1-7) pairing each plaintext document vector V with the generated encryption key by the secure KNN techniqueiEncrypting to obtain corresponding encrypted document vectorWherein, when the jth element S [ j ] in the random vector S]When equal to 0, Vi′+Vi″=ViWhen S [ j ]]When 1, Vi′=Vi″=Vi;
(1-8) encrypting each document in the document set DS through a symmetric encryption algorithm to obtain an encrypted document set ES ═ e1,e2,…,em};
And (1-9) transmitting the document filtering vector to a private cloud server for storage, and transmitting the encrypted document vector and the encrypted document set to a public cloud server for storage.
3. The hybrid cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval method according to claim 1, wherein: the step (2) specifically comprises the following steps:
(2-1) the data retrieval end provides a plurality of keywords w according to the user1,w2,…,wxGenerating a retrieval vector Q by adopting a TF-IDF algorithm and a space vector model, and normalizing; wherein the jth element of Q, Q [ j]Is the j-th keyword wjThe inverse document frequency IDF value in the document set DS provided by the data providing end;
(2-2) generating a retrieval trapdoor by adopting a safe KNN algorithm based on the retrieval vector QWherein when the data providing end generatesBit j of random vector in encryption key]When equal to 0, Q' [ j ]]=Q″[j]=Q[j]When S [ j ]]When 1, Q' [ j]+Q″[j]=Q[j];
(2-3) generating a retrieval filtering vector QF according to the block condition of the keywords in the keyword dictionary sequence, wherein the QF is equal to { b }1,b2,…,btQF is a vector with a t-dimension each bit value of 0/1, if the block is bjAll the corresponding key words in the retrieval vector Q have the value of 0, and then QF [ j]0, otherwise QF [ j]=1;
And (2-4) transmitting the retrieval trapdoor and the number k of the documents to be retrieved by the user to a public cloud server, and transmitting the retrieval filtering vector to a private cloud server.
4. The hybrid cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval method according to claim 1, wherein: the step (3) specifically comprises the following steps:
(3-1) the private cloud server filtering the received retrieval filtering vector QF and the document filtering vector DF of each documentiRespectively perform AND operation if QF&DFiIf all bits of the vector obtained by the operation (2) are not all 0, then DF is determinediCorresponding document number DidiAdding the document into the candidate document set to obtain a candidate document set CDS ═ d1,d2,…};
And (3-2) sending the candidate document set CDS to a public cloud server.
5. A mixed cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval system is characterized by comprising a data providing end, a data retrieval end, a private cloud server and a public cloud server, wherein:
the data providing end is used for extracting a keyword set from the provided document set and generating a keyword dictionary sequence through clustering and partitioning; generating a corresponding plaintext document vector for each document in the document set according to the keyword dictionary sequence, and blocking the plaintext document vectors according to the blocking condition of the keyword dictionary sequence to form document filtering vectors; encrypting the plaintext document vector to form an encrypted document vector, and encrypting each document in the document set to form an encrypted document set; transmitting the document filtering vector to a private cloud server, and transmitting the encrypted document vector and the encrypted document set to a public cloud server;
the data retrieval end is used for generating retrieval vectors according to a plurality of keywords provided by a user, generating a retrieval trapdoor by adopting a security algorithm after normalization, and transmitting the retrieval trapdoor and the number k of documents to be retrieved by the user to the public cloud server; generating a retrieval filtering vector for a plurality of keywords provided by a user according to the blocking condition of the keywords in the keyword dictionary sequence, and transmitting the retrieval filtering vector to the private cloud server;
the private cloud server is used for respectively carrying out AND operation on the received retrieval filtering vector and the document filtering vector of each document, if all bits of the vector obtained by the operation are not all 0, adding the corresponding document number to the candidate document set, and transmitting the candidate document set to the public cloud server;
the public cloud server is used for respectively calculating a security inner product between an encrypted document vector corresponding to each document in the candidate document set and the retrieval trapdoor according to the received candidate document set, the retrieval trapdoor and the number k of the retrieved documents, selecting k ciphertext documents most relevant to the keywords provided by the user in the candidate document set according to the security inner product, and returning the k ciphertext documents to the data retrieval end;
and the data retrieval end is also used for decrypting the received k ciphertext documents to obtain the most relevant k plaintext documents.
6. The hybrid cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval system according to claim 5, wherein: the data providing end specifically comprises:
a keyword extraction module for extracting keywords from the provided document set DS to obtain a keyword set { w1,w2,…,wnN is the number of keywords;
a clustering module for clustering the keywords in the keyword set according to the correlation relationship to obtain a plurality of clustering sub-clusters { c1,c2,…,ct};
A keyword dictionary generating module for using each sub-cluster as a block to obtain t blocks, b1,b2,…,btAnd generating a keyword dictionary sequence W ═ { W (b) according to the blocks1,1),w(b1,2),…,w(b2,1),w(b2,2),…,w(bt,1),w(bt2), … }; wherein w (b)jX) denotes belonging to the partition bjThe xth keyword in each block is unordered; block bj={w(bj,x)|0<x≤|bj|};
A plaintext document vector generation module for generating a document set DS (D) according to the position of the keyword in the keyword dictionary sequence by adopting a TF-IDF algorithm and a space vector model1,D2,…,DmEvery document D iniGenerating a corresponding plaintext document vector ViAnd carrying out normalization processing; where m is the number of documents in the document set DS, ViIs n, each bit takes the value of the key word corresponding to the bit in the document DiThe word frequency TF value of (1);
a document filtering vector generation module for generating a plaintext document vector V according to the block condition of the keyword dictionary sequenceiDividing the document into t blocks, wherein the block boundaries are the same as those of the keyword dictionary sequence, and obtaining each document DiDocument filter vector DF ofi={b1,b2,…,bt}; wherein, if ViMiddle block bjAll the positions of the corresponding keywords are taken as 0, and b isjThe value of the block is 0, otherwise bjThe value of a block is 1, DFiIs a vector with 0/1 values of each bit of the t dimension;
a key generation module for generating an encryption key SK (S, M)1,M2,kf) (ii) a Wherein S is a random vector with 0/1 values of each bit, and M is1And M2Is two nxn invertible matrices, n is the length of the keyword dictionary sequence, kfIs a document encryption key;
a document vector encryption module for employing the generated encryption by secure KNN techniqueKey pair vector V for each plaintext documentiEncrypting to obtain corresponding encrypted document vectorWherein, when the jth element S [ j ] in the random vector S]When equal to 0, Vi′+Vi″=ViWhen S [ j ]]When 1, Vi′=Vi″=Vi;
A document encryption module, configured to encrypt each document in the document set DS by using a symmetric encryption algorithm to obtain an encrypted document set ES ═ e1,e2,…,em};
And the transmission module is used for transmitting the document filtering vector to the private cloud server for storage, and transmitting the encrypted document vector and the encrypted document set to the public cloud server for storage.
7. The hybrid cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval system according to claim 5, wherein: the data retrieval end specifically comprises:
a search vector generation module for generating a plurality of keywords { w }according to a user1,w2,…,wxGenerating a retrieval vector Q by adopting a TF-IDF algorithm and a space vector model, and normalizing; wherein the jth element of Q, Q [ j]Is the j-th keyword wjThe inverse document frequency IDF value in the document set DS provided by the data providing end;
a retrieval trapdoor generation module used for generating the retrieval trapdoor by adopting a safe KNN algorithm based on the retrieval vector QWherein, when the j bit S [ j ] of random vector in the encryption key generated by the data providing terminal]When equal to 0, Q' [ j ]]=Q″[j]=Q[j]When S [ j ]]When 1, Q' [ j]+Q″[j]=Q[j];
A search filtering vector generating module for generating a search filtering vector QF according to the block condition of the keywords in the keyword dictionary sequence, wherein the QF is ═ b1,b2,…,btQF is a vector with a t-dimension each bit value of 0/1, if the block is bjAll the corresponding key words in the retrieval vector Q have the value of 0, and then QF [ j]0, otherwise QF [ j]=1;
And the transmission module is used for transmitting the retrieval trapdoor and the number k of the documents to be retrieved by the user to the public cloud server and transmitting the retrieval filtering vector to the private cloud server.
8. The hybrid cloud-oriented privacy protection multi-keyword Top-k ciphertext retrieval system according to claim 5, wherein: the private cloud server specifically includes:
and an operation module for receiving the retrieval filter vector QF and the document filter vector DF of each documentiRespectively perform AND operation if QF&DFiIf all bits of the vector obtained by the operation (2) are not all 0, then DF is determinediCorresponding document number DidiAdding the document into the candidate document set to obtain a candidate document set CDS ═ d1,d2,…};
And the transmission module is used for sending the candidate document set CDS to the public cloud server.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810122376.8A CN108363689B (en) | 2018-02-07 | 2018-02-07 | Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810122376.8A CN108363689B (en) | 2018-02-07 | 2018-02-07 | Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108363689A CN108363689A (en) | 2018-08-03 |
CN108363689B true CN108363689B (en) | 2021-03-19 |
Family
ID=63005057
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810122376.8A Active CN108363689B (en) | 2018-02-07 | 2018-02-07 | Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108363689B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109194666B (en) * | 2018-09-18 | 2021-06-01 | 东北大学 | LBS-based security kNN query method |
CN109271485B (en) * | 2018-09-19 | 2022-03-08 | 南京邮电大学 | Cloud environment encrypted document sequencing and searching method supporting semantics |
CN109739945B (en) * | 2018-12-13 | 2022-11-08 | 南京邮电大学 | Multi-keyword ciphertext sorting and searching method based on mixed index |
CN110727951B (en) * | 2019-10-14 | 2021-08-27 | 桂林电子科技大学 | Lightweight outsourcing file multi-keyword retrieval method and system with privacy protection function |
CN110895611B (en) * | 2019-11-26 | 2021-04-02 | 支付宝(杭州)信息技术有限公司 | Data query method, device, equipment and system based on privacy information protection |
CN112597268B (en) * | 2020-12-22 | 2022-09-20 | 南京邮电大学 | Retrieval filtering threshold value selection method for cloud environment ciphertext retrieval efficiency optimization |
CN114189391B (en) * | 2022-02-14 | 2022-04-29 | 浙江易天云网信息科技有限公司 | Privacy data control and management method suitable for hybrid cloud |
CN116521743A (en) * | 2023-06-27 | 2023-08-01 | 北京中科江南信息技术股份有限公司 | Ciphertext retrieval method and device, storage medium and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104765848A (en) * | 2015-04-17 | 2015-07-08 | 中国人民解放军空军航空大学 | Symmetrical searchable encryption method for supporting result high-efficiency sequencing in hybrid cloud storage |
CN105681280A (en) * | 2015-12-29 | 2016-06-15 | 西安电子科技大学 | Searchable encryption method based on Chinese in cloud environment |
CN106815350A (en) * | 2017-01-19 | 2017-06-09 | 安徽大学 | Dynamic ciphertext multi-key word searches for method generally in a kind of cloud environment |
CN107634829A (en) * | 2017-09-12 | 2018-01-26 | 南京理工大学 | Encrypted electronic medical records system and encryption method can search for based on attribute |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012049679A (en) * | 2010-08-25 | 2012-03-08 | Sony Corp | Terminal apparatus, server, data processing system, data processing method and program |
-
2018
- 2018-02-07 CN CN201810122376.8A patent/CN108363689B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104765848A (en) * | 2015-04-17 | 2015-07-08 | 中国人民解放军空军航空大学 | Symmetrical searchable encryption method for supporting result high-efficiency sequencing in hybrid cloud storage |
CN105681280A (en) * | 2015-12-29 | 2016-06-15 | 西安电子科技大学 | Searchable encryption method based on Chinese in cloud environment |
CN106815350A (en) * | 2017-01-19 | 2017-06-09 | 安徽大学 | Dynamic ciphertext multi-key word searches for method generally in a kind of cloud environment |
CN107634829A (en) * | 2017-09-12 | 2018-01-26 | 南京理工大学 | Encrypted electronic medical records system and encryption method can search for based on attribute |
Non-Patent Citations (2)
Title |
---|
公共云存储服务数据安全及隐私保护技术综述;李晖 等;《计算机研究与发展》;20140731;全文 * |
面向云环境的多关键词密文排序检索研究综述;戴华 等;《计算机科学》;20190131;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108363689A (en) | 2018-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108363689B (en) | Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud | |
US11567950B2 (en) | System and method for confidentiality-preserving rank-ordered search | |
Yuan et al. | SEISA: Secure and efficient encrypted image search with access control | |
Zhang et al. | SE-PPFM: A searchable encryption scheme supporting privacy-preserving fuzzy multikeyword in cloud systems | |
US8819408B2 (en) | Document processing method and system | |
US9197613B2 (en) | Document processing method and system | |
CN108959567B (en) | Safe retrieval method suitable for large-scale images in cloud environment | |
Dai et al. | A privacy-preserving multi-keyword ranked search over encrypted data in hybrid clouds | |
CN115314295B (en) | Block chain-based searchable encryption technical method | |
Al Sibahee et al. | Efficient encrypted image retrieval in IoT-cloud with multi-user authentication | |
Boucenna et al. | Secure inverted index based search over encrypted cloud data with user access rights management | |
Gong et al. | A privacy-preserving image retrieval method based on improved bovw model in cloud environment | |
CN113779597B (en) | Method, device, equipment and medium for storing and similar searching of encrypted document | |
CN109740378B (en) | Security pair index structure resisting keyword privacy disclosure and retrieval method thereof | |
Ren et al. | Privacy-preserving ranked multi-keyword search leveraging polynomial function in cloud computing | |
EP2775420A1 (en) | Semantic search over encrypted data | |
Zhang et al. | A verifiable and dynamic multi-keyword ranked search scheme over encrypted cloud data with accuracy improvement | |
Li et al. | Paillier-based fuzzy multi-keyword searchable encryption scheme with order-preserving | |
CN111966778B (en) | Multi-keyword ciphertext sorting and searching method based on keyword grouping reverse index | |
CN113158245A (en) | Method, system, equipment and readable storage medium for searching document | |
Manasrah et al. | A privacy-preserving multi-keyword search approach in cloud computing | |
Huang et al. | Efficient privacy-preserving content-based image retrieval in the cloud | |
Salmani et al. | Leakless privacy-preserving multi-keyword ranked search over encrypted cloud data | |
CN113158209A (en) | Top-k query why-not problem processing method for protecting privacy | |
Xu et al. | Achieving fine-grained multi-keyword ranked search over encrypted cloud data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20180803 Assignee: NUPT INSTITUTE OF BIG DATA RESEARCH AT YANCHENG Assignor: NANJING University OF POSTS AND TELECOMMUNICATIONS Contract record no.: X2021980013920 Denomination of invention: Hybrid cloud oriented privacy protection multi keyword Top-k ciphertext retrieval method and system Granted publication date: 20210319 License type: Common License Record date: 20211202 |