CN108363689A - Secret protection multi-key word Top-k cipher text retrieval methods towards mixed cloud and system - Google Patents

Secret protection multi-key word Top-k cipher text retrieval methods towards mixed cloud and system Download PDF

Info

Publication number
CN108363689A
CN108363689A CN201810122376.8A CN201810122376A CN108363689A CN 108363689 A CN108363689 A CN 108363689A CN 201810122376 A CN201810122376 A CN 201810122376A CN 108363689 A CN108363689 A CN 108363689A
Authority
CN
China
Prior art keywords
document
vector
keyword
retrieval
cloud server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810122376.8A
Other languages
Chinese (zh)
Other versions
CN108363689B (en
Inventor
戴华
朱向洋
杨庚
白双杰
史经启
孙彦珺
王敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201810122376.8A priority Critical patent/CN108363689B/en
Publication of CN108363689A publication Critical patent/CN108363689A/en
Application granted granted Critical
Publication of CN108363689B publication Critical patent/CN108363689B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]

Abstract

The invention discloses a kind of secret protection multi-key word Top k cipher text retrieval methods and system towards mixed cloud.Present invention mainly solves the low problems of recall precision.Its scheme is:Data provide end and generate keyword dictionary sequence by clustering strategies using the correlativity between keyword;The document filter vectors of the document vector sum low-dimensional of each document structure tree higher-dimension, are then outsourced to incredible publicly-owned Cloud Server by ciphertext document and encrypted document vector, by the document filter vectors storage of plaintext to believable privately owned Cloud Server.When retrieval, candidate documents are calculated by privately owned Cloud Server first, retrieval result Top k document calculations are then realized by publicly-owned Cloud Server.Associative key aggregation properties improve the effect of privately owned Cloud Server filtering in keyword dictionary sequence, have compressed the size of candidate documents.Flow of the present invention is simple, safe, it is easy to accomplish, it can realize that efficient multi-key word searching ciphertext is handled by less computing cost in mixing cloud environment.

Description

Secret protection multi-key word Top-k cipher text retrieval methods towards mixed cloud and system
Technical field
The present invention relates to user data secret protection more particularly to a kind of secret protection multi-key words towards mixed cloud Top-k cipher text retrieval methods and system.
Background technology
The thought of IT resource services becomes increasingly popular, and becoming for " all are all serviced " (X as a Service, XaaS) is presented Gesture, " service " become the key concept of cloud computing.However while cloud computing flourishes, cloud security also becomes to be closed extensively The problem of note.In cloud environment, it is placed in remote cloud server (Cloud Server, CS) since user can not directly control Data, worry that the outer bag data of oneself is illegally obtained or abused by cloud service provider, especially for privacy require compared with High sensitive data, such as electronic health record, bank transaction data, user mail etc..Although cloud service provider declares their meetings Some Security Countermeasures are provided to cope with privacy leakage problem, such as access control technology, firewall technology and Intrusion Detection Technique Deng, but user is unquestionably the main problem for restricting cloud computing and further developing to the worry of problem of data safety.
Protection data-privacy a kind of common practice be outsourced to publicly-owned Cloud Server again after handling data encryption, but It is the use for seriously constraining outer bag data in this way.In Research into information retrieval field, existing keywords-based retrieval is mainly Towards clear data, searching ciphertext field can not be applied directly to.And by all encryption datas from high in the clouds it is locally downloading into Row decryption is clearly a kind of unrealistic and the wasting of resources processing method.Therefore, it studies and solves have privacy in cloud environment The ciphertext data retrieval mechanism of defencive function is a challenge, this also has become cloud computing research field concern in recent years One of hot issue.
Big multi-method acquiescence is all to use public cloud service in the prior art, based on public cloud according to " semi-honesty model " side Formula, which provides, services this hypothesis, multi-key word searching ciphertext processing method in a series of encryption cloud environments of proposition, but these One or more problems such as that there are recall precisions is low for method, retrieval result is inaccurate, index tree structure complexity.
For these problems, the Chinese invention patent application of application number 201710181664.6 discloses in a kind of cloud computing The quick multi-key word Semantic Ranking searching method for protecting data-privacy, by the way that privately owned Cloud Server is added, for each document While creating document vector, corresponding mark vector is created, encrypted document vector is outsourced to publicly-owned Cloud Server, it will The storage of clear identification vector realizes the preliminary filter operation of document sets to privately owned Cloud Server, by privately owned Cloud Server, reduce with The document vector number that vector's correlation degree score calculates is retrieved, reduces retrieval computing cost, but this method is due to key word character The distribution of keyword is random in allusion quotation, and privately owned Cloud Server filter effect is bad, causes to need to calculate a large amount of texts in public cloud Relevance score between shelves vector sum retrieval vector.Therefore privately owned Cloud Server filter effect how is improved, it is mixed for improving Closing in cloud supports the multi-key word searching ciphertext efficiency of secret protection to play an important role.
Invention content
Goal of the invention:In view of the problems of the existing technology the present invention, it is more to provide a kind of secret protection towards mixed cloud Keyword Top-k cipher text retrieval methods and system, the present invention can effectively realize the secret protection of user data, promote close more The efficiency of keyword searching ciphertext realizes that accurately Top-k is retrieved.
Technical solution:Secret protection multi-key word Top-k cipher text retrieval method packets of the present invention towards mixed cloud It includes:
(1) data provide end and extract keyword set from the document sets of offer, and generate keyword by the piecemeal that clusters Dictionary sequence;It is one corresponding plain text document of each document structure tree vector in document sets further according to keyword dictionary sequence, and according to According to the piecemeal situation of keyword dictionary sequence by plain text document vector piecemeal, document filter vectors are formed;Later by plain text document Vector encryption forms encrypted document vector, and each file encryption in document sets is formed encrypted document collection;Finally by document mistake Filter vector is transmitted to privately owned Cloud Server, and encrypted document vector sum encrypted document collection is transmitted to publicly-owned Cloud Server;
(2) data retrieval end generates retrieval vector according to multiple keywords that user provides, using peace after being normalized Full algorithm generates retrieval trapdoor, needs the document number k retrieved to be transmitted to publicly-owned Cloud Server together with user;Again according to keyword The piecemeal situation of keyword in dictionary sequence, multiple keywords that user is provided generate retrieval filter vectors, and are transmitted to private There is Cloud Server;
(3) privately owned Cloud Server carries out the retrieval filter vectors received and the document filter vectors of each document respectively With operation, if all positions of vector that operation obtains are not all 0, corresponding document code, which is added to candidate documents, to be concentrated, and Candidate documents are transmitted to publicly-owned Cloud Server;
(4) publicly-owned Cloud Server is counted respectively according to the candidate documents, retrieval trapdoor and the search file number k that receive Calculating candidate documents concentrates the corresponding encrypted document vector sum of each document to retrieve the safe inner product between trapdoor, according to safe inner product It chooses candidate documents and the maximally related k ciphertext document of keyword provided with user is provided, which is back to number According to retrieval end;
(5) k ciphertext document of reception is decrypted in data retrieval end, obtains maximally related k plain text document.
Further, step (1) specifically includes:
(1-1) data provide end and extract keyword from the document sets DS of offer, obtain keyword set { w1,w2,…, wn};
Keyword root in keyword set is carried out the operation that clusters by (1-2) according to correlativity, obtains several cluster Cluster { c1,c2,…,ct};
(1-3) using each submanifold as a piecemeal, to obtain t piecemeal, respectively b1,b2,…,bt, further according to point Block generates keyword dictionary sequence W={ w (b1,1),w(b1,2),…,w(b2,1),w(b2,2),…,w(bt,1),w(bt, 2) ... }, wherein w (bj, x) and it indicates to belong to piecemeal bjIn x-th of keyword, each keyword in the block is unordered;Piecemeal bj= {w(bj,x)|0<x≤|bj|};
(1-4) is according to the position of keyword in keyword dictionary sequence using TF-IDF algorithms and vector space model Document sets DS={ D1,D2,…,DmIn each document DiGenerate a corresponding plain text document vector Vi, and place is normalized Reason;Wherein, ViDimension be n, every value is the corresponding keyword of this in document DiIn word frequency TF values;
(1-5) according to the piecemeal situation of keyword dictionary sequence, by plain text document vector ViIt is divided into t piecemeal, piecemeal side Boundary is identical with the piecemeal boundary of keyword dictionary sequence, obtains each document DiDocument filter vectors DFi={ b1,b2,…, bt};Wherein, if ViMiddle piecemeal bjPosition value where corresponding all keywords is all 0, then bjThe value of block is 0, otherwise bj The value of block is 1, DFiIt is the vector that every value of t dimensions is 0/1;
(1-6) generates encryption key SK (S, M1,M2,kf);Wherein, S is the random vector that every value is 0/1, M1 And M2It is two n × n invertible matrix, n is the length of keyword dictionary sequence, kfIt is document encryption key;
(1-7) is by safe KNN technologies using the encryption key generated to each plain text document vector ViEncryption, obtains pair The encrypted document vector answeredWherein, as j-th of element S [j]=0 in random vector S, Vi′+Vi″ =Vi, as S [j]=1, Vi'=Vi"=Vi
(1-8) obtains encrypted document collection ES={ e by each document in symmetric encipherment algorithm encrypted document collection DS1, e2,…,em};
Document filter vectors are transmitted to privately owned Cloud Server and stored by (1-9), and encrypted document vector sum is encrypted text Shelves collection is transmitted to publicly-owned Cloud Server and is stored.
Further, step (2) specifically includes:
Multiple keyword { w that (2-1) data retrieval end is provided according to user1,w2,…,wx, using TF-IDF algorithms and Vector space model generates retrieval vector Q, and is normalized;Wherein, j-th of element Q [j] of Q is jth position keyword wj Data provide the inverse document frequency IDF values in the document sets DS that end provides;
(2-2) is based on retrieval vector Q, using safe KNN algorithms, generates retrieval trapdoorWherein, Wherein when data provide jth position S [j]=0 of the random vector in the encryption key that end generates, Q ' [j]=Q " [j]=Q [j], as S [j]=1, Q ' [j]+Q " [j]=Q [j];
(2-3) generates retrieval filter vectors QF, QF={ b according to the piecemeal situation of keyword in keyword dictionary sequence1, b2,…,bt, QF is the vector that t ties up every value 0/1, if piecemeal bjCorresponding all keywords are corresponding in retrieving vector Q All values are all 0, then QF [j]=0, otherwise QF [j]=1;
(2-4) will retrieve trapdoor and user needs the document number k retrieved to be transmitted to publicly-owned Cloud Server, and retrieval is filtered Vector is transmitted to privately owned Cloud Server.
Further, step (3) specifically includes:
(3-1) privately owned Cloud Server is by the document filter vectors DF of the retrieval filter vectors QF received and each documenti Progress and operation respectively, if QF&DFiThe obtained all positions of vector of operation be not all 0, then by DFiCorresponding document code Didi It is added to candidate documents concentration, obtains candidate documents CDS={ d1,d2,…};
Candidate documents CDS is sent to publicly-owned Cloud Server by (3-2).
Secret protection multi-key word Top-k searching ciphertext systems of the present invention towards mixed cloud, including data carry For end, data retrieval end, privately owned Cloud Server and publicly-owned Cloud Server, wherein:
Data provide end and generate keyword for extracting keyword set from the document sets of offer, and by the piecemeal that clusters Dictionary sequence;And it is vectorial for one corresponding plain text document of each document structure tree in document sets according to keyword dictionary sequence, and Plain text document vector piecemeal is formed document filter vectors by the piecemeal situation according to keyword dictionary sequence;And it will be literary in plain text Shelves vector encryption forms encrypted document vector, and each file encryption in document sets is formed encrypted document collection;And by document Filter vectors are transmitted to privately owned Cloud Server, and encrypted document vector sum encrypted document collection is transmitted to publicly-owned Cloud Server;
Data retrieval end is used to generate retrieval vector according to multiple keywords that user provides, using peace after being normalized Full algorithm generates retrieval trapdoor, needs the document number k retrieved to be transmitted to publicly-owned Cloud Server together with user;And according to key The piecemeal situation of keyword in word dictionary sequence, multiple keywords that user is provided generate retrieval filter vectors, and are transmitted to Privately owned Cloud Server;
The document filter vectors of retrieval filter vectors and each document that privately owned Cloud Server is used to receive respectively into Row and operation, if all positions of vector that operation obtains are not all 0, corresponding document code, which is added to candidate documents, to be concentrated, And candidate documents are transmitted to publicly-owned Cloud Server;
Publicly-owned Cloud Server is used to, according to the candidate documents, retrieval trapdoor and the search file number k that receive, count respectively Calculating candidate documents concentrates the corresponding encrypted document vector sum of each document to retrieve the safe inner product between trapdoor, according to safe inner product It chooses candidate documents and the maximally related k ciphertext document of keyword provided with user is provided, which is back to number According to retrieval end;
Data retrieval end is additionally operable to that k ciphertext document of reception is decrypted, and obtains maximally related k plain text document.
Further, the data provide end and specifically include:
Keyword extracting module obtains keyword set { w for extracting keyword from the document sets DS of offer1, w2,…,wn};
Cluster module, for the keyword root in keyword set to be carried out the operation that clusters according to correlativity, obtains several A cluster submanifold { c1,c2,…,ct};
Keyword dictionary generation module is used for using each submanifold as a piecemeal, to obtain t piecemeal, respectively b1,b2,…,bt, keyword dictionary sequence W={ w (b are generated further according to piecemeal1,1),w(b1,2),…,w(b2,1),w(b2, 2),…,w(bt,1),w(bt,2),…};Wherein w (bj, x) and it indicates to belong to piecemeal bjIn x-th of keyword, it is each in the block Keyword is unordered;Piecemeal bj={ w (bj,x)|0<x≤|bj|};
Plain text document vector generation module, for using TF-IDF algorithms and vector space model, according to keyword dictionary The position of keyword in sequence is document sets DS={ D1,D2,…,DmIn each document DiGenerate a corresponding plain text document Vectorial Vi, and be normalized;Wherein, ViDimension be n, every value is the corresponding keyword of this in document DiIn Word frequency TF values;
Document filter vectors generation module, for the piecemeal situation according to keyword dictionary sequence, by plain text document vector ViIt is divided into t piecemeal, piecemeal boundary is identical with the piecemeal boundary of keyword dictionary sequence, obtains each document DiDocument filtering Vectorial DFi={ b1,b2,…,bt};Wherein, if ViMiddle piecemeal bjPosition value where corresponding all keywords is all 0, then bjThe value of block is 0, otherwise bjThe value of block is 1, DFiIt is the vector that every value of t dimensions is 0/1;
Key production module, for generating encryption key SK (S, M1,M2,kf);Wherein, it is 0/1 that S, which is every value, Random vector, M1And M2It is two n × n invertible matrix, n is the length of keyword dictionary sequence, kfIt is document encryption key;
Document vector encrypting module, for using the encryption key generated to each plain text document by safe KNN technologies Vectorial ViEncryption obtains corresponding encrypted document vectorWherein, when j-th of element S in random vector S When [j]=0, Vi′+Vi"=Vi, as S [j]=1, Vi'=Vi"=Vi
File encryption module, for by each document in symmetric encipherment algorithm encrypted document collection DS, obtaining encrypted document Collect ES={ e1,e2,…,em};
Transmission module is stored for document filter vectors to be transmitted to privately owned Cloud Server, by encrypted document vector Publicly-owned Cloud Server is transmitted to encrypted document collection to be stored.
Further, the data retrieval end specifically includes:
Vector generation module is retrieved, multiple keyword { w for being provided according to user1,w2,…,wx, using TF-IDF Algorithm and vector space model generate retrieval vector Q, and are normalized;Wherein, j-th of element Q [j] of Q is that jth position is crucial Word wjInverse document frequency IDF values in data provide the document sets DS that end provides;
Trapdoor generation module is retrieved, for generating retrieval trapdoor using safe KNN algorithms based on retrieval vector QWherein, wherein when data provide end generate encryption key in random vector jth position S [j]= When 0, Q ' [j]=Q " [j]=Q [j], as S [j]=1, Q ' [j]+Q " [j]=Q [j];
Filter vectors generation module is retrieved, for the piecemeal situation according to keyword in keyword dictionary sequence, generates inspection Rope filter vectors QF, QF={ b1,b2,…,bt, QF is the vector that t ties up every value 0/1, if piecemeal bjCorresponding all keys Word corresponding all values in retrieving vector Q are all 0, then QF [j]=0, otherwise QF [j]=1;
Transmission module needs the document number k retrieved to be transmitted to publicly-owned Cloud Server for will retrieve trapdoor and user, will Retrieval filter vectors are transmitted to privately owned Cloud Server.
Further, the privately owned Cloud Server specifically includes:
With computing module, the document filter vectors DF of retrieval filter vectors QF and each document for that will receiveiPoint It Jin Hang not be with operation, if QF&DFiThe obtained all positions of vector of operation be not all 0, then by DFiCorresponding document code DidiAdd It is added to candidate documents concentration, obtains candidate documents CDS={ d1,d2,…};
Transmission module, for candidate documents CDS to be sent to publicly-owned Cloud Server.
Advantageous effect:Compared with prior art, the present invention its remarkable advantage is:
1, safe
The present invention is accomplished that in insincere publicly-owned cloud environment, is realized and is carried out searching ciphertext according to multi-key word, is led to Cross safe KNN technologies and realize that safe inner product calculates, may be implemented the inner product value between two encryption vectors be equal to two plaintexts to Inner product value between amount, in publicly-owned cloud environment need not to retrieval trapdoor be decrypted, also need not to encrypted document to Operation is decrypted in amount, more operation need not be decrypted to encrypted document.In public cloud part, whole process is all under ciphertext Carry out operation, finally obtained Top-k results.Therefore, the utilization of safe KNN technologies can be realized carries out according to multi-key word While Top-k retrieval results calculate, the data-privacy of data owner is protected.Safe KNN technologies have been widely used for more Keyword searching ciphertext field.
2, accuracy is high
The secret protection multi-key word Top-k cipher text retrieval methods towards mixed cloud that the present invention provides, in data retrieval When the end interested multi-key word of offer is retrieved, it is divided into what two steps carried out, Cloud Server privately owned first, which generates, waits Document sets CDS is selected, is then tied with the maximally related Top-k of interested multi-key word in public cloud server search candidate documents Fruit.When private clound generates candidate documents, for any document D in whole document sets DSiAs long as including one or more data The interested multi-key word that user provides can all be added to candidate documents, therefore there is no the Top-k for the condition that meets Document is not the candidate documents the case where;And when publicly-owned Cloud Server obtains the candidate documents that privately owned Cloud Server sends over When, it is that the encrypted document vector sum for each document strictly concentrated according to candidate documents retrieves the inner product result of calculation between trapdoor Obtained Top-k is as a result, therefore, the mode of the privately owned Cloud Server of the present invention and public cloud server collaboration can be to retrieval result It is accurately sorted and returns to Top-k documents and give data retrieval end as retrieval result.
3, recall precision is high
The secret protection multi-key word Top-k cipher text retrieval methods towards mixed cloud that the present invention provides, for current peace The inefficient problem of encryption method is can search for based on the Technology applications such as full KNN calculating, TF-IDF and vector space model, is led to It crosses and adds credible privately owned Cloud Server, give the method that document vector piecemeal generates document filter vectors, document is filtered Vector uploads to privately owned Cloud Server, because the dimension of document filter vectors is smaller, privately owned Cloud Server can make according to data The retrieval filter vectors that user provides, can be obtained by candidate documents, fast filtering falls largely by less computing overhead Irrelevant document (these documents filtered out are unlikely to be final Top-k results), candidate documents are relative to original document collection For be much smaller, therefore publicly-owned Cloud Server only need to carry out the inner product between a small amount of encryption vector calculating, can save significantly Save the computing cost of publicly-owned Cloud Server.In addition to this, often relevant in view of interested multi-key word input by user The fact that, in order to improve the filter effect of privately owned Cloud Server, further compressed candidature document sets, in keyword dictionary sequence The position of keyword is not to put at random, but clustered according to keyword correlativity, then obtains multiple submanifolds, Keyword is all located at the document filter vectors in identical piece, generated in this way in keyword dictionary sequence and retrieval in each submanifold Filter vectors are more advantageous to the size of compressed candidature document sets, and document number tails off in candidate documents, then publicly-owned Cloud Server It is middle that the number for calculating the inner product between encryption vector is needed just to become smaller, computing cost is reduced naturally, and it is close to improve multi-key word Literary effectiveness of retrieval.
Description of the drawings
Fig. 1 is the framework of the secret protection multi-key word Top-k cipher text retrieval methods provided by the invention towards mixed cloud Figure;
Fig. 2 is the flow of the secret protection multi-key word Top-k cipher text retrieval methods provided by the invention towards mixed cloud Schematic diagram;
Fig. 3 be keyword cluster construction keyword dictionary sequence schematic diagram, it is illustrated that middle keyword set clusters into 10 small Cluster, corresponding keyword dictionary sequence are 10 blocks, include that keyword number is indefinite, and corresponds to keyword number in tuftlet in each block It measures identical;
Fig. 4 is the document vector sum document filter vectors and retrieval vector sum retrieval filtering before not doing normalized The schematic diagram of vector;
Fig. 5 is retrieval flow schematic diagram, wherein Cloud Server privately owned first by document filter vectors and retrieval filtering to Between amount and operation obtains candidate documents, and candidate documents are then sent to publicly-owned Cloud Server;Publicly-owned Cloud Server It is concentrated in candidate documents, Top-k documents is obtained by calculating the relevance score between document vector sum retrieval vector.In order to draw Figure terseness, document vector sum retrieval vector is not normalized and encryption.
Specific implementation mode
Embodiment 1
Present embodiments provide a kind of secret protection multi-key word Top-k cipher text retrieval methods towards mixed cloud, such as Fig. 1 Shown in Fig. 2, include the following steps:
(1) data provide end and extract keyword set from the document sets of offer, and generate keyword by the piecemeal that clusters Dictionary sequence;It is one corresponding plain text document of each document structure tree vector in document sets further according to keyword dictionary sequence, and according to According to the piecemeal situation of keyword dictionary sequence by plain text document vector piecemeal, document filter vectors are formed;Later by plain text document Vector encryption forms encrypted document vector, and each file encryption in document sets is formed encrypted document collection;Finally by document mistake Filter vector is transmitted to privately owned Cloud Server, and encrypted document vector sum encrypted document collection is transmitted to publicly-owned Cloud Server.
The step specifically includes:
(1-1) data provide end and extract keyword from the document sets DS of offer, obtain keyword set { w1,w2,…, wn};
Keyword root in keyword set is carried out the operation that clusters by (1-2) according to correlativity, obtains several cluster Cluster { c1,c2,…,ct};
(1-3) using each submanifold as a piecemeal, to obtain t piecemeal, respectively b1,b2,…,bt, further according to point Block generates keyword dictionary sequence W={ w (b1,1),w(b1,2),…,w(b2,1),w(b2,2),…,w(bt,1),w(bt, 2) ... }, wherein w (bj, x) and it indicates to belong to piecemeal bjIn x-th of keyword, each keyword in the block is unordered;Piecemeal bj= {w(bj,x)|0<x≤|bj|};The characteristic because keyword clusters, the strong keyword of correlation is gathered in keyword dictionary sequence In same piece.Such as in Fig. 3, keyword set clusters into altogether 10 small natural cluster, and the keyword quantity in each cluster is not Fixed, then keyword dictionary includes 10 keyword blocks, the keyword quantity for including in each keyword block and includes in corresponding cluster Keyword quantity is identical, then according to these blocks, generates keyword dictionary sequence;
(1-4) is according to the position of keyword in keyword dictionary sequence using TF-IDF algorithms and vector space model Document sets DS={ D1,D2,…,DmIn each document DiGenerate a corresponding plain text document vector Vi, and place is normalized Reason;Wherein, ViDimension be n, every value is the corresponding keyword of this in document DiIn word frequency TF values;
(1-5) according to the piecemeal situation of keyword dictionary sequence, by plain text document vector ViIt is divided into t piecemeal, piecemeal side Boundary is identical with the piecemeal boundary of keyword dictionary sequence, obtains each document DiDocument filter vectors DFi={ b1,b2,…, bt};Wherein, if ViMiddle piecemeal bjPosition value where corresponding all keywords is all 0, then bjThe value of block is 0, otherwise bj The value of block is 1, DFiIt is the vector that every value of t dimensions is 0/1;Such as in Fig. 4, gives document vector sum and correspond to document The specific example of filter vectors, it is contemplated that picture terseness is not normalized document vector, document Di, right It is V to answer document vectori, according to the position of block boundary in keyword dictionary sequence, the document filter vectors DF of compositioniSuch as institute in Fig. 4 Show;
(1-6) generates encryption key SK (S, M1,M2,kf);Wherein, S be every value be 0/1 n tie up random column to Amount, M1And M2It is two n × n invertible matrix, n is the length of keyword dictionary sequence, kfIt is document encryption key;SK is only provided DO, DU are used, to CS secrets.
(1-7) is by safe KNN technologies using the encryption key generated to each plain text document vector ViEncryption, obtains pair The encrypted document vector answeredWherein, as j-th of element S [j]=0 in random vector S, Vi′+Vi″ =Vi, as S [j]=1, Vi'=Vi"=Vi
(1-8) obtains encrypted document collection ES={ e by each document in symmetric encipherment algorithm encrypted document collection DS1, e2,…,em};
Document filter vectors are transmitted to privately owned Cloud Server and stored by (1-9), and encrypted document vector sum is encrypted text Shelves collection is transmitted to publicly-owned Cloud Server and is stored.
(2) data retrieval end generates retrieval vector according to multiple keywords that user provides, using peace after being normalized Full algorithm generates retrieval trapdoor, needs the document number k retrieved to be transmitted to publicly-owned Cloud Server together with user;Again according to keyword The piecemeal situation of keyword in dictionary sequence, multiple keywords that user is provided generate retrieval filter vectors, and are transmitted to private There is Cloud Server.
The step specifically includes:
Multiple keyword { w that (2-1) data retrieval end is provided according to user1,w2,…,wx, using TF-IDF algorithms and Vector space model generates retrieval vector Q, and is normalized;Wherein, j-th of element Q [j] of Q is jth position keyword wj Data provide the inverse document frequency IDF values in the document sets DS that end provides;
(2-2) is based on retrieval vector Q, using safe KNN algorithms, generates retrieval trapdoorWherein, Wherein when data provide jth position S [j]=0 of the random vector in the encryption key that end generates, Q ' [j]=Q " [j]=Q [j], as S [j]=1, Q ' [j]+Q " [j]=Q [j];
(2-3) generates retrieval filter vectors QF, QF={ b according to the piecemeal situation of keyword in keyword dictionary sequence1, b2,…,bt, QF is the vector that t ties up every value 0/1, if piecemeal bjCorresponding all keywords are corresponding in retrieving vector Q All values are all 0, then QF [j]=0, otherwise QF [j]=1;
(2-4) will retrieve trapdoor and user needs the document number k retrieved to be transmitted to publicly-owned Cloud Server, and retrieval is filtered Vector is transmitted to privately owned Cloud Server.
(3) privately owned Cloud Server carries out the retrieval filter vectors received and the document filter vectors of each document respectively With operation, if all positions of vector that operation obtains are not all 0, corresponding document code, which is added to candidate documents, to be concentrated, and Candidate documents are transmitted to publicly-owned Cloud Server.
The step specifically includes:
(3-1) privately owned Cloud Server is by the document filter vectors DF of the retrieval filter vectors QF received and each documenti Progress and operation respectively, if QF&DFiThe obtained all positions of vector of operation be not all 0, then by DFiCorresponding document code Didi It is added to candidate documents concentration, obtains candidate documents CDS={ d1,d2,…};
Candidate documents CDS is sent to publicly-owned Cloud Server by (3-2).Fig. 5 provides a specific query example, privately owned Cloud Server is by retrieval filter vectors and document filter vectors are done and operation, finding the document mistake that operation result is not all 0 The corresponding document code of filter vector, to obtain candidate documents CDS={ Did1,Did5,Did6, CDS is then sent to public affairs There is Cloud Server.
(4) publicly-owned Cloud Server is counted respectively according to the candidate documents, retrieval trapdoor and the search file number k that receive Calculating candidate documents concentrates the corresponding encrypted document vector sum of each document to retrieve the safe inner product between trapdoor, according to safe inner product It chooses candidate documents and the maximally related k ciphertext document of keyword provided with user is provided, which is back to number According to retrieval end.
For example, in Fig. 5, publicly-owned Cloud Server receives the candidate documents CDS=CDS=that privately owned Cloud Server is sent {Did1,Did5,Did6, the search space of Top-k documents has not been complete or collected works DS={ D at this time1,D2,…,D10, and become Candidate documents { D1,D5,D6, search space becomes 3 present documents by 10 original documents, therefore only needs to count The inner product operation between 3 vectors is calculated, at this time by calculating the corresponding encrypted document of each document one by one in candidate documents concentration Dot product between vector sum encryption retrieval vector selects maximum k corresponding encryption texts to obtain 3 relevance scores Shelves, return to data retrieval end.
(5) k ciphertext document of reception is decrypted using symmetric key for data retrieval end, obtains maximally related k Plain text document.
Embodiment 2
A kind of secret protection multi-key word Top-k searching ciphertext systems towards mixed cloud are present embodiments provided, including Data provide end, data retrieval end, privately owned Cloud Server and publicly-owned Cloud Server, wherein:
Data provide end and generate keyword for extracting keyword set from the document sets of offer, and by the piecemeal that clusters Dictionary sequence;And it is vectorial for one corresponding plain text document of each document structure tree in document sets according to keyword dictionary sequence, and Plain text document vector piecemeal is formed document filter vectors by the piecemeal situation according to keyword dictionary sequence;And it will be literary in plain text Shelves vector encryption forms encrypted document vector, and each file encryption in document sets is formed encrypted document collection;And by document Filter vectors are transmitted to privately owned Cloud Server, and encrypted document vector sum encrypted document collection is transmitted to publicly-owned Cloud Server;
Data retrieval end is used to generate retrieval vector according to multiple keywords that user provides, using peace after being normalized Full algorithm generates retrieval trapdoor, needs the document number k retrieved to be transmitted to publicly-owned Cloud Server together with user;And according to key The piecemeal situation of keyword in word dictionary sequence, multiple keywords that user is provided generate retrieval filter vectors, and are transmitted to Privately owned Cloud Server;
The document filter vectors of retrieval filter vectors and each document that privately owned Cloud Server is used to receive respectively into Row and operation, if all positions of vector that operation obtains are not all 0, corresponding document code, which is added to candidate documents, to be concentrated, And candidate documents are transmitted to publicly-owned Cloud Server;
Publicly-owned Cloud Server is used to, according to the candidate documents, retrieval trapdoor and the search file number k that receive, count respectively Calculating candidate documents concentrates the corresponding encrypted document vector sum of each document to retrieve the safe inner product between trapdoor, according to safe inner product It chooses candidate documents and the maximally related k ciphertext document of keyword provided with user is provided, which is back to number According to retrieval end;
Data retrieval end is additionally operable to that k ciphertext document of reception is decrypted, and obtains maximally related k plain text document.
Further, the data provide end and specifically include:
Keyword extracting module obtains keyword set { w for extracting keyword from the document sets DS of offer1, w2,…,wn};
Cluster module, for the keyword root in keyword set to be carried out the operation that clusters according to correlativity, obtains several A cluster submanifold { c1,c2,…,ct};
Keyword dictionary generation module is used for using each submanifold as a piecemeal, to obtain t piecemeal, respectively b1,b2,…,bt, keyword dictionary sequence W={ w (b are generated further according to piecemeal1,1),w(b1,2),…,w(b2,1),w(b2, 2),…,w(bt,1),w(bt,2),…};Wherein w (bj, x) and it indicates to belong to piecemeal bjIn x-th of keyword, it is each in the block Keyword is unordered;Piecemeal bj={ w (bj,x)|0<x≤|bj|};
Plain text document vector generation module, for using TF-IDF algorithms and vector space model, according to keyword dictionary The position of keyword in sequence is document sets DS={ D1,D2,…,DmIn each document DiGenerate a corresponding plain text document Vectorial Vi, and be normalized;Wherein, ViDimension be n, every value is the corresponding keyword of this in document DiIn Word frequency TF values;
Document filter vectors generation module, for the piecemeal situation according to keyword dictionary sequence, by plain text document vector ViIt is divided into t piecemeal, piecemeal boundary is identical with the piecemeal boundary of keyword dictionary sequence, obtains each document DiDocument filtering Vectorial DFi={ b1,b2,…,bt};Wherein, if ViMiddle piecemeal bjPosition value where corresponding all keywords is all 0, then bjThe value of block is 0, otherwise bjThe value of block is 1, DFiIt is the vector that every value of t dimensions is 0/1;
Key production module, for generating encryption key SK (S, M1,M2,kf);Wherein, it is 0/1 that S, which is every value, Random vector, M1And M2It is two n × n invertible matrix, n is the length of keyword dictionary sequence, kfIt is document encryption key;
Document vector encrypting module, for using the encryption key generated to each plain text document by safe KNN technologies Vectorial ViEncryption obtains corresponding encrypted document vectorWherein, when j-th of element S in random vector S When [j]=0, Vi′+Vi"=Vi, as S [j]=1, Vi'=Vi"=Vi
File encryption module, for by each document in symmetric encipherment algorithm encrypted document collection DS, obtaining encrypted document Collect ES={ e1,e2,…,em};
Transmission module is stored for document filter vectors to be transmitted to privately owned Cloud Server, by encrypted document vector Publicly-owned Cloud Server is transmitted to encrypted document collection to be stored.
Further, the data retrieval end specifically includes:
Vector generation module is retrieved, multiple keyword { w for being provided according to user1,w2,…,wx, using TF-IDF Algorithm and vector space model generate retrieval vector Q, and are normalized;Wherein, j-th of element Q [j] of Q is that jth position is crucial Word wjInverse document frequency IDF values in data provide the document sets DS that end provides;
Trapdoor generation module is retrieved, for generating retrieval trapdoor using safe KNN algorithms based on retrieval vector QWherein, wherein when data provide end generate encryption key in random vector jth position S [j]= When 0, Q ' [j]=Q " [j]=Q [j], as S [j]=1, Q ' [j]+Q " [j]=Q [j];
Filter vectors generation module is retrieved, for the piecemeal situation according to keyword in keyword dictionary sequence, generates inspection Rope filter vectors QF, QF={ b1,b2,…,bt, QF is the vector that t ties up every value 0/1, if piecemeal bjCorresponding all keys Word corresponding all values in retrieving vector Q are all 0, then QF [j]=0, otherwise QF [j]=1;
Transmission module needs the document number k retrieved to be transmitted to publicly-owned Cloud Server for will retrieve trapdoor and user, will Retrieval filter vectors are transmitted to privately owned Cloud Server.
Further, the privately owned Cloud Server specifically includes:
With computing module, the document filter vectors DF of retrieval filter vectors QF and each document for that will receiveiPoint It Jin Hang not be with operation, if QF&DFiThe obtained all positions of vector of operation be not all 0, then by DFiCorresponding document code DidiAdd It is added to candidate documents concentration, obtains candidate documents CDS={ d1,d2,…};
Transmission module, for candidate documents CDS to be sent to publicly-owned Cloud Server.
The method of this system and embodiment 1 corresponds, and other parts repeat no more, reference implementation example 1.
Above disclosed is only presently preferred embodiments of the present invention, and the right model of the present invention cannot be limited with this It encloses, therefore equivalent changes made in accordance with the claims of the present invention, is still within the scope of the present invention.

Claims (8)

1. a kind of secret protection multi-key word Top-k cipher text retrieval methods towards mixed cloud, it is characterised in that this method includes:
(1) data provide end and extract keyword set from the document sets of offer, and generate keyword dictionary by the piecemeal that clusters Sequence;It is one corresponding plain text document of each document structure tree vector in document sets further according to keyword dictionary sequence, and according to pass Plain text document vector piecemeal is formed document filter vectors by the piecemeal situation of keyword dictionary sequence;Later by plain text document vector Encryption forms encrypted document vector, and each file encryption in document sets is formed encrypted document collection;Finally by document filter to Amount is transmitted to privately owned Cloud Server, and encrypted document vector sum encrypted document collection is transmitted to publicly-owned Cloud Server;
(2) data retrieval end generates retrieval vector according to multiple keywords that user provides, and is calculated using safety after being normalized Method generates retrieval trapdoor, needs the document number k retrieved to be transmitted to publicly-owned Cloud Server together with user;Again according to keyword dictionary The piecemeal situation of keyword in sequence, multiple keywords that user is provided generate retrieval filter vectors, and are transmitted to private clound Server;
(3) the retrieval filter vectors received and the document filter vectors of each document are carried out and are transported by privately owned Cloud Server respectively It calculates, if all positions of vector that operation obtains are not all 0, corresponding document code, which is added to candidate documents, to be concentrated, and will be waited Document sets are selected to be transmitted to publicly-owned Cloud Server;
(4) publicly-owned Cloud Server calculates separately time according to the candidate documents, retrieval trapdoor and the search file number k that receive Selection shelves concentrate the safe inner product between the corresponding encrypted document vector sum retrieval trapdoor of each document, are chosen according to safe inner product Candidate documents concentrate the maximally related k ciphertext document of keyword provided with user, which is back to data inspection Bitter end;
(5) k ciphertext document of reception is decrypted in data retrieval end, obtains maximally related k plain text document.
2. the secret protection multi-key word Top-k cipher text retrieval methods according to claim 1 towards mixed cloud, feature It is:Step (1) specifically includes:
(1-1) data provide end and extract keyword from the document sets DS of offer, obtain keyword set { w1,w2,…,wn};
Keyword root in keyword set is carried out the operation that clusters by (1-2) according to correlativity, obtains several cluster submanifolds { c1, c2,…,ct};
(1-3) using each submanifold as a piecemeal, to obtain t piecemeal, respectively b1,b2,…,bt, given birth to further according to piecemeal At keyword dictionary sequence W={ w (b1,1),w(b1,2),…,w(b2,1),w(b2,2),…,w(bt,1),w(bt, 2) ... }, Wherein w (bj, x) and it indicates to belong to piecemeal bjIn x-th of keyword, each keyword in the block is unordered;Piecemeal bj={ w (bj,x) |0<x≤|bj|};
(1-4) uses TF-IDF algorithms and vector space model, is document according to the position of keyword in keyword dictionary sequence Collect DS={ D1,D2,…,DmIn each document DiGenerate a corresponding plain text document vector Vi, and be normalized;Its In, ViDimension be n, every value is the corresponding keyword of this in document DiIn word frequency TF values;
(1-5) according to the piecemeal situation of keyword dictionary sequence, by plain text document vector ViIt is divided into t piecemeal, piecemeal boundary and pass The piecemeal boundary of keyword dictionary sequence is identical, obtains each document DiDocument filter vectors DFi={ b1,b2,…,bt};Wherein, If ViMiddle piecemeal bjPosition value where corresponding all keywords is all 0, then bjThe value of block is 0, otherwise bjThe value of block It is 1, DFiIt is the vector that every value of t dimensions is 0/1;
(1-6) generates encryption key SK (S, M1,M2,kf);Wherein, S is the random vector that every value is 0/1, M1And M2 It is two n × n invertible matrix, n is the length of keyword dictionary sequence, kfIt is document encryption key;
(1-7) is by safe KNN technologies using the encryption key generated to each plain text document vector ViEncryption, obtains corresponding Encrypted document vectorWherein, as j-th of element S [j]=0 in random vector S, V 'i+V″i=Vi, As S [j]=1, V 'i=V "i=Vi
(1-8) obtains encrypted document collection ES={ e by each document in symmetric encipherment algorithm encrypted document collection DS1,e2,…, em};
Document filter vectors are transmitted to privately owned Cloud Server and stored by (1-9), by encrypted document vector sum encrypted document collection Publicly-owned Cloud Server is transmitted to be stored.
3. the secret protection multi-key word Top-k cipher text retrieval methods according to claim 1 towards mixed cloud, feature It is:Step (2) specifically includes:
Multiple keyword { w that (2-1) data retrieval end is provided according to user1,w2,…,wx, using TF-IDF algorithms and space Vector model generates retrieval vector Q, and is normalized;Wherein, j-th of element Q [j] of Q is jth position keyword wjIn data Inverse document frequency IDF values in the document sets DS that end provides are provided;
(2-2) is based on retrieval vector Q, using safe KNN algorithms, generates retrieval trapdoorWherein, wherein When data provide jth position S [j]=0 of the random vector in the encryption key that end generates, Q ' [j]=Q " [j]=Q [j] works as S When [j]=1, Q ' [j]+Q " [j]=Q [j];
(2-3) generates retrieval filter vectors QF, QF={ b according to the piecemeal situation of keyword in keyword dictionary sequence1, b2,…,bt, QF is the vector that t ties up every value 0/1, if piecemeal bjCorresponding all keywords are corresponding in retrieving vector Q All values are all 0, then QF [j]=0, otherwise QF [j]=1;
(2-4) will retrieve trapdoor and user needs the document number k retrieved to be transmitted to publicly-owned Cloud Server, will retrieve filter vectors It is transmitted to privately owned Cloud Server.
4. the secret protection multi-key word Top-k cipher text retrieval methods according to claim 1 towards mixed cloud, feature It is:Step (3) specifically includes:
(3-1) privately owned Cloud Server is by the document filter vectors DF of the retrieval filter vectors QF received and each documentiRespectively into Row and operation, if QF&DFiThe obtained all positions of vector of operation be not all 0, then by DFiCorresponding document code DidiIt is added to Candidate documents are concentrated, and candidate documents CDS={ d are obtained1,d2,…};
Candidate documents CDS is sent to publicly-owned Cloud Server by (3-2).
5. a kind of secret protection multi-key word Top-k searching ciphertext systems towards mixed cloud, it is characterised in that the system includes Data provide end, data retrieval end, privately owned Cloud Server and publicly-owned Cloud Server, wherein:
Data provide end and generate keyword dictionary for extracting keyword set from the document sets of offer, and by the piecemeal that clusters Sequence;And it is vectorial for one corresponding plain text document of each document structure tree in document sets according to keyword dictionary sequence, and foundation Plain text document vector piecemeal is formed document filter vectors by the piecemeal situation of keyword dictionary sequence;And by plain text document to Amount encryption forms encrypted document vector, and each file encryption in document sets is formed encrypted document collection;And document is filtered Vector is transmitted to privately owned Cloud Server, and encrypted document vector sum encrypted document collection is transmitted to publicly-owned Cloud Server;
Data retrieval end is used to generate retrieval vector according to multiple keywords that user provides, and is calculated using safety after being normalized Method generates retrieval trapdoor, needs the document number k retrieved to be transmitted to publicly-owned Cloud Server together with user;And according to keyword word The piecemeal situation of keyword in canonical ordering row, multiple keywords that user is provided generate retrieval filter vectors, and are transmitted to privately owned Cloud Server;
The document filter vectors of retrieval filter vectors and each document that privately owned Cloud Server is used to receive carry out respectively and Operation, if all positions of vector that operation obtains are not all 0, corresponding document code, which is added to candidate documents, to be concentrated, and will Candidate documents are transmitted to publicly-owned Cloud Server;
Publicly-owned Cloud Server is used to, according to the candidate documents, retrieval trapdoor and the search file number k that receive, calculate separately time Selection shelves concentrate the safe inner product between the corresponding encrypted document vector sum retrieval trapdoor of each document, are chosen according to safe inner product Candidate documents concentrate the maximally related k ciphertext document of keyword provided with user, which is back to data inspection Bitter end;
Data retrieval end is additionally operable to that k ciphertext document of reception is decrypted, and obtains maximally related k plain text document.
6. the secret protection multi-key word Top-k searching ciphertext systems according to claim 5 towards mixed cloud, feature It is:The data provide end and specifically include:
Keyword extracting module obtains keyword set { w for extracting keyword from the document sets DS of offer1,w2,…, wn};
Cluster module, and for the keyword root in keyword set to be carried out the operation that clusters according to correlativity, it is poly- to obtain several Class submanifold { c1,c2,…,ct};
Keyword dictionary generation module is used for using each submanifold as a piecemeal, to obtain t piecemeal, respectively b1, b2,…,bt, keyword dictionary sequence W={ w (b are generated further according to piecemeal1,1),w(b1,2),…,w(b2,1),w(b2,2),…, w(bt,1),w(bt,2),…};Wherein w (bj, x) and it indicates to belong to piecemeal bjIn x-th of keyword, each keyword in the block It is unordered;Piecemeal bj={ w (bj,x)|0<x≤|bj|};
Plain text document vector generation module, for using TF-IDF algorithms and vector space model, according to keyword dictionary sequence The position of middle keyword is document sets DS={ D1,D2,…,DmIn each document DiGenerate a corresponding plain text document vector Vi, and be normalized;Wherein, ViDimension be n, every value is the corresponding keyword of this in document DiIn word Frequency TF values;
Document filter vectors generation module, for the piecemeal situation according to keyword dictionary sequence, by plain text document vector ViIt is divided into The piecemeal boundary of t piecemeal, piecemeal boundary and keyword dictionary sequence is identical, obtains each document DiDocument filter vectors DFi ={ b1,b2,…,bt};Wherein, if ViMiddle piecemeal bjPosition value where corresponding all keywords is all 0, then bjBlock takes Value is 0, otherwise bjThe value of block is 1, DFiIt is the vector that every value of t dimensions is 0/1;
Key production module, for generating encryption key SK (S, M1,M2,kf);Wherein, S be every value be 0/1 with Machine vector, M1And M2It is two n × n invertible matrix, n is the length of keyword dictionary sequence, kfIt is document encryption key;
Document vector encrypting module, for using the encryption key generated to each plain text document vector V by safe KNN technologiesi Encryption obtains corresponding encrypted document vectorWherein, when j-th of element S [j] in random vector S= When 0, V 'i+V″i=Vi, as S [j]=1, V 'i=V "i=Vi
File encryption module, for by each document in symmetric encipherment algorithm encrypted document collection DS, obtaining encrypted document collection ES ={ e1,e2,…,em};
Transmission module stores for document filter vectors to be transmitted to privately owned Cloud Server, encrypted document vector sum is added Confidential document collection is transmitted to publicly-owned Cloud Server and is stored.
7. the secret protection multi-key word Top-k searching ciphertext systems according to claim 5 towards mixed cloud, feature It is:The data retrieval end specifically includes:
Vector generation module is retrieved, multiple keyword { w for being provided according to user1,w2,…,wx, using TF-IDF algorithms Retrieval vector Q is generated with vector space model, and is normalized;Wherein, j-th of element Q [j] of Q is jth position keyword wj Inverse document frequency IDF values in data provide the document sets DS that end provides;
Trapdoor generation module is retrieved, for generating retrieval trapdoor using safe KNN algorithms based on retrieval vector QWherein, wherein when data provide end generate encryption key in random vector jth position S [j]= When 0, Q ' [j]=Q " [j]=Q [j], as S [j]=1, Q ' [j]+Q " [j]=Q [j];
Filter vectors generation module is retrieved, for the piecemeal situation according to keyword in keyword dictionary sequence, generation was retrieved Filter vector QF, QF={ b1,b2,…,bt, QF is the vector that t ties up every value 0/1, if piecemeal bjCorresponding all keywords exist It is all 0 to retrieve corresponding all values in vector Q, then QF [j]=0, otherwise QF [j]=1;
Transmission module needing the document number k retrieved to be transmitted to publicly-owned Cloud Server, will retrieve for will retrieve trapdoor and user Filter vectors are transmitted to privately owned Cloud Server.
8. the secret protection multi-key word Top-k searching ciphertext systems according to claim 5 towards mixed cloud, feature It is:The privately owned Cloud Server specifically includes:
With computing module, the document filter vectors DF of retrieval filter vectors QF and each document for that will receiveiIt carries out respectively With operation, if QF&DFiThe obtained all positions of vector of operation be not all 0, then by DFiCorresponding document code DidiIt is added to time Selection shelves are concentrated, and candidate documents CDS={ d are obtained1,d2,…};
Transmission module, for candidate documents CDS to be sent to publicly-owned Cloud Server.
CN201810122376.8A 2018-02-07 2018-02-07 Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud Active CN108363689B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810122376.8A CN108363689B (en) 2018-02-07 2018-02-07 Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810122376.8A CN108363689B (en) 2018-02-07 2018-02-07 Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud

Publications (2)

Publication Number Publication Date
CN108363689A true CN108363689A (en) 2018-08-03
CN108363689B CN108363689B (en) 2021-03-19

Family

ID=63005057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810122376.8A Active CN108363689B (en) 2018-02-07 2018-02-07 Privacy protection multi-keyword Top-k ciphertext retrieval method and system facing hybrid cloud

Country Status (1)

Country Link
CN (1) CN108363689B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109194666A (en) * 2018-09-18 2019-01-11 东北大学 A kind of safe kNN querying method based on LBS
CN109271485A (en) * 2018-09-19 2019-01-25 南京邮电大学 It is a kind of to support semantic cloud environment encrypted document ordering searching method
CN109739945A (en) * 2018-12-13 2019-05-10 南京邮电大学 A kind of multi-key word ciphertext ordering searching method based on hybrid index
CN110727951A (en) * 2019-10-14 2020-01-24 桂林电子科技大学 Lightweight outsourcing file multi-keyword retrieval method and system with privacy protection function
CN112597268A (en) * 2020-12-22 2021-04-02 南京邮电大学 Retrieval filtering threshold value selection method for cloud environment ciphertext retrieval efficiency optimization
WO2021103708A1 (en) * 2019-11-26 2021-06-03 支付宝(杭州)信息技术有限公司 Data query method, apparatus, device and system based on privacy information protection
CN114189391A (en) * 2022-02-14 2022-03-15 浙江易天云网信息科技有限公司 Privacy data control and management method suitable for hybrid cloud
CN116521743A (en) * 2023-06-27 2023-08-01 北京中科江南信息技术股份有限公司 Ciphertext retrieval method and device, storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120054485A1 (en) * 2010-08-25 2012-03-01 Sony Corporation Terminal device, server, data processing system, data processing method, and program
CN104765848A (en) * 2015-04-17 2015-07-08 中国人民解放军空军航空大学 Symmetrical searchable encryption method for supporting result high-efficiency sequencing in hybrid cloud storage
CN105681280A (en) * 2015-12-29 2016-06-15 西安电子科技大学 Searchable encryption method based on Chinese in cloud environment
CN106815350A (en) * 2017-01-19 2017-06-09 安徽大学 Dynamic ciphertext multi-key word searches for method generally in a kind of cloud environment
CN107634829A (en) * 2017-09-12 2018-01-26 南京理工大学 Encrypted electronic medical records system and encryption method can search for based on attribute

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120054485A1 (en) * 2010-08-25 2012-03-01 Sony Corporation Terminal device, server, data processing system, data processing method, and program
CN104765848A (en) * 2015-04-17 2015-07-08 中国人民解放军空军航空大学 Symmetrical searchable encryption method for supporting result high-efficiency sequencing in hybrid cloud storage
CN105681280A (en) * 2015-12-29 2016-06-15 西安电子科技大学 Searchable encryption method based on Chinese in cloud environment
CN106815350A (en) * 2017-01-19 2017-06-09 安徽大学 Dynamic ciphertext multi-key word searches for method generally in a kind of cloud environment
CN107634829A (en) * 2017-09-12 2018-01-26 南京理工大学 Encrypted electronic medical records system and encryption method can search for based on attribute

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
戴华 等: "面向云环境的多关键词密文排序检索研究综述", 《计算机科学》 *
李晖 等: "公共云存储服务数据安全及隐私保护技术综述", 《计算机研究与发展》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109194666A (en) * 2018-09-18 2019-01-11 东北大学 A kind of safe kNN querying method based on LBS
CN109194666B (en) * 2018-09-18 2021-06-01 东北大学 LBS-based security kNN query method
CN109271485A (en) * 2018-09-19 2019-01-25 南京邮电大学 It is a kind of to support semantic cloud environment encrypted document ordering searching method
CN109739945A (en) * 2018-12-13 2019-05-10 南京邮电大学 A kind of multi-key word ciphertext ordering searching method based on hybrid index
CN109739945B (en) * 2018-12-13 2022-11-08 南京邮电大学 Multi-keyword ciphertext sorting and searching method based on mixed index
CN110727951A (en) * 2019-10-14 2020-01-24 桂林电子科技大学 Lightweight outsourcing file multi-keyword retrieval method and system with privacy protection function
WO2021103708A1 (en) * 2019-11-26 2021-06-03 支付宝(杭州)信息技术有限公司 Data query method, apparatus, device and system based on privacy information protection
CN112597268A (en) * 2020-12-22 2021-04-02 南京邮电大学 Retrieval filtering threshold value selection method for cloud environment ciphertext retrieval efficiency optimization
CN112597268B (en) * 2020-12-22 2022-09-20 南京邮电大学 Retrieval filtering threshold value selection method for cloud environment ciphertext retrieval efficiency optimization
CN114189391A (en) * 2022-02-14 2022-03-15 浙江易天云网信息科技有限公司 Privacy data control and management method suitable for hybrid cloud
CN116521743A (en) * 2023-06-27 2023-08-01 北京中科江南信息技术股份有限公司 Ciphertext retrieval method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN108363689B (en) 2021-03-19

Similar Documents

Publication Publication Date Title
CN108363689A (en) Secret protection multi-key word Top-k cipher text retrieval methods towards mixed cloud and system
Zhang et al. PIC: Enable large-scale privacy preserving content-based image search on cloud
US10013574B2 (en) Method and apparatus for secure storage and retrieval of encrypted files in public cloud-computing platforms
CN104408177B (en) Cipher text retrieval method based on cloud document system
US9313232B2 (en) System and method for data mining and security policy management
Chuah et al. Privacy-aware bedtree based solution for fuzzy multi-keyword search over encrypted data
Sadeghian et al. SQL injection is still alive: a study on SQL injection signature evasion techniques
CN106407447A (en) Simhash-based fuzzy sequencing searching method for encrypted cloud data
US11595435B2 (en) Methods and systems for detecting phishing emails using feature extraction and machine learning
CN106326360A (en) Fuzzy multi-keyword retrieval method of encrypted data in cloud environment
Su et al. Privacy-preserving top-k spatial keyword queries in untrusted cloud environments
CN108959567A (en) It is suitable for the safe retrieving method of large-scale image under a kind of cloud environment
CN106972927A (en) A kind of encryption method and system for different safety class
CN115314295B (en) Block chain-based searchable encryption technical method
US20130246338A1 (en) System and method for indexing a capture system
Boucenna et al. Secure inverted index based search over encrypted cloud data with user access rights management
CN109739945A (en) A kind of multi-key word ciphertext ordering searching method based on hybrid index
CN109740378B (en) Security pair index structure resisting keyword privacy disclosure and retrieval method thereof
Wang et al. A modified homomorphic encryption method for multiple keywords retrieval
Abduljabbar et al. Secure biometric image retrieval in IoT-cloud
Gupta et al. A learning oriented DLP system based on classification model
CN108829714A (en) A kind of ciphertext data multi-key word searches for method generally
CN112966086A (en) Verifiable fuzzy search method based on position sensitive hash function
Hussain et al. A novel method for preserving privacy in big-data mining
Yao et al. Topic-based rank search with verifiable social data outsourcing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20180803

Assignee: NUPT INSTITUTE OF BIG DATA RESEARCH AT YANCHENG

Assignor: NANJING University OF POSTS AND TELECOMMUNICATIONS

Contract record no.: X2021980013920

Denomination of invention: Hybrid cloud oriented privacy protection multi keyword Top-k ciphertext retrieval method and system

Granted publication date: 20210319

License type: Common License

Record date: 20211202

EE01 Entry into force of recordation of patent licensing contract