CN109063509A - It is a kind of that encryption method can search for based on keywords semantics sequence - Google Patents
It is a kind of that encryption method can search for based on keywords semantics sequence Download PDFInfo
- Publication number
- CN109063509A CN109063509A CN201810890114.6A CN201810890114A CN109063509A CN 109063509 A CN109063509 A CN 109063509A CN 201810890114 A CN201810890114 A CN 201810890114A CN 109063509 A CN109063509 A CN 109063509A
- Authority
- CN
- China
- Prior art keywords
- document
- vector
- keyword
- cloud server
- owned cloud
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6227—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2107—File encryption
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2141—Access rights, e.g. capability lists, access control lists, access tables, access matrices
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Storage Device Security (AREA)
Abstract
Encryption method can search for based on keywords semantics sequence the invention discloses a kind of, comprise the steps of: that data owner generates encryption key and is sent to authorized user, then extracting keywords are concentrated from plain text document, document markup vector is constructed according to keyword and is sent to privately owned Cloud Server, and creation Security Index tree is sent to publicly-owned Cloud Server;The inquiry label vector that search key generates is sent to privately owned Cloud Server by authorized user, and the inquiry trapdoor of generation is sent to publicly-owned Cloud Server;Privately owned Cloud Server matches inquiry label vector with document markup vector, it would be possible to which the Security Index set of identifiers for meeting user query request is sent to publicly-owned Cloud Server;Publicly-owned Cloud Server calculates Documents Similarity, sorts to calculated result, and the highest ciphertext document sets of similarity are returned to authorized user.The present invention while ensureing that data-privacy is safe, effectively improves recall precision by realizing multi-key word Semantic Ranking method.
Description
Technical field
It is the present invention relates to cloud security field, in particular to a kind of that encryption method can search for based on keywords semantics sequence.
Background technique
With the rapid development of cloud, the data of oneself are contracted out to Cloud Server by more and more users, but cloud
Server also produces very big prestige while providing the data storage service of high quality for user to the data-privacy of user
The side of body.In order to guarantee that the sensitive data of user is not leaked, data owner adopts mostly before uploading the data to Cloud Server
With cryptographic technique, data encryption is stored, then, some search techniques about encryption data are come into being.By data encryption
After be stored in Cloud Server, either Cloud Server administrator or external attacker can not obtain data true content,
Protect the privacy of user.But this brings great challenge to the information retrieval of ciphertext data, especially in untrusted environment
Under, the search plan how to provide highly effective and safe to ciphertext causes common concern.
It is to each word in document using symmetric key using the double-deck knot when can search for encryption technology and being suggested earliest
Structure encryption, but the mode efficiency of full-text search is too low, rear being suggested the public key based on keyword can search for encryption system again, make
Obtaining any user for possessing public key can be to server storing data, and the user for only possessing private key just may search for key
Word, but this structure based on Bilinear map makes recall precision very low, is not suitable for the cloud computing environment of large-scale data.
In order to more preferably meet searching request, keyword is searched for scheme generally and is suggested, and by constructing keyword fuzzy set, comes
Input misspelling and format inconsistent situation when tolerating user's search, but a disadvantage is that can not search for and keywords semantics phase
The document of pass.
Although some encryption methods are supported semantic ambiguity search, multi-key word search, are searched parallel in existing research method
The search methods such as rope, similarity retrieval, but semantic similarity cannot be participated in the scoring of document.
Summary of the invention
Encryption method can search for based on keywords semantics sequence the purpose of the present invention is to provide a kind of, in band keyword
On the basis of semantic ambiguity search, by being extracted to keyword, index vector is constructed, uses keywords semantics as document
Feature realizes multi-key word sequence searching ciphertext, allows users to be quickly found out most related data, alleviate network flow
The expense of aspect improves the accuracy of search.In addition, including a public affairs that is, in system model by using mixing cloud model
There are Cloud Server and a privately owned Cloud Server, maximally utilise server resource, reduces the computing cost of terminal, simultaneously
Make model safety with higher.
In order to reach the goals above, encryption method can search for based on keywords semantics sequence the present invention provides a kind of,
The method includes the steps of:
S1, data owner concentrate extracting keywords from plain text document, obtain keyword set, it is bright to regenerate an encryption
The key SK of literary document sets, and the key SK of the encrypting plaintext document sets of generation is sent to authorized user;
S2, data owner construct document markup vector according to keyword set and create Security Index tree, then will be literary
Shelves label vector is sent to privately owned Cloud Server, and Security Index tree is sent to publicly-owned Cloud Server;
S3, data owner encrypt plain text document collection using key SK, obtain ciphertext document sets, should by what is obtained
Ciphertext document sets are sent to publicly-owned Cloud Server;
S4, authorized user input the keyword set to be searched for, obtain inquiry label according to the keyword set to be searched for
Vector sum trapdoor, is then sent to privately owned Cloud Server for the inquiry label vector, which is sent to publicly-owned Cloud Server;
The document markup of inquiry label vector and data owner's transmission that S5, privately owned Cloud Server send authorized user
Vector is matched, and obtaining may be comprising the candidate index set of identifiers of keyword, then by the candidate index identifier collection
Conjunction is sent to publicly-owned Cloud Server;
S6, publicly-owned Cloud Server receive the candidate index set of identifiers that privately owned Cloud Server is sent, according to candidate rope
The Security Index tree for drawing set of identifiers and data owner's transmission uses corresponding encrypted document index vector sum authorization
The trapdoor that family is sent calculates the similarity score of document and sequence, returns to k ciphertext documents before authorized user, wherein safety cable
Each leaf node for drawing tree corresponds to a document index vector, and the document index vector stored in leaf node is per one-dimensional
Value be TF value that the dimensional vector corresponds to keyword;Wherein, k is the ciphertext number of documents for meeting authorized user's demand, the TF value
Indicate the frequency that a certain given keyword occurs in a document;
S7, authorized user use key SK, and the preceding k ciphertext documents returned to publicly-owned Cloud Server are decrypted, and obtain
Corresponding plain text document.
Preferably, following steps are further included in the step S1:
S1.1, data owner are from plain text document collection F=(f1,f2,…,fm) in extracting keywords, obtain keyword set
W=(w1,w2,…,wn), wherein w1,w2,…,wnThe each keyword respectively extracted;
A n-dimensional vector S is randomly generated in S1.2, data owner, and vector S is randomly generated per one-dimensional value, takes
Value is only 1 or 0;
The invertible matrix M of two n × n dimension is randomly generated in S1.3, data owner1And M2, two matrixes are per one-dimensional value
It is randomly generated, n is the keyword number in keyword set;
S1.4, data owner randomly choose two key sk1And sk2;Wherein the key SK of encrypting plaintext document sets F is
One five-tuple, is expressed as { S, M1,M2,sk1,sk2}。
Preferably, following steps are further included in the step S2:
S2.1, building document markup vector B: for each document f in plain text document collection Fi, firstly generate a n Balakrishnan
Shelves vector D=(D1,D2,…,Di,…,Dn), wherein i ∈ { 1,2 ..., n }, the n are the keyword number in keyword set,
D in document vector DiValue be set as corresponding TF value of the keyword in current plain text document collection F;If plain text document collection F is not
Comprising the keyword, then by DiIt is set as 0;A keyword in every one-dimensional corresponding keyword set of document markup vector, will
Document vector D is divided into u block, if some block is all 0, mark value bbi=0, otherwise bbi=1, obtain document markup vector B
=(bb1,bb2,…,bbi,…,bbu), wherein i ∈ { 1,2 ..., u };
S2.2, the corresponding n Balakrishnan shelves index vector of each leaf node for constructing Security Index tree I: Security Index tree I
V=(V1,V2,…,Vi,…,Vn), for each document f in plain text document collection FiA leaf node is generated, due to literary in plain text
There is m document in shelves collection F, then has m leaf node in Security Index tree I, the corresponding text of each document is stored in leaf node
Shelves index vector V, if the corresponding document f of the leaf nodeiComprising the keyword, then V in ViValue be 1, be otherwise 0;For peace
Each intermediate node v in full index tree I, stores a n Balakrishnan shelves index vector V=(Vv[1],Vv[2],…,Vv
[i],…,Vv[n]), if the document index vector stored in the left child of intermediate node v or right child nodes is not equal to 0, Vv[i]
=1, otherwise Vv[i]=0, i is keyword wiSequence in keyword set W;If Vv[i]=1, then it represents that at least have one
Item is from intermediate node v to a certain comprising keyword wiLeaf node path;
S2.3, data owner encrypt the document index vector V in Security Index tree I leaf node: for every
Document index vector V=(V in a leaf node1,V2,…,Vi,…,Vn) in ViIt is split into two stochastic variable { Vi',
Vi"};The splitting method are as follows: the n-dimensional vector S for using data owner to be randomly generated is as division indicator, if the jth of S is tieed up
Value is 0, then by Vi' [j] and Vi" value of [j] is set as and Vi[j] is identical, if the jth dimension value of S is 1, is randomly provided Vi'
[j] and Vi" [j] value, guarantee sum of the two be equal to Vi[j], then with key SK to Vi' and Vi" encryption, it obtains encrypted
Document index vectorEncrypted document index vector V is stored in the leaf of document index vector V
In node, and delete corresponding Vi;Wherein, since V is split into two stochastic variables, key SK is to Vi' encrypted result is
Vi', to Vi" encrypted result isVi", M1And M2For the invertible matrix for two n × n dimension that data owner is randomly generated;
Security Index tree I is sent to publicly-owned Cloud Server by S2.4, data owner, and document markup vector B is sent to
Privately owned Cloud Server.
Preferably, following steps are further included in the step S4:
S4.1, authorized user input the keyword set W'=(w to be searched for1',w'2,…,w'n), and be the searching request
Generate a n dimension inquiry label vector Q=(Q1,Q2,…,Qi,…,Qn), wherein inquiry label vector Q=(Q1,Q2,…,
Qi,…,Qn) per a keyword in one-dimensional corresponding keyword set, i.e. QiKeyword W in corresponding Wi, wherein i ∈ 1,
2 ..., n }, if keyword WiIn the keyword set W ' to be searched for, then by QiIt is set as the dimension and corresponds to keyword in document
The IDF value of concentration, otherwise by QiIt is set as 0;Wherein, which removed by calculating the number of documents comprising a certain given word
It is obtained with number of documents all in document sets;
S4.2, it generates trapdoor: inquiry label vector Q is split into two random vectors { Q', Q " };What if the jth of S was tieed up
Value is 0, then is randomly provided Q'jAnd Q'j' value, guarantee sum of the two be equal to QjIf the value that the jth of S is tieed up is 1, by Q'jWith
Q'j' value be set as and QjIt is identical;Inquiry label vector Q is encrypted using code key SK, generates trapdoorDue to inquiry label vector Q be split into two random vector Q' and Q ", then encrypted query mark to
The process for measuring Q is exactly to be encrypted respectively to two stochastic variables after division with code key SK, and the result encrypted is respectivelyQ' andQ ", wherein M1And M2For the invertible matrix for two n × n dimension that data owner is randomly generated;Authorized user will
Trapdoor T is sent to publicly-owned Cloud Server, and inquiry label vector Q is sent to privately owned Cloud Server.
Preferably, following steps are further included in the step S5:
S5.1, privately owned Cloud Server receive authorized user transmission inquiry label vector Q after, successively with inquiry label to
Whether corresponding value is 0 in each document markup vector B for going matched data owner to send in amount Q, if it is 0,
Illustrate there is no the keyword to be searched in the keyword set that data owner provides, if it is 1, by corresponding document index
Vector is recorded;
S5.2, privately owned Cloud Server obtain including the key to be searched for by recording corresponding document index vector
Word candidate index set of identifiers SID=..., sidi,…,sidj,…,sidz..., wherein i, j, z ∈ { 1,2 ..., n },
Then by candidate index set of identifiers SID=..., sidi,…,sidj,…,sidz... } and it is sent to publicly-owned Cloud Server.
Preferably, it is further included in the step S6:
Candidate index set of identifiers SID=that publicly-owned Cloud Server is sent according to privately owned Cloud Server ..., sidi,…,
sidj,…,sidz..., find the encrypted document index vector that corresponding data owner sendsIt is sent out with authorized user
The trapdoor T sent calculates the similarity score SC of document, and Documents Similarity score SC calculation formula is as follows:
Wherein, i ∈ (1,2 ..., n);
The Documents Similarity score being calculated is ranked up by publicly-owned Cloud Server, return to before authorized user k it is close
Document, wherein the value of k is to meet the ciphertext number of documents of authorized user's demand.
Compared with prior art, the invention has the benefit that the present invention is to the keyword set to be searched for of authorized user
It extracts, constructs the index tree that can search for of encryption, and semantic similarity is introduced into the scoring of document, thus when authorization is used
When family wishes to search the relevant document of keywords semantics that it to be searched for but can not search out accurate keyword, the present invention can
Think that it is matched to the highest document of semantic relevancy, and return to authorized user, realizes keywords semantics sorted search, have
The recall precision for improving to effect authorized user, since the present invention inquires label vector by document markup vector and authorized user
Matching, has filtered a large amount of irrelevant documents.It is this that encryption method can search for before guaranteeing safety based on keywords semantics sequence
It puts, reduces the time of creation index, effectively improve the recall precision of user, can also more accurately return to and award
The Semantic Ranking search result that power user most meets.The present invention is using mixing cloud model simultaneously, because private clound is with higher
Safety, thus this method can utilize server resource to greatest extent, while realizing high efficiency retrieval, it is ensured that
The safety of data will not reveal the relevant information of any keyword.
Detailed description of the invention
A kind of flow chart that can search for encryption method based on keywords semantics sequence of Fig. 1 present invention.
Specific embodiment
The present invention is further elaborated by the way that a preferable specific embodiment is described in detail below in conjunction with attached drawing.
As shown in Figure 1, the encryption method that can search for of the invention based on keywords semantics sequence can search for rope by construction
Draw tree, the impact of performance is significantly improved in terms of the semantic ambiguity sequence of keyword, the recall precision of authorized user obtains very big
It is promoted, which comprises the steps of:
Step S1, system initialization: data owner is from plain text document collection F=(f1,f2,…,fm) in extracting keywords,
Obtain keyword set W=(w1,w2,…,wn), regenerate the key SK of an encrypting plaintext document sets, and by the encryption of generation
The key SK of plain text document collection is sent to authorized user, wherein w1,w2,…,wnThe each keyword respectively extracted;
Step S2, data owner is according to keyword set W=(w1,w2,…,wn) construct document markup vector B and create
Then document markup vector B is sent to privately owned Cloud Server by Security Index tree I, Security Index tree I is sent to public cloud clothes
Business device;
Step S3, data owner is using key SK to plain text document collection F=(f1,f2,…,fm) encrypted, it obtains close
Literary document sets C=(c1,c2,…,cm), the obtained ciphertext document sets are sent to publicly-owned Cloud Server;
Step S4, authorized user inputs the keyword set W'=(w to be searched for1',w'2,…,w'n), according to what is searched for
Keyword set obtains inquiry label vector Q and trapdoor T, and inquiry label vector Q is then sent to privately owned Cloud Server, will be fallen into
Door T is sent to publicly-owned Cloud Server, wherein w1',w'2,…,w'nThe each keyword respectively to be searched for;
Step S5, text of the privately owned Cloud Server to the authorized user inquiry label vector Q sent and data owner's transmission
Shelves label vector B is matched, and obtaining may be comprising the candidate index set of identifiers SID of keyword, then by candidate index
Set of identifiers SID is sent to publicly-owned Cloud Server;
Step S6, publicly-owned Cloud Server receives the candidate index set of identifiers SID=that privately owned Cloud Server is sent
{…,sidi,…,sidj,…,sidz..., the safety cable sent according to candidate index set of identifiers SID and data owner
Draw tree I, by corresponding encrypted document index vectorThe similarity score of document is calculated with the trapdoor T that authorized user sends
And sort, k ciphertext documents before authorized user are returned to, wherein each leaf node of Security Index tree I is one corresponding
Document index vectorThe every one-dimensional value of the document index vector stored in leaf node is the TF that the dimensional vector corresponds to keyword
Value, the value of k is to meet the ciphertext number of documents of authorized user's demand;
Wherein the TF value indicates the frequency that a certain given keyword occurs in a document.
Step S7, authorized user uses key SK, and the preceding k ciphertext documents returned to publicly-owned Cloud Server are decrypted,
Obtain corresponding plain text document.
Illustratively, in the step S1, data owner generates the specific generating process of encrypting plaintext document sets key SK
Are as follows:
Step S1.1, data owner is from plain text document collection F=(f1,f2,…,fm) in extracting keywords, obtain keyword
Set W=(w1,w2,…,wn);
Step S1.2, a n-dimensional vector S is randomly generated in data owner, which is all randomly generated per one-dimensional value
, value is only 1 or 0;
Step S1.3, the invertible matrix M of two n × n dimension is randomly generated in data owner1And M2, two matrixes are per one-dimensional
What value was also randomly generated;
Step S1.4, data owner randomly chooses two key sk1And sk2;
Therefore, the key SK of encrypting plaintext document sets is a five-tuple, is expressed as { S, M1,M2,sk1,sk2, above-mentioned n
It is the keyword number in keyword set.
Illustratively, specifically include following procedure in the step S2:
Step S2.1, it constructs document markup vector B: each document f is concentrated for plain text documenti, firstly generate a n dimension
Document vector D=(D1,D2,…,Di,…,Dn), wherein i ∈ { 1,2 ..., n }, n here are the keyword in keyword set
Number, the D in document vector DiValue be set as corresponding TF value of the keyword in current plain text document collection F, if plain text document
Collect F and do not include the keyword, then by DiIt is set as 0.A key in every one-dimensional corresponding keyword set of document markup vector
Document vector D is divided into u block by word, if some block is all 0, mark value bbi=0, otherwise bbi=1, obtain document markup
Vector B=(bb1,bb2,…,bbi,…,bbu), wherein i ∈ { 1,2 ..., u };
Step S2.2, the corresponding n of each leaf node of building Security Index tree I: Security Index tree I ties up document index
Vector V=(V1,V2,…,Vi,…,Vn), for each document f in plain text document collection FiA leaf node is generated, because bright
There is m document in literary document sets F, so having m leaf node in Security Index tree I.Each document pair is stored in leaf node
The document index vector V answered, if the corresponding document f of the leaf nodeiComprising the keyword, then V in ViValue be 1, otherwise for
0.For each intermediate node v in Security Index tree, a n Balakrishnan shelves index vector V=(V is storedv[1],Vv[2],…,
Vv[i],…,Vv[n]), if the document index vector stored in the left child of intermediate node v or right child nodes is not equal to 0,
Vv[i]=1, otherwise Vv[i]=0, i is keyword wiSequence in keyword set W.If Vv[i]=1, then it represents that at least
There are one from intermediate node v to a certain comprising keyword wiLeaf node path;
Step S2.3, data owner encrypts the document index vector V in Security Index tree I leaf node: right
The document index vector V=(V in each leaf node1,V2,…,Vi,…,Vn) in ViIt is split into two stochastic variables
{Vi',Vi"}.Divide program it is specific as follows: the n-dimensional vector S for using data owner to be randomly generated as divide indicator, if
The jth dimension value of S is 0, then by Vi' [j] and Vi" value of [j] is set as and Vi[j] is identical, if the jth dimension value of S is 1, at random
V is seti' [j] and Vi" [j] value, but to guarantee sum of the two be equal to Vi[j].Then V is encrypted with key SK, due to V
Two stochastic variables are split into, wherein key SK is to Vi' encrypted result isVi', to Vi" encrypted result isVi", from
And obtain encrypted document index vectorWherein M1And M2Two be randomly generated for data owner
The invertible matrix of a n × n dimension, by encrypted document index vectorIt is stored in the leaf node of document index vector V, and
Delete corresponding Vi。
Step S2.4, Security Index tree I is sent to publicly-owned Cloud Server by data owner, and document markup vector B is sent out
Give privately owned Cloud Server.
Illustratively, specifically include following procedure in the step S4:
Step S4.1, authorized user inputs the keyword set W'=(w to be searched for1',w'2,…,w'n), and be the search
Request generates a n dimension inquiry label vector Q=(Q1,Q2,…,Qi,…,Qn), wherein inquiry label vector Q=(Q1,Q2,…,
Qi,…,Qn) per a keyword in one-dimensional corresponding keyword set, i.e. QiKeyword W in corresponding Wi, wherein i ∈ 1,
2 ..., n }, if keyword WiIn the keyword set W ' to be searched for, then by QiIt is set as the dimension and corresponds to keyword in document
The IDF value of concentration, otherwise by QiBe set as 0, wherein the IDF value be by calculate comprising a certain given word number of documents divided by
All number of documents obtain in document sets;
Step S4.2, it generates trapdoor: inquiry label vector Q is split into two random vectors { Q', Q " }.If the jth of S
The value of dimension is 0, then is randomly provided Q'jAnd Q'j' value, guarantee sum of the two be equal to QjIt, will if the value that the jth of S is tieed up is 1
Q'jAnd Q'j' value be set as and QjIt is identical.Inquiry label vector Q is encrypted using code key SK, due to inquiring label vector Q
Two random vector Q' and Q " are split into, so the process of encrypted query label vector Q is exactly to use code key SK respectively to division
Two stochastic variables afterwards are encrypted, and the result encrypted is respectivelyWithGenerate trapdoorWherein M1And M2For the invertible matrix for two n × n dimension that data owner is randomly generated.Authorization is used
Trapdoor T is sent to publicly-owned Cloud Server by family, and inquiry label vector Q is sent to privately owned Cloud Server.
Illustratively, specifically include following procedure in the step S5:
Step S5.1, after privately owned Cloud Server receives the inquiry label vector Q of authorized user's transmission, successively with inquiry mark
Whether corresponding value is 0 in each document markup vector B for going matched data owner to send in note vector Q, if it is
0, then illustrate there is no the keyword to be searched in the keyword set that data owner provides, if it is 1, by corresponding document
Index vector is recorded;
Step S5.2, privately owned Cloud Server obtains including to be searched for by recording corresponding document index vector
Keyword candidate index set of identifiers SID=..., sidi,…,sidj,…,sidz..., wherein i, j, z ∈ 1,
2 ..., n }, then by candidate index set of identifiers SID=..., sidi,…,sidj,…,sidz... } and it is sent to public cloud
Server.
Illustratively, specifically include following procedure in the step S6:
Candidate index set of identifiers SID=that publicly-owned Cloud Server is sent according to privately owned Cloud Server ..., sidi,…,
sidj,…,sidz..., find the encrypted document index vector V and authorized user's transmission that corresponding data owner sends
Trapdoor T, calculate the similarity score SC of document, Documents Similarity score SC calculation formula is as follows:
Wherein, i ∈ (1,2 ..., n).
The Documents Similarity score being calculated is ranked up by publicly-owned Cloud Server, return to before authorized user k it is close
Document, wherein the value of k is to meet the ciphertext number of documents of authorized user's demand.
In the present invention, to detect the performance that can search for Encryption Model to sort based on keywords semantics, the present invention is proposed
Method and the public key encryption methods of traditional several support multiple key word retrievals compare.The sheet obtained by experimental result
It invents the public key encryption method proposed and conventional public-key encryption method comparison result is as shown in table 1 below:
More than a kind of encryption method Character Comparison table of table
As it can be seen from table 1 if having higher requirement, additional storage overhead to authorized user's search result accuracy
It is difficult to avoid that.It is proposed by the present invention based on key in the case where authorized user intentionally gets more accurate search result
Word Semantic Ranking can search for encryption method, compared with the method that traditional support multiple key is searched for, not only may be implemented more
The function of keywords semantics sorted search also can search for index tree by building, further improve the search effect of authorized user
Rate considerably reduces calculating and the storage overhead of index, ensure that the accuracy of authorized user's search result.
It is discussed in detail although the contents of the present invention have passed through above preferred embodiment, but it should be appreciated that above-mentioned
Description is not considered as limitation of the present invention.After those skilled in the art have read above content, for of the invention
A variety of modifications and substitutions all will be apparent.Therefore, protection scope of the present invention should be limited to the appended claims.
Claims (6)
1. a kind of can search for encryption method based on keywords semantics sequence, which is characterized in that the method includes the steps of:
S1, data owner concentrate extracting keywords from plain text document, obtain keyword set, regenerate an encrypting plaintext text
The key SK of shelves collection, and the key SK of the encrypting plaintext document sets of generation is sent to authorized user;
S2, data owner construct document markup vector according to keyword set and create Security Index tree, then by document mark
Note vector is sent to privately owned Cloud Server, and Security Index tree is sent to publicly-owned Cloud Server;
S3, data owner encrypt plain text document collection using key SK, obtain ciphertext document sets, the ciphertext that will be obtained
Document sets are sent to publicly-owned Cloud Server;
S4, authorized user input the keyword set to be searched for, obtain inquiry label vector according to the keyword set to be searched for
And trapdoor, the inquiry label vector is then sent to privately owned Cloud Server, which is sent to publicly-owned Cloud Server;
The document markup vector of inquiry label vector and data owner's transmission that S5, privately owned Cloud Server send authorized user
It is matched, obtains then to send out the candidate index set of identifiers comprising the candidate index set of identifiers of keyword
Give publicly-owned Cloud Server;
S6, publicly-owned Cloud Server receive the candidate index set of identifiers that privately owned Cloud Server is sent, according to candidate index mark
The Security Index tree for knowing symbol set and data owner's transmission sends out corresponding encrypted document index vector sum authorized user
The trapdoor sent calculates the similarity score of document and sequence, returns to k ciphertext documents before authorized user, wherein Security Index tree
Each leaf node correspond to a document index vector, the document index vector stored in leaf node is per one-dimensional value
It is the TF value that the dimensional vector corresponds to keyword;Wherein, k is the ciphertext number of documents for meeting authorized user's demand, which indicates
The frequency that a certain given keyword occurs in a document;
S7, authorized user use key SK, and the preceding k ciphertext documents returned to publicly-owned Cloud Server are decrypted, and obtain corresponding
Plain text document.
2. a kind of as described in claim 1 can search for encryption method based on keywords semantics sequence, which is characterized in that
Following steps are further included in the step S1:
S1.1, data owner are from plain text document collection F=(f1,f2,…,fm) in extracting keywords, obtain keyword set W=
(w1,w2,…,wn), wherein w1,w2,…,wnThe each keyword respectively extracted;
A n-dimensional vector S is randomly generated in S1.2, data owner, and vector S is randomly generated per one-dimensional value, and value is only
It can be 1 or 0;
The invertible matrix M of two n × n dimension is randomly generated in S1.3, data owner1And M2, two matrixes per one-dimensional value be also with
What machine generated, n is the keyword number in keyword set;
S1.4, data owner randomly choose two key sk1And sk2;Wherein the key SK of encrypting plaintext document sets F is one
Five-tuple is expressed as { S, M1,M2,sk1,sk2}。
3. a kind of as claimed in claim 2 can search for encryption method based on keywords semantics sequence, which is characterized in that
Following steps are further included in the step S2:
S2.1, building document markup vector B: for each document f in plain text document collection Fi, firstly generate a n Balakrishnan shelves vector
D=(D1,D2,…,Di,…,Dn), wherein i ∈ { 1,2 ..., n }, the n are the keyword number in keyword set, document to
Measure the D in DiValue be set as corresponding TF value of the keyword in current plain text document collection F;It should if plain text document collection F does not include
Keyword, then by DiIt is set as 0;Document markup vector per a keyword in one-dimensional corresponding keyword set, by document to
Amount D is divided into u block, if some block is all 0, mark value bbi=0, otherwise bbi=1, obtain document markup vector B=
(bb1,bb2,…,bbi,…,bbu), wherein i ∈ { 1,2 ..., u };
S2.2, the corresponding n Balakrishnan shelves index vector V=of each leaf node for constructing Security Index tree I: Security Index tree I
(V1,V2,…,Vi,…,Vn), for each document f in plain text document collection FiA leaf node is generated, due to plain text document
There is m document in collection F, then has m leaf node in Security Index tree I, the corresponding document of each document is stored in leaf node
Index vector V, if the corresponding document f of the leaf nodeiComprising the keyword, then V in ViValue be 1, be otherwise 0;For safety
Each intermediate node v in index tree I stores a n Balakrishnan shelves index vector V=(Vv[1],Vv[2],…,Vv[i],…,
Vv[n]), if the document index vector stored in the left child of intermediate node v or right child nodes is not equal to 0, Vv[i]=1, it is no
Then Vv[i]=0, i is keyword wiSequence in keyword set W;If Vv[i]=1, then it represents that at least have one therefrom
Intermediate node v is to a certain comprising keyword wiLeaf node path;
S2.3, data owner encrypt the document index vector V in Security Index tree I leaf node: for each leaf
Document index vector V=(V in child node1,V2,…,Vi,…,Vn) in ViIt is split into two stochastic variable { Vi',Vi"};
The splitting method are as follows: the n-dimensional vector S for using data owner to be randomly generated is as division indicator, if the jth dimension value of S is
0, then by Vi' [j] and Vi" value of [j] is set as and Vi[j] is identical, if the jth dimension value of S is 1, is randomly provided Vi' [j] and
Vi" [j] value, guarantee sum of the two be equal to Vi[j], then with key SK to Vi' and Vi" encryption, obtain encrypted document rope
The amount of guiding intoBy encrypted document index vectorIt is stored in the leaf node of document index vector V
In, and delete corresponding Vi;Wherein, since V is split into two stochastic variables, key SK is to Vi' encrypted result isIt is right
Vi" encrypted result isM1And M2For the invertible matrix for two n × n dimension that data owner is randomly generated;
Security Index tree I is sent to publicly-owned Cloud Server by S2.4, data owner, document markup vector B is sent to privately owned
Cloud Server.
4. a kind of as claimed in claim 3 can search for encryption method based on keywords semantics sequence, which is characterized in that
Following steps are further included in the step S4:
S4.1, authorized user input the keyword set W'=(w ' to be searched for1,w'2,…,w'n), and generated for the searching request
One n dimension inquiry label vector Q=(Q1,Q2,…,Qi,…,Qn), wherein inquiry label vector Q=(Q1,Q2,…,Qi,…,Qn)
Per a keyword in one-dimensional corresponding keyword set, i.e. QiKeyword W in corresponding Wi, wherein i ∈ { 1,2 ..., n },
If keyword WiIn the keyword set W ' to be searched for, then by QiIt is set as the dimension and corresponds to IDF of the keyword in document sets
Value, otherwise by QiIt is set as 0;Wherein, which is by calculating the number of documents comprising a certain given word divided by document sets
What all number of documents obtained;
S4.2, it generates trapdoor: inquiry label vector Q is split into two random vectors { Q', Q " };If the value that the jth of S is tieed up is
0, then it is randomly provided Q'jWith Q "jValue, guarantee sum of the two be equal to QjIf the value that the jth of S is tieed up is 1, by Q'jWith Q "j's
Value is set as and QjIt is identical;Inquiry label vector Q is encrypted using code key SK, generates trapdoorBy
Two random vector Q' and Q " are split into inquiry label vector Q, then the process of encrypted query label vector Q is exactly to use code key
SK respectively encrypts two stochastic variables after division, and the result encrypted is respectivelyWithWherein M1
And M2For the invertible matrix for two n × n dimension that data owner is randomly generated;Trapdoor T is sent to public cloud clothes by authorized user
Inquiry label vector Q is sent to privately owned Cloud Server by business device.
5. a kind of as claimed in claim 4 can search for encryption method based on keywords semantics sequence, which is characterized in that
Following steps are further included in the step S5:
After S5.1, privately owned Cloud Server receive the inquiry label vector Q of authorized user's transmission, successively with inquiry label vector Q
In each go matched data owner send document markup vector B in corresponding value whether be 0, if it is 0, illustrate
There is no the keyword to be searched in the keyword set that data owner provides, if it is 1, by corresponding document index vector
It records;
By recording corresponding document index vector, obtaining may be comprising the keyword to be searched for for S5.2, privately owned Cloud Server
Candidate index set of identifiers SID=..., sidi,…,sidj,…,sidz..., wherein i, j, z ∈ { 1,2 ..., n }, then
By candidate index set of identifiers SID=..., sidi,…,sidj,…,sidz... } and it is sent to publicly-owned Cloud Server.
6. a kind of as claimed in claim 5 can search for encryption method based on keywords semantics sequence, which is characterized in that
It is further included in the step S6:
Candidate index set of identifiers SID=that publicly-owned Cloud Server is sent according to privately owned Cloud Server ..., sidi,…,
sidj,…,sidz..., find the encrypted document index vector that corresponding data owner sendsIt is sent out with authorized user
The trapdoor T sent calculates the similarity score SC of document, and Documents Similarity score SC calculation formula is as follows:
Wherein, i ∈ (1,2 ..., n);
The Documents Similarity score being calculated is ranked up by publicly-owned Cloud Server, returns to k ciphertext texts before authorized user
Shelves, wherein the value of k is to meet the ciphertext number of documents of authorized user's demand.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810890114.6A CN109063509A (en) | 2018-08-07 | 2018-08-07 | It is a kind of that encryption method can search for based on keywords semantics sequence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810890114.6A CN109063509A (en) | 2018-08-07 | 2018-08-07 | It is a kind of that encryption method can search for based on keywords semantics sequence |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109063509A true CN109063509A (en) | 2018-12-21 |
Family
ID=64832170
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810890114.6A Pending CN109063509A (en) | 2018-08-07 | 2018-08-07 | It is a kind of that encryption method can search for based on keywords semantics sequence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109063509A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110457574A (en) * | 2019-07-05 | 2019-11-15 | 深圳壹账通智能科技有限公司 | Information recommendation method, device and the storage medium compared based on data |
CN110851481A (en) * | 2019-11-08 | 2020-02-28 | 青岛大学 | Searchable encryption method, device, equipment and readable storage medium |
CN111431705A (en) * | 2020-03-06 | 2020-07-17 | 电子科技大学 | Reverse password firewall method suitable for searchable encryption |
CN111756777A (en) * | 2020-08-28 | 2020-10-09 | 腾讯科技(深圳)有限公司 | Data transmission method, data processing device, data processing apparatus, and computer storage medium |
CN111859421A (en) * | 2020-07-08 | 2020-10-30 | 中国软件与技术服务股份有限公司 | Multi-keyword ciphertext storage and retrieval method and system based on word vector |
CN112257455A (en) * | 2020-10-21 | 2021-01-22 | 西安电子科技大学 | Semantic-understanding ciphertext space keyword retrieval method and system |
CN112272188A (en) * | 2020-11-02 | 2021-01-26 | 重庆邮电大学 | Searchable encryption method for protecting data privacy of e-commerce platform |
CN112328626A (en) * | 2020-10-28 | 2021-02-05 | 浙江工商大学 | Searchable encryption method facing cloud environment and supporting fuzzy keyword sequencing |
CN113094573A (en) * | 2020-01-09 | 2021-07-09 | 中移(上海)信息通信科技有限公司 | Multi-keyword sequencing searchable encryption method, device, equipment and storage medium |
CN113779597A (en) * | 2021-08-19 | 2021-12-10 | 深圳技术大学 | Method, device, equipment and medium for storing and similar retrieving of encrypted document |
CN113821704A (en) * | 2020-06-18 | 2021-12-21 | 华为技术有限公司 | Method and device for constructing index, electronic equipment and storage medium |
CN114398650A (en) * | 2021-12-16 | 2022-04-26 | 西安电子科技大学 | Searchable encryption system and method supporting multi-keyword subset retrieval |
CN114417109A (en) * | 2021-12-29 | 2022-04-29 | 电子科技大学广东电子信息工程研究院 | Ciphertext searching method, device and system based on security gateway |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951411A (en) * | 2017-03-24 | 2017-07-14 | 福州大学 | The quick multi-key word Semantic Ranking searching method of data-privacy is protected in a kind of cloud computing |
CN106997384A (en) * | 2017-03-24 | 2017-08-01 | 福州大学 | A kind of semantic ambiguity that can verify that sorts can search for encryption method |
CN108171071A (en) * | 2017-12-01 | 2018-06-15 | 南京邮电大学 | A kind of multiple key towards cloud computing can sort cipher text retrieval method |
-
2018
- 2018-08-07 CN CN201810890114.6A patent/CN109063509A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951411A (en) * | 2017-03-24 | 2017-07-14 | 福州大学 | The quick multi-key word Semantic Ranking searching method of data-privacy is protected in a kind of cloud computing |
CN106997384A (en) * | 2017-03-24 | 2017-08-01 | 福州大学 | A kind of semantic ambiguity that can verify that sorts can search for encryption method |
CN108171071A (en) * | 2017-12-01 | 2018-06-15 | 南京邮电大学 | A kind of multiple key towards cloud computing can sort cipher text retrieval method |
Non-Patent Citations (1)
Title |
---|
杨旸等: "云计算中保护数据隐私的快速多关键词语义排序搜索方案", 《计算机学报》 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110457574A (en) * | 2019-07-05 | 2019-11-15 | 深圳壹账通智能科技有限公司 | Information recommendation method, device and the storage medium compared based on data |
CN110851481A (en) * | 2019-11-08 | 2020-02-28 | 青岛大学 | Searchable encryption method, device, equipment and readable storage medium |
CN110851481B (en) * | 2019-11-08 | 2022-06-28 | 青岛大学 | Searchable encryption method, device and equipment and readable storage medium |
CN113094573A (en) * | 2020-01-09 | 2021-07-09 | 中移(上海)信息通信科技有限公司 | Multi-keyword sequencing searchable encryption method, device, equipment and storage medium |
CN111431705A (en) * | 2020-03-06 | 2020-07-17 | 电子科技大学 | Reverse password firewall method suitable for searchable encryption |
CN111431705B (en) * | 2020-03-06 | 2021-08-06 | 电子科技大学 | Reverse password firewall method suitable for searchable encryption |
CN113821704B (en) * | 2020-06-18 | 2024-01-16 | 华为云计算技术有限公司 | Method, device, electronic equipment and storage medium for constructing index |
CN113821704A (en) * | 2020-06-18 | 2021-12-21 | 华为技术有限公司 | Method and device for constructing index, electronic equipment and storage medium |
CN111859421A (en) * | 2020-07-08 | 2020-10-30 | 中国软件与技术服务股份有限公司 | Multi-keyword ciphertext storage and retrieval method and system based on word vector |
CN111756777B (en) * | 2020-08-28 | 2020-11-17 | 腾讯科技(深圳)有限公司 | Data transmission method, data processing device, data processing apparatus, and computer storage medium |
CN111756777A (en) * | 2020-08-28 | 2020-10-09 | 腾讯科技(深圳)有限公司 | Data transmission method, data processing device, data processing apparatus, and computer storage medium |
CN112257455A (en) * | 2020-10-21 | 2021-01-22 | 西安电子科技大学 | Semantic-understanding ciphertext space keyword retrieval method and system |
CN112257455B (en) * | 2020-10-21 | 2024-04-30 | 西安电子科技大学 | Semantic understanding ciphertext space keyword retrieval method and system |
CN112328626A (en) * | 2020-10-28 | 2021-02-05 | 浙江工商大学 | Searchable encryption method facing cloud environment and supporting fuzzy keyword sequencing |
CN112328626B (en) * | 2020-10-28 | 2022-09-30 | 浙江工商大学 | Searchable encryption method facing cloud environment and supporting fuzzy keyword sequencing |
CN112272188A (en) * | 2020-11-02 | 2021-01-26 | 重庆邮电大学 | Searchable encryption method for protecting data privacy of e-commerce platform |
CN112272188B (en) * | 2020-11-02 | 2022-03-11 | 重庆邮电大学 | Searchable encryption method for protecting data privacy of e-commerce platform |
CN113779597A (en) * | 2021-08-19 | 2021-12-10 | 深圳技术大学 | Method, device, equipment and medium for storing and similar retrieving of encrypted document |
CN113779597B (en) * | 2021-08-19 | 2023-08-18 | 深圳技术大学 | Method, device, equipment and medium for storing and similar searching of encrypted document |
CN114398650A (en) * | 2021-12-16 | 2022-04-26 | 西安电子科技大学 | Searchable encryption system and method supporting multi-keyword subset retrieval |
CN114417109A (en) * | 2021-12-29 | 2022-04-29 | 电子科技大学广东电子信息工程研究院 | Ciphertext searching method, device and system based on security gateway |
CN114417109B (en) * | 2021-12-29 | 2024-05-17 | 电子科技大学广东电子信息工程研究院 | Ciphertext searching method, device and system based on security gateway |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109063509A (en) | It is a kind of that encryption method can search for based on keywords semantics sequence | |
Chen et al. | An efficient privacy-preserving ranked keyword search method | |
CN106951411B (en) | The quick multi-key word Semantic Ranking searching method of data-privacy is protected in a kind of cloud computing | |
Fu et al. | Enabling central keyword-based semantic extension search over encrypted outsourced data | |
Cao et al. | Privacy-preserving multi-keyword ranked search over encrypted cloud data | |
Wang et al. | Achieving usable and privacy-assured similarity search over outsourced cloud data | |
CN106997384B (en) | Semantic fuzzy searchable encryption method capable of verifying sequencing | |
Sun et al. | Privacy-preserving multi-keyword text search in the cloud supporting similarity-based ranking | |
Murugesan et al. | Providing privacy through plausibly deniable search | |
Guo et al. | Secure multi-keyword ranked search over encrypted cloud data for multiple data owners | |
Wang et al. | Privacy-preserving ranked multi-keyword fuzzy search on cloud encrypted data supporting range query | |
CN109739945B (en) | Multi-keyword ciphertext sorting and searching method based on mixed index | |
Li et al. | Enabling efficient fuzzy keyword search over encrypted data in cloud computing | |
Yu et al. | Privacy-preserving multikeyword similarity search over outsourced cloud data | |
Boucenna et al. | Secure inverted index based search over encrypted cloud data with user access rights management | |
Yang et al. | Cloud information retrieval: Model description and scheme design | |
CN108549701A (en) | Cloud environment encrypts outsourcing data semantic extended search method and system | |
CN115495792B (en) | Fuzzy keyword searchable encryption method and system with privacy protection function | |
Jivane | Time efficient privacy-preserving multi-keyword ranked search over encrypted cloud data | |
Raghavendra et al. | Split keyword fuzzy and synonym search over encrypted cloud data | |
Wang et al. | Fault-tolerant Verifiable Keyword Symmetric Searchable Encryption in Hybrid Cloud. | |
CN111966778B (en) | Multi-keyword ciphertext sorting and searching method based on keyword grouping reverse index | |
Manasrah et al. | A privacy-preserving multi-keyword search approach in cloud computing | |
Nepolean et al. | Privacy preserving ranked keyword search over encrypted cloud data | |
Li et al. | Diverse multi-keyword ranked search over encrypted cloud data supporting range query |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181221 |