CN108710698B - Multi-keyword fuzzy query method based on ciphertext under cloud environment - Google Patents
Multi-keyword fuzzy query method based on ciphertext under cloud environment Download PDFInfo
- Publication number
- CN108710698B CN108710698B CN201810501660.6A CN201810501660A CN108710698B CN 108710698 B CN108710698 B CN 108710698B CN 201810501660 A CN201810501660 A CN 201810501660A CN 108710698 B CN108710698 B CN 108710698B
- Authority
- CN
- China
- Prior art keywords
- matrix
- index
- vector
- request
- query
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Storage Device Security (AREA)
Abstract
The invention provides a ciphertext-based multi-keyword fuzzy query method in a cloud environment. The method solves the problem of fuzzy query of multiple keywords by introducing wildcards through a method of constructing vectors and matrixes, eliminates the mode of presetting dictionaries, and provides efficient updating and deleting of file indexes; meanwhile, the method supports a round of operation with multiple keywords, and reduces the operation times; moreover, the method has higher precision ratio AND recall ratio, AND provides very flexible rich-semantic AND/OR query, namely the method supports both logic AND query among the keywords AND logic OR query among the keywords. In addition, the matrix in the kNN algorithm encryption processing scheme is adopted, so that the scheme provided by the invention has good safety.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a ciphertext-based multi-keyword fuzzy query method in a cloud environment.
Background
The advent and development of cloud computing has realized the dream of human beings regarding computing as an infrastructure. Cloud computing is an internet-based computing method, and software and hardware resources and information in a shared state can be provided to computers and other devices in a network for use as required. However, while effectively solving the condition of limited user resources, cloud computing also brings new and non-negligible security problems.
In order to protect sensitive private data stored on the cloud server from being leaked, a user must encrypt the data and store the data on the cloud platform. The existing keyword-based searchable ciphertext fuzzy query technology allows the user to input the problems of tiny errors and inconsistent forms, and greatly improves the usability of the system and the user search experience.
However, the existing solutions have the following disadvantages: (1) update deletion is not flexible: the existing scheme needs a predefined dictionary which contains possible error forms of each keyword, so that the efficiency of query and update is low; (2) the existing scheme has low operation efficiency: in the existing scheme, the method supporting multi-keyword query needs multiple rounds of operations to obtain the result, so the operation efficiency is low; (3) existing schemes do not support flexible AND/OR queries; (4) the existing scheme can not resist plaintext selection attack and has low safety.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a ciphertext-based multi-keyword fuzzy query method in a cloud environment. The method introduces wildcards to solve the problem of fuzzy query of multiple keywords, eliminates the mode of presetting dictionaries, and provides efficient updating and deleting of file indexes; meanwhile, the method supports a round of operation with multiple keywords, and reduces the operation times; moreover, the method has higher precision ratio AND recall ratio, AND provides very flexible AND/OR query, namely the method supports both logic AND query among the keywords AND logic OR query among the keywords.
The invention provides a ciphertext-based multi-keyword fuzzy query method in a cloud environment, which comprises the following steps:
step S1: the local server encrypts a plaintext data set according to a preset data encryption algorithm to obtain a ciphertext data set;
step S2: the local server constructs the initial index key words corresponding to the plaintext data set into index vectors, constructs the index vectors into an index matrix, encrypts the index matrix by adopting a preset matrix encryption algorithm to obtain an encrypted index matrix, and sends the ciphertext data set and the encrypted index matrix to the cloud server;
step S3: constructing query keywords in a query request into a request vector by a user, constructing the request vector into a request matrix, encrypting the request matrix by adopting a preset matrix encryption algorithm to obtain an encrypted request matrix, and sending the encrypted request matrix to a cloud server;
step S4: the cloud server calculates the product of the encryption index matrix and the encryption request matrix according to the encryption request matrix, determines a target data ciphertext according to the product result, and sends the target data ciphertext back to the user;
step S5: the user decrypts the target data ciphertext through the data encryption algorithm of the step S1 to obtain a target data plaintext;
each initial index key corresponds to an index vector;
each row element of the index matrix corresponds to an index vector;
the query keyword in the query request comprises a wildcard character;
the query keywords in the query request are connected through a logical operator;
the logical operators include "AND" OR "OR".
The method provided by the invention comprises the steps of constructing the initial index key words into the index vectors, constructing the index matrixes, encrypting the index vectors to obtain the encrypted index matrixes, constructing the query key words into the request vectors, constructing the query matrixes, encrypting the query matrixes to obtain the encrypted request matrixes, and determining the target data ciphertext by calculating the product of the encrypted index matrixes and the encrypted request matrixes. After the vector and the matrix are constructed by adopting the method provided by the invention, the data ciphertext corresponding to the index vector is the target data ciphertext only when the product of the index vector and the request vector in the matrix is an integer. Meanwhile, the query keyword in the scheme of the invention contains wildcards, and when the request vector is constructed, the influence of the wildcards is considered, and the request vector is correspondingly processed, so that the fuzzy query function can be realized. Wildcards can represent any ambiguous English letter, for example, "unsure" and "insture" are both correct query results when the query keyword is ". times.sure". Because the query keywords in the query request in the scheme of the invention have a logical operation relationship (namely logical OR or logical OR), the invention can also realize logical query. In the scheme of the invention, the keywords comprise two types, namely accurate keywords and fuzzy keywords. The initial index key words can only be accurate key words, and the query key words can be accurate key words or fuzzy key words containing wildcards.
In the prior art, given query keywords W1, W2, and W3, it is necessary to search a file set D (W1) containing W1, then search a file set D (W1 n/W2) containing W2 from the file set D (W1), and finally search a file set D (W1 n/W2 n/W3) containing W3 from the file set D (W1 n/W2), so that the file sets containing W1, W2, and W3 are obtained through 3 rounds of operations, which is equivalent to performing 3 times of query algorithm. The scheme of the invention supports logical operation relation, can realize logical query, and only needs to execute a query algorithm once to obtain the target file D (W1 n W2 n W3).
Further, the preset matrix encryption algorithm is a kNN algorithm;
the key of the preset matrix encryption algorithm is Sk, and the key Sk at least comprises: a prime number sequence table P, a completion alphabet S and a reversible encryption matrix M;
the prime number sequence table P ═ { P ═ P1,...,Pi,...PLTherein, each element PiAre all randomly generated prime numbers;
the complement alphabet S ═ S1,...,Si,...SLTherein, each element SiAll are from 26 English alphabetic dictionary alphabetsChinese English letters are randomly selected from the Chinese characters,
the reversible encryption matrix M ═ { M ═ M1,M2S }, wherein M1And M2Are all invertible matrices of size dxd, M1And M2The corresponding inverse matrices are respectively denoted as M1 -1And M2 -1S is a one-dimensional matrix with the number of columns being d, the value of matrix elements being 0 or 1, and is generated by a pseudo-random number function;
and d is more than or equal to 26, L is more than or equal to the number of letters contained in the initial index key word with the longest length, and d and L are positive integers.
The existing literature document "Secure kNN calculation on Encrypted Databases" (w.k.wong, d.w. -l.chenng, b.kao, and n.mamoulis, proc.of SIGMOD,2009) discloses a method for using kNN algorithm for encryption processing.
The prime number sequence table is formed by arranging randomly generated prime numbers in a sequence from small to large. The value of L is determined by the keyword with the longest length in the initial index, and the value of L is more than or equal to the total number of letters contained in the keyword. For the sake of correctness of the scheme, the value of d should be greater than or equal to 26; for safety reasons, d is preferably greater than or equal to 128. Sequential dictionary alphabetThe English letters in the Chinese are not distinguished from case to case. The letters in the completion alphabet are in random order, further increasing the security of the scheme of the invention.
Further, each initial index key W is constructed according to the following stepsi,kCorresponding d-dimensional index vector Pi,k:
Step S201: indexing the initial Key Wi,kThe number of the letters is complemented to a uniform length L;
judging initial index key word Wi,kThe number of letters Wi,kL |: when | | | Wi,k||<When L, sequentially selecting L- | W from the completion alphabet Si,kCompleting initial index key word W by | letteri,kSo that | | Wi,kL | ═ L; when | | | Wi,kWhen | | | is greater than or equal to L, directly executing step S202;
step S202: initializing a d-dimensional index vector Pi,kWill index the vector Pi,kAll elements of (2) are set to 1, so that Pi,k[l]=1,l∈[1,d]L is an integer;
wherein, Pi,k[l]Represents an index vector Pi,kThe element in position I;
step S203: selecting initial index key word Wi,kContaining letters in an index vector Pi,kThe fill position of (1);
sequentially indexing the vector Pi,kPos of (2)lBit determination as an initial index key Wi,kFill location of the letter of the l-th digit of the inclusion;
wherein, posl=Fkf(Wi,k[l]),Fkf() Is a function of pseudo-random number, Wi,k[l]Representing an initial index key Wi,kContaining the letter in position L ∈ [1, L ]];
Step S204: updating the index vector Pi,kPos of (2)lValue P on biti,k[posl];
Wherein, Pi,k[posl]=Pi,k[posl]/Pl;
Pi,k[posl]Is initially value of Pi,k[l],PlIs the prime number of the I bit in the prime number sequence table P, and is belonged to [1, L ∈];
Step S205: randomly selecting a natural number alpha, and indexing the vector Pi,kReplacing the value of the position still being the initial value 1 with alpha to obtain the final index vector Pi,k。
When constructing the index vector, the first step is to complement all the initial index keywords into a uniform length, and for the initial index keywords with insufficient length, the initial index keywords are complemented in a mode of selecting letters from a complementing alphabet and directly complementing the letters behind the original letters of the initial index keywords. The general idea of the vector construction process is "choose the position first and then calculate the value", step S203 is used to choose the position of each letter in the initial index key in the index vector, and step S204 is used to determine the value filled in the corresponding position of each letter in the vector. Step S205 is used to improve the security of the scheme, even if two data in the plaintext data set both contain the same initial Index key, the Index vectors of the initial Index keys corresponding to the two data are different, and the two data are technically referred to as "Index indexing reliability" (Index is indistinguishable).
Go toStep by step, according to the following stepsi,kConstructed set of index vectorsConstructing an index matrix Ai, and encrypting the index matrix Ai by a kNN algorithm:
step S211: sequentially indexing the vector setP in (1)i,kIs placed in the kth row of the index matrix Ai to obtain an index matrix Ai with the size of mi x d, k ∈ [1, mi];
Step S212: two matrices Ai' and Ai "of size mi × d are generated from the index matrix Ai, the generation rule is as follows:
for each row Ai [ k ] [ ] of the index matrix Ai: if s [ j ] is equal to 0, Ai' [ k ] [ j ] ═ Ai "[ k ] [ j ] ═ Ai [ k ] [ j ]; if s [ j ] is 1, Ai' [ k ] [ j ] + Ai "[ k ] [ j ] ═ Ai [ k ] [ j ];
wherein s [ j ] represents the jth bit of a one-dimensional matrix s in the key Sk, Ai [ k ] [ j ] represents the jth column element of the kth row in the index matrix Ai, Ai '[ k ] [ j ] represents the jth column element of the kth row in the matrix Ai', Ai "[ k ] [ j ] represents the jth column element of the kth row in the matrix Ai", and j belongs to [1, d ];
step S213: the matrices Ai 'and Ai' are respectively associated with said M1And M2Multiplying to obtain an encryption index matrix Ii;
Wherein, Ii=(Ii′,Ii")=(Ai′·M1,Ai"·M2) And represents the multiplication of two matrices.
According to the scheme of the invention, the index vector is constructed into the index matrix, and then the index matrix is encrypted by adopting a kNN algorithm to obtain the encrypted index matrix. Therefore, the index contents transmitted in the data interaction process in the scheme are all encrypted contents, and the safety of the scheme is further improved.
Further, each query keyword is constructed according to the following stepsCorresponding d-dimensional request vector qj,k:
Step S301: will inquire about the key wordThe number of the letters is complemented to a uniform length L;
determining query keywordsNumber of letters involvedWhen in useThen, sequentially selecting from the complement alphabet SEach letter completes the query key wordBehind the original letter of (A) such thatWhen in useIf so, directly executing step S302;
step S302: initializing a d-dimensional request vector qj,kWill request a vector qj,kAll elements of (a) are set to 1, such that qj,k[l]=1,l∈[1,d]L is an integer;
wherein q isj,k[l]Representing a request vector qj,kThe element in position I;
step S303: selecting query keywordsContaining letters in the request vector qj,kIn (1)Fill in the location and update the request vector qj,kThe value q at the l-th bit of (1)j,k[posl];
When in useThen, a vector q will be requestedj,kPos of (2)lBit determination as a query keywordFill position of letter of I-th order contained, and let qj,k[posl]=qj,k[posl]×Pl(ii) a When in useMake the request vector qj,kThe value of all the positions in (A) is multiplied by Pl(ii) a Thereby obtaining a final request vector qj,k;
Wherein the content of the first and second substances,Fkf() In order to be a function of the pseudo-random number,representing query keywordsContaining the letter in position L ∈ [1, L ]],qj,k[posl]Is an initial value of qj,k[l],PlIs the prime number of the I bit in the prime number sequence table P.
Because the present invention is directed to query terms containing wildcards "", it is the key to the inventive scheme how to construct query terms containing wildcards "", into request vectors. In the first step, the same as constructing the index vector, or filling all the keywords to a uniform length. Secondly, distinguishing letters and wildcards "+" in the keywords to perform corresponding processing: when processing letters, the overall thought is the same as constructing an index vector, and the method is 'selecting bits first and then calculating values'; and the wildcard ". star" is handled by multiplying the values of all the positions in the request vector by a prime number. By adopting the method for processing, the influence of the wildcard can be embodied in the request vector, thereby realizing the function of multi-keyword fuzzy query.
Further, the request vector q is generated according to the following stepsj,kConstructed as a request matrix Qj,kAnd the request matrix Q is subjected to the matching through the kNN algorithmj,kStructured request matrix setAnd (3) carrying out encryption processing:
step S311: selecting a random number e, wherein e belongs to [2, d ], and e is an integer;
step S312: sequentially reacting q withj,k[l]Put in Qj,k[l][num]In, and is filled with Qj,k[l][*]Middle removing Qj,k[l][num]Other than the elements, each request vector qj,kAll expanded to a request matrix Q of size dXej,k;
Wherein q isj,k[l]Representing a request vector qj,kElement at position I in (1), Qj,k[l][num]Representation matrix Qj,kColumn num of (1), num is a random number and num belongs to [1, e ]]L is an integer and is e [1, d ]],Qj,k[l][*]Middle removing Qj,k[l][num]The sum of other elements than gammalSo that gamma isl=tl×Qj,k[l][num];
Wherein t isl0 or tl+1 is a prime number andQj,k[l][*]representing a request matrix Qj,kAll elements of line l;
step S314: will form a request matrix setEach request matrix Q ofj,kEach row of elements in the array is placed in the intermediate matrix in turnThereby obtaining an intermediate matrix of size d × (nj × e)
Wherein k is an integer and k belongs to [1, nj ];
step S315: according to an intermediate matrixGenerating two matrices of size d × (nj × e)Andthe generation rule is as follows:
for intermediate matrixEach column ofAll have: if s [ i ]]When it is equal to 0, thenIf s [ i ]]1, then
Wherein, s [ i ]]Represents the ith bit of a one-dimensional matrix s in the key Sk,representing an intermediate matrixThe ith row and the kth column of (1),representation matrixThe ith row and the kth column of (1),representation matrixRow i and column k, i e [1, d];
Step S316: will M1And M2Corresponding inverse matrix M1 -1And M2 -1Respectively and matrix withAndmultiplying to obtain an encryption request matrix TQ;
In the scheme of the invention, the request vector is expanded into the request matrix mainly due to the consideration of scheme safety. And then, the request matrix is encrypted by adopting a kNN algorithm, so that the safety of the scheme is further improved.
Further, the step S4 of calculating the product of the encryption index matrix and the encryption request matrix, and determining the target data ciphertext according to the product result is as follows:
Wherein the content of the first and second substances,is a product of the encryption index matrix and the encryption request matrix and has a size of mi×(njA matrix of x e);
step S402: to the product matrixSumming every e elements in every line to obtain a sum with mi×njResult matrix of (2)
For the result matrixEach row ofAll have: when k is2When% e is 0, calculateAnd taking the calculation results as a result matrix in turnKth1The elements on each column are arranged in rows to obtain a size mi×njResult matrix of (2)
Wherein k is1Is an integer and k1∈[1,mi],k2Is an integer and k2∈[1,nj×e];
Step S403: according to a matrixJudging the encryption index matrix IiCorresponding data cipher textWhether the target data is ciphertext:
when the query request contains a logical operator 'AND', if a result matrix corresponding to the encryption index matrix is obtainedIf at least one element in each row is an integer, the data ciphertext corresponding to the encryption index matrix is the target data ciphertext;
when the query request contains a logical operator 'OR', if a result matrix corresponding to the index matrix is encryptedIf at least one row has an element as an integer, the data ciphertext corresponding to the encryption index matrix is the target data ciphertext.
In the scheme of the invention, the cloud server does not need the specific contents of the data plaintext, the initial index and the query keyword, and can judge whether the data ciphertext is the target data ciphertext or not only by calculating the product of the encryption index matrix and the encryption request matrix, so that the risk of disclosure does not exist, and the whole process has good safety. Meanwhile, the matrix constructed by the method can obtain the logical query result of 'AND' OR 'OR' contained in the query keyword through the multiplication result, so that the query is more convenient AND flexible.
In the scheme of the invention, the result matrix is passedThe principle of judging the search result by whether the elements in (1) are integers is as follows:
assuming that the index vector is P, the request vector is q, and the dimensions are d, the dot product of the vectors is represented as:
P·q=P[1]×q[1]+P[2]×q[2]+...+P[i]×q[i]+K+P[d]×q[d]
where P [ i ] and q [ i ] represent the value of the ith bit of the corresponding vector.
According to the inventive solution, the value at each bit of the index vector P is represented by the formula Pi,k[posl]=Pi,k[posl]/PlThe calculation is either an integer, i.e. the value on each bit of the index vector is either a fraction (like:P1and P2Is a prime number, the numerator must be 1, because the initial value of each bit of the vector is 1), or is a random integer (i.e. the random natural number determined in step S205); the value on each bit of the request vector q is given by the formula qj,k[posl]=qj,k[posl]×PlThe calculation is carried out such that the value of each bit of the request vector is only possible to be an integer (e.g., P)1×P2×P3,P1、P2And P3Are all prime numbers). Thus, if P [ i ]]×q[i]Is an integer, it is either the case of multiplying the integer by the integer, or the value of the corresponding position of the two vectors can be reduced. For example:the result of this calculation is an integer, the molecule P1×P2×P3Decomposed into a prime factor set { P) of three prime factors1,P2,P3Contains a denominator P1×P2Decomposed prime factor set { P1,P2I.e. P1,P2}∈{P1,P2,P3}. And the prime decomposition results of integers are unique.
The formula pos recorded according to the scheme of the inventionl=Fkf(Wi,k[l]) Andassume initial index key Wi,kIs Wi,k[l]Querying keywordsThe first letter of (a) isIf the two letters are identical, i.e.Then respectively mapped to the vector Pi,kSum vector qj,kPosition pos oflAre the same; if the two letters are different, i.e.Then respectively mapped to the vector Pi,kSum vector qj,kPosition pos oflAre not the same. For example, the third letter and the fourth letter "l" in the keyword "hello" map to the same position in the vector; the first letter "h" and the third letter "l" map to different positions in the vector. It can be concluded that: whether vector Pi,kOr a vector qj,kIn the same pseudo-random function Fkf() Under calculation, the positions of the same letter mapped to the vector are the same, and the positions of different letters mapped to the vector are different.
In the scheme of the invention, the initial index key word WikAnd query keywordThe following two cases may occur in the l-th bit:
(1) initial index key Wi,kAnd query keywordIs the same (contains the query keyword)With wildcard "×", since wildcard "×" may represent any of the 26 letters): vector Pi,kPos of (2)lThe denominator of the value of the bit must contain the prime factor PlVector qj,kPos of (2)lThe value of the bit must contain a prime factor PlAnd vector Pi,kPos of (2)lBit sum vector qj,kPos of (2)lPosition is the same poslThe value is obtained.
(2) Initial index key Wi,kAnd query keywordIs not the same (at this time, the letter of the l-th letter of the query keyword must not be a wildcard): vector Pi,kPos of (2)lThe denominator of the value of the bit must contain the prime factor PlVector qj,kPos of (2)lThe value of the bit must contain a prime factor PlBut vector Pi,kPos of (2)lBit sum vector qj,kPos of (2)lBits not being one and the same poslThe value is obtained. Note vl=posl=Fkf(Wi,k[l]),ThenThen, vector Pi,kV. of (b)lThe denominator of the value of the bit must contain a prime factor PlAnd vector qj,kV. of (b)lThe value of the bit not including the prime factor PlPrime factor PlPresence vector qj,kTo (1) aBit, vector Pi,kThe denominator of (a) cannot be related to the vector qj,kThe value of the corresponding position is divided into the quotient P [ pos [ ]l]×q[posl]Must be a fraction.
The solution of the invention does not hold if the following two situations occur:
the first condition is as follows: initial index key Wi,kAnd query keywordsMatch, but dot product of index vector P and request vector qNot an integer.
Case two: initial index key Wi,kAnd query keywordsNot matched, but the dot product of the index vector P and the request vector q is an integer.
The following proves that the first case and the second case are both impossible to appear through a back syndrome method respectively:
the first condition of the evidence is:
if the dot product of the index vector P and the request vector q is not an integer, then P [ i ] exists]×q[i]Is a fractional number, and q [ i ]]The prime factor set decomposed into must not contain P [ i]The denominator of (a) is decomposed into a set of prime factors. At this time, the initial index key W is explainedi,kIs the prime number P corresponding to the l-th letterlMapped to x bits of vector P, and the keywords are queriedIs the prime number P corresponding to the l-th letterlMapping to other positions than x bits. According to the characteristics of the pseudo-random function, if the mapping positions are different, the initial index key word W is indicatedikThe first letter and the query keywordIs different from the initial index key Wi,kAnd query keywordsThe premise of matching contradicts each other. Therefore, the situation is unlikely to occur.
And (5) a case of evidence rejection:
if the dot product of the index vector P and the request vector q is an integer, then each P [ i [ i ] ]]×q[i]The values of (A) are all integers. At this time, each P [ i ]]×q[i]The value of (b) may or may not be 0. When the value is not 0, q [ i ]]The decomposed prime factor set must contain P [ i ]]Is decomposed into prime factor sets, i.e., P [ i]Of the denominator of (a) into a prime factor setEach prime factor is present in q [ i ]]Decomposed into prime factor sets. According to the characteristics of the pseudo-random function, the query key words must exist at the momentEach letter of (A) is equal to the initial index key word Wi,kIn the case that the letters in the corresponding positions of (1) are the same, i.e., the initial index keyAnd query keywordsMatch, and this case is matched with the "initial index key Wi,kAnd query keywordsThe preconditions of mismatch contradict each other. Thus, case two is also unlikely.
In summary, neither case one nor case two is likely to occur, and thus the solution provided by the present invention is true.
Advantageous effects
The invention provides a ciphertext-based multi-keyword fuzzy query method in a cloud environment. The method solves the problem of fuzzy query of multiple keywords by introducing wildcards through a method of constructing vectors and matrixes, eliminates the mode of presetting dictionaries, and provides efficient updating and deleting of file indexes; meanwhile, the method supports a round of operation with multiple keywords, and reduces the operation times; moreover, the method has higher precision ratio AND recall ratio, AND provides very flexible AND/OR query, namely the method supports both logic AND query among the keywords AND logic OR query among the keywords. In addition, the matrix in the kNN algorithm encryption processing scheme is adopted, so that the scheme provided by the invention has good safety.
Drawings
FIG. 1 is a schematic diagram illustrating steps of a ciphertext-based multi-keyword fuzzy query method in a cloud environment according to the present invention;
FIG. 2 is a diagram illustrating an initialized 26-dimensional index vector for understanding step S202 according to an embodiment of the present invention;
FIG. 3 is a vector P for understanding step S204 in the embodiment of the present inventionhello;
FIG. 4 is a vector P for understanding step S204 in the embodiment of the present inventionkey;
FIG. 5 is a final index vector for understanding step S205 in an embodiment of the present invention;
FIG. 6 is an index matrix Ai for understanding step S211 in the embodiment of the present invention;
FIG. 7 is a diagram of an initial request vector q for understanding step S302 according to an embodiment of the present inventionw*r*dAnd q isk*y;
FIG. 8 is a vector P for understanding step S303 in the embodiment of the present inventionw*r*d;
FIG. 9 shows a vector P for understanding step S303 in an embodiment of the present inventionk*yAB;
Fig. 10 is a 5-dimensional vector q for understanding the contents of step S312 and step S313 in the embodiment of the present invention;
fig. 11 is a matrix Q before filling for understanding the contents in step S312 and step S313 in the embodiment of the present invention;
fig. 12 is a matrix Q after filling for understanding the contents of step S312 and step S313 in the embodiment of the present invention;
FIG. 13 is an intermediate matrix for understanding the content of step S314 in the embodiment of the present invention
Fig. 14 is an index vector and a request vector for understanding the contents of step S403 in the embodiment of the present invention.
Detailed Description
In order to better understand the contents of the present invention, the following embodiments are further described.
As shown in fig. 1, the present invention provides a ciphertext-based multi-keyword fuzzy query method in a cloud environment, including: step S1: the local server encrypts a plaintext data set according to a preset data encryption algorithm to obtain a ciphertext data set; step S2: the local server constructs the initial index key words corresponding to the plaintext data set into index vectors, constructs the index vectors into an index matrix, encrypts the index matrix by adopting a preset matrix encryption algorithm to obtain an encrypted index matrix, and sends the ciphertext data set and the encrypted index matrix to the cloud server; step S3: constructing query keywords in a query request into a request vector by a user, constructing the request vector into a request matrix, encrypting the request matrix by adopting a preset matrix encryption algorithm to obtain an encrypted request matrix, and sending the encrypted request matrix to a cloud server; step S4: the cloud server calculates the product of the encryption index matrix and the encryption request matrix according to the encryption request matrix, determines a target data ciphertext according to the product result, and sends the target data ciphertext back to the user; step S5: the user decrypts the target data ciphertext through the data encryption algorithm of the step S1 to obtain a target data plaintext; each initial index key corresponds to an index vector; each row element of the index matrix corresponds to an index vector; the query keyword in the query request comprises a wildcard character; the query keywords in the query request are connected through a logical operator; the logical operators include "AND" OR "OR".
The method provided by the invention comprises the steps of constructing the initial index key words into the index vectors, constructing the index matrixes, encrypting the index vectors to obtain the encrypted index matrixes, constructing the query key words into the request vectors, constructing the query matrixes, encrypting the query matrixes to obtain the encrypted request matrixes, and determining the target data ciphertext by calculating the product of the encrypted index matrixes and the encrypted request matrixes. After the vector and the matrix are constructed by adopting the method provided by the invention, the data ciphertext corresponding to the index vector is the target data ciphertext only when the product of the index vector and the request vector in the matrix is an integer. Meanwhile, the query keyword in the scheme of the invention contains wildcards, and when the request vector is constructed, the influence of the wildcards is considered, and the request vector is correspondingly processed, so that the fuzzy query function can be realized. Wildcards can represent any ambiguous English letter, for example, "unsure" and "insture" are both correct query results when the query keyword is ". times.sure". Because the query keywords in the query request in the scheme of the invention have a logical operation relationship (namely logical OR or logical OR), the invention can also realize logical query.
In this embodiment, the preset matrix encryption algorithm is a kNN algorithm; the key of the preset matrix encryption algorithm is Sk, and the key Sk at least comprises: a prime number sequence table P, a completion alphabet S and a reversible encryption matrix M; the prime number sequence table P ═ { P ═ P1,...,Pi,...PLTherein, each element PiAre all randomly generated prime numbers; the complement alphabet S ═ S1,...,Si,...SLTherein, each element SiAll are from 26 English alphabetic dictionary alphabetsChinese English letters are randomly selected from the Chinese characters,the reversible encryption matrix M ═ { M ═ M1,M2S }, wherein M1And M2Are all invertible matrices of size dxd, M1And M2The corresponding inverse matrices are respectively denoted as M1 -1And M2 -1S is a one-dimensional matrix with the number of columns being d, the value of matrix elements being 0 or 1, and is generated by a pseudo-random number function; and d is more than or equal to 26, L is more than or equal to the number of letters contained in the initial index key word with the longest length, and d and L are positive integers.
The existing literature document "Secure kNN calculation on Encrypted Databases" (w.k.wong, d.w. -l.chenng, b.kao, and n.mamoulis, proc.of SIGMOD,2009) discloses a method for using kNN algorithm for encryption processing. Wherein the prime number sequence table is composed of randomly generated prime elementsThe numbers are arranged from small to large. The value of L is determined by the keyword with the longest length in the initial index, and the value of L is more than or equal to the total number of letters contained in the keyword. For the sake of correctness of the scheme, the value of d should be greater than or equal to 26; for safety reasons, the value of d is preferably greater than or equal to 128. Sequential dictionary alphabetThe English letters in the Chinese are not distinguished from case to case.
In the present embodiment, each of the initial index keys W is constructed according to the following stepsi,kCorresponding d-dimensional index vector Pi,k: step S201: indexing the initial Key Wi,kThe number of the letters is complemented to a uniform length L; judging initial index key word Wi,kThe number of letters Wi,kL |: when | | | Wi,k||<When L, sequentially selecting L- | W from the completion alphabet Si,kCompleting initial index key word W by | letteri,kSo that | | Wi,kL | ═ L; when | | | Wi,kWhen | | | is greater than or equal to L, directly executing step S202; step S202: initializing a d-dimensional index vector Pi,kWill index the vector Pi,kAll elements of (2) are set to 1, so that Pi,k[l]=1,l∈[1,d]L is an integer;
wherein, Pi,k[l]Represents an index vector Pi,kThe element in position I; step S203: selecting initial index key word Wi,kContaining letters in an index vector Pi,kThe fill position of (1); sequentially indexing the vector Pi,kPos of (2)lBit determination as an initial index key Wi,kFill location of the letter of the l-th digit of the inclusion; wherein, posl=Fkf(Wi,k[l]),Fkf() Is a function of pseudo-random number, Wi,k[l]Representing an initial index key Wi,kContaining the letter in position L ∈ [1, L ]](ii) a Step S204: updating the index vector Pi,kPos of (2)lValue P on biti,k[posl](ii) a Wherein, Pi,k[posl]=Pi,k[posl]/Pl;Pi,k[posl]Is initially value of Pi,k[l],PlIs the prime number of the I bit in the prime number sequence table P, and is belonged to [1, L ∈](ii) a Step S205: randomly selecting a natural number alpha, and indexing the vector Pi,kReplacing the value of the position still being the initial value 1 with alpha to obtain the final index vector Pi,k。
When constructing the index vector, the first step is to complement all the initial index keywords into a uniform length, and for the initial index keywords with insufficient length, the initial index keywords are complemented in a mode of selecting letters from a complementing alphabet and directly complementing the letters behind the original letters of the initial index keywords. The general idea of the vector construction process is "choose the position first and then calculate the value", step S203 is used to choose the position of each letter in the initial index key in the index vector, and step S204 is used to determine the value filled in the corresponding position of each letter in the vector. Step S205 is used to improve the security of the scheme, even if two data in the plaintext data set both contain the same initial Index key, the Index vectors of the initial Index keys corresponding to the two data are different, and the two data are technically referred to as "Index indexing reliability" (Index is indistinguishable).
Specifically, assume that the plaintext data set D ═ D1,D2The initial index corresponding to the plaintext data set D is W ═ W1,W 226, prime sequence table P {3,5,7,11,13}, completion alphabet S { 'a', 'B', 'C', 'D', 'E' }, where data D is obtained1Corresponding index W1{ "hello", "key" }, data D2Corresponding index W2L is 5, because of key. The process of constructing the index vectors corresponding to the keywords "hello" and "key" is as follows:
according to step S201, the keyword "hello" does not need to be filled, and the keyword "key" needs to be filled, so as to obtain W1={"hello","keyAB"};
According to step S202, two 26-dimensional index vectors are initialized, and all elements in the index vectors are set to 1, as shown in fig. 2;
according to step S203, for the keywordsFor "hello", Whello[1]=′h′,Whello[2]=′e′,Whello[3]=′l′, Whello[4]=′l′,Whello[5]O'; selecting a pseudo-random number function Fkf() And calculating to obtain the filling position corresponding to each letter. Similarly, the filling position of the keyword "keyAB" is calculated, and finally the filling positions corresponding to the keywords "hello" and "keyAB" are obtained as shown in the following table:
“hello” | ‘h’ | ‘e’ | ‘l’ | ‘l’ | ‘o’ |
pos | 9 | 13 | 11 | 11 | 7 |
“keyAB” | ‘k’ | ‘e’ | ‘y’ | ‘A’ | ‘B’ |
pos | 8 | 13 | 10 | 5 | 15 |
according to step S204, the filling position obtained from the above table is according to the formula Pi,k[posl]=Pi,k[posl]/PlCalculating the specific filling value of each position:
for the keyword "hello", the constructed vector PhelloAs shown in fig. 3:
①pos′h′9 (available from table above), prime table P ═ {3,5,7,11,13} P 13, then(the calculated meaning is: vector P representing the keyword "hellohelloThe value of the 9 th bit is updated to);
For the keyword "keyAB", the constructed vector PkeyAs shown in fig. 4:
In step S205, the random number α is set to 0, and the value of the position in the index vector where the initial value 1 is not yet obtained is replaced with α, so that the final index vector is obtained as shown in fig. 5.
In the present embodiment, the vector P is indexed according to the following stepsi,kConstructed set of index vectorsConstructing an index matrix Ai, and encrypting the index matrix Ai by a kNN algorithm: step S211: sequentially indexing the vector setP in (1)i,kIs placed in the kth row of the index matrix Ai to obtain an index matrix Ai with the size of mi x d, k ∈ [1, mi](ii) a Step S212: two matrices Ai' and Ai "of size mi × d are generated from the index matrix Ai, the generation rule is as follows: for each row Ai [ k ] of the index matrix Ai][*]: if s [ j ]]If 0, then Ai' [ k ]][j]=Ai"[k][j]=Ai[k][j](ii) a If s [ j ]]When 1, Ai' k][j]+Ai"[k][j]=Ai[k][j](ii) a Wherein, s [ j ]]Represents the jth bit, Ai [ k ], of a one-dimensional matrix s in the secret key Sk][j]Denotes the element in the k-th row and j-th column of the index matrix Ai, Ai' [ k][j]Denotes the element in the k-th row and j-th column of the matrix Ai', Ai "[ k ]][j]Elements representing the kth row and jth column of the matrix Ai', j ∈ [1, d](ii) a Step S213: the matrices Ai 'and Ai' are respectively associated with said M1And M2Multiplying to obtain an encryption index matrix Ii(ii) a Wherein, Ii=(Ii′,Ii")=(Ai′·M1,Ai"·M2) And represents the multiplication of two matrices.
In the scheme of the invention, an index vector set is constructed into an index matrix, and then the index matrix is encrypted by adopting a kNN algorithm to obtain an encrypted index matrix. Therefore, the index contents transmitted in the data interaction process in the scheme are all encrypted contents, and the safety of the scheme is further improved.
Continuing the example mentioned above, P is added according to step S2111={Phello,PkeyAre placed in a 2X 26 matrix A1:PhelloIs arranged in matrix A1First row of (B), PkeyIs arranged in matrix A1The resulting index matrix Ai is shown in fig. 6 for the second row of (a).
For the convenience of the splitting step of step S212, a one-dimensional example is specifically described below. Let s be {1,0,0,1}, and a be [1,2,3,4 }]And is separated into A '═ a'1,a′2,a′3,a′4]And A ″ [ a ″ ]1,a″2,a″3,a″4]。
①s[1]1, satisfies the resolution condition A [1]][1]=A′[1][1]+A″[1][1]Namely, such as: 1 ═ 2-3, then a ═ 2, a'2,a′3,a′4],A″=[-3,a″2,a″3,a″4]。
②s[2]0 'satisfies the resolution condition A'i[k][j]=A″i[k][j]=Ai[k][j]That is, then a '═ 2,2, a'3,a′4], A″=[-3,2,a″3,a″4]
③s[3]0 'satisfies the resolution condition A'i[k][j]=A″i[k][j]=Ai[k][j]That is, then a '═ 2,2,3, a'4], A″=[-3,2,3,a″4]
(iv) s [4] ═ 1, and the resolution condition a [1] [1] ═ a' [1] [1] + a "[ 1] [1], may be satisfied, for example: 4 ═ 0+4, then a ═ 2,2,3,0], a [ -3,2,3,4 ].
In the present embodiment, each query key is constructed according to the following stepsWordCorresponding d-dimensional request vector qj,k: step S301: will inquire about the key wordThe number of the letters is complemented to a uniform length L; determining query keywordsNumber of letters involvedWhen in useThen, sequentially selecting from the complement alphabet SEach letter completes the query key wordBehind the original letter of (A) such thatWhen in useIf so, directly executing step S302; step S302: initializing a d-dimensional request vector qj,kWill request a vector qj,kAll elements of (a) are set to 1, such that qj,k[l]=1,l∈[1,d]L is an integer; wherein q isj,k[l]Representing a request vector qj,kThe element in position I; step S303: selecting query keywordsContaining letters in the request vector qj,kAnd updating the request vector qj,kThe value q at the l-th bit of (1)j,k[posl](ii) a When in useThen, a vector q will be requestedj,kPos of (2)lBit determination as a query keywordFill position of letter of I-th order contained, and let qj,k[posl]=qj,k[posl]×Pl(ii) a When in useMake the request vector qj,kThe value of all the positions in (A) is multiplied by Pl(ii) a Thereby obtaining a final request vector qj,k(ii) a Wherein the content of the first and second substances,Fkf() In order to be a function of the pseudo-random number,representing query keywordsContaining the letter in position L ∈ [1, L ]],qj,k[posl]Is an initial value of qj,k[l],PlIs the prime number of the I bit in the prime number sequence table P.
Because the present invention is directed to query terms containing wildcards "", it is the key to the inventive scheme how to construct query terms containing wildcards "", into request vectors. In the first step, the same as constructing the index vector, or filling all the keywords to a uniform length. Secondly, distinguishing letters and wildcards "+" in the keywords to perform corresponding processing: when processing letters, the overall thought is the same as constructing an index vector, and the method is 'selecting bits first and then calculating values'; and the wildcard ". star" is handled by multiplying the values of all the positions in the request vector by a prime number. By adopting the method for processing, the influence of the wildcard can be embodied in the request vector, thereby realizing the function of multi-keyword fuzzy query.
According to step S302, a 26-dimensional initial request vector q is obtainedw*r*dAnd q isk*yAs shown in fig. 7.
According to step S303, the keyword "w × r × d":keyword "k × yAB":the filling positions obtained are shown in the following table:
“w*r*d” | ‘w’ | ‘r’ | ‘d’ | |
pos | 6 | 14 | 12 | |
“k*yAB” | ‘k’ | ‘y’ | ‘A’ | ‘B’ |
pos | 8 | 10 | 5 | 15 |
for the keyword "w r d", the constructed vector Pw*r*dAs shown in fig. 8:
①pos′w′6 (available from table above), prime table P ═ {3,5,7,11,13} P 13, then qw*r*d[6]=qw*r*d[6]×P l1 × 3 ═ 3 (the calculated meaning is: a vector q representing the keyword "w × r × dw*r*dThe value of the 6 th bit is updated to 3);
second digit of "w x r x d" is "", prime table P ═ 3,5,7,11,13} P 25. Vector qw*r*dExecute from bit 1 to bit 26
The fourth bit of "w x r x d" is "", and prime number table P {3,5,7,11,13} P 411. Vector qw*r*dExecute from bit 1 to bit 26
For the keyword "k × yAB", the vector P is formedk*yABAs shown in fig. 9:
①pos′k′8 (available from table above), prime table P ═ {3,5,7,11,13} P 13, then qk*y[8]=qk*y[8]×P l1 × 3 ═ 3 (the calculated meaning is: a vector q representing the keyword "k × yk*yThe value of the 8 th bit is updated to 3);
second digit of "k yAB" is "", prime table P is {3,5,7,11,13} P 25. Vector qk*yExecute from bit 1 to bit 26
④pos′A′(5), prime number table P ═ {3,5,7,11,13} P4Q is 11, then qk*y[5]=qk*y[5]×P l1 × 11 ═ 11 ('a' is the fourth bit of "k × yAB"), so the fourth prime P of the prime table is used4=11);
In the present embodiment, the request vector q is generated according to the following stepsj,kConstructed as a request matrix Qj,kAnd the request matrix Q is subjected to the matching through the kNN algorithmj,kStructured request matrix setAnd (3) carrying out encryption processing: step S311: selecting a random number e, wherein e belongs to [2, d ]]And e is an integer; step S312: sequentially reacting q withj,k[l]Put in Qj,k[l][num]In, and is filled with Qj,k[l][*]Middle removing Qj,k[l][num]Other than the elements, each request vector qj,kAll expanded to a request matrix Q of size dXej,k(ii) a Wherein q isj,k[l]Representing a request vector qj,kElement at position I in (1), Qj,k[l][num]Representation matrix Qj,kColumn num of (1), num is a random number and num belongs to [1, e ]]L is an integer and is e [1, d ]],Qj,k[l][*]Middle removing Qj,k[l][num]The sum of other elements than gammalSo that gamma isl=tl×Qj,k[l][num](ii) a Wherein t isl0 or tl+1 is a prime number andQj,k[l][*]representing a request matrix Qj,kAll elements of line l; step S314: will form a request matrix setEach request matrix Q ofj,kEach row of elements in the array is placed in the intermediate matrix in turnThereby obtaining an intermediate matrix of size d × (nj × e)Wherein k is an integer and k is an element [1, nj ]](ii) a Step S315: according to an intermediate matrixGenerating two matrices of size d × (nj × e)Andthe generation rule is as follows: for intermediate matrixEach column ofAll have: if s [ i ]]When it is equal to 0, thenIf s [ i ]]1, thenWherein, s [ i ]]Represents the ith bit of a one-dimensional matrix s in the key Sk,representing an intermediate matrixThe ith row and the kth column of (1),representation matrixThe ith row and the kth column of (1),representation matrixRow i and column k, i e [1, d](ii) a Step S316: will M1And M2Corresponding inverse matrix M1 -1And M2 -1Respectively and matrix withAndmultiplying to obtain an encryption request matrix TQ(ii) a Wherein the content of the first and second substances,denotes two matrix multiplications.
In the scheme of the invention, the request vector is expanded into the request matrix mainly due to the consideration of scheme safety. And then, the request matrix is encrypted by adopting a kNN algorithm, so that the safety of the scheme is further improved.
To simplify the process of expanding the request vector into the request matrix in steps S312 and S313, it is assumed that there is a 5-dimensional vector q (as shown in fig. 10), taking e-3, and starting from l-1, incrementing by 1:
when l is equal to 1, randomly selecting num to 1, and placing Q [1] in a matrix Q [1] [1 ];
when l is 2, randomly selecting num is 2, and placing Q2 in a matrix Q2;
when l is 3, randomly selecting num is 3, and placing Q3 in a matrix Q3;
when l is 4, randomly selecting num is 2, and placing Q4 in a matrix Q4 ] [2 ];
when l is 5, randomly selecting num is 3, and placing Q5 in a matrix Q5;
resulting in a corresponding matrix Q (as shown in fig. 11).
Then, other positions in the matrix Q are filled to obtain an expanded matrix Q (as shown in fig. 12):
when l is 1, filling except Q1][1]An element other than; selecting t1When the value is 0, then gamma1=t1Q[1][1]0; except for Q1][1]The other element being Q [1]][2]And Q < 1 >][3]Then the condition γ is satisfied1=Q[1][2]+Q[1][3]0 is obtained; taking Q [1]][2]=-2, Q[1][3]=2。
When l is 2, fill except for Q2][2]An element other than; selecting t21 (because of t)2+1=2 is a prime number andthen gamma is2=t2Q[2][2]11; except for Q2][2]The other element being Q2][1]And Q2][3]Then the condition γ is satisfied2=Q[2][1]+Q[2][3]Can be obtained by taking the formula as 11; taking Q2][1]=1,Q[2][3]=10。
(iii) when l is 3, filling except Q3][3]An element other than; selecting t3When the value is 0, then gamma3=t3Q[3][3]0; except for Q3][3]The other element being Q3][1]And Q3][2]Then the condition γ is satisfied3=Q[3][1]+Q[3][2]0 is obtained; taking Q3][1]=-5, Q[3][2]=5。
When l is 4, filling except Q4][2]An element other than; selecting t416 (because of t)4+ 1-17 is a prime number andthen gamma is4=t4Q[4][2]16 × 5 ═ 90; except for Q4][2]The other element being Q [4]][1]And Q < 4 >][3]Then the condition γ is satisfied4=Q[4][1]+Q[4][3]Can be changed to 90; taking Q [4]][1]=0,Q[4][3]=90。
When l is 5, filling except Q5][3]An element other than; selecting t516 (because of t)5+ 1-17 is a prime number andthen gamma is5=t5Q[5][3]16 × 13 ═ 208; except for Q5][3]The other element being Q [5]][1]And Q [5]][2]Then the condition γ is satisfied5=Q[5][1]+Q[5][2]208, the product can be obtained; taking Q [5]][1]=200,Q[5][2]=8。
To facilitate understanding of the contents of step S314, if a matrix set Q is requested1={Qw*r*d,Qk*y}, intermediate matrixAs shown in fig. 13, Q isw*r*dPlaced in columns 1-3, Q of the matrixk*yPlaced in columns 4-6 of the matrix. Qw*r*dAnd Qk*yAre all a 26 x 3 matrix and thereforeIs a 26 x (2 x 3) or 26 x 6 matrix.
In step S4, the step of calculating the product of the encryption index matrix and the encryption request matrix, and determining the target data ciphertext according to the product result is as follows: step S401: calculating a product matrix Wherein the content of the first and second substances,is a product of the encryption index matrix and the encryption request matrix and has a size of mi×(njA matrix of x e); step S402: to the product matrixSumming every e elements in every line to obtain a sum with mi×njResult matrix of (2)For the result matrixEach row ofAll have: when k is2When% e is 0, calculateAnd taking the calculation results as a result matrix in turnKth1Each row and each columnThe above element, to obtain a size of mi×njResult matrix of (2)Wherein k is1Is an integer and k1∈[1,mi],k2Is an integer and k2∈[1,nj×e](ii) a Step S403: according to a matrixJudging the encryption index matrix IiWhether the corresponding data ciphertext is the target data ciphertext: when the query request contains a logical operator 'AND', if a result matrix corresponding to the encryption index matrix is obtainedIf at least one element in each row is an integer, the data ciphertext corresponding to the encryption index matrix is the target data ciphertext; when the query request contains a logical operator 'OR', if a result matrix corresponding to the index matrix is encryptedIf at least one row has an element as an integer, the data ciphertext corresponding to the encryption index matrix is the target data ciphertext.
In the scheme of the invention, the cloud server does not need the specific contents of the data plaintext, the initial index and the query keyword, and can judge whether the data ciphertext is the target data ciphertext or not only by calculating the product of the encryption index matrix and the encryption request matrix, so that the risk of disclosure does not exist, and the whole process has good safety. Meanwhile, the matrix constructed by the method can obtain the logical query result of 'AND' OR 'OR' contained in the query keyword through the multiplication result, so that the query is more convenient AND flexible.
For the convenience of understanding the contents of step S402, it is assumed that there is a matrix of 2 × (2 × 3)This formula "When k is2When% e is equal to 0,this can be understood simply as follows: for matrixEvery e elements. For example:summing every 3 elements.
For the convenience of understanding the content of step S403, assuming that the index vector and the request vector are as shown in fig. 14, then:
It can be seen that the matrix contains "w x r x d" and "k x y" because no file contains bothAndthe requirement that each column has at least one integer is not met, so that the query result conforming to the logic 'AND' does not exist;
since both files contain "k x y", the matrix is a binary matrixAndall satisfy the requirement that at least one column has an integer, so there is a query result that meets the logical "OR".
In summary, the invention provides a ciphertext-based multi-keyword fuzzy query method in a cloud environment. The method solves the problem of fuzzy query of multiple keywords by introducing wildcards through a method of constructing vectors and matrixes, eliminates the mode of presetting dictionaries, and provides efficient updating and deleting of file indexes; meanwhile, the method supports a round of operation with multiple keywords, and reduces the operation times; moreover, the method has higher precision ratio AND recall ratio, AND provides very flexible AND/OR query, namely the method supports both logic AND query among the keywords AND logic OR query among the keywords. In addition, the matrix in the kNN algorithm encryption processing scheme is adopted, so that the scheme provided by the invention has good safety.
The above description is only exemplary of the present invention and should not be taken as limiting the invention, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (5)
1. A ciphertext-based multi-keyword fuzzy query method in a cloud environment is characterized by comprising the following steps:
step S1: the local server encrypts a plaintext data set according to a preset data encryption algorithm to obtain a ciphertext data set;
step S2: the local server constructs the initial index key words corresponding to the plaintext data set into index vectors, constructs the index vectors into an index matrix, encrypts the index matrix by adopting a preset matrix encryption algorithm to obtain an encrypted index matrix, and sends the ciphertext data set and the encrypted index matrix to the cloud server;
step S3: constructing query keywords in a query request into a request vector by a user, constructing the request vector into a request matrix, encrypting the request matrix by adopting a preset matrix encryption algorithm to obtain an encrypted request matrix, and sending the encrypted request matrix to a cloud server;
step S4: the cloud server calculates the product of the encryption index matrix and the encryption request matrix according to the encryption request matrix, determines a target data ciphertext according to the product result, and sends the target data ciphertext back to the user;
step S5: the user decrypts the target data ciphertext through the data encryption algorithm of the step S1 to obtain a target data plaintext;
each initial index key corresponds to an index vector;
each row element of the index matrix corresponds to an index vector;
the query keyword in the query request comprises a wildcard character;
the query keywords in the query request are connected through a logical operator;
the logical operators comprise "AND" OR "OR";
the preset matrix encryption algorithm is a kNN algorithm;
the key of the preset matrix encryption algorithm is Sk, and the key Sk at least comprises: a prime number sequence table P, a completion alphabet S and a reversible encryption matrix M;
the prime number sequence table P ═ { P ═ P1,...,Pi,...PLTherein, each element PiAre all randomly generated prime numbers;
the complement alphabet S ═ S1,...,Si,...SLTherein, each element SiAll are from 26 English alphabetic dictionary alphabetsChinese English letters are randomly selected from the Chinese characters,
the reversible encryption matrix M ═ { M ═ M1,M2S }, wherein M1And M2Are all invertible matrices of size dxd, M1And M2The corresponding inverse matrices are respectively denoted as M1 -1And M2 -1S is a one-dimensional matrix with the number of columns being d, the value of matrix elements being 0 or 1, and is generated by a pseudo-random number function;
wherein d is more than or equal to 26, L is more than or equal to the number of letters contained in the initial index key word with the longest length, and d and L are positive integers;
constructing each initial index key W according to the following stepsi,kCorresponding d-dimensional index vector Pi,k:
Step S201: indexing the initial Key Wi,kThe number of the letters is complemented to a uniform length L;
judging initial index key word Wi,kThe number of letters Wi,kL |: when | | | Wi,k||<When L, sequentially selecting L- | W from the completion alphabet Si,kCompleting initial index key word W by | letteri,kSo that | | Wi,kL | ═ L; when | | | Wi,kWhen | | | is greater than or equal to L, directly executing step S202;
step S202: initializing a d-dimensional index vector Pi,kWill index the vector Pi,kAll elements of (2) are set to 1, so that Pi,k[l]=1,l∈[1,d]L is an integer;
wherein, Pi,k[l]Represents an index vector Pi,kThe element in position I;
step S203: selecting initial index key word Wi,kContaining letters in an index vector Pi,kThe fill position of (1);
sequentially indexing the vector Pi,kPos of (2)lBit determination as an initial index key Wi,kFill location of the letter of the l-th digit of the inclusion;
wherein, posl=Fkf(Wi,k[l]),Fkf() Is a function of pseudo-random number, Wi,k[l]Representing an initial index key Wi,kContaining the letter in position L ∈ [1, L ]];
Step S204: updating the index vector Pi,kPos of (2)lValue P on biti,k[posl];
Wherein, Pi,k[posl]=Pi,k[posl]/Pl;
Pi,k[posl]Is initially value of Pi,k[l],PlIs the prime number of the I bit in the prime number sequence table P, and is belonged to [1, L ∈];
Step S205: randomly selecting a natural number alpha, and indexing the vector Pi,kReplacing the value of the position still being the initial value 1 with alpha to obtain the final index vector Pi,k。
2. The method of claim 1, wherein the vector P is indexed according to the following stepsi,kConstructed set of index vectorsConstructing an index matrix Ai, and encrypting the index matrix Ai by a kNN algorithm:
step S211: sequentially indexing the vector setP in (1)i,kIs placed in the k-th row of the index matrix Ai to obtain oneAn index matrix Ai, k ∈ [1, mi ] of size mi × d];
Step S212: two matrices Ai' and Ai "of size mi × d are generated from the index matrix Ai, the generation rule is as follows:
for each row Ai [ k ] [ ] of the index matrix Ai: if s [ j ] is equal to 0, Ai' [ k ] [ j ] ═ Ai "[ k ] [ j ] ═ Ai [ k ] [ j ]; if s [ j ] is 1, Ai' [ k ] [ j ] + Ai "[ k ] [ j ] ═ Ai [ k ] [ j ];
wherein s [ j ] represents the jth bit of a one-dimensional matrix s in the key Sk, Ai [ k ] [ j ] represents the jth column element of the kth row in the index matrix Ai, Ai '[ k ] [ j ] represents the jth column element of the kth row in the matrix Ai', Ai "[ k ] [ j ] represents the jth column element of the kth row in the matrix Ai", and j belongs to [1, d ];
step S213: the matrices Ai 'and Ai' are respectively associated with said M1And M2Multiplying to obtain an encryption index matrix Ii;
Wherein, Ii=(Ii',Ii")=(Ai'·M1,Ai"·M2) Denotes the multiplication of two matrices, Ii',Ii"denotes an intermediate variable, Ii'=Ai'·M1,Ii"=Ai"·M2。
3. The method of claim 2, wherein each query keyword is constructed according to the following stepsCorresponding d-dimensional request vector qj,k:
Step S301: will inquire about the key wordThe number of the letters is complemented to a uniform length L;
determining query keywordsNumber of letters involvedWhen in useThen, sequentially selecting from the complement alphabet SEach letter completes the query key wordBehind the original letter of (A) such thatWhen in useIf so, directly executing step S302;
step S302: initializing a d-dimensional request vector qj,kWill request a vector qj,kAll elements of (a) are set to 1, such that qj,k[l]=1,l∈[1,d]L is an integer;
wherein q isj,k[l]Representing a request vector qj,kThe element in position I;
step S303: selecting query keywordsContaining letters in the request vector qj,kAnd updating the request vector qj,kThe value q at the l-th bit of (1)j,k[posl];
When in useThen, a vector q will be requestedj,kPos of (2)lBit determination as a query keywordFilling position of the letter of the first digit included, andqj,k[posl]=qj,k[posl]×Pl(ii) a When in useMake the request vector qj,kThe value of all the positions in (A) is multiplied by Pl(ii) a Thereby obtaining a final request vector qj,k;
4. A method according to claim 3, characterized in that the request vector q is formed according to the following stepsj,kConstructed as a request matrix Qj,kAnd the request matrix Q is subjected to the matching through the kNN algorithmj,kStructured request matrix setAnd (3) carrying out encryption processing:
step S311: selecting a random number e, wherein e belongs to [2, d ], and e is an integer;
step S312: sequentially reacting q withj,k[l]Put in Qj,k[l][num]In, and is filled with Qj,k[l][*]Middle removing Qj,k[l][num]Other than the elements, each request vector qj,kAll expanded to a request matrix Q of size dXej,k;
Wherein Q isj,k[l][num]Representation matrix Qj,kColumn num of (1), num is a random number and num belongs to [1, e ]],lIs an integer and is e [1, d ∈ ]],Qj,k[l][*]Middle removing Qj,k[l][num]The sum of other elements than gammalSo that gamma isl=tl×Qj,k[l][num];
Wherein t isl0 or tl+1 is a prime number andQj,k[l][*]representing a request matrix Qj,kAll elements of line l;
step S314: will form a request matrix setEach request matrix Q ofj,kEach row of elements in the array is placed in the intermediate matrix in turnThereby obtaining an intermediate matrix of size d × (nj × e)
Wherein k is an integer and k belongs to [1, nj ];
step S315: according to an intermediate matrixGenerating two matrices of size d × (nj × e)Andthe generation rule is as follows:
for intermediate matrixEach column ofAll have: if s [ i ]]When it is equal to 0, thenIf s [ i ]]1, then
Wherein, s [ i ]]Represents the ith bit of a one-dimensional matrix s in the key Sk,representing an intermediate matrixThe ith row and the kth column of (1),representation matrixThe ith row and the kth column of (1),representation matrixRow i and column k, i e [1, d];
Step S316: will M1And M2Corresponding inverse matrix M1 -1And M2 -1Respectively and matrix withAndmultiplying to obtain an encryption request matrix TQ;
5. the method according to claim 4, wherein the step S4 is to calculate a product of the encryption index matrix and the encryption request matrix, and the step of determining the target data ciphertext according to the product result is as follows:
Wherein the content of the first and second substances,is a product of the encryption index matrix and the encryption request matrix and has a size of miA matrix of x (nj × e);
step S402: to the product matrixSumming every e elements in every line to obtain a sum with miResult matrix of xnj
For the result matrixEach row ofAll have: when k is2When% e is 0, calculateAnd taking the calculation results as a result matrix in turnKth1The elements on each column are arranged in rows to obtain a size miResult matrix of xnj
Wherein k is1Is an integer and k1∈[1,mi],k2Is an integer and k2∈[1,nj×e];
Step S403: according to a matrixJudging the encryption index matrix IiWhether the corresponding data ciphertext is the target data ciphertext:
when the query request contains a logical operator 'AND', if a result matrix corresponding to the encryption index matrix is obtainedIf at least one element in each row is an integer, the data ciphertext corresponding to the encryption index matrix is the target data ciphertext;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810501660.6A CN108710698B (en) | 2018-05-23 | 2018-05-23 | Multi-keyword fuzzy query method based on ciphertext under cloud environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810501660.6A CN108710698B (en) | 2018-05-23 | 2018-05-23 | Multi-keyword fuzzy query method based on ciphertext under cloud environment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108710698A CN108710698A (en) | 2018-10-26 |
CN108710698B true CN108710698B (en) | 2021-10-15 |
Family
ID=63868538
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810501660.6A Active CN108710698B (en) | 2018-05-23 | 2018-05-23 | Multi-keyword fuzzy query method based on ciphertext under cloud environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108710698B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10984052B2 (en) * | 2018-11-19 | 2021-04-20 | Beijing Jingdong Shangke Information Technology Co., Ltd. | System and method for multiple-character wildcard search over encrypted data |
CN113987144A (en) * | 2021-10-18 | 2022-01-28 | 深圳前海微众银行股份有限公司 | Query method and device for space text |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102314580A (en) * | 2011-09-20 | 2012-01-11 | 西安交通大学 | Vector and matrix operation-based calculation-supported encryption method |
US8521759B2 (en) * | 2011-05-23 | 2013-08-27 | Rovi Technologies Corporation | Text-based fuzzy search |
CN105681280A (en) * | 2015-12-29 | 2016-06-15 | 西安电子科技大学 | Searchable encryption method based on Chinese in cloud environment |
CN106326360A (en) * | 2016-08-10 | 2017-01-11 | 武汉科技大学 | Fuzzy multi-keyword retrieval method of encrypted data in cloud environment |
WO2017222407A1 (en) * | 2016-06-22 | 2017-12-28 | Autonomous Non-Profit Organization For Higher Education "Skolkovo Institute Of Science And Technology" | Two-mode encryption scheme allowing comparison-based indexing |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9101253B2 (en) * | 2012-01-11 | 2015-08-11 | Quirky, Inc. | Apparatus and methods for removing material from a surface |
-
2018
- 2018-05-23 CN CN201810501660.6A patent/CN108710698B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8521759B2 (en) * | 2011-05-23 | 2013-08-27 | Rovi Technologies Corporation | Text-based fuzzy search |
CN102314580A (en) * | 2011-09-20 | 2012-01-11 | 西安交通大学 | Vector and matrix operation-based calculation-supported encryption method |
CN105681280A (en) * | 2015-12-29 | 2016-06-15 | 西安电子科技大学 | Searchable encryption method based on Chinese in cloud environment |
WO2017222407A1 (en) * | 2016-06-22 | 2017-12-28 | Autonomous Non-Profit Organization For Higher Education "Skolkovo Institute Of Science And Technology" | Two-mode encryption scheme allowing comparison-based indexing |
CN106326360A (en) * | 2016-08-10 | 2017-01-11 | 武汉科技大学 | Fuzzy multi-keyword retrieval method of encrypted data in cloud environment |
Non-Patent Citations (1)
Title |
---|
云计算中基于模糊关键字的可搜索加密方案;王贇玲;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160415;第I136-280页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108710698A (en) | 2018-10-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10936744B1 (en) | Mathematical method for performing homomorphic operations | |
US10812252B2 (en) | String matching in encrypted data | |
US5963642A (en) | Method and apparatus for secure storage of data | |
US9503432B2 (en) | Secure linkage of databases | |
CN107209787B (en) | Improving searching ability of special encrypted data | |
US8271796B2 (en) | Apparatus for secure computation of string comparators | |
US7519835B2 (en) | Encrypted table indexes and searching encrypted tables | |
US7093137B1 (en) | Database management apparatus and encrypting/decrypting system | |
US10282448B2 (en) | System and method for searching a symmetrically encrypted database for conjunctive keywords | |
US8533489B2 (en) | Searchable symmetric encryption with dynamic updating | |
CN110110163A (en) | Safe substring search is with filtering enciphered data | |
US20180294952A1 (en) | Method for operating a distributed key-value store | |
WO2024077948A1 (en) | Private query method, apparatus and system, and storage medium | |
CN109492410B (en) | Data searchable encryption and keyword search method, system, terminal and equipment | |
CN108710698B (en) | Multi-keyword fuzzy query method based on ciphertext under cloud environment | |
CN103607420A (en) | Safe electronic medical system for cloud storage | |
US11829503B2 (en) | Term-based encrypted retrieval privacy | |
CN114531220A (en) | Efficient fault-tolerant dynamic phrase searching method based on forward privacy and backward privacy | |
Rane et al. | Multi-user multi-keyword privacy preserving ranked based search over encrypted cloud data | |
CN108595554B (en) | Multi-attribute range query method based on cloud environment | |
CN112100649A (en) | Multi-keyword searchable encryption method and system supporting Boolean access control strategy | |
WO2022099893A1 (en) | Data query method, apparatus and system, and data set processing method | |
CN113626645A (en) | Hierarchical optimization efficient ciphertext fuzzy retrieval method and related equipment | |
JP2001101055A (en) | Data base managing device, data base system, enciphering device and ercording medium | |
EP2775420A1 (en) | Semantic search over encrypted data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |