CN109165226B - Searchable encryption method for ciphertext large data set - Google Patents

Searchable encryption method for ciphertext large data set Download PDF

Info

Publication number
CN109165226B
CN109165226B CN201811194140.1A CN201811194140A CN109165226B CN 109165226 B CN109165226 B CN 109165226B CN 201811194140 A CN201811194140 A CN 201811194140A CN 109165226 B CN109165226 B CN 109165226B
Authority
CN
China
Prior art keywords
data
file
index
block
ciphertext
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811194140.1A
Other languages
Chinese (zh)
Other versions
CN109165226A (en
Inventor
周福才
贾强
秦诗悦
张宗烨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201811194140.1A priority Critical patent/CN109165226B/en
Publication of CN109165226A publication Critical patent/CN109165226A/en
Application granted granted Critical
Publication of CN109165226B publication Critical patent/CN109165226B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

The invention provides a searchable encryption method for a ciphertext large-scale data set, and relates to the technical field of the Internet. The method comprises the following steps: the data owner completes the file uploading process: preprocessing an original file set F, dividing ciphertext data into N parts equally and uploading the N parts to a data server
Figure DDA0001828304850000011
Uploading the encrypted index to an index server SI(ii) a The data owner completes the keyword search process: to index server SIIssuing a search token τ for a keyword ww;SIAccording to τwAnd the data server where the safety index DB calculates w
Figure DDA0001828304850000012
Returning the ciphertext data to the data owner; the data owner completes the file downloading process: and downloading the ciphertext data set corresponding to the keyword w by the data owner, and decrypting by using the key to obtain the data file set. The invention optimizes the data structure of the security index and adopts an indirect addressing mode, so that the good search time complexity can still be kept under the condition of overlarge security index, and the acceptable range is reached.

Description

Searchable encryption method for ciphertext large data set
Technical Field
The invention relates to the technical field of internet, in particular to a searchable encryption method for a ciphertext large-scale data set.
Background
With the rapid development of cloud computing, the cloud storage technology is widely applied, and users gradually migrate data to a cloud server to avoid local huge storage overhead and cumbersome data management and obtain more convenient services. The openness and sharing of the cloud itself also pose a significant challenge to the security of data stored in a distributed environment. In order to ensure data security and user privacy, data is generally stored in a cloud server in a form of a ciphertext. However, after the plaintext data is encrypted into the ciphertext, although confidentiality and security of the data are guaranteed, original characteristics of a plurality of plaintext data are lost, so that keyword search on the ciphertext becomes a difficult problem. The Searchable Encryption (SE) technology is a cryptology primitive developed in recent years and supporting keyword search on a ciphertext, which saves a large amount of computing and network overhead for users, and makes full use of distributed storage and computing power in a cloud environment to search keywords on the ciphertext. With the development of cloud computing, under the application scenarios of massive users and massive data, providing a safe, flexible and efficient SE mechanism will be one of the targets that researchers pursue to the utmost.
In the searchable encryption scheme, a user firstly encrypts data by using an encryption algorithm and stores a ciphertext into a cloud server; when a user initiates a search request, a keyword trapdoor is sent to a cloud server, the server conducts heuristic matching on each file through the received trapdoor, and if the matching is successful, the keyword is contained in the description file; and finally, the server sends the matched file ciphertext back to the user, and the user only needs to decrypt the returned file. In terms of security, the cloud server does not obtain any information of searched keyword content and plaintext except that the access mode, the search mode, the file ciphertext, the ciphertext size, the file number and the like are obtained.
While most of the current indexes in symmetric searchable encryption schemes theoretically have the best search time, the performance performed on large data sets is not ideal. Also, I/O latency, storage utilization, and dataset distributed storage all reduce the practical performance of the symmetric searchable encryption scheme. When large-scale data sets are faced, the constructed security index is too large, and the security index is used for sequentially matching keywords for searching, which is an important reason that the searching efficiency is low in practice.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a searchable encryption method for ciphertext large data sets, which optimizes the storage structure of the security index by the idea of performing indirect addressing on indexes in a hierarchical manner in the security index generation algorithm, so that a good time complexity is still maintained when the security index is too large.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a searchable encryption method for ciphertext large data sets comprises the following specific steps:
step 1: the data owner completes the file uploading process at the client; the data owner firstly preprocesses an original file set F, wherein the preprocessing comprises the steps of generating ciphertext data by using symmetric encryption, carrying out semantic analysis on the original file set, extracting key words, constructing an inverted index for the key words and generating a safety index DB; after file preprocessing, the data owner equally divides the ciphertext data into N parts and uploads the N parts of ciphertext data to the data server
Figure BDA0001828304830000023
And uploads the encrypted index to the index server SI
Step 2: the data owner completes the keyword search process; the process comprises searching and file updating;
the file updating process comprises file addition and file deletion; after the file is added or deleted, the search process for the search keyword w is converted into the search process for D + Dadd-DdelWherein D is a dictionary not containing file additions and deletions, DaddDictionary added for files, DdelMerging the search results of the three parts and returning the merged search results to the ciphertext data set;
the search process is for the data owner to index server SISending a search request for the keyword w and sending the search request to the index server SIProviding a search token τ of ww;SIAccording to τwAnd the data server where the safety index DB calculates w
Figure BDA0001828304830000024
Wherein nu is more than or equal to 1 and less than or equal to N;
Figure BDA0001828304830000025
returning τ to data ownerwCorresponding ciphertext data set
Figure BDA0001828304830000026
Wherein
Figure BDA00018283048300000210
The total number of the ciphertext data is;
and step 3: the data owner completes the file downloading process at the client; in the process of downloading the file, the data owner downloads the ciphertext data set corresponding to the keyword w
Figure BDA0001828304830000027
Decryption using a secret key
Figure BDA0001828304830000028
Get a data file set containing w
Figure BDA0001828304830000029
The step 1 comprises the following substeps:
step 1.1: a data owner generates a secret key K at a client through an initialization algorithm, and then ciphertext data c are generated through symmetric encryption;
step 1.1.1: input a security parameter k, where k ∈ {0,1}k
Step 1.1.2: using pseudo-random number generators
Figure BDA0001828304830000021
Generating 3 random numbers K1,K2,K3A key that is a pseudo-random function PRF;
wherein the pseudo-random function PRF is represented as
Figure BDA0001828304830000022
Step 1.1.2.1: inputting client passwordKey K is equal to {0,1}kAnd the keyword w ∈ {0,1}*Outputting the encryption key K generated corresponding to w1∈{0,1}kAnd K2∈{0,1}k
Step 1.1.2.2: by K3←SKE.Gen(1k) Calculating to obtain a key of a symmetric encryption algorithm, wherein the key is used for encrypting the original file set F; SEK (Gen, Enc, Dec) is a symmetric encryption scheme, where Gen denotes a key generation algorithm, Enc denotes an encryption algorithm, and Dec denotes a decryption algorithm.
Step 1.1.3: output K ═ K (K)1,K2,K3) As a key;
step 1.2: encrypting a file, namely inputting an original file set F, and encrypting the F into ciphertext data c by using a symmetric encryption algorithm;
step 1.2.1: inputting an original file set F;
step 1.2.2: for file F in FηExecute by
Figure BDA0001828304830000032
Wherein 0<Eta is less than or equal to | F |, to generate cipher text cη,cη∈c;
Step 1.2.3: c is divided into N parts and sent to a data server
Figure BDA0001828304830000033
Step 1.3: the data owner carries out semantic analysis on the F, extracts keywords w, constructs an inverted index for the w and generates a safety index DB, after the DB is classified, an array A and a Block Block for storing Block information are obtained, a list L is created, the Block and an encryption tag generated by the K are stored in the L, and the L is uploaded to the SIExecuting D ← Create (L) to generate a dictionary D and outputting K, D and A; the method comprises the following specific steps:
step 1.3.1: generating an array A for storing data of the first time blocking of the inverted index and a list L for storing a pointer of the second time blocking;
step 1.3.2: for each keyword w, perform K1,K2Ae of ae, e of e, e of1And K2
Step 1.3.3: determining the safe index DB and safe index blocking parameters B and B, and dividing DB (w) into three types of Small, Medium and Large according to the inverted index length | DB (w) | of the keyword w:
Figure BDA0001828304830000031
step 1.3.3.1: the secure index
Figure BDA00018283048300000311
Then, the number Num of the blocks is takenBS1, namely, the blocking operation is not needed; when | DB (w) |<When b is performed, random data filling is performed on DB (w), the size of b is filled up, and the Block is recorded as BlockS(ii) a Execute
Figure BDA0001828304830000034
L ← (α, β), uploading L to SI
Step 1.3.3.2: the secure index
Figure BDA00018283048300000310
Taking the number of blocks
Figure BDA0001828304830000035
NumBMB is less than or equal to b; when the last block is less than the size of B, the size of B is supplemented;
BM for each blocki,1≤i≤NumBMComputing its label using symmetric encryption
Figure BDA0001828304830000036
Will be provided with
Figure BDA0001828304830000037
Is randomly stored to SIIn, its pointer is noted as
Figure BDA0001828304830000038
Obtain the binary group
Figure BDA0001828304830000039
Wherein i is more than or equal to 1 and less than or equal to NumBMThe process actually performs an indirect addressing operation;
creating an array A of
Figure BDA00018283048300000416
Writing in A; partitioning A according to the size of b, and taking the number of the partitions
Figure BDA00018283048300000417
Due to NumBMB is less than or equal to b, Num is obtainedbMLet 1 denote this Block as BlockM(ii) a If the block size is less than b, random data filling is carried out, and the block size is filled to b; execute
Figure BDA0001828304830000041
Figure BDA0001828304830000042
Figure BDA0001828304830000043
L←(α,β)
Uploading L to SI
Step 1.3.3.3: the secure index
Figure BDA00018283048300000418
Taking the number of blocks
Figure BDA00018283048300000419
b<NumBL≤Bb;
After the array A is obtained by calculation, Num in A is pairedBLThe stripe data continues to be subjected to blocking operation according to the size B, and secondary indirect addressing is carried out; number of blocks
Figure BDA0001828304830000044
NumBL' < b > is less than or equal to; filling random numbers in the last block which is less than B in size, and filling the random numbers to B in size;
for each block BLj',1≤j≤NumBL' calculating its label using a symmetric encryption algorithm
Figure BDA0001828304830000045
Will be provided with
Figure BDA0001828304830000046
Is randomly stored to SIIn, its pointer is noted as
Figure BDA0001828304830000047
Obtain the binary group
Figure BDA0001828304830000048
Wherein j is more than or equal to 1 and less than or equal to NumBL';
Creating an array A of
Figure BDA0001828304830000049
Writing in A; partitioning A according to the size of b, and taking the number of partitions
Figure BDA00018283048300000410
Due to NumBL' < b > to obtain NumbM1, the Block is recorded as BlockL(ii) a If the size of the block is less than b, random data filling is carried out, and the block is filled to b; execute
Figure BDA00018283048300000411
Figure BDA00018283048300000412
Figure BDA00018283048300000413
Figure BDA00018283048300000414
Figure BDA00018283048300000415
L←(α,β)
Uploading L to SI
Step 1.3.4: l upload to SIThen, D ← Create (L) is executed to generate a dictionary D;
step 1.3.5: outputting K, D and A;
step 1.4: generating trapdoor tau corresponding to keyword wwMainly comprises the following steps:
step 1.4.1: inputting a search keyword w;
step 1.4.2: execute
Figure BDA0001828304830000051
Wherein tau is1、τ2∈τw
Step 1.4.3: will (tau)12) As a search token for w, is uploaded to SI
The step 2 comprises the following substeps:
step 2.1: input (tau)12) And DB;
step 2.2: execute
Figure BDA0001828304830000052
Obtaining the classification of DB (w);
step 2.2.1: when in use
Figure BDA00018283048300000523
When it is executed
Figure BDA0001828304830000053
Step 2.2.2: when in use
Figure BDA0001828304830000054
When it is executed
Figure BDA0001828304830000055
Figure BDA0001828304830000056
Figure BDA0001828304830000057
Step 2.2.3: when in use
Figure BDA0001828304830000058
When it is executed
Figure BDA0001828304830000059
Figure BDA00018283048300000510
Figure BDA00018283048300000511
Figure BDA00018283048300000512
Figure BDA00018283048300000524
Figure BDA00018283048300000513
Step 2.2.4: output (tau)12) Corresponding ciphertext data set
Figure BDA00018283048300000514
The step 3 comprises the following steps:
step 3.1: file decryption, input of ciphertext data set
Figure BDA00018283048300000515
Using a symmetric encryption algorithm
Figure BDA00018283048300000516
Reverting to a fileset containing w
Figure BDA00018283048300000517
Step 3.1.1: inputting ciphertext data
Figure BDA00018283048300000518
Step 3.1.2: for ciphertext
Figure BDA00018283048300000519
Execute
Figure BDA00018283048300000520
Restoring corresponding data
Figure BDA00018283048300000521
Step 3.1.3: outputting a data file set comprising w
Figure BDA00018283048300000522
Adding the files in the step 2: inputting an original file set F to be added at a clientaddExecute EncK(Fadd) Generating a ciphertext caddAnd uploaded to SF(ii) a Inputting a set W of keywords to be addedaddAnd an inverted index set DB (W)add) (ii) a Executing EncK(Wadd,DB(Wadd) To produceRaw LaddAnd uploaded to SI(ii) a Output K, Dadd,Aadd
And deleting the file: inputting an original file set F to be deleted at a clientdelExtracting FdelSet of keywords WdelAnd generates an inverted index set DB (W)del) Execute EncK(Wdel,DB(Wdel) To produce LdelAnd uploaded to SIOutput K, Ddel,Adel
Adopt the produced beneficial effect of above-mentioned technical scheme to lie in: the invention provides a searchable encryption method for a large ciphertext data set, which optimizes a data structure of a security index by using the idea of partitioning the security index, and directly or indirectly addresses in a keyword search process according to the size of the security index, thereby overcoming the defect that the whole security index needs to be traversed in the traditional searchable encryption scheme. With the increase of the security index, when the size of the security index exceeds a certain threshold, the search time is not increased linearly any more, but is decreased to be increased sub-linearly, so that the keyword search efficiency is improved.
Drawings
Fig. 1 is a schematic diagram of a system model of a searchable encryption method for a large ciphertext data set according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a keyword-document inverted index structure according to an embodiment of the present invention;
fig. 3 is a graph illustrating a relationship between search time and a security index size of a searchable encryption method for a large ciphertext data set according to an embodiment of the present invention;
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
As shown in fig. 1, the method of the present embodiment is as follows.
A searchable encryption method for ciphertext large data sets comprises three types of entities: one is the data owner (holding the original file set, security index, key trap door, key), one is the index server (holding the security index), and one is the data server (holding the encrypted data set). Firstly, a data owner completes encryption operation on an original file set locally, and uploads an encrypted file and a security index to a data server and an index server respectively; when performing keyword search, the data owner sends a keyword search request to the index server; then, the index server finds a data server where the ciphertext corresponding to the search keyword is located according to the security index; finally, the data server returns the search results to the data owner.
The method comprises a key generation algorithm setup (k) and a file encryption algorithm EncK(F) Secure index generation algorithm EncK(W, DB (W)), trap door generation algorithm STokenK(w), Search algorithm Search ((τ)12) I), Update algorithm UpdateK(add, del) and File decryption Algorithm DecK(c) The method comprises the following specific steps:
step 1: the data owner completes the file uploading process at the client; the data owner firstly preprocesses an original file set F, wherein the preprocessing comprises the steps of generating ciphertext data by using symmetric encryption, carrying out semantic analysis on the original file set, extracting key words, constructing an inverted index for the key words and generating a safety index DB; after file preprocessing, the data owner equally divides the ciphertext data into N parts and uploads the N parts of ciphertext data to the data server
Figure BDA0001828304830000072
And uploads the encrypted index to the index server SI
Step 2: the data owner completes the keyword search process; the process comprises searching and file updating;
the file updating process comprises file addition and file deletion; after the file is added or deleted, the search process for the search keyword w is converted into the search process for D + Dadd-DdelWherein D is a dictionary not containing file additions and deletions, DaddDictionary added for files, DdelDeleting filesThe divided dictionaries are combined and returned to the ciphertext data set;
the search process is for the data owner to index server SISending a search request for the keyword w and sending the search request to the index server SIProviding a search token τ of ww;SIAccording to τwAnd the data server where the safety index DB calculates w
Figure BDA0001828304830000073
Wherein nu is more than or equal to 1 and less than or equal to N;
Figure BDA0001828304830000074
returning τ to data ownerwCorresponding ciphertext data set
Figure BDA0001828304830000075
Wherein
Figure BDA0001828304830000076
The total number of the ciphertext data is;
and step 3: the data owner completes the file downloading process at the client; in the process of downloading the file, the data owner downloads the ciphertext data set corresponding to the keyword w
Figure BDA0001828304830000077
Decryption using a secret key
Figure BDA0001828304830000078
Get a data file set containing w
Figure BDA0001828304830000079
The step 1 comprises the following substeps:
step 1.1: a data owner generates a secret key K at a client through an initialization algorithm, and then ciphertext data c are generated through symmetric encryption;
step 1.1.1: input a security parameter k, where k ∈ {0,1}k
Step 1.1.2: using pseudorandom number generationDevice for cleaning the skin
Figure BDA0001828304830000071
Generating 3 random numbers K1,K2,K3A key that is a pseudo-random function PRF;
wherein the pseudo-random function PRF is represented as PRF: {0,1}k×{0,1}*→{0,1}k
Step 1.1.2.1: input client key K e {0,1}kAnd the keyword w ∈ {0,1}*Outputting the encryption key K generated corresponding to w1∈{0,1}kAnd K2∈{0,1}k
Step 1.1.2.2: by K3←SKE.Gen(1k) Calculating to obtain a key of a symmetric encryption algorithm, wherein the key is used for encrypting the original file set F; SEK (Gen, Enc, Dec) is a symmetric encryption scheme, where Gen denotes a key generation algorithm, Enc denotes an encryption algorithm, and Dec denotes a decryption algorithm.
Step 1.1.3: output K ═ K (K)1,K2,K3) As a key;
in the embodiment, no third party is involved except for the client (data owner) and the server (index server and data server), the client key is generated at the client through an initialization algorithm, and a key distribution process is not involved. However, if the client key is lost, the client key cannot interact with the server, so that the previously uploaded document cannot be obtained, and the data of the client key is stolen.
Step 1.2: encrypting a file, namely inputting an original file set F, and encrypting the F into ciphertext data c by using a symmetric encryption algorithm;
step 1.2.1: inputting an original file set F;
step 1.2.2: for file F in FηExecute by
Figure BDA0001828304830000081
Wherein 0<Eta is less than or equal to | F |, to generate cipher text cη,cη∈c;
Step 1.2.3: c is divided into N parts and sent to a data server
Figure BDA0001828304830000082
Step 1.3: the data owner carries out semantic analysis on the F, extracts keywords w, constructs an inverted index for the w and generates a safety index DB, after the DB is classified, an array A and a Block Block for storing Block information are obtained, a list L is created, the Block and an encryption tag generated by the K are stored in the L, and the L is uploaded to the SIExecuting D ← Create (L) to generate a dictionary D and outputting K, D and A;
in the embodiment, aiming at the scene of global search, research on a generation process of a security index in a traditional searchable encryption scheme discovers that the time generated by a traditional security index generation algorithm in the search process is mainly caused by traversing the security index, so that the time complexity is required to be reduced only by reducing the traversal time of the security index, and one mode is to optimize a storage structure of the security index. The traditional secure index generation algorithm is modified, and some identifiers are encrypted in each ciphertext. Specifically, a block of size B is fixed, when constructing the result list, B identifiers are processed at a time, the last block identifier is filled to the same length, and is encapsulated into a ciphertext d, and the same tag is used. The search process is exactly the same as before, except that the server decrypts and parses the results on a block-by-block basis, rather than individually.
In order to reduce the time for retrieving the safety index, an index increasing mode is adopted, namely the inverted index is partitioned according to the size B, and each piece of information is extracted to form a tag for searching. At this time, if the total number of the information blocks is divided into t blocks, the information block where the keyword is located can be found by one search, and then the corresponding file information can be found. This is the first blocking, and then the data after the blocking is again blocked by b size. Similar to the first blocking process, the tag of the block extracted at this time is stored in L. As shown in fig. 2, the specific steps are as follows:
step 1.3.1: generating an array A for storing data of the first time blocking of the inverted index and a list L for storing a pointer of the second time blocking;
step 1.3.2: for each oneA key word w, execute K1,K2Ae of ae, e of e, e of1And K2
Step 1.3.3: determining the safe index DB and safe index blocking parameters B and B, and dividing DB (w) into three types of Small, Medium and Large according to the inverted index length | DB (w) | of the keyword w:
Figure BDA00018283048300000913
step 1.3.3.1: the secure index
Figure BDA0001828304830000091
Then, the number Num of the blocks is takenBS1, namely, the blocking operation is not needed; when | DB (w) |<When b is performed, random data filling is performed on DB (w), the size of b is filled up, and the Block is recorded as BlockS(ii) a Execute
Figure BDA0001828304830000092
Uploading L to SI
Step 1.3.3.2: the secure index
Figure BDA0001828304830000093
Taking the number of blocks
Figure BDA0001828304830000094
NumBMB is less than or equal to b; when the last block is less than the size of B, the size of B is supplemented;
BM for each blocki,1≤i≤NumBMComputing its label using symmetric encryption
Figure BDA00018283048300000915
Will be provided with
Figure BDA00018283048300000914
Is randomly stored to SIIn, its pointer is noted as
Figure BDA0001828304830000095
Obtain the binary group
Figure BDA0001828304830000096
Wherein i is more than or equal to 1 and less than or equal to NumBMThe process actually performs an indirect addressing operation;
creating an array A of
Figure BDA0001828304830000097
Writing in A; partitioning A according to the size of b, and taking the number of the partitions
Figure BDA0001828304830000098
Due to NumBMB is less than or equal to b, Num is obtainedbMLet 1 denote this Block as BlockM(ii) a If the block size is less than b, random data filling is carried out, and the block size is filled to b; execute
Figure BDA0001828304830000099
Figure BDA00018283048300000910
Figure BDA00018283048300000911
L←(α,β)
Uploading L to SI
Step 1.3.3.3: the secure index
Figure BDA00018283048300000912
Taking the number of blocks
Figure BDA00018283048300000916
b<NumBL≤Bb;
After the array A is obtained by calculation, Num in A is pairedBLThe stripe data continues to be subjected to blocking operation according to the size B, and secondary indirect addressing is carried out; number of blocks
Figure BDA00018283048300000917
NumBL' < b > is less than or equal to; filling random numbers in the last block which is less than B in size, and filling the random numbers to B in size;
for each block BLj',1≤j≤NumBL' calculating its label using a symmetric encryption algorithm
Figure BDA00018283048300000919
Will be provided with
Figure BDA00018283048300000918
Is randomly stored to SIIn, its pointer is noted as
Figure BDA0001828304830000101
Obtain the binary group
Figure BDA0001828304830000102
Wherein j is more than or equal to 1 and less than or equal to NumBL';
Creating an array A of
Figure BDA0001828304830000103
Writing in A; partitioning A according to the size of b, and taking the number of partitions
Figure BDA0001828304830000104
Due to NumBL'B is less than or equal to b, Num is obtainedbM1, the Block is recorded as BlockL(ii) a If the size of the block is less than b, random data filling is carried out, and the block is filled to b; execute
Figure BDA0001828304830000105
Figure BDA0001828304830000106
Figure BDA0001828304830000107
Figure BDA0001828304830000108
Figure BDA0001828304830000109
L←(α,β)
Uploading L to SI
Step 1.3.4: l upload to SIThen, D ← Create (L) is executed to generate a dictionary D;
step 1.3.5: outputting K, D and A;
step 1.4: generating trapdoor tau corresponding to keyword wwMainly comprises the following steps:
step 1.4.1: inputting a search keyword w;
step 1.4.2: execute
Figure BDA00018283048300001010
Wherein tau is1、τ2∈τw
Step 1.4.3: will (tau)12) As a search token for w, is uploaded to SI
The step 2 comprises the following substeps:
step 2.1: input (tau)12) And DB;
step 2.2: execute
Figure BDA00018283048300001011
Obtaining the classification of DB (w);
step 2.2.1: when in use
Figure BDA00018283048300001012
When it is executed
Figure BDA00018283048300001013
Step 2.2.2: when in use
Figure BDA00018283048300001014
When it is executed
Figure BDA00018283048300001015
Figure BDA00018283048300001016
Figure BDA00018283048300001017
Step 2.2.3: when in use
Figure BDA00018283048300001018
When it is executed
Figure BDA0001828304830000111
Figure BDA0001828304830000112
Figure BDA0001828304830000113
Figure BDA0001828304830000114
Figure BDA0001828304830000115
Figure BDA0001828304830000116
Step 2.2.4: output (tau)12) Corresponding ciphertext data set
Figure BDA00018283048300001115
The step 3 comprises the following steps:
step 3.1: file decryption, input of ciphertext data set
Figure BDA0001828304830000117
Using a symmetric encryption algorithm
Figure BDA0001828304830000118
Reverting to a fileset containing w
Figure BDA0001828304830000119
Step 3.1.1: inputting ciphertext data
Figure BDA00018283048300001110
Step 3.1.2: for ciphertext
Figure BDA00018283048300001111
Execute
Figure BDA00018283048300001112
Restoring corresponding data
Figure BDA00018283048300001113
Step 3.1.3: outputting a data file set comprising w
Figure BDA00018283048300001114
Adding the files in the step 2: inputting an original file set F to be added at a clientaddExecute EncK(Fadd) Generating a ciphertext caddAnd uploaded to SF(ii) a Inputting a set W of keywords to be addedaddAnd reverse index setDB(Wadd) (ii) a Executing EncK(Wadd,DB(Wadd) To produce LaddAnd uploaded to SI(ii) a Output K, Dadd,Aadd
And deleting the file: inputting an original file set F to be deleted at a clientdelExtracting FdelSet of keywords WdelAnd generates an inverted index set DB (W)del) Execute EncK(Wdel,DB(Wdel) To produce LdelAnd uploaded to SIOutput K, Ddel,Adel
In this embodiment, the keyword search performance is studied by the search time of the security indexes of different sizes. The size of the security index reflects the size of the logarithm of the mapping relationship between the keywords and the file information to a certain extent, and the number of the logarithm of the mapping relationship between the keywords and the file information is increased with the increase of the security index.
This embodiment uses a part of news data generated from month 6 to month 7 of 2012 provided by a certain laboratory and public mail data of a certain company. In an embodiment, under the same search keyword set, 5 sizes of secure indexes are used for the search query, and the sizes are as shown in table 1:
TABLE 1 secure index Classification Table
Classification A B C D E
size/Kb 100 200 300 400 500
In the embodiment, in the traditional scheme, no storage structure processing is carried out on the security index, and the traversal operation is directly carried out on the security index during searching; in the scheme provided by the invention, the block structure processing is carried out once before the safety index is stored, and the searching is completed through the safety index block storage structure. By comparing the difference results of the keyword search time caused by the difference of the sizes of the five security indexes, in the search process, compared with the traditional method for traversing the whole security index, when the security index is smaller, the advantages of the method are not obvious, even the search time is slightly higher than that of the traditional traversing security index scheme, but with the increase of the security index, the advantages of the scheme are gradually shown, and the search time is shorter and shorter compared with that of the traditional method, as shown in fig. 3.
The invention optimizes the data structure of the security index by using the idea of partitioning the security index, and directly or indirectly addresses in the keyword searching process according to the size of the security index, thereby overcoming the defect that the whole security index needs to be traversed in the traditional searchable encryption scheme. With the increase of the security index, when the size of the security index exceeds a certain threshold, the search time is not increased linearly any more, but is decreased to be increased sub-linearly, so that the keyword search efficiency is improved.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions and scope of the present invention as defined in the appended claims.

Claims (3)

1. A searchable encryption method for ciphertext large-scale data sets is characterized in that: the method comprises the following steps:
step 1: the data owner completes the file uploading process at the client; the data owner firstly preprocesses an original file set F, wherein the preprocessing comprises the steps of generating ciphertext data by using symmetric encryption, carrying out semantic analysis on the original file set, extracting key words, constructing an inverted index for the key words and generating a safety index DB; after file preprocessing, the data owner equally divides the ciphertext data into N parts and uploads the N parts of ciphertext data to the data server
Figure FDA0002906818970000011
And uploads the encrypted index to the index server SI(ii) a The method specifically comprises the following steps:
step 1.1: a data owner generates a secret key K at a client through an initialization algorithm, and then ciphertext data c are generated through symmetric encryption;
step 1.1.1: input a security parameter k, where k ∈ {0,1}k
Step 1.1.2: using pseudo-random number generators
Figure FDA0002906818970000012
Generating 3 random numbers K1,K2,K3A key that is a pseudo-random function PRF;
wherein the pseudo-random function PRF is represented as PRF: {0,1}k×{0,1}*→{0,1}k
Step 1.1.2.1: input client key K e {0,1}kAnd the keyword w ∈ {0,1}*Outputting the encryption key K generated corresponding to w1∈{0,1}kAnd K2∈{0,1}k
Step 1.1.2.2: by K3←SKE.Gen(1k) Calculating to obtain a key of a symmetric encryption algorithm, wherein the key is used for encrypting the original file set F; SEK(Gen, Enc, Dec) is a symmetric encryption scheme, where Gen denotes a key generation algorithm, Enc denotes an encryption algorithm, and Dec denotes a decryption algorithm;
step 1.1.3: output K ═ K (K)1,K2,K3) As a key;
step 1.2: encrypting a file, namely inputting an original file set F, and encrypting the F into ciphertext data c by using a symmetric encryption algorithm;
step 1.2.1: inputting an original file set F;
step 1.2.2: for file F in FηExecute by
Figure FDA0002906818970000013
Wherein 0<Eta is less than or equal to | F |, to generate cipher text cη,cη∈c;
Step 1.2.3: c is divided into N parts and sent to a data server
Figure FDA0002906818970000014
Step 1.3: the data owner carries out semantic analysis on the F, extracts keywords w, constructs an inverted index for the w and generates a safety index DB, after the DB is classified, an array A and a Block Block for storing Block information are obtained, a list L is created, the Block and an encryption tag generated by the K are stored in the L, and the L is uploaded to the SIExecuting D ← Create (L) to generate a dictionary D and outputting K, D and A; the method comprises the following specific steps:
step 1.3.1: generating an array A for storing data of the first time blocking of the inverted index and a list L for storing a pointer of the second time blocking;
step 1.3.2: for each keyword w, perform K1,K2Ae of ae, e of e, e of1And K2
Step 1.3.3: determining the safe index DB and safe index blocking parameters B and B, and dividing DB (w) into three types of Small, Medium and Large according to the inverted index length | DB (w) | of the keyword w:
Figure FDA0002906818970000021
step 1.3.3.1: the secure index
Figure FDA0002906818970000022
Then, the number Num of the blocks is takenBS1, namely, the blocking operation is not needed; when | DB (w) |<When b is performed, random data filling is performed on DB (w), the size of b is filled up, and the Block is recorded as BlockS(ii) a Execute
Figure FDA00029068189700000211
L ← (α, β), uploading L to SI
Step 1.3.3.2: the secure index
Figure FDA0002906818970000023
Taking the number of blocks
Figure FDA0002906818970000024
NumBMB is less than or equal to b; when the last block is less than the size of B, the size of B is supplemented;
BM for each blocki,1≤i≤NumBMComputing its label using symmetric encryption
Figure FDA00029068189700000212
Will be provided with
Figure FDA00029068189700000213
Is randomly stored to SIIn, its pointer is noted as
Figure FDA00029068189700000214
Obtain the binary group
Figure FDA00029068189700000215
Wherein i is more than or equal to 1 and less than or equal to NumBMThe process actually performs an indirect addressing operation;
creating an array A of
Figure FDA00029068189700000216
Writing in A; partitioning A according to the size of b, and taking the number of the partitions
Figure FDA0002906818970000025
Due to NumBMB is less than or equal to b, Num is obtainedbMLet 1 denote this Block as BlockM(ii) a If the block size is less than b, random data filling is carried out, and the block size is filled to b; execute
Figure FDA0002906818970000026
Figure FDA0002906818970000027
Figure FDA0002906818970000028
L←(α,β)
Uploading L to SI
Step 1.3.3.3: the secure index
Figure FDA0002906818970000029
Taking the number of blocks
Figure FDA00029068189700000210
b<NumBL≤Bb;
After the array A is obtained by calculation, Num in A is pairedBLThe stripe data continues to be subjected to blocking operation according to the size B, and secondary indirect addressing is carried out; number of blocks
Figure FDA0002906818970000031
NumBL' < b > is less than or equal to; the last block is less than B size, and random number filling is performedThe size of B is filled;
for each block BLj',1≤j≤NumBL' calculating its label using a symmetric encryption algorithm
Figure FDA0002906818970000032
Will be provided with
Figure FDA0002906818970000033
Is randomly stored to SIIn, its pointer is noted as
Figure FDA0002906818970000034
Obtain the binary group
Figure FDA0002906818970000035
Wherein j is more than or equal to 1 and less than or equal to NumBL';
Creating an array A of
Figure FDA0002906818970000036
Writing in A; partitioning A according to the size of b, and taking the number of partitions
Figure FDA0002906818970000037
Due to NumBL'B is less than or equal to b, Num is obtainedbM1, the Block is recorded as BlockL(ii) a If the size of the block is less than b, random data filling is carried out, and the block is filled to b; execute
Figure FDA0002906818970000038
Figure FDA0002906818970000039
Figure FDA00029068189700000310
Figure FDA00029068189700000311
Figure FDA00029068189700000312
L←(α,β)
Uploading L to SI
Step 1.3.4: l upload to SIThen, D ← Create (L) is executed to generate a dictionary D;
step 1.3.5: outputting K, D and A;
step 1.4: generating trapdoor tau corresponding to keyword wwMainly comprises the following steps:
step 1.4.1: inputting a search keyword w;
step 1.4.2: execute
Figure FDA00029068189700000313
Wherein tau is1、τ2∈τw
Step 1.4.3: will (tau)12) As a search token for w, is uploaded to SI
Step 2: the data owner completes the keyword search process; the process comprises searching and file updating;
the file updating process comprises file addition and file deletion; after the file is added or deleted, the search process for the search keyword w is converted into the search process for D + Dadd-DdelWherein D is a dictionary not containing file additions and deletions, DaddDictionary added for files, DdelMerging the search results of the three parts and returning the merged search results to the ciphertext data set;
the search process is for the data owner to index server SISending a search request for the keyword w and sending the search request to the index server SIProviding a search token τ of ww;SIAccording to τwAnd the data server where the safety index DB calculates w
Figure FDA00029068189700000314
Wherein nu is more than or equal to 1 and less than or equal to N;
Figure FDA0002906818970000041
returning τ to data ownerwCorresponding ciphertext data set
Figure FDA0002906818970000042
Wherein
Figure FDA0002906818970000043
The total number of the ciphertext data is;
the method specifically comprises the following steps:
step 2.1: input (tau)12) And DB;
step 2.2: execute
Figure FDA0002906818970000044
Obtaining the classification of DB (w);
step 2.2.1: when in use
Figure FDA0002906818970000045
When it is executed
Figure FDA0002906818970000046
cres←Get(SF;BlockS);
Step 2.2.2: when in use
Figure FDA0002906818970000047
When it is executed
Figure FDA0002906818970000048
Figure FDA0002906818970000049
Figure FDA00029068189700000410
Step 2.2.3: when in use
Figure FDA00029068189700000411
When it is executed
Figure FDA00029068189700000412
Figure FDA00029068189700000413
Figure FDA00029068189700000414
Figure FDA00029068189700000415
Figure FDA00029068189700000416
Figure FDA00029068189700000417
Step 2.2.4: output (tau)12) Corresponding ciphertext data set
Figure FDA00029068189700000418
And step 3: the data owner completes the file downloading process at the client; in the process of downloading the file, the data owner downloads the ciphertext data set corresponding to the keyword w
Figure FDA00029068189700000419
Decryption using a secret key
Figure FDA00029068189700000420
Get a data file set containing w
Figure FDA00029068189700000421
2. The searchable encryption method for the large ciphertext data set according to claim 1, wherein: the step 3 comprises the following steps:
step 3.1: file decryption, input of ciphertext data set
Figure FDA00029068189700000422
Using a symmetric encryption algorithm
Figure FDA00029068189700000423
Reverting to a fileset containing w
Figure FDA00029068189700000424
Step 3.1.1: inputting ciphertext data
Figure FDA00029068189700000425
Step 3.1.2: for ciphertext
Figure FDA00029068189700000426
Execute
Figure FDA00029068189700000427
Restoring corresponding data
Figure FDA00029068189700000428
Step 3.1.3: outputting a data file set comprising w
Figure FDA00029068189700000429
3. The searchable encryption method for the large ciphertext data set according to claim 1, wherein: adding the file: inputting an original file set F to be added at a clientaddExecute EncK(Fadd) Generating a ciphertext caddAnd uploaded to SF(ii) a Inputting a set W of keywords to be addedaddAnd an inverted index set DB (W)add) (ii) a Executing EncK(Wadd,DB(Wadd) To produce LaddAnd uploaded to SI(ii) a Output K, Dadd,Aadd
And deleting the file: inputting an original file set F to be deleted at a clientdelExtracting FdelSet of keywords WdelAnd generates an inverted index set DB (W)del) Execute EncK(Wdel,DB(Wdel) To produce LdelAnd uploaded to SIOutput K, Ddel,Adel
CN201811194140.1A 2018-10-15 2018-10-15 Searchable encryption method for ciphertext large data set Active CN109165226B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811194140.1A CN109165226B (en) 2018-10-15 2018-10-15 Searchable encryption method for ciphertext large data set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811194140.1A CN109165226B (en) 2018-10-15 2018-10-15 Searchable encryption method for ciphertext large data set

Publications (2)

Publication Number Publication Date
CN109165226A CN109165226A (en) 2019-01-08
CN109165226B true CN109165226B (en) 2021-03-02

Family

ID=64878239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811194140.1A Active CN109165226B (en) 2018-10-15 2018-10-15 Searchable encryption method for ciphertext large data set

Country Status (1)

Country Link
CN (1) CN109165226B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113468575B (en) * 2021-07-22 2023-09-19 东北大学 System and method for retrieving encrypted streaming data supporting access mode hiding

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416037A (en) * 2018-03-14 2018-08-17 安徽大学 Centric keyword cipher text searching method based on two-stage index in cloud environment
CN108494768A (en) * 2018-03-22 2018-09-04 深圳大学 A kind of cipher text searching method and system for supporting access control

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9740879B2 (en) * 2014-10-29 2017-08-22 Sap Se Searchable encryption with secure and efficient updates

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416037A (en) * 2018-03-14 2018-08-17 安徽大学 Centric keyword cipher text searching method based on two-stage index in cloud environment
CN108494768A (en) * 2018-03-22 2018-09-04 深圳大学 A kind of cipher text searching method and system for supporting access control

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Integrity-verifiable conjunctive keyword searchable encryption in cloud storage;Yuxi Li等;《International Journal of Information Security》;20171108;全文 *
面向多关键字的模糊密文搜索方法;王恺璇等;《计算机研究与发展》;20171231;全文 *

Also Published As

Publication number Publication date
CN109165226A (en) 2019-01-08

Similar Documents

Publication Publication Date Title
CN106815350B (en) Dynamic ciphertext multi-keyword fuzzy search method in cloud environment
CN108494768B (en) Ciphertext searching method and system supporting access control
CN111026788B (en) Homomorphic encryption-based multi-keyword ciphertext ordering and retrieving method in hybrid cloud
CN109361644B (en) Fuzzy attribute based encryption method supporting rapid search and decryption
CN110659379B (en) Searchable encrypted image retrieval method based on deep convolution network characteristics
WO2022099495A1 (en) Ciphertext search method, system, and device in cloud computing environment
Cui et al. Harnessing encrypted data in cloud for secure and efficient mobile image sharing
CN112800445B (en) Boolean query method for forward and backward security and verifiability of ciphertext data
CN112332979B (en) Ciphertext search method, system and equipment in cloud computing environment
CN109213731B (en) Multi-keyword ciphertext retrieval method based on iterative encryption in cloud environment
CN104821876B (en) A kind of dynamic for supporting that physics is deleted can search for symmetric encryption method
Handa et al. A cluster based multi-keyword search on outsourced encrypted cloud data
Lin et al. Privacy-preserving similarity search with efficient updates in distributed key-value stores
CN109783456B (en) Duplication removing structure building method, duplication removing method, file retrieving method and duplication removing system
CN109165226B (en) Searchable encryption method for ciphertext large data set
CN110928980B (en) Ciphertext data storage and retrieval method oriented to mobile cloud computing
Lam et al. Gpu-based private information retrieval for on-device machine learning inference
Rajkumar et al. Fuzzy-Dedup: A secure deduplication model using cosine based Fuzzy interference system in cloud application
Kozak et al. Efficiency and security in similarity cloud services
CN111966778B (en) Multi-keyword ciphertext sorting and searching method based on keyword grouping reverse index
CN113626836A (en) Symmetric searchable encryption method and system based on LSM
Sankari et al. PLIE-A Light-weight Image Encryption for data Privacy in mobile cloud storage
Indhuja et al. A multi-keyword ranked search scheme over encrypted based on hierarchical clustering index
CN113626485B (en) Searchable encryption method and system suitable for database management system
Chen et al. Memory leakage-resilient dynamic and verifiable multi-keyword ranked search on encrypted smart body sensor network data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant