CN110674524A - Mixed ciphertext indexing method and system - Google Patents

Mixed ciphertext indexing method and system Download PDF

Info

Publication number
CN110674524A
CN110674524A CN201910940962.8A CN201910940962A CN110674524A CN 110674524 A CN110674524 A CN 110674524A CN 201910940962 A CN201910940962 A CN 201910940962A CN 110674524 A CN110674524 A CN 110674524A
Authority
CN
China
Prior art keywords
data
ciphertext
barrel
interval
division
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910940962.8A
Other languages
Chinese (zh)
Inventor
翟建军
邢亚君
陈青民
孟铭
郑敏波
彭海龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing An Xin Tian Xing Technology Co Ltd
Original Assignee
Beijing An Xin Tian Xing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing An Xin Tian Xing Technology Co Ltd filed Critical Beijing An Xin Tian Xing Technology Co Ltd
Priority to CN201910940962.8A priority Critical patent/CN110674524A/en
Publication of CN110674524A publication Critical patent/CN110674524A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2425Iterative querying; Query formulation based on the results of a preceding query

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioethics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a system for indexing a mixed ciphertext. The method comprises the following steps: acquiring ciphertext data stored in a database; carrying out barrel division on the ciphertext data with the same attribute to obtain barrel division data; a B + tree is built for the bucket-partitioned data, with data pointers stored in the leaf nodes. The method and the system for indexing the mixed ciphertext can improve the indexing speed.

Description

Mixed ciphertext indexing method and system
Technical Field
The invention relates to the field of indexes, in particular to a method and a system for indexing a mixed ciphertext.
Background
The database encryption technology is the last line of defense for guaranteeing the security of the database, and insecurity caused by directly storing plaintext data in the database is solved. Meanwhile, the encryption of the database also brings a new problem that the encryption obviously reduces the efficiency of data query. The initial query result returned to the client by the existing bucket partition ciphertext indexing method contains records which do not meet the query condition. The data in the barrel returned to the client by the server is in a linear structure, and the data is encrypted and loses the original partial order relation, so that the client can only do sequential searching. The disadvantage of sequential lookup is that the lookup is inefficient, especially in the case of large amounts of data.
Disclosure of Invention
The invention aims to provide a method and a system for indexing a mixed ciphertext, which improve the indexing speed.
In order to achieve the purpose, the invention provides the following scheme:
a method of hybrid ciphertext indexing, comprising:
acquiring ciphertext data stored in a database;
carrying out barrel division on the ciphertext data with the same attribute to obtain barrel division data;
a B + tree is built for the bucket partitioned data, and data pointers are stored into leaf nodes.
Optionally, the barrel dividing is performed on the ciphertext data with the same attribute to obtain barrel divided data, and the method specifically includes:
determining a division point for the ciphertext data with the same attribute by taking the minimum total number of error detection elements as a target function; the total number of the error detection elements is the sum of the error detection elements corresponding to all the barrel intervals; the error detection element number is the number of data except the target data in the bucket interval obtained in the indexing process;
and carrying out barrel division on the ciphertext data with the corresponding attribute according to the dividing points.
Optionally, the determining, by using the minimum total number of error detection elements as a target function, a partition point for the ciphertext data with the same attribute specifically includes:
calculating the maximum value and the minimum value of ciphertext data to be divided, and determining the interval of the ciphertext data;
adding a division point to be inserted into the interval where the ciphertext data is located;
calculating the total number of error detection elements of the barrel interval divided by the division point to be inserted at different positions;
determining the position of the division point to be inserted when the total number of the error detection elements is minimum to obtain the insertion position of the division point to be inserted;
judging whether the number of the barrel intervals divided after the division point to be inserted is larger than or equal to a maximum preset threshold value or not, and obtaining a judgment result;
if the judgment result shows that the position is correct, the insertion position is determined to be finished, and all the insertion positions are recorded;
and if the judgment result shows no, returning to the step of adding a division point to be inserted into the interval where the ciphertext data is located.
Optionally, a calculation formula of the number of false detection elements corresponding to any one bucket interval is as follows:
wherein BC (α, β) represents an attribute value vαAnd an attribute value vβThe number of false detection elements in the barrel interval which is the boundary point; f. oftIs a certain attribute value vtAt v inαAnd vβThe frequencies of occurrence within the barrel interval as boundary points, α, t and β are attribute values v, respectivelyαProperty value vtAnd an attribute value vβThe number of (2).
A hybrid ciphertext indexing system, comprising:
the acquisition module is used for acquiring the ciphertext data stored in the database;
the barrel dividing module is used for carrying out barrel division on the ciphertext data with the same attribute to obtain barrel divided data;
and the tree structure establishing module is used for establishing a B + tree for the barrel division data and storing the data pointer into a leaf node.
Optionally, the bucket dividing module includes:
the division point determining submodule is used for determining division points for the ciphertext data with the same attribute by taking the minimum total number of the error detection elements as a target function; the total number of the error detection elements is the sum of the error detection elements corresponding to all the barrel intervals; the error detection element number is the number of data except the target data in the bucket interval obtained in the indexing process; (ii) a
And the partitioning submodule is used for carrying out barrel partitioning on the ciphertext data with the corresponding attribute according to the partitioning points.
Optionally, the partition point determining sub-module specifically includes:
the ciphertext interval determining unit is used for calculating the maximum value and the minimum value of ciphertext data to be divided and determining an interval in which the ciphertext data is located;
a division point adding unit, configured to add a division point to be inserted to an interval in which the ciphertext data is located;
the total number of the error detection elements is used for calculating the total number of the error detection elements of the barrel interval divided by the division point to be inserted at different positions;
an insertion position determining unit, configured to determine a position of the division point to be inserted when the total number of the error detection elements is minimum, to obtain an insertion position of the division point to be inserted;
the judging unit is used for judging whether the number of the barrel intervals which are divided after the division points to be inserted are inserted is larger than or equal to a maximum preset threshold value or not to obtain a judgment result;
an insertion position determination completion unit configured to complete the insertion position determination and record all insertion positions if the determination result indicates yes;
and the returning unit is used for returning to the step of adding a division point to be inserted into the interval where the ciphertext data is located if the judgment result shows that the ciphertext data is not inserted into the interval.
Optionally, a calculation formula of the number of false detection elements corresponding to any one bucket interval is as follows:
wherein BC (α, β) represents an attribute value vαAnd an attribute value vβThe number of false detection elements in the barrel interval which is the boundary point; f. oftIs a certain attribute value vtAt v inαAnd vβThe frequencies of occurrence within the barrel interval as boundary points, α, t and β are attribute values v, respectivelyαProperty value vtAnd an attribute value vβThe number of (2).
According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the mixed ciphertext indexing method and the mixed ciphertext indexing system provided by the invention have the advantages that indexes are built by using the B + tree on the basis of barrel division, and the prior art is replaced by adopting a sequential searching mode for indexing. And the index is established by using the B + tree on the basis of bucket division, so that the rapid index can be realized, and the index speed is improved. Meanwhile, the invention takes the minimum total number of the false detection elements as a target to determine the partition point of the barrel partition, thereby optimizing the process of the barrel partition, effectively reducing the total number of the false detection elements during the indexing, obtaining the optimal query hit rate, reducing the query cost and balancing the safety of the ciphertext index and the query efficiency.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flowchart of a method for hybrid ciphertext indexing according to an embodiment 1 of the present invention;
fig. 2 is a schematic diagram of bucket division when N is 4.
FIG. 3 is a B + tree structure diagram built for the names in Table 2.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Example 1:
fig. 1 is a flowchart of a method for hybrid ciphertext indexing according to embodiment 1 of the present invention.
Referring to fig. 1, the mixed ciphertext indexing method includes:
step 101: ciphertext data stored in a database is obtained.
Step 102: and carrying out barrel division on the ciphertext data with the same attribute to obtain barrel division data.
The bucket division has the function of carrying out rough query once on the basis of ciphertext data and filtering out some irrelevant records. When the data is stored, a barrel number is added to the confidential data to indicate that the plaintext data value of the confidential data falls within a certain section of interval. When inquiring, the server side firstly searches the ciphertext data partition where the data to be inquired is located by the barrel number, and then carries out the next inquiry, so that the inquiry range is reduced, and the inquiry efficiency is improved.
In practical applications, a plaintext relationship R (A)1,A2,…,Am) Usually contains multiple sensitive attributes and therefore in the ciphertext relationship R corresponding theretos(etuple,A1 s,A2 s,…,Am s) It will also contain a plurality of ciphertext index columns. Wherein etuple is a ciphertext string after encrypting a plaintext tuple, A1 sIs attribute A in relation R1Index column of (2).
Table 1 is a student information table, and a storage model of the ciphertext index database is established using the table as an example.
TABLE 1 student information Table
Figure BDA0002222864790000051
Wherein id, name and score represent the number, name and score of the student respectively. The table is encrypted and indexed using a bucket partitioning technique, and the format of the index table is shown in table 2.
TABLE 2 bucket index Table
Figure BDA0002222864790000052
The attribute String in the index table represents the String after the corresponding tuple is encrypted. Index-id, Index-same, and Index-score represent Index columns of the corresponding attribute, respectively. The numbers in the table represent the barrel numbers to which the corresponding data is assigned. Wherein the allocation of the bucket numbers may employ a collision-free random number allocation.
For different attributes, different partition strategies can be adopted according to different conditions, and under the condition that the number of partition buckets is limited under the safety requirement, the query efficiency of the ciphertext index is related to a bucket partition function. Bucket partitioning is built on a non-negative integer property domain and it is assumed that the probability of all queries occurring is equal.
The existing barrel partitioning technology basically aims to uniformly partition ciphertext data with certain attribute according to the interval related to the ciphertext data. I.e. each bucket has the same interval size. The invention optimizes the barrel division algorithm in order to improve the query hit rate of barrel division and reduce the query cost. The goal of the optimization is to reduce the number of false detection elements. It is thus known which parameters in the bucket partitioning strategy the number of false positives is related to. Let a bucket interval B contain N attribute values V ═ V1,v2,…,vNF ═ F, the frequency of occurrence of each attribute is set1,f2,…,fNIn which ft(1. ltoreq. t. ltoreq.N) represents the attribute value vtFrequency of occurrence within the bucket interval. QkRepresenting the set of all queries with a query scope size k, q [ l, p]Set Q for a querykAnd satisfies p-l +1 ═ k; the number of queries associated with a bucket interval and having a size of k is N + k-1. As shown in fig. 2, when k is 2 and N is 4, the number of range queries of size 2 associated with a bucket interval is N + k-1 is 5, q is q, respectively1、q2、q3、q4、q5
When some element values (not all element values) in the bucket interval meet the query condition of query in a certain range, all the attribute values meeting the index number are searched out due to the consistency of the index numbers in the same bucket interval, and the query result contains false detection tuples and the attribute value v is the attribute value viWill result in a secret query result set fiThe occurrence of a false detection attribute value. With q2For example, when a query q is required2Due to q2In the barrel interval except q2V of the edge1And v2And also includes v3And v4. The number of false detection elements when k is 2 and N is 4 can be calculated as follows, as shown in table 3.
TABLE 3 wrong detection element number table
Figure BDA0002222864790000061
When the query range size k is 2 and the number of attribute values N in the bucket interval is 4, the total number of false detectors is expressed as follows
Figure BDA0002222864790000062
For the same reason, when N is 5, the total number of the false detection elements is expressed as follows
Figure BDA0002222864790000071
The induction proof method can deduce that when the bucket interval contains N attribute values, the total number of false detectors is shown in formula (3):
Figure BDA0002222864790000072
therefore, the total number of false detection elements is only related to the sum F of the number N of attribute values contained in the bucket interval and the occurrence frequency of the attribute values, and is not related to the size k of the range query. In the case that the number of bucket partitioning sub-intervals is limited, seeking oneThe seed bucket division method enables the total number N x F of the false detection tuples to be minimum, and the query hit rate to be highest, so that the optimal bucket division strategy can be obtained. The goal of the optimization algorithm is to minimize
Figure BDA0002222864790000073
Where M is the upper number of partition buckets (partition sub-intervals),in the jth barrel interval BjTotal number of false positive attribute values. Let attribute value set V ═ V1,v2,…,vn}(v1<…<vn) Wherein the attribute value vt(1. ltoreq. t. ltoreq.n) is present at least once in the table.
From the above, an optimized bucket partitioning process can be obtained. Then, the step 102 specifically includes:
determining a division point for the ciphertext data with the same attribute by taking the minimum total number of error detection elements as a target function; the total number of the error detection elements is the sum of the error detection elements corresponding to all the barrel intervals; the error detection element number is the number of data except the target data in the bucket interval obtained in the indexing process; and carrying out barrel division on the ciphertext data with the corresponding attribute according to the dividing points.
In the above step, the specific process of determining partition points for the ciphertext data with the same attribute by using the minimum total number of error detection elements as a target function includes:
A. and calculating the maximum value and the minimum value of the ciphertext data to be divided, and determining the interval of the ciphertext data.
B. And adding a division point to be inserted into the interval where the ciphertext data is located.
C. And calculating the total number of the error detection elements of the barrel interval divided by the dividing point to be inserted at different positions. The position of the existing insertion point is not added with a new insertion point, so that the insertion of two or more insertion points into the same position is avoided.
The calculation formula of the number of the error detection elements corresponding to any one bucket interval is as follows:
Figure BDA0002222864790000075
wherein BC (α, β) represents an attribute value vαAnd an attribute value vβThe number of false detection elements in the barrel interval which is the boundary point; f. oftIs a certain attribute value vtAt v inαAnd vβThe frequencies of occurrence within the barrel interval as boundary points, α, t and β are attribute values v, respectivelyαProperty value vtAnd an attribute value vβThe number of (2). FαβTo take an attribute value vαAnd an attribute value vβIs the sum of the frequency of occurrence of each attribute value in the bucket interval of the boundary point.
D. And determining the position of the division point to be inserted when the total number of the error detection elements is minimum to obtain the insertion position of the division point to be inserted. After the insertion position is obtained, the insertion position is stored into the set of insertion positions.
The method for calculating the minimum value of the total number of the error detection elements can calculate the total number of the error detection elements by taking a certain assumed insertion position of the division point to be inserted currently as a boundary point. The calculation formula is as follows: MOB (1, n, M) ═ min [ MOB (1, i, j) + MOB (i +1, n, M-j) ].
Wherein n is the number of attribute values, M is the total number of barrel intervals formed after the partition point to be inserted is inserted currently, i is the attribute value at a certain assumed insertion position of the partition point to be inserted currently, j is the number of barrel intervals formed by the attribute values 1-i, and M-j is the number of barrel intervals formed by the next n-i attribute values. MOB (1, n, M) is the minimum value of the total number of error detection elements obtained after attribute values 1-n are divided into M barrel intervals. The MOB (1, i, j) is the number of error detection elements obtained after the attribute values 1-i are divided into j barrel intervals, and the MOB (i, n, M-j) is the number of error detection elements obtained after the attribute values i-n are divided into M-j barrel intervals.
E. And judging whether the number of the barrel intervals divided after the division point to be inserted is greater than or equal to a maximum preset threshold value or not, and obtaining a judgment result.
F. And if the judgment result shows that the position is positive, determining the insertion position, and recording all the insertion positions.
G. And if the judgment result shows no, returning to the step B.
After the ciphertext data of the corresponding attribute is subjected to barrel division according to the dividing points, the end point and the number of each barrel interval are recorded.
Step 103: a B + tree is built for the bucket partitioned data, and data pointers are stored into leaf nodes.
In the original ciphertext retrieval method based on barrel division, a server returns all data in a certain barrel, so that a plurality of ciphertext records are returned to a client for filtering and querying, the data volume in practical application is very large, the processing process aggravates the load of the client and slows down the response time of the client.
Therefore, the data in the barrel is searched and inquired at the server side until the record meeting the conditions is inquired and returned to the client side, and the data processing amount of the client side is reduced. In order to improve the data index speed in the barrel, after the barrel is encrypted, a B + tree is established on the basis of a ciphertext.
The B + tree is a dynamic multi-level index structure, and data pointers are only stored in leaf nodes of the tree; therefore, the structure of the leaf node is different from the internal node structure. If the lookup field is a key field, then for each value of the lookup field, there is an entry in the leaf node and a pointer to the record (or to the block containing the record). For non-key lookup fields, a pointer points to a block that contains a pointer to a data file record, thus creating an additional layer of indirection.
The leaf nodes of the B + tree are typically linked together to provide lookup field-based ordered access to the records. These leaf nodes are analogous to the first level (base) indices. The internal nodes of the B + tree correspond to other levels of the multi-level index. Some of the lookup field values of the leaf nodes appear repeatedly in the interior nodes of the B + tree to guide the lookup.
FIG. 3 is a B + tree structure diagram built for the names in Table 2.
Referring to fig. 3, when the current data is less than or equal to the left data of the current node, the current node is indexed to the lower left, and when the current data is greater than or equal to the right data of the current node, the current node is indexed to the lower right until a leaf node is indexed.
As can be seen from fig. 3, the index generated based on bucket partitioning and B + tree mixing is used to perform operations such as query and insertion on data in the ciphertext database, so that the hit rate of the server for ciphertext data query is improved. The higher the query accuracy of the server side is, the smaller the network bandwidth load of the server and the client side is, and the lower the cost spent on decryption query processing of the client side is, so that the response time of the client side is shortened.
Example 2:
an embodiment provides a mixed ciphertext indexing system, comprising:
the acquisition module is used for acquiring the ciphertext data stored in the database;
the barrel dividing module is used for carrying out barrel division on the ciphertext data with the same attribute to obtain barrel divided data;
and the tree structure establishing module is used for establishing a B + tree for the barrel division data and storing the data pointer into a leaf node.
Optionally, the bucket dividing module includes:
the division point determining submodule is used for determining division points for the ciphertext data with the same attribute by taking the minimum total number of the error detection elements as a target function; the total number of the error detection elements is the sum of the error detection elements corresponding to all the barrel intervals; the error detection element number is the number of data except the target data in the bucket interval obtained in the indexing process; (ii) a
And the partitioning submodule is used for carrying out barrel partitioning on the ciphertext data with the corresponding attribute according to the partitioning points.
Optionally, the partition point determining sub-module specifically includes:
the ciphertext interval determining unit is used for calculating the maximum value and the minimum value of ciphertext data to be divided and determining an interval in which the ciphertext data is located;
a division point adding unit, configured to add a division point to be inserted to an interval in which the ciphertext data is located;
the total number of the error detection elements is used for calculating the total number of the error detection elements of the barrel interval divided by the division point to be inserted at different positions;
an insertion position determining unit, configured to determine a position of the division point to be inserted when the total number of the error detection elements is minimum, to obtain an insertion position of the division point to be inserted;
the judging unit is used for judging whether the number of the barrel intervals which are divided after the division points to be inserted are inserted is larger than or equal to a maximum preset threshold value or not to obtain a judgment result;
an insertion position determination completion unit configured to complete the insertion position determination and record all insertion positions if the determination result indicates yes;
and the returning unit is used for returning to the step of adding a division point to be inserted into the interval where the ciphertext data is located if the judgment result shows that the ciphertext data is not inserted into the interval.
Optionally, a calculation formula of the number of false detection elements corresponding to any one bucket interval is as follows:
Figure BDA0002222864790000101
wherein BC (α, β) represents an attribute value vαAnd an attribute value vβThe number of false detection elements in the barrel interval which is the boundary point; f. oftIs a certain attribute value vtAt v inαAnd vβThe frequencies of occurrence within the barrel interval as boundary points, α, t and β are attribute values v, respectivelyαProperty value vtAnd an attribute value vβThe number of (2).
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
1. when the ciphertext index is established by using a bucket dividing technology, the smaller the number of divided buckets is, the lower the hit rate of query is; the more the number of the partition buckets is, the higher the hit rate of the query is, but the information leakage degree is also increased, and under the condition that the plaintext space of the indexed attribute value is relatively small, the data security can be greatly reduced. Therefore, the core of establishing the ciphertext index by using the bucket partitioning technology lies in the determination of the bucket partitioning function, and under the condition of giving the number of the buckets, the optimal bucket partitioning strategy is provided by the invention.
2. In the original bucket partition ciphertext indexing method: the data in the barrel returned to the client by the server is in a linear structure, and the data is encrypted and loses the original partial order relation, so that the client can only do sequential searching. The sequential search has the defect of low search efficiency, and particularly under the condition of large data volume, the rapid index can be realized by introducing the B + tree structure in the bucket, so that the index speed of the data in the bucket is improved.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (8)

1. A method for indexing a hybrid ciphertext, comprising:
acquiring ciphertext data stored in a database;
carrying out barrel division on the ciphertext data with the same attribute to obtain barrel division data;
a B + tree is built for the bucket partitioned data, and data pointers are stored into leaf nodes.
2. The method for indexing a mixed ciphertext according to claim 1, wherein the performing bucket partitioning on the ciphertext data with the same attribute to obtain bucket partitioned data specifically comprises:
determining a division point for the ciphertext data with the same attribute by taking the minimum total number of error detection elements as a target function; the total number of the error detection elements is the sum of the error detection elements corresponding to all the barrel intervals; the error detection element number is the number of data except the target data in the bucket interval obtained in the indexing process;
and carrying out barrel division on the ciphertext data with the corresponding attribute according to the dividing points.
3. The method for indexing mixed ciphertext according to claim 2, wherein the determining the partition point for the ciphertext data having the same attribute using the minimum total number of error detectors as an objective function specifically comprises:
calculating the maximum value and the minimum value of ciphertext data to be divided, and determining the interval of the ciphertext data;
adding a division point to be inserted into the interval where the ciphertext data is located;
calculating the total number of error detection elements of the barrel interval divided by the division point to be inserted at different positions;
determining the position of the division point to be inserted when the total number of the error detection elements is minimum to obtain the insertion position of the division point to be inserted;
judging whether the number of the barrel intervals divided after the division point to be inserted is larger than or equal to a maximum preset threshold value or not, and obtaining a judgment result;
if the judgment result shows that the position is correct, the insertion position is determined to be finished, and all the insertion positions are recorded;
and if the judgment result shows no, returning to the step of adding a division point to be inserted into the interval where the ciphertext data is located.
4. The method of claim 3, wherein the calculation formula of the number of false positives corresponding to any one bucket interval is:
Figure FDA0002222864780000011
wherein BC (α, β) represents an attribute value vαAnd an attribute value vβThe number of false detection elements in the barrel interval which is the boundary point; f. oftIs a certain attribute value vtAt v inαAnd vβThe frequencies of occurrence within the barrel interval as boundary points, α, t and β are attribute values v, respectivelyαProperty value vtAnd an attribute value vβThe number of (2).
5. A hybrid ciphertext indexing system, comprising:
the acquisition module is used for acquiring the ciphertext data stored in the database;
the barrel dividing module is used for carrying out barrel division on the ciphertext data with the same attribute to obtain barrel divided data;
and the tree structure establishing module is used for establishing a B + tree for the barrel division data and storing the data pointer into a leaf node.
6. The hybrid ciphertext indexing system of claim 5, wherein the bucket partitioning module comprises:
the division point determining submodule is used for determining division points for the ciphertext data with the same attribute by taking the minimum total number of the error detection elements as a target function; the total number of the error detection elements is the sum of the error detection elements corresponding to all the barrel intervals; the error detection element number is the number of data except the target data in the bucket interval obtained in the indexing process; (ii) a
And the partitioning submodule is used for carrying out barrel partitioning on the ciphertext data with the corresponding attribute according to the partitioning points.
7. The mixed ciphertext indexing system of claim 6, wherein the partition point determining sub-module specifically comprises:
the ciphertext interval determining unit is used for calculating the maximum value and the minimum value of ciphertext data to be divided and determining an interval in which the ciphertext data is located;
a division point adding unit, configured to add a division point to be inserted to an interval in which the ciphertext data is located;
the total number of the error detection elements is used for calculating the total number of the error detection elements of the barrel interval divided by the division point to be inserted at different positions;
an insertion position determining unit, configured to determine a position of the division point to be inserted when the total number of the error detection elements is minimum, to obtain an insertion position of the division point to be inserted;
the judging unit is used for judging whether the number of the barrel intervals which are divided after the division points to be inserted are inserted is larger than or equal to a maximum preset threshold value or not to obtain a judgment result;
an insertion position determination completion unit configured to complete the insertion position determination and record all insertion positions if the determination result indicates yes;
and the returning unit is used for returning to the step of adding a division point to be inserted into the interval where the ciphertext data is located if the judgment result shows that the ciphertext data is not inserted into the interval.
8. The mixed ciphertext indexing system of claim 7, wherein the error detector number corresponding to any bucket interval is calculated as:
Figure FDA0002222864780000031
wherein BC (α, β) represents an attribute value vαAnd an attribute value vβThe number of false detection elements in the barrel interval which is the boundary point; f. oftIs a certain attribute value vtAt v inαAnd vβThe frequencies of occurrence within the barrel interval as boundary points, α, t and β are attribute values v, respectivelyαProperty value vtAnd an attribute value vβThe number of (2).
CN201910940962.8A 2019-09-30 2019-09-30 Mixed ciphertext indexing method and system Pending CN110674524A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910940962.8A CN110674524A (en) 2019-09-30 2019-09-30 Mixed ciphertext indexing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910940962.8A CN110674524A (en) 2019-09-30 2019-09-30 Mixed ciphertext indexing method and system

Publications (1)

Publication Number Publication Date
CN110674524A true CN110674524A (en) 2020-01-10

Family

ID=69078777

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910940962.8A Pending CN110674524A (en) 2019-09-30 2019-09-30 Mixed ciphertext indexing method and system

Country Status (1)

Country Link
CN (1) CN110674524A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111680062A (en) * 2020-05-15 2020-09-18 江西师范大学 Safe multi-target data object query method and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111680062A (en) * 2020-05-15 2020-09-18 江西师范大学 Safe multi-target data object query method and storage medium

Similar Documents

Publication Publication Date Title
Mouratidis et al. Partially materialized digest scheme: an efficient verification method for outsourced databases
Beckmann et al. A revised R*-tree in comparison with related index structures
US8880511B2 (en) Database query optimization and cost estimation
US8316417B2 (en) Method for dynamic secure management of an authenticated relational table in a database
KR20090065130A (en) Indexing and searching method for high-demensional data using signature file and the system thereof
US20130159347A1 (en) Automatic and dynamic design of cache groups
CN110069500B (en) Dynamic mixed indexing method for non-relational database
CN114841374A (en) Method for optimizing transverse federated gradient spanning tree based on stochastic greedy algorithm
CN105447030A (en) Index processing method and equipment
CN106484815B (en) A kind of automatic identification optimization method based on mass data class SQL retrieval scene
Chen et al. A hierarchical clustering method for big data oriented ciphertext search
CN108763536A (en) Data bank access method and device
WO2022127418A1 (en) Data retrieval method and apparatus, electronic device, and storage medium
CN110674524A (en) Mixed ciphertext indexing method and system
US20090063479A1 (en) Search templates
CN113076319B (en) Dynamic database filling method based on outlier detection technology and bitmap index
CN117194418A (en) Verifiable multi-mode space-time data index structure and space-time range query verification method
Tzouramanis et al. Secure reverse k-nearest neighbours search over encrypted multi-dimensional databases
Dai et al. A Multibranch Search Tree‐Based Multi‐Keyword Ranked Search Scheme over Encrypted Cloud Data
CN112199396A (en) Industrial Internet identification query method and system facing MES
Ganti et al. MP-trie: Fast spatial queries on moving objects
Munir et al. An instance based schema matching between opaque database schemas
CN115543993A (en) Data processing method and device, electronic equipment and storage medium
CN107239517B (en) Multi-condition searching method and device based on Hbase database
CN110874348A (en) Privacy differential data retrieval method in mixed cloud environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200110