CN115858579A - Data warehouse star connection query method, system and medium based on differential privacy - Google Patents

Data warehouse star connection query method, system and medium based on differential privacy Download PDF

Info

Publication number
CN115858579A
CN115858579A CN202211337791.8A CN202211337791A CN115858579A CN 115858579 A CN115858579 A CN 115858579A CN 202211337791 A CN202211337791 A CN 202211337791A CN 115858579 A CN115858579 A CN 115858579A
Authority
CN
China
Prior art keywords
query
noise
interval
dimension
data warehouse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211337791.8A
Other languages
Chinese (zh)
Inventor
张亮
吴志刚
曹晓光
许斌
吴世山
赵力文
张海威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Shiping Information & Technology Co ltd
Original Assignee
Hangzhou Shiping Information & Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Shiping Information & Technology Co ltd filed Critical Hangzhou Shiping Information & Technology Co ltd
Priority to CN202211337791.8A priority Critical patent/CN115858579A/en
Publication of CN115858579A publication Critical patent/CN115858579A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data warehouse star connection query method, system and medium based on differential privacy are disclosed, the method comprises the steps that a server receives star connection query, extracts query predicates, and adds noise to a query interval range related to the query predicates; responding to the aggregation function in the query interval after the noise is added to obtain a disturbed query result; and responding the disturbed inquiry result to a data analyst. The invention designs a scheme of adding noise to the query mode and then responding to the query aiming at the star-type connection query with differential privacy, so that the overall sensitivity of the star-type connection query can be reduced from the query perspective, and the influence of the noise on the query result is reduced. And a disturbance mode of the query interval in each dimension is adopted, so that the global sensitivity of the star-connected query operation is reduced essentially. Even if the method is extended to n dimension tables, the introduced noise does not increase in power along with the number of the dimension tables. Because the disturbance can be carried out in a parallel mode, the calculation expense is reduced.

Description

Data warehouse star connection query method, system and medium based on differential privacy
Technical Field
The invention belongs to the technical field of data collection and analysis, and particularly relates to a data warehouse star connection query method, system and medium based on differential privacy.
Background
With the rapid development of artificial intelligence and big data technology, data collection and analysis become particularly easy, and a service party improves service quality according to collected data so as to develop a more personalized tool. The star connection query is one of typical applications in a relational database, a data analyzer provides a query to a credible server for collecting data, and the server responds to a query result according to the collected data. However, an untrusted data analyzer may infer the privacy information of the user from multiple response results, and further threaten the property and security of an individual, the main reason is that the server responds to the query result according to the original data of the individual, so that the individual cannot master the private data of the individual, the occurrence of differential privacy makes the server disturb the query result and then responds to the query of the analyzer, and at present, the differential privacy focuses on research such as aggregation query, but the work related to star-type connection query is very little.
In recent years, star connection queries in data warehouses are widely applied to OLAP (On-Line analytical Processing), and how to respond to star connection queries while protecting user privacy has become a new idea of differential privacy research. The current differential privacy scheme based on connection query generally comprises three steps: the server calculates the global sensitivity of the connection query operation, then adds noise to the real query result, checks the noise according to the calculated global sensitivity and the privacy cost epsilon, and finally responds to the query result after noise disturbance. However, due to the data pairs starThe join query operation is highly sensitive, the deletion and addition of a piece of data may affect the query result of the join operation by the number of dimension tables, and the global sensitivity is O (N) n-1 ) Where N is each dimension but the threshold, N represents the number of dimension tables [1] . Therefore, the current differential privacy scheme for the star connection query operation generally has the problem of overlarge global sensitivity, and finally, the usability of the query result is low.
For the problem of excessive global sensitivity, the current differential privacy scheme focuses on designing a calculation mode of sensitivity to approximately replace the global sensitivity, such as local sensitivity, smooth sensitivity, elastic sensitivity and residual sensitivity. However, the above calculation methods have some defects, the local sensitivity depends on data, and the verification noise cannot meet the difference privacy by using the local sensitivity; the smooth sensitivity, the elastic sensitivity and the residual sensitivity are all an upper limit value for obtaining the local sensitivity, the smooth sensitivity value is minimum but the calculation cost is high, and the elastic sensitivity and the residual sensitivity value are small but large in calculation cost. Therefore, the approach to approximate global sensitivity does not substantially improve the impact of the data on join queries.
In addition, the star connection is a fact table and a connection of a plurality of dimension tables, and as the dimension tables are expanded, the influence of data on the connection query operation is increased, the sensitivity is increased, the noise is increased, and the influence on the query result is exponentially increased. Meanwhile, for the query of star connection, the server will connect the dimension tables first, and then add noise after obtaining the query result, and the increase of the dimension tables may cause high calculation overhead.
The above scheme only considers the sensitivity calculation mode and cannot solve the problem caused by dimension table expansion.
[1]Wei Dong and KeYi.A nearly instance-optimal differentially private mechanism for conjunctive queries.In PODS 2022.
Disclosure of Invention
The present invention aims to solve the above problems in the prior art, and provide a method, a system, and a medium for querying a star connection in a data warehouse based on differential privacy, which can reduce the global sensitivity of the star connection query from the query perspective, reduce the influence of noise on the query result, not only improve the query accuracy, but also expand the star connection query to multiple dimension tables.
In order to achieve the purpose, the invention has the following technical scheme:
in a first aspect, a data warehouse star connection query method based on differential privacy is provided, which includes:
the server receives the star connection query, extracts a query predicate and adds noise to a query interval range related to the query predicate;
responding to the aggregation function in the query interval after the noise is added to obtain a disturbed query result;
and responding the disturbed inquiry result to a data analyst.
Preferably, suppose a data warehouse D = { D = { D } 1 ,d 2 ,…,d n The method comprises the steps of (1) containing N dimension tables, wherein the threshold value of each dimension table is set to be N; respectively adding noise according to the number k of the dimensionalities involved in the query, wherein k is less than or equal to N, and if the change of one record influences N intervals, the global sensitivity of each dimensionality is N; suppose a query predicate of a star-connected query q relates to all dimensions with an interval q = { d = { d = } 1 :[l 1 ,h 1 ],d 2 :[l 1 ,h 1 ],...,d n :[l n ,h n ]In which, [ l ] i ,h i ]Representing the query interval in the i e n dimension.
Preferably, the manner of adding noise to the query interval range related to the query predicate includes:
and (3) adding noise at two ends of the interval: adding Laplace noise to two end points of each query interval;
interval width noise addition: fixing one end point of each query interval, and adding Laplace noise to the interval width;
and (3) interval decomposition and noise addition: and respectively constructing a dimension tree on each dimension according to the concept hierarchy, decomposing the dimension tree into a plurality of subintervals before disturbing the query interval according to the dimension tree, and adding noise to each subinterval respectively.
Further, as a preferred scheme of the noise adding mode of the present invention, noise is added at both ends of the interval, and the privacy cost is divided into two parts, one part is used for the left endpoint and the other part is used for the right endpoint;
post-perturbation query is q '= { d' 1 :[l′ 1 ,h′ 1 ],d′ 2 :[l′ 2 ,h′ 2 ],...,d′ n :[l′ n ,h′ n ]};
Wherein the content of the first and second substances,
Figure BDA0003915800010000031
and the query interval after the disturbance on the ith dimension is obtained.
Further, as a preferable mode of the noise addition method of the present invention, the interval width is added with noise, and the query after the disturbance is q '= { d' 1 :[l′ 1 ,h′ 1 ],d′ 2 :[l′ 2 ,h′ 2 ],...,d′ n :[l′ n ,h′ n ]};
Wherein, [ l' i ,h′ i ]=[l i ,l i +(|h i -l i +1|+Lap(N/ε))]And the query interval after the disturbance on the ith dimension is obtained.
Furthermore, as a preferred scheme of the noise adding method of the present invention, the interval decomposition and noise addition is performed, and assuming that the branch of the dimension tree is b, the tree height is t = log b N; the query predicate of the star connection query q is { d 1 :[l 1 ,h 1 ],d 2 :[l 1 ,h 1 ],...,d n :[l n ,h n ]H, for the ith e n dimensionality query interval [ l i ,h i ]Query interval [ l ] according to the tree structure in the ith dimension i ,h i ]Is decomposed into [ l i ,h i ]={[l i1 ,h i1 ],...,[l it ,h it ]For each sub-areaObtaining l 'by noise disturbance' i ,h′ i ]={[l′ i1 ,h′ i1 ],...,[l′ it ,h′ it ]Wherein, [ l' ij ,h′ ij ]And j ∈ t is the disturbed subinterval.
Further, the disturbed subinterval [ l' ij ,h′ ij ]Calculated by the following two ways:
Figure BDA0003915800010000032
or is
Figure BDA0003915800010000041
In a second aspect, a differential privacy-based data warehouse star connection query system is provided, which includes:
the noise adding module is used for extracting a query predicate after the server receives the star-type connection query, and adding noise to a query interval range related to the query predicate;
the query result perturbation module is used for responding to the aggregation function in the query interval after the noise is added to obtain a perturbed query result;
and the response module is used for responding the disturbed inquiry result to a data analyst.
Preferably, the method for adding noise to the query interval range related to the query predicate by the noise adding module includes:
and (3) adding noise at two ends of the interval: adding Laplace noise to two end points of each query interval;
interval width noise addition: fixing one end point of each query interval, and adding Laplace noise to the interval width;
and (3) interval decomposition and noise addition: and respectively constructing a dimension tree on each dimension according to the concept hierarchy, decomposing the dimension tree into a plurality of subintervals before disturbing the query interval according to the dimension tree, and adding noise to each subinterval respectively.
In a third aspect, a computer-readable storage medium is provided, which stores a computer program that, when executed by a processor, implements the steps of the differential privacy based data warehouse star join query method.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a differential privacy scheme for star connection query in a data warehouse from three aspects of usability, efficiency and expansibility, and designs a scheme of adding noise to a query mode and then responding to the query aiming at the star connection query of the differential privacy, so that the global sensitivity of the star connection query can be reduced from the query angle, and the influence of the noise on the query result is reduced. And a disturbance mode of the query interval in each dimension is adopted, so that the global sensitivity of the star-type connection query operation is reduced essentially. In addition, even if the method is extended to n dimension tables, the introduced noise does not increase in power along with the number of the dimension tables. Because the mode of adding disturbance and then responding to the query is respectively carried out on each dimension query interval, the disturbance can be carried out in a parallel mode, and the calculation cost is reduced. The scheme can improve the query precision and can also be expanded to the star connection query of a plurality of dimension tables.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments are briefly described below, it should be understood that the following drawings only show some embodiments of the present invention, and it is obvious to those skilled in the art that other related drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of a scenario application of a data warehouse star connection query method based on differential privacy;
FIG. 2 is a flowchart of a data warehouse star connection query method based on differential privacy.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. Based on the embodiments of the present invention, those skilled in the art can also obtain other embodiments without creative efforts.
Since data is highly sensitive to join query operations, the deletion and addition of a piece of data may have an infinite impact on the query results of join operations. Therefore, the problem of overlarge global sensitivity generally exists in the current differential privacy scheme oriented to the star-type connection query operation in the data warehouse, and the usability of the query result is low. In addition, the existing differential privacy scheme for star-type connection query cannot balance three aspects of usability, efficiency and expansibility. The invention provides a differential privacy scheme for star connection query in a data warehouse from the three aspects of usability, efficiency and expansibility, the global sensitivity of the star connection query can be reduced from the query angle, the influence of noise on the query result is reduced, the scheme not only can improve the query precision, but also can expand the star connection query to a plurality of dimension tables. The conventional star-type connection query scheme for the data warehouse mainly has the following defects:
1. the availability of the query results is low. The data is highly sensitive to the star connection query operation, and even if the data is an approximate upper limit value of local sensitivity, the data is large in value, and excessive noise is directly introduced into a query result.
2. The expandability is poor. In general, the star connection of the data warehouse comprises a connection of a plurality of dimension tables, and the existing scheme aims at the increase of the dimension tables, and the value of the sensitivity of the existing scheme also changes with the power of the number of the dimension tables, so that the use is limited when the number of the dimension tables is expanded to n.
3. The time consumption is calculated. The existing differential privacy scheme firstly connects dimension tables and then adds noise, and a plurality of dimension tables cause high calculation overhead.
Aiming at the defects of the prior art, the invention provides a query method which is reasonable in design, safe and efficient, namely, noise is added to a query mode and then the query is responded. The sensitivity is reduced to multiple orders of magnitude in a calculation mode compared with the sensitivity of the existing scheme, and even if the sensitivity is expanded to n dimension tables, the introduced noise does not increase in power along with the number of the dimension tables. Because the mode of adding noise to the query mode and then responding to the query is adopted, predicates connected with the query can be disturbed in parallel, and the calculation cost is reduced.
Referring to fig. 2, the data warehouse star connection query method based on differential privacy in the embodiment of the present invention includes the following steps:
the server receives the star connection query, extracts a query predicate and adds noise to a query interval range related to the query predicate;
responding to the aggregation function in the query interval after the noise is added to obtain a disturbed query result;
and responding the disturbed inquiry result to a data analyst.
Referring to fig. 1, fig. 1 shows a scenario application of the differential privacy-based data warehouse star connection query method of the present invention, where an untrusted data analyst proposes a star connection query q to a server, and the server responds to the query to the data analyst in a noisy manner according to a collected data warehouse D. The main process is that the server responds to the star connection query in a differential privacy mode, namely how to design a noise adding mode meeting the differential privacy. The method of the invention adopts a mode of adding noise to the query and then responding, and designs three noise adding modes aiming at the query, and the method of the invention mainly comprises the following two processes:
a. inquiring and adding noise, namely q' = q + noise;
b. aggregate noisy queries, i.e., q' (D);
suppose data warehouse D = { D = { [ D ] 1 ,d 2 ,…,d n It contains N dimension tables, and the threshold value of each dimension table is set to be N. Noise is respectively added according to the number k (k is less than or equal to N) of the dimensionalities involved in the query, and the change of one record influences N intervals, so that the global sensitivity of each dimensionality is N. Suppose a query predicate of a star-connected query q relates to all dimensions with an interval q = { d = { d = } 1 :[l 1 ,h 1 ],d 2 :[l 1 ,h 1 ],…,d n :[l n ,h n ]In which [ l i ,h i ]Representing the query interval in the ith e n dimensions.
Aiming at the process a, three different query interval noise adding modes are designed.
1. Noise is added at two ends of interval
And adding Laplace noise to two end points of each query interval, and dividing the privacy cost into two parts, wherein one part is used for a left end point, and the other part is used for a right end point.
That is, the post-disturbance query is q '= { d' 1 :[l′ 1 ,h′ 1 ],d′ 2 :[l′ 2 ,h′ 2 ],...,d′ n :[l′ n ,h′ n ]};
Wherein the content of the first and second substances,
Figure BDA0003915800010000071
and the query interval after the disturbance on the ith dimension is obtained.
2. Interval width noise addition
And fixing one end point of each query interval, and adding Laplace noise to the interval width.
Post-perturbation query q '= { d' 1 :[l′ 1 ,h′ 1 ],d′ 2 :[l′ 2 ,h′ 2 ],...,d′ n :[l′ n ,h′ n ]}。
Wherein, [ l' i ,h′ i ]=[l i ,l i +(|h i -l i +1|+Lap(N/ε))]And the query interval after the disturbance on the ith dimension.
3. Interval decomposition plus noise
And respectively constructing a dimension tree on each dimension according to the concept hierarchy, decomposing the dimension tree into a plurality of subintervals before disturbing the query interval according to the dimension tree, and adding noise to each subinterval respectively.
Assuming that the branch of the dimension tree is b, the tree height is t = log b N。
query predicate of qIs { d 1 :[l 1 ,h 1 ],d 2 :[l 1 ,h 1 ],...,d n :[l n ,h n ]H, query interval [ l ] with ith e n dimensionality i ,h i ]For example, [ l ] can be divided according to the tree structure in the ith dimension i ,h i ]Is decomposed into [ l i ,h i ]={[l i1 ,h i1 ],...,[l it ,h it ]And carrying out noise disturbance on each subinterval to obtain [ l' i ,h′ i ]={[l′ i1 ,h′ i1 ],...,[l′ it ,h′ it ]}. Wherein, [ l' ij ,h′ ij ]And j ∈ t is the disturbed subinterval.
[l′ ij ,h′ ij ]There may be two ways of calculation:
Figure BDA0003915800010000072
or is
Figure BDA0003915800010000073
Finally, process b is performed. The disturbed query q ' can be obtained by the three noise adding modes, then the aggregation function in the query is responded to obtain a final query result q ' (D), and the final query result q ' (D) is responded to a data analyzer.
Another embodiment of the present invention provides a data warehouse star connection query system based on differential privacy, including:
the noise adding module is used for extracting a query predicate after the server receives the star-type connection query, and adding noise to a query interval range related to the query predicate;
the query result perturbation module is used for responding to the aggregation function in the query interval after the noise is added to obtain a perturbed query result;
and the response module is used for responding the disturbed inquiry result to a data analyst.
In a possible implementation manner, the manner in which the noise adding module adds noise to the query interval range related to the query predicate includes:
and (3) adding noise at two ends of the interval: adding Laplace noise to two end points of each query interval;
interval width noise addition: fixing one end point of each query interval, and adding Laplace noise to the interval width;
and (3) carrying out interval decomposition and noise addition: and respectively constructing a dimension tree on each dimension according to the concept hierarchy, decomposing the dimension tree into a plurality of subintervals before disturbing the query interval according to the dimension tree, and adding noise to each subinterval respectively.
The invention designs a scheme for adding noise to a query mode and then responding to the query aiming at star-shaped connection query of differential privacy, and the key points and key points in the technical scheme are at least embodied in the following aspects:
1. in consideration of the problem of overlarge global sensitivity in the current differential privacy scheme of star-type connection query, the invention designs a mode for reducing the influence of sensitivity on query disturbance.
2. Each dimension is independent, noise is added to the dimension query interval related to star connection query, and three different query interval noise adding modes are designed.
3. And (4) considering the disturbance of two interval expression modes of interval endpoints and interval widths.
3. And the tree structure is adopted to decompose the interval and then noise is added, so that the query precision is further improved.
Another embodiment of the present invention further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the differential privacy based data warehouse star connection query method.
The computer program may be divided into one or more modules/units, and the one or more modules/units are stored in the memory and executed by the processor to implement the differential privacy based data warehouse star connection query method of the present invention.
The terminal can be a desktop computer, a notebook, a palm computer, a cloud server and other computing equipment, and can also be a processor and a memory. The processor may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc. The memory can be used for storing computer programs and/or modules, and the processor can realize various functions of the differential privacy-based data warehouse star-type connection query system by operating or executing the computer programs and/or modules stored in the memory and calling the data stored in the memory.
While the invention has been described in further detail with reference to specific preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A data warehouse star connection query method based on differential privacy is characterized by comprising the following steps:
the server receives the star connection query, extracts a query predicate and adds noise to a query interval range related to the query predicate;
responding to the aggregation function in the query interval after the noise is added to obtain a disturbed query result;
and responding the disturbed inquiry result to a data analyst.
2. The differential privacy-based data warehouse of claim 1Star-connected query method characterized by assuming data warehouse D = { D = { (D) } 1 ,d 2 ,...,d n The method comprises the steps of (1) containing N dimension tables, wherein the threshold value of each dimension table is set to be N; respectively adding noise according to the number k of dimensionalities involved in the query, wherein k is less than or equal to N, and if the change of one record influences N intervals, the global sensitivity of each dimensionality is N; suppose a query predicate of a star-connected query q relates to all dimensions with an interval q = { d = { d = } 1 :[l 1 ,h 1 ],d 2 :[l 1 ,h 1 ],...,d n :[l n ,h n ]In which, [ l ] i ,h i ]Representing the query interval in the i e n dimension.
3. The differential privacy-based data warehouse star connection query method according to claim 2, wherein the manner of adding noise to the query interval range related to the query predicate includes:
and (3) adding noise at two ends of the interval: adding Laplace noise to two end points of each query interval;
interval width noise addition: fixing one end point of each query interval, and adding Laplace noise to the interval width;
and (3) carrying out interval decomposition and noise addition: and respectively constructing a dimension tree on each dimension according to the concept hierarchy, decomposing the dimension tree into a plurality of subintervals before disturbing the query interval according to the dimension tree, and adding noise to each subinterval respectively.
4. The differential privacy-based data warehouse star connection query method according to claim 3, wherein noise is added at two ends of the interval, and privacy cost is divided into two parts, one part is used for a left endpoint and the other part is used for a right endpoint;
post-perturbation query is q '= { d' 1 :[l′ 1 ,h′ 1 ],d′ 2 :[l′ 2 ,h′ 2 ],...,d′ n :[l′ n ,h′ n ]};
Wherein the content of the first and second substances,
Figure FDA0003915800000000011
and the query interval after the disturbance on the ith dimension is obtained.
5. The differential privacy-based data warehouse star connection query method of claim 3, wherein the interval width is added with noise, and the query after disturbance is q '= { d' 1 :[l′ 1 ,h′ 1 ],d′ 2 :[l′ 2 ,h′ 2 ],...,d′ n :[l′ n ,h′ n ]};
Wherein, [ l' i ,h′ i ]=[l i ,l i +(|h i -l i +1|+Lap(N/ε))]And the query interval after the disturbance on the ith dimension is obtained.
6. The differential privacy-based data warehouse star connection query method as claimed in claim 3, wherein the interval decomposition and noise addition is carried out, and assuming that the branch of the dimension tree is b, the tree height is t = log b N; the query predicate of the star-connected query q is { d 1 :[l 1 ,h 1 ],d 2 :[l 1 ,h 1 ],...,d n :[l n ,h n ]H, for the ith e n dimensionality query interval [ l i ,h i ]Query interval [ l ] according to the tree structure in the ith dimension i ,h i ]Is decomposed into [ l i ,h i ]={[l i1 ,h i1 ],...,[l it ,h it ]And carrying out noise disturbance on each subinterval to obtain [ l' i ,h′ i ]={[l′ i1 ,h′ i1 ],...,[l′ it ,h′ it ]Wherein, [ l' ij ,h′ ij ]And j ∈ t is the disturbed subinterval.
7. The differential privacy-based data warehouse star connection query method of claim 6, wherein the perturbed subinterval [ l' ij ,h′ ij ]Calculated by the following two ways:
Figure FDA0003915800000000021
or is
Figure FDA0003915800000000022
8. A data warehouse star connection query system based on differential privacy is characterized by comprising:
the noise adding module is used for extracting the query predicate after the server receives the star connection query, and adding noise to the query interval range related to the query predicate;
the query result perturbation module is used for responding to the aggregation function in the query interval after the noise is added to obtain a perturbed query result;
and the response module is used for responding the disturbed inquiry result to a data analyst.
9. The differential privacy-based data warehouse star connection query system of claim 8, wherein the noise addition module adds noise to the query interval range related to the query predicate in a manner that includes:
and (3) adding noise at two ends of the interval: adding Laplace noise to two end points of each query interval;
interval width noise addition: fixing one end point of each query interval, and adding Laplace noise to the interval width;
and (3) interval decomposition and noise addition: and respectively constructing a dimension tree on each dimension according to the concept hierarchy, decomposing the dimension tree into a plurality of subintervals before disturbing the query interval according to the dimension tree, and adding noise to each subinterval respectively.
10. A computer-readable storage medium storing a computer program, characterized in that: the computer program when being executed by a processor implements the steps of the differential privacy based data warehouse star join query method as claimed in any one of claims 1 to 7.
CN202211337791.8A 2022-10-28 2022-10-28 Data warehouse star connection query method, system and medium based on differential privacy Pending CN115858579A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211337791.8A CN115858579A (en) 2022-10-28 2022-10-28 Data warehouse star connection query method, system and medium based on differential privacy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211337791.8A CN115858579A (en) 2022-10-28 2022-10-28 Data warehouse star connection query method, system and medium based on differential privacy

Publications (1)

Publication Number Publication Date
CN115858579A true CN115858579A (en) 2023-03-28

Family

ID=85662039

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211337791.8A Pending CN115858579A (en) 2022-10-28 2022-10-28 Data warehouse star connection query method, system and medium based on differential privacy

Country Status (1)

Country Link
CN (1) CN115858579A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116451278A (en) * 2023-06-19 2023-07-18 杭州世平信息科技有限公司 Star-connection workload query privacy protection method, system, equipment and medium
CN117633902A (en) * 2024-01-25 2024-03-01 杭州世平信息科技有限公司 OLAP star-type connection workload query differential privacy protection method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116451278A (en) * 2023-06-19 2023-07-18 杭州世平信息科技有限公司 Star-connection workload query privacy protection method, system, equipment and medium
CN117633902A (en) * 2024-01-25 2024-03-01 杭州世平信息科技有限公司 OLAP star-type connection workload query differential privacy protection method and system

Similar Documents

Publication Publication Date Title
Li et al. Skyline community search in multi-valued networks
CN115858579A (en) Data warehouse star connection query method, system and medium based on differential privacy
Zhang et al. Bed-tree: an all-purpose index structure for string similarity search based on edit distance
CN108509547B (en) Information management method, information management system and electronic equipment
US8744197B2 (en) Identifying information related to a particular entity from electronic sources, using dimensional reduction and quantum clustering
US7650330B1 (en) Information extraction from a database
JP4814570B2 (en) Resistant to ambiguous duplication
US9747349B2 (en) System and method for distributing queries to a group of databases and expediting data access
Wang et al. LogUAD: Log unsupervised anomaly detection based on Word2Vec
Li et al. Bursty event detection from microblog: a distributed and incremental approach
Wang et al. Synthesizing mapping relationships using table corpus
US20190080006A1 (en) Computing features of structured data
Li et al. Anonymization by local recoding in data with attribute hierarchical taxonomies
WO2021047373A1 (en) Big data-based column data processing method, apparatus, and medium
US20130318091A1 (en) Systems, methods, and computer program products for fast and scalable proximal search for search queries
Alassi et al. Effectiveness of template detection on noise reduction and websites summarization
Welch et al. Fast and accurate incremental entity resolution relative to an entity knowledge base
Gagliardelli et al. Bigdedup: a big data integration toolkit for duplicate detection in industrial scenarios
Mandl et al. Preference analytics in EXASolution
Kathare et al. A comprehensive study of Elasticsearch
Yin et al. An industrial dynamic skyline based similarity joins for multidimensional big data applications
US20060122963A1 (en) System and method for performing a data uniqueness check in a sorted data set
CN116451278A (en) Star-connection workload query privacy protection method, system, equipment and medium
Liu et al. PAIRPQ: an efficient path index for regular path queries on knowledge graphs
Ganguly et al. Deterministic k-set structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination