CN118096441A - Power data aggregation system and method based on w event-level local differential privacy - Google Patents
Power data aggregation system and method based on w event-level local differential privacy Download PDFInfo
- Publication number
- CN118096441A CN118096441A CN202410519315.0A CN202410519315A CN118096441A CN 118096441 A CN118096441 A CN 118096441A CN 202410519315 A CN202410519315 A CN 202410519315A CN 118096441 A CN118096441 A CN 118096441A
- Authority
- CN
- China
- Prior art keywords
- data
- power consumption
- disturbance
- preset
- moment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000002776 aggregation Effects 0.000 title claims abstract description 117
- 238000004220 aggregation Methods 0.000 title claims abstract description 117
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000011218 segmentation Effects 0.000 claims abstract description 74
- 230000007246 mechanism Effects 0.000 claims abstract description 71
- 238000004364 calculation method Methods 0.000 claims abstract description 50
- 230000005611 electricity Effects 0.000 claims description 54
- 238000003860 storage Methods 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 11
- 238000011084 recovery Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000004931 aggregating effect Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000003094 perturbing effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioethics (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Water Supply & Treatment (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- General Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application belongs to the technical field of information security, and particularly relates to a power data aggregation system and method based on w event-level local differential privacy, wherein the power data aggregation system comprises the following steps: the user side is used for disturbing the original power consumption data of each user at each moment by utilizing privacy budget and relaxation factors required by the power consumption data disturbance of each moment in the preset w moments, which are obtained by the pre-calculation, so as to obtain final disturbance data, and sending the final disturbance data to the data aggregation side; the data aggregation end is used for determining total power consumption of all users at each moment in preset w moments based on final disturbance data, and fitting the total power consumption of all users at each moment in preset w moments to obtain a user power consumption total fitting curve. According to the application, the power data aggregation based on differential privacy is realized by combining a privacy parameter segmentation mechanism, the total power consumption of the user in a period of time is effectively fitted while the user information is not revealed, and the safety of real data is ensured.
Description
Technical Field
The invention belongs to the technical field of information security, and particularly relates to a power data aggregation system and method based on w event-level local differential privacy.
Background
With the increasing prominence of energy problems, smart grids have grown for sustainable development. As an information physical fusion system, a smart grid, which combines a traditional power grid system with an advanced information communication technology, such as an advanced measurement system, provides higher reliability, flexibility, sustainability and safety, and is regarded as one of the important trends of the next power grid development. The intelligent ammeter is used as a terminal of the intelligent power grid, and can monitor the electricity consumption condition of the user in real time and acquire the electricity consumption data of the user. And the service provider accurately analyzes the user electricity consumption behavior of the collected massive user electricity consumption data through a big data analysis technology. But the collection and analysis of smart grid data poses a serious threat to user privacy. User electricity consumption data can be utilized by attackers to analyze private information such as electricity consumption rules, family members, equipment use conditions and the like, and leakage of the information can cause economic loss and even life danger to users, for example, the attackers can steal or hurt people when the user electricity consumption is low. Therefore, how to perform accurate and efficient electricity data analysis while protecting the privacy of the user electricity data is an important problem.
Aiming at the problem of protecting the privacy of user power consumption data, different schemes are proposed by researchers at present, and mainly encryption, anonymity and random disturbance technologies are focused on. Encryption techniques can recover accurate statistical properties, but the computational overhead is large. The anonymization technology solves the problem of large calculation cost, but under the condition that an attacker has a certain background knowledge, the anonymization technology can be attacked to reveal the privacy information. To address this problem, random perturbations may be resistant to background knowledge attacks by an attacker. The most widely used random disturbance method is differential privacy at present, the privacy protection level is strictly defined by the differential privacy, and even if an attacker has a great amount of background knowledge, information about real data cannot be acquired.
Differential privacy includes centralized differential privacy and local differential privacy. In the centralized differential privacy, a user sends data to a trusted data aggregation end, and the data aggregation end perturbs statistical data to meet the requirement of the differential privacy and distributes perturbed data. However, in real application, there is no completely trusted data aggregation end, and if the data aggregation end is attacked, the real data of the user can be revealed. Therefore, KASIVISWANATHAN et al propose the concept of Local differential privacy (LDP for short), in the LDP model, a user locally perturbs data to satisfy LDP, and sends the perturbed data to the data aggregation end, so that even if the data aggregation end is not trusted, the information of the user data will not be revealed. Wang et al propose a mechanism that data is perturbed to different intervals according to probability under LDP model, can solve user's one-dimensional and multidimensional data aggregation. But cannot process the time-stream data.
Aiming at data published by data streams, wang Teng et al [ Wang Teng, yang Xinyu, ren Xue ] and the like, a data self-adaptive privacy protection mechanism [ J ] for data published by data streams, china science information science, 2021,51 (7): 1199-1216 ] provides a concept of w-event-level epsilon-local differential privacy, and data at w times before any time can be aggregated, so that disturbance of epsilon-local differential privacy on w-dimensional data is met. However, epsilon-LDP provides strict privacy protection to the data, losing some of the data utility. Introducing delta greater than 0 can enable disturbance to generate privacy disclosure with delta probability, so that the privacy protection of an algorithm is relaxed, and the user privacy is protected from disclosure with larger probability while the data availability is improved. Formalized definition of this relaxed local differential privacy is as follows: given a random perturbation algorithm A and its domain Dom (A) and value Range Range (A), then for any two different tuples t and t '(t, t' e Dom (A)), and the probability of perturbation of any subset S A of Range (A) satisfies Pr [ A (t) e S A]≤eεPr[A(t')∈SA ] +delta, then A satisfies (ε, delta) -local differential privacy, also known as relaxed local differential privacy. Wherein epsilon >0 is privacy budget, the value affects the privacy protection degree, and the smaller epsilon is, the higher the privacy protection intensity of the algorithm A is, whereas the larger epsilon is, the lower the privacy protection intensity of the algorithm A is. 0.ltoreq.delta.1 is a relaxation factor, and represents the probability that Pr [ A (t) ∈S A ] exceeds e εPr[A(t')∈SA ], namely the algorithm A does not generate privacy leakage with the probability of 1-delta, and the value of delta is generally smaller than the reciprocal of the number of data to be counted. Bassily et al demonstrate that the mechanism has smaller error boundaries at 0< delta <1, so the (epsilon, delta) -LDP model is more practical.
Based on the (epsilon, delta) -LDP model, dwork et al propose a Gaussian mechanism to directly inject 0-mean noise into the data, report the data added with random numbers satisfying the Gaussian distribution to the collector, and the collector calculates an unbiased estimate of the mean. Thereafter, the optimal Gaussian mechanism proposed by Balle et al reduces the estimation error of the Gaussian mechanism, giving a way to calculate the minimum Gaussian distribution random number variance at this time. In addition, wang et al designed a new mean value estimation mechanism for numerical data, which perturbed the data according to probability to one of two determined values, and the collector directly averaged the perturbed values to obtain an unbiased estimate. However, the above mechanisms are all operated on numerical data, and it is difficult to deal with streaming data problems. Because the user electricity consumption data is time flow data, in order to count the time sequence characteristics of the electricity consumption data, the w-event level (epsilon, delta) -local differential privacy model can more accurately reflect the similarity of the electricity consumption data at adjacent moments relative to the (epsilon, delta) -local differential privacy model, and the similarity is effectively utilized to improve the accuracy of aggregation calculation. However, no solution has been available to enable power data aggregation under w-event level (epsilon, delta) -local differential privacy model.
Disclosure of Invention
In order to overcome the problems in the related art, the present application provides a system and a method for aggregating power data based on w event-level local differential privacy.
According to a first aspect of an embodiment of the present application, there is provided a power data aggregation system based on w event level local differential privacy, including: a user end and a data aggregation end;
The user side is used for disturbing the original power consumption data of each user at each moment by utilizing privacy budget and relaxation factors required by the power consumption data disturbance at each moment in preset w moments, which are obtained by calculation in advance, so as to obtain final disturbance data, and sending the final disturbance data to the data aggregation side;
The data aggregation end is used for determining total power consumption of all users at each moment in preset w moments based on the final disturbance data, and fitting the total power consumption of all users at each moment in preset w moments to obtain a user power consumption total fitting curve;
the privacy budget and the relaxation factor required by the power consumption data disturbance at each moment in the preset w moments are determined by using the preset overall privacy budget and the preset overall relaxation factor.
Preferably, the client includes:
The determining unit is used for determining the original power consumption data to be formatted by utilizing the original power consumption data of each user at each moment;
The formatting unit is used for formatting the original power consumption data to be formatted to obtain formatted power consumption data;
The disturbance unit is used for disturbing the formatted power consumption data by utilizing privacy budget and relaxation factors required by the disturbance of the power consumption data at each moment in the preset w moments to obtain disturbed power consumption data;
and the recovery unit is used for recovering the disturbed electricity consumption data to an original interval to obtain recovered disturbance data, wherein the recovered disturbance data is the final disturbance data.
Preferably, the determining unit is specifically configured to:
And selecting the power consumption data of the previous w times within the preset cut-off time of power consumption data aggregation from the original power consumption data of each user at each time as the power consumption data to be formatted.
Preferably, the disturbance unit includes:
The first calculation module is used for calculating a first disturbance probability and a second disturbance probability by utilizing privacy budget and relaxation factors required by disturbance of the power consumption data at each moment in the preset w moments;
the second calculation module is used for calculating a first segmentation parameter and a second segmentation parameter by using the first disturbance probability and the second disturbance probability;
The third calculation module is used for calculating a first end point of the segmentation interval and a second end point of the segmentation interval by using the first segmentation parameter and the second segmentation parameter;
a first obtaining module, configured to format the processed electricity consumption data, and uniformly and randomly perturb the electricity consumption data into intervals according to the second perturbation probability The inner point is uniformly and randomly disturbed by the first disturbance probability as an intervalObtaining power consumption data after disturbance at an inner point;
Wherein, For input asFirst end point of segment interval,For input asSecond end point of the segment interval,And C is an interval endpoint, wherein the electricity consumption data is the formatted electricity consumption data of the ith user at the t moment.
Preferably, the second computing module is specifically configured to:
Calculating to obtain a first segmentation parameter by using the first disturbance probability and the second disturbance probability;
Calculating to obtain an interval endpoint by using the first segmentation parameter, the first disturbance probability and the second disturbance probability;
and calculating to obtain a second segmentation parameter by using the interval end point and the first segmentation parameter.
Preferably, the data aggregation end includes:
The second acquisition module is used for adding final disturbance data of the previous w times of each user within the preset cut-off time of the power consumption data aggregation to obtain the total power consumption of all the users at each time within the preset w times;
and the fitting module is used for fitting the total power consumption of all the users at each moment in the preset w moments by utilizing a least square method based on a preset fitting initial function to obtain a fitting curve of the total power consumption of the users.
Preferably, the client further includes:
The computing unit is used for computing the privacy budget required by the power consumption data disturbance at each moment in the preset w moments and the relaxation factors required by the power consumption data disturbance at each moment in the preset w moments by utilizing the preset overall privacy budget and the overall relaxation factors according to the zero-concentration differential privacy.
Preferably, the computing unit is specifically configured to:
Let the preset overall privacy budget be epsilon, the preset overall relaxation factor be delta, and the privacy budget required by the disturbance of the power consumption data at each moment in the preset w moments be The relaxation factor required by disturbance of the power consumption data at each moment in the preset w moments is; Setting a power consumption data aggregation mechanism for data disturbance by using w-event level (epsilon, delta) -local differential privacy in preset w moments as A 0, wherein the power consumption data aggregation mechanism is used/>, and the power consumption data aggregation mechanism is used for data disturbance by using w-event level (epsilon, delta) -local differential privacy in each moment in the w momentsThe power consumption data mechanism of the local differential privacy for data disturbance is { A 1,A2,…,At,…,Aw };
When mechanism A 0 satisfies zero-set differential privacy, A 0 has the parameters of ; When mechanisms { A 1,A2,…,At,…,Aw } all satisfy zero-centered differential privacy, the parameters of A t are;
According to the combination theorem of the differential privacy in the zero set, the serial combination of the mechanism A 0 as the mechanism { A 1,A2,…,At,…,Aw } is obtained, and then the parameter of A 0 isI.e.ObtainAnd。
Preferably, the calculation formula of the formatted electricity consumption data includes:
In the above formula, i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the preset deadline of power consumption data aggregation, and w is the preset w moments; a, b is the section of the original electricity consumption data of each user at each moment; For the original electricity consumption data of the ith user at the t moment,/> And formatting the processed electricity consumption data of the ith user at the t moment.
Preferably, the formula of the first disturbance probability includes:
the second disturbance probability calculation formula comprises:
In the above-mentioned method, the step of, Privacy budget required for power consumption data perturbation,And q is a first disturbance probability, and p is a second disturbance probability, which are relaxation factors required for disturbance of the power consumption data.
Preferably, the calculation formula of the first segmentation parameter includes:
the formula for calculating the interval endpoint comprises:
the calculation formula of the second segmentation parameter comprises:
a formula for calculating a first endpoint of the segment interval, comprising:
A formula for calculating a second endpoint of the segment interval, comprising:
in the above formula, q is a first disturbance probability, p is a second disturbance probability, u is a first segmentation parameter, C is a segment endpoint, v is a second segmentation parameter, l (x) is a first endpoint of a segmentation segment input as x, and r (x) is a second endpoint of the segmentation segment input as x.
Preferably, the calculation formula of the restored disturbance data includes:
In the above formula, i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the deadline of power consumption data aggregation, and w is preset w moments; a, b is the section of the original electricity consumption data of each user at each moment; The disturbance data after the recovery of the ith user at the t moment are obtained; /(I) And the power consumption data after disturbance of the ith user at the t moment is obtained.
Preferably, the calculation formula of the total power consumption of all users at each moment in the preset w moments includes:
In the above-mentioned method, the step of, The total electricity consumption of all users at the t-th moment in the preset w moments; i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the preset deadline of power consumption data aggregation, and w is the preset w moments; /(I)The final disturbance data of the ith user at the t moment.
According to a second aspect of an embodiment of the present application, there is provided a power data aggregation method based on w event level local differential privacy, including:
the user side utilizes privacy budget and relaxation factors required by disturbance of the power consumption data at each moment in the preset w moments to disturbance the original power consumption data of each user at each moment to obtain final disturbance data, and the final disturbance data are sent to the data aggregation side;
The data aggregation end determines total power consumption of all users at each moment in preset w moments based on the final disturbance data, and fits the total power consumption of all users at each moment in preset w moments to obtain a user power consumption total fitting curve;
the privacy budget and the relaxation factor required by the power consumption data disturbance at each moment in the preset w moments are determined by using the preset overall privacy budget and the preset overall relaxation factor.
According to a third aspect of an embodiment of the present application, there is provided a computer apparatus comprising: one or more processors;
The processor is used for storing one or more programs;
the power data aggregation method based on w event level local differential privacy is implemented when the one or more programs are executed by the one or more processors.
According to a fourth aspect of embodiments of the present application, there is provided a computer readable storage medium having stored thereon a computer program which, when executed, implements the w-event-level local differential privacy-based power data aggregation method.
The technical scheme provided by the invention has the following beneficial effects:
According to the power data aggregation system and method based on the w event level local differential privacy, the privacy budget and the relaxation factor required by the power consumption data disturbance at each moment in the preset w moments are determined by the user side through the preset overall privacy budget and the overall relaxation factor, the privacy budget is divided, the privacy protection intensity of data at each moment is guaranteed to be the same, the power consumption data in each time period is disturbed by the privacy budget which is evenly divided, and the accuracy of the power data aggregation result is guaranteed; the method comprises the steps that a user side utilizes privacy budget and relaxation factors required by power consumption data disturbance at each moment in preset w moments to disturbance original power consumption data of each user at each moment to obtain final disturbance data, and the final disturbance data are sent to a data aggregation side, so that the original data are disturbed by using a differential privacy method, background knowledge attack of an attacker is prevented, and meanwhile calculation cost is reduced; the data aggregation end is used for determining the total power consumption of all users at each moment in preset w moments based on final disturbance data, and the total power consumption of all users at each moment in preset w moments is utilized to obtain a total power consumption fitting curve of the users, so that the privacy parameter segmentation mechanism can be combined to realize the power data aggregation based on differential privacy, the total power consumption of the users in a period of time can be effectively fitted without revealing user information, and the safety of real data is ensured.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block diagram of a power data aggregation system based on w event level local differential privacy provided by an embodiment of the present invention;
FIG. 2 is a workflow diagram of a power data aggregation system based on w event level local differential privacy provided by an embodiment of the present invention;
Fig. 3 is a schematic diagram of a specific flow of a user side in a power data aggregation system based on differential privacy according to an embodiment of the present invention;
Fig. 4 is a schematic diagram of a specific flow of a data aggregation end in a power data aggregation system based on differential privacy according to an embodiment of the present invention;
Fig. 5 is a flowchart of a power data aggregation method based on w event level local differential privacy according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the following embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
The invention provides a power data aggregation system based on w event-level local differential privacy, as shown in fig. 1, comprising: a user end and a data aggregation end;
The user side is used for disturbing the original power consumption data of each user at each moment by utilizing privacy budget and relaxation factors required by the power consumption data disturbance of each moment in the preset w moments, which are obtained by the pre-calculation, so as to obtain final disturbance data, and sending the final disturbance data to the data aggregation side;
The data aggregation end is used for determining total power consumption of all users at each moment in preset w moments based on final disturbance data, and fitting the total power consumption of all users at each moment in preset w moments to obtain a user power consumption total fitting curve;
the privacy budget and the relaxation factor required by the power consumption data disturbance at each moment in the preset w moments are determined by using the preset overall privacy budget and the preset overall relaxation factor.
Further, the client further includes:
The computing unit is used for computing the privacy budget required by the power consumption data disturbance at each moment in the preset w moments and the relaxation factors required by the power consumption data disturbance at each moment in the preset w moments by utilizing the preset overall privacy budget and the overall relaxation factors according to the zero-concentration differential privacy.
Further, the computing unit is specifically configured to:
Let the preset overall privacy budget be epsilon, the preset overall relaxation factor be delta, and the privacy budget required by the disturbance of the power consumption data at each moment in the preset w moments be The relaxation factor required by disturbance of the power consumption data at each moment in the preset w moments is; Setting a power consumption data aggregation mechanism for data disturbance by using w-event level (epsilon, delta) -local differential privacy in preset w moments as A 0, wherein the power consumption data aggregation mechanism is used/>, and the power consumption data aggregation mechanism is used for data disturbance by using w-event level (epsilon, delta) -local differential privacy in each moment in the w momentsThe power consumption data mechanism of the local differential privacy for data disturbance is { A 1,A2,…,At,…,Aw };
When mechanism A 0 satisfies zero-set differential privacy, A 0 has the parameters of ; When mechanisms { A 1,A2,…,At,…,Aw } all satisfy zero-centered differential privacy, the parameters of A t are;
According to the combination theorem of the differential privacy in the zero set, the serial combination of the mechanism A 0 as the mechanism { A 1,A2,…,At,…,Aw } is obtained, and then the parameter of A 0 isI.e.ObtainAnd。
For example, step a: when the aggregation time number is w, the power data aggregation mechanism for data disturbance by using w-event level (epsilon, delta) -local differential privacy is A 0, and the required overall privacy parameters are set as follows: the overall privacy budget is epsilon and the overall relaxation factor is delta; suppose use at each timeThe mechanism by which the local differential privacy is perturbed is { A 1,A2,…,At,…,Aw }, respectively, that is, the privacy budget required for each moment of data perturbation isAnd relaxation factor ofGiven an (ε, δ) -local differential privacy mechanism A 0, based on the relationship between local differential privacy and zero-set differential privacy, then the parameters of mechanism A 0 areZero centralized differential privacy, so the power data aggregation mechanism a 0 isZero-set differential privacy, where the parameters of the perturbation mechanism A t at each instant are-Zero-set differential privacy;
Step b: according to the combination theorem of the zero-set differential privacy, if the parameters of the disturbance mechanism A t at each moment are Zero-set differential privacy and mechanism A 0 is a serial combination of mechanisms { A 1,A2,…,At,…,Aw }, then the parameters of mechanism A 0 are (/ >)++……+) Zero-centralized differential privacy, i.e. the parameters of mechanism a 0 are-Zero-set differential privacy;
Step c: from the above steps a and b, it can be seen that mechanism A 0 is from two points of view Zero-centralized differential privacyZero-centered differential privacy, the two parameter values being identical, soThe calculation formula for calculating the privacy parameters of disturbance at each moment is/>, respectivelyAnd。
It should be noted that, the present invention is not limited to "the preset overall privacy budget is epsilon", "the preset overall relaxation factor is delta", and "the preset w moments", and may be set by those skilled in the art according to engineering needs or experimental data.
Further, the client includes:
The determining unit is used for determining the original power consumption data to be formatted by utilizing the original power consumption data of each user at each moment;
The formatting unit is used for formatting the original power consumption data to be formatted to obtain formatted power consumption data;
The disturbance unit is used for disturbing the formatted power consumption data by utilizing privacy budget and relaxation factors required by disturbance of the power consumption data at each moment in preset w moments to obtain disturbed power consumption data;
and the recovery unit is used for recovering the disturbed power consumption data to the original interval to obtain recovered disturbance data which is final disturbance data.
Specifically, the calculation formula of the formatted electricity consumption data includes:
In the above formula, i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the preset deadline of power consumption data aggregation, and w is the preset w moments; a, b is the section of the original electricity consumption data of each user at each moment; For the original electricity consumption data of the ith user at the t moment,/> And formatting the processed electricity consumption data of the ith user at the t moment.
Further, the determining unit is specifically configured to:
And selecting the power consumption data of the previous w times within the preset cut-off time of power consumption data aggregation from the original power consumption data of each user at each time as the power consumption data to be formatted.
Further, the perturbation unit includes:
the first calculation module is used for calculating a first disturbance probability and a second disturbance probability by utilizing privacy budget and relaxation factors required by disturbance of power consumption data at each moment in preset w moments;
Specifically, the first disturbance probability calculation formula includes:
A second disturbance probability calculation formula comprising:
In the above-mentioned method, the step of, Privacy budget required for power consumption data perturbation,The relaxation factor required by disturbance of the power consumption data is q is a first disturbance probability, and p is a second disturbance probability;
The second calculation module is used for calculating to obtain a first segmentation parameter and a second segmentation parameter by using the first disturbance probability and the second disturbance probability;
The third calculation module is used for calculating to obtain a first end point of the segmentation section and a second end point of the segmentation section by using the first segmentation parameter and the second segmentation parameter;
a first acquisition module for formatting the processed power consumption data to uniformly and randomly perturb the power consumption data into intervals with a second perturbation probability The inner point is uniformly and randomly disturbed by a first disturbance probability to be intervalObtaining power consumption data after disturbance at an inner point;
Wherein, For input asFirst end point of segment interval,For input asSecond end point of the segment interval,And C is an interval endpoint, wherein the electricity consumption data is the formatted electricity consumption data of the ith user at the t moment.
Further, the second computing module is specifically configured to:
calculating to obtain a first segmentation parameter by using the first disturbance probability and the second disturbance probability;
Calculating to obtain an interval endpoint by using the first segmentation parameter, the first disturbance probability and the second disturbance probability;
And calculating to obtain a second segmentation parameter by using the interval end point and the first segmentation parameter.
Specifically, the calculation formula of the first segmentation parameter includes:
A formula for calculating an interval endpoint, comprising:
a calculation formula for a second segmentation parameter, comprising:
A calculation formula for a first endpoint of a segment interval, comprising:
a calculation formula for a second end point of a segment interval, comprising:
in the above formula, q is a first disturbance probability, p is a second disturbance probability, u is a first segmentation parameter, C is a segment endpoint, v is a second segmentation parameter, l (x) is a first endpoint of a segmentation segment input as x, and r (x) is a second endpoint of the segmentation segment input as x.
Further, the data aggregation end includes:
the second acquisition module is used for adding final disturbance data of the previous w times of each user within the preset cut-off time of the power consumption data aggregation to obtain the total power consumption of all the users at each of the preset w times;
The fitting module is used for fitting the total power consumption of all users at each moment in the preset w moments by utilizing a least square method based on a preset fitting initial function to obtain a fitting curve of the total power consumption of the users.
Specifically, the calculation formula of the total power consumption of all users at each moment in the preset w moments includes:
In the above-mentioned method, the step of, The total electricity consumption of all users at the t-th moment in the preset w moments; i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the preset deadline of power consumption data aggregation, and w is the preset w moments; /(I)The final disturbance data of the ith user at the t moment.
The invention solves the problem of large calculation cost of the existing privacy protection power data aggregation scheme, in addition, the background knowledge attack of an attacker can be prevented, the attacker is prevented from acquiring the real data of the user, the attacker is prevented from acquiring the original data of the user from the trusted third party aggregation end, the user is prevented from avoiding the risk of privacy disclosure, if the power consumption is less, the user is possibly presumed to be out of home, and the power consumption is more, the user is presumed to use a high-power electric appliance which is not expected to be known by other people. The invention uses the privacy protection of the privacy budget dividing method for the time stream data by utilizing the w-event level (epsilon, delta) -local differential privacy, calculates the privacy budget at each moment by utilizing the property of zero-concentration differential privacy to carry out data disturbance, and finally sends disturbance data to a data aggregation end, thereby ensuring the safety of real data. Even with sufficient background knowledge, an attacker cannot obtain any information for each piece of consumer power data.
To further illustrate the above-mentioned power data aggregation system based on w-event-level local differential privacy, the present invention provides a specific example, as shown in fig. 2, and the workflow of the power data aggregation system based on w-event-level local differential privacy includes the following steps:
S1, a user side sets overall privacy budget epsilon and overall relaxation factor delta, and calculates privacy budget required by data disturbance at each moment in preset w moments And relaxation factorSimultaneously, the overall privacy budget epsilon, the overall relaxation factor delta, preset w moments and the data aggregation deadline T are published to a user;
S2, the user locally pairs the original time stream data vector for each user ,Data set preprocessing is performed. Formatting the data at w times before time T to facilitate perturbation, each data in the original vectorWithin the interval [ a, b ], each selected data is perturbed into the interval [ -1,1], the formatted data is set to;
S3, the user side pairs the formatted dataDisturbance is carried out, and the data after disturbance is processedRestoring to the original data interval [ a, b ], restoring the restored dataAnd sending the data to a data aggregation end.
And S4, carrying out statistical analysis by the data aggregation end according to disturbance data sent by the users, estimating total power consumption of all users at each moment in the original w moments, and fitting the total power consumption of all users at each moment in the original w moments by using a least square method.
Further, step S1 includes:
S11, when the aggregation time is w, an electric power data aggregation mechanism for data disturbance by using w-event level (epsilon, delta) -local differential privacy is A 0, and the required overall privacy parameters are set as follows: the overall privacy budget is epsilon and the overall relaxation factor is delta; suppose use at each time The mechanism by which the local differential privacy is perturbed is { A 1,A2,…,At,…,Aw }, respectively, that is, the privacy budget required for each moment of data perturbation isAnd relaxation factor ofGiven an (ε, δ) -local differential privacy mechanism A 0, based on the relationship between local differential privacy and zero-set differential privacy, then the parameters of mechanism A 0 areZero centralized differential privacy, so the power data aggregation mechanism a 0 isZero-set differential privacy, where the parameters of the perturbation mechanism A t at each instant are-Zero-set differential privacy;
S12, according to the combination theorem of the zero-set differential privacy, if the parameters of the disturbance mechanism A t at each moment are Zero-set differential privacy and mechanism A 0 is a serial combination of mechanisms { A 1,A2,…,At,…,Aw }, then the parameters of mechanism A 0 are (/ >)++……+) Zero-centralized differential privacy, i.e. the parameters of mechanism a 0 are-Zero-set differential privacy;
S13, from the above steps S11 and S12, the mechanism A 0 is respectively from two points of view Zero-centralized differential privacyZero-centered differential privacy, the two parameter values being identical, soThe calculation formula for calculating the privacy parameters of disturbance at each moment is/>, respectivelyAnd。
Further, step S2 includes:
S21, regarding the original data ,Firstly, selecting data/>, of w times before time TAs data to be processed;
s22, formatting the selected data, aiming at the original data in the interval [ a, b ] Formatting to beFormatted dataWithin the interval [ -1,1 ].
Further, step S3 includes:
S31, using privacy budget for formatted data Perturbation is carried out by a segmentation mechanism with a relaxation factor delta; let the first disturbance probabilitySecond disturbance probabilityFirst segmentation parameterInterval endpointThe second segmentation parameter v=u-C, the end point l (x) =ux+v of the segmentation section with the input data x, and the end point r (x) =ux-v of the segmentation section with the input data x; formatted dataUniformly and randomly perturbing by using probability p as intervalThe inner point is uniformly and randomly disturbed by the probability q as intervalIn the inner point, post-disturbance data are recorded as;
S32, restoring disturbance data to the original interval, and for the data after disturbanceCalculate recovered post-disturbance data。
Further, step S4 includes:
s41, aggregating data at w times before time T, wherein n users participate in power aggregation, and calculating the summation of disturbance data at the w times ;
S42, fitting total power consumption at w moments, wherein the change trend of the total power consumption along with time is nonlinear distribution, a polynomial function is selected for fitting, and through experimental simulation, in order to avoid over-fitting, the following initial fitting function is selected: Where t is the time and y is the total power usage corresponding to time t, using least squares pair/> Fitting was performed.
In order to further illustrate the above-mentioned power data aggregation method based on w event level local differential privacy, the present invention further provides a specific example, as shown in fig. 3, in the power data aggregation method based on differential privacy, the specific flow steps of the user side are as follows:
step one: the time stream data of w times before the original time T is formatted. The original time stream data is The raw data at time t isT=0 is set.
Step two: judging whether the original time stream data is selected data or not according to the original time stream data, if so, executing the step three, wherein T is T-w+1, T; if the data is not the selected data, executing the fourth step.
Step three: when the original data x t is the selected data, formatting the selected data, and calculating to obtain the following resultAnd executing the step six.
Step four: when the original data x t is not the selected data, judging whether T < T-w+1 is met, and if so, executing the fifth step; if not, executing step nine.
Step five: calculating t=t+1, and executing the second step.
Step six: the data disturbance related parameters are set as follows: first disturbance probabilitySecond disturbance probabilityFirst segmentation parameterInterval endpointThe second segmentation parameter v=u-C, the end point l (x) =ux+v of the segmentation section with the input data x, and the end point r (x) =ux-v of the segmentation section with the input data x.
Step seven: formatting dataUniformly and randomly perturbing by using probability p as intervalThe inner point is uniformly and randomly disturbed by the probability q as intervalIn the inner point, post-disturbance data are recorded as。
Step eight: restoring disturbance data to an original interval, and for the data after disturbanceCalculate recovered post-disturbance dataAnd executing the fifth step. /(I)
Step nine: the user willAnd sending the data to a data aggregation end.
As shown in fig. 4, the specific flow steps of the data aggregation end in the power data aggregation method based on differential privacy are as follows:
step one: obtaining disturbance data of all n users from user side 。
Step two: the total electricity consumption of w times before the aggregation time T is calculated。
Step three: fitting w moments using least squaresA user power consumption total amount fitting curve P (t) with respect to the time is obtained.
Step four: the data aggregation end obtains fitting results of user power data aggregation at w times before the time T.
According to the derivation, the average privacy division of the power data aggregation method based on differential privacy can be obtained, the disturbed data is unbiased estimation of the original data, and in addition, the interval division of the random disturbance under the w-event level (epsilon, delta) -local differential privacy is also provided. We evaluate the utility mechanism by sum of variances (SSE), measuring the scientific credibility of the mechanism in use.
The following are experimental results of the present invention. The dataset used in the experiment was CER ELECTRICITY, which is the electricity usage detection data for residents in a certain place by the irish smart meter issued by THE RESEARCH PERSPECTIVE LTD on 3/12 of 2012. The raw data contains 6 compressed files, each file containing 1000 or more power consumed by each user every half hour, and 24 days (195 to 218 days) of power consumption are counted for each user. However, the statistical information of some users is less than 24 days, the missing data is cleared, the total electricity consumption of each user at each moment in 24 days is only counted, the dimension of the processed data is 48, and the number of effective users is 6397. In the experiment, 8 privacy budgets epsilon are divided into 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 4.0 and 5.0,8 relaxation factors delta, 10 -9、10-8、10-7、10-6、10-5、10-4、10-3、10-2 respectively, and 6 time points w for required aggregation are set to be 6, 8, 12, 16, 24 and 48.δ=10 -6, w=24 is set in table 1; epsilon=1.0 was used in table 2, w=24; epsilon=1.0, delta=10 -6 was applied in table 3. In the experiment, each user locally perturbs own data and sends the result to a server, the server estimates the sum of the power consumption of the users and fits the sum, SSEs which are fit by using real data and SSEs which are fit by using the method are respectively calculated, and if the SSEs are closer, the estimation accuracy is higher.
TABLE 1 and variance are affected by parameter ε
TABLE 2 and variance are affected by parameter delta
TABLE 3 influence of the variance by the parameter w
As can be seen from table 1, the accuracy of the method herein increases with increasing privacy budget epsilon, already very close to the result of the true fit at epsilon=5.0. By setting δ to a number greater than 0 and close to 0 in table 2, it can be seen that the estimation accuracy of the method herein improves with increasing relaxation factor δ, but the improvement effect is not as good as the variation of ε, even when δ=10 -2, the SSE of the method herein is still much greater than the true fit result, indicating that the variation of privacy budget ε has a greater impact on the overall scheme performance. It can be seen from table 3 that both the true fit and the sum variance of the method herein become larger as w increases, indicating that the more moments to be aggregated, the more difficult it is to approach the true value to the fit, i.e. the larger the error that is produced, and the larger the w, the larger the gap between the method herein and the SSE of the true fit results, the higher the accuracy of the method herein at smaller w.
The power data aggregation method based on the w-event-level local differential privacy provided by the invention comprises the steps of defining a w-event-level (epsilon, delta) -local differential privacy model of a streaming data type, and dividing privacy parameters of disturbance of a plurality of moments before a certain moment so that all data have the same privacy protection intensity. The problem is solved in a scenario that a data aggregation end and a plurality of users exist, each user reports time flow data with a plurality of time data, then the user uses a segmentation mechanism of a divided privacy parameter to disturb each time data, and the disturbed data is sent to the data aggregation end. The data aggregation end collects the disturbed data of the users, estimates the total power consumption of all the users at each moment, and then obtains a function of the fitted total power consumption of the users with respect to the moment according to the estimated addition value. The data protection process is carried out at the user end, so that the background knowledge attack of an attacker can be resisted, and meanwhile, only lower calculation and communication expenditure is needed; the method and the device for protecting the privacy of the stream data, disclosed by the invention, are used for protecting the privacy of the stream data, expanding a differential privacy mechanism, improving the accuracy of an estimation result under the condition of guaranteeing the privacy protection intensity of the data, and recovering the change trend of the total power consumption about time.
Example two
A method for aggregating power data based on w event level local differential privacy, as shown in fig. 5, includes:
Step 101: the user side utilizes privacy budget and relaxation factors required by disturbance of the power consumption data at each moment in the preset w moments obtained by the pre-calculation to carry out disturbance on the original power consumption data of each user at each moment to obtain final disturbance data, and the final disturbance data is sent to the data aggregation side;
step 102: the data aggregation end determines total power consumption of all users at each moment in preset w moments based on final disturbance data, and fits the total power consumption of all users at each moment in preset w moments to obtain a fitting curve of the total power consumption of the users;
the privacy budget and the relaxation factor required by the power consumption data disturbance at each moment in the preset w moments are determined by using the preset overall privacy budget and the preset overall relaxation factor.
Further, step 101 includes:
Step 1011: determining the original power consumption data to be formatted by utilizing the original power consumption data of each user at each moment;
Step 1012: formatting the original power consumption data to be formatted to obtain formatted power consumption data;
Step 1013: disturbing the formatted power consumption data by using privacy budget and relaxation factors required by power consumption data disturbance at each moment in preset w moments to obtain disturbed power consumption data;
step 1014: and recovering the disturbed electricity consumption data to the original interval to obtain recovered disturbance data, wherein the recovered disturbance data is final disturbance data.
Further, step 10111 includes:
And selecting the power consumption data of the previous w times within the preset cut-off time of power consumption data aggregation from the original power consumption data of each user at each time as the power consumption data to be formatted.
Further, step 1013 includes:
step 1013a: calculating to obtain a first disturbance probability and a second disturbance probability by using privacy budget and relaxation factors required by disturbance of power consumption data at each moment in preset w moments;
step 1013b: calculating to obtain a first segmentation parameter and a second segmentation parameter by using the first disturbance probability and the second disturbance probability;
Step 1013c: calculating to obtain a first end point of the segmented section and a second end point of the segmented section by using the first segmentation parameter and the second segmentation parameter;
Step 1013d: the formatted power consumption data is uniformly and randomly disturbed into intervals according to the second disturbance probability The inner point is uniformly and randomly disturbed by a first disturbance probability to be intervalObtaining power consumption data after disturbance at an inner point;
Wherein, For input asFirst end point of segment interval,For input asSecond end point of the segment interval,And C is an interval endpoint, wherein the electricity consumption data is the formatted electricity consumption data of the ith user at the t moment.
Further, step 1013b comprises:
calculating to obtain a first segmentation parameter by using the first disturbance probability and the second disturbance probability;
Calculating to obtain an interval endpoint by using the first segmentation parameter, the first disturbance probability and the second disturbance probability;
And calculating to obtain a second segmentation parameter by using the interval end point and the first segmentation parameter.
Further, step 102 includes:
Step 1021: adding final disturbance data of the previous w times of each user within the preset cut-off time of the power consumption data aggregation to obtain total power consumption of all users at each of the preset w times;
Step 1022: and fitting the total power consumption of all users at each moment in the preset w moments by using a least square method based on a preset fitting initial function to obtain a fitting curve of the total power consumption of the users.
Further, the method further comprises:
step 100: and the user side calculates the privacy budget required by the power consumption data disturbance at each moment in the preset w moments and the relaxation factors required by the power consumption data disturbance at each moment in the preset w moments by utilizing the preset overall privacy budget and the overall relaxation factors according to the zero-concentration differential privacy.
Further, step 100 includes:
Step 1001: let the preset overall privacy budget be epsilon, the preset overall relaxation factor be delta, and the privacy budget required by the disturbance of the power consumption data at each moment in the preset w moments be The relaxation factor required by disturbance of the power consumption data at each moment in the preset w moments is; Setting a power consumption data aggregation mechanism for data disturbance by using w-event level (epsilon, delta) -local differential privacy in preset w moments as A 0, wherein the power consumption data aggregation mechanism is used/>, and the power consumption data aggregation mechanism is used for data disturbance by using w-event level (epsilon, delta) -local differential privacy in each moment in the w momentsThe power consumption data mechanism of the local differential privacy for data disturbance is { A 1,A2,…,At,…,Aw };
Step 1002: when mechanism A 0 satisfies zero-set differential privacy, A 0 has the parameters of ; When mechanisms { A 1,A2,…,At,…,Aw } all satisfy zero-centered differential privacy, the parameters of A t are;
Step 1003: according to the combination theorem of the differential privacy in the zero set, the serial combination of the mechanism A 0 as the mechanism { A 1,A2,…,At,…,Aw } is obtained, and then the parameter of A 0 isI.e.ObtainAnd。
Further, the calculation formula of the formatted electricity consumption data includes:
In the above formula, i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the preset deadline of power consumption data aggregation, and w is the preset w moments; a, b is the section of the original electricity consumption data of each user at each moment; For the original electricity consumption data of the ith user at the t moment,/> And formatting the processed electricity consumption data of the ith user at the t moment.
Further, the first disturbance probability calculation formula includes:
A second disturbance probability calculation formula comprising:
In the above-mentioned method, the step of, Privacy budget required for power consumption data perturbation,And q is a first disturbance probability, and p is a second disturbance probability, which are relaxation factors required for disturbance of the power consumption data.
Further, the calculation formula of the first segmentation parameter includes:
A formula for calculating an interval endpoint, comprising:
a calculation formula for a second segmentation parameter, comprising:
A calculation formula for a first endpoint of a segment interval, comprising:
a calculation formula for a second end point of a segment interval, comprising:
in the above formula, q is a first disturbance probability, p is a second disturbance probability, u is a first segmentation parameter, C is a segment endpoint, v is a second segmentation parameter, l (x) is a first endpoint of a segmentation segment input as x, and r (x) is a second endpoint of the segmentation segment input as x.
Further, the calculation formula of the restored disturbance data comprises:
In the above formula, i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the deadline of power consumption data aggregation, and w is preset w moments; a, b is the section of the original electricity consumption data of each user at each moment; The disturbance data after the recovery of the ith user at the t moment are obtained; /(I) And the power consumption data after disturbance of the ith user at the t moment is obtained.
Further, the calculation formula of the total power consumption of all users at each moment in the preset w moments includes:
In the above-mentioned method, the step of, The total electricity consumption of all users at the t-th moment in the preset w moments; i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the preset deadline of power consumption data aggregation, and w is the preset w moments; /(I)The final disturbance data of the ith user at the t moment.
It can be understood that the above-provided method embodiments correspond to the above-described system embodiments, and corresponding specific details may be referred to each other and will not be described herein.
It is to be understood that the same or similar parts in the above embodiments may be referred to each other, and that in some embodiments, the same or similar parts in other embodiments may be referred to.
Example III
Based on the same inventive concept, the invention also provides a computer device comprising a processor and a memory for storing a computer program comprising program instructions, the processor for executing the program instructions stored by the computer storage medium. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processor, digital signal processor (DIGITAL SIGNAL Processor, DSP), application specific integrated circuit (ApplicationSpecific Integrated Circuit, ASIC), off-the-shelf Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc., which are the computational core and control core of the terminal adapted to implement one or more instructions, particularly adapted to load and execute one or more instructions within a computer storage medium to implement the corresponding method flow or corresponding functions, to implement the steps of a w event level local differential privacy-based power data aggregation method in the above embodiments.
Example IV
Based on the same inventive concept, the present invention also provides a storage medium, in particular, a computer readable storage medium (Memory), which is a Memory device in a computer device, for storing programs and data. It is understood that the computer readable storage medium herein may include both built-in storage media in a computer device and extended storage media supported by the computer device. The computer-readable storage medium provides a storage space storing an operating system of the terminal. Also stored in the memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor. The computer readable storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. One or more instructions stored in a computer-readable storage medium may be loaded and executed by a processor to implement the steps of a method of power data aggregation based on w-event level local differential privacy in the above embodiments.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.
Claims (16)
1. A w event level local differential privacy-based power data aggregation system, comprising: a user end and a data aggregation end;
The user side is used for disturbing the original power consumption data of each user at each moment by utilizing privacy budget and relaxation factors required by the power consumption data disturbance at each moment in preset w moments, which are obtained by calculation in advance, so as to obtain final disturbance data, and sending the final disturbance data to the data aggregation side;
The data aggregation end is used for determining total power consumption of all users at each moment in preset w moments based on the final disturbance data, and fitting the total power consumption of all users at each moment in preset w moments to obtain a user power consumption total fitting curve;
the privacy budget and the relaxation factor required by the power consumption data disturbance at each moment in the preset w moments are determined by using the preset overall privacy budget and the preset overall relaxation factor.
2. The system of claim 1, wherein the client comprises:
The determining unit is used for determining the original power consumption data to be formatted by utilizing the original power consumption data of each user at each moment;
The formatting unit is used for formatting the original power consumption data to be formatted to obtain formatted power consumption data;
The disturbance unit is used for disturbing the formatted power consumption data by utilizing privacy budget and relaxation factors required by the disturbance of the power consumption data at each moment in the preset w moments to obtain disturbed power consumption data;
and the recovery unit is used for recovering the disturbed electricity consumption data to an original interval to obtain recovered disturbance data, wherein the recovered disturbance data is the final disturbance data.
3. The system according to claim 2, characterized in that the determining unit is specifically configured to:
And selecting the power consumption data of the previous w times within the preset cut-off time of power consumption data aggregation from the original power consumption data of each user at each time as the power consumption data to be formatted.
4. The system of claim 2, wherein the perturbation unit comprises:
The first calculation module is used for calculating a first disturbance probability and a second disturbance probability by utilizing privacy budget and relaxation factors required by disturbance of the power consumption data at each moment in the preset w moments;
the second calculation module is used for calculating a first segmentation parameter and a second segmentation parameter by using the first disturbance probability and the second disturbance probability;
The third calculation module is used for calculating a first end point of the segmentation interval and a second end point of the segmentation interval by using the first segmentation parameter and the second segmentation parameter;
A first obtaining module, configured to format the processed electricity consumption data, and uniformly and randomly perturb the electricity consumption data into intervals according to the second perturbation probability The inner point is uniformly and randomly disturbed by a first disturbance probability to be intervalObtaining power consumption data after disturbance at an inner point;
Wherein, For input asFirst end point of segment interval,For input asSecond end point of the segment interval,And C is an interval endpoint, wherein the electricity consumption data is the formatted electricity consumption data of the ith user at the t moment.
5. The system according to claim 4, wherein the second computing module is specifically configured to:
Calculating to obtain a first segmentation parameter by using the first disturbance probability and the second disturbance probability;
Calculating to obtain an interval endpoint by using the first segmentation parameter, the first disturbance probability and the second disturbance probability;
and calculating to obtain a second segmentation parameter by using the interval end point and the first segmentation parameter.
6. The system of claim 1, wherein the data aggregation side comprises:
The second acquisition module is used for adding final disturbance data of the previous w times of each user within the preset cut-off time of the power consumption data aggregation to obtain the total power consumption of all the users at each time within the preset w times;
and the fitting module is used for fitting the total power consumption of all the users at each moment in the preset w moments by utilizing a least square method based on a preset fitting initial function to obtain a fitting curve of the total power consumption of the users.
7. The system of claim 1, wherein the client further comprises:
The computing unit is used for computing the privacy budget required by the power consumption data disturbance at each moment in the preset w moments and the relaxation factors required by the power consumption data disturbance at each moment in the preset w moments by utilizing the preset overall privacy budget and the overall relaxation factors according to the zero-concentration differential privacy.
8. The system according to claim 7, characterized in that the computing unit is specifically configured to:
Let the preset overall privacy budget be epsilon, the preset overall relaxation factor be delta, and the privacy budget required by the disturbance of the power consumption data at each moment in the preset w moments be The relaxation factor required by disturbance of the power consumption data at each moment in the preset w moments is; Setting a power consumption data aggregation mechanism for data disturbance by using w-event level (epsilon, delta) -local differential privacy in preset w moments as A 0, wherein the power consumption data aggregation mechanism is used/>, and the power consumption data aggregation mechanism is used for data disturbance by using w-event level (epsilon, delta) -local differential privacy in each moment in the w momentsThe power consumption data mechanism of the local differential privacy for data disturbance is { A 1,A2,…,At,…,Aw };
When mechanism A 0 satisfies zero-set differential privacy, A 0 has the parameters of ; When mechanisms { A 1,A2,…,At,…,Aw } all satisfy zero-centered differential privacy, the parameters of A t are;
According to the combination theorem of the differential privacy in the zero set, the serial combination of the mechanism A 0 as the mechanism { A 1,A2,…,At,…,Aw } is obtained, and then the parameter of A 0 isI.e.ObtainAnd。
9. The system of claim 2, wherein the calculation formula for formatting the processed electricity consumption data includes:
In the above formula, i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the preset deadline of power consumption data aggregation, and w is the preset w moments; a, b is the section of the original electricity consumption data of each user at each moment; For the original electricity consumption data of the ith user at the t moment,/> And formatting the processed electricity consumption data of the ith user at the t moment.
10. The system of claim 4, wherein the first disturbance probability calculation comprises:
the second disturbance probability calculation formula comprises:
In the above-mentioned method, the step of, Privacy budget required for power consumption data perturbation,And q is a first disturbance probability, and p is a second disturbance probability, which are relaxation factors required for disturbance of the power consumption data.
11. The system of claim 5, wherein the first segmentation parameter calculation comprises:
the formula for calculating the interval endpoint comprises:
the calculation formula of the second segmentation parameter comprises:
a formula for calculating a first endpoint of the segment interval, comprising:
A formula for calculating a second endpoint of the segment interval, comprising:
in the above formula, q is a first disturbance probability, p is a second disturbance probability, u is a first segmentation parameter, C is a segment endpoint, v is a second segmentation parameter, l (x) is a first endpoint of a segmentation segment input as x, and r (x) is a second endpoint of the segmentation segment input as x.
12. The system of claim 2, wherein the calculated equation for the recovered disturbance data comprises:
In the above formula, i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the deadline of power consumption data aggregation, and w is preset w moments; a, b is the section of the original electricity consumption data of each user at each moment; The disturbance data after the recovery of the ith user at the t moment are obtained; /(I) And the power consumption data after disturbance of the ith user at the t moment is obtained.
13. The system of claim 6, wherein the calculation formula of the total power consumption of all users at each of the preset w times includes:
In the above-mentioned method, the step of, The total electricity consumption of all users at the t-th moment in the preset w moments; i epsilon [1, n ], n is the total number of users, T epsilon [ T-w+1, T ], T is the preset deadline of power consumption data aggregation, and w is the preset w moments; /(I)The final disturbance data of the ith user at the t moment.
14. A method of power data aggregation based on w event level local differential privacy, comprising:
the user side utilizes privacy budget and relaxation factors required by disturbance of the power consumption data at each moment in the preset w moments to disturbance the original power consumption data of each user at each moment to obtain final disturbance data, and the final disturbance data are sent to the data aggregation side;
The data aggregation end determines total power consumption of all users at each moment in preset w moments based on the final disturbance data, and fits the total power consumption of all users at each moment in preset w moments to obtain a user power consumption total fitting curve;
the privacy budget and the relaxation factor required by the power consumption data disturbance at each moment in the preset w moments are determined by using the preset overall privacy budget and the preset overall relaxation factor.
15. A computer device, comprising: at least one processor and memory;
the memory is used for storing one or more programs;
The power data aggregation method based on w event level local differential privacy of claim 14 is implemented when the one or more programs are executed by the at least one processor.
16. A computer readable storage medium, having stored thereon a computer program which, when executed, implements a w event level local differential privacy based power data aggregation method as claimed in claim 14.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410519315.0A CN118096441B (en) | 2024-04-28 | 2024-04-28 | Power data aggregation system and method based on w event-level local differential privacy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410519315.0A CN118096441B (en) | 2024-04-28 | 2024-04-28 | Power data aggregation system and method based on w event-level local differential privacy |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118096441A true CN118096441A (en) | 2024-05-28 |
CN118096441B CN118096441B (en) | 2024-09-06 |
Family
ID=91164069
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410519315.0A Active CN118096441B (en) | 2024-04-28 | 2024-04-28 | Power data aggregation system and method based on w event-level local differential privacy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118096441B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109902506A (en) * | 2019-01-08 | 2019-06-18 | 中国科学院软件研究所 | A kind of local difference private data sharing method and system of more privacy budgets |
CN115168423A (en) * | 2022-07-15 | 2022-10-11 | 东南大学 | Smart power grid data aggregation method based on local differential privacy |
CN116471040A (en) * | 2023-03-08 | 2023-07-21 | 南京航空航天大学 | Smart grid data aggregation method based on w event-level local differential privacy |
CN117371019A (en) * | 2022-06-29 | 2024-01-09 | 国网上海能源互联网研究院有限公司 | Privacy protection method and system for intelligent power distribution terminal data |
-
2024
- 2024-04-28 CN CN202410519315.0A patent/CN118096441B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109902506A (en) * | 2019-01-08 | 2019-06-18 | 中国科学院软件研究所 | A kind of local difference private data sharing method and system of more privacy budgets |
CN117371019A (en) * | 2022-06-29 | 2024-01-09 | 国网上海能源互联网研究院有限公司 | Privacy protection method and system for intelligent power distribution terminal data |
CN115168423A (en) * | 2022-07-15 | 2022-10-11 | 东南大学 | Smart power grid data aggregation method based on local differential privacy |
CN116471040A (en) * | 2023-03-08 | 2023-07-21 | 南京航空航天大学 | Smart grid data aggregation method based on w event-level local differential privacy |
Also Published As
Publication number | Publication date |
---|---|
CN118096441B (en) | 2024-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Barbosa et al. | A technique to provide differential privacy for appliance usage in smart metering | |
Chen et al. | Impact analysis of false data injection attacks on power system static security assessment | |
Zheng et al. | A decentralized mechanism based on differential privacy for privacy-preserving computation in smart grid | |
Higgins et al. | Stealthy MTD against unsupervised learning-based blind FDI attacks in power systems | |
Wang et al. | Improving utility and security of the shuffler-based differential privacy | |
Hua et al. | Privacy-preserving utility verification of the data published by non-interactive differentially private mechanisms | |
CN108809628B (en) | Time series abnormity detection method and system based on safety multiple parties | |
CN112702341A (en) | Privacy protection-based user electricity consumption data sharing method and system | |
Zhang et al. | SIRS: Internet worm propagation model and application | |
Jiang et al. | A privacy-preserving aggregation scheme based on immunological negative surveys for smart meters | |
CN118096441B (en) | Power data aggregation system and method based on w event-level local differential privacy | |
Hou et al. | A new privacy-preserving framework based on edge-fog-cloud continuum for load forecasting | |
Sharma et al. | Detection of false data injection in smart grid using PCA based unsupervised learning | |
CN114298862A (en) | Smart power grid privacy protection and electricity stealing detection method based on block chain | |
Li et al. | Look-up table based FHE system for privacy preserving anomaly detection in smart grids | |
Yan et al. | Multi-smart meter data encryption scheme based on distributed differential privacy | |
Keshk et al. | Privacy-preserving techniques for protecting large-scale data of cyber-physical systems | |
CN114221809B (en) | Data aggregation system and method for resisting abnormal data and protecting privacy | |
Wang et al. | Block Verification Mechanism Based on Zero-Knowledge Proof in Blockchain. | |
CN113868695B (en) | Block chain-based trusted privacy protection method in crowd-sourced data aggregation | |
CN117371019A (en) | Privacy protection method and system for intelligent power distribution terminal data | |
Wu et al. | Private estimation of symptom distribution for infectious disease analysis in edge computing | |
Pan et al. | Research on Privacy Preserving on K-anonymity. | |
Deng et al. | A security multi-dimensional range query protocol based on left 0-1 encoding in two-tiered wireless sensor networks | |
Yang et al. | Privacy-preserving HE-based clustering for load profiling over encrypted smart meter data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |