CN117892357B - Energy big data sharing and distribution risk control method based on differential privacy protection - Google Patents

Energy big data sharing and distribution risk control method based on differential privacy protection Download PDF

Info

Publication number
CN117892357B
CN117892357B CN202410295368.9A CN202410295368A CN117892357B CN 117892357 B CN117892357 B CN 117892357B CN 202410295368 A CN202410295368 A CN 202410295368A CN 117892357 B CN117892357 B CN 117892357B
Authority
CN
China
Prior art keywords
power data
data sequence
newly
added power
peak
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410295368.9A
Other languages
Chinese (zh)
Other versions
CN117892357A (en
Inventor
王圆圆
王世谦
贾一博
李秋燕
田春筝
华远鹏
闫利
宋大为
韩丁
狄立
卜飞飞
王涵
姬哲
牛斌斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Jiuyu Tenglong Information Engineering Co ltd
Economic and Technological Research Institute of State Grid Henan Electric Power Co Ltd
Original Assignee
Henan Jiuyu Tenglong Information Engineering Co ltd
Economic and Technological Research Institute of State Grid Henan Electric Power Co Ltd
Filing date
Publication date
Application filed by Henan Jiuyu Tenglong Information Engineering Co ltd, Economic and Technological Research Institute of State Grid Henan Electric Power Co Ltd filed Critical Henan Jiuyu Tenglong Information Engineering Co ltd
Priority to CN202410295368.9A priority Critical patent/CN117892357B/en
Publication of CN117892357A publication Critical patent/CN117892357A/en
Application granted granted Critical
Publication of CN117892357B publication Critical patent/CN117892357B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention relates to the technical field of data security, in particular to an energy big data sharing and distributing risk control method based on differential privacy protection, which comprises the following steps: collecting various existing power data in an energy data management platform, sequencing the power data according to time, forming each stored power data sequence, and obtaining each newly-added power data sequence; setting new class weight of each new power data sequence; constructing peak compactness and data category consistency indexes of each newly-added power data sequence, and calculating the macroscopic impact index of the power consumption of each newly-added power data sequence based on the peak compactness and the data category consistency indexes; constructing a microscopic change index of the power consumption of each newly-added power data sequence, and further calculating a subdivision degree index; and constructing a privacy budget correction factor of each newly-added power data sequence to acquire self-adaptive privacy budget, and processing each newly-added power data sequence by adopting a differential privacy query optimization algorithm. The invention can reduce the risk in the process of sharing the big energy data and improve the safety of data sharing.

Description

Energy big data sharing and distribution risk control method based on differential privacy protection
Technical Field
The application relates to the technical field of data security, in particular to an energy big data sharing and distributing risk control method based on differential privacy protection.
Background
Along with the rapid development of technologies such as the Internet of things, big data, artificial intelligence and the like, the importance and the value of energy data are increasingly revealed. However, advances in information technology also present a number of risks and challenges to data security. In recent years, the digitized transformation of energy enterprises is regarded as a key means for reducing cost and improving efficiency, and an important way for expanding service range is that the application requirements of cross-service and cross-system energy big data are increased, and the data sharing and distribution requirements are urgent. In the process of sharing and distributing the energy data, the data transmission can be attacked and stolen, so that the leakage risk of the energy data is increased. Therefore, in order to secure data security, it is necessary to encrypt the transmitted energy data.
However, in the energy big data management platform of each region, when uploading energy data or in the process of sharing and distributing the energy data by the energy big data center, the transmitted data may be attacked, stolen and other problems, so that the energy data leaks, and therefore, in general, certain encryption processing is performed on the transmitted energy data. In the energy big data, the electric power data is one of important data, and is the key point of data encryption.
Differential privacy technology is commonly used for protecting data and encrypting the data so as to improve the safety of data sharing and distribution and reduce the risk of large data sharing and distribution. The privacy budget in the differential privacy algorithm is used for controlling the privacy protection degree of the algorithm, and when the traditional differential privacy algorithm performs large data query protection, the privacy budget cannot be adaptively adjusted based on the data leakage risk, so that the protectiveness and usability of the data are reduced.
Disclosure of Invention
In order to solve the technical problems, the invention provides an energy big data sharing and distribution risk control method based on differential privacy protection so as to solve the existing problems.
The energy big data sharing and distribution risk control method based on differential privacy protection adopts the following technical scheme:
the embodiment of the invention provides an energy big data sharing and distribution risk control method based on differential privacy protection, which comprises the following steps:
acquiring various existing power data in an energy data management platform, and sequencing the various existing power data according to time to form a stored power data sequence of the various power data, and acquiring each newly added power data sequence by adopting an acquisition method of the stored power data sequence of the various power data for each newly added power data;
Setting new class weight of each new power data sequence according to the data class change of the energy data management platform before and after the new power data sequence is added; performing polynomial fitting on each newly-added power data sequence to obtain a polynomial curve of each newly-added power data sequence, and obtaining the peak compactness of each newly-added power data sequence according to the peak of the polynomial curve of each newly-added power data sequence at the moment corresponding to each peak; obtaining a data category consistency index of each newly-added power data sequence according to the newly-added weight of the category, the peak compactness and the cosine similarity between the newly-added power data sequence and the stored power data sequence; obtaining the macroscopic impact index of the power consumption of each newly-added power data sequence according to the polynomial curve peak value difference of the data category consistency index and each newly-added power data sequence; constructing microscopic change indexes of power consumption of each newly-added power data sequence according to the change condition of the polynomial curve and each wave crest and wave trough; obtaining subdivision degree indexes of each newly-added power data sequence according to the power consumption macroscopic influence indexes and the power consumption microscopic change indexes;
Constructing privacy budget correction factors of each newly-added power data sequence according to the data category consistency index and the subdivision degree index, and obtaining self-adaptive privacy budgets of each newly-added power data sequence according to preset initial privacy budgets and the privacy budget correction factors; and processing each newly added power data sequence by adopting a differential privacy query optimization algorithm in combination with the self-adaptive privacy budget to finish the safe storage of the power data.
Further, the setting the category new increasing weight of each new increasing power data sequence includes:
for each newly added power data, the data types of the energy data management platform before and after the newly added power data are counted and respectively recorded as
When (when)Setting the new weight of the category of the new power data sequence corresponding to each new power data to be 1;
When (when) And setting the new weight of the category of the new power data sequence corresponding to each new power data to be 2.
Further, the peak compactness of each newly added power data sequence includes:
For each newly added power data sequence, calculating the maximum value and the minimum value of the wave peak in a polynomial curve of the newly added power data sequence, calculating the difference value between the maximum value and the minimum value, calculating the product of the total number of wave peak points of the newly added power data sequence and the difference value, and obtaining the reciprocal of the product;
Counting the corresponding time of each wave crest in a polynomial curve of the newly added power data sequence, calculating the absolute value of the difference value of the corresponding time of the adjacent wave crest, and taking the opposite number of the sum value of the absolute values of the difference values of all the adjacent wave crests in the polynomial curve as an index of an exponential function taking a natural constant as a base number;
And taking the product of the reciprocal and the calculation result of the exponential function as the peak compactness of the newly added power data sequence.
Further, the data class consistency index of each newly added power data sequence includes:
For the stored power data sequence, obtaining the peak compactness of the stored power data sequence by adopting a calculation method of the peak compactness of the newly added power data sequence;
For each newly added power data sequence, taking the opposite number of the newly added power data sequence as an index of an exponential function based on a natural constant;
Calculating the absolute value of the difference value of the peak compactness of the newly-added power data sequence and each stored power data sequence, obtaining the sum value of the absolute value of the difference value and a preset value, respectively calculating the cosine similarity of the newly-added power data sequence and each stored power data sequence, calculating the ratio of the cosine similarity to the sum value, obtaining the sum value of the ratio obtained by calculating the newly-added power data sequence and all the stored power data sequences, and marking the sum value as a first sum value;
And taking the product of the calculation result of the exponential function and the first sum value as a data category consistency exponent of the newly added power data sequence.
Further, the macroscopic impact index of the electricity consumption of each newly added power data sequence comprises:
calculating the absolute value of the difference between the peak values and the average values of the polynomial curves of each newly-added power data sequence and other newly-added power data sequences, and obtaining the sum of the absolute values of the difference calculated by the polynomial curves of each newly-added power data sequence and all the other newly-added power data sequences;
and taking the product of the data category consistency index of the newly added power data sequence and the sum value as the power consumption macroscopic influence index of the newly added power data sequence.
Further, the constructing the microscopic change index of the power consumption of each newly added power data sequence according to the change condition of the polynomial curve and each wave crest and each wave trough comprises the following steps:
for each newly added power data sequence, constructing a peak-valley pair of a polynomial curve of each newly added power data sequence, extracting each round trip point of the polynomial curve, and counting the number of the round trip points;
For each peak-valley pair, marking the difference value between the peak value and the trough value of the peak-valley pair as a first difference value, marking the difference value between the moment of the peak point and the moment of the trough point of each peak-valley pair as a second difference value, and obtaining the ratio absolute value of the first difference value and the second difference value; calculating the sum of absolute values of the ratios of all peak-valley pairs;
and taking the product of the number and the sum value as an electric consumption microscopic change index of the newly added electric power data sequence.
Further, the constructing the peak-valley pair of the polynomial curve of each newly added power data sequence, extracting each round trip point of the polynomial curve, includes:
counting all wave peak points and wave trough points on the polynomial curve for the polynomial curve of each newly added power data sequence, and combining each wave trough point with the wave peak point of the next wave trough point from the first wave trough point to form a peak-valley pair;
and acquiring an ordinate corresponding to one half of the data value of the minimum peak point of the polynomial curve, taking a horizontal line passing through the ordinate and parallel to the abscissa axis as a round-trip judgment line, and taking each point on the round-trip judgment line as each round-trip point of the polynomial curve.
Further, the obtaining the subdivision degree index of each newly added power data sequence according to the power consumption macroscopic influence index and the power consumption microscopic change index includes:
obtaining the sum of a macroscopic impact index of the power consumption of the newly-added power data sequence and a preset value, wherein the preset value is larger than zero;
and taking the ratio of the microscopic change index of the power consumption of the newly-added power data sequence to the sum value as the subdivision degree index of the newly-added power data sequence.
Further, the constructing the privacy budget correction factor of each newly added power data sequence according to the data category consistency index and the subdivision degree index includes:
and acquiring the sum value of the subdivision degree index and the preset value of the newly-added power data sequence, and taking the ratio of the data category consistency index of the newly-added power data sequence to the sum value as the privacy budget correction factor of the newly-added power data sequence, wherein the preset value is larger than zero.
Further, the adaptive privacy budget is a product of a preset initial privacy budget and a privacy budget correction factor.
The invention has at least the following beneficial effects:
According to the invention, the sharing risk control is carried out on the electric power data of the large energy data, the data category consistency index and the subdivision degree index are constructed by analyzing the association influence characteristics of the newly added electric power data and the existing electric power data and the macroscopic and microscopic influence of the newly added electric power data on the other data, the privacy budget correction factor is calculated, the privacy budget value in the differential privacy query optimization algorithm is improved, the setting of the privacy budget value can be adaptively adjusted when the differential privacy query optimization algorithm faces the data without the differential privacy leakage risk, and the problems that the privacy budget cannot be adjusted based on the data leakage risk when the differential privacy query optimization algorithm is used for carrying out electric power data query protection, and the protectiveness and usability of the data are reduced are solved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a step flow chart of an energy big data sharing distribution risk control method based on differential privacy protection provided by the invention;
Fig. 2 is a schematic diagram of a polynomial curve of a new power data sequence.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description is given to the energy big data sharing and distributing risk control method based on differential privacy protection according to the invention by combining the accompanying drawings and the preferred embodiment. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The specific scheme of the energy big data sharing and distribution risk control method based on differential privacy protection provided by the invention is specifically described below with reference to the accompanying drawings.
The method for controlling the risk of sharing and distributing the big energy data based on the differential privacy protection provided by the embodiment of the invention specifically provides the following method for controlling the risk of sharing and distributing the big energy data based on the differential privacy protection, referring to fig. 1, the method comprises the following steps:
Step S001, collecting existing power data and newly added power data in the energy data management platform.
In a certain energy data management platform, data values of various existing power data at different times are collected, wherein the various power data include but are not limited to: electricity usage, power generation, maximum load, maximum voltage, etc. Ordering the data of each power data at different time according to time, constructing a stored power data sequence, and recording the stored power data sequence asWherein/>Meaning the d-th stored power data sequence is collected, wherein each power data corresponds to one stored power data sequence. Further, newly-added shared distribution power data received in the energy data management platform are collected, each time of the newly-added power data is sequenced according to a time sequence, a newly-added power data sequence is constructed, and the newly-added power data sequence is recorded as/>Wherein/>Meaning the g new power data sequence is collected, wherein each new power data corresponds to one new power data sequence.
Thus, each stored power data sequence and each newly added power data sequence can be acquired.
Step S002, constructing a data category consistency index according to the association influence characteristics of the newly added power data and the existing power data; constructing subdivision degree indexes of the newly-added power data according to macroscopic and microscopic influences of the newly-added power data on the other data; and calculating a privacy budget correction factor to obtain the self-adaptive privacy budget.
Because part of information in the big energy data center is disclosed, other big energy data management platforms can also obtain the data content to be analyzed by inquiring related data besides inquiring the public, but the big energy data management platform also has a multiplying mechanism for lawbreakers. In general, in addition to attack and hack on transmitted encrypted data, lawless persons may acquire indirect data information of stored energy data to obtain non-public data information.
For example, in a certain energy big data management platform, the data types disclosed by the industrial production electricity consumption project of a certain enterprise are 10, wherein the data types disclosed by the electricity consumption project of the ore equipment are 5, lawless persons can obtain the data through the query function of the platform, and when the energy big data center performs sharing distribution of data, the energy big data management platform obtains the newly-added ore equipment electricity consumption project of the enterprise, then the data is updated to 11 data types disclosed by the industrial production electricity consumption project, wherein the data types disclosed by the electricity consumption project of the ore equipment are 6, then the lawless persons can learn through query that the ore equipment is newly added this time, and the information that the relevant development direction of the enterprise possibly is the ore equipment is obtained. The above situation is the private data that can be obtained by differential attack.
In general, in order to prevent differential attack, many data platforms use a differential privacy protection method, for example, a differential privacy query optimization algorithm, so that the difference between query results is reduced by adding a certain noise to the query results during query, for example, in the above example, the query results before and after adding new data may be both 6 or both 7, so that an attacker cannot distinguish which data set a certain sample is in. Generally, a privacy budget is preset by using a differential privacy protection methodWhen the privacy budget is set to be larger, the more noise is added to the result, the better the differential privacy protection effect is, but the usability of data is reduced due to the addition of noise, otherwise, the smaller the privacy budget is, the less the added noise is, the worse the protection effect of the differential privacy is, and the usability of data is also increased. Therefore, the protection effect of differential privacy can be enhanced under the condition that the availability of data is ensured by selecting a proper privacy budget value. However, since the privacy budget is usually preset, it is difficult to ensure the applicability, resulting in an increase in resources consumed for protecting privacy.
For the differential attack, the greater the degree of change of the newly added data to the original database, the easier it is to obtain some information through the differential attack, for example, only 5 kinds of original query classifications, and when the newly added data is changed to 6 kinds of query classifications, the newly added data is the newly added kind of data. Therefore, the size of the privacy budget needs to be adaptively adjusted according to the association degree of the newly added data with the rest of the data in the database and the change degree of the database.
Specifically, each newly added power data sequence is used as input of a polynomial fitting algorithm, and is output as a polynomial curve of each newly added power data sequence, and specifically as shown in fig. 2, all peak points in the polynomial curve are obtained. Counting the total number of data categories before and after each new power data increment in the energy data management platform, and respectively recording the total number as. The polynomial fitting algorithm is a well-known technique, and the process is not described in detail in this embodiment. Constructing a data category consistency index (I) of each newly added power data sequence:
In the method, in the process of the invention, Adding weight for the category of the g-th newly added power data sequence,/>The total number of data categories before and after the g-th newly added power data sequence of the energy data management platform is respectively.
Peak compactness for the g-th newly added power data sequence,/>Peak maximum value and minimum value in polynomial curve of g' th newly added power data sequence respectively,/>And h is the total number of peak points of the g new power data sequence, wherein the time values are respectively the b+1th peak and the b peak in the polynomial curve of the g new power data sequence.
A data class consistency index for the g-th newly added power data sequence, J for the total number of stored power data sequences,Peak compactness for the d-th stored power data sequence,/>Store power data sequence for the d-th,/>For the g-th newly added power data sequence,/>As cosine similarity function,/>To calculate the cosine similarity of the g-th newly added power data sequence and the d-th stored power data sequence,/>To prevent the denominator from being zero parameter,/>. The cosine similarity algorithm is a known technique, and the process of this embodiment is not described in detail. Wherein/>Recorded as a first sum.
When the g-th newly added power data sequence is added to the energy data management platform and the new added category of the energy data appears,When no new category is added,/>It is explained that the more easily the differential information is obtained when the newly added power data affects the number of categories. When the peak range of the g-th newly added power data is smaller, the wave range is/(I)The larger the value of (2) and the shorter the interval between the occurrence of peaks,/>The larger the value of (c) is, the tighter the peak distribution of the g-th newly added power data sequence is, and the more outstanding the characteristics are. When the peak compactness of the g-th newly-added power data sequence is closer to that of the rest of the stored power data sequences, the cosine similarity between the sequences is larger, the first sum value is larger, and the influence weight of the g-th newly-added power data sequence on the category newly-added is smaller,/>The larger the data type is, the smaller the difference between the g newly added power data sequence and the existing data in the energy data management platform is, and the more consistent the data type is, the less the differential privacy of the newly added data is easy to attack.
So far, the data category consistency index of each newly added power data sequence is obtained.
In the power data, some data are macroscopic, such as the whole power consumption of a factory of a certain enterprise, the whole power consumption of the factory contains a plurality of subdivided power consumption, the power consumption of the factory of the certain enterprise and the power consumption of a certain device under the factory are changed by the influence of the subdivided power consumption, when the power consumption of the device rises at a certain moment, the whole power consumption of the factory also rises at the moment, the power consumption of the device changes to different degrees according to the power consumption of the device, the association degree of the data of the two electric powers is larger, but the higher the subdivision degree of the power data is, the clearer the electric equipment is, the higher the speed of the data change is, the more easy the influence of the electric equipment is on the whole data base is, new extra content is added due to the electric equipment, the difference is easier to be acquired, and privacy budget of the data is more needed to be increased.
Specifically, each newly added power data sequence is used as the input of a polynomial curve fitting algorithm, the output is a polynomial curve of each newly added power data sequence, all wave peak points and wave trough points in the polynomial curve are obtained, each wave trough point and the wave peak point next to the wave trough point are combined into a peak-valley pair from the first wave trough point, and the peak-valley pair is recorded asWhereinMeaning of i-th peak-valley pair,/>For the peak value of the ith peak-valley pair,/>Is the trough value of the ith peak-trough pair. The method comprises the steps of obtaining the minimum peak point of a polynomial curve of a newly added power data sequence, obtaining the ordinate of the minimum peak point, taking a horizontal line parallel to the abscissa axis passing through the ordinate as a round trip judgment line, taking each point on the round trip judgment line as a round trip point, counting the times of the polynomial curve of each newly added power data sequence passing through the round trip judgment line, namely the number of round trip points, as shown in fig. 2, and recording the round trip judgment line as u.
Constructing subdivision degree index of each newly added power data sequence):
In the method, in the process of the invention,Macroscopic impact index of electricity consumption for g' th newly added power data sequence,/>Data category consistency index for the g-th newly added power data sequence,/>Peak mean value of polynomial curves of the g and k newly added power data sequences respectively,/>Is the total number of the newly added power data sequences.
Microscopic change index of electricity consumption for g-th newly added power data sequence,/>The number of round trip points of the polynomial curve for the g-th newly added power data sequence,/>For the total number of peak-valley pairs of the g-th newly added power data sequence,/>For the peak value of the ith peak-valley pair in the g-th newly added power data sequence,/>For the trough value of the ith peak-trough pair in the g-th newly added power data sequence,The time values of the ith peak point and the ith trough point in the g-th newly added power data sequence are respectively.
Index of subdivision degree for the g-th newly added power data sequence,/>A preset value greater than zero for preventing the denominator from being zero, this embodiment/>
When the data category consistency index of the g-th newly added power data sequence is larger and the peak-to-average value difference between the polynomial curve peak-to-average value of the sequence and the polynomial curves of other sequences is larger, the power data sequence is characterized in thatThe larger the value of (c) is, the greater the degree of correlation between the new power data and other new power data is, and the macroscopic influence degree is high. When the number of times of the round trip data average value in the g newly added power data sequence curve is more, and each change in the sequence is more rapid, the method is/>The larger the value of (c) indicates that the sequence is more frequent and easier to change during the course of the change. When the macroscopic impact index of the electricity consumption of the g-th newly added electric power data sequence is smaller, and the microscopic change index of the electricity consumption is larger, namely/>The larger the subdivision degree of the g newly added power data sequence is, the more likely the subdivision degree of the g newly added power data sequence is obtained in differential attack after the subdivision degree is added to the energy data management platform.
To this end, a subdivision index of each newly added power data sequence is obtained. Combining the data category consistency index I of each newly added power data sequence to construct a privacy budget correction factor (/ >) of a differential privacy query optimization algorithm):
In the method, in the process of the invention,Privacy budget correction factor for g' th newly added power data sequence,/>Index of subdivision degree for the g-th newly added power data sequence,/>Data category consistency index for the g-th newly added power data sequence,/>A preset value greater than zero for preventing the denominator from being zero, this embodiment/>
When the subdivision degree of the g newly added power data sequence is large and the degree of consistency of the data with the data types of the rest of the power data sequences is lower, the data rarity degree of the g newly added power data sequence is higher, the energy data management platform is more likely to generate larger change after adding the data, the possibility that the data is acquired when being subjected to differential attack is higher, and the privacy budget is more reduced, so that the protection effect is increased.
Computing adaptive privacy budget for g-th newly added power data sequence
In the method, in the process of the invention,Adaptive privacy budgeting for g-th newly added power data sequence,/>For a preset initial privacy budget, the value in this embodiment is 5,/>And (5) adding a privacy budget correction factor for the power data sequence of the g.
When the privacy budget correction factor of the g newly added power data sequence is smaller, the protection degree needed by the g newly added power data sequence is higher, the privacy budget is reduced, and the privacy budget is self-adaptedThe smaller the value of (2); when the privacy budget correction factor of the g newly added power data sequence is larger, the protection degree needed by the g newly added power data sequence is lower, the privacy budget is increased, the availability of the data is improved, and the privacy budget/>, is self-adaptiveThe greater the value of (2).
To this end, the privacy budget required for each newly added power data sequence is adaptively calculated.
When the energy data management platform receives newly-added power data each time, query optimization is performed on each newly-added power data sequence by using a differential privacy query optimization algorithm, privacy budget self-adaptive adjustment is performed according to characteristic points of the newly-added power data, the data protection effect is improved, the availability of the data is improved, and the data quality and the safety of the energy data management platform are improved. The differential privacy query optimization algorithm is a known technology, and the process of this embodiment is not described in detail.
And step S003, the power data is processed by adopting a differential privacy query optimization algorithm in combination with the self-adaptive privacy budget, so that the risk of sharing and distributing the large energy data is reduced, and the data sharing safety is improved.
When the large energy data center needs to share and distribute the electric power data, the electric power data to be shared and distributed is encrypted by using an AES encryption algorithm, and the encrypted electric power data is sent to each energy data management platform. When each energy data management platform receives the ciphertext of the shared and distributed power data, plaintext data of the shared and distributed power data is obtained through decryption and stored in a power database of the energy data management platform, the privacy budget of a differential privacy query optimization algorithm is adjusted in a self-adaptive mode according to the characteristics of newly-added data, the safety and quality of the newly-added power data storage are improved, the possibility of data stealing is reduced, and the risk of the data in the sharing and distribution process is further reduced.
According to the process of the embodiment, sharing distribution risk control of the energy big data based on differential privacy protection can be achieved.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. And the foregoing description has been directed to specific embodiments of this specification. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and the same or similar parts of each embodiment are referred to each other, and each embodiment mainly describes differences from other embodiments.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; the technical solutions described in the foregoing embodiments are modified or some of the technical features are replaced equivalently, so that the essence of the corresponding technical solutions does not deviate from the scope of the technical solutions of the embodiments of the present application, and all the technical solutions are included in the protection scope of the present application.

Claims (9)

1. The energy big data sharing and distribution risk control method based on differential privacy protection is characterized by comprising the following steps of:
acquiring various existing power data in an energy data management platform, and sequencing the various existing power data according to time to form a stored power data sequence of the various power data, and acquiring each newly added power data sequence by adopting an acquisition method of the stored power data sequence of the various power data for each newly added power data;
Setting new class weight of each new power data sequence according to the data class change of the energy data management platform before and after the new power data sequence is added; performing polynomial fitting on each newly-added power data sequence to obtain a polynomial curve of each newly-added power data sequence, and obtaining the peak compactness of each newly-added power data sequence according to the peak of the polynomial curve of each newly-added power data sequence at the moment corresponding to each peak; obtaining a data category consistency index of each newly-added power data sequence according to the newly-added weight of the category, the peak compactness and the cosine similarity between the newly-added power data sequence and the stored power data sequence; obtaining the macroscopic impact index of the power consumption of each newly-added power data sequence according to the polynomial curve peak value difference of the data category consistency index and each newly-added power data sequence; constructing microscopic change indexes of power consumption of each newly-added power data sequence according to the change condition of the polynomial curve and each wave crest and wave trough; obtaining subdivision degree indexes of each newly-added power data sequence according to the power consumption macroscopic influence indexes and the power consumption microscopic change indexes;
Constructing privacy budget correction factors of each newly-added power data sequence according to the data category consistency index and the subdivision degree index, and obtaining self-adaptive privacy budgets of each newly-added power data sequence according to preset initial privacy budgets and the privacy budget correction factors; processing each newly added power data sequence by adopting a differential privacy query optimization algorithm in combination with the self-adaptive privacy budget to finish the safe storage of the power data;
the peak compactness of each newly added power data sequence comprises:
For each newly added power data sequence, calculating the maximum value and the minimum value of the wave peak in a polynomial curve of the newly added power data sequence, calculating the difference value between the maximum value and the minimum value, calculating the product of the total number of wave peak points of the newly added power data sequence and the difference value, and obtaining the reciprocal of the product;
Counting the corresponding time of each wave crest in a polynomial curve of the newly added power data sequence, calculating the absolute value of the difference value of the corresponding time of the adjacent wave crest, and taking the opposite number of the sum value of the absolute values of the difference values of all the adjacent wave crests in the polynomial curve as an index of an exponential function taking a natural constant as a base number;
And taking the product of the reciprocal and the calculation result of the exponential function as the peak compactness of the newly added power data sequence.
2. The method for controlling risk of sharing and distributing big energy data based on differential privacy protection as set forth in claim 1, wherein the setting the new weight of each new type of the power data sequence comprises:
for each newly added power data, counting the total number of data categories of the energy data management platform before and after each newly added power data is respectively recorded as
When (when)Setting the new weight of the category of the new power data sequence corresponding to each new power data to be 1;
When (when) And setting the new weight of the category of the new power data sequence corresponding to each new power data to be 2.
3. The method for controlling risk of sharing and distributing big energy data based on differential privacy protection as set forth in claim 1, wherein the data category consistency index of each newly added power data sequence comprises:
For the stored power data sequence, obtaining the peak compactness of the stored power data sequence by adopting a calculation method of the peak compactness of the newly added power data sequence;
For each newly added power data sequence, taking the opposite number of the newly added power data sequence as an index of an exponential function based on a natural constant;
Calculating the absolute value of the difference value of the peak compactness of the newly-added power data sequence and each stored power data sequence, obtaining the sum value of the absolute value of the difference value and a preset value, respectively calculating the cosine similarity of the newly-added power data sequence and each stored power data sequence, calculating the ratio of the cosine similarity to the sum value, obtaining the sum value of the ratio obtained by calculating the newly-added power data sequence and all the stored power data sequences, and marking the sum value as a first sum value;
And taking the product of the calculation result of the exponential function and the first sum value as a data category consistency exponent of the newly added power data sequence.
4. The method for controlling risk of sharing and distributing energy big data based on differential privacy protection as set forth in claim 1, wherein the macroscopic impact index of electricity consumption of each newly added power data sequence comprises:
calculating the absolute value of the difference between the peak values and the average values of the polynomial curves of each newly-added power data sequence and other newly-added power data sequences, and obtaining the sum of the absolute values of the difference calculated by the polynomial curves of each newly-added power data sequence and all the other newly-added power data sequences;
and taking the product of the data category consistency index of the newly added power data sequence and the sum value as the power consumption macroscopic influence index of the newly added power data sequence.
5. The method for controlling risk of sharing and distributing energy big data based on differential privacy protection as set forth in claim 1, wherein the constructing the microscopic change index of the power consumption of each newly added power data sequence according to the change condition of the polynomial curve and each peak and trough comprises:
for each newly added power data sequence, constructing a peak-valley pair of a polynomial curve of each newly added power data sequence, extracting each round trip point of the polynomial curve, and counting the number of the round trip points;
For each peak-valley pair, marking the difference value between the peak value and the trough value of the peak-valley pair as a first difference value, marking the difference value between the moment of the peak point and the moment of the trough point of each peak-valley pair as a second difference value, and obtaining the ratio absolute value of the first difference value and the second difference value; calculating the sum of absolute values of the ratios of all peak-valley pairs;
and taking the product of the number and the sum value as an electric consumption microscopic change index of the newly added electric power data sequence.
6. The method for controlling risk of sharing and distributing big energy data based on differential privacy protection as set forth in claim 5, wherein the constructing the peak-valley pairs of the polynomial curves of each newly added power data sequence, extracting each round trip point of the polynomial curves, comprises:
counting all wave peak points and wave trough points on the polynomial curve for the polynomial curve of each newly added power data sequence, and combining each wave trough point with the wave peak point of the next wave trough point from the first wave trough point to form a peak-valley pair;
and acquiring an ordinate corresponding to one half of the data value of the minimum peak point of the polynomial curve, taking a horizontal line passing through the ordinate and parallel to the abscissa axis as a round-trip judgment line, and taking each point on the round-trip judgment line as each round-trip point of the polynomial curve.
7. The method for controlling risk of sharing and distributing big energy data based on differential privacy protection according to claim 1, wherein the step of obtaining the subdivision index of each newly added power data sequence according to the macroscopic influence index of power consumption and the microscopic change index of power consumption comprises the following steps:
obtaining the sum of a macroscopic impact index of the power consumption of the newly-added power data sequence and a preset value, wherein the preset value is larger than zero;
and taking the ratio of the microscopic change index of the power consumption of the newly-added power data sequence to the sum value as the subdivision degree index of the newly-added power data sequence.
8. The method for controlling risk of sharing and distributing big energy data based on differential privacy protection as set forth in claim 1, wherein the constructing the privacy budget correction factor of each newly added power data sequence according to the data category consistency index and the subdivision degree index comprises:
and acquiring the sum value of the subdivision degree index and the preset value of the newly-added power data sequence, and taking the ratio of the data category consistency index of the newly-added power data sequence to the sum value as the privacy budget correction factor of the newly-added power data sequence, wherein the preset value is larger than zero.
9. The differential privacy protection-based energy big data sharing distribution risk control method according to claim 8, wherein the adaptive privacy budget is a product of a preset initial privacy budget and a privacy budget correction factor.
CN202410295368.9A 2024-03-15 Energy big data sharing and distribution risk control method based on differential privacy protection Active CN117892357B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410295368.9A CN117892357B (en) 2024-03-15 Energy big data sharing and distribution risk control method based on differential privacy protection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410295368.9A CN117892357B (en) 2024-03-15 Energy big data sharing and distribution risk control method based on differential privacy protection

Publications (2)

Publication Number Publication Date
CN117892357A CN117892357A (en) 2024-04-16
CN117892357B true CN117892357B (en) 2024-05-31

Family

ID=

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001415A (en) * 2020-07-15 2020-11-27 西安电子科技大学 Location difference privacy protection method based on countermeasure network
WO2021248937A1 (en) * 2020-06-09 2021-12-16 深圳大学 Geographically distributed graph computing method and system based on differential privacy
CN115378707A (en) * 2022-08-23 2022-11-22 西安电子科技大学 Adaptive sampling federal learning privacy protection method based on threshold homomorphism
CN116595553A (en) * 2023-05-18 2023-08-15 马贤 Encryption method of intelligent electric meter of Internet of things with differential privacy protection
CN117118736A (en) * 2023-09-20 2023-11-24 辽宁大学 Privacy protection method of NDN file sharing system facing privacy protection
CN117235770A (en) * 2023-10-27 2023-12-15 国网浙江省电力有限公司信息通信分公司 Power data sharing analysis system and method based on differential privacy
CN117454408A (en) * 2023-12-01 2024-01-26 上海零数众合信息科技有限公司 Data sharing security verification method and system based on differential privacy
CN117521117A (en) * 2024-01-05 2024-02-06 深圳万海思数字医疗有限公司 Medical data application security and privacy protection method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021248937A1 (en) * 2020-06-09 2021-12-16 深圳大学 Geographically distributed graph computing method and system based on differential privacy
CN112001415A (en) * 2020-07-15 2020-11-27 西安电子科技大学 Location difference privacy protection method based on countermeasure network
CN115378707A (en) * 2022-08-23 2022-11-22 西安电子科技大学 Adaptive sampling federal learning privacy protection method based on threshold homomorphism
CN116595553A (en) * 2023-05-18 2023-08-15 马贤 Encryption method of intelligent electric meter of Internet of things with differential privacy protection
CN117118736A (en) * 2023-09-20 2023-11-24 辽宁大学 Privacy protection method of NDN file sharing system facing privacy protection
CN117235770A (en) * 2023-10-27 2023-12-15 国网浙江省电力有限公司信息通信分公司 Power data sharing analysis system and method based on differential privacy
CN117454408A (en) * 2023-12-01 2024-01-26 上海零数众合信息科技有限公司 Data sharing security verification method and system based on differential privacy
CN117521117A (en) * 2024-01-05 2024-02-06 深圳万海思数字医疗有限公司 Medical data application security and privacy protection method and system

Similar Documents

Publication Publication Date Title
CN102360488B (en) Digital image encryption method based on chaotic orbit perturbation
CN108898025A (en) New chaotic image encryption method based on dual scramble and DNA encoding
CN101965711B (en) Signature and verification method, signature generation device, and signature verification device
CN107317666A (en) A kind of parallel full homomorphism encipher-decipher method for supporting floating-point operation
CN112104619A (en) Data access control system and method based on outsourcing ciphertext attribute encryption
CN117240604B (en) Cloud computing-based data safe storage and energy saving optimization method
CN106788963A (en) A kind of full homomorphic cryptography method of identity-based on improved lattice
CN112288523A (en) Block chain based numerical ranking method and device
CN114386089A (en) Privacy set intersection method based on multi-condition retrieval
Pan et al. A novel image encryption algorithm based on hybrid chaotic mapping and intelligent learning in financial security system
CN108259185A (en) A kind of group key agreement system and method for group communication moderate resistance leakage
CN112116672A (en) Color image saving thumbnail encryption algorithm based on genetic algorithm
CN115189878A (en) Shared data sorting method based on secret sharing and electronic equipment
CN110209994B (en) Matrix decomposition recommendation method based on homomorphic encryption
CN108282328A (en) A kind of ciphertext statistical method based on homomorphic cryptography
CN117892357B (en) Energy big data sharing and distribution risk control method based on differential privacy protection
Sun et al. Fed-DFE: A Decentralized Function Encryption-Based Privacy-Preserving Scheme for Federated Learning.
CN110489998A (en) One kind can search for encryption method, device, equipment and readable storage medium storing program for executing
CN111682932B (en) Single-round image encryption method based on mixed chaotic mapping
CN117892357A (en) Energy big data sharing and distribution risk control method based on differential privacy protection
CN116938450A (en) Paillier encryption-based privacy protection Bayesian robust federal learning method and system
CN116756756A (en) Financial data safe storage method
Zhang et al. FSDA: flexible subset data aggregation for smart grid
Yan et al. Chaotic image encryption algorithm based on fractional order scrambling wavelet transform and 3D cyclic displacement operation
CN113114454B (en) Efficient privacy outsourcing k-means clustering method

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20240508

Address after: 1-10 / F, C building, No.87 courtyard, Songshan South Road, Erqi District, Zhengzhou City, Henan Province

Applicant after: ECONOMIC TECHNOLOGY RESEARCH INSTITUTE OF STATE GRID HENAN ELECTRIC POWER Co.

Country or region after: China

Applicant after: Henan Jiuyu Tenglong Information Engineering Co.,Ltd.

Address before: F1-6, No.16 Baoling Street, Dalian Economic and Technological Development Zone, Liaoning Province, 116600

Applicant before: Dalian Youguan Network Technology Co.,Ltd.

Country or region before: China

Applicant before: ECONOMIC TECHNOLOGY RESEARCH INSTITUTE OF STATE GRID HENAN ELECTRIC POWER Co.

Applicant before: Henan Jiuyu Tenglong Information Engineering Co.,Ltd.

GR01 Patent grant