CN117763620A - Electric power big data dynamic desensitization method based on isomorphic encryption algorithm - Google Patents

Electric power big data dynamic desensitization method based on isomorphic encryption algorithm Download PDF

Info

Publication number
CN117763620A
CN117763620A CN202410194568.5A CN202410194568A CN117763620A CN 117763620 A CN117763620 A CN 117763620A CN 202410194568 A CN202410194568 A CN 202410194568A CN 117763620 A CN117763620 A CN 117763620A
Authority
CN
China
Prior art keywords
disturbance
data
power
management
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410194568.5A
Other languages
Chinese (zh)
Inventor
王世谦
郭军利
邵志鹏
石磊
张小建
高宇飞
黄勇
卜飞飞
李秋燕
王圆圆
韩丁
华远鹏
宋大为
贾一博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Smart Grid Research Institute Co ltd
Zhengzhou University
Economic and Technological Research Institute of State Grid Henan Electric Power Co Ltd
Original Assignee
State Grid Smart Grid Research Institute Co ltd
Shenzhen Fushan Automation Technology Co ltd
Zhengzhou University
Economic and Technological Research Institute of State Grid Henan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Smart Grid Research Institute Co ltd, Shenzhen Fushan Automation Technology Co ltd, Zhengzhou University, Economic and Technological Research Institute of State Grid Henan Electric Power Co Ltd filed Critical State Grid Smart Grid Research Institute Co ltd
Priority to CN202410194568.5A priority Critical patent/CN117763620A/en
Publication of CN117763620A publication Critical patent/CN117763620A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to the technical field of data processing, and provides a dynamic desensitization method for big electric power data based on an isomorphic encryption algorithm, which comprises the following steps: acquiring a large electric power transaction data set, acquiring a data set to be suppressed and a privacy protection data set according to the large electric power transaction data set, acquiring a suppression protection enhancement coefficient according to the data set to be suppressed and the privacy protection data set of different electric power data samples in the large electric power transaction data set, constructing a disturbance neighbor matrix according to the suppression protection enhancement coefficient, calculating a data disturbance sensitivity coefficient according to the disturbance neighbor matrix, constructing a disturbance sensitive management quadtree according to the data disturbance sensitivity coefficient, calculating a noise sensitive management coefficient based on the disturbance sensitive management quadtree, acquiring electric power data noise sensitivity according to the noise sensitive management coefficient, and acquiring an isomorphic encryption result of electric power data based on the electric power data noise sensitivity. The application encrypts the power data in an isomorphic manner through the power data noise sensitivity, and improves the desensitization quality of the power data.

Description

Electric power big data dynamic desensitization method based on isomorphic encryption algorithm
Technical Field
The application relates to the technical field of data processing, in particular to a dynamic desensitization method for large electric power data based on an isomorphic encryption algorithm.
Background
currently, the demand for data mining is increasing, the collection of various information and data is more and more frequent, and various algorithms and big data technologies enable the data processing process to be rapid and accurate, but also bring about the demand for protecting the data privacy. With the development of smart grids, data mining and analysis of the state of the power industry by using large published power data has become a development trend in the age of power informatization, but data sharing brings convenience and also brings risk of privacy disclosure, so that privacy protection of power data is receiving a great deal of attention.
The privacy protection mode is wide, common privacy protection methods are K-anonymity, L-diversity, T-compactness and the like, data can be desensitized in a full homomorphic encryption mode, and when the traditional full homomorphic encryption algorithm desensitizes the large electric power data, the large electric power data is converted into encrypted ciphertext, and meanwhile the ciphertext has the same-state attribute of addition and multiplication. At present, a key exchange technology is added to an isomorphic encryption scheme based on LWE, so that the isomorphic encryption scheme can better have multiplication and addition homomorphic calculation, but after each ciphertext multiplication calculation, a product ciphertext is multiplied by a key exchange matrix, so that a normal-dimension ciphertext is obtained by key exchange, and the calculation efficiency is greatly influenced; on the other hand, the management of noise in the full homomorphic encryption process is directly related to the security and precision of data encryption, if the noise in the full homomorphic encryption process is too large, the decryption accuracy is reduced, and if the noise in the ciphertext is too small, the security of data encryption is reduced; these problems reduce the efficiency and accuracy of data desensitization of the electrical big data, thereby affecting the accuracy of the desensitization processing of the electrical big data.
Disclosure of Invention
The application provides a dynamic desensitization method of large electric power data based on an isomorphic encryption algorithm, which aims to solve the problem of poor desensitization quality of the electric power data, and adopts the following technical scheme:
one embodiment of the application provides a dynamic desensitization method for big electric power data based on an isomorphic encryption algorithm, which comprises the following steps:
acquiring a large data set of power transaction;
Dividing the data of each power data sample in the power transaction big data set into a data set to be concealed and a privacy protection data set; calculating a hiding protection enhancement coefficient according to the difference between a to-be-hidden data set and a privacy protection data set among different power data samples in the power transaction big data set;
Constructing a disturbance neighbor matrix of each power data sample according to the hidden protection enhancement coefficients among different power data samples in the power transaction big data set; calculating a data disturbance sensitivity coefficient according to a disturbance neighbor matrix of each power data sample in the power transaction big data set;
Constructing a disturbance sensitive management quadtree according to data disturbance sensitivity coefficients among different power data samples in the power transaction big data set, and calculating a noise sensitive management coefficient according to the disturbance sensitive management quadtree;
and acquiring the noise sensitivity of the power data according to the noise sensitivity management coefficient, and finishing the desensitization processing of the power data based on the noise sensitivity of the power data.
Preferably, the method for dividing the data of each power data sample in the large data set of the power transaction into the data set to be concealed and the privacy protection data set comprises the following steps:
The name, contact telephone and ammeter user number information of each power data sample in the power transaction big data set are used as to-be-concealed processing data information, and a set formed by all to-be-concealed processing data information corresponding to each power data sample is used as to-be-concealed data set of each power data sample; the method comprises the steps of taking age, electricity consumption and account balance information of each power data sample in a power transaction big data set as data information to be privacy protected, and taking a set formed by all data information to be privacy protected corresponding to each power data sample as a privacy protection data set of each power data sample.
Preferably, the method for calculating the concealment protection enhancement coefficient according to the difference between the to-be-concealed data set and the privacy protection data set among different power data samples in the power transaction big data set comprises the following steps:
For any two power data samples in the power transaction big data set, taking the Jaccard coefficient of the data set to be concealed between the two power data samples as the concealment enhancement coefficient of the two power data samples; taking the Jacquard coefficient of the privacy-preserving data set between the two power data samples as the privacy-preserving coefficient between the two power data samples;
The concealment enhancement coefficient between two power data samples is used as a numerator, the sum of the privacy protection coefficient between the two data samples and 0.01 is used as a denominator, and the ratio of the numerator to the denominator is used as the concealment protection enhancement coefficient of the two data samples.
Preferably, the method for constructing the disturbance neighbor matrix of each power data sample according to the hidden protection enhancement coefficient among different power data samples in the power transaction big data set comprises the following steps:
Taking a power transaction big data set as input, acquiring a preset number of neighbor samples of each power data sample in the power transaction big data set by adopting a neighbor algorithm, and taking a set formed by each power data sample and a corresponding preset number of neighbor samples as a neighbor sample set of each power data sample; and acquiring a disturbance neighbor sequence according to a neighbor sample set of each power data sample in the power transaction big data set, and acquiring a disturbance neighbor matrix of each power data sample based on the disturbance neighbor sequence.
Preferably, the method for obtaining the disturbance neighbor sequence according to the neighbor sample set of each power data sample in the power transaction big data set and obtaining the disturbance neighbor matrix of each power data sample based on the disturbance neighbor sequence comprises the following steps:
For a neighbor sample set corresponding to each power data sample in the power transaction big data set, taking a hidden protection enhancement coefficient between each element and other elements in a neighbor sample set as a disturbance neighbor sequence of each element in the neighbor sample set according to a sequence consisting of a small order to a large order, taking the disturbance neighbor sequence as one row of elements in a matrix, and taking a matrix consisting of all disturbance neighbor sequences corresponding to the neighbor sample set as a disturbance neighbor matrix of each power data sample.
Preferably, the method for calculating the data disturbance sensitivity coefficient according to the disturbance neighbor matrix of each power data sample in the power transaction big data set comprises the following steps:
In the method, in the process of the invention,representing the/>, in a big dataset of power transactionsSum/>data disturbance sensitivity coefficients between the individual power data samples; /(I)And/>respectively represent the/>, in the big data set of the electric power transactionPerson, 5/>The/>, in the perturbed neighbor matrix of the individual power data samplesLine/>A vector of row elements; /(I)Representation/>And/>Cosine similarity between them; /(I)And/>respectively represent the/>, in the big data set of the electric power transactionPerson, 5/>rank of the disturbance neighbor matrix of the individual power data samples; /(I)An exponential function based on a natural constant; /(I)representing the/>, in a big dataset of power transactionsPerson, 5/>The number of rows of the perturbed neighbor matrix for each power data sample.
Preferably, the method for constructing the disturbance sensitive management quadtree according to the data disturbance sensitivity coefficient among different power data samples in the power transaction big data set and calculating the noise sensitive management coefficient according to the disturbance sensitive management quadtree comprises the following steps:
Taking each electric power data sample in the electric power transaction big data set as a node, taking a data disturbance sensitivity coefficient among different electric power data samples in the electric power transaction big data set as a weight of a connecting line among the corresponding different nodes, taking a weighted undirected graph formed by the weights of the nodes corresponding to all the electric power data samples in the electric power transaction big data set and the connecting line as a disturbance sensitive weighted undirected graph of the electric power transaction big data set, taking the disturbance sensitive weighted undirected graph as an input, acquiring a clustering result of the disturbance weighted undirected graph by adopting a clustering algorithm, and constructing a disturbance sensitive management quadtree according to the clustering result of the disturbance sensitive weighted undirected graph of the electric power transaction big data set;
for the disturbance sensitive management quadtree of each disturbance sensitive management sample set, taking a sequence consisting of the statistical quantity corresponding to all root nodes in the disturbance sensitive management quadtree as a disturbance sensitive management basic quantity sequence, and taking a sequence consisting of the quantity of all leaf nodes corresponding to each root node as a disturbance sensitive management quantity sequence; and determining the noise sensitivity management coefficient of each disturbance sensitivity management sample set according to the disturbance sensitivity management basic quantity sequence and the analysis result of the noise sensitivity influence degree of the disturbance sensitivity management quantity sequence on each disturbance sensitivity management sample set.
Preferably, the method for constructing the disturbance sensitive management quadtree according to the clustering result of the disturbance sensitive weighted undirected graph of the large data set of the electric power transaction comprises the following steps:
For each cluster in the clustering result of the disturbance sensitive weighted undirected graph of the large data set of the power transaction, taking a set formed by power data samples corresponding to all nodes in each cluster as a disturbance sensitive management sample set; for any two power data samples in each disturbance management sample set, performing exclusive or operation on elements at the same position in disturbance neighbor matrixes corresponding to the two power data samples, and taking the result matrixes of the exclusive or operation as disturbance neighbor judgment matrixes of the two power data samples in each disturbance sensitive management sample set;
For the disturbance neighbor judgment matrix of any two power data samples in each disturbance sensitive management sample set, uniformly dividing each column of elements in the disturbance neighbor judgment matrix, converting each group of uniformly divided values into decimal numbers, mapping decimal conversion results of each column of elements in the disturbance neighbor judgment matrix into a quadtree, mapping results of all columns of elements in the disturbance neighbor judgment matrix into the quadtree as a group of quadtree mapping results, and mapping results of all disturbance neighbor judgment matrices corresponding to each disturbance sensitive management sample set on the quadtree as disturbance sensitive management quadtree of each disturbance sensitive management sample set.
preferably, the method for analyzing the noise sensitivity influence degree of each disturbance sensitivity management sample set according to the corresponding relation between the disturbance sensitivity management basic quantity sequence and the disturbance sensitivity management quantity sequence and calculating the noise sensitivity management coefficient according to the analysis result comprises the following steps:
In the method, in the process of the invention,representing disturbance sensitive management basis quantity sequence/>Middle/>Noise sensitivity influence coefficients of the individual elements; /(I)representing disturbance sensitive management basis quantity sequence/>Middle/>Disturbance sensitive management quantity sequence corresponding to each element,/>Representation/>Maximum value of (2); /(I)Middle/>The values of the individual elements; /(I)Representation/>The number of data in (a);
Disturbance sensitive weighted undirected graph representing large data sets of power transactionsnoise sensitivity management coefficient of each disturbance sensitivity management sample set,/>Disturbance sensitive weighted undirected graph representing large data sets of power transactionsdisturbance-sensitive management base quantity sequence of a disturbance-sensitive management sample set,/>Representation/>Maximum value of (2); /(I)representing disturbance sensitive management basis quantity sequence/>the number of elements in (a).
preferably, the method for obtaining the noise sensitivity of the power data according to the noise sensitivity management coefficient and completing the desensitization processing of the power data based on the noise sensitivity of the power data comprises the following steps:
Taking all power data samples of each disturbance sensitive management sample set corresponding to a disturbance sensitive weighted undirected graph of the power transaction big data set as a group of dynamic encryption data, taking a set formed by noise sensitive management coefficients of all disturbance sensitive management sample sets corresponding to the power transaction big data set as a noise sensitive management quantization set, acquiring a normalization result of the noise sensitive management quantization set by adopting a normalization algorithm, and taking a value corresponding to a normalization result of the noise sensitive management coefficient of each disturbance sensitive management sample set in the noise sensitive management quantization set as the power data noise sensitivity of each disturbance sensitive management sample set;
And taking all groups of dynamic encryption data of the large electric power transaction data set and the corresponding electric power noise sensitivity as inputs of an homomorphic encryption algorithm to obtain the homomorphic encryption result of the large electric power transaction data set.
The beneficial effects of the application are as follows: the method has the advantages that the electric power data with the same data disturbance sensitivity degree are divided into a group by considering the disturbance sensitivity degree of different user data in the electric power transaction big data for full homomorphic encryption protection, noise sensitivity management coefficients are constructed by analyzing the distribution situation of the disturbance sensitivity degree of the data of each group of electric power data, the noise addition degree in the full homomorphic encryption process is quantized based on the noise sensitivity management coefficients, the accuracy of the electric power data encryption is prevented from being reduced due to overlarge noise parameters in the full homomorphic encryption process, and the accuracy of dynamic desensitization of the electric power transaction data is improved due to the full homomorphic encryption.
Drawings
in order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the application, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a flow chart of a dynamic desensitization method for large electric power data based on an isomorphic encryption algorithm according to an embodiment of the application;
FIG. 2 is a schematic diagram of a quad-tree construction process according to one embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Referring to fig. 1, a flowchart of a method for dynamically desensitizing large power data based on an isomorphic encryption algorithm according to an embodiment of the application is shown, the method comprises the following steps:
Step S001, acquiring a large data set of the power transaction.
The method and the system collect and arrange the information of each user in the electric power big database, specifically, the method and the system collect and arrange the information of each user, including the name, age, contact telephone, electricity consumption and account balance information of each user, and it is to be noted that the collected and arranged data implementer can adjust according to the content to be protected, and arrange the user electric power data table according to the information corresponding to each user, wherein the name, age, contact telephone and electricity consumption of each user belong to sensitive information, the electricity consumption and account balance information of each user are the data to be disclosed, the sensitive data are required to be encrypted to be hidden, the data to be disclosed are required to be subjected to differential privacy protection to be subjected to privacy protection, and the specific differential privacy protection calculation process is known technology and is not repeated.
Further, each piece of data in the user power data table corresponds to the name, age, contact telephone, electricity meter user number, electricity consumption and account balance information of a user, the name, contact telephone and electricity meter user number information in each piece of data are used as data information to be concealed, and the age, electricity consumption and account balance information in each piece of data are used as data information to be privacy protected. And taking each piece of data in the user power data table as a power data sample, and taking a set formed by all power data samples corresponding to the user power data table as a power transaction big data set.
thus, a large data set of the power transaction is obtained.
Step S002, respectively obtaining a hidden enhancement coefficient and a privacy protection coefficient according to the data characteristics of different power data samples in the power transaction big data set, obtaining and constructing a disturbance neighbor matrix according to the hidden enhancement coefficient and the privacy protection coefficient, and calculating a data disturbance sensitivity coefficient based on the disturbance neighbor matrix.
In the process of sharing the electric power big data, encryption processing is required to be carried out on sensitive information in the electric power big data, differential privacy protection is carried out on data with private information, specifically, for example, an owner of the electric power big data needs to send the data to a third party platform for sharing or carrying out data analysis, but the original data of the electric power big data cannot be directly sent to the third party platform, so that the encrypted ciphertext can be sent to the third party platform through a full homomorphic encryption algorithm, the third party platform can still carry out homomorphic operation on the ciphertext to obtain an analysis result of the electric power big data, and when the encrypted ciphertext is transmitted to the third party platform for carrying out data analysis processing, the probability that the accuracy of decrypting the data after the full homomorphic encryption processing is lower is higher as the sensitivity of the plaintext composed of the electric power big data is higher.
Furthermore, since the ciphertext after isomorphic encryption can perform homomorphic operation of addition and multiplication, noise is increased while the homomorphic operation of multiplication is realized. Therefore, noise parameters in the homomorphic encryption algorithm are adjusted according to the analysis result of the characteristics of the electric power big data, and the reduction of the accuracy of the electric power big data encryption caused by overlarge noise parameters is avoided. Specifically, a set of to-be-concealed processed data information corresponding to each power data sample in the power transaction big data set is used as a to-be-concealed data set of each power data sample, and a set of to-be-privacy-protected data corresponding to each power data sample is used as a privacy-protected data set.
Further, for any two electric power data samples in the electric power transaction big data set, using the Jacaded coefficient of the to-be-concealed data set between the two electric power data samples as the concealment enhancement coefficient of the two electric power data samples, wherein the greater the concealment enhancement coefficient is, the closer the to-be-concealed data information between the two electric power data samples is, and the degree of concealment processing needs to be enhanced; the Jaccard coefficient of the privacy protection data set between two electric power data samples is used as the privacy protection coefficient between the two electric power data samples, and the privacy protection data is the age, the electricity consumption and the account balance data of the user, so that the closer the privacy protection data are, the lower the degree of distinction between the data is, namely the greater the privacy protection coefficient is, the lower the sensitivity degree of the privacy protection data is; the method comprises the steps of taking a hiding enhancement coefficient between two electric power data samples as a numerator, taking the sum of a privacy protection coefficient between the two data samples and 0.01 as a denominator, taking the ratio of the numerator to the denominator as the hiding protection enhancement coefficient of the two data samples, wherein the larger the hiding protection enhancement coefficient is, the information to be hidden between the two data samples is close, the information difference to be privacy protected is larger, and the specific calculation process of the Jacard coefficient is a known technology and is not repeated.
Further, the input is a power transaction big data set, K (the size of which is checked to be 12) neighbor samples of each power data sample in the power transaction big data set are obtained by adopting a K neighbor algorithm, and a set formed by each power data sample and the corresponding K neighbor samples is used as a neighbor sample set of each power data sample. According to the method, a disturbance neighbor matrix is built according to a neighbor sample set of each power data sample in a power transaction big data set, specifically, for the neighbor sample set corresponding to each power data sample in the power transaction big data set, a hidden protection enhancement coefficient between each element and other elements in the neighbor sample set is used as a disturbance neighbor sequence of each element in the neighbor sample set according to a sequence formed by a small-to-big order, the disturbance neighbor sequence is used as one row of elements in the matrix, the matrix formed by all disturbance neighbor sequences corresponding to the neighbor sample set is used as a disturbance neighbor matrix of each power data sample, and the specific calculation process of a K neighbor algorithm is a known technology and is not repeated.
Further, the disturbance neighbor matrix of each power data sample in the power transaction big data set represents the sensitivity change degree of each power data sample between the data needing to be hidden and the data needing to be privacy protected, so that the data disturbance sensitivity coefficient between different power data samples is calculated according to the disturbance neighbor matrix of each power data sample in the power transaction big data set, and a specific calculation formula is as follows:
In the method, in the process of the invention,representing the/>, in a big dataset of power transactionsSum/>data disturbance sensitivity coefficients between the individual power data samples; /(I)And/>respectively represent the/>, in the big data set of the electric power transactionPerson, 5/>The/>, in the perturbed neighbor matrix of the individual power data samplesLine/>A vector of row elements; /(I)Representation/>And/>Cosine similarity between them; /(I)And/>respectively represent the/>, in the big data set of the electric power transactionPerson, 5/>rank of the disturbance neighbor matrix of the individual power data samples; /(I)An exponential function based on a natural constant; /(I)representing the/>, in a big dataset of power transactionsPerson, 5/>The number of rows of the perturbed neighbor matrix for each power data sample.
If the electric power trade is in the big data setSum/>The line element difference between the disturbance neighbor matrixes corresponding to the electric power data samples is larger, and the calculated/>The smaller the value of (2); at the same time, the first/>, in the big data set of the electric power transactionSum/>the distribution difference of the elements in the disturbance neighbor matrix corresponding to each power data sample is larger, and the distribution difference is calculatedthe larger the value of (1), namely the calculated power transaction big data set is the/>Sum/>Data disturbance sensitivity coefficient/>, between individual power data samplesthe larger the value of (c) represents the greater the value of (c) in the data set by power transactionSum/>Analysis of contrast of perturbed neighbor matrix corresponding to individual power data samples, first/>Sum/>the individual power data samples are close in terms of power data disturbance sensitivity.
So far, the data disturbance sensitivity coefficient is obtained.
and step S003, constructing a disturbance sensitive management quadtree according to the data disturbance sensitive coefficient, and calculating a noise sensitive management coefficient based on the disturbance sensitive management quadtree.
Taking each electric power data sample in the electric power transaction big data set as a node, taking a data disturbance sensitivity coefficient between different electric power data samples in the electric power transaction big data set as a weight of a connecting line between the corresponding different nodes, taking a weighted undirected graph formed by the weights of the nodes corresponding to all the electric power data samples in the electric power transaction big data set and the connecting line as a disturbance sensitivity weighted undirected graph of the electric power transaction big data set, inputting the disturbance sensitivity weighted undirected graph as the disturbance sensitivity weighted undirected graph, acquiring a clustering result of the disturbance sensitivity weighted undirected graph by adopting a Markov clustering algorithm, wherein each clustering cluster in the clustering result represents a node with similar data disturbance sensitivity degree, and the specific calculation process of the Markov clustering algorithm is a known technology and is not repeated.
Further, for each cluster in the clustering result of the disturbance sensitive weighted undirected graph of the large data set of the power transaction, a set formed by power data samples corresponding to all nodes in each cluster is used as a disturbance sensitive management sample set, and a disturbance sensitive management quadtree is constructed according to the disturbance management sample set. Specifically, for any two power data samples in the disturbance management sample set, performing exclusive or operation on elements at the same position in a disturbance neighbor matrix corresponding to the two power data samples, taking the result matrix of the exclusive or operation as a disturbance neighbor judgment matrix of the two power data samples, and constructing a quadtree according to the disturbance neighbor matrix.
Specifically, for example, the first column element in the disturbance neighbor judgment matrix of two power data samples isdividing every two elements, and converting the divided result into decimal numbers, wherein the divided result of the first column element is/>The result of the conversion is/>、/>、/>、/>、/>、/>According to the value of each element in the conversion result, nodes with the same value are sequentially searched in a quadtree, and each forked node in the quadtree is/>、/>、/>then/>、/>、/>、/>、/>、/>The mapping result in the quadtree is the value/>, from the root nodeThe nodes of (a) start to grow downwards in sequence, and the final corresponding leaf node is the numerical value/>In the specific process of building the quadtree according to the first column element, as shown in fig. 2, all column elements in the perturbed neighbor judgment matrix are mapped into the quadtree, and statistics is performed on the starting point and the ending point of mapping each column element, for example, the conversion results of the two column elements in the perturbed neighbor judgment matrix are the same, and the data of the starting point and the ending point are/>At the corresponding root node/>Statistical quantity of places is/>Corresponding leaf node/>Statistical quantity of places is/>The counted number is sequentially increased according to the mapping number; and taking the mapping results of all column elements in the corresponding disturbance neighbor judgment matrix in the disturbance sensitive management sample set in the quadtree as a disturbance sensitive management quadtree of the disturbance sensitive management sample set, and reflecting the distribution condition of disturbance sensitivity degree of data of the disturbance sensitive management sample set through the disturbance sensitive management quadtree.
further, for the disturbance sensitivity management quadtree of each disturbance sensitivity management sample set corresponding to the disturbance sensitivity weighted undirected graph of the large data set of the electric power transaction, taking a sequence consisting of the statistical quantity corresponding to all root nodes in the disturbance sensitivity management quadtree as a disturbance sensitivity management basic quantity sequenceIf the distribution difference of the disturbance sensitive management basic quantity sequence is larger, the disturbance generated in the disturbance sensitive management data in the sample set is larger, and further, the sequence consisting of the quantity of all leaf nodes corresponding to each root node is taken as a disturbance sensitive management quantity sequence/>If the number of elements in the disturbance sensitive management number sequence is larger, the sensitivity distribution difference of the disturbance sensitive management sample set is larger; based on the analysis disturbance sensitive management basic quantity sequence/>Each element of the list corresponds to a disturbance sensitive management quantity sequence/>
Further, the basic quantity sequence is managed according to disturbance sensitivityand disturbance sensitive management quantity sequence/>The corresponding relation between the noise sensitivity management sample sets analyzes the noise sensitivity influence degree of each disturbance sensitivity management sample set, and a noise sensitivity management coefficient is calculated according to the analysis result, wherein the specific calculation process of the noise sensitivity management coefficient is as follows:
In the method, in the process of the invention,representing disturbance sensitive management basis quantity sequence/>Middle/>Noise sensitivity influence coefficients of the individual elements; /(I)representing disturbance sensitive management basis quantity sequence/>Middle/>Disturbance sensitive management quantity sequence corresponding to each element,/>Representation/>Maximum value of (2); /(I)Middle/>The values of the individual elements; /(I)Representation/>The number of data in (a);
Disturbance sensitive weighted undirected graph representing large data sets of power transactionsnoise sensitivity management coefficient of each disturbance sensitivity management sample set,/>Disturbance sensitive weighted undirected graph representing large data sets of power transactionsdisturbance-sensitive management base quantity sequence of a disturbance-sensitive management sample set,/>Representation/>Maximum value of (2); /(I)Representation/>the number of elements in (a).
if the disturbance sensitive weighted undirected graph of the large data set of the electric power transaction corresponds to the firstThe disturbance sensitive management quadtree of each disturbance sensitive management sample set is in local divergence distribution, and the calculated/>And/>The larger the value of (2), i.e. disturbance-sensitive management base quantity sequence/>Middle/>noise sensitivity influence coefficient of individual element/>At the same time/>the disturbance sensitive management quadtrees of the disturbance sensitive management sample sets are integrally and intensively distributed, and the calculated/>The larger the value of (2), namely the calculated disturbance sensitive management basic quantity sequence/>Middle/>noise sensitivity influence coefficient of individual element/>The larger the value of (c) represents the disturbance-sensitive weighted undirected graph corresponding to the large dataset of power transactionsThe data disturbance sensitivity of each disturbance sensitive management sample set is close, but the distribution difference of the whole sensitivity is larger, and the noise adding degree needs to be properly increased.
The method has the advantages that the disturbance sensitivity degree analysis accuracy of the disturbance sensitivity weighted undirected graph clustering result of the electric power transaction large data set for adding noise to the data is accurately reflected by considering the disturbance sensitivity degree of different data in the electric power transaction large data set.
Thus, the noise sensitivity management coefficient is obtained.
step S004, the noise sensitivity of the power data is obtained according to the noise sensitivity management coefficient, and the fully homomorphic encryption result of the power data is obtained based on the noise sensitivity of the power data.
All power data samples of each disturbance sensitive management sample set corresponding to the disturbance sensitive weighted undirected graph of the power transaction big data set are used as a group of dynamic encryption data, a set formed by noise sensitive management coefficients of all disturbance sensitive management sample sets corresponding to the power transaction big data set is used as a noise sensitive management quantization set, the noise sensitive management quantization set is input as the noise sensitive management quantization set, a normalization result of the noise sensitive management quantization set is obtained by adopting a maximum minimum normalization algorithm, a value corresponding to the normalization result of the noise sensitive management coefficients of each disturbance sensitive management sample set in the noise sensitive management quantization set is used as the power data noise sensitivity of each disturbance sensitive management sample set, namely, each group of dynamic encryption data of the power transaction big data set has the corresponding power data noise sensitivity, and the detailed calculation process of the maximum minimum normalization algorithm is known technology and is not repeated.
Further, noise is added to the data in the power transaction large data set according to the power noise sensitivity of each group of dynamically encrypted data in the power transaction large data set. Specifically, all groups of dynamic encryption data of the large electric power transaction data set and corresponding electric power noise sensitivity are used as inputs of an isomorphic encryption algorithm, an isomorphic encryption result of the large electric power transaction data set is obtained, the isomorphic encryption result is transmitted to a third party platform, leakage of sensitive information of a user in the large electric power transaction data set is avoided, and a specific calculation process of the isomorphic encryption algorithm is a known technology and is not repeated.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. The above description is only of the preferred embodiments of the present application and is not intended to limit the application, but any modifications, equivalent substitutions, improvements, etc. within the principles of the present application should be included in the scope of the present application.

Claims (10)

1. The dynamic desensitization method for the electric power big data based on the homomorphic encryption algorithm is characterized by comprising the following steps:
acquiring a large data set of power transaction;
Dividing the data of each power data sample in the power transaction big data set into a data set to be concealed and a privacy protection data set; calculating a hiding protection enhancement coefficient according to the difference between a to-be-hidden data set and a privacy protection data set among different power data samples in the power transaction big data set;
Constructing a disturbance neighbor matrix of each power data sample according to the hidden protection enhancement coefficients among different power data samples in the power transaction big data set; calculating a data disturbance sensitivity coefficient according to a disturbance neighbor matrix of each power data sample in the power transaction big data set;
Constructing a disturbance sensitive management quadtree according to data disturbance sensitivity coefficients among different power data samples in the power transaction big data set, and calculating a noise sensitive management coefficient according to the disturbance sensitive management quadtree;
and acquiring the noise sensitivity of the power data according to the noise sensitivity management coefficient, and finishing the desensitization processing of the power data based on the noise sensitivity of the power data.
2. The method for dynamically desensitizing large power data based on isomorphic encryption algorithm according to claim 1, wherein the method for dividing the data of each power data sample in the large power transaction data set into a to-be-concealed data set and a privacy-preserving data set is as follows:
The name, contact telephone and ammeter user number information of each power data sample in the power transaction big data set are used as to-be-concealed processing data information, and a set formed by all to-be-concealed processing data information corresponding to each power data sample is used as to-be-concealed data set of each power data sample; the method comprises the steps of taking age, electricity consumption and account balance information of each power data sample in a power transaction big data set as data information to be privacy protected, and taking a set formed by all data information to be privacy protected corresponding to each power data sample as a privacy protection data set of each power data sample.
3. The method for dynamically desensitizing large power data based on isomorphic encryption algorithm according to claim 1, wherein the method for calculating the concealment protection enhancement coefficient according to the difference between the to-be-concealed data set and the privacy protection data set among different power data samples in the large power transaction data set is as follows:
For any two power data samples in the power transaction big data set, taking the Jaccard coefficient of the data set to be concealed between the two power data samples as the concealment enhancement coefficient of the two power data samples; taking the Jacquard coefficient of the privacy-preserving data set between the two power data samples as the privacy-preserving coefficient between the two power data samples;
The concealment enhancement coefficient between two power data samples is used as a numerator, the sum of the privacy protection coefficient between the two data samples and 0.01 is used as a denominator, and the ratio of the numerator to the denominator is used as the concealment protection enhancement coefficient of the two data samples.
4. The method for dynamically desensitizing large power data based on isomorphic encryption algorithm according to claim 1, wherein the method for constructing disturbance neighbor matrix of each large power data sample according to hidden protection enhancement coefficient among different large power data samples in power transaction set is as follows:
Taking a power transaction big data set as input, acquiring a preset number of neighbor samples of each power data sample in the power transaction big data set by adopting a neighbor algorithm, and taking a set formed by each power data sample and a corresponding preset number of neighbor samples as a neighbor sample set of each power data sample; and acquiring a disturbance neighbor sequence according to a neighbor sample set of each power data sample in the power transaction big data set, and acquiring a disturbance neighbor matrix of each power data sample based on the disturbance neighbor sequence.
5. the method for dynamically desensitizing power big data based on the homomorphic encryption algorithm according to claim 4, wherein the method for obtaining a disturbance neighbor sequence according to a neighbor sample set of each power data sample in the power transaction big data set and obtaining a disturbance neighbor matrix of each power data sample based on the disturbance neighbor sequence is as follows:
For a neighbor sample set corresponding to each power data sample in the power transaction big data set, taking a hidden protection enhancement coefficient between each element and other elements in a neighbor sample set as a disturbance neighbor sequence of each element in the neighbor sample set according to a sequence consisting of a small order to a large order, taking the disturbance neighbor sequence as one row of elements in a matrix, and taking a matrix consisting of all disturbance neighbor sequences corresponding to the neighbor sample set as a disturbance neighbor matrix of each power data sample.
6. The method for dynamically desensitizing large electric power data based on the homomorphic encryption algorithm according to claim 1, wherein the method for calculating the data disturbance sensitivity coefficient according to the disturbance neighbor matrix of each electric power data sample in the large electric power transaction data set is as follows:
In the method, in the process of the invention,representing the/>, in a big dataset of power transactionsSum/>data disturbance sensitivity coefficients between the individual power data samples; /(I)And/>respectively represent the/>, in the big data set of the electric power transactionPerson, 5/>The/>, in the perturbed neighbor matrix of the individual power data samplesLine/>A vector of row elements; /(I)Representation/>And/>Cosine similarity between them; /(I)And/>respectively represent the/>, in the big data set of the electric power transactionPerson, 5/>rank of the disturbance neighbor matrix of the individual power data samples; /(I)An exponential function based on a natural constant; /(I)representing the/>, in a big dataset of power transactionsPerson, 5/>The number of rows of the perturbed neighbor matrix for each power data sample.
7. the dynamic desensitization method of electric power big data based on the isomorphic encryption algorithm according to claim 1, wherein the method for constructing disturbance sensitive management quadtree according to data disturbance sensitive coefficient among different electric power data samples in electric power transaction big data set and calculating noise sensitive management coefficient according to disturbance sensitive management quadtree is as follows:
Taking each electric power data sample in the electric power transaction big data set as a node, taking a data disturbance sensitivity coefficient among different electric power data samples in the electric power transaction big data set as a weight of a connecting line among the corresponding different nodes, taking a weighted undirected graph formed by the weights of the nodes corresponding to all the electric power data samples in the electric power transaction big data set and the connecting line as a disturbance sensitive weighted undirected graph of the electric power transaction big data set, taking the disturbance sensitive weighted undirected graph as an input, acquiring a clustering result of the disturbance weighted undirected graph by adopting a clustering algorithm, and constructing a disturbance sensitive management quadtree according to the clustering result of the disturbance sensitive weighted undirected graph of the electric power transaction big data set;
for the disturbance sensitive management quadtree of each disturbance sensitive management sample set, taking a sequence consisting of the statistical quantity corresponding to all root nodes in the disturbance sensitive management quadtree as a disturbance sensitive management basic quantity sequence, and taking a sequence consisting of the quantity of all leaf nodes corresponding to each root node as a disturbance sensitive management quantity sequence; and determining the noise sensitivity management coefficient of each disturbance sensitivity management sample set according to the disturbance sensitivity management basic quantity sequence and the analysis result of the noise sensitivity influence degree of the disturbance sensitivity management quantity sequence on each disturbance sensitivity management sample set.
8. The dynamic desensitization method for electric power big data based on the isomorphic encryption algorithm according to claim 7, wherein the method for constructing disturbance sensitive management quadtree according to the clustering result of disturbance sensitive weighted undirected graph of electric power transaction big data set is:
For each cluster in the clustering result of the disturbance sensitive weighted undirected graph of the large data set of the power transaction, taking a set formed by power data samples corresponding to all nodes in each cluster as a disturbance sensitive management sample set; for any two power data samples in each disturbance management sample set, performing exclusive or operation on elements at the same position in disturbance neighbor matrixes corresponding to the two power data samples, and taking the result matrixes of the exclusive or operation as disturbance neighbor judgment matrixes of the two power data samples in each disturbance sensitive management sample set;
For the disturbance neighbor judgment matrix of any two power data samples in each disturbance sensitive management sample set, uniformly dividing each column of elements in the disturbance neighbor judgment matrix, converting each group of uniformly divided values into decimal numbers, mapping decimal conversion results of each column of elements in the disturbance neighbor judgment matrix into a quadtree, mapping results of all columns of elements in the disturbance neighbor judgment matrix into the quadtree as a group of quadtree mapping results, and mapping results of all disturbance neighbor judgment matrices corresponding to each disturbance sensitive management sample set on the quadtree as disturbance sensitive management quadtree of each disturbance sensitive management sample set.
9. The method for dynamically desensitizing power big data based on isomorphic encryption algorithm according to claim 7, wherein the method for determining noise sensitivity management coefficients of each disturbance sensitivity management sample set according to analysis results of noise sensitivity influence degree of disturbance sensitivity management basic quantity sequence and disturbance sensitivity management quantity sequence on each disturbance sensitivity management sample set comprises:
In the method, in the process of the invention,representing disturbance sensitive management basis quantity sequence/>Middle/>Noise sensitivity influence coefficients of the individual elements; /(I)representing disturbance sensitive management basis quantity sequence/>Middle/>Disturbance sensitive management quantity sequence corresponding to each element,/>Representation/>Maximum value of (2); /(I)representing disturbance sensitive management basis quantity sequence/>Middle/>The values of the individual elements; /(I)Representation/>The number of data in (a);
Disturbance sensitive weighted undirected graph representing large data sets of power transactionsnoise sensitivity management coefficient of each disturbance sensitivity management sample set,/>Disturbance sensitive weighted undirected graph representing large data sets of power transactionsdisturbance-sensitive management base quantity sequence of a disturbance-sensitive management sample set,/>Representation/>Maximum value of (2); /(I)representing disturbance sensitive management basis quantity sequence/>the number of elements in (a).
10. The method for dynamically desensitizing big power data based on the homomorphic encryption algorithm according to claim 1, wherein the method for acquiring the noise sensitivity of the power data according to the noise sensitivity management coefficient and completing the desensitizing processing of the power data based on the noise sensitivity of the power data is as follows:
Taking all power data samples of each disturbance sensitive management sample set corresponding to a disturbance sensitive weighted undirected graph of the power transaction big data set as a group of dynamic encryption data, taking a set formed by noise sensitive management coefficients of all disturbance sensitive management sample sets corresponding to the power transaction big data set as a noise sensitive management quantization set, acquiring a normalization result of the noise sensitive management quantization set by adopting a normalization algorithm, and taking a value corresponding to a normalization result of the noise sensitive management coefficient of each disturbance sensitive management sample set in the noise sensitive management quantization set as the power data noise sensitivity of each disturbance sensitive management sample set;
And taking all groups of dynamic encryption data of the large electric power transaction data set and the corresponding electric power noise sensitivity as inputs of an homomorphic encryption algorithm to obtain the homomorphic encryption result of the large electric power transaction data set.
CN202410194568.5A 2024-02-22 2024-02-22 Electric power big data dynamic desensitization method based on isomorphic encryption algorithm Pending CN117763620A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410194568.5A CN117763620A (en) 2024-02-22 2024-02-22 Electric power big data dynamic desensitization method based on isomorphic encryption algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410194568.5A CN117763620A (en) 2024-02-22 2024-02-22 Electric power big data dynamic desensitization method based on isomorphic encryption algorithm

Publications (1)

Publication Number Publication Date
CN117763620A true CN117763620A (en) 2024-03-26

Family

ID=90310754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410194568.5A Pending CN117763620A (en) 2024-02-22 2024-02-22 Electric power big data dynamic desensitization method based on isomorphic encryption algorithm

Country Status (1)

Country Link
CN (1) CN117763620A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110087237A (en) * 2019-04-30 2019-08-02 苏州大学 Method for secret protection, device and associated component based on disturbance of data
CN113254988A (en) * 2021-04-25 2021-08-13 西安电子科技大学 High-dimensional sensitive data privacy classified protection publishing method, system, medium and equipment
CN114547670A (en) * 2022-01-14 2022-05-27 北京理工大学 Sensitive text desensitization method using differential privacy word embedding disturbance
CN116861697A (en) * 2023-07-28 2023-10-10 国网江苏省电力有限公司扬州供电分公司 Big data-based power data processing system and processing method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110087237A (en) * 2019-04-30 2019-08-02 苏州大学 Method for secret protection, device and associated component based on disturbance of data
CN113254988A (en) * 2021-04-25 2021-08-13 西安电子科技大学 High-dimensional sensitive data privacy classified protection publishing method, system, medium and equipment
CN114547670A (en) * 2022-01-14 2022-05-27 北京理工大学 Sensitive text desensitization method using differential privacy word embedding disturbance
CN116861697A (en) * 2023-07-28 2023-10-10 国网江苏省电力有限公司扬州供电分公司 Big data-based power data processing system and processing method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王林信;杨鹏;江元;侯应龙;廖晓群;: "智能电网大数据隐私保护技术研究与实现", 电力信息与通信技术, no. 12, 25 December 2019 (2019-12-25), pages 29 - 30 *
邹云峰;徐超;倪巍伟;唐刘远;沈涛;: "电力客户数据隐私保护机制研究", 电力需求侧管理, no. 02, 31 March 2020 (2020-03-31), pages 86 - 88 *

Similar Documents

Publication Publication Date Title
CN111783875B (en) Abnormal user detection method, device, equipment and medium based on cluster analysis
CN111669366B (en) Localized differential private data exchange method and storage medium
WO2017076154A1 (en) Method and apparatus for predicting network event and establishing network event prediction model
JP2016531513A (en) Method and apparatus for utility-aware privacy protection mapping using additive noise
KR20150115772A (en) Privacy against interference attack against mismatched prior
Zhao et al. A blockchain-based approach for saving and tracking differential-privacy cost
CN109564616A (en) Personal information goes markization method and device
CN116762069A (en) Metadata classification
Dwork 14 Differential Privacy: A Cryptographic Approach to Private Data Analysis
CN107886009A (en) The big data generation method and system of anti-privacy leakage
CN117390657A (en) Data encryption method, device, computer equipment and storage medium
Bao et al. Privacy-preserving collaborative filtering algorithm based on local differential privacy
Fernández et al. Sample selection procedure in daily trading volume processes
Wang et al. Set-valued data publication with local privacy: tight error bounds and efficient mechanisms
Zhao et al. Efficient protocols for heavy hitter identification with local differential privacy
CN117763620A (en) Electric power big data dynamic desensitization method based on isomorphic encryption algorithm
Audrino et al. Oracle Properties, Bias Correction, and Bootstrap Inference for Adaptive Lasso for Time Series M‐Estimators
CN112348041B (en) Log classification and log classification training method and device, equipment and storage medium
CN117081941A (en) Flow prediction method and device based on attention mechanism and electronic equipment
CN116910506A (en) Load dimension reduction clustering method based on space-time network variation self-encoder algorithm
CN106372213A (en) Position analysis method
CN111858575B (en) Private data analysis method and system
CN115098881A (en) Data disturbance method and device based on sensitivity level division
CN113704816A (en) Data desensitization method, device and storage medium
He et al. An efficient ciphertext retrieval scheme based on homomorphic encryption for multiple data owners in hybrid cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20240514

Address after: 1-10 / F, C building, No.87 courtyard, Songshan South Road, Erqi District, Zhengzhou City, Henan Province

Applicant after: ECONOMIC TECHNOLOGY RESEARCH INSTITUTE OF STATE GRID HENAN ELECTRIC POWER Co.

Country or region after: China

Applicant after: Zhengzhou University

Applicant after: State Grid Smart Grid Research Institute Co.,Ltd.

Address before: 518101, Building 2, 302, Songbai Industrial Park, No. 4 Yangyong Industrial Road, Tangxiachong Community, Yanluo Street, Bao'an District, Shenzhen City, Guangdong Province

Applicant before: Shenzhen Fushan Automation Technology Co.,Ltd.

Country or region before: China

Applicant before: ECONOMIC TECHNOLOGY RESEARCH INSTITUTE OF STATE GRID HENAN ELECTRIC POWER Co.

Applicant before: Zhengzhou University

Applicant before: State Grid Smart Grid Research Institute Co.,Ltd.