CN112995076A - Discrete data frequency estimation method, user side, data center and system - Google Patents
Discrete data frequency estimation method, user side, data center and system Download PDFInfo
- Publication number
- CN112995076A CN112995076A CN201911298496.4A CN201911298496A CN112995076A CN 112995076 A CN112995076 A CN 112995076A CN 201911298496 A CN201911298496 A CN 201911298496A CN 112995076 A CN112995076 A CN 112995076A
- Authority
- CN
- China
- Prior art keywords
- discrete data
- data
- codes
- discrete
- code
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L25/00—Baseband systems
- H04L25/02—Details ; arrangements for supplying electrical power along data transmission lines
- H04L25/08—Modifications for reducing interference; Modifications for reducing effects due to line faults ; Receiver end arrangements for detecting or overcoming line faults
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/04—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
- H04L63/0428—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Power Engineering (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention relates to a discrete data frequency estimation method, a user side, a data center and a system, comprising the following steps: the user side generates discrete data codes according to the types of the discrete data sent to the data center; the method comprises the steps that a user side obtains a disturbing code corresponding to a discrete data code and sends the disturbing code corresponding to the discrete data code to a data center; the data center receives the disturbing codes corresponding to the discrete data codes of the user sides; and the data center determines the occurrence frequency of various discrete data according to the disturbing codes corresponding to the discrete data codes of the user terminals. According to the scheme, the user terminal reduces the noise injection on the original data according to the definition of loose local differential privacy, reduces the distortion degree of the data as much as possible on the basis of meeting the local differential privacy, improves the usability of the disturbed data, and further improves the accuracy of the statistical result.
Description
Technical Field
The invention relates to the field of power grid information control, in particular to a discrete data frequency estimation method, a user side, a data center and a system.
Background
In the field of production control, including but not limited to the field of power grid information control, it is often necessary to collect service data of different areas and different departments to a data center, and through joint analysis, the occurrence frequency of a certain service event is obtained, and service analysis is performed. The case of separating data ownership and data use right is involved, namely, all data of the data are respectively in different areas and different departments, and the analysis result can be shared, so that the joint data analysis needs to be carried out under the condition of ensuring the data secret of each part.
At present, business data of the same region and different departments are directly collected to a data center, sensitive data leakage risks exist, the data center serves as a key node for joint work of all parties, and data safety protection responsibility is huge. In addition, in order to maintain data security and avoid data security responsibility, the enthusiasm of each party for sharing data is greatly reduced, which is not beneficial to the development of data service. Therefore, a technology for performing local differential privacy processing by independent parties according to the free data condition and performing joint analysis under the condition of protecting the data privacy of the independent parties is urgently needed.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to reduce the injection of noise on the original data by the user terminal according to the definition of loose local differential privacy, reduce the distortion degree of the data as much as possible on the basis of meeting the local differential privacy, improve the usability of the disturbed data and further improve the accuracy of the statistical result.
The purpose of the invention is realized by adopting the following technical scheme:
the invention provides a discrete data frequency estimation method, which is applied to a user terminal, and the improvement is that the method comprises the following steps:
generating discrete data codes according to the types of the discrete data sent to the data center;
and acquiring a scrambling code corresponding to the discrete data code, and sending the scrambling code corresponding to the discrete data code to a data center.
Preferably, the length of the discrete data codes is equal to the total number of discrete data types.
Further, the discrete data is encoded as (v)1...vi...vn) Where n is the total number of discrete data types, viIs the coded value corresponding to the i-th type of discrete data, if the type of the discrete data sent to the data center by the user side is the i-th type of discrete data, v isi1, otherwise, vi=0。
Preferably, the obtaining of the scrambling code corresponding to the discrete data code includes:
acquiring the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes;
and determining the scrambling codes corresponding to the discrete data codes based on the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes.
Further, the obtaining of the conversion probability of the code value corresponding to each type of discrete data in the discrete data coding includes:
determining the probability of converting the coded value corresponding to the ith type of discrete data in the discrete data coding into 0 according to the following formula:
determining the probability of converting the coded value corresponding to the ith type of discrete data into 1 in the discrete data coding according to the following formula:
in the above formula, epsilon is the privacy protection budget, delta is a parameter under loose local differential privacy, the value is between 0 and 1,scrambling code values corresponding to i-th discrete data in scrambling codes corresponding to discrete data codes,the probability of converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 0,and converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 1.
Further, the determining, based on the transition probabilities of the code values corresponding to various types of discrete data in the discrete data codes, a scrambling code corresponding to the discrete data codes includes:
in the {0,1} setExtract 0 toProbability of (1) is extracted, and if 0 is extracted, thenIf 1 is drawn, then
The invention provides a user terminal applied to discrete data frequency estimation, and the improvement is that the user terminal comprises:
the generating module is used for generating discrete data codes according to the types of the discrete data sent to the data center;
the acquisition module is used for acquiring a scrambling code corresponding to the discrete data code;
and the sending module is used for sending the scrambling codes corresponding to the discrete data codes to the data center.
Preferably, the length of the discrete data codes is equal to the total number of discrete data types.
Further, the discrete data is encoded as (v)1...vi...vn) Where n is the total number of discrete data types, viIs the coded value corresponding to the i-th type of discrete data, if the type of the discrete data sent to the data center by the user side is the i-th type of discrete data, v isi1, otherwise, vi=0。
Preferably, the obtaining module includes:
the acquisition unit is used for acquiring the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes;
and the determining unit is used for determining the scrambling codes corresponding to the discrete data codes based on the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes.
Further, the obtaining unit is specifically configured to:
determining the probability of converting the coded value corresponding to the ith type of discrete data in the discrete data coding into 0 according to the following formula:
determining the probability of converting the coded value corresponding to the ith type of discrete data into 1 in the discrete data coding according to the following formula:
in the above formula, epsilon is the privacy protection budget, delta is a parameter under loose local differential privacy, the value is between 0 and 1,scrambling code values corresponding to i-th discrete data in scrambling codes corresponding to discrete data codes,the probability of converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 0,and converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 1.
Further, the determining unit is specifically configured to:
in the {0,1} setExtract 0 toProbability of (1) is extracted, and if 0 is extracted, thenIf 1 is drawn, then
The invention provides a discrete data frequency estimation method, which is applied to a data center, and the improvement is that the method comprises the following steps:
receiving a scrambling code corresponding to the discrete data code of each user side;
and determining the occurrence frequency of various discrete data according to the scrambling codes corresponding to the discrete data codes of the user sides.
Preferably, the determining the occurrence frequency of each type of discrete data according to the scrambling code corresponding to the discrete data code of each user side includes:
counting the i-th discrete codes in the disturbed codes corresponding to the discrete data codes of each user terminalFrequency with scrambling code value of 0 corresponding to dataAnd a frequency with scrambling code value 1
and solving the receiving frequency equation set of the ith type of discrete data to obtain the occurrence frequency of the ith type of discrete data.
Further, the system of the generation frequency equation of the ith type of discrete data is as follows:
in the above formula, f0(i) For no occurrence frequency of i-th type discrete data, f1(i) And the occurrence frequency of the ith type of discrete data, epsilon is a privacy protection budget, and delta is a parameter under loose local differential privacy, and the value is between 0 and 1.
The present invention provides a data center for use in discrete data frequency estimation, the improvement wherein the data center comprises:
the receiving module is used for receiving the scrambling codes corresponding to the discrete data codes of the user sides;
and the determining module is used for determining the occurrence frequency of various discrete data according to the scrambling codes corresponding to the discrete data codes of the user sides.
Preferably, the determining module includes:
a statistic unit for counting the disturbance corresponding to the i-th discrete data in the disturbance codes corresponding to the discrete data codes of each user terminalFrequency with 0 scrambling code valueAnd a frequency with scrambling code value 1
A building unit for building based onAndestablishing an i-th discrete data generation frequency equation set;
and the solving unit is used for solving the receiving frequency equation set of the ith type of discrete data to obtain the occurrence frequency of the ith type of discrete data.
Further, the system of the generation frequency equation of the ith type of discrete data is as follows:
in the above formula, f0(i) For no occurrence frequency of i-th type discrete data, f1(i) And the occurrence frequency of the ith type of discrete data, epsilon is a privacy protection budget, and delta is a parameter under loose local differential privacy, and the value is between 0 and 1.
The invention provides a method for estimating discrete data frequency, the improvement is that the method comprises the following steps:
the user side generates discrete data codes according to the types of the discrete data sent to the data center;
the method comprises the steps that a user side obtains a disturbing code corresponding to a discrete data code and sends the disturbing code corresponding to the discrete data code to a data center;
the data center receives the disturbing codes corresponding to the discrete data codes of the user sides;
and the data center determines the occurrence frequency of various discrete data according to the disturbing codes corresponding to the discrete data codes of the user terminals.
The present invention provides a discrete data frequency estimation system, the improvement wherein said system comprises: the user side and the data center.
Compared with the closest prior art, the invention has the following beneficial effects:
in the technical scheme provided by the invention, a user terminal generates discrete data codes according to the types of discrete data sent to a data center, randomly scrambles code values corresponding to various types of discrete data in the discrete data codes, and sends the scrambled codes to a data collection center; the data processed by the scheme meets the privacy requirement, and the risk of privacy disclosure is avoided.
After the data collection center receives the disturbing codes corresponding to the discrete data codes of the user sides, the occurrence frequency of various discrete data is determined according to the disturbing codes corresponding to the discrete data codes of the user sides.
Drawings
FIG. 1 is a flow chart of a method for estimating a frequency of discrete data according to the present invention;
fig. 2 is a schematic diagram of a ue structure applied to a discrete data frequency estimation method according to the present invention;
FIG. 3 is a schematic diagram of a data center structure applied to a discrete data frequency estimation method provided by the present invention;
fig. 4 is a schematic structural diagram of a discrete data frequency estimation system provided by the present invention.
Detailed Description
The following describes embodiments of the present invention in further detail with reference to the accompanying drawings.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to carry out joint data analysis under the condition of ensuring data secrets of all parties, the discrete data frequency estimation method provided by the invention introduces the definition of loose local differential privacy on the basis of the existing scheme, and provides a discrete data frequency estimation scheme meeting the loose local differential privacy. The main idea of the scheme is that a user terminal reduces noise injection on original data according to definition of loose local differential privacy, reduces distortion of the data as much as possible on the basis of meeting the local differential privacy, improves usability of disturbed data, and further improves accuracy of a statistical result, as shown in fig. 1, the method includes:
101, a user side generates discrete data codes according to the types of the discrete data sent to a data center;
102, the user side acquires the disturbing codes corresponding to the discrete data codes and sends the disturbing codes corresponding to the discrete data codes to the data center;
103, the data center receives the disturbing codes corresponding to the discrete data codes of each user side;
and 104, the data center determines the occurrence frequency of various discrete data according to the scrambling codes corresponding to the discrete data codes of the user terminals.
Wherein the length of the discrete data code is equal to the total number of the discrete data types.
The discrete data is encoded as (v)1...vi...vn) Where n is the total number of discrete data types, viIs the coded value corresponding to the i-th type of discrete data, if the type of the discrete data sent to the data center by the user side is the i-th type of discrete data, v isi1, otherwise, vi=0。
For example: each user terminal possesses one of the discrete data in the discrete data set S. Each user terminal firstly checks the own data diPerforming independent heatingCoding, i.e. obtaining a unit vector v of length miOnly self data diThe corresponding position is 1, and the rest of the positions are 0. Specifically, if diIs the jth data (j ≦ m) in the discrete data set, the unit vector viThe j-th bit in (1) and the rest are 0.
Specifically, in the embodiment provided by the present invention, step 101 and step 102 may be applied to the user side, where in step 102, acquiring the scrambling code corresponding to the discrete data code includes:
acquiring the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes;
and determining the scrambling codes corresponding to the discrete data codes based on the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes.
Further, the obtaining of the conversion probability of the code value corresponding to each type of discrete data in the discrete data coding includes:
determining the probability of converting the coded value corresponding to the ith type of discrete data in the discrete data coding into 0 according to the following formula:
determining the probability of converting the coded value corresponding to the ith type of discrete data into 1 in the discrete data coding according to the following formula:
in the above formula, epsilon is the privacy protection budget, delta is a parameter under loose local differential privacy, the value is between 0 and 1,scrambling code values corresponding to i-th discrete data in scrambling codes corresponding to discrete data codes,the probability of converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 0,and converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 1.
Wherein, δ is generally a value greater than 0 and much smaller than 1, and when δ is 0, the privacy protection mechanism satisfies the local differential privacy under strict definition. In this application, a discrete data frequency estimation method satisfying loose local differential privacy is mainly discussed.
Further, the determining, based on the transition probabilities of the code values corresponding to various types of discrete data in the discrete data codes, a scrambling code corresponding to the discrete data codes includes:
in the {0,1} setExtract 0 toProbability of (1) is extracted, and if 0 is extracted, thenIf 1 is drawn, then
Based on the technical solutions of step 101 and step 102, the present invention provides a ue for discrete data frequency estimation, as shown in fig. 2, the ue includes:
the generating module is used for generating discrete data codes according to the types of the discrete data sent to the data center;
the acquisition module is used for acquiring a scrambling code corresponding to the discrete data code;
and the sending module is used for sending the scrambling codes corresponding to the discrete data codes to the data center.
Preferably, the length of the discrete data codes is equal to the total number of discrete data types.
Further, the discrete data is encoded as (v)1...vi...vn) Where n is the total number of discrete data types, viIs the coded value corresponding to the i-th type of discrete data, if the type of the discrete data sent to the data center by the user side is the i-th type of discrete data, v isi1, otherwise, vi=0。
Preferably, the obtaining module includes:
the acquisition unit is used for acquiring the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes;
and the determining unit is used for determining the scrambling codes corresponding to the discrete data codes based on the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes.
Further, the obtaining unit is specifically configured to:
determining the probability of converting the coded value corresponding to the ith type of discrete data in the discrete data coding into 0 according to the following formula:
determining the probability of converting the coded value corresponding to the ith type of discrete data into 1 in the discrete data coding according to the following formula:
in the above formula, epsilon is the privacy protection budget, delta is a parameter under loose local differential privacy, the value is between 0 and 1,scrambling code values corresponding to i-th discrete data in scrambling codes corresponding to discrete data codes,the probability of converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 0,and converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 1.
Further, the determining unit is specifically configured to:
in the {0,1} setExtract 0 toProbability of (1) is extracted, and if 0 is extracted, thenIf 1 is drawn, then
In the embodiment provided by the present invention, step 103 and step 104 may be applied to a data center, where step 104 includes:
counting the frequency of 0 corresponding to the disturbing code value of the ith type of discrete data in the disturbing codes corresponding to the discrete data codes of each user terminalAnd a frequency with scrambling code value 1
and solving the receiving frequency equation set of the ith type of discrete data to obtain the occurrence frequency of the ith type of discrete data.
Further, the system of the generation frequency equation of the ith type of discrete data is as follows:
in the above formula, f0(i) For no occurrence frequency of i-th type discrete data, f1(i) And the occurrence frequency of the ith type of discrete data, epsilon is a privacy protection budget, and delta is a parameter under loose local differential privacy, and the value is between 0 and 1.
Based on the technical solutions of step 103 and step 104, the present invention provides a data center for discrete data frequency estimation, as shown in fig. 3, the data center includes:
the receiving module is used for receiving the scrambling codes corresponding to the discrete data codes of the user sides;
and the determining module is used for determining the occurrence frequency of various discrete data according to the scrambling codes corresponding to the discrete data codes of the user sides.
Preferably, the determining module includes:
a statistic unit for counting the frequency of 0 for the scrambling code value corresponding to the i-th discrete data in the scrambling codes corresponding to the discrete data codes of each user terminalAnd a frequency with scrambling code value 1
A building unit for building based onAndset up the firstGenerating a frequency equation set of i-type discrete data;
and the solving unit is used for solving the receiving frequency equation set of the ith type of discrete data to obtain the occurrence frequency of the ith type of discrete data.
Further, the system of the generation frequency equation of the ith type of discrete data is as follows:
in the above formula, f0(i) For no occurrence frequency of i-th type discrete data, f1(i) And the occurrence frequency of the ith type of discrete data, epsilon is a privacy protection budget, and delta is a parameter under loose local differential privacy, and the value is between 0 and 1.
Meanwhile, the present invention also provides a discrete data frequency estimation system, as shown in fig. 4, the system includes: the user side and the data center.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.
Claims (20)
1. A method for estimating discrete data frequency, the method being applied to a user side, the method comprising:
generating discrete data codes according to the types of the discrete data sent to the data center;
and acquiring a scrambling code corresponding to the discrete data code, and sending the scrambling code corresponding to the discrete data code to a data center.
2. The method of claim 1, wherein the length of the discrete data encoding is equal to a total number of discrete data types.
3. The method of claim 2, wherein the discrete data is encoded as (v;)1...vi...vn) Where n is the total number of discrete data types, viIs the coded value corresponding to the i-th type of discrete data, if the type of the discrete data sent to the data center by the user side is the i-th type of discrete data, v isi1, otherwise, vi=0。
4. The method of claim 1, wherein obtaining the scrambling code corresponding to the discrete data code comprises:
acquiring the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes;
and determining the scrambling codes corresponding to the discrete data codes based on the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes.
5. The method of claim 4, wherein obtaining transition probabilities for code values corresponding to various types of discrete data in the encoding of discrete data comprises:
determining the probability of converting the coded value corresponding to the ith type of discrete data in the discrete data coding into 0 according to the following formula:
determining the probability of converting the coded value corresponding to the ith type of discrete data into 1 in the discrete data coding according to the following formula:
in the above formula, epsilon is the privacy protection budget, delta is a parameter under loose local differential privacy, the value is between 0 and 1,scrambling code values corresponding to i-th discrete data in scrambling codes corresponding to discrete data codes,the probability of converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 0,and converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 1.
6. The method of claim 5, wherein determining the scrambling code corresponding to the discrete data encoding based on transition probabilities of encoding values corresponding to various types of discrete data in the discrete data encoding comprises:
7. A user terminal for discrete data frequency estimation, the user terminal comprising:
the generating module is used for generating discrete data codes according to the types of the discrete data sent to the data center;
the acquisition module is used for acquiring a scrambling code corresponding to the discrete data code;
and the sending module is used for sending the scrambling codes corresponding to the discrete data codes to the data center.
8. The user terminal of claim 7, wherein the length of the discrete data codes is equal to the total number of discrete data types.
9. The user terminal of claim 8, wherein the discrete data is encoded as (v £ v1...vi...vn) Where n is the total number of discrete data types, viIs the coded value corresponding to the i-th type of discrete data, if the type of the discrete data sent to the data center by the user side is the i-th type of discrete data, v isi1, otherwise, vi=0。
10. The user end according to claim 7, wherein the obtaining module includes:
the acquisition unit is used for acquiring the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes;
and the determining unit is used for determining the scrambling codes corresponding to the discrete data codes based on the conversion probability of the code values corresponding to various types of discrete data in the discrete data codes.
11. The user end according to claim 10, wherein the obtaining unit is specifically configured to:
determining the probability of converting the coded value corresponding to the ith type of discrete data in the discrete data coding into 0 according to the following formula:
determining the probability of converting the coded value corresponding to the ith type of discrete data into 1 in the discrete data coding according to the following formula:
in the above formula, epsilon is the privacy protection budget, delta is a parameter under loose local differential privacy, the value is between 0 and 1,scrambling code values corresponding to i-th discrete data in scrambling codes corresponding to discrete data codes,the probability of converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 0,and converting the coded value corresponding to the i-th type of discrete data in the discrete data coding into 1.
13. A discrete data frequency estimation method applied to a data center is characterized by comprising the following steps:
receiving a scrambling code corresponding to the discrete data code of each user side;
and determining the occurrence frequency of various discrete data according to the scrambling codes corresponding to the discrete data codes of the user sides.
14. The method of claim 13, wherein the determining the occurrence frequency of each type of discrete data according to the scrambling code corresponding to the discrete data code of each user terminal comprises:
counting the frequency of 0 corresponding to the disturbing code value of the ith type of discrete data in the disturbing codes corresponding to the discrete data codes of each user terminalAnd a frequency with scrambling code value 1
and solving the receiving frequency equation set of the ith type of discrete data to obtain the occurrence frequency of the ith type of discrete data.
15. The method of claim 14, wherein the system of i-th class of discrete data generation frequency equations is:
in the above formula, f0(i) For no occurrence frequency of i-th type discrete data, f1(i) For the frequency of occurrence of i-th class of discrete data, e is the privacy preserving measureAnd if delta is a parameter under loose local differential privacy, the value is between 0 and 1.
16. A data center for use in discrete data frequency estimation, the data center comprising:
the receiving module is used for receiving the scrambling codes corresponding to the discrete data codes of the user sides;
and the determining module is used for determining the occurrence frequency of various discrete data according to the scrambling codes corresponding to the discrete data codes of the user sides.
17. The data center of claim 16, wherein the determination module comprises:
a statistic unit for counting the frequency of 0 for the scrambling code value corresponding to the i-th discrete data in the scrambling codes corresponding to the discrete data codes of each user terminalAnd a frequency with scrambling code value 1
A building unit for building based onAndestablishing an i-th discrete data generation frequency equation set;
and the solving unit is used for solving the receiving frequency equation set of the ith type of discrete data to obtain the occurrence frequency of the ith type of discrete data.
18. The data center of claim 17, wherein the system of equations for the occurrence frequency of the ith type of discrete data is:
in the above formula, f0(i) For no occurrence frequency of i-th type discrete data, f1(i) And the occurrence frequency of the ith type of discrete data, epsilon is a privacy protection budget, and delta is a parameter under loose local differential privacy, and the value is between 0 and 1.
19. A method of discrete data frequency estimation, the method comprising:
the user side generates discrete data codes according to the types of the discrete data sent to the data center;
the method comprises the steps that a user side obtains a disturbing code corresponding to a discrete data code and sends the disturbing code corresponding to the discrete data code to a data center;
the data center receives the disturbing codes corresponding to the discrete data codes of the user sides;
and the data center determines the occurrence frequency of various discrete data according to the disturbing codes corresponding to the discrete data codes of the user terminals.
20. A discrete data frequency estimation system, the system comprising: the user terminal according to any of claims 7-12 and the data center according to any of claims 16-18.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911298496.4A CN112995076B (en) | 2019-12-17 | 2019-12-17 | Discrete data frequency estimation method, user side, data center and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911298496.4A CN112995076B (en) | 2019-12-17 | 2019-12-17 | Discrete data frequency estimation method, user side, data center and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112995076A true CN112995076A (en) | 2021-06-18 |
CN112995076B CN112995076B (en) | 2022-09-27 |
Family
ID=76341887
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911298496.4A Active CN112995076B (en) | 2019-12-17 | 2019-12-17 | Discrete data frequency estimation method, user side, data center and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112995076B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107302521A (en) * | 2017-05-23 | 2017-10-27 | 全球能源互联网研究院 | The sending method and method of reseptance of a kind of privacy of user data |
CN108509627A (en) * | 2018-04-08 | 2018-09-07 | 腾讯科技(深圳)有限公司 | data discretization model training method and device, data discrete method |
CN109299436A (en) * | 2018-09-17 | 2019-02-01 | 北京邮电大学 | A kind of ordering of optimization preference method of data capture meeting local difference privacy |
CN110022531A (en) * | 2019-03-01 | 2019-07-16 | 华南理工大学 | A kind of localization difference privacy municipal refuse data report and privacy calculation method |
WO2019172837A1 (en) * | 2018-03-05 | 2019-09-12 | Agency For Science, Technology And Research | Method and system for deriving statistical information from encrypted data |
CN110569286A (en) * | 2019-09-11 | 2019-12-13 | 哈尔滨工业大学(威海) | activity time sequence track mining method based on local differential privacy |
-
2019
- 2019-12-17 CN CN201911298496.4A patent/CN112995076B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107302521A (en) * | 2017-05-23 | 2017-10-27 | 全球能源互联网研究院 | The sending method and method of reseptance of a kind of privacy of user data |
WO2019172837A1 (en) * | 2018-03-05 | 2019-09-12 | Agency For Science, Technology And Research | Method and system for deriving statistical information from encrypted data |
CN108509627A (en) * | 2018-04-08 | 2018-09-07 | 腾讯科技(深圳)有限公司 | data discretization model training method and device, data discrete method |
CN109299436A (en) * | 2018-09-17 | 2019-02-01 | 北京邮电大学 | A kind of ordering of optimization preference method of data capture meeting local difference privacy |
CN110022531A (en) * | 2019-03-01 | 2019-07-16 | 华南理工大学 | A kind of localization difference privacy municipal refuse data report and privacy calculation method |
CN110569286A (en) * | 2019-09-11 | 2019-12-13 | 哈尔滨工业大学(威海) | activity time sequence track mining method based on local differential privacy |
Also Published As
Publication number | Publication date |
---|---|
CN112995076B (en) | 2022-09-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9407435B2 (en) | Cryptographic key generation based on multiple biometrics | |
CN112395643B (en) | Data privacy protection method and system for neural network | |
CN111222158B (en) | Block chain-based two-party security and privacy comparison method | |
Merhav et al. | Optimal watermark embedding and detection strategies under limited detection resources | |
CN108648761B (en) | Method for embedding blockchain account book in audio digital watermark | |
CN110852374A (en) | Data detection method and device, electronic equipment and storage medium | |
CN114491610B (en) | Intelligent shared financial platform and system based on Hash encryption algorithm and quantum key | |
CN111026359B (en) | Method and device for judging numerical range of private data in multi-party combination manner | |
CN112437060B (en) | Data transmission method and device, computer equipment and storage medium | |
CN110598464B (en) | Data and model safety protection method of face recognition system | |
CN115296862A (en) | Network data secure transmission method based on data coding | |
CN113472537B (en) | Data encryption method, system and computer readable storage medium | |
CN117240604B (en) | Cloud computing-based data safe storage and energy saving optimization method | |
CN112995076B (en) | Discrete data frequency estimation method, user side, data center and system | |
CN117195274A (en) | Format file anti-fake method and system | |
CN115292739B (en) | Data management method of metal mold design system | |
CN113537516B (en) | Training method, device, equipment and medium for distributed machine learning model | |
CN112288757B (en) | Encryption domain image segmentation optimization method based on data packing technology | |
CN115292726A (en) | Semantic communication method and device, electronic equipment and storage medium | |
CN114003939A (en) | Multiple collinearity analysis method for longitudinal federal scene | |
CN113766273A (en) | Method and device for processing video data | |
Tverdokhlib et al. | Method of Selective Steganographic Data Hiding Based on Graphic Containers | |
Chandramouli | Watermarking capacity in the presence of multiple watermarks and a partially known channel | |
Moulin | Information-hiding games | |
CN117938355B (en) | Block chain-based joint prediction method, medium and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |