CN114238509A - Data generation and decentralized encryption federation framework based on GAN and block chain - Google Patents

Data generation and decentralized encryption federation framework based on GAN and block chain Download PDF

Info

Publication number
CN114238509A
CN114238509A CN202111552560.4A CN202111552560A CN114238509A CN 114238509 A CN114238509 A CN 114238509A CN 202111552560 A CN202111552560 A CN 202111552560A CN 114238509 A CN114238509 A CN 114238509A
Authority
CN
China
Prior art keywords
gan
data
block chain
decentralized
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111552560.4A
Other languages
Chinese (zh)
Inventor
王玉乾
张卫山
陈雷鸣
董次浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum East China
Original Assignee
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum East China filed Critical China University of Petroleum East China
Priority to CN202111552560.4A priority Critical patent/CN114238509A/en
Publication of CN114238509A publication Critical patent/CN114238509A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Storage Device Security (AREA)

Abstract

The invention provides a data generation And Decentralized Encryption federated framework GAN-DGEFL (generalized adaptive Networks-Decentralized And Gradient Encryption) based on GAN And a block chain, aiming at the defects of unbalanced data types And the traditional federated Learning framework in industrial actual production. Because the probability of equipment failure in the actual industrial production process is small, the quantity of generated abnormal data is small, and the problems of poor model overfitting and generalization capability can occur if the original data is directly used for training. Therefore, generation of abnormal data based on a generation countermeasure network is proposed to solve the problem of data imbalance. Meanwhile, only one fusion node of the traditional federal learning framework is provided, and the gradient is easy to attack in the transmission process, so that the decentralized and gradient encryption federal framework is provided to solve the defects existing in the traditional federal learning framework by combining the ideas of block chains and information encryption.

Description

Data generation and decentralized encryption federation framework based on GAN and block chain
Technical Field
The invention relates to the fields of industrial internet, data generation, federal learning, block chain and information encryption. In particular to a data generation and decentralized encryption federation framework based on GAN and a block chain.
Background
The GAN-DGEFL is mainly based on the ideas of generation countermeasure, federal learning, block chaining and information encryption, and the idea of generation countermeasure is used for generating enough abnormal (fault) data to solve the problem of data imbalance and improve the generalization of the model. Meanwhile, based on the characteristics of intelligent contracts of the block chains, incapability of being tampered and center self-organization removal, the thought of the block chains is combined with the thought of federal learning, and an information encryption algorithm is integrated in the gradient parameter transmission process, so that the defects of system breakdown caused by down of a center node and privacy leakage caused by malicious attack of the transmitted gradient parameters in the traditional federal learning framework are overcome.
Disclosure of Invention
In order to solve the defects and shortcomings in the prior art, the invention provides a data generation and decentralized encryption federation framework based on GAN and a block chain, and the technical scheme of the invention is as follows:
a GAN and blockchain based data generation and decentralized encryption federation framework, comprising: and generating a federal learning framework of countermeasure network, gradient parameter protection and decentralization.
(1) The generation of the confrontation network refers to the dynamic confrontation between a generator G (multilayer perceptron) and a discriminator D (multilayer perceptron), and the training process is divided into 2 steps:
the method comprises the following steps: the generator G generates data G (z) which is least similar to the real abnormal data according to the random noise z
Secondly, the step of: the ability of the discriminator D to discriminate between true and false samples is gradually improved.
In the model training process, the G and the D are alternately trained, and are mutually restricted and continuously optimized to finally tend to be stable. I.e., G can eventually generate a "false" sample similar to a real sample, D cannot tell whether the input is a real sample or a generated sample. The optimization objective function L of the discriminator and the generator in the model is as follows:
Figure BDA0003417556190000021
wherein x to pdata(x) For x sampling from the true sample distribution, z-pz(z) is z sampled from the generated sample distribution g (z) and satisfies the standard normal distribution, E (×) is the expected value.
(2) The automatic issuing of the updated gradient parameters after fusion of all the fusion nodes is realized by using an intelligent contract mechanism in the block chain, and the parameters received by each task node are all sent by the fusion node closest to the task node, so that the time consumption caused by long-distance transmission of information is reduced; meanwhile, the encryption modes of RSA and AES are integrated in the information transmission process (the gradient parameters are uploaded to the fusion node by the task node and are issued to the gradient parameters by the fusion node), and malicious attacks on the transmission gradient parameters are effectively prevented.
(3) The block chain idea and the federal learning idea are combined, the characteristics that the block chain cannot be tampered and the center is removed and self-organized are applied to the federal learning framework, the decentralized federal learning framework is provided, the purpose that gradient parameter information updated in each round in the federal learning process is stored in a plurality of credible block chain nodes is achieved, the defects in the traditional federal learning framework are overcome, the robustness of the traditional federal learning framework is improved, and the framework can be applied to actual production development more stably.
The invention has the beneficial effects that:
(1) the generated countermeasure network is applied to the industrial internet to solve the problems of poor model overfitting and generalization capability caused by unbalance of normal data samples and abnormal data samples;
(2) an intelligent contract mechanism and an information encryption algorithm of a block chain are applied to the gradient parameter transmission process between the fusion node and the task node, so that time loss in the transmission process is reduced, and malicious attack on the transmission gradient parameter is prevented;
(3) the characteristics that a block chain cannot be tampered and a center is removed, and self-organization is combined with the idea of federal learning, and a decentralized federal learning framework with higher robustness is provided;
drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is an architecture diagram of the portion of the present invention that generates a countermeasure network;
FIG. 2 is an overall architecture diagram of the present invention;
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
According to the data generation and decentralized encryption federated framework based on the GAN and the blockchain, a small amount of abnormal (fault) data in industrial actual production is generated based on the concept of generation countermeasure, meanwhile, the characteristics that the blockchain is not falsifiable and decentralized self-organization are applied to the federated learning framework, and an encryption algorithm is combined, so that the problem caused by the breakdown of a central node in the traditional federated learning framework is solved, and the safety of parameter information in the transmission process is effectively protected. The method comprises the following six modules: the system comprises a data generation module, a data discrimination module, a generation countermeasure module, an encryption transmission module, a block chain module and a decentralized federal learning module.
The following details the specific procedure for GAN-DGEFL:
step (1): generating a false sample G (z) via a generator G using random noise or a variable z following a certain distribution as input to the generator;
step (2): the input real data x and the data g (z) generated by the generator are discriminated by using a discriminator D to discriminate the probability that the data generated by the generator is real data.
And (3): the discriminator D and the generator G are in mutual confrontation and iteration, the discriminator D provides feedback according to a discrimination result to guide the generator to train, the generation capability of the generator is continuously improved, and the discrimination capability of the discriminator is also continuously improved at the same time until the generator learns the probability distribution similar to the real sample and the discriminator can not correctly judge whether the input data come from the real sample or the false sample generated by the generator;
and (4): let us assume that sufficient data X have been generated by the trained discriminators D and generators GgThere are n task nodes { K) in the decentralized federated learning framework1,K2,K3......Kn}, m fusion nodes { K1,K2,K3......KmEach task node uses the raw data X and the generated data X (assuming that the m fusion nodes are trusted)gAnd (3) performing model training, and when the gradient parameters of the trained model are to be transmitted to the m fusion nodes after each round of training of the task nodes is completed, firstly, judging whether AES secret keys of the m fusion nodes exist in the node by the n task nodes:
the method comprises the following steps: if the n task nodes have the AES secret keys of the m fusion nodes, the parameter information of the node is directly encrypted and transmitted to the corresponding fusion nodes through the AES secret keys corresponding to the m fusion nodes.
Secondly, the step of: if the n task nodes do not have the AES secret keys of the m fusion nodes/only have partial AES secret keys of the m fusion nodes, the m fusion nodes respectively encrypt the AES secret keys of the node by utilizing the RSA public keys of the n task nodes, and respectively transmit the encrypted secret keys to the corresponding task nodes. After the n task nodes receive the encrypted secret keys sent by the m fusion nodes, the n task nodes respectively use the RSA private keys of the node to decrypt the encrypted secret keys, and then the first step is repeated.
And (5): when m fusion nodes receive the encrypted parameter information { P) sent by n task nodes1,P2,P3......PnAnd then, decrypting the parameter information by using the AES secret key of the node, fusing the information of the n task nodes by adopting a fusion algorithm such as FedAvg/FedSGD and the like, and setting { P'1,P′2,P′3......P′nAnd expressing updated gradient parameter information corresponding to the n nodes after the fusion is finished. The fusion node which completes the fusion firstly sends the information of completing the fusion task of the round to other fusion nodes through broadcasting, and the current fusion process in the node is stopped and the next round of fusion is waited after the other fusion nodes receive the corresponding information. Meanwhile, the fusion node after fusion obtains the recording authority of the gradient parameter block of the current round.
And (6): the fusion node obtaining the recording authority uses RSA public keys (set as { sig) of the n task nodes respectively1,sig2,sig3......signAn RSA public key representing n task nodes) as a signature of the corresponding node, so as to form a unique key-value pair with updated parameter information of the corresponding task node, where the signed parameter information represents the following:
{<sig1,P′1>,<sig2,P′2>,<sig3,P′3>,......<sign,P′n>}
the fusion node constructs a new data block, stores the signed parameter information in the block body of the newly generated block, and simultaneously informs other fusion nodes of synchronizing the state of the local block chain.
And (7): after all the fusion nodes synchronously complete the state of the local block chain, each fusion node encrypts the updated parameter information by using the RSA public key of each task node through an intelligent contract mechanism of the block chain, and transmits the encrypted parameter information to the task node closest to the fusion node, so that the time consumption caused by long-distance transmission of the information and the attack to the parameter information in the transmission process are reduced, and finally, after each task node receives the encrypted gradient parameter, the received information is decrypted by using the RSA private key of the node, and the model of the local machine is updated by using the updated gradient parameter information.
And (8): and repeating the steps 4-7 until the specified precision is reached or the specified iteration turns are finished.
The data generation and decentralized encryption federated framework based on the GAN and the block chain can solve the problems of model overfitting and poor generalization caused by data imbalance, the problem of malicious attack on parameters in the parameter transmission process and the problem of system breakdown caused by down of a central node in the traditional federated learning framework, and has high enough robustness compared with the traditional federated learning framework and can be applied to the field of the current industrial Internet.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (7)

1. A data generation and decentralized encryption federated framework GAN-DGEFL based on GAN and a block chain is used for processing unbalanced data by using a generation countermeasure network, so that positive and negative samples of input data tend to be balanced, the poor overfitting and generalization capabilities of a trained model are avoided, meanwhile, aiming at the privacy disclosure problem caused by the fact that the whole system training is stopped and transmission gradients are attacked due to the down of a central node in the traditional federated learning framework, and the decentralized and gradient encryption federated framework is provided based on the characteristics of decentralized and non-falsification of the block chain and an information encryption algorithm. The method comprises the following six modules: the system comprises a data generation module, a data discrimination module, a generation countermeasure module, an encryption transmission module, a block chain module and a decentralized federal learning module.
2. The GAN-DGEFL of claim 1 wherein the data generating module is: and generating false samples of the original input data by using a multi-layer perceptron-generator G so as to make the false samples follow the distribution of real data as much as possible.
3. The GAN-DGEFL of claim 1 wherein said data discrimination module is: a multi-layered perceptron-arbiter D is used to determine whether the input data is a real sample.
4. The GAN-DGEFL of claim 1 wherein said generation countermeasure module is: the discriminator D and the generator G are mutually confronted and iterated, and the discriminator D provides feedback to guide the generator G to train according to the discrimination result, so that the generation capability of the generator is continuously improved, and the discrimination capability of the discriminator is also continuously improved.
5. The GAN-DGEFL of claim 1 wherein said encrypted transmission module is: and encrypting the gradient parameter transmission between the task node and the fusion node by using a public key and a private key of an asymmetric encryption algorithm (RSA) and a secret key of a symmetric encryption Algorithm (AES) through corresponding steps to ensure the security of the gradient parameter in the transmission process.
6. The GAN-DGEFL of claim 1 wherein the block chain module is: and storing the block chain formed by the gradient parameters of all task nodes of each iteration in the model training process in a credible fusion node based on the characteristic that the block chain is removed from the center and self-organized, protecting the safety of the gradient parameters based on the characteristic that the block chain can not be tampered, and automatically issuing the updated gradient parameters through an intelligent contract.
7. The GAN-DGEFL of claim 1 wherein the decentralized federal learning module is selected from the group consisting of: different from the mode that the local gradient parameters can only be updated by obtaining updated parameters from a specific central node in the traditional federal learning framework, the decentralized federal learning can obtain the required parameters from the block chain of any one of the m credible fusion nodes, and the problem of whole system breakdown caused by a series of conditions such as down of the central node or malicious attack is effectively prevented.
CN202111552560.4A 2021-12-17 2021-12-17 Data generation and decentralized encryption federation framework based on GAN and block chain Pending CN114238509A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111552560.4A CN114238509A (en) 2021-12-17 2021-12-17 Data generation and decentralized encryption federation framework based on GAN and block chain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111552560.4A CN114238509A (en) 2021-12-17 2021-12-17 Data generation and decentralized encryption federation framework based on GAN and block chain

Publications (1)

Publication Number Publication Date
CN114238509A true CN114238509A (en) 2022-03-25

Family

ID=80758000

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111552560.4A Pending CN114238509A (en) 2021-12-17 2021-12-17 Data generation and decentralized encryption federation framework based on GAN and block chain

Country Status (1)

Country Link
CN (1) CN114238509A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117034328A (en) * 2023-10-09 2023-11-10 国网信息通信产业集团有限公司 Improved abnormal electricity utilization detection system and method based on federal learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117034328A (en) * 2023-10-09 2023-11-10 国网信息通信产业集团有限公司 Improved abnormal electricity utilization detection system and method based on federal learning
CN117034328B (en) * 2023-10-09 2024-03-19 国网信息通信产业集团有限公司 Improved abnormal electricity utilization detection system and method based on federal learning

Similar Documents

Publication Publication Date Title
Li et al. DeepFed: Federated deep learning for intrusion detection in industrial cyber–physical systems
Da Xu et al. Embedding blockchain technology into IoT for security: A survey
CN106233661B (en) Method for generating secret or key in a network
Zhang et al. Defending against sybil attacks in sensor networks
CN110377002A (en) A kind of adaptive interior CAN bus method of controlling security and system
Mantravadi et al. Securing IT/OT links for low power IIoT devices: design considerations for industry 4.0
CN113259135B (en) Lightweight blockchain communication authentication device and method for detecting data tamper
Lv et al. Digital twins based on quantum networking
Yavuz et al. Distributed cyber-infrastructures and artificial intelligence in hybrid post-quantum era
CN114238509A (en) Data generation and decentralized encryption federation framework based on GAN and block chain
Abdulaal et al. Privacy-preserving detection of power theft in smart grid change and transmit (cat) advanced metering infrastructure
Chen et al. Convoy_DTN: A security interaction engine design for Digital Twin Network
Zhao et al. Privacy-preserving electricity theft detection based on blockchain
CN115865426B (en) Privacy intersection method and device
CN101552778A (en) Construction method of attacker model in automatic detection of safety protocol
CN116094719A (en) Lightweight industrial sensor data stream integrity verification method based on physical unclonable function
Téglásy et al. A Location-Based Global Authorization Method for Underwater Security
AU2022314600A1 (en) System and method for quantum-secure microgrids
Chauhan et al. Improving IoT security using elliptic curve integrated encryption scheme with primary structure-based block chain technology
Huang et al. Covert communication scheme based on Bitcoin transaction mechanism
Yadav et al. Smart communication and security by key distribution in multicast environment
Danilczyk et al. Blockchain checksum for establishing secure communications for digital twin technology
Riyadi et al. Real-time testing on improved data transmission security in the industrial control system
Ren et al. QFDSA: A Quantum-Secured Federated Learning System for Smart Grid Dynamic Security Assessment
Lomte et al. Review of a new distinguishing attack using block cipher with a neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication