CN115238288A - Safety processing method for industrial internet data - Google Patents
Safety processing method for industrial internet data Download PDFInfo
- Publication number
- CN115238288A CN115238288A CN202210880056.5A CN202210880056A CN115238288A CN 115238288 A CN115238288 A CN 115238288A CN 202210880056 A CN202210880056 A CN 202210880056A CN 115238288 A CN115238288 A CN 115238288A
- Authority
- CN
- China
- Prior art keywords
- model parameters
- factory
- model
- ipfs
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 14
- 238000000034 method Methods 0.000 claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 8
- 238000010801 machine learning Methods 0.000 claims abstract description 5
- 238000012549 training Methods 0.000 claims description 35
- 230000006870 function Effects 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 8
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 239000000284 extract Substances 0.000 claims description 4
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Storage Device Security (AREA)
Abstract
本发明涉及一种工业互联网数据的安全处理方法,属于工业互联网数据处理领域。该方法包括:工厂和协作方以ElGamal加密算法生成一对秘钥;协约方初始化模型参数,并将其与公钥上传在区块链上;工厂从区块链下载初始模型和协作方公钥,训练模型并使用差分隐私算法提取模型参数;工厂加密模型参数并存储在IPFS中获得哈希值;加密哈希值将其与工厂公钥加到区块链上;协作方通过哈希值在IPFS中检索模型参数来训练全局模型,利用SK来加密模型参数并将其存储在IPFS,加密IPFS哈希值,利用工厂公钥加密SK,将结果添加区块链上;工厂收到当前全局模型参数进行更新。本发明解决了工业互联网系统中机器学习的隐私和信任问题。
The invention relates to a security processing method for industrial internet data, belonging to the field of industrial internet data processing. The method includes: the factory and the collaborating party generate a pair of secret keys using the ElGamal encryption algorithm; the contracting party initializes the model parameters and uploads them and the public key on the blockchain; the factory downloads the initial model and the collaborating party's public key from the blockchain , train the model and use the differential privacy algorithm to extract the model parameters; the factory encrypts the model parameters and stores them in IPFS to obtain the hash value; encrypts the hash value and adds it and the factory public key to the blockchain; Retrieve the model parameters in IPFS to train the global model, use SK to encrypt the model parameters and store them in IPFS, encrypt the IPFS hash value, encrypt the SK with the factory public key, and add the result to the blockchain; the factory receives the current global model parameters are updated. The present invention solves the privacy and trust problems of machine learning in the industrial Internet system.
Description
技术领域technical field
本发明属于工业互联网数据安全处理技术领域,涉及一种工业互联网数据的安全处理方法。The invention belongs to the technical field of industrial internet data security processing, and relates to a security processing method for industrial internet data.
背景技术Background technique
近年来,随着智能感知设备的大规模部署应用,由此产生了海量数据,这些数据包含生产生活的各个领域信息,已经成为一种重要的生产要素。工业互联网作为新一代信息技术与制造业深度融合的产物,通过对人、机、物的全面互联,构建起全要素、全产业链、全价值链全面连接的新型工业生产制造和服务体系,是数字化转型的实现途径,是实现新旧动能转换的关键力量。工业互联网正处于数据规模爆长、计算能力剧增以及算法性能不断提升的新阶段,深度学习、强化学习以及联邦学习等智能技术已成为提升工业互联网综合服务性能的重要支撑。In recent years, with the large-scale deployment and application of intelligent sensing devices, massive amounts of data have been generated. These data contain information in various fields of production and life, and have become an important production factor. As a product of the deep integration of a new generation of information technology and manufacturing, the Industrial Internet, through the comprehensive interconnection of people, machines, and things, builds a new industrial manufacturing and service system that is fully connected with all elements, the entire industrial chain, and the entire value chain. The way to realize digital transformation is the key force to realize the transformation of old and new kinetic energy. The Industrial Internet is in a new stage of exploding data scale, sharply increasing computing power, and improving algorithm performance. Intelligent technologies such as deep learning, reinforcement learning, and federated learning have become important supports for improving the comprehensive service performance of the Industrial Internet.
工业生产数据保密性极高,目前常用的中心化的模型训练方法存在数据隐私泄露问题,难以兼顾数据共享与隐私保护的双重需求,不仅需要新型的隐私保护计算范式来完成多方联合建模,还需要灵活且智能的适配方法来保障新型计算范式的高效运行。而安全是工业互联网的保障。在未来,工业互联网这一新兴基础设施建设将向更广范围、更深程度、更高水平不断推进。面对新技术以及安全防护新形势,工业互联网安全生态将不断面临新的挑战,因此,亟待面向工业互联网数据的安全处理方法,以提升应对安全风险的能力,促进工业互联网的繁荣与发展。The confidentiality of industrial production data is extremely high. At present, the commonly used centralized model training method has the problem of data privacy leakage. It is difficult to take into account the dual needs of data sharing and privacy protection. Not only does a new privacy-preserving computing paradigm need to complete multi-party joint modeling, but also Flexible and intelligent adaptation methods are required to ensure the efficient operation of new computing paradigms. And security is the guarantee of the Industrial Internet. In the future, the construction of the emerging infrastructure of the Industrial Internet will continue to advance to a wider, deeper and higher level. In the face of new technologies and new situations of security protection, the industrial Internet security ecosystem will continue to face new challenges. Therefore, it is urgent to deal with industrial Internet data security methods to improve the ability to deal with security risks and promote the prosperity and development of the Industrial Internet.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本发明的目的在于提供一种工业互联网数据的安全处理方法,针对难以兼顾工业数据共享与隐私保护的双重需求问题,通过将联邦学习、差分隐私、ElGamal加密等结合,协同打造安全的工业互联网,更好赋能工业互联网,推动产业升级。In view of this, the purpose of the present invention is to provide a security processing method for industrial Internet data, aiming at the problem that it is difficult to take into account the dual requirements of industrial data sharing and privacy protection, by combining federated learning, differential privacy, ElGamal encryption, etc. The industrial Internet can better empower the industrial Internet and promote industrial upgrading.
为达到上述目的,本发明提供如下技术方案:To achieve the above object, the present invention provides the following technical solutions:
一种工业互联网数据的安全处理方法,具体包括以下步骤:A security processing method for industrial Internet data, which specifically includes the following steps:
S1:工厂和协作方生成一对秘钥;然后协作方初始化模型参数w0,并将模型参数w0与其公钥上传到区块链上;S1: The factory and the collaborating party generate a pair of secret keys; then the collaborating party initializes the model parameter w 0 and uploads the model parameter w 0 and its public key to the blockchain;
S2:每个工厂从区块链上下载初始化模型参数w0和协作方的公钥,使用特定数量的本地数据样本训练一个深度神经网络模型;训练完成后用差分隐私算法对模型进行处理,生成局部差分隐私机器学习模型,之后工厂提取模型参数;S2: Each factory downloads the initialization model parameter w 0 and the public key of the collaborator from the blockchain, and uses a specific number of local data samples to train a deep neural network model; after the training is completed, the model is processed with a differential privacy algorithm to generate Local differential privacy machine learning model, after which the factory extracts model parameters;
S3:工厂使用协作方的公钥对模型参数进行加密,然后将加密的模型参数存储在IPFS中,并获得唯一的IPFS哈希值;然后同样使用协作方的公钥加密IPFS哈希值,借助于智能合约将其与工厂公钥打包添加到区块链上;工厂通知协作方,当前一轮的本地模型训练完成;S3: The factory uses the collaborator's public key to encrypt the model parameters, then stores the encrypted model parameters in IPFS, and obtains a unique IPFS hash value; then also uses the collaborator's public key to encrypt the IPFS hash value, with the help of It is packaged with the factory public key and added to the blockchain in the smart contract; the factory notifies the collaborator that the current round of local model training is completed;
将加密的模型参数存储在IPFS中,IPFS是一种用于文件存储的对等网络协议,采用的是基于内容的寻址,而非基于位置。在IPFS网络里的文件,会被赋予一个哈希值,这个哈希值类似于“指纹”,它是从文件内容中被计算出来的。Store encrypted model parameters in IPFS, a peer-to-peer protocol for file storage that uses content-based addressing rather than location-based. Files in the IPFS network are assigned a hash value, which is similar to a "fingerprint", which is calculated from the content of the file.
S4:协作方利用私钥解密,通过其公钥加密IPFS哈希值,用IPFS的哈希值在IPFS中检索获得相应的加密模型参数,解密后进行全局模型的训练并提取模型参数;S4: The collaborator decrypts with the private key, encrypts the IPFS hash value with its public key, retrieves the corresponding encryption model parameters in IPFS with the IPFS hash value, and then trains the global model and extracts the model parameters after decryption;
S5:利用SK(由协作方生产的对称密钥)来加密模型参数并将其存储在IPFS,之后会被赋予一个IPFS哈希值,利用SK加密IPFS哈希值,同时利用各工厂的公钥分别加密SK,将加密结果添加区块链上(这种多层加密的方法确保了数据的安全),通知工厂本轮聚合结束;S5: Use SK (symmetric key produced by the collaborator) to encrypt the model parameters and store them in IPFS, and then will be given an IPFS hash value, use SK to encrypt the IPFS hash value, and use the public key of each factory Encrypt SK separately, add the encryption result to the blockchain (this multi-layer encryption method ensures the security of data), and notify the factory that this round of aggregation is over;
S6:工厂在收到协作方通知当前联邦周期结束后,检索加密的全局模型参数,利用其私钥解密获得SK,利用SK解密加密的IPFS哈希值,用IPFS哈希值检索模型参数;工厂使用更新后的模型参数进行局部训练,开始下一轮联合训练,并在预定义的联合轮数中重复所有步骤。S6: After receiving the notification from the collaborator that the current federation cycle is over, the factory retrieves the encrypted global model parameters, uses its private key to decrypt to obtain SK, uses SK to decrypt the encrypted IPFS hash value, and uses the IPFS hash value to retrieve model parameters; Use the updated model parameters for local training, start the next round of joint training, and repeat all steps for a predefined number of joint rounds.
进一步,步骤S1中,工厂和协作方以ElGamal加密算法生成一对秘钥;所述ElGamal加密算法是一个基于迪菲-赫尔曼密钥交换的非对称加密算法,EIGamal加密算法根据的原理是:求解离散对数是困难的,而其逆运算可以应用平方乘的方法有效的计算出来。在相应的群G中,指数函数是单向函数。Further, in step S1, the factory and the collaborating party generate a pair of secret keys with the ElGamal encryption algorithm; the ElGamal encryption algorithm is an asymmetric encryption algorithm based on Diffie-Hellman key exchange, and the principle based on the EIGamal encryption algorithm is : It is difficult to solve the discrete logarithm, and its inverse operation can be effectively calculated by the method of square multiplication. In the corresponding group G, the exponential function is a one-way function.
进一步,步骤S2具体包括:工厂使用特定数量的本地数据样本训练一个深度神经网络;从服务器获得最新的模型参数,从1到批量数量的批量序号b,计算批梯度gk (b),本地更新模型参数:其中η表示学习率,wt表示当前的模型参数,Dk表示第k个协作方所拥有的数据集,M表示指定的客户更新是使用的mini-batch的大小;Further, step S2 specifically includes: the factory uses a specific number of local data samples to train a deep neural network; obtains the latest model parameters from the server, ranging from 1 to the number of batches The batch sequence number b is calculated, the batch gradient g k (b) is calculated, and the model parameters are updated locally: where η represents the learning rate, wt represents the current model parameters, D k represents the data set owned by the k-th collaborator, and M represents the size of the mini-batch used for the specified customer update;
在训练过程中,在SGD计算中实现基于差分隐私的神经网络训练,通过最小化经验损失函数来训练模型参数;在SGD的每一步,计算梯度,对于采样子集,剪切每个梯度的范数l2,计算平均值,为保护隐私添加噪声,向这个平均噪声梯度的相反方向迈出一步进行梯度下降反向传播完成训练最后输出模型。During the training process, differential privacy-based neural network training is implemented in the SGD calculation by minimizing the empirical loss function to train the model parameters; at each step of SGD, compute the gradient, for a sampled subset, clip the norm l 2 of each gradient, compute the average, add noise for privacy, and step in the opposite direction of this average noise gradient One step of gradient descent backpropagation completes the training and finally outputs the model.
进一步,步骤S4具体包括:假设有K个参与方(即工厂)在一个联邦学习系统中,Dk表示第k个参与方所拥有的数据集,Pk表示位于参与方k的数据点的索引集;设nk表示Pk的基数,假设有第k个参与方有个nk数据点,总共有K个参与方时,协作方对收到的模型参数进行聚合,即对收到的模型参数进行加权平均:Further, step S4 specifically includes: assuming that there are K participants (ie factories) in a federated learning system, D k represents the data set owned by the kth participant, and P k represents the index of the data point located in participant k Set; let n k represent the cardinality of P k , assuming that the kth participant has n k data points, when there are a total of K participants, the collaborating party aggregates the received model parameters, that is, the received model Weighted average of parameters:
更新模型参数:Update model parameters:
其中,其中表示在给定的模型参数w上对样本(xi,yi)进行预测所得到的损失结果,xi和yi分别表示第i个训练数据点及其相关的标签。η表示学习率,n表示训练数据的数量;协作方检查损失函数是否收敛或者是否达到最大训练轮次;若是,则协作方给各参与方发信号,使其全部停止模型训练。of which, of which represents the loss result obtained by predicting the sample ( xi , y i ) on the given model parameter w, where x i and y i represent the ith training data point and its associated label, respectively. η represents the learning rate, and n represents the number of training data; the collaborating party checks whether the loss function converges or whether the maximum training round is reached; if so, the collaborating party sends a signal to each participant to stop model training.
进一步,步骤S6中,工厂更新后的模型参数进行加密、存储和上传。Further, in step S6, the updated model parameters of the factory are encrypted, stored and uploaded.
本发明的有益效果在于:本发明通过合并差分隐私、联邦学习、区块链和智能合约来增强工业互联网数据的隐私性和可信度。提升了应对安全风险的能力,促进工业互联网的繁荣与发展。The beneficial effect of the present invention is that the present invention enhances the privacy and credibility of industrial Internet data by combining differential privacy, federated learning, block chain and smart contracts. It improves the ability to deal with security risks and promotes the prosperity and development of the Industrial Internet.
本发明的其他优点、目标和特征在某种程度上将在随后的说明书中进行阐述,并且在某种程度上,基于对下文的考察研究对本领域技术人员而言将是显而易见的,或者可以从本发明的实践中得到教导。本发明的目标和其他优点可以通过下面的说明书来实现和获得。Other advantages, objects, and features of the present invention will be set forth in the description that follows, and will be apparent to those skilled in the art based on a study of the following, to the extent that is taught in the practice of the present invention. The objectives and other advantages of the present invention may be realized and attained by the following description.
附图说明Description of drawings
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作优选的详细描述,其中:In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be preferably described in detail below with reference to the accompanying drawings, wherein:
图1为本发明涉及的工业互联网数据安全处理方法架构图;Fig. 1 is the framework diagram of the industrial Internet data security processing method involved in the present invention;
图2为本发明的工业互联网数据安全处理方法的工厂数据处理过程流程图;Fig. 2 is the factory data processing process flow chart of the industrial Internet data security processing method of the present invention;
图3为本发明的工业互联网数据安全处理方法的协作方数据处理过程流程图。FIG. 3 is a flow chart of the data processing process of the collaborator of the industrial Internet data security processing method of the present invention.
具体实施方式Detailed ways
以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。需要说明的是,以下实施例中所提供的图示仅以示意方式说明本发明的基本构想,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。The embodiments of the present invention are described below through specific specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the contents disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and various details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that the drawings provided in the following embodiments are only used to illustrate the basic idea of the present invention in a schematic manner, and the following embodiments and features in the embodiments can be combined with each other without conflict.
请参阅图1~图3,图1所示的是一种工业互联网数据安全处理方法架构图,图2所示为工业互联网数据安全处理方法的工厂数据处理过程,图3所示为工业互联网数据安全处理方法的协作方数据处理过程。Please refer to Figures 1 to 3. Figure 1 shows the architecture of an industrial Internet data security processing method, Figure 2 shows the factory data processing process of the industrial Internet data security processing method, and Figure 3 shows the industrial Internet data The collaborating party's data processing process for the secure processing method.
该方法解决了工业互联网系统中机器学习的隐私和信任问题,包括工厂、协作方、IPFS、区块链等实体及各实体之间的数据传输。具体包含以下步骤:This method solves the privacy and trust issues of machine learning in industrial Internet systems, including factories, collaborators, IPFS, blockchain and other entities and data transmission between entities. Specifically includes the following steps:
步骤1:工厂和协作方首先以ElGamal加密算法生成一对秘钥,ElGamal加密算法是一个基于迪菲-赫尔曼密钥交换的非对称加密算法,EIGamal加密算法根据的原理是:求解离散对数是困难的,而其逆运算可以应用平方乘的方法有效的计算出来。在相应的群G中,指数函数是单向函数。Step 1: The factory and the collaborator first generate a pair of secret keys with the ElGamal encryption algorithm. The ElGamal encryption algorithm is an asymmetric encryption algorithm based on Diffie-Hellman key exchange. The principle of the EIGamal encryption algorithm is to solve the discrete pair Numbers are difficult, and their inverse operations can be efficiently calculated using the square multiplication method. In the corresponding group G, the exponential function is a one-way function.
步骤2:每个工厂从区块链上下载初始化模型w0和协作方的公钥,使用特定数量的本地数据样本训练一个深度神经网络。训练完成后用差分隐私算法对模型进行处理,生成局部差分隐私机器学习模型。Step 2: Each factory downloads the initialization model w 0 and the public key of the collaborating party from the blockchain, and trains a deep neural network with a specific number of local data samples. After the training is completed, the differential privacy algorithm is used to process the model to generate a local differential privacy machine learning model.
工厂使用特定数量的本地数据样本训练一个深度神经网络。从服务器获得最新的模型参数,从1到批量数量的批量序号b,计算批梯度gk (b),本地更新模型参数:η表示学习率。The factory trains a deep neural network using a specific number of local data samples. Get the latest model parameters from the server, from 1 to the number of batches The batch sequence number b is calculated, the batch gradient g k (b) is calculated, and the model parameters are updated locally: η represents the learning rate.
在训练过程中,在SGD计算中实现基于差分隐私的神经网络训练,通过最小化经验损失函数来训练模型参数。在SGD的每一步,计算梯度,对于采样子集,剪切每个梯度的范数l2,计算平均值,为保护隐私添加噪声,向这个平均噪声梯度的相反方向迈出一步进行梯度下降反向传播完成训练最后输出模型。During the training process, differential privacy-based neural network training is implemented in the SGD calculation by minimizing the empirical loss function to train the model parameters. At each step of SGD, compute the gradient, for the sampled subset, clip the norm l 2 of each gradient, compute the average, add noise for privacy, take a step in the opposite direction of this average noise gradient and perform the gradient descent inverse The training is completed to the final output model of the propagation.
联邦学习能够在中央服务器的帮助下训练深度学习模型,同时保持训练数据分布在客户端。客户端只需要在模型训练期间向云服务器提交其本地梯度,这一定程度上避免了用户数据隐私泄露的风险。差分隐私是一种高效的隐私保护技术,能对数据隐私保护程度量化,设置合适的隐私预算,能达到数据可用性和隐私保护的良好平衡。差分隐私常用于人工智能模型的训练数据的隐私保护上,可为联邦学习提供隐私保障。Federated learning is able to train deep learning models with the help of a central server while keeping the training data distributed across clients. The client only needs to submit its local gradient to the cloud server during model training, which avoids the risk of user data privacy leakage to a certain extent. Differential privacy is an efficient privacy protection technology, which can quantify the degree of data privacy protection, set an appropriate privacy budget, and achieve a good balance between data availability and privacy protection. Differential privacy is often used to protect the privacy of training data of artificial intelligence models, and can provide privacy guarantees for federated learning.
在用户数据不出本地的前提下,通过加密机制或扰动机制下的参数交换与优化,建立一个共有模型。这个共有模型的性能接近于将各方数据聚合到一块训练出来的模型。该数据联合建模方案不泄露用户隐私且符合数据安全保护的原则。On the premise that the user data is not local, a common model is established through parameter exchange and optimization under the encryption mechanism or the perturbation mechanism. The performance of this shared model is close to that of a model trained by aggregating data from all parties into one piece. The data joint modeling scheme does not reveal user privacy and conforms to the principle of data security protection.
步骤3:工厂使用协作方的公钥对模型参数进行加密,然后将加密的模型参数存储在IPFS中,并获得唯一的IPFS哈希值。然后同样使用协约方的公钥加密IPFS哈希值,借助于智能合约将其与工厂公钥打包添加到区块链上。工厂通知协作方,当前一轮的本地模型训练完成。Step 3: The factory encrypts the model parameters with the public key of the collaborating party, then stores the encrypted model parameters in IPFS, and obtains a unique IPFS hash value. Then, the IPFS hash value is also encrypted with the public key of the contracting party, and it is packaged with the factory public key and added to the blockchain with the help of a smart contract. The factory notifies the collaborator that the current round of local model training is complete.
将加密的模型参数存储在IPFS中,IPFS是一种用于文件存储的对等网络协议,采用的是基于内容的寻址,而非基于位置。在IPFS网络里的文件,会被赋予一个哈希值,这个哈希值类似于“指纹”,它是从文件内容中被计算出来的。Store encrypted model parameters in IPFS, a peer-to-peer protocol for file storage that uses content-based addressing rather than location-based. Files in the IPFS network are assigned a hash value, which is similar to a "fingerprint", which is calculated from the content of the file.
步骤4:协作方利用私钥解密通过其公钥加密IPFS哈希值,用IPFS的哈希值在IPFS中检索获得相应的加密模型参数,解密后进行全局模型的训练并提取模型参数。Step 4: The collaborating party uses private key decryption to encrypt the IPFS hash value through its public key, and retrieves the corresponding encryption model parameters in IPFS with the IPFS hash value. After decryption, the global model is trained and the model parameters are extracted.
协作方首先进行解密,获得IPFS哈希值,用IPFS的哈希值检索获得相应的加密模型参数,继续利用其私钥解密获得局部模型更新的参数。The collaborating party first decrypts to obtain the IPFS hash value, retrieves the corresponding encryption model parameters with the IPFS hash value, and continues to use its private key to decrypt to obtain the parameters of the local model update.
假设有K个参与方在一个联邦学习系统中,Dk表示第K个参与方所拥有的数据集,Pk表示位于客户k的数据点的索引集。设nk表示Pk的基数,假设有第k个参与方有个nk数据点,总共有K个参与方时Suppose there are K parties in a federated learning system, Dk denotes the dataset owned by the Kth party, and Pk denotes the index set of data points located at customer k. Let n k denote the cardinality of P k , assuming that the kth participant has n k data points, and there are K participants in total
协作方对收到的模型参数进行聚合,即对收到的模型参数进行加权平均:The collaborators aggregate the received model parameters, that is, perform a weighted average of the received model parameters:
其中,η表示学习率,协作方检查损失函数是否收敛或者是否达到最大训练轮次。若是,则协作方给各参与方发信号,使其全部停止模型训练。where η represents the learning rate, and the collaborator checks whether the loss function converges or whether the maximum training epoch is reached. If so, the cooperating party sends a signal to all the participating parties to stop model training.
步骤5:利用SK(是由协作方生产的对称密钥)来加密模型参数并将其存储在IPFS,之后会被赋予一个IPFS哈希值,利用SK加密IPFS哈希值,同时利用工厂的公钥加密SK,将加密结果添加区块链上,通知工厂本轮聚合结束。Step 5: Use SK (a symmetric key produced by the collaborator) to encrypt the model parameters and store them in IPFS, and then will be given an IPFS hash value, use SK to encrypt the IPFS hash value, and use the factory's public key. Key encryption SK, the encryption result is added to the blockchain, and the factory is notified that this round of aggregation is over.
步骤6:工厂使用更新后的模型参数进行局部训练,首先检索加密的全局模型参数,利用私钥解密获得SK,利用SK解密加密IPFS哈希值,用IPFS哈希值检索模型参数,在获得模型参数后翼与步骤2相同的方式并进行模型训练。之后进行模型参数进行加密、存储和上传。Step 6: The factory uses the updated model parameters for local training, first retrieves the encrypted global model parameters, decrypts with the private key to obtain SK, decrypts and encrypts the IPFS hash value with the SK, retrieves the model parameters with the IPFS hash value, and then uses the IPFS hash value to retrieve the model parameters. Parameterize the rear wing in the same way as in step 2 and perform model training. After that, model parameters are encrypted, stored and uploaded.
最后说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本技术方案的宗旨和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be Modifications or equivalent replacements, without departing from the spirit and scope of the technical solution, should all be included in the scope of the claims of the present invention.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210880056.5A CN115238288A (en) | 2022-07-25 | 2022-07-25 | Safety processing method for industrial internet data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210880056.5A CN115238288A (en) | 2022-07-25 | 2022-07-25 | Safety processing method for industrial internet data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115238288A true CN115238288A (en) | 2022-10-25 |
Family
ID=83674636
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210880056.5A Pending CN115238288A (en) | 2022-07-25 | 2022-07-25 | Safety processing method for industrial internet data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115238288A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115619947A (en) * | 2022-12-19 | 2023-01-17 | 江西农业大学 | A blockchain-based 3D modeling collaboration method and system |
CN115865487A (en) * | 2022-11-30 | 2023-03-28 | 四川启睿克科技有限公司 | Abnormal behavior analysis method and device with privacy protection function |
-
2022
- 2022-07-25 CN CN202210880056.5A patent/CN115238288A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115865487A (en) * | 2022-11-30 | 2023-03-28 | 四川启睿克科技有限公司 | Abnormal behavior analysis method and device with privacy protection function |
CN115865487B (en) * | 2022-11-30 | 2024-06-04 | 四川启睿克科技有限公司 | Abnormal behavior analysis method and device with privacy protection function |
CN115619947A (en) * | 2022-12-19 | 2023-01-17 | 江西农业大学 | A blockchain-based 3D modeling collaboration method and system |
CN115619947B (en) * | 2022-12-19 | 2023-12-26 | 江西农业大学 | Three-dimensional modeling cooperation method and system based on blockchain |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yang et al. | A quasi-newton method based vertical federated learning framework for logistic regression | |
Aujla et al. | SecSVA: secure storage, verification, and auditing of big data in the cloud environment | |
Xu et al. | Fedv: Privacy-preserving federated learning over vertically partitioned data | |
CN113505882B (en) | Data processing method based on federal neural network model, related equipment and medium | |
Liu et al. | Secure multi-label data classification in cloud by additionally homomorphic encryption | |
CN107196926A (en) | A kind of cloud outsourcing privacy set comparative approach and device | |
CN111291411B (en) | Security video anomaly detection system and method based on convolutional neural network | |
CN111222645A (en) | Management system and method based on Internet of things block chain quantum algorithm artificial intelligence | |
CN112347500A (en) | Machine learning method, device, system, equipment and storage medium of distributed system | |
CN115238288A (en) | Safety processing method for industrial internet data | |
CN116957064A (en) | Federated learning privacy protection model training method and system based on knowledge distillation | |
Baryalai et al. | Towards privacy-preserving classification in neural networks | |
WO2023020216A1 (en) | Extremum determination method and apparatus based on secure multi-party computation, device, and storage medium | |
Kurupathi et al. | Survey on federated learning towards privacy preserving AI | |
Das et al. | A secure softwarized blockchain-based federated health alliance for next generation IoT networks | |
CN115062323A (en) | Multi-center federal learning method for enhancing privacy protection and computer equipment | |
Miao et al. | Robust asynchronous federated learning with time-weighted and stale model aggregation | |
Zhang et al. | A Blockchain‐Based Microgrid Data Disaster Backup Scheme in Edge Computing | |
CN115277175B (en) | A method of industrial Internet data privacy protection | |
Guo et al. | Privacy-preserving multi-label propagation based on federated learning | |
CN117349685A (en) | Clustering method, system, terminal and medium for communication data | |
Bose et al. | A fully decentralized homomorphic federated learning framework | |
CN101141248A (en) | A Lightweight Key Agreement Method Based on Neural Network Weight Synchronization | |
Zhao et al. | ePMLF: Efficient and Privacy‐Preserving Machine Learning Framework Based on Fog Computing | |
Zhou et al. | Toward privacy-aware efficient federated graph attention network in smart cloud |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |