CN115238288A

CN115238288A - Safety processing method for industrial internet data

Info

Publication number: CN115238288A
Application number: CN202210880056.5A
Authority: CN
Inventors: 王汝言; 景忠源; 吴大鹏; 杨志刚; 张普宁
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2022-07-25
Filing date: 2022-07-25
Publication date: 2022-10-25

Abstract

The invention relates to a security processing method for industrial internet data, belonging to the field of industrial internet data processing. The method includes: the factory and the collaborating party generate a pair of secret keys using the ElGamal encryption algorithm; the contracting party initializes the model parameters and uploads them and the public key on the blockchain; the factory downloads the initial model and the collaborating party's public key from the blockchain , train the model and use the differential privacy algorithm to extract the model parameters; the factory encrypts the model parameters and stores them in IPFS to obtain the hash value; encrypts the hash value and adds it and the factory public key to the blockchain; Retrieve the model parameters in IPFS to train the global model, use SK to encrypt the model parameters and store them in IPFS, encrypt the IPFS hash value, encrypt the SK with the factory public key, and add the result to the blockchain; the factory receives the current global model parameters are updated. The present invention solves the privacy and trust problems of machine learning in the industrial Internet system.

Description

A security processing method for industrial Internet data

技术领域technical field

本发明属于工业互联网数据安全处理技术领域，涉及一种工业互联网数据的安全处理方法。The invention belongs to the technical field of industrial internet data security processing, and relates to a security processing method for industrial internet data.

背景技术Background technique

近年来，随着智能感知设备的大规模部署应用，由此产生了海量数据，这些数据包含生产生活的各个领域信息，已经成为一种重要的生产要素。工业互联网作为新一代信息技术与制造业深度融合的产物，通过对人、机、物的全面互联，构建起全要素、全产业链、全价值链全面连接的新型工业生产制造和服务体系，是数字化转型的实现途径，是实现新旧动能转换的关键力量。工业互联网正处于数据规模爆长、计算能力剧增以及算法性能不断提升的新阶段，深度学习、强化学习以及联邦学习等智能技术已成为提升工业互联网综合服务性能的重要支撑。In recent years, with the large-scale deployment and application of intelligent sensing devices, massive amounts of data have been generated. These data contain information in various fields of production and life, and have become an important production factor. As a product of the deep integration of a new generation of information technology and manufacturing, the Industrial Internet, through the comprehensive interconnection of people, machines, and things, builds a new industrial manufacturing and service system that is fully connected with all elements, the entire industrial chain, and the entire value chain. The way to realize digital transformation is the key force to realize the transformation of old and new kinetic energy. The Industrial Internet is in a new stage of exploding data scale, sharply increasing computing power, and improving algorithm performance. Intelligent technologies such as deep learning, reinforcement learning, and federated learning have become important supports for improving the comprehensive service performance of the Industrial Internet.

工业生产数据保密性极高，目前常用的中心化的模型训练方法存在数据隐私泄露问题，难以兼顾数据共享与隐私保护的双重需求，不仅需要新型的隐私保护计算范式来完成多方联合建模，还需要灵活且智能的适配方法来保障新型计算范式的高效运行。而安全是工业互联网的保障。在未来，工业互联网这一新兴基础设施建设将向更广范围、更深程度、更高水平不断推进。面对新技术以及安全防护新形势，工业互联网安全生态将不断面临新的挑战，因此，亟待面向工业互联网数据的安全处理方法，以提升应对安全风险的能力，促进工业互联网的繁荣与发展。The confidentiality of industrial production data is extremely high. At present, the commonly used centralized model training method has the problem of data privacy leakage. It is difficult to take into account the dual needs of data sharing and privacy protection. Not only does a new privacy-preserving computing paradigm need to complete multi-party joint modeling, but also Flexible and intelligent adaptation methods are required to ensure the efficient operation of new computing paradigms. And security is the guarantee of the Industrial Internet. In the future, the construction of the emerging infrastructure of the Industrial Internet will continue to advance to a wider, deeper and higher level. In the face of new technologies and new situations of security protection, the industrial Internet security ecosystem will continue to face new challenges. Therefore, it is urgent to deal with industrial Internet data security methods to improve the ability to deal with security risks and promote the prosperity and development of the Industrial Internet.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明的目的在于提供一种工业互联网数据的安全处理方法，针对难以兼顾工业数据共享与隐私保护的双重需求问题，通过将联邦学习、差分隐私、ElGamal加密等结合，协同打造安全的工业互联网，更好赋能工业互联网，推动产业升级。In view of this, the purpose of the present invention is to provide a security processing method for industrial Internet data, aiming at the problem that it is difficult to take into account the dual requirements of industrial data sharing and privacy protection, by combining federated learning, differential privacy, ElGamal encryption, etc. The industrial Internet can better empower the industrial Internet and promote industrial upgrading.

为达到上述目的，本发明提供如下技术方案：To achieve the above object, the present invention provides the following technical solutions:

一种工业互联网数据的安全处理方法，具体包括以下步骤：A security processing method for industrial Internet data, which specifically includes the following steps:

S1：工厂和协作方生成一对秘钥；然后协作方初始化模型参数w₀，并将模型参数w₀与其公钥上传到区块链上；S1: The factory and the collaborating party generate a pair of secret keys; then the collaborating party initializes the model parameter w ₀ and uploads the model parameter w ₀ and its public key to the blockchain;

S2：每个工厂从区块链上下载初始化模型参数w₀和协作方的公钥，使用特定数量的本地数据样本训练一个深度神经网络模型；训练完成后用差分隐私算法对模型进行处理，生成局部差分隐私机器学习模型，之后工厂提取模型参数；S2: Each factory downloads the initialization model parameter w ₀ and the public key of the collaborator from the blockchain, and uses a specific number of local data samples to train a deep neural network model; after the training is completed, the model is processed with a differential privacy algorithm to generate Local differential privacy machine learning model, after which the factory extracts model parameters;

S3：工厂使用协作方的公钥对模型参数进行加密，然后将加密的模型参数存储在IPFS中，并获得唯一的IPFS哈希值；然后同样使用协作方的公钥加密IPFS哈希值，借助于智能合约将其与工厂公钥打包添加到区块链上；工厂通知协作方，当前一轮的本地模型训练完成；S3: The factory uses the collaborator's public key to encrypt the model parameters, then stores the encrypted model parameters in IPFS, and obtains a unique IPFS hash value; then also uses the collaborator's public key to encrypt the IPFS hash value, with the help of It is packaged with the factory public key and added to the blockchain in the smart contract; the factory notifies the collaborator that the current round of local model training is completed;

将加密的模型参数存储在IPFS中，IPFS是一种用于文件存储的对等网络协议，采用的是基于内容的寻址，而非基于位置。在IPFS网络里的文件，会被赋予一个哈希值，这个哈希值类似于“指纹”，它是从文件内容中被计算出来的。Store encrypted model parameters in IPFS, a peer-to-peer protocol for file storage that uses content-based addressing rather than location-based. Files in the IPFS network are assigned a hash value, which is similar to a "fingerprint", which is calculated from the content of the file.

S4：协作方利用私钥解密，通过其公钥加密IPFS哈希值，用IPFS的哈希值在IPFS中检索获得相应的加密模型参数，解密后进行全局模型的训练并提取模型参数；S4: The collaborator decrypts with the private key, encrypts the IPFS hash value with its public key, retrieves the corresponding encryption model parameters in IPFS with the IPFS hash value, and then trains the global model and extracts the model parameters after decryption;

S5：利用SK(由协作方生产的对称密钥)来加密模型参数并将其存储在IPFS，之后会被赋予一个IPFS哈希值，利用SK加密IPFS哈希值，同时利用各工厂的公钥分别加密SK，将加密结果添加区块链上(这种多层加密的方法确保了数据的安全)，通知工厂本轮聚合结束；S5: Use SK (symmetric key produced by the collaborator) to encrypt the model parameters and store them in IPFS, and then will be given an IPFS hash value, use SK to encrypt the IPFS hash value, and use the public key of each factory Encrypt SK separately, add the encryption result to the blockchain (this multi-layer encryption method ensures the security of data), and notify the factory that this round of aggregation is over;

S6：工厂在收到协作方通知当前联邦周期结束后，检索加密的全局模型参数，利用其私钥解密获得SK，利用SK解密加密的IPFS哈希值，用IPFS哈希值检索模型参数；工厂使用更新后的模型参数进行局部训练，开始下一轮联合训练，并在预定义的联合轮数中重复所有步骤。S6: After receiving the notification from the collaborator that the current federation cycle is over, the factory retrieves the encrypted global model parameters, uses its private key to decrypt to obtain SK, uses SK to decrypt the encrypted IPFS hash value, and uses the IPFS hash value to retrieve model parameters; Use the updated model parameters for local training, start the next round of joint training, and repeat all steps for a predefined number of joint rounds.

进一步，步骤S1中，工厂和协作方以ElGamal加密算法生成一对秘钥；所述ElGamal加密算法是一个基于迪菲-赫尔曼密钥交换的非对称加密算法，EIGamal加密算法根据的原理是：求解离散对数是困难的，而其逆运算可以应用平方乘的方法有效的计算出来。在相应的群G中，指数函数是单向函数。Further, in step S1, the factory and the collaborating party generate a pair of secret keys with the ElGamal encryption algorithm; the ElGamal encryption algorithm is an asymmetric encryption algorithm based on Diffie-Hellman key exchange, and the principle based on the EIGamal encryption algorithm is : It is difficult to solve the discrete logarithm, and its inverse operation can be effectively calculated by the method of square multiplication. In the corresponding group G, the exponential function is a one-way function.

进一步，步骤S2具体包括：工厂使用特定数量的本地数据样本训练一个深度神经网络；从服务器获得最新的模型参数，从1到批量数量

的批量序号b，计算批梯度g_k ^(b)，本地更新模型参数：

其中η表示学习率，w_t表示当前的模型参数，D_k表示第k个协作方所拥有的数据集，M表示指定的客户更新是使用的mini-batch的大小；Further, step S2 specifically includes: the factory uses a specific number of local data samples to train a deep neural network; obtains the latest model parameters from the server, ranging from 1 to the number of batches

The batch sequence number b is calculated, the batch gradient g _k ^(b) is calculated, and the model parameters are updated locally:

where η represents the learning rate, _wt represents the current model parameters, D _k represents the data set owned by the k-th collaborator, and M represents the size of the mini-batch used for the specified customer update;

在训练过程中，在SGD计算中实现基于差分隐私的神经网络训练，通过最小化经验损失函数

来训练模型参数；在SGD的每一步，计算梯度，对于采样子集，剪切每个梯度的范数l₂，计算平均值，为保护隐私添加噪声，向这个平均噪声梯度的相反方向迈出一步进行梯度下降反向传播完成训练最后输出模型。During the training process, differential privacy-based neural network training is implemented in the SGD calculation by minimizing the empirical loss function

to train the model parameters; at each step of SGD, compute the gradient, for a sampled subset, clip the norm l ₂ of each gradient, compute the average, add noise for privacy, and step in the opposite direction of this average noise gradient One step of gradient descent backpropagation completes the training and finally outputs the model.

进一步，步骤S4具体包括：假设有K个参与方(即工厂)在一个联邦学习系统中，D_k表示第k个参与方所拥有的数据集，P_k表示位于参与方k的数据点的索引集；设n_k表示P_k的基数，假设有第k个参与方有个n_k数据点，总共有K个参与方时，协作方对收到的模型参数进行聚合，即对收到的模型参数进行加权平均：Further, step S4 specifically includes: assuming that there are K participants (ie factories) in a federated learning system, D _k represents the data set owned by the kth participant, and P _k represents the index of the data point located in participant k Set; let n _k represent the cardinality of P _k , assuming that the kth participant has n _k data points, when there are a total of K participants, the collaborating party aggregates the received model parameters, that is, the received model Weighted average of parameters:

更新模型参数：Update model parameters:

其中，其中

表示在给定的模型参数w上对样本(x_i,y_i)进行预测所得到的损失结果，x_i和y_i分别表示第i个训练数据点及其相关的标签。η表示学习率，n表示训练数据的数量；协作方检查损失函数是否收敛或者是否达到最大训练轮次；若是，则协作方给各参与方发信号，使其全部停止模型训练。of which, of which

represents the loss result obtained by predicting the sample ( _xi , y _i ) on the given model parameter w, where x _i and y _i represent the ith training data point and its associated label, respectively. η represents the learning rate, and n represents the number of training data; the collaborating party checks whether the loss function converges or whether the maximum training round is reached; if so, the collaborating party sends a signal to each participant to stop model training.

进一步，步骤S6中，工厂更新后的模型参数进行加密、存储和上传。Further, in step S6, the updated model parameters of the factory are encrypted, stored and uploaded.

本发明的有益效果在于：本发明通过合并差分隐私、联邦学习、区块链和智能合约来增强工业互联网数据的隐私性和可信度。提升了应对安全风险的能力，促进工业互联网的繁荣与发展。The beneficial effect of the present invention is that the present invention enhances the privacy and credibility of industrial Internet data by combining differential privacy, federated learning, block chain and smart contracts. It improves the ability to deal with security risks and promotes the prosperity and development of the Industrial Internet.

本发明的其他优点、目标和特征在某种程度上将在随后的说明书中进行阐述，并且在某种程度上，基于对下文的考察研究对本领域技术人员而言将是显而易见的，或者可以从本发明的实践中得到教导。本发明的目标和其他优点可以通过下面的说明书来实现和获得。Other advantages, objects, and features of the present invention will be set forth in the description that follows, and will be apparent to those skilled in the art based on a study of the following, to the extent that is taught in the practice of the present invention. The objectives and other advantages of the present invention may be realized and attained by the following description.

附图说明Description of drawings

为了使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明作优选的详细描述，其中：In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be preferably described in detail below with reference to the accompanying drawings, wherein:

图1为本发明涉及的工业互联网数据安全处理方法架构图；Fig. 1 is the framework diagram of the industrial Internet data security processing method involved in the present invention;

图2为本发明的工业互联网数据安全处理方法的工厂数据处理过程流程图；Fig. 2 is the factory data processing process flow chart of the industrial Internet data security processing method of the present invention;

图3为本发明的工业互联网数据安全处理方法的协作方数据处理过程流程图。FIG. 3 is a flow chart of the data processing process of the collaborator of the industrial Internet data security processing method of the present invention.

具体实施方式Detailed ways

以下通过特定的具体实例说明本发明的实施方式，本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用，本说明书中的各项细节也可以基于不同观点与应用，在没有背离本发明的精神下进行各种修饰或改变。需要说明的是，以下实施例中所提供的图示仅以示意方式说明本发明的基本构想，在不冲突的情况下，以下实施例及实施例中的特征可以相互组合。The embodiments of the present invention are described below through specific specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the contents disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and various details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that the drawings provided in the following embodiments are only used to illustrate the basic idea of the present invention in a schematic manner, and the following embodiments and features in the embodiments can be combined with each other without conflict.

请参阅图1～图3，图1所示的是一种工业互联网数据安全处理方法架构图，图2所示为工业互联网数据安全处理方法的工厂数据处理过程，图3所示为工业互联网数据安全处理方法的协作方数据处理过程。Please refer to Figures 1 to 3. Figure 1 shows the architecture of an industrial Internet data security processing method, Figure 2 shows the factory data processing process of the industrial Internet data security processing method, and Figure 3 shows the industrial Internet data The collaborating party's data processing process for the secure processing method.

该方法解决了工业互联网系统中机器学习的隐私和信任问题，包括工厂、协作方、IPFS、区块链等实体及各实体之间的数据传输。具体包含以下步骤：This method solves the privacy and trust issues of machine learning in industrial Internet systems, including factories, collaborators, IPFS, blockchain and other entities and data transmission between entities. Specifically includes the following steps:

步骤1：工厂和协作方首先以ElGamal加密算法生成一对秘钥，ElGamal加密算法是一个基于迪菲-赫尔曼密钥交换的非对称加密算法，EIGamal加密算法根据的原理是：求解离散对数是困难的，而其逆运算可以应用平方乘的方法有效的计算出来。在相应的群G中，指数函数是单向函数。Step 1: The factory and the collaborator first generate a pair of secret keys with the ElGamal encryption algorithm. The ElGamal encryption algorithm is an asymmetric encryption algorithm based on Diffie-Hellman key exchange. The principle of the EIGamal encryption algorithm is to solve the discrete pair Numbers are difficult, and their inverse operations can be efficiently calculated using the square multiplication method. In the corresponding group G, the exponential function is a one-way function.

步骤2：每个工厂从区块链上下载初始化模型w₀和协作方的公钥，使用特定数量的本地数据样本训练一个深度神经网络。训练完成后用差分隐私算法对模型进行处理，生成局部差分隐私机器学习模型。Step 2: Each factory downloads the initialization model w ₀ and the public key of the collaborating party from the blockchain, and trains a deep neural network with a specific number of local data samples. After the training is completed, the differential privacy algorithm is used to process the model to generate a local differential privacy machine learning model.

工厂使用特定数量的本地数据样本训练一个深度神经网络。从服务器获得最新的模型参数，从1到批量数量

的批量序号b，计算批梯度g_k ^(b)，本地更新模型参数：

η表示学习率。The factory trains a deep neural network using a specific number of local data samples. Get the latest model parameters from the server, from 1 to the number of batches

η represents the learning rate.

来训练模型参数。在SGD的每一步，计算梯度，对于采样子集，剪切每个梯度的范数l₂，计算平均值，为保护隐私添加噪声，向这个平均噪声梯度的相反方向迈出一步进行梯度下降反向传播完成训练最后输出模型。During the training process, differential privacy-based neural network training is implemented in the SGD calculation by minimizing the empirical loss function

to train the model parameters. At each step of SGD, compute the gradient, for the sampled subset, clip the norm l ₂ of each gradient, compute the average, add noise for privacy, take a step in the opposite direction of this average noise gradient and perform the gradient descent inverse The training is completed to the final output model of the propagation.

联邦学习能够在中央服务器的帮助下训练深度学习模型，同时保持训练数据分布在客户端。客户端只需要在模型训练期间向云服务器提交其本地梯度，这一定程度上避免了用户数据隐私泄露的风险。差分隐私是一种高效的隐私保护技术，能对数据隐私保护程度量化，设置合适的隐私预算，能达到数据可用性和隐私保护的良好平衡。差分隐私常用于人工智能模型的训练数据的隐私保护上，可为联邦学习提供隐私保障。Federated learning is able to train deep learning models with the help of a central server while keeping the training data distributed across clients. The client only needs to submit its local gradient to the cloud server during model training, which avoids the risk of user data privacy leakage to a certain extent. Differential privacy is an efficient privacy protection technology, which can quantify the degree of data privacy protection, set an appropriate privacy budget, and achieve a good balance between data availability and privacy protection. Differential privacy is often used to protect the privacy of training data of artificial intelligence models, and can provide privacy guarantees for federated learning.

在用户数据不出本地的前提下，通过加密机制或扰动机制下的参数交换与优化，建立一个共有模型。这个共有模型的性能接近于将各方数据聚合到一块训练出来的模型。该数据联合建模方案不泄露用户隐私且符合数据安全保护的原则。On the premise that the user data is not local, a common model is established through parameter exchange and optimization under the encryption mechanism or the perturbation mechanism. The performance of this shared model is close to that of a model trained by aggregating data from all parties into one piece. The data joint modeling scheme does not reveal user privacy and conforms to the principle of data security protection.

步骤3：工厂使用协作方的公钥对模型参数进行加密，然后将加密的模型参数存储在IPFS中，并获得唯一的IPFS哈希值。然后同样使用协约方的公钥加密IPFS哈希值，借助于智能合约将其与工厂公钥打包添加到区块链上。工厂通知协作方，当前一轮的本地模型训练完成。Step 3: The factory encrypts the model parameters with the public key of the collaborating party, then stores the encrypted model parameters in IPFS, and obtains a unique IPFS hash value. Then, the IPFS hash value is also encrypted with the public key of the contracting party, and it is packaged with the factory public key and added to the blockchain with the help of a smart contract. The factory notifies the collaborator that the current round of local model training is complete.

步骤4：协作方利用私钥解密通过其公钥加密IPFS哈希值，用IPFS的哈希值在IPFS中检索获得相应的加密模型参数，解密后进行全局模型的训练并提取模型参数。Step 4: The collaborating party uses private key decryption to encrypt the IPFS hash value through its public key, and retrieves the corresponding encryption model parameters in IPFS with the IPFS hash value. After decryption, the global model is trained and the model parameters are extracted.

协作方首先进行解密，获得IPFS哈希值，用IPFS的哈希值检索获得相应的加密模型参数，继续利用其私钥解密获得局部模型更新的参数。The collaborating party first decrypts to obtain the IPFS hash value, retrieves the corresponding encryption model parameters with the IPFS hash value, and continues to use its private key to decrypt to obtain the parameters of the local model update.

假设有K个参与方在一个联邦学习系统中，D_k表示第K个参与方所拥有的数据集，P_k表示位于客户k的数据点的索引集。设n_k表示P_k的基数，假设有第k个参与方有个n_k数据点，总共有K个参与方时Suppose there are K parties in a federated learning system, _Dk denotes the dataset owned by the Kth party, and _Pk denotes the index set of data points located at customer k. Let n _k denote the cardinality of P _k , assuming that the kth participant has n _k data points, and there are K participants in total

协作方对收到的模型参数进行聚合，即对收到的模型参数进行加权平均：The collaborators aggregate the received model parameters, that is, perform a weighted average of the received model parameters:

其中，η表示学习率，协作方检查损失函数是否收敛或者是否达到最大训练轮次。若是，则协作方给各参与方发信号，使其全部停止模型训练。where η represents the learning rate, and the collaborator checks whether the loss function converges or whether the maximum training epoch is reached. If so, the cooperating party sends a signal to all the participating parties to stop model training.

步骤5：利用SK(是由协作方生产的对称密钥)来加密模型参数并将其存储在IPFS，之后会被赋予一个IPFS哈希值，利用SK加密IPFS哈希值，同时利用工厂的公钥加密SK，将加密结果添加区块链上，通知工厂本轮聚合结束。Step 5: Use SK (a symmetric key produced by the collaborator) to encrypt the model parameters and store them in IPFS, and then will be given an IPFS hash value, use SK to encrypt the IPFS hash value, and use the factory's public key. Key encryption SK, the encryption result is added to the blockchain, and the factory is notified that this round of aggregation is over.

步骤6：工厂使用更新后的模型参数进行局部训练，首先检索加密的全局模型参数，利用私钥解密获得SK，利用SK解密加密IPFS哈希值，用IPFS哈希值检索模型参数，在获得模型参数后翼与步骤2相同的方式并进行模型训练。之后进行模型参数进行加密、存储和上传。Step 6: The factory uses the updated model parameters for local training, first retrieves the encrypted global model parameters, decrypts with the private key to obtain SK, decrypts and encrypts the IPFS hash value with the SK, retrieves the model parameters with the IPFS hash value, and then uses the IPFS hash value to retrieve the model parameters. Parameterize the rear wing in the same way as in step 2 and perform model training. After that, model parameters are encrypted, stored and uploaded.

最后说明的是，以上实施例仅用以说明本发明的技术方案而非限制，尽管参照较佳实施例对本发明进行了详细说明，本领域的普通技术人员应当理解，可以对本发明的技术方案进行修改或者等同替换，而不脱离本技术方案的宗旨和范围，其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be Modifications or equivalent replacements, without departing from the spirit and scope of the technical solution, should all be included in the scope of the claims of the present invention.

Claims

1. a security processing method for industrial internet data, it is characterised in that the method specifically comprises the following steps:

S1: The factory and the collaborating party generate a pair of secret keys; then the collaborating party initializes the model parameter w ₀ and uploads the model parameter w ₀ and its public key to the blockchain;

S2: Each factory downloads the initialization model parameter w ₀ and the public key of the collaborator from the blockchain, and uses a specific number of local data samples to train a deep neural network model; after the training is completed, the model is processed with a differential privacy algorithm to generate Local differential privacy machine learning model, after which the factory extracts model parameters;

S3: The factory uses the collaborator's public key to encrypt the model parameters, then stores the encrypted model parameters in IPFS, and obtains a unique IPFS hash value; then also uses the collaborator's public key to encrypt the IPFS hash value, with the help of It is packaged with the factory public key and added to the blockchain in the smart contract; the factory notifies the collaborator that the current round of local model training is completed;

S4: The collaborator decrypts with the private key, encrypts the IPFS hash value with its public key, retrieves the corresponding encryption model parameters in IPFS with the IPFS hash value, and then trains the global model and extracts the model parameters after decryption;

S5: Use the symmetric key SK produced by the collaborator to encrypt the model parameters and store them in IPFS, and then will be given an IPFS hash value, use SK to encrypt the IPFS hash value, and use the public key of each factory to encrypt SK, add the encrypted result to the blockchain and notify the factory that this round of aggregation is over;

S6: After receiving the notification from the collaborator that the current federation cycle is over, the factory retrieves the encrypted global model parameters, uses its private key to decrypt to obtain SK, uses SK to decrypt the encrypted IPFS hash value, and uses the IPFS hash value to retrieve model parameters; Use the updated model parameters for local training, start the next round of joint training, and repeat all steps for a predefined number of joint rounds.

2. the security processing method of industrial internet data according to claim 1, is characterized in that, in step S1, factory and collaborating party generate a pair of secret keys with ElGamal encryption algorithm; Described ElGamal encryption algorithm is a Diffie-based encryption algorithm. Asymmetric encryption algorithm for Hermann key exchange.

3. The method for safe processing of industrial internet data according to claim 1, wherein step S2 specifically comprises: the factory uses a specific number of local data samples to train a deep neural network; obtain the latest model parameters from the server, from 1 to batch quantity

During the training process, differential privacy-based neural network training is implemented in the SGD calculation by minimizing the empirical loss function

to train the model parameters; at each step of SGD, the gradients are calculated, and for a sampled subset, the norm of each gradient is clipped

Calculate the average, take a step in the opposite direction of the average noise gradient and perform gradient descent backpropagation to complete the training and finally output the model.

4. The security processing method for industrial Internet data according to claim 3, wherein step S4 specifically comprises: assuming that there are K participants in a federated learning system, D _k represents the data owned by the kth participant. Data set, P _k represents the index set of data points located in participant k; let n _k represent the cardinality of P _k , assuming that the kth participant has n _k data points, when there are a total of K participants, the collaborating party Aggregate the received model parameters, that is, perform a weighted average of the received model parameters:

Update model parameters:

of which, of which

Represents the loss result obtained by predicting the sample ( _xi , y _i ) on the given model parameter w, x _i and y _i represent the i-th training data point and its associated label respectively; η represents the learning rate, n represents the number of training data; the collaborating party checks whether the loss function converges or whether the maximum training round is reached; if so, the collaborating party sends a signal to each participant to stop model training.

5 . The method for safely processing industrial Internet data according to claim 1 , wherein, in step S6 , the updated model parameters of the factory are encrypted, stored and uploaded. 6 .