CN116488906A

CN116488906A - Safe and efficient model co-building method

Info

Publication number: CN116488906A
Application number: CN202310457821.7A
Authority: CN
Inventors: 王卓彤; 杨志刚; 吴大鹏; 张鸿; 王汝言; 吕翊
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2023-04-25
Filing date: 2023-04-25
Publication date: 2023-07-25

Abstract

The invention relates to a safe and efficient model co-building method, and belongs to the field of artificial intelligence. The method comprises the following steps: dividing the area: each edge server divides the area responsible for management according to the coverage capacity of the range; initializing a system: initializing a global model and generating a secret key; local model training: updating the calculation model according to the local data of the equipment, and carrying out disturbance feedback on gradient information; edge security robust aggregation: designing a communication efficient safety enhancement aggregation protocol to support the realization of an asynchronous grouping robust aggregation algorithm based on disturbance gradients; the cloud global model aggregation comprises the following steps: and receiving the local model aggregation results returned by the edge servers, and executing a federal average algorithm to perform global model aggregation. The method and the system can effectively improve the robustness and the safety of the global model under the condition that the equipment isomerism and the resource limitation exist in the client.

Description

Safe and efficient model co-building method

Technical Field

The invention belongs to the field of artificial intelligence, and relates to a safe and efficient model co-building method.

Background

In the Internet of things era of Internet of things, mass equipment access networks generate large-scale interactive data, huge data processing and analysis demands promote the deeper integration of the Internet of things and artificial intelligence, and the analysis, decision making and collaborative processing capabilities of the Internet of things are greatly improved. But this data is unique to the holder and prevents artificial intelligence from assisting in the progress of the internet of things application service. The federal learning is used as a distributed machine learning method for model co-construction, the distributed training characteristic of the federal learning is very compatible with the Internet of things, the update iteration of model parameters is carried out by aggregating multiparty parameters, and the data sharing and the joint modeling are realized on the premise of meeting the privacy protection requirement.

Federal learning is used as an emerging privacy protection machine learning technology, and the internet of things is helped to realize everything interconnection and everything intelligence by utilizing strong data processing and knowledge mining capabilities. However, open model training environments, frequent model parameter sharing, and fragile model aggregation algorithms all expose federal learning to new risks. Firstly, because the internet of things equipment generally has insufficient security measures, the internet of things equipment is often easy to be maliciously controlled and attacked, and the risk of model training failure is increased. A typical attack is a byesting attack, where the byesting attacker can send arbitrary model updates to the server, thereby compromising overall learning performance. Second, federal-learned privacy protection mechanisms are based on a very strong security assumption that intermediate parameters exposed per iteration do not reveal sensitive information. However, many works prove that this security assumption is not true, and that partial leakage of a particular neural network layer or complete leakage of gradients can be achieved by an intermediate parameter-initiated inference attack.

Currently, there are some research efforts on defending against the bayer attack and the privacy reasoning attack. Xu C, jia Y, zhu L distributes weights according to the contribution of clients in TDFL Truth Discovery Based Byzantine Robust Federated Learning [ in IEEE Transactions on Parallel and Distributed Systems, vol.33, no.12, pp.4835-4848,2022 ], designs a robust truth-finding aggregation method to remove malicious updates, and filters the malicious updates by using a maximum-clique algorithm in graph theory to ensure the quality of a global model for most settings of Bayesian attackers. Wu Z, ling Q, chen T in "Federated Variance-reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks" [ in IEEE Transactions on Signal Processing, vol.68, pp.4583-4596,2020 ], combines a geometric median algorithm with a variance reduction algorithm to reduce the effects of random gradient noise and demonstrate that the algorithm can linearly converge to a neighborhood of optimal solutions in the presence of a small number of bayer attacks. Lu Y, huang X, dai Y in "Differentially Private Asynchronous Federated Learning for Mobile Edge Computing in Urban Informatics" [ in IEEE Transactions on Industrial Informatics, vol.16, no.3, pp.2134-2143,2019 ] utilizes a gaussian mechanism of differential privacy, comprehensively considering the iterative optimization effect and privacy preserving costs, adaptively allocating privacy budgets in each iteration. Fu A, zhang X, xionN in "VFL: A Verifiable Federated Learning with Privacy-preserving for Big Data in Industrial IoT" [ in IEEE Transactions on Industrial Informatics, vol.18, no.5, pp.3316-3326,2020 ] realizes secret sharing through two sets of Lagrange interpolation points, ensures the correctness of the aggregation result and performs effective restoration.

The existing research work for defending the Bayesian attack and the privacy reasoning attack still has a plurality of challenges in the scene of the Internet of things although the research work presents excellent effects. First, there is inevitably a difference in resources between the internet of things devices, resulting in that the internet of things devices participating in the federal learning task may send model updates to the server with different response delays, and the asynchronous optimization algorithm allows a single model update to update the global model immediately after it arrives without waiting for model updates of the remaining devices. However, the asynchronous optimization algorithm brings a plurality of challenges to the design of the Bayesian robust optimization algorithm, on one hand, a single gradient lacks a comparison object, and the gradient information is difficult to effectively evaluate. On the other hand, the random gradient is given additional noise, which makes the design of the robust algorithm more difficult. Second, unlike centralized training patterns, federal learning requires constant interaction between the device and the server, which can even reach millions of model parameters for complex tasks, which can be problematic in both ways. On the one hand, encryption for high-dimensional model updating can increase calculation cost and communication cost greatly. On the other hand, for a security aggregation protocol with a specific computing task, information interaction of the security aggregation protocol needs to be completed through a large number of communications. Finally, the existing implementation paths for resisting the two attacks show contradiction, so that a safe and efficient model co-construction method is urgently needed to be researched.

Disclosure of Invention

In view of the above, the invention aims to provide a safe and efficient model co-building method, which effectively improves the robustness and safety of a global model under the conditions that equipment isomerism exists at a client and resources are limited.

In order to achieve the above purpose, the present invention provides the following technical solutions:

a safe and efficient model co-building method firstly provides an asynchronous grouping robust aggregation algorithm, designs cosine anomaly filtering and eliminating anomaly gradients according to delay perceived grouping, synthesizes intra-group true value estimation and inter-group delay aggregation, and effectively relieves the influence brought by a dewing person and a Bayesian attacker. Then, a high-efficiency safety enhancement aggregation protocol for communication is designed, and under the premise that the safety of a communication link is not needed, the homomorphic property and homomorphic multiplication property of Paillier homomorphic encryption are utilized, so that the realization of an asynchronous packet robust aggregation algorithm can be supported while the data privacy is protected, and the communication cost among equipment, a server and a server is greatly reduced. The method can provide bidirectional defense for the semi-honest server and the Bayesian attacker under the heterogeneous environment of the equipment, and realizes good balance among model performance, training efficiency and communication overhead. The method specifically comprises the following steps:

s1: dividing the area: each edge server divides the area responsible for management according to the coverage capacity of the range, and manages users and physical entities in the area;

s2: initializing a system: the method comprises two parts of model initialization and key generation;

the model initialization includes: cloud server initializing global model parameters w ^t And sending the initial global model to an edge server, and then sending the initial global model to the Internet of things equipment in the jurisdiction area by the edge server.

The key generation includes: edge server S _e Auxiliary server S _a Generates own public key and private key respectively, and uses (pk _e ,sk _e )、(pk _a ,sk _a ) To represent S _e And S is _a Is a key pair of the key pair.

S3: local model training: after the Internet of things equipment receives the global model, the local data computing model is utilized to updateAnd generates a random number locally +.>Perturbation of model update to +.>The disturbance data is then encrypted to +.>Returns to the edge server, and then +.>And->Returning to the auxiliary server; wherein->Edge server S for representation _e Public key pk of (a) _e Encryption is performed, N represents the dimension of the gradient, and i represents the index of the device.

S4: edge security robust aggregation: the auxiliary server and the edge server receive the disturbance gradient of M devices returned firstAnd encryption disturbance data->And then, cooperatively executing a communication efficient safety enhancement aggregation protocol to realize an asynchronous grouping robust aggregation algorithm based on disturbance gradients.

S5: cloud global model aggregation: after receiving the local model aggregation results returned by each edge server, the cloud server adopts a classical federal average algorithm to realize the rapid aggregation of the global model; and if the global model index reaches the training task stopping standard, stopping the local model training, otherwise, repeating the steps S3-S5.

Further, the step S4 specifically includes the following steps:

s41: edge server performs delay-aware grouping on all received gradient information, assuming thatFor the s-th round of selection of a set of devices involved in training and returning a gradient at the t-th round +.>Denoted as the set of devices selected in round s, here +.>Means that the device selected in s round but did not successfully return the result to the server, and that the satisfaction between groups is achieved

S42: auxiliary server S _a Gradient for all received disturbancesEncryption is carried out to obtainAnd will->To the edge server S _e The method comprises the steps of carrying out a first treatment on the surface of the Wherein->Auxiliary server S for representation _a Public key pk of (a) _a Encrypting;

s43: edge server S _e Received byAfterwards, by calculating->Obtain->In order to avoid the auxiliary server from directly seeing the gradient information of the equipment, selecting a random number q to blindly process the gradient information to obtain +.>Finally will->To the auxiliary server S _a 。

S44: auxiliary server S _a Upon receipt ofAfter that, the auxiliary server decrypts to get +.>And calculating cosine distances between each gradient and the true values of the gradients in the group, and eliminating abnormal cosine distances through a threshold mechanism to avoid the continuous influence of the Bayesian attacker on the global model. For->Is composed of +.>Intra-group gradient truth value representing current iteration, for +.>The difference between two consecutive model updates is as small as possible given that model training is usually a steady convergence, given that intra-group gradient truth values cannot be predicted in the current iteration. Thus, using global updates in the previous iteration to estimate the intra-set gradient truth value in the current iteration, device i is a cosine distance cs from the intra-set gradient truth value _i Expressed as:

wherein,,representing a set of devices selected to participate in training in round s and returning gradients in round t.

Here, in the case of most honest nodes, the cosine distances are ordered with the median value as the threshold;

s45: selecting a random number r to perform secondary disturbance on the screened gradient according to the cosine distance of the threshold filtering abnormality, and calculating to obtainAnd will->Returning to the edge server; edge server by joiningObtain->And comparing each dimension to obtain a median +.>And will->And returning to the auxiliary server.

S46: the auxiliary server removes the secondary disturbance value r to obtainAs an initialization gradient truth value; within a group of devices of the same latency, the distance between each dimension and the true value of the gradient within the group can be considered to adjust the weight of each device uploading the gradient and return the distance information to the edge server S _e 。

Calculating distance-related information, expressed as:

wherein T is _total Information about the sum of distances representing all gradients and true values of gradients in a group, T _single Distance-related information representing individual gradients within a group from the true value of the gradient.

S47: edge server S _e Received T _total And T _single After that, the private key is used for decryption to obtainAnd->By adding->Andthe sum of the distances between the gradients of all devices and the true value of the gradients can be obtained>And distance of gradient to gradient truth value for each device ∈>The weight update value for each device may then be calculatedExpressed as:

where N represents the dimension of the gradient, M is expressed asThe number of devices involved, i.e. +.>dist (. Cndot.) is a measure->And->The distance between them is calculated as +.>

S48: the edge server calculates according to the weight update valueAnd using the public key pk of the auxiliary server _a Encryption is performed and then the sum of all device weighted gradients is calculated +.>And will->Sum of weight update values with all devices +.>And returning to the auxiliary server.

Computing the sum of all device weighted gradientsExpressed as:

s49: auxiliary server S _a ObtainingAnd->Thereafter, public key pk is used _a Decryption to obtainThen calculate the gradient truth value for each group +.>And sends it to the edge server S _e 。

Computing gradient truth values for each groupExpressed as: />

S410: the edge server performs gradient truth value on groups according to different stalenessDelay polymerization was performed, expressed as:

wherein w is ^t+1 Representing global model parameters of the t+1th round, η representing learning rate, Λ (τ) being an arbitrary decay function, here set as Λ (τ) =1/1+τ, τ=t-s, s representing the round in which the device is selected to participate in training, t representing the round in which the device returns the gradient; finally, the edge server will w ^t+1 And sending the global model aggregation to a cloud server.

Further, the step S5 specifically includes the following steps:

s51: the cloud server receives w returned by all edge servers ^t+1 The aggregation of global models is performed according to the federal averaging algorithm, expressed as:

wherein,,is the global model parameter of the global at the t+1st round, n _e Is the total amount of data contained in the edge area, n is the total amount of data contained in all edge servers +.>E represents the number of edge servers;

s52: then, the cloud server willAnd distributing the training data to all edge servers, and then distributing the training data to all the Internet of things devices in the jurisdiction by the edge servers to perform new training until the predefined precision or model convergence is achieved.

The invention has the beneficial effects that: the invention provides a safe and efficient model co-building method, which firstly provides an asynchronous packet robust aggregation method for the Internet of things, and relieves the influence of a dewing person and a Bayesian attacker by delay perception packet, cosine anomaly filtering, intra-group true value estimation and inter-group delay aggregation. The high-efficiency safety enhancement aggregation protocol is further constructed, and model robustness, training efficiency and communication overhead are considered while data privacy is guaranteed.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.

Drawings

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:

FIG. 1 is a system block diagram of a safe and efficient model co-building method of the present invention;

fig. 2 is a schematic diagram of an asynchronous packet robust aggregation method in the present invention.

Fig. 3 is a flow chart of a security enhanced aggregation protocol with efficient communication according to an embodiment of the present invention.

Detailed Description

Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.

Referring to fig. 1 to 3, an embodiment of the present invention provides a communication system composition module:

the device comprises: the sensing layer positioned on the Internet of things consists of a physical entity and attached sensing equipment. Each device has a fixed geographic location and is assigned to a unique edge server for management. The device is responsible for data acquisition, model training and gradient perturbation.

Edge server: the method is positioned at a base station or an access point and has stronger communication, storage and calculation capabilities. And the method is responsible for the model update uploaded by the receiving equipment and the safe enhancement aggregation protocol realization with high communication efficiency, and supports the asynchronous grouping robust aggregation algorithm for executing the local model in the ciphertext environment.

Cloud server: the cloud server is central equipment with very strong communication, storage and calculation capabilities and is responsible for receiving the local models returned by all edge servers, and then global model aggregation is carried out according to the data volume.

Aiming at the communication system, the invention provides a safe and efficient model co-construction method, which specifically comprises the following steps:

step 1: dividing the area: each edge server divides the area responsible for management according to the coverage capacity of the range, and manages users and physical entities in the area.

Step 2: initializing a system: the method comprises two parts of model initialization and key generation. Cloud server initializing global model parameters w ^t And sending the initial global model to an edge server, and then sending the initial global model to the Internet of things equipment in the jurisdiction area by the edge server. Edge server S _e Auxiliary server S _a Generates own public key and private key respectively, and uses (pk _e ,sk _e )、(pk _a ,sk _a ) To represent S _e And S is _a Is a key pair of the key pair.

Step 3: local model training: after the Internet of things equipment receives the global model, the local data computing model is utilized to updateAnd generates a random number locally +.>Perturbation of model update to +.>The disturbance data is then encrypted to +.>Returns to the edge server, wherein->Edge server S for representation _e Public key pk of (a) _e Encryption is carried out, and ∈ ->And->And returning the data to the auxiliary server.

Step 4: edge security robust aggregation: the auxiliary server and the edge server receive the disturbance gradient of M devices returned firstAnd encryption disturbance data->After that, cooperatively executing the efficient security enhanced aggregation protocol for communication to realize the baseAsynchronous packet robust aggregation algorithm at perturbed gradients.

The step 4 specifically comprises the following steps:

step 4.1: edge server performs delay-aware grouping on all received gradient information, assuming thatFor the s-th round of selection of a set of devices involved in training and returning a gradient at the t-th round +.>Denoted as the set of devices selected in round s, here +.>Means that in round s the device selected but failed to return the result to the server and that between the groups +_ is satisfied>

Step 4.2: auxiliary server S _a Gradient for all received disturbancesEncryption is carried out to obtain Auxiliary server S for representation _a Public key pk of (a) _a Encryption is performed. And will->To the edge server S _e 。

Step 4.3: edge server S _e Received byThereafter, by calculationObtain->In order to avoid the auxiliary server from directly seeing the gradient information of the equipment, selecting a random number q to blindly process the gradient information to obtain +.>Finally will->To the auxiliary server S _a 。

Step 4.4: auxiliary server S _a Upon receipt ofAfter that, the auxiliary server decrypts to obtainAnd calculating cosine distances between each gradient and the true values of the gradients in the group, and eliminating abnormal cosine distances through a threshold mechanism to avoid the continuous influence of the Bayesian attacker on the global model. For->Is composed of +.>Intra-group gradient truth value representing current iteration, for +.>The difference between two consecutive model updates is as small as possible given that model training is usually a steady convergence, given that intra-group gradient truth values cannot be predicted in the current iteration. Thus, the global update in the previous iteration is used to estimate the intra-set gradient truth value in the current iteration, expressed as:

here, in the case of most honest nodes, the computed cosine distances are sorted, with the median value as the threshold value being reliable.

Step 4.5: selecting a random number r to perform secondary disturbance on the screened gradient according to the cosine distance of the threshold filtering abnormality, and calculating to obtainAnd will->And returning to the edge server. Edge server by joiningObtain->And comparing each dimension to obtain a median +.>And will->And returning to the auxiliary server.

Step 4.6: the auxiliary server removes the secondary disturbance value r to obtainAs an initialization gradient truth. Within a group of devices of the same latency, the weight of each device uploading the gradient can be adjusted by taking into account the distance between each dimension and the true value of the gradient within the group, thus calculating distance-related information, expressed as:

the distance information T is processed _total And T _single Returned to the edge server S _e 。

Step 4.7: edge server S _e Received T _total And T _single After that, the private key is used for decryption to obtainAnd->By adding->Andthe sum of the distances between the gradients of all devices and the true value of the gradients can be obtained>And distance of gradient to gradient truth value for each device ∈>The weight update value for each device may then be calculatedExpressed as:

Step 4.8: the edge server calculates according to the weight update valueAnd using the public key pk of the auxiliary server _a Encryption is performed and then the sum of all device weighted gradients is calculated +.>Expressed as:

the weighted gradient sum of all the devicesSum of weight update values of all devicesAnd returning to the auxiliary server.

Step 4.9: auxiliary server S _a ObtainingAnd->Thereafter, public key pk is used _a Decryption to obtainThen calculate the gradient truth value for each group +.>Expressed as:

and sends it to the edge server S _e 。

Step 4.10: the edge server performs gradient truth value on groups according to different stalenessDelay polymerization was performed, expressed as:

where Λ (τ) is an arbitrary decay function, here set as Λ (τ) =1/1+τ, τ=t-s. Finally, the edge server will w ^t+1 And sending the global model aggregation to a cloud server.

Step 5: cloud global model aggregation: and after receiving the local model aggregation results returned by each edge server, the cloud server adopts a classical federal average algorithm to realize the rapid aggregation of the global model. And if the global model index reaches the training task stopping standard, stopping the local model training, otherwise, repeating the steps 3-5.

The step 5 specifically comprises the following steps:

step 5.1: the cloud server receives w returned by all edge servers ^t+1 The aggregation of global models is performed according to the federal averaging algorithm, expressed as:

wherein,,is the global model parameter of the global at the t+1st round, n _e Is the total amount of data contained in the edge area, n is the total amount of data contained in all edge servers +.>

Step 5.2: then, the cloud server willAnd distributing the training data to all edge servers, and then distributing the training data to all the Internet of things devices in the jurisdiction by the edge servers to perform new training until the predefined precision or model convergence is achieved.

Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims

1. A safe and efficient model co-building method is characterized by comprising the following steps:

the model initialization includes: cloud server initializing global model parameters w ^t The initial global model is sent to the edge server, and then the edge server sends the initial global model to the jurisdiction areaThe Internet of things equipment; the key generation includes: edge server S _e Auxiliary server S _a Generates own public key and private key respectively, and uses (pk _e ,sk _e )、(pk _a ,sk _a ) To represent S _e And S is _a Key pairs of (a);

s3: local model training: after the Internet of things equipment receives the global model, the local data computing model is utilized to updateAnd generates a random number locally +.>Perturbation of model update to +.>The disturbance data is then encrypted to +.>Returns to the edge server, and then +.>And->Returning to the auxiliary server; wherein->Edge server S for representation _e Public key pk of (a) _e Encrypting, wherein N represents the dimension of the gradient, and i represents the index of the device;

s4: edge security robust aggregation: the auxiliary server and the edge server receive the disturbance gradient of M devices returned firstAnd encrypting the disturbance data/>Then, cooperatively executing a communication efficient safety enhancement aggregation protocol to realize an asynchronous grouping robust aggregation algorithm based on disturbance gradients;

2. The model co-construction method according to claim 1, wherein the step S4 specifically comprises the steps of:

s41: the edge server carries out delay perception grouping on all the received gradient information;

s42: auxiliary server S _a Gradient for all received disturbancesEncryption is carried out to obtain->And will->To the edge server S _e The method comprises the steps of carrying out a first treatment on the surface of the Wherein->Auxiliary server S for representation _a Public key pk of (a) _a Encrypting;

s43: edge server S _e Received byAfterwards, by calculating->ObtainingIn order to avoid the auxiliary server from directly seeing the gradient information of the equipment, selecting a random number q to blindly process the gradient information to obtain +.>Finally will->To the auxiliary server S _a ；

S44: auxiliary server S _a Upon receipt ofAfter that, the auxiliary server decrypts to get +.>The cosine distance between each gradient and the true value of the gradient in the group is calculated, and the continuous influence of the Bayesian attacker on the global model is avoided by eliminating abnormal cosine distance through a threshold mechanism; in the case of most honest nodes, sorting cosine distances, and taking a median value as a threshold value;

s45: selecting a random number r to perform secondary disturbance on the screened gradient according to the cosine distance of the threshold filtering abnormality, and calculating to obtainAnd will->Returning to the edge server; edge server by joining->Obtaining/>And comparing each dimension to obtain a median +.>And will->Returning to the auxiliary server;

s46: the auxiliary server removes the secondary disturbance value r to obtainAs an initialization gradient truth value; in a group of devices with the same delay, the distance between each dimension and the true value of the gradient in the group is considered to be measured to adjust the weight of the gradient uploaded by each device, and the distance information is returned to the edge server S _e ；

S47: edge server S _e Received T _total And T _single After that, the private key is used for decryption to obtainAnd->By adding->Andthe sum of the distances between the gradients of all devices and the true value of the gradients can be obtained>And distance of gradient to gradient truth value for each device ∈>Then calculate the weight update value of each device

S48: the edge server calculates according to the weight update valueAnd using the public key pk of the auxiliary server _a Encryption is performed and then the sum of all device weighted gradients is calculated +.>And sum it with the weight update value of all devicesReturning to the auxiliary server;

s49: auxiliary server S _a ObtainingAnd->Thereafter, public key pk is used _a Decryption to obtainThen calculate the gradient truth value for each group +.>And sends it to the edge server S _e ；

S410: the edge server performs gradient truth value on groups according to different stalenessPerforming delayed aggregationThe sum is expressed as:

3. The model co-construction method according to claim 2, wherein step S41 specifically comprises: suppose G _t s is the group of equipment selected by the s-th round to participate in training and return gradient at the t-th round, thenDenoted as the set of devices selected in round s, < >>Means that the device selected in s round but did not successfully return the result to the server, and that the satisfaction between groups is achieved

4. A method according to claim 3, characterized in that in step S44, the global update in the previous iteration is used to estimate the intra-group gradient truth value in the current iteration, the cosine distance cs of the intra-group gradient truth value from device i _i Expressed as:

5. The model building method according to claim 4, wherein in step S46, distance-related information is calculated as:

6. The model building method according to claim 5, wherein in step S47, a weight update value of each device is calculatedExpressed as:

where N represents the dimension of the gradient, M is expressed asThe number of devices involved, i.e. +.>dist (. Cndot.) is a measurementAnd->The distance between them is calculated as +.>

7. The model building method according to claim 6, wherein in step S48, the sum of all device weighted gradients is calculatedExpressed as:

8. the method according to claim 7, wherein in step S49, a true value of the gradient of each group is calculatedExpressed as: />

9. The model co-construction method according to claim 8, wherein the step S5 specifically comprises the steps of: