CN116485430A

CN116485430A - Federal learning forgetting mechanism and method for data circulation

Info

Publication number: CN116485430A
Application number: CN202310414143.6A
Authority: CN
Inventors: 黄建国; 钱敬; 唐浩竣; 魏宗正; 王鹏飞
Original assignee: Dalian University of Technology
Current assignee: Dalian University of Technology
Priority date: 2023-04-18
Filing date: 2023-04-18
Publication date: 2023-07-25

Abstract

The invention discloses a federal learning forgetting mechanism and method for data circulation, and belongs to the technical field of deep learning. The method uses the federal learning technology to realize the implicit circulation of data, and carries out rewarding and punishment allocation on data contributors in the market based on the shape value, and simultaneously uses the federal learning forgetting algorithm to realize the revocation in the data transaction process. The invention solves the problems of difficult data right confirmation, difficult transaction pricing, difficult privacy protection, difficult circulation transaction and the like in the data circulation transaction process.

Description

Federal learning forgetting mechanism and method for data circulation

Technical Field

The invention belongs to the technical field of deep learning, relates to building a data circulation trading market based on federal learning, provides a data trading mechanism based on the quality of a generated model, sets a reward mechanism and a punishment mechanism in the data trading process to realize standardized operation of the market, and simultaneously utilizes a federal forgetting algorithm to realize data revocation in the market.

Background

In the background of the big data age, massive, diversified and complicated data are continuously generated, and the method becomes a new trend of current social and economic development. However, the data vitality is not really released, and the data is necessarily the next technical barrier today when the deep learning technology is rapidly developed. However, the existence of the data island causes that data cannot be shared among enterprises, the value of the data is not yet excited, and in order to solve the problem, the invention constructs a data circulation transaction market from data transaction, thereby realizing data circulation and sharing, and meanwhile, the operation mechanism of the market is processed in a standardized way, so that the data circulation transaction market has the characteristics of excellent data transaction mode, reasonable allocation mechanism and reward and punishment mechanism, replaceable and revocable data use and the like.

At present, the circulation and the transaction of data are not standard, and a plurality of problems exist in the data transaction, such as data piracy, how to realize the optimal strategy of the data transaction, how to realize the privacy protection of users and how to standardize markets, so that sustainable development and operation can be realized; meanwhile, a safe and stable data circulation transaction market is constructed by taking data as commodities, and one major difficulty to be solved is that: unlike common physical commodities, the data cannot be easily removed and replaced, and the removal and replacement are an important part of the market, and the lack of the data can lead to the fact that the standardization of the market is almost impossible.

In fact, some studies currently available have been able to solve this problem locally, such as: federal learning-based solutions to some of the issues with data privacy protection; constructing a data integration and sharing platform to realize integration and sharing of data; various data security protection technologies such as encryption algorithm, access control technology, identity verification technology, audit technology and the like are used for ensuring the security of data circulation; through technical means such as data cleaning, data matching and data restoration, the invisibility of the data in the circulation process is guaranteed; and providing data service to the outside through the data open interface, so that the data can be accessed and used by a third party application. However, it is almost impossible to construct a data circulation trading market based on the above research, and the existing federal learning technology can only solve the implicit circulation problem of data, so that only the privacy of users is guaranteed not to be revealed in the circulation process of data. Meanwhile, the prior art does not consider the way in which data is sold as a commodity; and how to normalize the transaction process, such as rewarding distribution and setting for data contributors, is not considered in the data flow transaction process. Meanwhile, the existing technology cannot ensure that the data is withdrawn in the circulation process.

Consider one such case: the target client wants to exit the federation after the federation learning process ends, and therefore wants to delete their contribution from the global model. For example, consider a federal learning scenario, assuming a system is co-operated by multiple e-commerce platforms, which co-train a model to provide personalized commodity recommendations to users. During training, each platform shares its user's purchase history and browsing behavior to the model so that the model can learn the interests and preferences of the user. However, after training is completed, one of the platforms decides to exit the federation, so they no longer wish to share their user's data. The data of the platform now needs to be removed from the trained model to ensure that the privacy of its users is preserved. Similar to centralized machine learning, the most reasonable way to delete clients is to retrain the federal learning model from scratch. However, the cost of retraining from scratch is often prohibitive and the large chapter is not a long term because of a data removal. The present invention therefore devised an unreeling model to solve this problem.

Disclosure of Invention

In order to solve the problems of crude transaction mode, incomplete reward and punishment and allocation mechanism, irrevocable data and the like in the existing data transaction mode, the invention provides a Quality-oriented task allocation algorithm QTA (Quality-oriented Task Allocation algorithm), an implicit circulation of data is realized by using a federal learning technology, reward and punishment allocation is carried out on data contributors in the market based on a shape value, and revocation in the data transaction process is realized by using a federal learning forgetting algorithm.

The technical scheme provided by the invention aiming at the technical problems is as follows:

a federal learning forgetting mechanism and method facing data circulation comprises the following steps:

step one: training data through a federal learning framework to generate a local model, and replacing the direct circulation of the data in the market by the local model to realize the implicit flow of the data in the market; and then, model aggregation is carried out in the server to generate a global model.

Step two: establishing a buyer market based on a quality-oriented task allocation algorithm QTA, wherein a data seller can send the budget to a server, calculate sales price and return the calculated sales price to a buyer for selection; if payment is made, the transaction will be recorded and considered to join the federal collaboration.

Step three: the bonus allocation of the participant is calculated by calculating the contribution of the participant to the global model based on the Xia Puli value bonus mechanism.

Step four: when the user requests to exit the market and withdraw data, the server calculates the impact of the model via a penalty mechanism to determine the amount of the reimbursement.

Step five: for target users who want to exit the market, a forgetting mechanism for reversing federal learning is adopted for data revocation.

In the third step, the specific steps of the reward mechanism are as follows:

3.1: and calculating the contribution degree of the participant combination. From the federal learning participant collective perspective, the data value of an individual participant cannot replace its contribution in federal collaboration. The rewarding mechanism needs to evaluate the marginal value gain brought by introducing a certain participant into the federal from the data value of the participant combination, so that different rewards are given according to different contributions.

3.2: the Xia Puli value is used to calculate the marginal value gain for each participant.

3.3: the total prize value is calculated using the total contribution. The total prize value should be the combined property of all participants.

3.4: the marginal value gain of a new participant after being added to the market is calculated to determine the rewards of that participant based on the total contribution.

In the fourth step, the penalty mechanism construction step is as follows:

4.1: a feature vector is calculated for each user. Features are various attributes of user data such as time stamps, geographic location, device information, behavioral characteristics, and the like. The data of each user can be regarded as a feature vector.

4.2: the effect of each feature on the model prediction results is calculated. The degree of contribution of each feature is calculated using a chaperone value calculation method, and the importance ranking of each feature is given.

4.3: the reimbursement for each user data is calculated. The corresponding reimbursement may be calculated based on the contribution of each user data, which should be proportional to the contribution of the user data.

4.4: the revocators pay the reimbursement and update the total bonus pool and contribution.

The invention has the beneficial effects that: the invention realizes a safe and stable data circulation transaction system with complete functions. The system uses a distributed artificial intelligence architecture-training data generation model and uses the model as a circulation element, a blockchain-a transaction traceability and federal forgetting algorithm-data withdrawal; a multidimensional data circulation transaction market is constructed, so that data circulation transaction is accelerated technically, market vitality is stimulated, and data transaction 'available invisible', 'traceable source', 'controllable metering', 'exchangeable revocable' is realized. The invention solves the problems of difficult data right confirmation, difficult transaction pricing, difficult privacy protection, difficult circulation transaction and the like in the data circulation transaction process.

Drawings

FIG. 1 is a market infrastructure built based on federal learning.

Fig. 2 is a specific flow chart in the data transaction process of the present invention.

FIG. 3 is a detailed explanation of the rewarding mechanism and punishment mechanism of the present invention.

FIG. 4 is a flowchart of the federal forgetting algorithm of the present invention.

Detailed Description

The invention provides a data transaction circulation mode based on federal learning, which realizes data transaction based on the quality of a generated model, sets a rewarding mechanism and a punishment mechanism in the data transaction process, and simultaneously realizes the revocation of data in the market by utilizing a federal forgetting algorithm, and specifically comprises the following steps:

step one: training data through a federal learning framework to generate a local model, replacing the direct circulation of the data in the market with the local model to realize the implicit flow of the data in the market, and then carrying out model aggregation in a server to generate a global model.

Step two: based on Quality-oriented task allocation algorithm QTA (Quality-oriented Task Allocation algorithm), the process of data transaction is embodied so that the requester can obtain the most amount of data within a given budget and get the highest accuracy model.

Step four: when the user asks to exit the market and withdraw their data, the server calculates the impact of the model through a penalty mechanism to determine the amount of the reimbursement.

Step five: and carrying out data revocation by using a federal forgetting algorithm.

Fig. 1 is a basic market architecture constructed based on federal learning, and the specific implementation steps are as follows:

1.1 the requestor sends a transaction request to a data transaction marketplace, which accepts the purchase needs of the requestor.

1.2 data in the trading market train a local model in a distributed manner in different devices.

1.3 local models generated the local models are aggregated into a global model via a server in the data trade market.

1.4 sell the global model to the requestor.

The following is a simple introduction to the process of federal learning implementation:

federal study introduction: federal learning (federated learning, FL) is a data-immobilized distributed machine learning framework, the machine learning model is completed by a plurality of participants through multiple rounds of training, federal learning is composed of a federal server and a plurality of participants, each participant holds respective local data, during the training process, the local data of each participant cannot leave the local, but local models are directly generated locally, and after the federal models are subjected to multiple rounds of training until the requirements are met, the federal learning is terminated.

The federal learning implementation process:

the training process of federal learning is divided into two phases:

and (5) local updating: each participant updates the local model based on the respective local data and sends parameter gradient updates to the federal server.

Model aggregation: the federal server aggregates all participants 'updates, e.g., federal average (FedAvg) simply weights aggregate the gradient values uploaded by each participant based on each participant's data volume and gives the aggregate gradient to update the global model.

The federal server sends the updated global model parameters to each participant, and each participant updates the local model and prepares for the next round of training.

In practice federal learning also has a decentralised end-to-end architecture, and the definition of federal learning generally includes the following assumptions: 1. multiparty participation, 2. Data immobility, 3. Trusted transmission. According to different data dimensions held by the participants, federal learning is divided into horizontal federal learning, longitudinal federal learning and transfer learning. Federal learning is classified into cross-device federal learning and cross-organization federal learning according to the participant object.

As shown in fig. 2, which is a process diagram of a specific implementation manner of data transaction, the present invention proposes a Quality-oriented task allocation algorithm QTA (Quality-oriented Task Allocation algorithm), in which a greedy strategy is applied in the implementation process of the algorithm, for allocating federal learning tasks issued by requesters in the market to devices with appropriate data in the market, and the present invention constructs a purchasing market of data based on the algorithm, so that a purchaser can obtain the maximum amount of data available under a fixed budget.

The specific process implementation steps of the QTA algorithm are as follows:

2.1 the requester has a budget for its task and each device with the appropriate data will generate an expected sales price upon receipt of the request, the sales prices generated by all devices will constitute a set of alternative sales prices.

2.2 sales price update: each device generates a sales price of the data based on the requestor's budget and sends the data price and the amount of data the device has to the server, which recalculates the sales prices to update the sales price.

The sales price update is performed because: the data price of a device is the sum of its data and computational costs, whereas the data price sold by a server should contain the costs of the data transmission process.

2.3 the server offers to the buyer: the server updates the sales price, obtains a sales price vector, initiates quotation to the buyer for the buyer to select, and the calculation mode of the sales price update is to consider the cost of the data in the transmission process and add the cost of the data transmission into the final model cost. The profit of the server is the difference between the payment per unit data amount and the price per unit data of the computing device minus the cost of transmission times the total amount of transaction data.

2.4 greedy selection of requesters: the server employs a greedy strategy to select devices for the requester based on the sales price vector until the requester's budget is exhausted. The greedy strategy is specifically as follows: after the sales price vector is generated, the server preferentially selects the data model with the lowest price for the requester each time under the given budget of the initiator, then the price of the data is deleted from the sales price, the budget of the requester is updated, and then the selection of the data model with the lowest price is performed again, and the cycle is repeated until the budget of the requester is exhausted.

QTA may implement a globally optimal solution when requesters submit federal learning tasks in a particular order. In the trading market, all requesters independently issue federal learning tasks. Any requestor should be considered when to join the data flow trading market. The greedy strategy proposed by the present invention ensures that the requester will maximally utilize training data of the internet of things devices on the current market, i.e. exclude devices that have been selected by the early requester. Therefore, the greedy selection-based QTA proposed by the present invention is optimal in the asynchronous market.

Fig. 3 is a detailed explanation of the reward mechanism and penalty mechanism of the present invention.

Rewarding mechanism:

the contribution degree of the multiparty users in federal learning is evaluated through a classical cooperative game scheme Xia Puli value (shape value), and the rewards are reasonably distributed according to the contribution degree of each party, so that the rewards mechanism is reasonable, fair, efficient and more realistic. The formula for calculating the users of each party by using the eplerian value is as follows:

wherein pi e pi is an arrangement of all participants,combinations of participants arranged before in the arrangement pi; v is a cost function, such as test accuracy. The Xia Puli value of party i can be understood as i's marginal contribution expectation under all federal orders of addition, i.e. enumerating all expected value gains to federal brought by party i under all possible federal orders of addition as i's contribution to the federation, by which a reward mechanism can be built, and if the total contribution degree of all parties is phi, the contribution degree of the ith party is phi _i The total data flow transaction market amount is O, then the i-th user's reward is:

wherein phi is _i Score the contribution of the party and satisfy:

the joining federal collaboration process of the participants is: the participants submit applications, the market calculates the total contribution value of each existing participant, the sharp value is used for calculating the independent contribution degree of the participants after the participants are added into the market, and then the participation is multiplied by the total prize according to the proportion, so that the prize obtained after the participants are added into the market is obtained.

Penalty mechanism:

in the data flow trading market, individual participants may opt into federal learning and have the right to opt out of existing platforms. Since the individual user exits federal collaboration, the contribution that would otherwise belong to him will be drawn from the total rewards, and the corresponding rewards will be deducted for that user according to a penalty mechanism. Thus, when a user asks to exit the market and withdraw their data, the server may calculate taking into account the impact on the model to determine the amount of the reimbursement. When the participant submits the revocation application, the model calculates the influence of the data on the model prediction result through Xia Puli values on various feature vectors (various attributes of the user data, such as time stamps, geographic positions, equipment information, behavior characteristics and the like), and the importance ranking is given. And calculating the compensation of each user data according to the contribution degree of each user data to the model prediction result, wherein the compensation is proportional to the contribution degree of the user data. And multiplying the contribution degree of each user data to the model prediction result by the change degree of the model accuracy, thereby obtaining the influence degree of each user data on the model accuracy. The corresponding payouts are then calculated based on the degree of influence, and the payouts may be a fixed amount or a proportion. The calculation formula is as follows:

wherein W represents an influence factor of the party; n represents the number of all participants and C represents the amount of reimbursement corresponding to each degree of influence. P (P) _i Representing the accuracy of the model when the user data is not used, ΔP _i Indicating the degree of influence of certain user data, ΔP _j Representing the change in accuracy of the model after use of the user data. The denominator in the formula represents the sum of the degrees of influence of all participant data and can be used to normalize all degrees of influence. This ensures that the total reimbursement for all participants does not exceed the sustainable reimbursement amount.

Forgetting mechanism:

based on the distribution of all models generated by using the federal learning algorithm, the quality of forgetting can be measured by weight of the retrained model and the model to be forgotten and the distance in the output space.

After the federal training t-round, the goal of the client is to learn a local model that minimizes the risk of (local) experience. This problem can be solved using the following mathematical formula:

wherein L (w; (x) _j ,y _j ) A predicted loss on the sample; (x) _j ,y _j ) For each client to locally perform several small batch random gradient descent, searching a model with small experience loss, F _i (w) predicting the loss for the average sample corresponding to the client to be forgotten.

One major implementation of the forgetting mechanism is to reverse the learning process; that is, in the forgetting process, the customer does not learn model parameters that minimize experience loss, but rather strives to learn model parameters that maximize loss. To find a model with a large loss of experience, the customer can simply make several small batches of random gradient rises. However, the step of maximizing the loss simply by using gradient ascent is somewhat problematic. The loss before and after forgetting is unbounded, in which case each gradient step moves towards a model that increases the loss, and it is likely that after several steps an arbitrary model similar to a random model will be created.

To address this problem, it is ensured that the model to be forgotten is sufficiently close to the reference model that has effectively learned the data distribution of other clients. The range of losses is defined and will automatically stop once the losses exceed a given threshold. In particular, it is suggested to use averages of other client models As a reference pattern. The target client i can calculate this reference model locally as +.>

For the client i to be forgotten, F is needed _i The variation interval of (w) is limited to w ^* The use of l '2-norm spheres in l' 2-norm spheres with a surrounding radius δ will be specific to F _i The limitation of (w) is shown:

one option to solve this problem is to use projection gradient ascent. More specifically, let' 2-norm sphere w≡with radius delta be expressed ^* At the position ofWherein is set to represent ∈Ω ->Is a projector operator of (a).

Then, for a given step size eta _μ Client i iteratively updates:

and continuously updating and iterating the client data to finally obtain a good model. If the verification accuracy is below a predetermined threshold τ, an early stop will be performed. In this way, the formation of any model by the excessive gradient-rise training model is avoided. By performing gradient up-training, the resulting model can counteract the historical effects of low quality data on them. And the parameters obtained by gradient rising training are carried out on the local model of the client, so that the influence of low-quality data reserved in the local model of other clients can be eliminated during subsequent polymerization.

This is a new federal revocation approach proposed by the present invention that can successfully revoke the contribution of any client by reversing the learning process. The method relies on clients wanting to exit the federation without requiring the server to track their parameter update history.

Claims

1. A federal learning forgetting mechanism and method facing data circulation is characterized in that the method comprises the following steps:

step one: training data through a federal learning framework to generate a local model, and replacing the direct circulation of the data in the market by the local model to realize the implicit flow of the data in the market; then, model aggregation is carried out in a server to generate a global model;

step two: establishing a buyer market based on a quality-oriented task allocation algorithm QTA, transmitting the budget of the buyer market to a server by a data seller, calculating a sales price and returning the sales price to the buyer for selection; if payment is made, the transaction will be recorded, as if it were joining federal collaboration;

step three: calculating the contribution degree of the participant to the global model through a reward mechanism based on Xia Puli values, and further calculating the reward distribution of the participant;

step four: when the user requests to exit the market and withdraw data, the server calculates the influence of the model through a punishment mechanism to determine the amount of the reimbursement;

step five: for target users who are to exit the market, a forgetting mechanism for reversing federal learning is adopted to conduct data revocation.

2. The data flow oriented federal learning forgetting mechanism and method of claim 1, wherein said step one comprises the specific steps of:

1.1, a requester sends a transaction request to a data transaction market, and the transaction market accepts the purchase demand of the requester;

1.2 data in the trade market train the local model in different devices in a distributed manner;

1.3 the local model generated aggregates the local model into a global model via a server in the data trade market;

1.4 sell the global model to the requestor.

3. A mechanism and method for federal learning forgetting for data traffic according to claim 1 or 2, wherein the procedure of step two is as follows:

2.1 the requester has a budget for its task, each device with the appropriate data will generate an expected sales price upon receipt of the request, the sales prices generated by all devices will constitute a set of alternative sales prices;

2.2 sales price update: generating a sales price of data by each device according to the budget of the requester, transmitting the data price and the data quantity of the device to a server, and recalculating the sales prices by the server so as to update the sales price;

2.3 the server offers to the buyer: the server updates the sales price, obtains a sales price vector, and initiates quotation to the buyer for the buyer to select; the sales price update is calculated by adding the cost of data transmission to the final model cost; the profit of the server is the difference between the payment per unit data amount and the price per unit data of the computing device minus the cost of transmission times the total amount of transaction data;

2.4 greedy selection of requesters: the server adopts a greedy strategy to select equipment for the requester according to the sales price vector until the budget of the requester is exhausted;

the greedy strategy is specifically as follows: after the sales price vector is generated, the server preferentially selects the data model with the lowest price for the requester each time under the given budget of the initiator, then the price of the data is deleted from the sales price, the budget of the requester is updated, and then the selection of the data model with the lowest price is carried out again, and the cycle is repeated until the budget of the requester is exhausted.

4. A data flow oriented federal learning forgetting mechanism and method according to claim 1 or 2, wherein in step three, the rewarding mechanism is as follows:

the contribution degree of the multiparty users in federal learning is evaluated through the eplerian value, the incentives are reasonably distributed according to the contribution degree of each party, and the formula for calculating each party user by using the Xia Puli value is as follows:

wherein pi e pi is an arrangement of all participants,combinations of participants arranged before in the arrangement pi; v is a cost function; the Xia Puli value of party i is understood to be i's marginal contribution expectation under all federal orders of addition, i.e. enumerating all expected value gains to federation brought by party i under all possible federal orders of addition as i's contribution to federation by which a reward mechanism can be built, if the total contribution degree of all parties is phi, the contribution degree of the ith party is phi _i The total data flow transaction market amount is O, then the i-th user's reward is:

wherein phi is _i Score the contribution of the party and satisfy:

5. A data flow oriented federal learning forgetting mechanism and method according to claim 3, wherein in step three, the rewarding mechanism is as follows:

wherein pi e pi is an arrangement of all participants,combinations of participants arranged before in the arrangement pi; v is a cost function; the Xia Puli value of party i is understood to be i the marginal contribution expectation under all federal orders of addition, i.e. enumerating all expected value gains to federal brought by party i under the possible federal orders of addition as i contribution to the federation by which a reward mechanism can be built, if the total contribution degree of all parties is Φ, iContribution degree of each participant is phi _i The total data flow transaction market amount is O, then the i-th user's reward is:

wherein phi is _i Score the contribution of the party and satisfy:

6. The method and the federal learning forgetting mechanism for data traffic according to claim 1, 2 or 5, wherein in the fourth step, the penalty mechanism is as follows:

after the participant submits the revocation application, calculating the feature vector of each user; calculating the influence of each feature on the model prediction result, and giving the importance ranking of each feature; calculating the compensation of each user data according to the contribution degree of each user data to the model prediction result, wherein the compensation is in direct proportion to the contribution degree of the user data; multiplying the contribution degree of each user data to the model prediction result by the change degree of the model accuracy, thereby obtaining the influence degree of each user data on the model accuracy; then calculating corresponding compensation according to the influence degree, wherein the compensation is a fixed amount or a proportion; the calculation formula is as follows:

wherein W represents an influence factor of the party; n represents the number of all participants, and C represents the amount of the reimbursement corresponding to each influence degree; p (P) _i Representing the accuracy of the model when the user data is not used, ΔP _i Indicating the degree of influence of certain user data, ΔP _j Representing the change in accuracy of the model after use of the user data.

7. The mechanism and method for federal learning forgetting for data traffic according to claim 1, 2 or 5, wherein the fifth step is as follows:

after the federal training t-round, the client's goal is to learn a local model that minimizes the risk of experience, solving the problem using the following mathematical formula:

wherein L (w; (x) _j ,y _j ) A predicted loss on the sample; (x) _j ,y _j ) For each client to locally perform several small batch random gradient descent, searching a model with small experience loss, F _i (w) predicting loss for the average sample corresponding to the client to be forgotten;

one major implementation of the forgetting mechanism is to reverse the learning process; that is, in the forgetting process, the customer does not learn model parameters that minimize experience loss, but rather strives to learn model parameters that maximize loss;

defining a range of losses, which automatically stops once the losses exceed a given threshold; average using other client modelsAs a reference pattern; target client i computes this reference model locally as

For the client i to be forgotten, F is needed _i The variation range of (w) is limited to w ^{^*} The use of l '2-norm spheres in l' 2-norm spheres with a surrounding radius δ will be specific to F _i The limitation of (w) is shown:

let l' 2-norm sphere w representing radius delta using projection gradient elevation ^{^*} At the position of Wherein is set to represent ∈Ω ->Is a projector operator of (1);

then, for a given step size eta _μ Client i iteratively updates:

continuously updating and iterating the data of the client to obtain an optimal model; when the verification accuracy is lower than a predetermined threshold τ, early stopping is performed.