CN114330587A - Federal learning incentive method under specific index - Google Patents
Federal learning incentive method under specific index Download PDFInfo
- Publication number
- CN114330587A CN114330587A CN202210001509.2A CN202210001509A CN114330587A CN 114330587 A CN114330587 A CN 114330587A CN 202210001509 A CN202210001509 A CN 202210001509A CN 114330587 A CN114330587 A CN 114330587A
- Authority
- CN
- China
- Prior art keywords
- data
- platform server
- platform
- model
- island
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 37
- 230000006872 improvement Effects 0.000 claims abstract description 13
- 230000006870 function Effects 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 2
- 230000007246 mechanism Effects 0.000 abstract description 10
- 239000002699 waste material Substances 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a two-stage federal learning incentive method under specific indexes, which comprises the following steps: receiving a platform model precision improvement task index issued by a platform server; making a learning strategy according to a model precision improvement target issued by a platform server; training and acquiring the total reward amount of the platform server based on the learning strategy; the platform server is obtained a reward amount based on the contribution proportion distribution to the platform model precision value promotion. The two-stage federal learning incentive mechanism under the specific model precision index can be combined with the actual situation, unnecessary cost waste is reduced, the incentive mechanism designed from the angle of data quality and data quantity is more comprehensive and scientific, and the training efficiency of federal learning is systematically improved.
Description
Technical Field
The invention provides a federal learning incentive method under specific indexes, belongs to the field of distributed machine learning, and particularly provides a federal learning incentive method under specific indexes.
Background
With the continuous development of machine learning technology, data security has become an inevitable problem, and joint learning as a new distributed machine learning model can well solve the data privacy problem. The basic joint learning model addresses the data privacy issue, but such techniques, like crowd sensing, still have another problem in that collaboration between the data island and the platform server becomes inefficient. It is therefore common practice to design appropriate incentive schemes to maximize the benefits of each participant and society.
The main research directions of the federal learning incentive mechanism are Stackelberg game, auction, contract theory, Shapley value, reinforcement learning, blockchain and the like. The Stackelberg game can well establish the relationship between all related subjects for joint learning, namely the relationship between the platform server and the data island is described as the relationship between the master game and the slave game. However, current research is mainly focused on complex incentive mechanisms under uncertain conditions of theoretically constructed indexes. In reality, however, the accuracy of the training model may only meet the requirements of specific indexes. The problem of cost increase may be caused by neglecting the model precision redundancy problem in the actual operation process while only aiming at obtaining the theoretical optimal solution without combining with the actual situation; data quality and data quantity are not effectively used as the basis for the incentive scheme.
Disclosure of Invention
In view of the above problems, the present invention provides a federal learning incentive method under specific indexes, which is suitable for collaboration between a platform server and a plurality of data islands, and comprises the following steps,
s1: receiving a platform model precision improvement task index issued by a platform server;
s2: making a learning strategy according to a model precision improvement target issued by a platform server;
s3: training and acquiring the total reward amount of the platform server based on the learning strategy;
s4: the platform server is obtained a reward amount based on the contribution proportion distribution to the platform model precision value promotion.
Further, in step S2, the data island develops a learning strategy based on the maximization of self utility, and the specific steps are as follows,
1) establishing a utility model of a data island:
Ui=Ri-Ci,i∈(1,...,N), (1)
Wherein, UiFor the utility of data islands i, RiRepresenting the reward earned by the data island i, CiRepresents the training cost, Δ θ, of the data island iiRepresenting the lifting value of the data island i to the training precision of the model, aiAs the number of data, qiFor data quality, viFor data computation, storage cost comprehensive parameter, mu, of data island iiFor data islands iA data processing cost parameter, wherein kappa is more than 1 and is a training parameter, and sigma is a precision parameter;
2) based on the utility maximization of the data isolated island, establishing an objective function aiming at the utility model:
wherein, the decision variable of the data island i is the number a of the data sets participating in trainingiAnd data quality qiI.e. its own utility maximization strategy; the second stage is based on Nash equilibrium game among data islands:the second stage of the game is resolved,
qifirst derivative of (d):
aifirst derivative of (d):
calculating a Hessian matrix:
solving a system of equations:
the decision variables for training are obtained as follows:
further, the platform server maximizes the total reward amount based on the effect thereof, and the specific steps are as follows:
1) establishing a platform server total reward information calculation model:
U=V-R, (3)
u is the utility obtained by the platform server, V represents the total valuation increment of the model and is set as a constant, R represents the total incentive cost paid by the platform server, gamma is the average reward amount of the platform decision, and N is the number of data islands;
2) based on the game of the platform server and the data island in the first stage, the utility of the platform server is maximized, and the objective function is established as follows:
wherein, the decision variable of the platform server is the average reward amount gamma provided by the platform;
First derivative of γ:
let the first derivative be zero:
solving can obtain:
the optimal policy value on the platform server side is gamma*I.e. the actual total prize amount.
Further, a data island decision variable data set quantity a is adoptediAnd data quality qiThrough Δ θi=σlogκ(qi ai) Calculating the ratio of the precision value and the contribution value of the specific island to the platform model training; the platform server distributes the incentives according to the proportion:
the two-stage federal learning incentive mechanism under the specific model precision index can be combined with the actual situation, unnecessary cost waste is reduced, the incentive mechanism designed from the angle of data quality and data quantity is more comprehensive and scientific, and the training efficiency of federal learning is systematically improved.
Drawings
FIG. 1 is a schematic overall flow diagram of the present invention;
FIG. 2 is a schematic diagram of a once-trained Federal learning model under a specific accuracy index;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, the present invention provides a federal learning incentive method under specific indexes, which is suitable for collaboration between a platform server and a plurality of data islands, and includes the following steps, wherein each data island,
s1: receiving a platform model precision improvement task index issued by a platform server;
s2: making a learning strategy according to a model precision improvement target issued by a platform server;
s3: training and acquiring the reward amount of the platform server based on the learning strategy;
s4: and acquiring the total reward amount distributed by the platform server based on the contribution ratio of the platform model precision value promotion.
The study hypothesis mainly includes two: the data island training data cost is related to the quality and quantity of data; the accuracy improvement of the data model is also related to the quality and quantity of the data. Using the Stackelberg game for analysis: the first stage of the two-stage game is a master-slave game between the server and the data island; the second stage of the two-stage game is a nash equilibrium game among data islands, and the significance is that for any data island i, the final strategy result is the result with the maximum utility, that is, the utility of any other strategy is not as great as the final strategy utility. When all data islands meet the requirements, it can be said that a nash equilibrium state is achieved between the data islands.
The specific implementation mode is as follows:
s1: receiving a platform model precision improvement task index issued by a platform server;
s2: making a learning strategy according to a model precision improvement target issued by a platform server;
the data island is used for making a learning strategy based on self utility maximization, and the specific steps are as follows,
1) establishing a utility model of a data island:
Ui=Ri-Ci,i∈(1,...,N), (1)
Wherein, UiFor the utility of data islands i, RiRepresenting the reward obtained by the data island i; ciRepresenting the training cost of the data island i; delta thetaiRepresenting the lifting value of the data island i to the training precision of the model, aiAs the number of data, qiFor data quality, viCalculating and storing cost comprehensive parameters for data of the data island i, wherein the cost comprehensive parameters are known fixed parameters; mu.siThe data processing cost parameter of the data island i is a known fixed parameter; kappa > 1 is a training parameter, sigma is a precision parameter, and all parameters are known fixed parameters. Data quantity aiThe higher the data calculation and storage cost is; data quality qiThe higher the data processing cost. The higher the data quality and the data quantity are, the higher the accuracy of the model parameters is, the more the angle isiThe easier, but certain data quality and improvement of data quality to model parameter accuracyThe promotion is in a marginal decreasing rule.
2) Based on the utility maximization of the data isolated island, establishing an objective function aiming at the utility model:
wherein, the decision variable of the data island i is the number a of the data sets participating in trainingiAnd data quality qiAn optimal strategy under the condition, namely a self utility maximization strategy; the second stage is based on Nash equilibrium game among data islands:the meaning of the method is that for any data island i, the final strategy result is the result with the maximum utility, namely the utility of any other strategy is not as great as the final strategy utility. When all data islands meet the requirements, it can be said that a nash equilibrium state is achieved between the data islands.
Solving the second stage game to determine the optimal data quantity and the data quality local precision target of each data island under the self utility maximization:
qifirst derivative of (d):
aifirst derivative of (d):
calculating a Hessian matrix:
solving a system of equations:
obtaining an optimal strategy of learning data participation:
s3: training and acquiring the total reward amount of the platform server based on the learning strategy;
1) establishing a platform server total reward information calculation model:
U=V-R, (3)
u is the utility obtained by the platform server, V represents the total estimation increment of the model, namely the increment of the estimation value of the model, and is determined by the platform or a third party according to the actual conditions of the specific actual model, so that a corresponding determination constant can be reasonably assumed; r represents the total incentive cost paid by the platform server, gamma is the average incentive amount decided by the platform, namely the platform adjusts the incentive degree by deciding the average incentive amount, so that the regulation and control of the whole incentive mechanism are realized, the average incentive amount value with the maximum effectiveness for the platform is finally obtained according to the data island training condition, and N is the number of data islands;
2) because the precision of the parameters of the training model is a given value of the server platform, and the utility function of the server platform is the total incentive R paid out subtracted from the incremental estimate of the model based on the data quality, the smaller the total incentive R paid out by the server platform is, the greater the utility of the server is. Therefore, the objective function of the server platform is:
wherein, the decision variable of the platform server is the average reward amount gamma provided by the platform;
First derivative of γ:
let the first derivative be zero:
solving can obtain:
optimal policy value on the platform server sideIs gamma*The significance is that after the accuracy requirement of the specific model is determined, the platform server only needs to make corresponding reward amount, and the utility of the platform server can be maximized on the premise that the accuracy of the specific model is obtained.
S4: and carrying out total reward amount distribution according to the ratio of the precision value of the contribution of the single data island to the precision value promotion of the platform model.
Using the above data set quantity aiAnd data quality qiThrough Δ θi=σlogκ(qi ai) Calculating the precision value improved by the platform model training of the specific island; the platform server distributes the incentives according to the proportion:
in the scheme, the data quality evaluation generally comprises consistency, integrity and timeliness, the platform issues a thirteen-dimensional table comprising three indexes of data type, data integrity and data timeliness, namely the data type, the data integrity and the data timeliness respectively correspond to three dimensions, and the three dimensions are represented by values in an interval of 0-1. The standards of the three index platforms are all 1, then the data islands are compared after data of the data islands are input into a table, namely the data type dimension is the number of data types owned by the data islands in the type types provided by the platforms; the data integrity is the integrity degree of data owned by the data island in the platform; the data timeliness is how much the data timeliness owned by the data island accounts for the timeliness standard provided by the platform. And finally, realizing the quantification of the data quality of each data island by the weighted average of the three indexes.
Referring to fig. 2, the federal learning incentive scheme model proposed in the present invention mainly aims at a training situation under the requirement of a specific model accuracy index. In reality, the requirements under specific accuracy indexes are met, so that the training mechanism efficiency is higher, and the situation that the training cost is wasted due to too high training accuracy is avoided. In addition, although the model training process may have multiple rounds, we can analyze the training mechanism once, and then the last time the training process ends can be regarded as the beginning of the training process and just repeated for multiple times, so the mechanism of the present invention simplifies the training process.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (4)
1. A federal learning incentive method under specific indexes is suitable for collaboration between a platform server and a plurality of data islands, and is characterized in that: comprises the following steps of (a) carrying out,
s1: receiving a platform model precision improvement task index issued by a platform server;
s2: making a learning strategy according to a model precision improvement target issued by a platform server;
s3: training and acquiring the total reward amount of the platform server based on the learning strategy;
s4: the platform server is obtained a reward amount based on the contribution proportion distribution to the platform model precision value promotion.
2. The federal learning incentive method under a specific guideline as claimed in claim 1, wherein: in step S2, the data island formulates a learning strategy based on self utility maximization, and the specific steps are as follows,
1) establishing a utility model of a data island:
Ui=Ri-Ci,i∈(1,...,N), (1)
Wherein, UiFor the utility of data islands i, RiRepresenting the reward earned by the data island i, CiRepresents the training cost, Δ θ, of the data island iiRepresenting the lifting value of the data island i to the training precision of the model, aiAs the number of data, qiFor data quality, viFor data computation, storage cost comprehensive parameter, mu, of data island iiTaking data processing cost parameters of a data island i, taking kappa > 1 as a training parameter and taking sigma as a precision parameter;
2) based on the utility maximization of the data isolated island, establishing an objective function aiming at the utility model:
wherein, the decision variable of the data island i is the number a of the data sets participating in trainingiAnd data quality qiI.e. its own utility maximization strategy; the second stage is based on Nash equilibrium game among data islands:the second stage of the game is resolved,
qifirst derivative of (d):
aifirst derivative of (d):
calculating a Hessian matrix:
solving a system of equations:
the decision variables for training are obtained as follows:
3. the federal learning incentive method under a specific guideline as claimed in claim 1, wherein: the platform server makes a corresponding total reward amount based on self effect maximization, and the specific steps are as follows:
1) establishing a platform server total reward information calculation model:
U=V-R, (3)
u is the utility obtained by the platform server, V represents the total valuation increment of the model and is set as a constant, R represents the total incentive cost paid by the platform server, gamma is the average reward amount of the platform decision, and N is the number of data islands;
2) based on the game of the platform server and the data island in the first stage, the utility of the platform server is maximized, and the objective function is established as follows:
wherein, the decision variable of the platform server is the average reward amount gamma provided by the platform;
First derivative of γ:
let the first derivative be zero:
solving can obtain:
the optimal policy value on the platform server side is gamma*I.e. the actual total prize amount.
4. The two-stage federal learning incentive method in a specific index as claimed in claim 1, wherein:
decision variable data set quantity a by data islandiAnd data quality qiThrough Δ θi=σlogκ(qi ai) Calculating the ratio of the precision value and the contribution value of the specific island to the platform model training; the platform server distributes the incentives according to the proportion:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210001509.2A CN114330587A (en) | 2022-01-04 | 2022-01-04 | Federal learning incentive method under specific index |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210001509.2A CN114330587A (en) | 2022-01-04 | 2022-01-04 | Federal learning incentive method under specific index |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114330587A true CN114330587A (en) | 2022-04-12 |
Family
ID=81022869
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210001509.2A Pending CN114330587A (en) | 2022-01-04 | 2022-01-04 | Federal learning incentive method under specific index |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114330587A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114819197A (en) * | 2022-06-27 | 2022-07-29 | 杭州同花顺数据开发有限公司 | Block chain alliance-based federal learning method, system, device and storage medium |
-
2022
- 2022-01-04 CN CN202210001509.2A patent/CN114330587A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114819197A (en) * | 2022-06-27 | 2022-07-29 | 杭州同花顺数据开发有限公司 | Block chain alliance-based federal learning method, system, device and storage medium |
CN114819197B (en) * | 2022-06-27 | 2023-07-04 | 杭州同花顺数据开发有限公司 | Federal learning method, system, device and storage medium based on blockchain alliance |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110135761A (en) | For power demand side response Load Regulation method of commerce, system and terminal device | |
Amirteimoori et al. | Optimal input/output reduction in production processes | |
CN111262241B (en) | Flexible load optimization scheduling strategy research method considering user type | |
CN114330587A (en) | Federal learning incentive method under specific index | |
CN110826890A (en) | Benefit distribution method and device of virtual power plant considering risks | |
CN112132309A (en) | Electricity purchasing and selling optimization method and system for electricity selling company under renewable energy power generation quota system | |
Zeng et al. | A Game Study on Accounts Receivable Financing in Energy Conservation and Environmental Protection Manufacturing Supply Chain under Green Development. | |
Koibichuk et al. | The effectiveness of employment in high-tech and science-intensive business areas as important indicator of socio-economic development: Cross-country cluster analysis | |
Moene | Strong unions or worker control | |
CN116451800A (en) | Multi-task federal edge learning excitation method and system based on deep reinforcement learning | |
Zheng et al. | Wealth optimization models on jump-diffusion model | |
CN113177366B (en) | Comprehensive energy system planning method and device and terminal equipment | |
CN110390443A (en) | A kind of production plan method of adjustment and system considering demand response | |
Hu et al. | Design of two-stage federal learning incentive mechanism under specific indicators | |
Liu et al. | Differential Game Analysis of Shared Manufacturing Platform Pricing Considering Cooperative Advertising Under Government Subsidies | |
CN110321511B (en) | Knowledge sharing incentive method, device, equipment and storage medium | |
Zhang et al. | Two-stage blockchain-based transaction mechanism of demand response quota | |
Zhang et al. | Identifying the configurations to operating efficiency in China’s life insurance industry using fuzzy-set qualitative comparative analysis | |
CN117744931A (en) | Quantum technology-based electric power spot market game method and system thereof | |
Kaya et al. | Project FUGI and the future of ESCAP developing countries | |
Hamlen | The Output Distribution Frontier: A Comment and Further Consideration | |
CN117634929A (en) | Flexible overflow price assessment method considering flexible regulation characteristics of virtual power plant | |
CN117332859A (en) | Crowd-sourced logistics method based on digital twin and evolution game | |
CN117689494A (en) | Wind power cluster deviation assessment method and system based on Shapley value | |
CN117314190A (en) | Low-carbon control method, device, electronic equipment, medium and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |