CN113157434B

CN113157434B - Method and system for exciting user nodes of transverse federal learning system

Info

Publication number: CN113157434B
Application number: CN202110218159.0A
Authority: CN
Inventors: 郭晶晶; 熊良成; 马建峰; 李兴华; 刘玖樽; 李海洋; 田思怡; 高华敏
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2024-05-07
Anticipated expiration: 2041-02-26
Also published as: CN113157434A

Abstract

The method comprises the steps that an incentive mechanism based on a compound profit principle is constructed, and a federal learning task publisher calculates the reward base number and the reward rate of each user according to total budget and user information, and calculates the final reward of each user; different return rates are set for honest users, malicious users and unreliable users, so that three types of users obtain different final rewards, and excitation of honest users and inhibition of other users are achieved. The invention also provides a system, a terminal and a computer readable storage medium for realizing the excitation method. The invention can inhibit malicious users from participating in federal learning and encourage honest users to participate in federal learning for a long time. And finally, the safety and reliability of the federal learning system are improved.

Description

Method and system for exciting user nodes of transverse federal learning system

Technical Field

The invention belongs to the electronic information technology, and particularly relates to a method and a system for exciting a user node of a horizontal federal learning system.

Background

Lateral federal learning (Horizontal FEDERATED LEARNING) is a machine learning paradigm proposed in recent years, and is characterized in that multiple users cooperatively train a model with own data under the cooperation of a server. Firstly, a user performs model training by using local data of the user; then uploading the trained local model to a server; the server then aggregates the received user local models using some aggregation rule to obtain a global model for sharing by all users. The machine learning paradigm can enable training data of each user not to be shared with other users and servers, so that data privacy of the users is protected. In addition to data privacy considerations, it is also essential to provide corresponding incentives for users of federal learning during the business landing process of federal learning, and the incentives mechanism has received great attention as an important research direction in the federal learning field.

Most federal learning incentive mechanisms in existence assume that all nodes participating in federal learning are honest and reliable, but this assumption is not necessarily true in practical application scenarios. The method is characterized in that a learner puts forward a federal learning incentive mechanism based on contract theory, the core idea is that a task publisher of federal learning publishes a series of different contracts to all users, a corresponding optimal payment scheme is designed for each contract item, and each user node can select the contract meeting the maximization of the benefit and participate in federal learning according to contract requirements, so that the user is effectively motivated to provide data and resources for federal. The premise of this mechanism is that all users in the system are honest and trusted, and they will honest the contents of their selected contracts. In practical application, however, malicious users and unreliable users may exist in the system, and the malicious users may submit any generated parameters to the server as a local model thereof, so that self resources are saved or attack is implemented on the system; an unreliable user may have an upload timeout when uploading local model parameters; the existence of both users affects the whole federal learning process, so that the system finally obtains a global model with reduced accuracy. It is necessary to consider the case of malicious and unreliable nodes that may exist in the system.

Disclosure of Invention

Aiming at the problem that the federal learning excitation mechanism in the prior art cannot achieve the expected effect because malicious users and unreliable users are not considered, the invention provides an excitation method and an excitation system for user nodes of a transverse federal learning system, which are used for resisting the participation of malicious nodes in federal learning and improving the robustness of the federal learning system and the reliability of learning results.

In order to achieve the above purpose, the present invention has the following technical scheme:

An excitation method of a user node of a transverse federal learning system constructs an excitation mechanism based on a complex theory, and the mathematical expression of the excitation mechanism is as follows: Wherein R _u is the final reward of the user u, p _u is the reward base of the user u, R _u is the return rate of the user u, and k _u is the effective number of rounds of participation of the user u in federal learning;

Calculating the reward base number and the return rate of each user according to the total budget and the user information by the federal learning task publisher, and calculating the final reward of each user; different return rates are set for honest users, malicious users and unreliable users, so that three types of users obtain different final rewards, and excitation of honest users and inhibition of other users are achieved.

Preferably, the calculation expression of the return rate is as follows:

Where a represents the tolerance of the task publisher to malicious behavior, b represents the total budget of the task publisher, The total energy consumption and the data cost of the total T round global iteration of the system of the user u are assumed, T is the number of system iteration rounds, epsilon _u represents the local model precision of the user u, k _u is the effective number of rounds of the federal learning of the user u, and n represents the number of users participating in each round of iteration.

Preferably, the calculation expression of the reward base is as follows:

Where a represents the tolerance of the task publisher to malicious behavior, x _u represents the number of iterative rounds of user u's participation in federal learning, and C _u is the total cost of user u's participation in federal learning, which is equal to the sum of energy consumption and data cost.

Preferably, each user decides whether to participate in the federal learning task, and if so, uploads relevant parameters including computing resources, local data volume, local model precision and unit data cost; the server selects users meeting the federal learning requirement, and the users not meeting the requirement refuses to participate in federal learning; estimating a return rate interval of each user participating in federal learning, and publishing the report rate interval in a public way; and finally, the user decides whether to participate in the federal learning according to the return rate interval, and if the user does not participate in the federal learning, the user automatically exits.

Preferably, the upper limit calculation expression of the rate of return involved in federal learning is as follows:

Wherein, Representing the sum of total energy consumption and data cost of all T round global iterations of the optimal n users participating in the user acquiring participation qualification, wherein n represents the number of the users participating in each round of iteration;

The lower limit calculation expression of the return rate participating in federal learning is as follows:

wherein T is the maximum global iteration number, epsilon ^* is the local precision threshold; the return rate interval is

Preferably, the user participating in federation learning cooperates with the server to complete federation learning, and the aggregation server completes iterative updating of the global model, detection of behaviors of the user and recording of related information; the aggregation server sets a parameter x _u,k_u for each user to record the iteration round number and the effective round number of the user u participating in federal learning, and the parameter is set to 0 before the federal learning starts; when the global model is updated each time, the aggregation server detects the behaviors of each user, different operations are carried out on the parameter x _u,k_u according to different detection results, x _u represents the iterative round number of the user u participating in federal learning, and k _u represents the effective round number of the user u participating in federal learning.

Preferably, the aggregation server operates the parameter x _u,k_u according to the behavior detection result of each user as follows:

Case 1, wherein the user u effectively participates in one round of federal learning, the value of x _u is increased by 1, and the value of k _u is increased by 1; the effective participation means that a user uploads local model parameters with model precision reaching federal learning requirements to an aggregation server within the maximum transmission time specified by the system;

Case 2, when the local model is overtime in uploading by the user u, the value of the parameter x _u,k_u is unchanged; uploading local model timeout means that the aggregation server does not receive the model parameters uploaded by the user u within the maximum transmission time specified by the system, and the user is regarded as an unreliable user;

Case 3, the aggregation server detects malicious behaviors of the user u, the value of the parameter k _u is unchanged, the value of x _u is increased by 1, then the server detects whether x _u-k_u is larger than the value of a, and if so, the node is removed from the system; here, malicious behavior refers to if the user u uploads the local model parameters to the aggregation server in the maximum transmission time, but after the aggregation server checks that the model parameters do not meet federal learning rules, including abnormal data or the local model accuracy is less than the rule requirement.

The invention also provides a transverse federal learning system, which comprises an aggregation server, a task publisher and a plurality of users; the task publisher is responsible for publishing federal learning tasks, making an incentive mechanism and paying rewards to users participating in federal learning; the aggregation server is responsible for receiving local models uploaded by all users and aggregating the models through a certain aggregation rule to obtain a global model for the users, recording identity information of node users participating in learning in the process, calculating energy consumption of all the users, detecting the accuracy of the local models and detecting malicious behaviors of the users; the user performs model training by utilizing own local data to obtain a local model; when the user joins federal study, submitting the relevant information to the aggregation server according to the convention, and informing the aggregation server when exiting federal.

The invention also provides a terminal device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the method for exciting the user node of the horizontal federal learning system when executing the computer program.

The invention also provides a computer readable storage medium storing a computer program which when executed by a processor implements the steps of the method of motivating user nodes of the horizontal federal learning system.

Compared with the prior art, the invention has the following beneficial effects: an incentive mechanism based on the principle of compound profit is constructed, under the mechanism, honest users participating in the federal learning system for a long time are positively stimulated, malicious nodes and unreliable nodes are reversely stimulated, so that the nodes participating in the federal learning are stimulated to provide long-term honest behaviors, the malicious nodes are resisted to participate in the federal learning, and the robustness of the federal learning system and the reliability of learning results are improved. According to the invention, different return rates are set for honest users, malicious users and unreliable users, so that three types of users obtain different final rewards. Aiming at honest users, positive benefits are obtained when the honest users participate in federal learning each time, the number of iteration rounds of participating in federal learning is larger, the obtained positive benefits are exponentially increased, so that enthusiasm of honest and reliable users for participating in federal for a long time is stimulated, and finally, the safety and reliability of a federal learning system are improved. The invention can inhibit malicious users from participating in federal learning, and can encourage honest users to participate in federal learning for a long time.

Drawings

FIG. 1 is a block diagram of a lateral federal learning system according to the present invention;

FIG. 2 is a flow chart of an incentive method of a user node of the horizontal federal learning system of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples.

Referring to fig. 1, the lateral federal learning system of the present invention is composed of an aggregation server (AS, aggregation Server), a task publisher (TP, task Publisher) and a number of Users (Users). The tasks of the respective entities are described as follows:

Task publisher (TP, task Publisher): the method is mainly responsible for issuing federal learning tasks, making an incentive mechanism and paying rewards to users participating in federal learning.

Aggregation server (AS, aggregation Server): the method mainly takes charge of receiving local models uploaded by users and aggregating the models through a certain aggregation rule to obtain a global model for the users, and further takes charge of recording identity information of node users participating in learning in the process, calculating energy consumption of the users, detecting the accuracy of the local models and detecting malicious behaviors of the users.

Several Users (Users): the method is mainly responsible for training a model by utilizing own local data to obtain a local model, and all the participants cannot share the own data in order to protect privacy, wherein malicious users can randomly generate the malicious local model, and unreliable users can overtime upload the local model; in addition, when a user joins the federal learning system, related information including parameters such as computing resources, local data size, local model precision, unit data cost and the like is submitted to the aggregation server according to convention; when the user exits the federation, the aggregation server is notified.

Referring to fig. 2, the method for exciting the user node of the horizontal federal learning system of the present invention is composed of four phases:

The first stage, system initialization, the task publisher and the server release federal learning tasks, motivation schemes and related system parameters outwards; and in the second stage, users select, and each user decides whether to participate in the federal learning task according to the issued task, the incentive scheme and the related parameters, if so, the related parameters (information such as computing resources, local data volume, local model precision, unit data cost and the like) are uploaded, and the users meeting the system requirements continue the following steps. The server estimates a report rate interval of the user participating in federal learning according to the parameter uploaded by the user, and issues the report rate interval to the user, the user finally decides whether to participate in the federal learning according to the report rate interval, and if the user does not participate, the system is automatically exited; the third stage, the user cooperates with the server to complete federal learning, in the process, the server completes global model updating, and the behavior detection of the user and records related information; and fourth stage, calculating and paying the payment. When the federal learning task is completed or the user exits, the task publisher calculates relevant parameters, including the calculation of the rate of return and the reward base, based on the incentive mechanism designed by the invention according to the relevant parameters of the user and the total cost budget of the user, and then calculates and pays the corresponding reward to the user.

The existing federal learning excitation mechanism generally assumes that all user nodes in the federal learning system are honest and reliable nodes, and in actual situations, some malicious users and unreliable users may exist in the system, so that the existing federal learning excitation mechanism cannot achieve the expected effect. For the problem, the invention provides an excitation mechanism for the federal learning system with malicious users and unreliable users, under the excitation mechanism, honest users participating in the federal learning system for a long time are excited forward, and malicious nodes and unreliable nodes are excited reversely, so that nodes participating in federal learning are excited to provide long-term honest behaviors, the malicious nodes are resisted from participating in federal learning, and the robustness of the federal learning system and the reliability of learning results are improved.

The invention relates to the theory of model training energy consumption and total cost as follows:

computational model of local model training:

The user set is denoted by u= { U1, u2.. For each user U e U, the data sample size used to train the local model is s _u, and the computing resource used by user U for local model training, i.e., the CPU cycle frequency, is denoted as f _u. User u trains the number of CPU cycles for one data sample in the local model training, denoted by c _u. Thus, the local iterative computation time when user u performs local model training is CPU energy consumption of a user for carrying out one local model training iteration is expressed as:

where ζ is the effective capacitance parameter of the computing chipset for user u.

Federally learned communication model:

During federal learning iterations, after each round of local model training, each user will upload its own local model updates to the aggregation server via wireless communication. Assuming that user u's local model accuracy is ε _u, it is apparent that a higher local model accuracy results in fewer local and global iterations. When the precision of the global model is fixed, the local iteration number is The process of updating the global model by the server is called a global iteration, and the process of training the local model by each user is called a local iteration. The time spent for one global iteration includes the computation time for the local iteration and the uplink communication time for the local model update. Calculation time of one local iteration of user u/>And (3) representing. Assume that the user's location is fixed while uploading the local model parameters. The transmission rate of user u is denoted/>Here, B is the transmission bandwidth, ρ _u is the transmission power of user u, h _u is the channel transmission of the peer-to-peer link between user u and the task publisher, and N ₀ is the background noise. Assuming that the data size of the local model update is a constant σ, the uploading time of the local model with the data size of σ is/>And (3) representing. Thus, the total time of one global iteration of user u is expressed as:

The energy consumption of user u to transmit the local model update in the global iteration is:

thus, for one global iteration, the total energy consumption of user u is expressed as follows:

assuming that the unit data cost of the user u is l and the number of rounds of effective participation in federal learning is k _u, the sum of the total energy consumption and the data cost of the user u in federal learning can be expressed as the following expression:

the invention has the following assumptions for a lateral federal learning system:

(1) The server is honest and reliable, and each user node is rational and greedy.

(2) There are three types of user nodes in the federal learning system:

1) The reliable node performs local model training by using the real data sample and uploads the obtained real local model to the server in time; 2) Unreliable nodes perform local model training by using real data samples, but communication is unstable, and overtime phenomenon can occur when federal learning is performed; 3) Malicious nodes randomly generate false local models and upload the false local models to a server.

(3) The total paid budget of the task publisher for each user is greater than the sum of the costs of each node.

(4) The server may detect malicious behavior of a malicious user.

(5) The local data of each user satisfies the characteristic of independent and same distribution.

(6) Each user node can freely join and exit the federal learning system.

A method of exciting a user node of a lateral federal learning system, comprising the steps of:

Step 1, an initial stage. The task publisher and the server publish federal learning tasks, incentive schemes and system related parameters, wherein the system related parameters comprise a time range for federal learning, a maximum global iteration number T, a local data type, a local model precision threshold epsilon ^*, malicious behavior tolerance a and other parameters and federal learning rules. The malicious behavior tolerance a refers to the tolerance times of a task publisher and a server to malicious behaviors, and the tolerance times are valued as non-negative integers.

The excitation scheme calculation formula is as follows. And calculating related parameters of the incentive mechanism, including the calculation of the return rate and the return base, according to the information of each user and the total cost budget of the user. The final reward R _u for user u can be expressed as equation (7):

Where p _u is the reward radix of user u, r _u is the rate of return of user u, and k _u is the number of active rounds of user u to participate in federal learning. When the value of the parameter x _u-k_u of the user u is greater than the value of the parameter a, the aggregation server eliminates the node from the federation.

1) The method for calculating the return rate comprises the following steps: the calculation method of the return rate r _u of the user u is shown in a formula (8), wherein a represents the malicious behavior tolerance of the task publisher, b represents the total payment budget of the task publisher,Representing the sum of total energy consumption and data cost assuming that user u participates in all T rounds of global iteration of the system, T being the number of rounds of system iteration, ε _n representing the accuracy of the local model of user u, and k _u being the number of effective rounds of user u participating in federal learning.

2) The reward base calculation method comprises the following steps:

The reward parameter p _u of the user u can be found according to formula (9), where a is the malicious behavior tolerance of the task publisher, x _u represents the number of iterative rounds of user u to participate in federal learning, k _u is the number of effective rounds of user u to participate in federal learning, and C _u is the total cost of user u to participate in federal learning (sum of energy consumption and data cost).

Step 2, user selection. The method comprises the following specific steps:

1) First, each user decides whether to participate in the federal learning task according to the content issued by Step 1, and if so, uploads relevant parameters (parameters such as computing resources, local data amount, local model accuracy, unit data cost and the like).

2) The server selects users meeting the federal learning requirement, the users not meeting the federal learning requirement refuses to participate in the federal learning, and the users meeting the system requirement continue the subsequent steps; then, the server calculates the upper bound and the lower bound of the reporting rate of the federal study participated in by the parameter estimation uploaded by each user according to the formulas (10) and (11) respectively, and obtains the reporting rate interval, and the reporting rate interval is issued together, wherein the calculation formula is as follows.

Wherein the method comprises the steps ofRepresenting the sum of the total energy consumption and data cost of the optimal n users participating in all T round global iterations among the users that acquired the participation qualification, n representing the number of users participating in each round of iterations.

Wherein T is the maximum global iteration number, epsilon ^* is the local accuracy threshold, and the interval of the return rate is

3) The user finally decides whether to participate in the federal learning according to the return rate interval, and if the user does not participate in the federal learning, the user automatically exits the system;

Step 3, the users participating in learning cooperate with the server to complete federal learning. In the process, the aggregation server completes the iterative updating of the global model, detects the behavior of the user and records the related information. The aggregation server sets a parameter x _u,k_u for each user to record the iteration round number and the effective round number of the user u participating in federal learning, and the parameter is set to 0 before the federal learning starts. Each time the global model is updated, the aggregation server will perform behavior detection on each user, the detection results can be divided into three types, and for the three types of behaviors, the aggregation server will have the following operations on the parameter x _u,k_u:

Case 1, if user u effectively participates in one federal study, the value of x _u is increased by 1, and the value of k _u is increased by 1. The effective participation means that the user uploads the local model parameters with the model precision reaching the federal learning requirement to the aggregation server in the maximum transmission time specified by the system.

Case 2, when the local model is overtime, the value of the parameter x _u,k_u is unchanged. User u uploads the local model timeout refers to the fact that the aggregation server does not receive the model parameters uploaded by user u within the maximum transmission time specified by the system, and the model parameters are considered as unreliable users.

Case 3, the aggregation server detects malicious behaviors of the user u, the value of the parameter k _u is unchanged, the value of x _u is increased by 1, then the server detects whether x _u-k_u is larger than the value of a, and if so, the node is removed from the system according to federal learning rules. Malicious behavior here means that if user u uploads the local model parameters to the aggregation server within the maximum transmission time, but the aggregation server verifies that the model parameters do not meet federal learning rules (data anomalies or local model accuracy is less than the regulatory requirements).

Step 4, calculating and paying user rewards.

For any user u, after completion of the federal learning task or when user u exits the system, the task publisher will calculate and pay the user's consideration. The final reward R _u for user u may be availableAnd (5) calculating to obtain the product.

The excitation mechanism provided by the invention has the following properties: 1) Budget equalization; 2) Encouraging honest and reliable behaviors and inhibiting malicious behaviors; 3) Fairness. The invention can achieve the excitation purpose of exciting high-quality participants to continuously participate in the system and for malicious participants to exit the system on the premise of meeting the budget and system requirements of task publishers and servers.

1. The incentive mechanism proposed by the present invention satisfies the budget-balancing property that the total expenditure of the task publisher for all users should be less than equal to the expenditure budget, as shown in equation (12):

Assuming that all users participating in federal learning are honest and reliable users and all participate in all rounds of global iterations, the task publisher pays the user a total reward as in equation (13.1). And (3) introducing the formula (9) into the formula (13.2), wherein as all users are honest and reliable users, the parameters of honest user u are x _u＝k_u =T according to the specific content of the incentive mechanism, and the formula (13.2) can be simplified into the formula (13.3). Since the local model precision epsilon _u is less than 1, the inequality (13.4) can be obtained, and the right end of the formula can be finally simplified into a formula (13.5), namely the total payment budget of the task publisher for each user. It is thus achieved that, most ideally, the total expenditure of the task publisher for each participating user is smaller than the total payment budget of the task publisher for each user. Whereas in practical cases the total expenditure is smaller than in ideal cases, so in practice there is a requirement for constraint 1).

2. Any user who participates in federal learning training and is not rejected by the system has a net gain that is not negative.

The net benefit of any user can be expressed as equation (14.1). Bringing equation (9) into equation (14.1) yields equation (14.2), and from the proposed incentive mechanism, it is known that both unreliable and reliable users have equation k _u＝x_u, so that equation (14.3) holds for both unreliable and reliable users. Since r _u > 0 is obtained from the assumption (3) and the equation (8), the value of the equation (14.3) is larger than 0, i.e., the equation (14.4). While equation (14.2) holds for malicious users, it is known from the proposed incentive mechanism that: when x _u-k_u > a, the aggregation server may kick the user. So that the inequality x _u-k_u. Ltoreq.a holds for malicious nodes that are not removed from the system, and therefore for these users there is also a net benefit value greater than 0.

3. On the premise of individuality, the income maximization strategy of honest users and unreliable users is to increase the effective participation period, and the income maximization strategy of malicious users is to reduce the malicious behavior times. The net benefit of any user u can be expressed as equation (14.1), bringing equation (9) into equation (14.1) to obtain equation (14.2). According to the proposed scheme, no penalty is made for unreliable users, so that equation k _u＝x_u holds for both unreliable and reliable users, and therefore equation (14.3) holds for unreliable and reliable users, so that under the condition that the values of C _u and a are fixed, the strategy of maximizing the value of equation (14.3) is to increase the value of r _u, and therefore the benefit maximizing strategy of users is to increase the effective participation period.

The malicious user behavior and rewards obtained are analyzed as follows. For a user whose malicious behavior is detected a 'times, there is an equation k _u＝x_u -a' according to the proposed incentive mechanism, and the net benefit of the user is shown in equation (15).

Assuming that other conditions are unchanged, the larger a 'is available from the formula (15), the smaller the net gain of the malicious user is, and when the detected number a' is equal to the tolerance number a, the obtained net gain is 0 no matter how many times the global iteration is participated.

When the detected number a' is greater than the tolerance number a, the user is kicked off the federation as specified in the motivation mechanism specification. The optimal strategy for malicious users is to reduce the number of malicious acts.

4. Under the condition that other conditions are the same, the yield of the reliable user is larger than that of the unreliable user, and the yield of the unreliable user is larger than that of the malicious user; users who provide more local data and higher local model accuracy during federal learning should obtain higher yields; the longer the effective participation period of the user, the higher the yield thereof.

The user's yield can be expressed as equation (16.1), and if equations (8), (9) are brought into equation (16.1), equation (16.2) can be obtained, since equation k _u＝x_u holds for both unreliable and reliable users, equation (16.2) can be reduced to equation (16.3). According to the formula (8), when the effective number of rounds of the federal learning of the user is larger, the return rate (r _u) of the user is larger, so that the return rate of the reliable user is larger than that of the unreliable user. While malicious nodes within the federal learning system have x _u＞k_u, the yield of malicious nodes is less than unreliable nodes that participate in the same number of valid rounds.

According to the formula (8) and the formula (16), the invention can obtain the larger number of effective participation rounds or provide more local model training data and higher local model precision users, the larger obtained return rate, and under the condition that the rest conditions are the same, the users have larger return rate, thereby meeting the excitation of high-quality users and long-term participants.

The parameters used in the above description of the present invention are shown in table 1.

TABLE 1

/>

The computer program may be partitioned into one or more modules/units, one or more modules/units being stored in the memory and executed by the processor to perform the method of motivating user nodes of the lateral federal learning system of the present invention.

The terminal can be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud server and the like, and can also be a processor and a memory. The processor may be a central processing unit (CentralProcessingUnit, CPU), but may also be other general purpose processors, digital signal processors (DigitalSignalProcessor, DSP), application specific integrated circuits (ApplicationSpecificIntegratedCircuit, ASIC), off-the-shelf programmable gate arrays (Field-ProgrammableGateArray, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The memory may be used to store computer programs and/or modules that, by running or executing the computer programs and/or modules stored in the memory, and invoking data stored in the memory, implement the various functions of the lateral federal learning system.

The foregoing description of the preferred embodiment of the present invention is not intended to limit the technical solution of the present invention in any way, and it should be understood that the technical solution can be modified and replaced in several ways without departing from the spirit and principle of the present invention, and these modifications and substitutions are also included in the protection scope of the claims.

Claims

1. A method for exciting a user node of a transverse federal learning system is characterized by comprising the following steps of: an excitation mechanism based on the principle of complex profit is constructed, and the mathematical expression of the excitation mechanism is as follows: in the formula, ru is the final reward of the user u, pu is the reward base number of the user u, ru is the return rate of the user u, and ku is the effective round number of the user u participating in federal learning;

Calculating the reward base number and the return rate of each user according to the total budget and the user information by the federal learning task publisher, and calculating the final reward of each user; different return rates are set for honest users, malicious users and unreliable users, so that three types of users obtain different final rewards, and excitation of honest users and inhibition of other users are achieved;

the calculated expression of the rate of return is as follows:

Where a represents the tolerance of the task publisher to malicious behavior, b represents the total budget of the task publisher, The method comprises the steps that the sum of total energy consumption and data cost of all T round global iterations of a system is assumed, T is the number of system iteration rounds, epsilon _u represents the local model precision of the user u, ku is the effective number of rounds of federal learning of the user u, and n represents the number of users participating in each round of iteration;

the calculation expression of the reward base is as follows:

wherein a represents the malicious behavior tolerance of the task publisher, xu represents the iterative round number of the user u participating in federal learning, cu is the total cost of the user u participating in federal learning, and the total cost is equal to the sum of energy consumption and data cost;

Each user decides whether to participate in the federal learning task or not, if so, the relevant parameters are uploaded, and the relevant parameters comprise computing resources, local data quantity, local model precision and unit data cost; the server selects users meeting the federal learning requirement, and the users not meeting the requirement refuses to participate in federal learning; estimating a return rate interval of each user participating in federal learning, and publishing the report rate interval in a public way; the user finally decides whether to participate in the federal learning according to the return rate interval, and if the user does not participate in the federal learning, the user automatically exits;

The upper limit calculation expression of the rate of return participating in federal learning is as follows:

wherein T is the maximum global iteration number, epsilon is the local precision threshold; the return rate interval is

The user participating in federation learning cooperates with the server to complete federation learning, and the aggregation server completes iterative updating of the global model, detection of behaviors of the user and recording related information; the aggregation server sets a parameter xu and ku for each user to record the iteration round number and the effective round number of the user u participating in federal learning, and the parameter is set to be 0 before the federal learning starts; when the global model is updated each time, the aggregation server detects the behaviors of each user, and carries out different operations on parameters xu and ku according to different detection results, wherein the parameter xu represents the iterative round number of the user u participating in federal learning, and the parameter ku is the effective round number of the user u participating in federal learning;

the aggregation server operates the parameters xu and ku according to the behavior detection results of the users in the following manner:

Case 1, wherein the user u effectively participates in one round of federal learning, the xu value is increased by 1, and the ku value is increased by 1; the effective participation means that a user uploads local model parameters with model precision reaching federal learning requirements to an aggregation server within the maximum transmission time specified by the system;

Case 2, when the local model is overtime in uploading by the user u, the values of the parameters xu and ku are unchanged; uploading local model timeout means that the aggregation server does not receive the model parameters uploaded by the user u within the maximum transmission time specified by the system, and the user is regarded as an unreliable user;

Case 3, the aggregation server detects malicious behaviors of the user u, the value of the parameter ku is unchanged, the value of xu is increased by 1, then the server detects whether xu-ku is larger than the value of a, and if so, the node is removed from the system; here, malicious behavior refers to if the user u uploads the local model parameters to the aggregation server in the maximum transmission time, but after the aggregation server checks that the model parameters do not meet federal learning rules, including abnormal data or the local model accuracy is less than the rule requirement.

2. A lateral federal learning system for implementing the method of incentive for user nodes of the lateral federal learning system of claim 1, wherein: the system comprises an aggregation server, a task publisher and a plurality of users; the task publisher is responsible for publishing federal learning tasks, making an incentive mechanism and paying rewards to users participating in federal learning; the aggregation server is responsible for receiving local models uploaded by all users and aggregating the models through a certain aggregation rule to obtain a global model for the users, recording identity information of node users participating in learning in the process, calculating energy consumption of all the users, detecting the accuracy of the local models and detecting malicious behaviors of the users; the user performs model training by utilizing own local data to obtain a local model; when the user joins federal study, submitting the relevant information to the aggregation server according to the convention, and informing the aggregation server when exiting federal.

3. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that: the processor, when executing the computer program, performs the steps of the method for motivating user nodes of the horizontal federal learning system according to claim 1.

4. A computer-readable storage medium storing a computer program, characterized in that: the computer program when executed by a processor performs the steps of the method of motivating user nodes of a lateral federal learning system according to claim 1.