CN113159190B

CN113159190B - Federal incentive distribution method, apparatus, computer device, and storage medium

Info

Publication number: CN113159190B
Application number: CN202110449555.4A
Authority: CN
Inventors: 李泽远; 王健宗
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2021-04-25
Filing date: 2021-04-25
Publication date: 2024-02-02
Anticipated expiration: 2041-04-25
Also published as: CN113159190A

Abstract

The invention discloses a federal excitation distribution method, a federal excitation distribution device, computer equipment and a storage medium, wherein an initial training data set is sent from each participant added into a federal system, and an effective training data set corresponding to each participant and a training quality vector thereof are determined; determining the federal excitation depth of the federal system according to training quality vectors and the total amount of effective data corresponding to all the participants; determining a contribution value of each participant by adopting a marginal utility measurement method, and determining a preset incentive distribution value corresponding to each participant according to the federal incentive depth and the contribution value corresponding to each participant; determining actual incentive distribution values corresponding to all the participants according to the preset incentive distribution values and a preset incentive determination strategy; and executing the federal incentive distribution task according to the actual incentive distribution value corresponding to each participant. The invention improves the comprehensive benefit of the federal system.

Description

Federal incentive distribution method, apparatus, computer device, and storage medium

Technical Field

The present invention relates to the field of artificial intelligence technologies, and in particular, to a federal incentive distribution method, apparatus, computer device, and storage medium.

Background

Because the federal learning technology has the advantages of distributed machine learning and privacy protection technology, the federal learning technology can be trained by combining multiple parties on the premise of ensuring data safety and privacy, so that the model performance and the actual benefit are improved, and the federal learning technology is currently applied to application scenes such as intelligent security and asset risk detection.

In the prior art, the premise of the application scene of federal learning is that a plurality of participants actively join and train a local model by using high-quality data. However, since the quality and quantity of the input data of the federal learning are determined by the participants, there may occur a problem that allocation of federal learning incentives in the federal system cannot be matched with the needs of each participant, thereby resulting in lower overall benefits of the federal system.

Disclosure of Invention

The embodiment of the invention provides a federal incentive distribution method, a federal incentive distribution device, computer equipment and a storage medium, which are used for solving the problem of lower comprehensive benefit of a federal system.

A federal incentive distribution method, comprising:

receiving an initial training data set sent by each participant joining the federal system, and determining an effective training data set and a training quality vector thereof corresponding to each participant from the initial training data set; a valid training data set of the participants is associated with a valid data total;

Determining the federal excitation depth of the federal system according to the training quality vector and the total effective data amount corresponding to each participant;

determining a contribution value of each participant by adopting a marginal utility measurement method, and determining a preset incentive distribution value corresponding to each participant according to the federal incentive depth and the contribution value corresponding to each participant;

determining actual incentive distribution values corresponding to all the participants according to the preset incentive distribution values and a preset incentive determination strategy;

and executing the federal incentive distribution task according to the actual incentive distribution value corresponding to each participant.

A federal excitation distribution device, comprising:

the data processing module is used for receiving initial training data sets sent by all the participants joining the federal system, and determining effective training data sets and training quality vectors thereof corresponding to all the participants from the initial training data sets; a valid training data set of the participants is associated with a valid data total;

the federal excitation depth determining module is used for determining the federal excitation depth of the federal system according to the training quality vector and the total effective data amount corresponding to each participant;

The preset incentive distribution value determining module is used for determining the contribution value of each participant by adopting a marginal utility measuring method and determining the preset incentive distribution value corresponding to each participant according to the federal incentive depth and the contribution value corresponding to each participant;

the actual incentive distribution value determining module is used for determining the actual incentive distribution value corresponding to each participant according to the preset incentive distribution value and a preset incentive determination strategy;

and the incentive allocation task execution module is used for executing the federal incentive allocation task according to the actual incentive allocation value corresponding to each participant.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the federal incentive allocation method described above when executing the computer program.

A computer readable storage medium storing a computer program which when executed by a processor implements the federal incentive allocation method described above.

The federal excitation distribution method, the federal excitation distribution device, the computer equipment and the storage medium are used for determining an effective training data set and a training quality vector thereof corresponding to each participant from an initial training data set sent by each participant joining a federal system by receiving the initial training data set; a valid training data set of the participants is associated with a valid data total; determining the federal excitation depth of the federal system according to the training quality vector and the total effective data amount corresponding to each participant; determining a contribution value of each participant by adopting a marginal utility measurement method, and determining a preset incentive distribution value corresponding to each participant according to the federal incentive depth and the contribution value corresponding to each participant; determining actual incentive distribution values corresponding to all the participants according to the preset incentive distribution values and a preset incentive determination strategy; and executing the federal incentive distribution task according to the actual incentive distribution value corresponding to each participant.

According to the invention, the contribution of each participant to the federation training of the federation system is evaluated through the total effective data amount of the effective training data set transmitted by each participant and the training quality vector corresponding to the effective training data set, so that the excitation matched with the contribution value corresponding to each participant is determined, and the preset excitation determination strategy is introduced, so that all participants hold forward excitation, more participants can be attracted to provide more training data with better quality, and the comprehensive benefit of the federation system is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic illustration of an application environment for a federal incentive distribution method in accordance with an embodiment of the present invention;

FIG. 2 is a flow chart of a federal incentive distribution method in accordance with an embodiment of the present invention;

FIG. 3 is a flow chart of step S10 of the federal incentive assigning method according to an embodiment of the present invention;

FIG. 4 is a flow chart of step S20 of the federal incentive assigning method according to an embodiment of the present invention;

FIG. 5 is a flow chart of step S40 of the federal incentive assigning method according to an embodiment of the present invention;

FIG. 6 is a schematic block diagram of a federal excitation distribution device according to an embodiment of the present invention;

FIG. 7 is a schematic block diagram of a data processing module in a federal excitation distribution arrangement according to an embodiment of the present invention;

FIG. 8 is a schematic block diagram of a federal excitation depth determination module in a federal excitation distribution arrangement according to an embodiment of the present invention;

FIG. 9 is a schematic block diagram of an actual incentive assigning value determination module of the federal incentive assigning device in accordance with an embodiment of the present invention;

FIG. 10 is a schematic diagram of a computer device in accordance with an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The federal excitation distribution method provided by the embodiment of the invention can be applied to an application environment shown in fig. 1. Specifically, the federal excitation distribution method is applied to a federal excitation distribution system, the federal excitation distribution system comprises a client and a server as shown in fig. 1, and the client and the server communicate through a network to solve the problem of low comprehensive benefit of the federal system. The client is also called a client, and refers to a program corresponding to the server for providing local service for the client. The client may be installed on, but is not limited to, various personal computers, notebook computers, smartphones, tablet computers, and portable wearable devices. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.

In one embodiment, as shown in fig. 2, a federal incentive distribution method is provided, which is illustrated by way of example as the method is applied to the server in fig. 1, and includes the following steps:

s10: receiving an initial training data set sent by each participant joining the federal system, and determining an effective training data set and a training quality vector thereof corresponding to each participant from the initial training data set; an active training data set of the participant is associated with an active total amount of data.

As will be appreciated, a participant refers to a user or terminal that is determined to be engaged in federal system training. Federal systems refer to systems based on federal learning techniques and awaiting federal training. The initial training data set refers to local data for each participant, i.e., data that each participant determines to input for use in training the federal system. An effective training data set refers to a set of remaining training data in the initial training data set that does not meet the training requirements, the training data in the effective training data set all meeting the training requirements of the federal system. The training quality vector is used for representing the quality of training data in the effective training data set, and is obtained through various dimensional performance evaluations such as saturation, similarity and the like. The total amount of valid data refers to the total number of valid training data in each valid training data set.

In one embodiment, as shown in fig. 3, the initial training data set includes at least one initial training data; in step S10, it includes:

s101: and receiving a data cleaning instruction containing training requirements, so as to carry out data cleaning processing on the initial training set of each participant, and eliminating initial training data which does not meet the training requirements in the initial training set.

S102: and recording the initial training set after the initial training data which does not meet the training requirement is removed as the effective training data set.

As will be appreciated, training requirements refer to the need for federal training of the federal system, which may include the need for training data, the need for model parameters, and the like. The data cleansing instructions may be sent by a user (e.g., federal system trainer) or may be automatically generated after typing in the training requirements.

Specifically, after receiving a data cleaning instruction containing a training requirement, performing data cleaning processing on initial training data in an initial training set of each participant to remove initial training data which does not meet the training requirement in the initial training data set, and recording the initial training set after removing the initial training data which does not meet the training requirement as an effective training data set. Further, the total amount of effective data associated with the effective training data set is the difference between the total amount of initial training data in the initial training data set and the total amount of initial training data which does not meet the training requirement.

S103: inputting the effective training data sets into a federal feature engineering module, performing feature evaluation processing on the effective training data sets through the federal feature engineering module, and determining training quality vectors corresponding to the effective training data sets.

It will be appreciated that federal feature engineering module refers to a module in a federal system that scales the quality of data in an effective training dataset from a number of different dimensions.

Specifically, after the initial training set after the initial training data which does not meet the training requirement is removed is recorded as the effective training data set, the effective training data set is input into a federal feature engineering module in a federal system, and feature evaluation processing is performed on training data in the effective training data set from various dimensions such as saturation, sparsity, similarity and data distribution through the federal feature engineering module, so that training quality vectors corresponding to the effective training data sets are determined.

S20: and determining the federal excitation depth of the federal system according to the training quality vector and the total effective data amount corresponding to each participant.

It will be appreciated that after each participant provides data for federal training of the federal system, the participant will be subject to motivational feedback, and the federal depth of motivation is the parameter that affects the motivational feedback to the participant, the greater the federal depth of motivation, the more motivational feedback to each participant; conversely, the smaller the federal excitation depth, the less excitation feedback is given to each participant.

In one embodiment, as shown in fig. 4, step S20 includes:

s201: determining a matching quality vector from training quality vectors corresponding to the participants; the matching quality is a training quality vector with the highest matching degree with the training requirement.

It can be understood that after feature evaluation is performed on the effective training data sets by the federal feature engineering module to determine the training quality vector corresponding to each effective training data set, the training quality vector which is most matched with the training requirement, that is, the training quality vector with the highest quality can be determined according to the training quality vector corresponding to the effective training data set corresponding to each participant, and then the training quality vector with the highest quality is recorded as the matching quality vector.

S202: and determining an average quality vector by adopting a mathematical expectation algorithm according to the training quality vector corresponding to each participant.

Specifically, after feature evaluation processing is performed on the effective training data sets through the federal feature engineering module, training quality vectors corresponding to the effective training data sets are determined, and then average quality vectors are determined through a mathematical expectation algorithm according to the training quality vectors corresponding to the participants. Illustratively, the average quality vector may be determined by the following mathematical expectation algorithm:

Wherein,refers to the average quality vector corresponding to m participants; e (qi) refers to the training quality vector corresponding to the ith participant; m is the total number of participants.

S203: and determining the average effective data amount by adopting a mathematical expectation algorithm according to the effective data amount corresponding to each participant.

Specifically, after the initial training set after the initial training data which does not meet the training requirement is removed is recorded as the effective training data set, determining the quantity of training data in each effective training data set, namely determining the total quantity of effective data in the effective training data set, and further determining the average effective data quantity by adopting a mathematical expectation algorithm according to the total quantity of effective data associated with the effective training data set corresponding to each participant. Illustratively, the average effective data amount may be determined by the following mathematical expectation algorithm:

wherein,refers to the average effective number corresponding to m participantsA data amount; q (i) refers to the total amount of valid data corresponding to the ith participant.

S204: and obtaining the maximum data bearing capacity of the federal system, and determining the federal excitation depth according to the matching quality vector, the average effective data and the maximum data bearing capacity.

It will be appreciated that the maximum data load refers to the maximum amount of training data that the federal system can carry. And further, after the maximum data bearing capacity of the federal system is obtained, determining the federal excitation depth according to the matching quality vector, the average effective data and the maximum data bearing capacity.

In one embodiment, step S204 includes:

and receiving system service parameters sent by each participant, and determining the service total parameters of the federal system according to the system service parameters of each participant.

As will be appreciated, the system service parameters refer to the system service costs that each participant needs to submit after determining to join the federal system for federal training (the system service costs submitted by each participant may be set to the same costs). The total service parameter is the sum of the system service fees submitted by all the participants.

Specifically, after receiving the system service parameters transmitted from each of the participants, the sum of the system service parameters of each of the participants is recorded as the service total parameter of the federal system.

Acquiring a first preset number of decision parameters, a second preset number of decision parameters, a first preset depth decision parameter and a second preset depth decision parameter of the federal system; the second preset number of decision parameters is greater than the first preset number of decision parameters.

It can be understood that the first preset number of decision parameters, the second preset number of decision parameters, the first preset depth decision parameters and the second preset depth decision parameters are all decision parameters of the federal system, and the parameters can be determined by various factors such as an application environment of the federal system, a system operation condition and the like. The first preset number of decision parameters and the second preset number of decision parameters are used for measuring the size of the average effective data quantity. The first preset depth decision parameter and the second preset depth decision parameter are used for determining the size of the federal excitation depth.

And when the average effective data volume is smaller than the first preset quantity decision parameter, determining the federal excitation depth according to the service total parameter, the matching quality vector, the average effective data and the maximum data bearing capacity.

Specifically, after a first preset number decision parameter, a second preset number decision parameter, a first preset depth decision parameter and a second preset depth decision parameter of the federal system are acquired, comparing the average effective data volume with the first preset number decision parameter and the second preset number decision parameter, and determining federal excitation depth according to the service total parameter, the matching quality vector, the average effective data and the maximum data carrying capacity when the average effective data volume is smaller than the first preset number decision parameter. For example, the corresponding federal excitation depth when the average effective data volume is less than the first preset number decision parameter may be determined according to the following expression:

Wherein T1 is the federal excitation depth corresponding to when the average effective data size is less than the first preset number decision parameter; c is a system service parameter corresponding to the participants (the system service parameters corresponding to the participants are all set as C, and if the system service parameters corresponding to the participants are different, the system service parameters can be replaced by the sum of the system service parameters corresponding to the participants);refers to the average quality vector corresponding to m participants; />Refers to m participant pairsThe average effective data amount; />Is a matching quality vector; />Is the maximum data bearing capacity; x1 is a first predetermined number of decision parameters.

And when the average effective data volume is larger than or equal to the first preset quantity decision parameter and smaller than the second preset quantity decision parameter, determining the federal excitation depth according to the first preset depth decision parameter, the service total parameter, the matching quality vector, the average effective data and the maximum data bearing capacity.

Specifically, after a first preset number decision parameter, a second preset number decision parameter, a first preset depth decision parameter and a second preset depth decision parameter of the federal system are acquired, comparing the average effective data volume with the first preset number decision parameter and the second preset number decision parameter, and determining the federal excitation depth according to the first preset depth decision parameter, the service total parameter, the matching quality vector, the average effective data and the maximum data bearing capacity when the average effective data volume is greater than or equal to the first preset number decision parameter and less than the second preset number decision parameter. For example, the corresponding federal excitation depth when the average effective data volume is greater than or equal to the first preset number decision parameter and less than the second preset number decision parameter may be determined according to the following expression:

Wherein, T2 is the federal excitation depth corresponding to the average effective data volume being greater than or equal to the first preset number decision parameter and less than the second preset number decision parameter; x2 is a second preset number decision parameter; t1 is a first predetermined depth decision parameter.

And when the average effective data volume is larger than or equal to the second preset quantity decision parameter, determining the federal excitation depth according to the second preset depth decision parameter, a service total parameter, a matching quality vector, an average quality vector, average effective data and a maximum data bearing capacity.

Specifically, after a first preset number decision parameter, a second preset number decision parameter, a first preset depth decision parameter and a second preset depth decision parameter of the federal system are obtained, comparing the average effective data volume with the first preset number decision parameter and the second preset number decision parameter, and determining the federal excitation depth according to the second preset depth decision parameter, the service total parameter, the matching quality vector, the average effective data and the maximum data bearing capacity when the average effective data volume is greater than or equal to the second preset number decision parameter. For example, the corresponding federal excitation depth when the average effective data volume is greater than or equal to the second preset number decision parameter may be determined according to the following expression:

Wherein T3 is the corresponding federal excitation depth when the average effective data size is greater than or equal to the second preset number decision parameter; t2 is a second predetermined depth decision parameter.

S30: and determining the contribution value of each participant by adopting a marginal utility measurement method, and determining a preset incentive distribution value corresponding to each participant according to the federal incentive depth and the contribution value corresponding to each participant.

It will be appreciated that the marginal utility measurement method serves to measure the contribution of each participant to the training of the federal system. The preset incentive assigning value refers to a value for assigning an incentive to each participant in advance according to the contribution of the participant to the training of the binding system.

In one embodiment, step S30 includes:

and determining marginal utility of each participant to the federal system by adopting a shape algorithm according to the effective training data set corresponding to each participant.

And determining a contribution value corresponding to each participant according to the marginal utility corresponding to each participant.

Wherein, the Shapley algorithm is used for measuring the contribution of each participant to the training of the binding system. Specifically, after determining the federal excitation depth of the federal system according to the training quality vector and the total effective data amount corresponding to each participant, determining marginal utility of the effective training data set corresponding to each participant to the training of the federal system in the training process of each participant to the federal system by adopting a shape algorithm, and further determining the contribution value corresponding to each participant according to the marginal utility corresponding to each participant. Further, the sum of the contribution values corresponding to the participants is 1.

Further, the contribution value corresponding to each of the participants may be determined by the following expression:

δi＝v(S∪{i})-v(S)

wherein δi is the marginal utility brought by the ith party after being added to the federal system; v (S.U.I) is the marginal utility brought by all the participants after adding to the federal system; v (S) is the marginal utility that other participants, except the ith participant, bring upon addition to the federal system (S is the set that does not contain the ith participant);the contribution value corresponding to each participant; m is the set of all participants; m is the number of participants.

Illustratively, assuming there are a total of 2 participants X and Y, if there is only participant X, then the marginal utility corresponding to the federal system is v (X); if the party X and the party Y exist, and the marginal utility corresponding to the federal system is v (x+y) at the moment, the marginal utility corresponding to the party Y is v (x+y) -v (X); assuming that there are a total of 3 participants X, Y, Z, if the marginal utility of participant X needs to be calculated, enumerating the set of all participants as { X }, { Y }, { Z }, { X, Y }, { X, Z }, { Y, Z }, where the subset of draining participant A has { Y }, { Z }, { Y, Z }, then these subsets can be represented by S.

S40: and determining the actual incentive distribution value corresponding to each participant according to the preset incentive distribution value and a preset incentive determination strategy.

It will be appreciated that the predictive incentive determination strategy is used to determine the actual incentive allotment value corresponding to each participant. Because the participants generate data calculation and communication loss in the federal training process, when the stimulus allocated to the participants is less, the fact that the stimulus actually obtained by the participants cannot meet the total consumption caused by the data calculation and communication loss may occur, so that when the preset stimulus allocation value of the participants does not meet the total consumption caused by the data calculation and communication loss, the corresponding preset stimulus allocation value is supplemented to the stimulus allocation value matched with the total consumption through a stimulus pool in the federal system.

In one embodiment, as shown in fig. 5, in step S40, the method includes:

s401: and obtaining a basic loss value corresponding to each participant, and comparing the preset excitation distribution value corresponding to the same participant with the basic loss value.

As will be appreciated, the base loss value refers to the sum of the data calculations and communication losses that the participants have generated during the federal training process.

In one embodiment, before step S401, the method includes:

And determining the calculated loss value corresponding to each participant according to the hardware equipment parameter corresponding to each participant and the total effective data through the calculated loss function.

It will be appreciated that the hardware device parameter may be a capacitance coefficient of the terminal of the participant, a number of processing cycles of the CPU (Central Processing Unit ) of the terminal of the participant, a processing cycle frequency, etc.

Specifically, the calculated loss value corresponding to each of the participants may be determined by the following expression:

ei is a calculated loss value corresponding to the ith participant;a capacitance coefficient of a terminal that is a party; ci is the total amount of effective data corresponding to the ith participant; di is the number of processing cycles of the CPU (Central Processing Unit ) of the terminal of the participant; fi is the processing cycle frequency of the CPU (Central Processing Unit ) of the terminals of the participants.

And determining communication loss values corresponding to the participants according to the communication transmission parameters corresponding to the participants through a communication loss function.

It will be appreciated that the communication transmission parameters may include data transmission duration, transmission power, total amount of transmitted data, network bandwidth, etc.

Specifically, the communication loss value corresponding to each of the participants may be determined by the following expression:

F _i ＝τ _i p _i (s _i /τ _i )

wherein Fi is a communication loss value corresponding to the ith participant; τi is the transmission duration of the ith participant to transmit the initial training data set; pi is the transmission power of the ith participant to transmit the initial training data set; si is the total amount of initial training data in the initial training data set transmitted by the ith participant; n0 is transmission background noise; hi is the terminal channel gain of the ith participant; b is the network bandwidth.

And determining loss costs corresponding to the participants according to the hardware equipment parameters corresponding to the participants and the communication transmission parameters through a product logarithmic function.

The product logarithmic function is a Lambert W function, and is used for determining loss cost corresponding to each participant. It can be appreciated that, due to the hardware isomerism of the terminals of each participant, the loss generated by each participant is different, so by introducing the loss cost of each participant, the accuracy of the determined base loss value corresponding to each participant can be improved.

Specifically, the loss cost corresponding to each of the participants may be determined by the following expression:

Wherein gi is the loss cost corresponding to the ith participant; w () is a product log function; k is a loss parameter which can be changed according to the state of the CPU of the terminal of each party.

And determining a basic loss value corresponding to each participant according to the calculated loss value, the communication loss value and the loss cost corresponding to each participant.

Specifically, after determining the calculated loss value, the communication loss value, and the loss cost corresponding to each participant, the base loss value corresponding to each participant is determined according to the calculated loss value, the communication loss value, and the loss cost corresponding to each participant. Further, the base loss value corresponding to each participant may be determined according to the following expression:

wherein Ri is a basic loss value corresponding to the ith participant;for the best service capability value corresponding to the ith participant, the best service capability value may be determined according to the loss parameter k in the foregoing description (it may be understood that, for different terminals, the best service capability value corresponding to each loss parameter k has been preset, and the mapping relationship is stored in a preset storage table, so that after determining the loss parameter k, the best service capability value may be determined by querying the preset storage table).

S402: and when the preset excitation distribution value is greater than or equal to the basic loss value, recording the actual excitation distribution value of the participant corresponding to the preset excitation distribution value as the preset excitation distribution value.

Specifically, after the basic loss value corresponding to each participant is obtained, the preset excitation distribution value corresponding to the same participant is compared with the basic loss value, when the preset excitation distribution value is greater than or equal to the basic loss value, the total amount of effective training data transmitted by the participant is indicated to be large, and the quality of the training data is higher, so that the preset excitation distribution value is recorded as the actual excitation distribution value corresponding to the participant.

S403: and when the preset excitation distribution value is smaller than the basic loss value, recording the actual excitation distribution value of the participant corresponding to the basic loss value as the basic loss value according to the preset excitation determination strategy.

Specifically, after the basic loss value corresponding to each participant is obtained, comparing the preset excitation distribution value corresponding to the same participant with the basic loss value, when the preset excitation distribution value is smaller than the basic loss value, indicating that the total amount of effective training data transmitted by the participant is smaller, and the quality of the training data is lower, supplementing the preset excitation distribution value through an excitation pool of the federal system through the preset excitation determination strategy so as to supplement the preset excitation distribution value to the basic loss value, and recording the basic loss value as an actual excitation distribution value corresponding to the participant.

S50: and executing the federal incentive distribution task according to the actual incentive distribution value corresponding to each participant.

Specifically, after the actual incentive distribution values corresponding to the respective participants are determined according to the preset incentive distribution values and the preset incentive determination policy, a federal incentive distribution task is performed according to the actual incentive distribution values corresponding to the respective participants to distribute the actual incentive distribution values corresponding to the participants to the respective participants.

In this embodiment, the contribution of each participant to the federal training of the federation system is evaluated by the total amount of effective data in the effective training data set transmitted by each participant and the training quality vector corresponding to the effective training data set to determine the excitation matched with the contribution value corresponding to each participant, and a preset excitation determination policy is introduced, so that the participants contributing less to the federal training of the federal system can acquire the excitation matched with the basic consumption value thereof, and all the participants can hold forward excitation, thereby attracting more participants to provide more training data with better quality to the federal system, and improving the comprehensive benefit of the federal system.

In another specific embodiment, to ensure the privacy and security of the initial training data set in the above embodiments, the initial training data set may be stored in a blockchain. Among them, blockchain (Blockchain) is an encrypted, chained transaction memory structure formed by blocks (blocks).

For example, the header of each chunk may include both the hash values of all transactions in the chunk and the hash values of all transactions in the previous chunk, thereby enabling tamper-and anti-counterfeiting of transactions in the chunk based on the hash values; the newly generated transactions, after being filled into the block and passing through the consensus of the nodes in the blockchain network, are appended to the tail of the blockchain to form a chain growth.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.

In one embodiment, a federal excitation distribution device is provided that corresponds one-to-one to the federal excitation distribution method of the above embodiments. As shown in fig. 6, the federal excitation allocation apparatus includes a data processing module 10, a federal excitation depth determination module 20, a preset excitation allocation value determination module 30, an actual excitation allocation value determination module 40, and an excitation allocation task performing module 50. The functional modules are described in detail as follows:

A data processing module 10, configured to receive an initial training data set sent by each participant joining in the federal system, and determine an effective training data set and a training quality vector thereof corresponding to each participant from the initial training data set; a valid training data set of the participants is associated with a valid data total;

a federal excitation depth determination module 20, configured to determine a federal excitation depth of the federal system according to training quality vectors and total amount of valid data corresponding to each of the participants;

a preset incentive distribution value determining module 30, configured to determine a contribution value of each of the participants by using a marginal utility measurement method, and determine a preset incentive distribution value corresponding to each of the participants according to the federal incentive depth and the contribution value corresponding to each of the participants;

an actual incentive distribution value determining module 40, configured to determine an actual incentive distribution value corresponding to each of the participants according to the preset incentive distribution value and a preset incentive determination policy;

an incentive distribution task execution module 50 for executing federal incentive distribution tasks based on the actual incentive distribution values corresponding to each of the participants.

Preferably, as shown in fig. 7, the initial training data set includes at least one initial training data; the data processing module 10 includes:

a data cleansing unit 101, configured to receive a data cleansing instruction including a training requirement, so as to perform data cleansing processing on the initial training set of each of the participants, so as to reject initial training data in the initial training set that does not meet the training requirement;

an effective data determining unit 102, configured to record, as the effective training data set, an initial training set after eliminating initial training data that does not meet the training requirement;

the feature evaluation unit 103 is configured to input the valid training data sets into a federal feature engineering module, perform feature evaluation processing on the valid training data sets through the federal feature engineering module, and determine training quality vectors corresponding to the valid training data sets.

Preferably, as shown in FIG. 8, the federal excitation depth determination module 20 includes:

a matching quality vector determining unit 201, configured to determine a matching quality vector from training quality vectors corresponding to the participants; the matching quality is a training quality vector with the highest matching degree with the training requirement;

An average quality vector determining unit 202, configured to determine an average quality vector by using a mathematical expectation algorithm according to training quality vectors corresponding to the participants;

an average effective data amount determining unit 203, configured to determine an average effective data amount by using a mathematical expectation algorithm according to the total effective data amount corresponding to each of the participants;

and the federal excitation depth determining unit 204 is configured to obtain a maximum data carrying capacity of the federal system, and determine the federal excitation depth according to the matching quality vector, the average effective data, and the maximum data carrying capacity.

Preferably, the federal excitation depth determination unit 204 includes:

a service total parameter determining subunit, configured to receive system service parameters sent by each of the participants, and determine a service total parameter of the federal system according to the system service parameters of each of the participants;

the parameter acquisition subunit is used for acquiring a first preset quantity decision parameter, a second preset quantity decision parameter, a first preset depth decision parameter and a second preset depth decision parameter of the federal system; the second preset number of decision parameters is greater than the first preset number of decision parameters;

A first federal excitation depth determination subunit, configured to determine the federal excitation depth according to the service total parameter, the matching quality vector, the average effective data, and the maximum data load when the average effective data size is smaller than the first preset number decision parameter;

a second federation excitation depth determining subunit, configured to determine the federation excitation depth according to the first preset depth decision parameter, a service total parameter, a matching quality vector, an average quality vector, average effective data, and a maximum data bearing capacity when the average effective data size is greater than or equal to the first preset number decision parameter and less than the second preset number decision parameter;

and the third federal excitation depth determining subunit is configured to determine the federal excitation depth according to the second preset depth decision parameter, the service total parameter, the matching quality vector, the average effective data and the maximum data bearing capacity when the average effective data size is greater than or equal to the second preset number decision parameter.

Preferably, the preset stimulus distribution value determination module 30 includes:

The marginal utility determining unit is used for determining the marginal utility of each participant for the federal system by adopting a shape algorithm according to the effective training data set corresponding to each participant;

and the contribution value determining unit is used for determining the contribution value corresponding to each participant according to the marginal utility corresponding to each participant.

Preferably, as shown in fig. 9, the actual incentive assigning value determination module 40 includes:

a basic loss value obtaining unit 401, configured to obtain basic loss values corresponding to the participants, and compare the preset excitation distribution value corresponding to the same participant with the basic loss values;

a first actual excitation distribution value determining unit 402, configured to record, as the preset excitation distribution value, an actual excitation distribution value of the participant corresponding to the preset excitation distribution value when the preset excitation distribution value is greater than or equal to the base loss value;

a second actual incentive distribution value determining unit 403, configured to record, when the preset incentive distribution value is smaller than the basic loss value, the actual incentive distribution value of the participant corresponding to the basic loss value as the basic loss value according to the preset incentive determination policy.

Preferably, the federal excitation distribution arrangement further comprises:

the calculation loss value determining module is used for determining calculation loss values corresponding to all the participants according to hardware equipment parameters corresponding to all the participants and the total effective data through calculation loss functions;

the communication loss value determining module is used for determining the communication loss value corresponding to each participant according to the communication transmission parameters corresponding to each participant through a communication loss function;

the loss cost determining module is used for determining loss cost corresponding to each participant according to the hardware equipment parameters corresponding to each participant and the communication transmission parameters through a product logarithmic function;

and the basic loss value determining module is used for determining the basic loss value corresponding to each participant according to the calculated loss value, the communication loss value and the loss cost corresponding to each participant.

Specific limitations regarding federal excitation dispensing apparatus can be found in the above definitions of federal excitation dispensing methods, and are not described in detail herein. The various modules in the federal excitation distribution arrangement described above may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used to store data used by the federal incentive allocation method in the above embodiments. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements a federal incentive assigning method.

In one embodiment, a computer device is provided that includes a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the federal incentive allocation method of the above embodiments when the computer program is executed by the processor.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon that when executed by a processor implements the federal incentive assigning method of the above embodiments.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. A federal incentive distribution method, comprising:

performing federal incentive assigning tasks based on the actual incentive assigning values corresponding to each of the participants;

the step of determining the federal excitation depth of the federal system according to the training quality vector and the total effective data amount corresponding to each participant comprises the following steps:

determining a matching quality vector from training quality vectors corresponding to the participants; the matching quality is a training quality vector with highest matching degree with training requirements;

determining an average quality vector by adopting a mathematical expectation algorithm according to the training quality vector corresponding to each participant;

determining average effective data quantity by adopting a mathematical expectation algorithm according to the effective data quantity corresponding to each participant;

Obtaining the maximum data bearing capacity of the federation system, and determining the federation excitation depth according to the matching quality vector, the average effective data quantity and the maximum data bearing capacity;

the determining the federal excitation depth according to the matching quality vector, the average effective data, and the maximum data bearing capacity includes:

receiving system service parameters sent by each participant, and determining service overall parameters of the federal system according to the system service parameters of each participant;

acquiring a first preset number of decision parameters, a second preset number of decision parameters, a first preset depth decision parameter and a second preset depth decision parameter of the federal system; the second preset number of decision parameters is greater than the first preset number of decision parameters;

when the average effective data volume is smaller than the first preset quantity decision parameter, determining the federal excitation depth according to the service total parameter, the matching quality vector, the average effective data and the maximum data bearing capacity;

when the average effective data volume is larger than or equal to the first preset quantity decision parameter and smaller than the second preset quantity decision parameter, determining the federal excitation depth according to the first preset depth decision parameter, a service total parameter, a matching quality vector, an average quality vector, average effective data and a maximum data bearing capacity;

2. The federal excitation allocation method according to claim 1, wherein the initial training data set includes at least one initial training data; the determining, from the initial training data set, an effective training data set corresponding to each of the participants and a training quality vector corresponding to each of the effective training data sets, comprising:

receiving a data cleaning instruction containing a training requirement, so as to perform data cleaning processing on the initial training data set of each participant, and eliminating initial training data which does not meet the training requirement in the initial training data set;

recording an initial training data set after eliminating initial training data which does not meet the training requirement as the effective training data set;

inputting the effective training data sets into a federal feature engineering module, performing feature evaluation processing on the effective training data sets through the federal feature engineering module, and determining training quality vectors corresponding to the effective training data sets.

3. The federal incentive distribution method according to claim 1, wherein said determining a contribution value for each of the parties using a marginal utility metric method comprises:

determining marginal utility of each participant for the federal system by adopting a shape algorithm according to an effective training data set corresponding to each participant;

4. The federal incentive assigning method of claim 1, wherein said determining actual incentive assigning values corresponding to each of said parties based on said preset incentive assigning values and a preset incentive determination strategy comprises:

obtaining basic loss values corresponding to all the participants, and comparing the preset excitation distribution values corresponding to the same participant with the basic loss values;

recording an actual excitation allocation value of the participant corresponding to the preset excitation allocation value as the preset excitation allocation value when the preset excitation allocation value is greater than or equal to the base loss value;

and when the preset excitation distribution value is smaller than the basic loss value, recording the actual excitation distribution value of the participant corresponding to the basic loss value as the basic loss value according to the preset excitation determination strategy.

5. The federal incentive distribution method according to claim 4, wherein prior to obtaining the base loss values for each of the participants, comprising:

determining a calculation loss value corresponding to each participant according to the hardware equipment parameter corresponding to each participant and the total effective data through calculation loss function;

determining a communication loss value corresponding to each participant according to the communication transmission parameters corresponding to each participant through a communication loss function;

determining loss cost corresponding to each participant according to hardware equipment parameters corresponding to each participant and the communication transmission parameters through a product logarithmic function;

6. A federal excitation distribution device for performing the federal excitation distribution method according to any one of claims 1 to 5, the federal excitation distribution device comprising:

7. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the federal incentive allocation method of any one of claims 1 to 5 when the computer program is executed.

8. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the federal incentive allocation method of any of claims 1 to 5.