WO2023134181A1

WO2023134181A1 - Resource allocation method, apparatus and system based on federated learning

Info

Publication number: WO2023134181A1
Application number: PCT/CN2022/117168
Authority: WO
Inventors: 刘嘉; 李增祥
Original assignee: 新智我来网络科技有限公司
Priority date: 2022-01-17
Filing date: 2022-09-06
Publication date: 2023-07-20

Abstract

The present disclosure provides a resource allocation method, apparatus and system based on federated learning. The method comprises: reading preset resource allocation configuration information; obtaining model demand information provided by a plurality of model demanders, the resource allocation configuration information comprising attribute configuration information, contribution degree configuration information and monitoring configuration information, and determining a target demander according to the attribute configuration information and the model demand information; determining a plurality of target resource contributors matched with the model demand information, and obtaining model resources of each target resource contributor; and determining an allocation value corresponding to each target resource contributor according to the attribute configuration information, the contribution degree configuration information, the monitoring configuration information and the model resources. According to the present disclosure, each target resource contributor can obtain the allocation value matched with the model resources provided by the target resource contributor, the enthusiasm of the resource contributor to provide model resources is stimulated, and long-term good sustainable development of federated learning is facilitated.

Description

A resource allocation method, device and system based on joint learning

technical field

The present disclosure relates to the technical field of machine learning, and in particular to a resource allocation method, device and system based on joint learning.

Background technique

In the context of increasing attention to privacy and data protection issues, federated learning has become a very popular research direction in the field of artificial intelligence. Federated learning refers to the comprehensive utilization of various AI (Artificial Intelligence, artificial intelligence) technologies on the premise of ensuring data security and user privacy, and joint multi-party cooperation to jointly mine data value and generate new intelligent business models and models based on joint modeling.

For federated learning, the continuous participation of participants in the federated learning process (for example, by sharing encrypted model parameters) is the key to its long-term success. However, in the existing technology, most of the evaluation and reward mechanisms of joint learning adopt the method of equal distribution, which inevitably has problems such as unreasonable and unfair distribution. Participants without valid data will undoubtedly be detrimental to the long-term sustainable development of joint learning.

Contents of the invention

In view of this, the embodiments of the present disclosure provide a resource allocation method, device, and system based on joint learning to solve the problem that most of the evaluation and reward mechanisms of joint learning in the prior art adopt the method of average distribution, which inevitably leads to unreasonable allocation. , unfairness and other issues may easily lead to the loss of some participants with sufficient and valid data due to unfair incentive distribution and other reasons, which is not conducive to the long-term sustainable development of joint learning.

The first aspect of the embodiments of the present disclosure provides a resource allocation method based on joint learning, including:

Read preset resource allocation configuration information, resource allocation configuration information includes attribute configuration information, contribution configuration information and monitoring configuration information;

Obtain model demand information provided by multiple model demand parties, and determine the target demand party according to attribute configuration information and model demand information;

Determine multiple target resource contributors that match the model requirement information, and obtain the model resources of each target resource contributor. The model resources include model parameters and effective training data volume;

According to attribute configuration information, contribution degree configuration information, monitoring configuration information and model resources, determine the distribution value corresponding to each target resource contributor, and feed back the distribution value to each target resource contributor.

The second aspect of the embodiments of the present disclosure provides a resource allocation device based on joint learning, including:

The reading module is configured to read preset resource allocation configuration information, and the resource allocation configuration information includes attribute configuration information, contribution degree configuration information and monitoring configuration information;

The demander determination module is configured to obtain model demand information provided by multiple model demand parties, and determine the target demand party according to attribute configuration information and model demand information;

The resource acquisition module is configured to determine multiple target resource contributors that match the model requirement information, and acquire model resources of each target resource contributor, where the model resources include model parameters and effective training data volume;

The allocation module is configured to determine the allocation value corresponding to each target resource contributor according to attribute configuration information, contribution configuration information, monitoring configuration information and model resources, and feed back the allocation value to each target resource contributor.

The third aspect of the embodiments of the present disclosure provides a resource allocation system based on joint learning, including a coordination center, a contributor transmission module and a demand transmission module respectively connected to the coordination center in communication;

The demand-side transmission module is configured to send model demand information to the coordination center according to a preset time step, where the model demand information includes a demand model;

The contributor transmission module is configured to send an invitation application to the coordination center when the model training information is received and it is determined to participate in the model training;

The coordination center is configured to generate model training information according to the model requirement information, and broadcast the model training information. The model training information includes the preset basic model, training sample type, sample size required for each round and participation strategy;

According to the invited application, the training resources of at least two target contributors are locked, and the preset training program is started, so that each target contributor uses its training resources to train the basic model until the preset end conditions are met, and the global model is obtained. The contribution allocation resources that the target contributors should deserve, and the contribution allocation resources are fed back to the corresponding target contributors.

According to the fourth aspect of the embodiments of the present disclosure, a resource allocation method of a joint learning-based resource allocation system is provided, including:

Receive model demand information sent by the demander; where the demander is one of the multiple participants;

Generate model training information according to the model requirement information, and broadcast the model training information. The model training information includes the preset basic model, training sample type, sample size required for each round and participation strategy;

Identify contributors among multiple parties;

Responding to the message sent by the contributor to determine participation in model training based on the model training information;

Lock the training resources of at least two target contributors according to the message, start the preset training program, make each target contributor use its training resources to train the basic model, until the preset end condition is met, and aggregate the training resources provided by each target contributor Model parameters, get the global model, calculate the contribution allocation resources that each target contributor should get, and feed back the contribution allocation resources to the corresponding target contributors.

A fifth aspect of the embodiments of the present disclosure provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. The processor implements the steps of the above method when executing the computer program.

A sixth aspect of the embodiments of the present disclosure provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the foregoing method are implemented.

Compared with the prior art, the beneficial effects of the embodiments of the present disclosure at least include: the embodiments of the present disclosure read preset resource allocation configuration information, and the resource allocation configuration information includes attribute configuration information, contribution degree configuration information, and monitoring configuration information; Based on the model demand information provided by multiple model demand parties, the target demand party is determined according to the attribute configuration information and model demand information; multiple target resource contributors that match the model demand information are determined, and the model resources of each target resource contributor are obtained. Model resources include model parameters and effective training data volume; according to attribute configuration information, contribution configuration information, monitoring configuration information and model resources, determine the allocation value corresponding to each target resource contributor, and feed back the allocation value to each target resource contribution fully consider the contribution of each target resource contributor to the utility of the model required by the model demander, so that each target resource contributor can obtain an allocation value that matches the model resources it provides, not only can Stimulating the enthusiasm of resource contributors to provide model resources can further constrain resource contributors to provide real and effective model parameters, so as to facilitate the long-term sustainable development of joint learning.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure, the following will briefly introduce the drawings that need to be used in the embodiments or the description of the prior art. Obviously, the drawings in the following description are only of the present disclosure For some embodiments, those skilled in the art can also obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of a joint learning architecture according to an embodiment of the present disclosure;

FIG. 2 is a schematic flowchart of a resource allocation method based on joint learning provided by an embodiment of the present disclosure;

Fig. 3 is a schematic diagram of a resource allocation device based on joint learning provided by an embodiment of the present disclosure;

FIG. 4 is a schematic structural diagram of a resource allocation system based on joint learning according to an embodiment of the present disclosure;

FIG. 5 is a sequence diagram of another resource allocation method based on joint learning provided by an embodiment of the present disclosure;

Fig. 6 is a schematic flowchart of another resource allocation method based on joint learning provided by an embodiment of the present disclosure.

Fig. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.

Detailed ways

In the following description, for the purpose of illustration rather than limitation, specific details such as specific system structures and techniques are presented for a thorough understanding of the embodiments of the present disclosure. It will be apparent, however, to one skilled in the art that the present disclosure may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.

Federated learning refers to the comprehensive utilization of various AI (Artificial Intelligence, artificial intelligence) technologies on the premise of ensuring data security and user privacy, and joint multi-party cooperation to jointly mine data value and generate new intelligent business models and models based on joint modeling. Federated learning has at least the following characteristics:

(1) Participating nodes control the weakly centralized joint training mode of their own data to ensure data privacy and security in the process of co-creating intelligence.

(2) In different application scenarios, use screening and/or combining AI algorithms and privacy-preserving calculations to establish multiple model aggregation optimization strategies to obtain high-level, high-quality models.

(3) On the premise of ensuring data security and user privacy, based on a variety of model aggregation optimization strategies, obtain a performance method to improve the joint learning engine, where the performance method can be solved by solving problems including parallel computing architecture and large-scale cross-domain network Information interaction, intelligent perception, exception handling mechanism, etc., improve the overall performance of the joint learning engine.

(4) Obtain the needs of multi-party users in each scenario, determine and reasonably evaluate the true contribution of each joint participant through the mutual trust mechanism, and distribute incentives.

Based on the above methods, it is possible to establish an AI technology ecology based on joint learning, give full play to the value of industry data, and promote the implementation of scenarios in vertical fields.

In the embodiment of the present disclosure, the architecture of the federated learning may include a server, multiple contributors, and multiple demanders, and its specific structure and functions may be adjusted according to specific needs. The resource allocation method and device for joint learning according to the embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

FIG. 1 is a schematic structural diagram of a federated learning architecture according to an embodiment of the disclosure. As shown in Figure 1, the architecture of joint learning can include a server 101, multiple contributors 102, and multiple demanders 103, where the server 101 is the resource allocation center, the contributor 102 is the target resource contributor, and the demander 103 is the model demand square.

In the joint learning process, the resource allocation configuration information that may be used by the resource allocation center 101 in the subsequent simulated auction scene can be manually configured in advance. The resource allocation configuration information includes attribute configuration information, contribution configuration information and monitoring configuration. information; when starting the simulated auction program, the resource allocation center 101 can read the preset resource allocation configuration information, and then obtain the model demand information provided by multiple model demanders 103, and determine the target according to the attribute configuration information and model demand information The demander; then, further determine a plurality of target resource contributors 102 that match the model demand information, obtain the model resources of each target resource contributor, and the model resources include model parameters and effective training data volume; then configure information according to the above attributes, Contribution configuration information, monitoring configuration information and model resources, determine the allocation value corresponding to each target resource contributor, and feed back the allocation value to each target resource contributor, thus completing the entire model resource auction process.

The embodiments of the present disclosure fully consider factors such as the contribution of each target resource contributor to the utility of the model required by the model demander, so that each target resource contributor can obtain an allocation value that matches the model resources it provides, It can not only stimulate the enthusiasm of resource contributors to provide model resources, but also further constrain resource contributors to provide real and effective model parameters, so as to facilitate the long-term sustainable development of joint learning.

It should be noted that the number and types of target resource parties and model demand parties can be specifically set according to actual conditions, and are not specifically limited in this disclosure.

Fig. 2 is a schematic flowchart of a resource allocation method based on joint learning provided by an embodiment of the present disclosure. The resource allocation method based on joint learning in FIG. 2 may be executed by the resource allocation center 101 in FIG. 1 . As shown in Figure 2, the resource allocation method based on joint learning includes:

Step S201, read preset resource allocation configuration information, the resource allocation configuration information includes attribute configuration information, contribution degree configuration information and monitoring configuration information.

As an example, the resource allocation configuration information may be obtained by manually preconfiguring specific parameter values, or may be randomly generated within a manually preconfigured parameter range.

Specifically, the attribute configuration information may include auction methods, including but not limited to first-price auctions, VCG price auctions, and the like. Among them, the principle of first-price auction is that the highest bidder wins. The principle of VCG price auction is to calculate the profit loss brought to the entire bidding revenue after the bidder wins the auction item. In theory, this loss is the fee that the bid winner should pay.

Contribution configuration information, including but not limited to contribution measurement methods, such as distribution according to marginal contribution (meaning that the benefit of each node is the utility generated when it joins the team), distribution based on Shapely value (designed to exclude nodes joining in different order impact on the aggregate, thereby more fairly estimating their contribution to the aggregate).

Monitoring configuration information, including but not limited to training data deviation (refers to the degree of deviation between the training data provided by the resource contributor and the model sample required by the model demander), resource deviation (generally refers to the difference between the computing resources owned by the resource contributor and The deviation between the computing resources required for training the model), the deviation of online stability (generally refers to the deviation between the probability that the resource contributor will be offline during the joint training process and the preset online rate), etc.

As an example, a data structure for configuring the object of the resource allocation center 101 can be generated in the platform memory by running a preset computer program, and the data structure can be "[attribute configuration information, contribution configuration information, monitoring configuration information]" in the form of a one-dimensional array.

Step S202, acquiring model demand information provided by multiple model demand parties, and determining a target demand party according to attribute configuration information and model demand information.

As an example, communication channels between the resource allocation center 101 and multiple target resource contributors 102 and multiple model demanders 103 may be constructed in advance. The resource allocation center 101 can monitor the communication channel after starting the simulation program, and realize communication and information exchange with multiple target resource contributors 102 and multiple model demanders 103 .

In some embodiments, the above-mentioned model requirement information includes a requirement model. According to the attribute configuration information and model demand information, determine the target demand party, specifically including: classify multiple model demand parties according to the demand model, and obtain the set of model demand parties corresponding to each demand model; according to the attribute configuration information, from the model demand Select the target demand party from the party set.

Specifically, when running the preset model auction simulation program, the resource allocation center 101 can start monitoring the communication channel, and receive model demand information provided by multiple model demanders 103 through the communication channel. A demand model on the demand side of a model. The demand model here can be determined according to the actual business needs of the model demander. For example, if the business demand of the model demander is to improve the face recognition accuracy of the attendance system, then the required model can be a face recognition model.

Then, according to the obtained demand models of each model demand party, these model demand parties are classified to obtain a model demand party set corresponding to each demand model, and the model demand party set includes at least one model demand party .

For example, assume that there are currently five model demanders, namely model demanders A, B, C, D, and E. Among them, the demand models of model demanders A and B are face recognition models, and model demanders C and D , The demand model of E is a load forecasting model. Then, according to the demand model, the above five model demanders can be divided into two categories, one is the model demander set 01 corresponding to the face recognition model (including model demanders A and B), and the other is the load forecasting model The set of model demanders 02 corresponding to the model (including model demanders C, D, and E).

In some embodiments, according to the attribute configuration information, the target demander is screened out from the set of model demanders, which specifically includes: obtaining the budget resources of each model demander in the set of model demanders; Determined as the target demand side.

Assume that the given attribute configuration information is a first-price auction, that is, the one with the highest price wins. Then, the budget resources of each model demander can be further obtained, where the budget resources can refer to budget expenses, that is, the budget for purchasing the required models. Then, compare the size of budget resources between any pair of model demanders in each set of model demanders, and determine the model demander with the most budget resources (that is, the highest bid) as the target demander.

Exemplarily, if the budget resource of model demander A in model demander set 01 is X yuan, and the budget resource of model demander B is Y yuan, and X>Y, then determine model demander A as the target demander .

Step S203, determine multiple target resource contributors matching the model requirement information, and obtain the model resources of each target resource contributor, where the model resources include model parameters and effective training data volume.

In some embodiments, the resource allocation center 101 can broadcast model bidding information through a preset communication channel, so that each resource contributor can receive the model bidding information. The model bidding information includes the demand model, the required training samples, the required number of samples, and Incentive coefficient; receive the model information of the model to be auctioned based on the model bidding information fed back by multiple resource contributors.

Wherein, the required training sample type can be determined according to the requirement model, assuming that the requirement model is a face recognition model, then the required training samples can be pictures/images/videos containing faces. The required number of samples usually refers to the sample size that meets at least one round of joint training requirements. For example, if a round of training requires 100 samples, then the number of samples required is 100.

The incentive coefficient is related to the degree of correlation between the training sample and the demand model. Usually, the higher the degree of correlation, the greater the corresponding incentive coefficient. As an example, assuming that the required training samples are pictures/images/videos containing faces, then the training samples containing pictures/images/videos of faces can be regarded as samples with high correlation, and the corresponding excitation coefficient It can be set to 0.8, and the training samples of pictures/images/videos that do not contain human faces can be regarded as samples with low correlation (or no correlation), and the corresponding excitation coefficient can be set to 0.2.

As an example, when the resource contributor (participant with training data) receives the model bidding information broadcast by the resource allocation center 101 through the above communication channel, the required model is a face recognition model, and the required training samples (pictures/ image/video), the number of required samples (100 samples per round) and the excitation coefficient (the excitation coefficient of the sample of the picture/image/video containing the face is 0.8, and the sample of the picture/image/video without the face When the incentive coefficient of is 0.2), it can be determined whether to participate in this bidding activity according to its own training data and the above-mentioned model bidding information.

In some embodiments, the model information of the model to be auctioned reported by multiple resource contributors can be obtained first, and the model information of the model to be auctioned includes the model type of the model to be auctioned; then the model type of the model to be auctioned and the model of the demand model can be calculated The similarity between types, and finally, multiple target resource contributors matching the model type of the requirement model are determined according to the similarity.

As an example, according to the above model bidding information, combined with its own training data, determine the resource contributors participating in this bidding activity, and can use the training data they have to train the basic model issued by the resource allocation center 101 to obtain the training model , and feed back the model information of the training model (that is, the model information of the model to be auctioned) to the resource allocation center 101 .

As an example, after obtaining the model information (including the model type of the model to be auctioned) reported by each resource contributor, the resource allocation center 101 can further calculate the model type of the model to be auctioned reported by each resource contributor The similarity (for example, cosine similarity) with the model type of the requirement model, and then sort each resource contributor according to the similarity from large to small to obtain a sorted list. Finally, according to actual needs, at least two resource contributors are sequentially selected as target resource contributors in the order of the sorted list.

Step S204, according to attribute configuration information, contribution degree configuration information, monitoring configuration information and model resources, determine the distribution value corresponding to each target resource contributor, and feed back the distribution value to each target resource contributor.

In some embodiments, the auction value of each target resource contributor is calculated according to the attribute configuration information and the preset first weight;

Calculate the contribution value of each target resource contributor according to the contribution configuration information, model parameters, effective training data volume and the preset second weight;

Calculate the penalty value of each target resource contributor according to the monitoring configuration information and the preset third weight;

According to the auction value, contribution value and penalty value, determine the allocation value corresponding to each target resource contributor.

Wherein, the first weight, the second weight, and the third weight can be flexibly set according to actual conditions, and there is no specific limitation here. For example, the first weight, the second weight, and the third weight may be set to 0.5, 0.3, and 0.2, respectively.

As an example, the auction value of each target resource contributor is calculated according to the attribute configuration information and the preset first weight. Specifically, the auction value may be calculated according to the pricing corresponding to the auction mode and the corresponding preset first weight value. For example, the auction method is a first-price auction, and the pricing of the first-price auction may be the remaining part after deducting the service cost of the resource distribution center from the budget resources provided by the target demander.

The amount of effective training data refers to the amount of training data whose degree of correlation with the model required by the target demander reaches the preset correlation degree threshold. For example, if the model required by the target demand side is a face recognition model, then the training data associated with the face recognition model is an image/picture/video containing a face. The preset correlation degree threshold can be set according to actual conditions, for example, it can be set to 50%, 100%, and so on. Exemplarily, it is assumed that the training data of images/pictures/videos containing faces has a 100% correlation with the face recognition model, and the training data of images/pictures/videos not containing faces has a correlation of 100% with the face recognition model. 0%, the preset correlation threshold can be set to 100%. That is to say, the amount of effective training data in this example refers to the number of pieces that are 100% related to the face recognition model required by the target demander (that is, the training data of images/pictures/videos containing faces) .

Generally, when the training model is a deep neural network model, its model parameters generally include weights and biases.

As an example, the contribution value of each target resource contributor is calculated according to the contribution degree configuration information, the model parameters, the effective training data amount and the preset second weight. Specifically, the model parameters provided by the resource contributor can be updated to the iterative model of the model required by the target demand party stored in the resource distribution center, and then the updated iterative model is deduced and tested with the pre-stored test data to determine the The degree to which the model parameters provided by the resource contributor improve the performance of the iterative model. Usually, the higher the degree of improvement, the higher the contribution of the resource contributor and the higher the contribution value. Similarly, when the amount of effective training data provided by resource contributors is greater, the contribution to the improvement of model performance will be greater, and the contribution value will be higher.

In practical applications, the contribution of each resource can be calculated according to the pre-configured contribution measurement method (such as distribution by marginal contribution), as well as the model parameters provided by each resource contributor, the amount of effective training data, and the preset second weight. party's contribution.

In some embodiments, the penalty value of each target resource contributor is calculated according to the monitoring configuration information and the preset third weight, specifically including:

Determine the first penalty item, the second penalty item, and the third penalty item for each target resource contributor;

According to the first penalty item, the second penalty item and the third penalty item, the penalty value of each target resource contributor is calculated.

The penalty value is mainly set to punish resource contributors for providing false or invalid model information.

Wherein, the first penalty item may be training data deviation, the second penalty item may be resource deviation, and the third penalty item may be online stability deviation.

For each of the above deviation items, the corresponding penalty weight can be determined according to the influence degree of the deviation on the model performance. For example, the degree of influence of the first penalty item on model performance > the second penalty item > the third penalty item, then the penalty weights of the first penalty item, the second penalty item, and the third penalty item can be set to 0.7, 0.2, and 0.1 respectively . It should be noted that the penalty weight can be flexibly set according to the actual situation, and no specific limitation is set here.

As an example, the penalty value of each resource contributor can be calculated according to the formula: penalty value = first penalty item * first penalty weight + second penalty item * second penalty weight + third penalty item * third penalty weight .

After that, according to the formula: allocation value = auction value + contribution value - penalty value, the allocation value corresponding to each target resource contributor is calculated.

In the technical solution provided by the embodiments of the present disclosure, by reading the preset resource allocation configuration information, the resource allocation configuration information includes attribute configuration information, contribution degree configuration information and monitoring configuration information; obtaining model demand information provided by multiple model demanders, Determine the target demander according to the attribute configuration information and model demand information; determine multiple target resource contributors that match the model demand information, and obtain the model resources of each target resource contributor. The model resources include model parameters and effective training data volume; According to the attribute configuration information, contribution degree configuration information, monitoring configuration information and model resources, determine the corresponding distribution value of each target resource contributor, and feed back the distribution value to each target resource contributor, fully considering each target resource contributor Factors such as the degree of contribution to the utility of the model required by the model demander enable each target resource contributor to obtain an allocation value that matches the model resources it provides, which not only stimulates the enthusiasm of the resource contributor to provide model resources, but also Resource contributors can be further constrained to provide real and effective model parameters, so as to facilitate the long-term sustainable development of joint learning.

All the above optional technical solutions may be combined in any way to form optional embodiments of the present application, which will not be repeated here.

The following are device embodiments of the present disclosure, which can be used to implement the method embodiments of the present disclosure. For details not disclosed in the disclosed device embodiments, please refer to the disclosed method embodiments.

Based on the same inventive concept as the resource allocation method based on joint learning shown in FIG. 2 , an embodiment of the present disclosure further provides a resource allocation device based on joint learning. As shown in Figure 3, the resource allocation device based on joint learning includes:

The reading module 301 is configured to read preset resource allocation configuration information, where the resource allocation configuration information includes attribute configuration information, contribution degree configuration information, and monitoring configuration information;

The demander determination module 302 is configured to obtain the model demand information provided by multiple model demand parties, and determine the target demand party according to the attribute configuration information and the model demand information;

The resource acquisition module 303 is configured to determine multiple target resource contributors that match the model requirement information, and acquire the model resources of each target resource contributor, where the model resources include model parameters and effective training data volume;

The allocation module 304 is configured to determine the allocation value corresponding to each target resource contributor according to attribute configuration information, contribution configuration information, monitoring configuration information, and model resources, and feed back the allocation value to each target resource contributor.

In the technical solution provided by the embodiments of the present disclosure, the preset resource allocation configuration information is read through the reading module 301. The resource allocation configuration information includes attribute configuration information, contribution configuration information, and monitoring configuration information; the demand side determination module 302 obtains multiple The model demand information provided by the model demand party determines the target demand party according to the attribute configuration information and model demand information; the resource acquisition module 303 determines multiple target resource contributors that match the model demand information, and obtains the model of each target resource contributor Resources, model resources include model parameters and effective training data volume; allocation module 304 determines the allocation value corresponding to each target resource contributor according to attribute configuration information, contribution configuration information, monitoring configuration information and model resources, and feeds back the allocation value To each target resource contributor, fully consider the contribution of each target resource contributor to the utility of the model required by the model demander, so that each target resource contributor can get the model resources that match the model resources it provides. The assigned value can not only stimulate the enthusiasm of resource contributors to provide model resources, but also further constrain resource contributors to provide real and effective model parameters, so as to facilitate the long-term sustainable development of joint learning.

In some embodiments, the above-mentioned model requirement information includes a requirement model. In the above steps, the target demander is determined according to the attribute configuration information and model demand information, including:

Classify multiple model demanders according to the demand model, and obtain a set of model demanders corresponding to each demand model;

According to the attribute configuration information, the target demander is filtered out from the set of model demanders.

In some embodiments, the above step, according to the attribute configuration information, screens out the target demander from the set of model demanders, including:

Obtain the budget resources of each model demander in the model demander collection;

Determine the model demander with the most budget resources as the target demander.

In some embodiments, the above step of determining multiple target resource contributors that match the model requirement information includes:

Obtain the model information of the model to be auctioned reported by multiple resource contributors, the model information of the model to be auctioned includes the model type of the model to be auctioned;

Calculate the similarity between the model type of the model to be auctioned and the model type of the demand model, and determine multiple target resource contributors matching the model type of the demand model according to the similarity.

In some embodiments, the above steps, before obtaining the model information of the models to be auctioned reported by multiple resource contributors, further include:

The model bidding information is broadcast through the preset communication channel so that each resource contributor can receive the model bidding information. The model bidding information includes the demand model, the required training samples, the required number of samples and the incentive coefficient;

Receive the model information of the model to be auctioned based on the model bidding information fed back by multiple resource contributors.

In some embodiments, the above step, according to attribute configuration information, contribution degree configuration information, monitoring configuration information and model resources, determines the allocation value corresponding to each target resource contributor, including:

Calculate the auction value of each target resource contributor according to the attribute configuration information and the preset first weight;

In some embodiments, the above step, according to the monitoring configuration information and the preset third weight, calculates the penalty value of each target resource contributor, including:

Fig. 4 is a schematic structural diagram of a resource allocation system based on joint learning according to an embodiment of the present disclosure. As shown in FIG. 4 , the server in this system is the coordination center 401 , and the contributor and the demander communicate with the coordination center 401 respectively. Wherein, the contributor includes the contributor's transmission module 402, and the demander includes the demander's transmission module 403.

Wherein, the coordination center 401, the contributor and the demander can be generated by a computer program preset by the system, specifically, one coordination center 401, multiple contributors and at least one demander can be generated. The coordination center 401 can respectively communicate with multiple contributors and at least one demander through a preset communication channel, and realize information exchange.

The contributors and demanders mentioned above belong to the participants of joint learning. When a participant wants to obtain the application model he needs through federated learning based on its business needs and other reasons, the computer program preset by the system can configure the participant as the demander, while other participants can be configured as contributor. It can be understood that when the demander participates in other joint learning tasks, the computer program preset by the system can reconfigure the demander as a contributor. In other words, the roles of demanders and contributors here are not fixed, but can be flexibly configured according to the actual situation.

In practical applications, multiple contributors and at least one demander can be randomly generated through preset agent generation rules. Wherein, the preset agent generation rule may be to generate a plurality of contributors with higher quotations, a plurality of contributors with lower quotations, and at least one demander according to the quotation range. Contributors refer to agents that have training resources (such as training data, computing resources, etc.). The demand side can be an agent with training resources (for example, insufficient training resources or inconsistent with its required model), or an agent without any training resources. In some embodiments, the requester may be any one or more specified from multiple contributors.

In addition, the preset agent generation rule may also be to generate multiple contributors and at least one demander according to the quantity range or quality level of the training data.

The demander transmission module 403 is configured to send model demand information to the coordination center according to a preset time step, where the model demand information includes a demand model.

Wherein, the preset time step refers to the starting interval time of the simulation training. The interval time may be an equal duration for each interval, or may be a random duration for each interval. For example, the model requirement information is automatically sent to the coordination center 401 at intervals of 1 minute; or, the model requirement information is automatically sent to the coordination center 401 at intervals of 1 minute, 5 minutes, 7 minutes.... The preset time step can be flexibly set according to actual needs, and there is no limit here.

The demand model can be determined according to the specific business needs of the demand side. For example, if the business demand of the demand side is to improve the face recognition accuracy of the attendance system, then the demand model can be a face recognition model.

The coordination center 401 is configured to generate model training information according to the model requirement information, and broadcast the model training information. The model training information includes the preset basic model, training sample type, sample size required for each round, and participation strategy.

In some embodiments, model training information can be generated according to model requirement information and preset configuration information; and then the model training information can be sent to idle contributors and/or resource surplus contributors in an idle state.

Wherein, the preset configuration information may include a preset basic model, training sample type, sample size required for each round, and participation strategy.

The preset basic model can be a randomly initialized model, or a model provided by the demand side.

The type of training sample can be specifically determined according to the type of the required model. For example, the required model is a face recognition model, and the type of training sample can be a picture/photo/video containing a human face.

The sample size required for each round refers to the number of samples required to participate in a round of joint training, also known as batch samples. The sample size can be determined according to actual needs.

Regarding the participation strategy, the participation strategy for the contributor can be that it must be a registered member of the simulation system, it can also be the possession of XX training resources (usually the training data associated with the model required by the demander), or it can be the possession of XX computing resources, or have XX communication resources and so on. Specifically, it can be flexibly set according to actual conditions, and is not limited here.

The participation strategy for the demand side can be an auction method, for example, the first price auction (the one with the highest price wins), the VCG price auction (the principle is to calculate the contribution to the entire bidding revenue after the bidder wins the auction item (such as a model) loss of income, theoretically this loss is the fee that the winner of the auction should pay), etc.

As an example, assume that the demand model in the model demand information is a preset model of gas load, and the preset configuration information is the basic model M1, gas load measurement data, 100 samples required for each round, registered members of the system . Then, according to the above configuration information and model demand information, the following model training information can be generated: "Recruitment of contributors to the training gas load preset load model. The invitation conditions include: having at least 100 pieces of gas load measurement data, and the system Registered member of ; the accessory is the basic model M1".

In some embodiments, sending model training information to resource surplus contributors specifically includes:

Collect the resource status information of all contributors. The resource status information includes the contributor’s computing resource information and communication resource information; judge whether the contributor belongs to the resource surplus contributor according to the resource status information; if so, send the model training information to the resource surplus contribution square.

As an example, the coordination center 401 can pre-collect the resource status information (including computing resource information and communication resource information) of all contributors in the system, wherein the computing resource information includes CPU resources, memory resources, hard disk resources, and network resources. Related Information. Communication resource information, usually including relevant information on data transmission delay, transmission bandwidth, etc.

As an example, a resource surplus contributor usually refers to a contributor that currently has redundant computing resources and/or communication resources and can stably support at least one round of joint training.

The technical solution provided by the embodiments of the present disclosure first collects the resource status information of each contributor, and then selects the contributors who belong to resource surplus from multiple contributors according to the resource status information, and gives the model training information to these resource surplus contributors. It is beneficial to improve the quality of the joint training model and the convergence efficiency of the model.

In some other embodiments, sending the model training information to the idle contributor in the idle state specifically includes:

Obtain the training task execution status information of all contributors; according to the training task execution status information, determine the idle contributors who are currently idle, or, the currently executing training task is about to be completed, and can participate in the next training task at the preset time node idle contributors.

Among them, the training task execution status information includes whether there is currently a joint learning task, and the execution progress of the currently participating joint learning task (for example, training has not started, training is in progress, training is over) and other related information.

As an example, assume that the acquired training task execution status information of contributor A is not currently participating in any joint learning task; the training task execution status information of contributor B is that there are currently two joint learning tasks, and one of the joint learning tasks has been completed. The training is over, and another joint learning task is being trained; the execution status information of the training task of contributor C is that there is currently one joint learning task, and the task can end the training before the XX time node. Then, according to the training task execution status information obtained above, it can be determined that the idle contributor who is currently idle is Contributor A, the currently executing training task is about to be completed, and can participate in the next training task at the preset time node. The contributor is C. Then, the model training information can be sent to contributors A and C.

In still other embodiments, the broadcast model training information specifically includes:

Determine the degree of association between the training data owned by each contributor and the demand model; determine the contributor who has the degree of association that meets the preset association threshold as the contributor that has training data that matches the demand model; The contributors of the matching training data deliver model training information.

As an example, firstly, an instruction to report training data related information may be sent to all contributors in the system. After receiving the instruction, all contributors will report information about their training data to the coordination center 401. The coordination center 401, the relevant information of the training data owned by all contributors can be collected through this channel (for example, XX training data).

As another example, when a contributor registers with the system, he reports what type of training data he owns, and the system can associate and store the contributor and the training data he owns, for example, to establish a correspondence table between the contributor and the training data . During the simulation training process, the coordination center 401 can retrieve the correspondence table between the contributor and the training data from the system, so as to obtain the training data owned by all contributors.

In one embodiment, it is determined how relevant the training data held by each contributor is to the demand model. Specifically, according to the similarity between the training data owned by each contributor and the data required for training the demand model, it can be determined whether the two are related and to what extent.

Exemplarily, it is assumed that the demand model is a face recognition model (the required training data is pictures/photos/videos containing faces), and the training data owned by a contributor A is gas load prediction data, which is related to face recognition The training data required by the model is completely irrelevant, so it can be determined that the correlation between the training data of contributor A and the demand model is 0. If the training data owned by a contributor B is a picture/photo/video containing a face, which is completely consistent with the training data required by the face recognition model, it can be determined that the correlation between the training data of the contributor B and the demand model is 100%.

The technical solution provided by the embodiments of the present disclosure first determines whether the training data owned by the contributors in the system is related to the demand model, and then selects the contributors who have training data that match the demand model, and then the model Sending training information to these contributors can improve the response rate of the contributors and help improve the model effect and efficiency of subsequent joint training.

The contributor transmission module 402 is configured to send an invitation application to the coordination center when the model training information is received and it is determined to participate in the model training.

In some embodiments, when the model training information is received, query and determine whether there are training samples of the same type as the training samples in the preset resource library and whose quantity satisfies the sample size required for each round; if so, determine Whether participating in the training according to the participation strategy can achieve the preset expected return resources; if the preset expected resource return can be achieved, an invitation application will be sent to the coordination center.

As an example, assume that the model training information is "Recruiting contributors to the preset load model of the training gas load. The invitation conditions include: having at least 100 pieces of gas load measurement data, and being a registered member of the system, and the training return is X yuan ; the attachment is the basic model M1", then when each contributor receives the above-mentioned model training information broadcast by the coordination center, he can first query and judge whether the training required in the advertisement is stored in his own resource library (such as a database). Training samples (data measured by gas load) with the same sample type (measured by gas load) and the number of samples meeting the required sample size (at least 100) for each round. If it is found that there are 200 pieces of gas load measurement data stored in its resource bank, it will be further judged whether it is a registered user of the system, and if so, it will be further judged whether it is invited to participate in this joint training. Set expected return resources (such as expected benefits).

Assuming that contributor A’s preset expected return resource (such as expected income) is not less than Y yuan, and X>Y, then A participates in the above-mentioned joint training, and can obtain X yuan training return, which can meet its preset is expected to return resources, at this time, contributor A can send an invited application to the coordination center 401 .

The coordination center 401 is also configured to lock the training resources of at least two target contributors according to the invited application, and start a preset training program, so that each target contributor uses its training resources to train the basic model until the preset end Conditions, get the global model, calculate the contribution allocation resources that each target contributor should get, and feed back the contribution allocation resources to the corresponding target contributors.

In some embodiments, the coordination center 401 can receive the invitation application sent by multiple contributors, the invitation application includes training resources and expected return resources; then determine at least two target contributors according to the training resources and expected return resources, and send to The target contributor issues a bid winning notice to lock the training resources of the target contributor.

Among them, training resources include training data (including data type and data quantity), computing resources, communication resources, and the like.

The preset training program usually refers to the application program file designed by the programmer in advance according to the training process of federated learning.

As an example, at least two target contributors are determined according to the training resources and the expected reward resources. Specifically, at least two target contributors can be screened out by judging whether the contributor has training data related to the model required by the demander, and its expected return resources are within a preset training return range.

Assume that the model demanded by the demand side is the gas load forecasting model, and the corresponding required training data is gas load measurement data. The sample size required for each round is 100, and the preset training return range is X to Y yuan . If a contributor A has 200 pieces of gas load measurement data and the expected return resource is X yuan, then contributor A can be determined as one of the target contributors. Similarly, according to the above screening method, at least two target contributors can be selected from multiple contributors, and bid winning notifications can be issued to these target contributors (for example, text/voice notification information such as "Congratulations to XX for winning the bid") , and lock the training resources of the selected target contributor.

Then, start the preset training program, simulate each target contributor in the system to use its training data "locally" to conduct the first round of joint training on the basic model M1, obtain the first round of model parameters uploaded by each contributor, and These first-round model parameters are aggregated to obtain the first aggregation parameters, and then the basic model M1 is updated according to the first aggregation parameters to obtain the updated model M2; then, the updated model M2 is sent to each target contributor, so that Each target contributor uses its training data to conduct a second round of training on the updated model M2 to obtain the second round of model parameters, and aggregate these second round model parameters to obtain the second aggregation parameters, and then according to the second aggregation parameters to The updated model M2 is updated to obtain the updated model M3, and then the updated model M3 is sent to each target contributor, so that each target contributor can use its training data to perform the third round of training on the updated model M3, that is, repeat the above iterative training process until the preset end condition is reached (such as the preset number of simulation training rounds, the preset model accuracy, etc.), and the global model is obtained.

In some embodiments, after obtaining the global model, calculate the contribution allocation resources that each target contributor should receive. Specifically, the calculation of each target contributor’s share of resources can be done according to the preset contribution measurement strategy, auction strategy, and punishment strategy. contribution to allocate resources.

Among them, the preset contribution measurement strategy includes allocating resources according to equal contribution (that is, average distribution), allocating resources according to marginal contribution (according to the utility generated when each node (contributor) joins the team (joint training alliance)), based on Shapely value assignment (by excluding the effect of nodes joining the ensemble in a different order to more fairly estimate their contribution to the ensemble).

Auction strategies usually refer to auction methods, including first-price auctions, VCG price auctions, etc. Among them, the principle of first-price auction is that the highest bidder wins. The principle of VCG price auction is to calculate the profit loss brought to the entire bidding revenue after the bidder wins the auction item. In theory, this loss is the fee that the bid winner should pay.

Punishment strategy mainly refers to the punishment measures set for certain malicious behaviors of individual members. Exemplary, the malicious behavior (punishment item) includes: there is a large deviation between the reported training data and the training data actually participating in the training; there is a large deviation between the reported computing resources and the computing resources actually participating in the training; during the training process, online rate below a preset threshold, etc. Punishment measures can be to formulate corresponding deduction values or deduction coefficients for different malicious behaviors.

As an example, the auction price may be calculated according to the auction strategy and the budget provided by the demander, and the auction price is the service budget of the coordination center deducted. For example, if the auction strategy is a first-price auction, the budget provided by the winning buyer is K yuan, and the service budget of the coordination center is S yuan, then the auction price is (K-S) yuan. Then calculate the penalty value of each target contributor according to the penalty value that needs to be deducted corresponding to each penalty item in the penalty strategy. For example, the penalty items include training data deviation and computing resource deviation. The deduction base for training data deviation is X, and the deviation weights are 100% for severe deviation, 50% for moderate deviation, and 10% for mild deviation; Y is deducted for all deviations in computing resources. . Assuming that Contributor A's training data deviation is moderate and there is computing resource deviation, then Contributor A's penalty value is X*50%+Y. Then, according to the contribution measurement strategy and the preset contribution base value, the contribution value of each target contributor is determined. Finally, add up the auction value (positive number), penalty value (negative number) and contribution value (positive number) to calculate the contribution allocation resources (such as income) that each target contributor should receive.

In some embodiments, after calculating the contribution allocation resources due to each target contributor, it also includes:

Use the preset test data to perform deduction prediction on the global model, and obtain the deduction prediction result;

Determine the model performance of the global model based on a comparison of the inference predictions and the label values of the test data.

Wherein, the preset test data may be data with tag values.

As an example, assuming that the obtained global model is a face recognition model, the test data may be pictures with labels (it may be a picture of a face or other pictures (such as pictures of animals, etc.)). These test data are input into the face recognition model, and a face recognition result (for example, a human face or a binary classification result of a non-human face) is output. According to the comparison between the deduction prediction result and the real label value of the test data, the comparison result is obtained, and then the model performance (such as accuracy rate, recall rate, etc.) of the global model is determined according to the comparison result.

In the embodiment of the present disclosure, after each iteration of training is completed, the respective contribution allocation resources of each target participant are calculated, and each target contributor can be sorted from high to low according to the contribution allocation resources. At the same time, the The global model obtained by training is tested for model utility, and the corresponding relationship between the contribution party's participation in training and the utility of the global model is obtained. According to the corresponding relationship, the optimal resource allocation scheme under the same simulation training mechanism can be further analyzed.

In addition, different simulation training mechanisms can also be set. Referring to the above simulation training process, the resource allocation values of each contributor under each mechanism can be ranked, counted, and the effectiveness of the test model can be further analyzed in different simulation training mechanisms. Next, which mechanism can make the behavior of the contributor more in line with the incentive goal, and which mechanism can enable the demander to obtain a model with better utility.

The system provided by the embodiments of the present disclosure can set various simulation training mechanisms, and through various simulation training mechanisms can simulate joint training in various scenarios, so as to obtain the optimal resource allocation plan, which is conducive to motivating contributors to actively Participate in joint learning, build a robust joint model, and reduce the overall cost of joint training.

It should be understood that the order in which the modules are arranged in the above embodiments does not mean the order in which the steps are executed, and the execution order of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure.

Fig. 5 is a sequence diagram of another resource allocation method based on joint learning provided by an embodiment of the present disclosure. As shown in Figure 5, the method includes the following steps:

The demand-side transmission module sends model demand information to the coordination center according to the preset time step, and the model demand information includes the demand model;

Contributor transmission module, when receiving model training information and confirming to participate in model training, sends an invitation application to the coordination center;

The coordination center generates model training information according to the model requirement information, and broadcasts the model training information. The model training information includes the preset basic model, training sample type, sample size required for each round and participation strategy;

According to the invited application, the training resources of at least two target contributors are locked, and the preset training program is started, so that each target contributor uses its training resources to train the basic model until the preset end conditions are met, and the aggregated contributions provided by each target contributor The model parameters of , get the global model, calculate the contribution allocation resources that each target contributor should get, and feed back the contribution allocation resources to the corresponding target contributors.

In the technical solution provided by the embodiments of the present disclosure, the demander sends model demand information to the coordination center according to the preset time step, and the model demand information includes the demand model; the coordination center generates model training information according to the model demand information, and broadcasts the model training Information, model training information includes the preset basic model, training sample type, sample size required for each round, and participation strategy; when the contributor receives the model training information and determines to participate in the model training, it sends an invitation application to the coordination center; According to the invited application, the coordination center locks the training resources of at least two target contributors, starts the preset training program, and enables each target contributor to use its training resources to train the basic model until the preset end conditions are met, and aggregates the contributions of each target The model parameters provided by the party can be used to obtain the global model, calculate the contribution allocation resources that each target contributor should deserve, and feed back the contribution allocation resources to the corresponding target contributors, so that the joint learning results can be allocated relatively fairly and reasonably, and better Incentivize parties to participate in training and share their models.

FIG. 6 is a schematic flowchart of another resource allocation method based on joint learning provided by an embodiment of the present disclosure, and the method may be executed by the coordination center 401 in FIG. 4 . As shown in Figure 6, the method includes the following steps:

Step S601, receiving model demand information sent by a demander; wherein, the demander is one of multiple participants;

Step S602, generating model training information according to the model requirement information, and broadcasting the model training information, the model training information including the preset basic model, training sample type, sample size required for each round and participation strategy;

Step S603, determining a contributing party among the multiple participating parties;

Step S604, responding to the message sent by the contributor to determine participation in model training based on the model training information;

Step S605, lock the training resources of at least two target contributors according to the message, and start a preset training program, so that each of the target contributors uses its training resources to train the basic model until the preset end is satisfied. Conditions, aggregate the model parameters provided by each of the target contributors to obtain a global model, calculate the contribution allocation resources that each of the target contributors should receive, and feed back the contribution allocation resources to the corresponding target contributors.

Wherein, the above-mentioned message sent by the contributor to determine participation in model training based on the model training information refers to the above-mentioned invitation application.

As an example, to determine the contributor among the multiple participants, specifically, it may firstly determine the participant that meets the relevant requirements in the model training information based on the model training information generated by the model requirement information sent by the demander, and then further pass A computer program programmed into the system configures these parties as contributors.

As another example, at least one demander and multiple contributors may be initialized and generated through a computer program preset by the system according to preset configuration information.

As yet another example, participants who participate in the same joint learning task in the same joint learning community and have not submitted model requirement information to the coordination center may be determined as contributors, and participants who submit model requirement information to the coordination center may be determined as demand side.

In some embodiments, the above steps, generating model training information according to model requirement information, and broadcasting model training information include:

Generate model training information according to model requirement information and preset configuration information;

Send model training information to idle contributors and/or resource-abundant contributors in an idle state.

In some embodiments, the above steps of sending model training information to resource surplus contributors include:

Collect the resource status information of all contributors, including the computing resource information and communication resource information of the contributor;

Judging whether the contributor is a resource surplus contributor according to the resource status information;

If yes, send the model training information to the resource surplus contributor.

In some embodiments, the above steps of sending model training information to idle contributors in an idle state include:

Get the training task execution status information of all contributors;

According to the execution status information of the training task, determine the idle contributor who is currently idle, or the idle contributor who can participate in the next training task at the preset time node when the currently executing training task is about to be completed.

In some embodiments, the above steps, broadcasting model training information, include:

Determine how relevant the training data each contributor has to the required model;

Determine the contributor whose degree of association meets the preset association threshold as the contributor who has training data that matches the demand model;

Send model training information to contributors who have training data that matches the required model.

In some embodiments, when the contributor receives the model training information, it queries and determines whether there are training samples of the same type as the training samples and the number of which meets the required sample size for each round in the preset resource library; if so, Then judge whether participating in the training according to the participation strategy can achieve the preset expected return resources; if the preset expected resource return can be achieved, send a message to the coordination center to confirm participation in model training.

In some embodiments, the above step of locking the training resources of at least two target contributors according to the message includes:

Receive invited applications from multiple contributors, including training resources and expected return resources;

Determine at least two target contributors based on the training resources and expected return resources, and issue a bid winning notice to the target contributors to lock the training resources of the target contributors.

In some embodiments, the above steps of calculating the contribution allocation resources that each target contributor should receive include:

According to the preset contribution measurement strategy, auction strategy and penalty strategy, calculate the contribution allocation resources that each target contributor should deserve.

In some embodiments, the above steps, after calculating the contribution allocation resources that each target contributor should receive, further include:

It should be understood that the sequence numbers of the steps in the above embodiments do not mean the order of execution, and the execution order of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure.

FIG. 7 is a schematic diagram of an electronic device 700 provided by an embodiment of the present disclosure. As shown in FIG. 7 , an electronic device 700 in this embodiment includes: a processor 701 , a memory 702 , and a computer program 703 stored in the memory 702 and operable on the processor 701 . When the processor 701 executes the computer program 703, the steps in the foregoing method embodiments are implemented. Alternatively, when the processor 701 executes the computer program 703, the functions of the modules/units in the foregoing device embodiments are realized.

Exemplarily, the computer program 703 can be divided into one or more modules/units, and one or more modules/units are stored in the memory 702 and executed by the processor 701 to complete the present disclosure. One or more modules/units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program 703 in the electronic device 700 .

The electronic device 700 may be an electronic device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The electronic device 700 may include but not limited to a processor 701 and a memory 702 . Those skilled in the art can understand that FIG. 7 is only an example of the electronic device 700, and does not constitute a limitation to the electronic device 700. It may include more or less components than shown in the figure, or combine certain components, or different components. , for example, an electronic device may also include an input and output device, a network access device, a bus, and the like.

Processor 701 can be a central processing unit (Central Processing Unit, CPU), and can also be other general processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), on-site Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, and the like.

The storage 702 may be an internal storage unit of the electronic device 700 , for example, a hard disk or a memory of the electronic device 700 . The memory 702 can also be an external storage device of the electronic device 700, for example, a plug-in hard disk equipped on the electronic device 700, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash memory card ( Flash Card), etc. Further, the memory 702 may also include both an internal storage unit of the electronic device 700 and an external storage device. The memory 702 is used to store computer programs and other programs and data required by the electronic device. The memory 702 can also be used to temporarily store data that has been output or will be output.

Those skilled in the art can clearly understand that for the convenience and brevity of description, only the division of the above-mentioned functional units and modules is used for illustration. In practical applications, the above-mentioned functions can be assigned to different functional units, Completion of modules means that the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit, and the above-mentioned integrated units may adopt hardware It can also be implemented in the form of software functional units. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working processes of the units and modules in the above system, reference may be made to the corresponding processes in the aforementioned method embodiments, and details will not be repeated here.

In the above-mentioned embodiments, the descriptions of each embodiment have their own emphases, and for parts that are not detailed or recorded in a certain embodiment, refer to the relevant descriptions of other embodiments.

Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementation should not be considered beyond the scope of the present disclosure.

In the embodiments provided in the present disclosure, it should be understood that the disclosed device/electronic equipment and method may be implemented in other ways. For example, the device/electronic device embodiments described above are only illustrative. For example, the division of modules or units is only a logical function division. In actual implementation, there may be other division methods. Multiple units or components can be Incorporation may either be integrated into another system, or some features may be omitted, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

A unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

If an integrated module/unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the present disclosure realizes all or part of the processes in the methods of the above embodiments, and can also be completed by instructing related hardware through computer programs. The computer programs can be stored in computer-readable storage media, and the computer programs can be processed. When executed by the controller, the steps in the above-mentioned method embodiments can be realized. A computer program may include computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer memory, read-only memory (Read-Only Memory, ROM), random access Memory (Random Access Memory, RAM), electrical carrier signal, telecommunication signal and software distribution medium, etc. It should be noted that the content contained in computer readable media may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, computer readable media may not Including electrical carrier signals and telecommunication signals.

The above embodiments are only used to illustrate the technical solutions of the present disclosure, rather than to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still be described in the foregoing embodiments Modifications to the technical solutions recorded, or equivalent replacements for some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in this disclosure. within the scope of protection.

Claims

A resource allocation method based on joint learning, characterized in that it includes:

Reading preset resource allocation configuration information, the resource allocation configuration information includes attribute configuration information, contribution degree configuration information and monitoring configuration information;

Obtaining model demand information provided by multiple model demand parties, and determining a target demand party according to the attribute configuration information and the model demand information;

Determining multiple target resource contributors that match the model requirement information, and acquiring model resources of each target resource contributor, where the model resources include model parameters and effective training data volumes;

According to the attribute configuration information, contribution degree configuration information, monitoring configuration information and the model resources, determine the allocation value corresponding to each of the target resource contributors, and feed back the allocation value to each of the target resource contributors .
The method according to claim 1, wherein the model requirement information includes a requirement model;

The determining the target demander according to the attribute configuration information and the model requirement information includes:

classify the plurality of model demanders according to the demand model, and obtain a set of model demanders corresponding to each of the demand models;

According to the attribute configuration information, a target demander is selected from the set of model demanders.
The method according to claim 2, wherein, according to the attribute configuration information, selecting the target demander from the set of model demanders comprises: acquiring each model in the set of model demanders Budget resources of the demand side; determine the model demand side with the most budget resources as the target demand side;

The determining multiple target resource contributors that match the model demand information includes: obtaining model information of models to be auctioned reported by multiple resource contributors, the model information of models to be auctioned includes model types of models to be auctioned ; Calculate the similarity between the model type of the model to be auctioned and the model type of the demand model, and determine a plurality of target resource contributors matching the model type of the demand model according to the similarity.
The method according to claim 3, wherein before acquiring the model information of the models to be auctioned reported by the multiple resource contributors, further comprising:

Broadcast model bidding information through a preset communication channel, so that each resource contributor receives the model bidding information, and the model bidding information includes a demand model, required training samples, required number of samples, and incentive coefficients;

The model information of the model to be auctioned based on the model bidding information fed back by multiple resource contributors is received.
The method according to claim 1, characterized in that, according to the attribute configuration information, contribution degree configuration information, monitoring configuration information and the model resources, the allocation value corresponding to each of the target resource contributors is determined, include:

calculating the auction value of each of the target resource contributors according to the attribute configuration information and the preset first weight;

calculating the contribution value of each target resource contributor according to the contribution configuration information, the model parameters, the amount of effective training data, and the preset second weight;

calculating a penalty value for each target resource contributor according to the monitoring configuration information and a preset third weight;

According to the auction value, the contribution value and the penalty value, an allocation value corresponding to each of the target resource contributors is determined.
The method according to claim 5, wherein the calculation of the penalty value of each of the target resource contributors according to the monitoring configuration information and the preset third weight includes:

determining a first penalty term, a second penalty term, and a third penalty term for each of said target resource contributors;

Calculate a penalty value for each target resource contributor according to the first penalty item, the second penalty item, and the third penalty item.
A resource allocation device based on joint learning, characterized in that it includes:

The reading module is configured to read preset resource allocation configuration information, and the resource allocation configuration information includes attribute configuration information, contribution degree configuration information and monitoring configuration information;

The demander determination module is configured to obtain model demand information provided by multiple model demand parties, and determine a target demand party according to the attribute configuration information and the model demand information;

The resource acquisition module is configured to determine multiple target resource contributors that match the model requirement information, and acquire model resources of each target resource contributor, where the model resources include model parameters and effective training data volumes;

The allocation module is configured to determine the allocation value corresponding to each of the target resource contributors according to the attribute configuration information, contribution configuration information, monitoring configuration information, and the model resources, and feed back the allocation value to each The target resource contributor.
A resource allocation system based on joint learning, characterized in that it includes a coordination center, a contributor transmission module and a demand transmission module respectively connected to the coordination center in communication;

The demander transmission module is configured to send model demand information to the coordination center according to a preset time step, and the model demand information includes a demand model;

The contributor transmission module is configured to send an invitation application to the coordination center when receiving model training information and determining to participate in model training;

The coordination center is configured to generate model training information according to the model requirement information, and broadcast the model training information, the model training information includes a preset basic model, training sample type, sample size required for each round and engagement strategies;

lock the training resources of at least two target contributors according to the invited application, and start a preset training program, so that each of the target contributors uses its training resources to train the basic model until the preset end condition is met, Obtain the global model, calculate the contribution allocation resources that each of the target contributors should have, and feed back the contribution allocation resources to the corresponding target contributors.
The system according to claim 8, wherein said generating model training information according to said model requirement information, and broadcasting said model training information comprises:

generating model training information according to the model requirement information and preset configuration information;

Sending the model training information to idle contributors and/or resource surplus contributors in an idle state.
The system according to claim 9, wherein the sending the model training information to resource surplus contributors comprises: collecting resource status information of all contributors, the resource status information including the contributors Computing resource information and communication resource information; judging whether the contributor belongs to a resource surplus contributor according to the resource state information; if so, sending the model training information to the resource surplus contributor;

Alternatively, the sending the model training information to the idle contributors in the idle state includes: obtaining the training task execution status information of all contributors; determining the idle contributors currently in the idle state according to the training task execution status information Party, or, the currently executing training task is about to be completed, and can participate in the idle contributor of the next training task at the preset time node.
The system according to claim 8, wherein the broadcasting of the model training information includes: determining the degree of association between the training data owned by each of the contributors and the demand model; Contributors that meet a preset association threshold are determined as contributors that have training data that matches the requirement model;

Sending the model training information to contributors who have training data matching the demand model;

Alternatively, when receiving the model training information and determining to participate in the model training, sending an invitation application to the coordination center includes: when receiving the model training information, querying and judging the preset resource library Whether there are training samples of the same type as the training samples and the number of samples required for each round are stored; if so, it is judged whether participation in training according to the participation strategy can achieve its preset expected return resources; if If the preset expected resource return can be achieved, an invited application is sent to the coordination center.

Alternatively, the locking the training resources of at least two target contributors according to the invitation application includes: receiving invitation applications sent by multiple contributors, the invitation application including training resources and expected return resources; according to the training resources and It is expected to return resources, determine at least two target contributors, and issue a bid winning notification to the target contributors, so as to lock the training resources of the target contributors.
The system according to claim 8, wherein the calculation of the contribution allocation resources that each of the target contributors should receive includes: calculating each of the target contributors according to the preset contribution measurement strategy, auction strategy and penalty strategy. Contribution allocation resources due to target contributors;

Alternatively, after the calculation of the resource allocation due to the contribution of each of the target contributors, it further includes: using preset test data to perform deduction prediction on the global model to obtain a deduction prediction result; according to the deduction prediction result and the obtained The model performance of the global model is determined by comparing the label values of the test data.
A resource allocation method based on a joint learning resource allocation system, characterized in that it includes:

Receiving model demand information sent by the demander; wherein, the demander is one of the multiple participants;

Generate model training information according to the model requirement information, and broadcast the model training information, the model training information includes a preset basic model, training sample type, sample size required for each round, and participation strategy;

determining a contributing party of the plurality of parties;

Responding to a message sent by the contributor to determine participation in model training based on the model training information;

Lock the training resources of at least two target contributors according to the message, start a preset training program, and make each of the target contributors use their training resources to train the basic model until the preset end condition is met, aggregate The model parameters provided by each of the target contributors are used to obtain a global model, and the contribution allocation resources due to each of the target contributors are calculated, and the contribution allocation resources are fed back to the corresponding target contributors.
An electronic device, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, characterized in that, when the processor executes the computer program, the computer program according to claim 1 is implemented. steps of the method described above.
A computer-readable storage medium storing a computer program, wherein the computer program implements the steps of the method according to claim 1 when the computer program is executed by a processor.