WO2023024378A1

WO2023024378A1 - Multi-agent model training method, apparatus, electronic device, storage medium and program product

Info

Publication number: WO2023024378A1
Application number: PCT/CN2021/142157
Authority: WO
Inventors: 何元钦; 康焱; 刘洋; 陈天健
Original assignee: 深圳前海微众银行股份有限公司
Priority date: 2021-08-25
Filing date: 2021-12-28
Publication date: 2023-03-02
Also published as: CN113658689A

Abstract

Provided are a multi-agent model training method, an apparatus, an electronic device, a storage medium, and a program product, comprising: a participant device inputting a training parameter value of a predictable parameter into a local multi-agent model, and when the training parameter value is fixed, inputting each of a plurality of parameter value groups into the multi-agent model for prediction, so as to obtain a plurality of prediction results; comparing each of the prediction results against corresponding actual results, so as to determine an influence factor for each parameter value group; then, aggregating the parameter values of unpredictable parameters to obtain an intermediate parameter value corresponding to each unpredictable parameter, and sending the intermediate parameter value to a collaborator device, wherein an intermediate parameter value is used to trigger the collaborator device to aggregate received intermediate parameter values, so as to obtain a target parameter value corresponding to each unpredictable parameter; receiving a target parameter value corresponding to each unpredictable parameter and returned by the collaborator device, and updating the multi-agent model on the basis of the target parameter values.

Description

Multi-agent model training method, device, electronic equipment, storage medium and program product

Cross References to Related Applications

This application is based on a Chinese patent application with application number 202110981895.1 and a filing date of August 25, 2021, and claims the priority of this Chinese patent application. The entire content of this Chinese patent application is hereby incorporated by reference into this application.

technical field

The present application relates to the technical field of artificial intelligence, and in particular to a multi-agent model training method, device, electronic equipment, computer readable storage medium and computer program product.

Background technique

Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the nature of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence. Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.

Horizontal federated learning in related technologies usually trains machine learning models by different parties and a collaborative party. Its goal is to use the limited data of all parties to jointly train a global model under the premise of ensuring data security. Because the global model uses the data of each participant for training, the effect of the model can approach the situation where the data of each participant is trained together, which is significantly better than the effect of the model obtained by each participant only based on its own data. However, the use of multi-agent models is very different from traditional machine learning, and it is impossible to apply federated learning to solve multi-agent model verification according to the traditional federated machine learning model training method.

Contents of the invention

Embodiments of the present application provide a multi-agent model training method, device, electronic device, computer-readable storage medium, and computer program product, which can improve model prediction accuracy while ensuring local data security.

An embodiment of the present application provides a multi-agent model training method, based on a federated learning system, the system includes a collaborator device and at least two participant devices, and the method is executed by the participant device, including:

The participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and under the condition of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model for prediction, Get multiple prediction results;

Wherein, the set of parameter values includes at least one parameter value of an unpredictable parameter;

determining an impact factor for each of the parameter value groups based on the plurality of prediction results and actual results corresponding to each of the prediction results;

Based on each of the parameter value groups and corresponding impact factors, the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameter values corresponding to each of the unpredictable parameters;

Send the obtained intermediate parameter value to the cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices to obtain the The target parameter value of the unpredictable parameter;

receiving target parameter values corresponding to the unpredictable parameters returned by the coordinating device, and updating the multi-agent model based on the target parameter values.

The embodiment of the present application also provides a multi-agent model training device, the device comprising:

The acquisition module is configured to input the training parameter values of the predictable parameters to the local multi-agent model by the participant equipment, and input multiple parameter value groups into the multi-agent model respectively under the condition of fixing the training parameter values The volume model is predicted to obtain multiple prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter;

A comparison module configured to determine an impact factor for each of the parameter value groups based on the plurality of prediction results and actual results corresponding to each of the prediction results;

An aggregation module configured to aggregate the parameter values of each of the unpredictable parameters based on each of the parameter value groups and the corresponding impact factors, to obtain intermediate parameter values corresponding to each of the unpredictable parameters;

A sending module, configured to send the obtained intermediate parameter value to a cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices , to obtain the target parameter value corresponding to each of the unpredictable parameters;

The update module is configured to receive target parameter values corresponding to each of the unpredictable parameters returned by the coordinating device, and update the multi-agent model based on the target parameter values.

An embodiment of the present application provides an electronic device, including:

memory for storing executable instructions;

The processor is configured to implement the multi-agent model training method provided in the embodiment of the present application when executing the executable instructions stored in the memory.

The embodiment of the present application provides a computer-readable storage medium, which stores executable instructions, and is used to cause a processor to execute the method to implement the multi-agent model training method provided in the embodiment of the present application.

An embodiment of the present application provides a computer program product, including a computer program, and when the computer program is executed by a processor, the multi-agent model training method provided in the embodiment of the present application is implemented.

The embodiment of the present application has the following beneficial effects:

Compared with the way in which the multi-agent model in the related art can only be trained independently by the data owner, the training method, device, electronic equipment, and computer-based multi-agent model based on the horizontal federated learning architecture provided by the embodiment of the present application can Read storage media and computer program products, obtain intermediate parameter values through local aggregation of unpredictable parameters by the participating parties and send them to the collaborating party, and based on the target parameter value obtained by secondary aggregation of the received intermediate parameter values by the collaborating party , to update the multi-agent model. In this way, when multiple participants train the multi-agent model with the same purpose, they jointly optimize the value of unpredictable parameters, so as to obtain a multi-agent model that is better in line with the real data. The agent model ensures the security of local data, solves the problem of data islands in the field of multi-agent models, and realizes joint modeling among multiple parties, thereby improving the accuracy of model prediction.

Description of drawings

Fig. 1 is a schematic diagram of the implementation scene of the training method of the multi-agent model provided by the embodiment of the present application;

Fig. 2 is a schematic structural diagram of an electronic device provided by an embodiment of the present application;

Fig. 3 is a comparison diagram of the verification process of the multi-agent model provided by the embodiment of the present application and the training process of the machine learning model;

Fig. 4 is a schematic flow chart of the training method of the multi-agent model provided by the embodiment of the present application;

Fig. 5 is an optional flowchart of the training method of the multi-agent model provided by the embodiment of the present application;

Fig. 6A is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by an embodiment of the present application;

Fig. 6B is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by the embodiment of the present application;

Fig. 7A is an optional flowchart of the multi-agent model training method provided by the embodiment of the present application;

FIG. 7B is an optional flowchart of the multi-agent model training method provided by the embodiment of the present application;

FIG. 8 is a schematic flowchart of a prediction method for a multi-agent model provided in an embodiment of the present application;

FIG. 9 is a schematic flowchart of a training method for a multi-agent model provided in an embodiment of the present application;

Fig. 10 is a horizontal federated learning method of a multi-agent model provided by the embodiment of the present application;

Fig. 11 is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by an embodiment of the present application;

Fig. 12 is a schematic structural diagram of a training device for a multi-agent model provided in an embodiment of the present application;

FIG. 13 is a schematic structural diagram of a prediction device for a multi-agent model provided by an embodiment of the present application.

Detailed ways

In order to make the purpose, technical solutions and advantages of the application clearer, the application will be further described in detail below in conjunction with the accompanying drawings. All other embodiments obtained under the premise of creative labor belong to the scope of protection of this application.

In the following description, references to "some embodiments" describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or a different subset of all possible embodiments, and Can be combined with each other without conflict.

In the following description, the term "first\second\third" is only used to distinguish similar objects, and does not represent a specific order for objects. Understandably, "first\second\third" is used in Where permitted, the specific order or sequence may be interchanged such that the embodiments of the application described herein can be practiced in other sequences than illustrated or described herein.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of the present application, and are not intended to limit the present application.

Before further describing the embodiments of the present application in detail, the nouns and terms involved in the embodiments of the present application are described, and the nouns and terms involved in the embodiments of the present application are applicable to the following explanations.

1) Federated learning refers to the method of machine learning by uniting different participants (participants, or parties, also known as data owners, or clients). In federated learning, participants do not need to expose their own data to other participants and coordinators (coordinator, also known as parameter server (parameter server), or aggregation server (aggregation server)), so federated learning can be very good Protect user privacy and ensure data security.

Among them, horizontal federated learning is to take out the part of the data with the same data characteristics of the participants but not the same users for joint machine learning when the data characteristics of each participant overlap more, but the users overlap less. For example, there are two banks in different regions, and their user groups come from their respective regions, and the mutual intersection is very small. But their businesses are very similar, and most of the recorded user data features are the same. Horizontal federated learning can be used to help two banks build a joint model to predict their customer behavior.

2) The simulation method of the multi-agent model (agent based simulation or agent based modeling, ABS or ABM) is a calculation used to simulate the actions and interactions of agents (independent individuals or common groups, such as organizations and teams) Model. The multi-agent model is a microscopic model that reproduces and predicts complex phenomena by simulating the simultaneous actions and interactions of multiple agents. This process is the emergence from a low (micro) level to a high (macro) level. Through ABS, urban traffic conditions and disease transmission can be simulated. For example, ABS can be used to simulate the spread of new crown virus to help predict the development of the new crown virus epidemic. And analyze the suppression effect of different intervention methods on the epidemic. In this scenario, three parts are usually involved, 1) a crowd model that is close to the real distribution; 2) a social network model between crowds; 3) a disease transmission model; based on the above three-part model and corresponding parameters, it is possible to simulate Given the initial number of infected people, the development trend of the epidemic. Among them, in addition to the parameters and empirical parameters obtained from the data in the model (called predictable parameters), there are still some parameters whose values cannot be determined (called unpredictable parameters), and the values of these parameters need to be passed on the real data. It is obtained by performing validation. Here, the validation step on real data is similar to the training step in machine learning, that is, to optimize the values of unpredictable parameters so that the model simulation results are as close as possible to the real data. A commonly used method for determining these parameters is based on optimization methods, such as the Nelder-Mead Optimization optimization method.

3) Homomorphic Encryption (HE) is a symmetric encryption algorithm. The purpose of homomorphic encryption is to find an encryption algorithm that can perform addition and multiplication operations on the ciphertext, so that the encrypted The result obtained by performing a certain operation on the ciphertext is exactly equal to the ciphertext obtained by performing the expected operation on the plaintext before encryption and then encrypting it. Homomorphic encryption effectively ensures that the data processor can directly process the ciphertext of the data, but cannot know the plaintext information of the data it processes. This characteristic of homomorphic encryption enables users' data and privacy to be guaranteed corresponding security. Therefore, homomorphic encryption is applied in many real-world scenarios to ensure data security.

If an encryption function satisfies additive homomorphism and multiplicative homomorphism at the same time, it is called fully homomorphic encryption. Various encrypted operations (addition, subtraction, multiplication, division, polynomial evaluation, exponent, logarithm, trigonometric function, etc.) can be completed by using this encryption function.

The applicant found that a simulated ABS model of a well-built multi-agent model can be applied to different regions, and only needs to adjust its predictable parameters (such as the age of the population, sex ratio, etc.) according to the corresponding situation in the target region, and then verify Given the values of the unpredictable parameters, the model can be used to predict and analyze the subsequent development of the outbreak in the target area. Generally, the larger the area involved in the simulation, the more agents used to build the model, the better the effect of the model, and the more accurately it can reflect the real situation of the system. However, because the data of population distribution, population activity, and epidemic situation in each region may involve privacy or security issues and are relatively sensitive, these data are usually only authorized to be viewed by local credible institutions, and cannot be aggregated in one place for training/validation , so each institution can only conduct verification simulations based on its own limited data, and the values of unpredictable parameters obtained are often not optimal results, and the effect of the model will be affected, which may lead to deviations in prediction.

Based on this, the embodiment of the present application provides a multi-agent model training method, device, electronic equipment, computer-readable storage medium, and computer program product, so that multi-participant equipment can jointly train a multi-agent model under the coordination of the coordinating equipment. Agent model, and to ensure the security of local data, to solve the problem of data islands in the field of multi-agent model.

Based on the above-mentioned explanations of terms and terms involved in the embodiment of the present application, the implementation scenario of the training method of the multi-agent model provided by the embodiment of the present application is described below, see Figure 1, which is the multi-agent model provided by the embodiment of the present application Schematic diagram of the implementation scenario of the training method of the model. In order to support an exemplary application, the participant devices 200-1, 200-2, ..., 200-n are connected to the collaborator device 400 through the network 300, wherein the participant device 200- 1, 200-2, ..., 200-n may be institutions that store predictable parameters, unpredictable parameters, and real values of predicted targets, such as hospitals, and the collaborating party device 400 may be a credible institution. The devices 200-1, 200-2, ..., 200-n and the collaborating party's device 400 assist each other in federated learning so that the participating devices 200-1, 200-2, ..., 200-n can obtain a multi-agent model, The network 300 may be a wide area network or a local area network, or a combination of the two, using wireless or wired links for data transmission.

Participant devices (including participant devices 200-1, 200-2, ..., 200-n), are used to input the training parameter values of predictable parameters into the local multi-agent model, and in the case of fixing the training parameter values In this case, multiple parameter value groups are respectively input to the multi-agent model for prediction, and multiple prediction results are obtained; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter; based on multiple prediction results and the actual corresponding prediction results As a result, the influence factor of each parameter value group is determined; based on each parameter value group and the corresponding influence factor, the parameter values of each unpredictable parameter are aggregated to obtain the intermediate parameter value corresponding to each unpredictable parameter; the obtained intermediate parameter The value is sent to the partner device.

The coordinating party device (including the coordinating party device 400 ) is configured to aggregate the intermediate parameter values sent by multiple participant devices to obtain target parameter values corresponding to each unpredictable parameter; and send the target parameter value to the participant device.

The participant devices (including the participant devices 200-1, 200-2, ..., 200-n) are also used to receive the target parameter values corresponding to the unpredictable parameters returned by the coordinating device, and compare the multi- The agent model is updated.

In practical applications, the trained multi-agent model can be applied to the modeling of the new crown epidemic that has recently spread around the world, realizing joint modeling among multiple cities, regions, and countries, improving the prediction accuracy of the model, and serving the public and Policymakers provide more accurate data.

In practical applications, the participant devices 200-1, 200-2, ..., 200-n and the coordinating party device 400 may be independent physical servers, or server clusters or distributed systems composed of multiple physical servers. It can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery network (Content Deliver Network, CDN), and big data and Cloud servers for basic cloud computing services such as artificial intelligence platforms. The participant devices 200-1, 200-2, ..., 200-n and the collaborator device 400 can also be smart phones, tablet computers, notebook computers, desktop computers, smart speakers, smart watches, etc., but are not limited thereto . The participant devices 200-1, 200-2, . . . , 200-n and the cooperating device 400 may be connected directly or indirectly through wired or wireless communication, which is not limited in this application.

The hardware structure of the electronic device implementing the multi-agent model training method provided by the embodiment of the present application is described in detail below, and the electronic device includes but is not limited to a server or a terminal. Referring to FIG. 2 , FIG. 2 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. The electronic device 200 shown in FIG. 2 includes: at least one processor 210 , a memory 250 , at least one network interface 220 and a user interface 230 . Various components in the electronic device 200 are coupled together through the bus system 240 . It can be understood that the bus system 240 is used to realize connection and communication between these components. In addition to the data bus, the bus system 240 also includes a power bus, a control bus and a status signal bus. However, for clarity of illustration, the various buses are labeled as bus system 240 in FIG.

Processor 210 can be a kind of integrated circuit chip, has signal processing capability, such as general-purpose processor, digital signal processor (Digital Signal Processor, DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware Components, etc., wherein the general-purpose processor can be a microprocessor or any conventional processor, etc.

User interface 230 includes one or more output devices 231 that enable presentation of media content, including one or more speakers and/or one or more visual displays. The user interface 230 also includes one or more input devices 232, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

Memory 250 may be removable, non-removable or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 250 optionally includes one or more storage devices located physically remote from processor 210 .

Memory 250 includes volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The non-volatile memory can be read-only memory (Read Only Memory, ROM), and the volatile memory can be random access memory (Random Access Memory, RAM). The memory 250 described in the embodiment of the present application is intended to include any suitable type of memory.

In some embodiments, memory 250 is capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.

Operating system 251, including system programs for processing various basic system services and performing hardware-related tasks, such as framework layer, core library layer, driver layer, etc., for implementing various basic services and processing hardware-based tasks;

Network communication module 252, for reaching other computing devices via one or more (wired or wireless) network interfaces 220, exemplary network interfaces 220 include: Bluetooth, Wireless Fidelity (Wireless Fidelity, WiFi), and Universal Serial Bus Universal Serial Bus (USB), etc.;

The input processing module 253 is configured to detect one or more user inputs or interactions from one or more of the input devices 232 and translate the detected inputs or interactions.

In some embodiments, the training device of the multi-agent model provided by the embodiment of the present application can be realized by software, and Fig. 2 shows a training device 254 of the multi-agent model stored in the memory 250, which can be a program and a plug-in and other forms of software, including the following software modules: acquisition module 2541, comparison module 2542, aggregation module 2543, sending module 2544, and update module 2545, these modules are logical, so any combination or combination can be performed according to the realized functions Further splitting, the functions of each module will be explained below.

In some other embodiments, the multi-agent model training device provided by the embodiment of the present application can be realized by combining software and hardware. As an example, the multi-agent model training device provided by the embodiment of the present application can be implemented by using hardware A processor in the form of a code processor, which is programmed to execute the multi-agent model training method provided by the embodiment of the present application, for example, a processor in the form of a hardware decoding processor can use one or more application-specific integrated circuits (Application Specific Integrated Circuit, ASIC), DSP, Programmable Logic Device (Programmable Logic Device, PLD), Complex Programmable Logic Device (Complex Programmable Logic Device, CPLD), Field Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other electronic components.

Based on the above description of the implementation scenarios and electronic equipment of the multi-agent model training method of the embodiment of the present application, the following describes the multi-agent model training method provided by the embodiment of the present application. It should be noted that there are significant differences between the training process of the multi-agent model in the embodiment of the present application and the training process of the traditional machine learning model, see Figure 3, which is the verification process of the multi-agent model provided in the embodiment of the present application Compared with the training process of the machine learning model, based on Figure 3, the process of obtaining an updated multi-agent model specifically includes building an initial multi-agent model (construction model), verifying the multi-agent model (verification process) and testing Multi-agent model (test procedure). Among them, the construction of the initial multi-agent model refers to the initialization of the model parameters, the preset loss function (for updating the multi-agent model), etc.; Prediction parameters; the testing process refers to testing the correctness of the multi-agent model by modifying the output results of the model. The process of obtaining a converged machine learning model specifically includes building an initial machine learning model, training a machine learning model, and testing a machine learning model. iterative update. It should be noted that the verification process of the multi-agent model on real data is similar to the training process in machine learning, that is, to optimize the values of unpredictable parameters so that the results predicted by the model are as close as possible to the real data.

Referring to Fig. 4, Fig. 4 is a schematic flow chart of the training method of the multi-agent model provided by the embodiment of the present application, the training method of the multi-agent model provided by the embodiment of the present application includes:

Step 101, the participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and in the case of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model for prediction, and obtains multi-agent prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter.

In actual implementation, the value of the predictable parameter here is determined according to the local conditions of each party. For example, it can be the age, occupation, gender, and daily travel trajectory of local residents, or the gender, Age, occupation, the number of infected people, and the action trajectory of the target disease infected person; here, the training parameter value of the predictable parameter is based on the difference in the training purpose of the local multi-agent model, and the different predictable parameters obtained, namely In the process of training and optimizing a multi-agent model, the value of predictable parameters is fixed. As an example, if the multi-agent model is used to predict the number of local disease deaths, the total number of local residents, residents Gender, age, etc. are fixed predictable parameters in the process of training and optimizing the multi-agent model; correspondingly, when changing the use of the multi-agent model, only need to adjust the predictable parameters to realize the For other purposes, for example, when the model is used to predict the number of deaths in another region, the predictable parameters are adjusted to the total number of residents in another region, the sex, age, etc. of the residents; or the multi-agent model is For predicting the spread probability of the disease, the fixed predictable parameter at this time can be the number of contacts between healthy users and sick users; correspondingly, it can be determined by changing the predictable parameter, that is, the number of contacts between healthy users and sick users New disease transmission probability.

In the embodiment of the present application, the parameter value group includes at least one parameter value of an unpredictable parameter. The value of the unpredictable parameter cannot be deduced from existing data or experience, and the predicted value obtained by bringing the unpredictable parameter into the model is required. It is obtained by comparing with the corresponding real value, that is, by adjusting the value of the unpredictable parameter, the model result is consistent with the actual prediction target, the optimal value is determined, and the accuracy of the simulation result is verified on the test data, that is to say , select the appropriate value of the unpredictable parameter, so that the simulation results of the model conform to the real data (distribution) as much as possible.

In some embodiments, for the process of inputting multiple parameter value groups into the multi-agent model for prediction and obtaining multiple prediction results, refer to FIG. 5 . FIG. 5 is a training method for the multi-agent model provided by the embodiment of the present application. An optional schematic flow chart of FIG. 4, step 101 can also be implemented in the following manner:

Step 1011, acquire the number of unpredictable parameters, and determine the number of parameter value groups based on the number of unpredictable parameters.

In actual implementation, the number of unpredictable parameters that need to be optimized is determined, so that the number of parameter value groups is determined based on the number of unpredictable parameters. As an example, when the number of unpredictable parameters to be optimized is n, the number of parameter value groups may be n+1.

Step 1012, based on the number of parameter value groups, determine the parameter values of the unpredictable parameters in each parameter value group.

In actual implementation, after the number of parameter value groups is determined, based on the number of parameter value groups, parameter values of unpredictable parameters corresponding to the number of parameter value groups are selected. Following the above example, when the number of parameter value groups is n+1, select n+1 parameter values as the unpredictable parameters in each parameter value group. Following the above example, when n is 3, the parameter value group is The four groups are A, B, C and D, where the parameter values of the unpredictable parameters include A(a ₁ , b ₁ , c ₁ , d ₁ ), B(a ₂ , b ₂ , c ₂ , d ₂ ), C (a ₃ , b ₃ , c ₃ , d ₃ ) and D (a ₄ , b ₄ , c ₄ , d ₄ ).

It should be noted that selecting the parameter value of the unpredictable parameter here includes obtaining the parameter type of each unpredictable parameter in the parameter value group, and then determining the corresponding parameter value range according to the parameter type corresponding to each unpredictable parameter, and then according to each unpredictable parameter The parameter value range of the parameter determines the parameter value of each unpredictable parameter. Here, the unpredictable parameter can be the transmission coefficient of the disease, or it can be the influence of weather, age, gender, etc. on the transmission of the disease. For example, when one of the unpredictable parameters to be optimized is the transmission coefficient of the disease, determine the The value range of the unpredictable parameter is 0-K, and then the parameter value of the unpredictable parameter is randomly selected from the range of 0-K. Following the above example, for example, here a is an unpredictable parameter to be optimized with a value range of 0-K, then a ₁ , a ₂ , a ₃ and a ₄ are all parameter values between (0, K).

Step 1013, respectively input the parameter values of the unpredictable parameters in each parameter value group to the multi-agent model for prediction, and obtain multiple prediction results corresponding to multiple parameter value groups.

Following the above example, A(a ₁ , b ₁ , c ₁ , d ₁ ), B(a ₂ , b ₂ , c ₂ , d ₂ ), C(a ₃ , b ₃ , c ₃ , d ₃ ) and D(a ₄ , b ₄ , c ₄ , d ₄ ) are respectively input to the multi-agent model for prediction, and the prediction results corresponding to group A, the prediction results corresponding to group B, the prediction results corresponding to group C and the prediction results corresponding to group D are obtained result.

Step 102, based on multiple prediction results and actual results corresponding to each prediction result, determine the impact factor of each parameter value group.

Here, the impact factor can be used to characterize the degree of influence of unpredictable parameters in each parameter value group, that is, to characterize the degree of influence of each parameter value.

In some embodiments, determining the impact factor of each parameter value group based on multiple prediction results and actual results corresponding to each prediction result includes determining each The prediction accuracy corresponding to the parameter value group; the prediction accuracy corresponding to each parameter value group is used as the corresponding impact factor. Here, the prediction accuracy may be the weight corresponding to each parameter value group.

In some other embodiments, determining the influence factor of each parameter value group based on multiple prediction results and actual results corresponding to each prediction result includes determining each The loss values corresponding to each parameter value group; based on the loss value corresponding to each parameter value group, determine the impact factor of the corresponding parameter value group. In actual implementation, the reciprocal of the loss value can be used as the influence factor of the corresponding parameter value group. The larger the loss value is, the smaller the reciprocal of the loss value is, the smaller the influence factor is, or the loss value can be used as the influence factor of the corresponding parameter value group. Factor, the greater the loss value, the greater the impact factor. Here, the embodiment of the present application does not limit the method of determining the impact factor of the corresponding parameter value group through the loss value.

Step 103, based on each parameter value group and the corresponding impact factor, the parameter values of each unpredictable parameter are aggregated to obtain the intermediate parameter value corresponding to each unpredictable parameter.

In some embodiments, when the influencing factors of the corresponding parameter value groups are weights, the weights corresponding to each parameter value group are multiplied by the parameter values of unpredictable parameters to obtain the product results corresponding to each parameter value group, and then The multiplication results corresponding to each parameter value group are accumulated to obtain the accumulation result, and finally the accumulation result is used as the intermediate parameter value of the unpredictable parameter. Following the above example, the parameter groups here are A(a ₁ , b ₁ , c ₁ , d ₁ ), B(a ₂ , b ₂ , c ₂ , d ₂ ), C(a ₃ , b ₃ , c ₃ , d ₃ ) and D(a ₄ , b ₄ , c ₄ , d ₄ ), the corresponding weights are x, y, z and k, then the intermediate parameter value P of the unpredictable parameter is (a ₁ *x+a ₂ *y+a ₃ *z+a ₄ *k, b ₁ *x+b ₂ *y+b ₃ *z+b ₄ *k, c ₁ *x+c ₂ *y+c ₃ *z+c ₄ *k, d ₁ *x+d ₂ *y+d ₃ *z+d ₄ *k).

In some embodiments, when the influence factor of the corresponding parameter value group is related to the loss value corresponding to each parameter value group, multiple parameter value groups are sorted based on the influence factors of each parameter value group to obtain the sorting result; based on Sorting results, selecting a parameter value group of a target quantity from a plurality of parameter value groups; wherein, the target quantity is less than the quantity of a plurality of parameter value groups; obtaining the average value of the parameter value of the unpredictable parameter in the parameter value group of the target quantity; The mean value serves as an intermediate parameter value for unpredictable parameters.

In actual implementation, when the impact factor is the reciprocal of the loss value, sort multiple parameter value groups from large to small or small to large based on the size of the loss value, and then select the target from the sorted parameter value group A number of parameter value groups, where the target number is less than the number of parameter value groups.

Following the above example, the parameter groups here are A(a ₁ , b ₁ , c ₁ , d ₁ ), B(a ₂ , b ₂ , c ₂ , d ₂ ), C(a ₃ , b ₃ , c ₃ , d ₃ ) and D(a ₄ , b ₄ , c ₄ , d ₄ ), based on the size of the loss value, determine the optimal model parameter value group A, the worst model parameter value group D and other model parameter value groups B and C . Then aggregate the parameter values of the unpredictable parameters in the selected target number of parameter value groups, that is, aggregate a ₁ , a ₂ , a ₃ , and a ₄ , and aggregate b ₁ , b ₂ , b ₃ , and b ₄ , aggregate c ₁ , c ₂ , c ₃ , c ₄ and aggregate d ₁ , d ₂ , d ₃ , d ₄ .

Here, the process of aggregating the parameter values of the unpredictable parameters in the selected target number of parameter value groups includes obtaining the average value of the parameter values of the unpredictable parameters in the target number of parameter value groups, and then using the average value as the unpredictable parameter As an example, to obtain the average value of the parameter value of the unpredictable parameter in the parameter value group of the target quantity, the process of using the average value as the intermediate parameter value of the unpredictable parameter is described, exemplarily, optimize n select n parameter groups from n+1 parameter groups, and average the parameter values of the corresponding unpredictable parameters in the n parameter groups to use as the intermediate parameter value of the parameter values of the unpredictable parameters.

It should be noted that after obtaining the average value of the parameter values of the unpredictable parameters in the parameter value group of the target number, the average value can also be used to update the multi-agent model, and then the average value and the selected target number The parameter value groups are aggregated, that is, the parameter value groups of the target number are selected again, and the parameter values of the unpredictable parameters in the parameter value groups of the target number selected again are averaged, and then the above-mentioned updating of the multi-agent model is continued. The process and the process of aggregation again are iterated, and the average value obtained by the last aggregation is used as the intermediate parameter value of the unpredictable parameter. In this way, each participant iteratively optimizes its unpredictable parameter preset rounds locally, and obtains its final average value, which is the intermediate parameter value.

Following the above example, the parameter groups here are A(a ₁ , b ₁ , c ₁ , d ₁ ), B(a ₂ , b ₂ , c ₂ , d ₂ ), C(a ₃ , b ₃ , c ₃ , d ₃ ) and D(a ₄ , b ₄ , c ₄ , d ₄ ), based on the size of the loss value, determine the optimal model parameter value group A, the worst model parameter value group D and other model parameter value groups B and C , and then obtain the geometric mean point of the optimal model parameter value group and other model parameter groups, here, referring to Figure 6A, Figure 6A is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by the embodiment of the present application , where the geometric mean point P of the three groups of parameter values A, B, and C is obtained, where P=[(a ₁ +a ₂ +a ₃ )/3, (b ₁ +b ₂ +b ₃ )/ 3, (c ₁ +c ₂ +c ₃ )/3, (d ₁ +d ₂ +d ₃ )/3]. After obtaining the geometric center point P, based on the model parameter value group corresponding to P [(a ₁ +a ₂ +a ₃ )/3, (b ₁ +b ₂ +b ₃ )/3, (c ₁ +c ₂ + c ₃ )/3, (d ₁ +d ₂ +d ₃ )/3] to update the model parameters. Here, A, B, C, and P are brought into the updated model for simulation, and the corresponding four For the prediction results of group model parameter value groups, here, see Figure 6B. Figure 6B is an optional schematic diagram of the aggregation of unpredictable parameters of a multi-agent model provided by the embodiment of the present application. According to the size of the loss value, from A to B Continue to determine the optimal model parameter value group, the worst model parameter value group and other model parameter value groups in the four groups of model parameter value groups, C and P, and then calculate the geometric mean of the optimal model parameter value group and other model parameter groups point, continue the above process, so that each participant locally iteratively optimizes their own unpredictable parameter preset rounds, and obtains their final geometric center point, that is, the intermediate parameter value.

In this way, through the above method of aggregating the parameter values of the unpredictable parameters of the parameter value group, no additional simulations will be generated, that is, no new global unpredictable parameter values will be generated, so that each participant does not need to simulate new values, Compared with unilateral local optimization, the optimal unpredictable parameter value can be found faster and more stably, reducing the number of simulations and model calculations.

In some embodiments, when the influencing factors of the corresponding parameter value groups are weights, the multiple parameter value groups can also be sorted based on the weight of each parameter value group, and the target can be selected from the multiple parameter value groups based on the sorting results. The number of parameter value groups, wherein the target number is less than the number of multiple parameter value groups, and then respectively multiply the weight corresponding to each selected parameter value group with the parameter value of the unpredictable parameter to obtain the corresponding parameter value group The multiplication result, and then accumulate the multiplication results corresponding to each parameter value group to obtain the accumulation result, and finally use the accumulation result as the intermediate parameter value of the unpredictable parameter.

It should be noted that, for the method of aggregating the parameter values of unpredictable parameters based on each parameter value group and the corresponding impact factors, multiple parameter value groups can also be sorted based on the loss value, based on the sorting results, from multiple Select the parameter value groups of the target number from the parameter value groups, wherein the target number is less than the number of multiple parameter value groups, and then multiply the weights corresponding to the selected parameter value groups with the parameter values of the unpredictable parameters, Obtain the multiplication result corresponding to each parameter value group, and then accumulate the multiplication result corresponding to each parameter value group to obtain the accumulation result, and finally use the accumulation result as the intermediate parameter value of the unpredictable parameter, the embodiment of the present application is based on each parameter value group And the corresponding impact factors, there is no limit to the way of aggregation of parameter values of unpredictable parameters.

Step 104: Send the obtained intermediate parameter value to the cooperating device, wherein the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participating devices to obtain target parameters corresponding to each unpredictable parameter value.

In actual implementation, privacy protection is performed on the intermediate parameter values of each unpredictable parameter after obtaining the intermediate parameter values, and the privacy-protected intermediate parameter values are obtained; here, the privacy protection method can be fuzzy processing on the intermediate parameter values, for example, adding Noise, differential privacy processing, etc., what the coordinating device obtains is the parameter value obtained by at least two participant devices after performing privacy processing on the intermediate parameter value. When the parameter value is set, the noise in it will cancel each other out, without affecting the aggregation result of the intermediate parameter value. In addition, the processing method of privacy protection can also be to perform homomorphic encryption on intermediate parameter values.

In actual implementation, there are many ways for the coordinating party to aggregate the intermediate parameter values sent by multiple participant devices. The center point uploaded by the participant is averaged, or the participant uploads the loss value of the optimal model parameter value group or the worst model parameter value group at the same time in addition to uploading the geometric center point, or other than the worst model parameter value group On the basis of the average loss value of all model parameter value groups, the participants are sorted according to the loss value, and multiple better center points are selected for averaging to obtain a new center point. The embodiment of the present application does not limit the process of the parameter aggregation operation performed by the coordinating party.

Step 105, receiving target parameter values corresponding to unpredictable parameters returned by the coordinating device, and updating the multi-agent model based on the target parameter values.

It should be noted that there are two ways for the participants to update the multi-agent model based on the target parameter values.

In some embodiments, refer to FIG. 7A. FIG. 7A is an optional flow chart of the multi-agent model training method provided by the embodiment of the present application. Here, the entire model training process is divided into two stages, and the first stage is local The multi-agent model training, until the model reaches the convergence condition, the intermediate parameter values at the time of convergence are uploaded to the collaborating party device (parameter aggregation device), where the intermediate parameter value is used to trigger the collaborating party device to perform the second stage parameter Aggregation operation, in order to adapt to preliminary modeling or rapid modeling scenarios, the parameter aggregation in the second stage can be performed only once, and the entire model will converge.

In some other embodiments, refer to FIG. 7B. FIG. 7B is an optional flow chart of the multi-agent model training method provided by the embodiment of the present application. Here, the participants can also only conduct local multi-agent model training Parameter aggregation, that is, to upload each intermediate parameter value to the partner device, wherein the intermediate parameter value is used to trigger the second stage parameter aggregation operation of the partner device only once, and then return the aggregated target parameter value to each participant equipment for each participant’s equipment to update the local model, and then continue to simulate the local multi-agent model based on the updated model, and then upload the intermediate parameter values to the collaborating party’s equipment, and continue the above process until the local The multi-agent model converges.

It should be noted that in the second update method above, after each participant device obtains the target parameter value, the participant device updates the local multi-agent model based on the target parameter value, and then compares the target parameter value with the value selected before the model update. The target number of parameter value groups is input to the updated local multi-agent model, and the target parameter value and the target number of parameter value groups selected before the model update are aggregated, that is, the target number of parameter value groups is selected again, Calculate the average value of the parameter values of the unpredictable parameters in the target number of parameter value groups selected again, and send them to the coordinating device as intermediate parameter values, and then continue the above process.

In some embodiments, after the training of the multi-agent model is completed, other uses of the multi-agent model can be realized by changing the actual parameter values of the predictable parameters, where the actual parameter values are different from the training parameters of the predictable parameters value; as an example, the predictable parameters include the sex, age, occupation, and number of infected persons of the target disease, and the actual parameter values may be the sex, age, occupation, and number of infected persons of the target disease in the target area, Then the actual parameter values are input into the updated multi-agent model for prediction, so that the number of deaths caused by the target disease in the target area can be obtained.

In this way, the multi-agent model is used to predict the data related to the disease, which improves the accuracy of the model prediction, and then timely controls the situation related to the disease, so as to quickly dispatch medical resources and timely carry out disease prevention and control.

Applying the above-mentioned embodiments of the present application, compared with the way in which the multi-agent model in the related art can only be trained by the data owner alone, the intermediate parameter values obtained by the participants' local aggregation of unpredictable parameters are sent to the collaborative Party, and based on the target parameter value returned by the secondary aggregation of the intermediate parameter value received by the collaborating party, the multi-agent model is updated. In this way, when multiple participants train the multi-agent model with the same purpose, Jointly optimize the value of unpredictable parameters, so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation. Co-modeling among them improves the prediction accuracy of the model.

After describing the training method of the multi-agent model provided by the embodiment of the present application, the application of the multi-agent model obtained by training will be described next. The prediction method of the provided multi-agent model is introduced, see Fig. 8, Fig. 8 is a schematic flowchart of the prediction method of the multi-agent model provided by the embodiment of the present application, the prediction method based on the multi-agent model provided by the embodiment of the present application include:

In step 201, the participant device acquires an actual parameter value of a predictable parameter, wherein the actual parameter value is different from a training parameter value of the predictable parameter.

In actual implementation, obtaining the actual parameter values of the predictable parameters includes obtaining the total number of residents in the target area, the sex, age, and occupation of the residents, and the sex, age, occupation of the target disease infected person, and the activity track of the infected person. Here, the target area can be a certain city or a certain country, the target disease can be a new type of disease with strong transmission, and the target disease infected person can be at least one foreign disease infected person who flows into the target area from an area outside the target area , or it could be a free-moving local spreader not subject to disease control in the target area.

Step 202, input actual parameter values into the updated multi-agent model for prediction, and obtain corresponding prediction results.

In actual implementation, the acquired total number of residents in the target area, the sex, age, occupation of the residents, and the sex, age, occupation of the target disease infected person, and the activity trajectory of the infected person are input into the updated multi-intelligence The body model can predict the impact of the target disease infection on the residents in the target area, that is, the number of new infections in the target area caused by the target disease infection can be obtained.

In this way, after obtaining specific predictable parameter values, compared with the previous multi-agent model, the updated multi-agent model can accurately predict the impact of the target disease infected person on the target area, that is, the number of infections. , it is possible to fully prepare medical resources, provide timely treatment for disease-infected persons, and avoid the problem of rising mortality due to insufficient medical resources.

In some embodiments, the updated multi-agent model can also be used to predict urban traffic conditions, that is, to predict the number of vehicles congested within the target time period for the target road segment in the target area within a certain period of time in the future, specifically including obtaining predictable parameters The actual parameter values of the target area are the population travel trajectory, office area distribution, holiday time, etc.; here, the target area can be different central areas of the city. In actual implementation, the acquired population travel trajectory, office area The distribution, holiday time, etc. are input into the updated multi-agent model, which can predict the number of congested vehicles in the target road segment in the target area within the target time period. In this way, after obtaining specific predictable parameter values, compared with the previous multi-agent model, the updated multi-agent model can accurately predict the congestion situation of the target road section in the target area within the target time period, So as to make timely traffic control.

Next, taking the application scenario of horizontal federated learning as an example, the training of the multi-agent model provided by the embodiment of the present application will be described. In the scenario of horizontal federated learning, there is usually one collaborator and at least two participants, that is, the training of the model is jointly implemented by one collaborator device and at least two participant devices. Both the participant device and the coordinating device can be servers or terminals. Referring to FIG. 9, FIG. 9 is a schematic flowchart of a training method for a multi-agent model provided in an embodiment of the present application, including:

Step 301, each participant device initializes a local multi-agent model.

Here, in the application scenario of horizontal federated learning, each participant, as the data holder, has relatively little user overlap and relatively large user feature overlap in the data set owned by each participant, and each participant has the label of the corresponding user; for example, each participant It can be hospitals in different regions, and the users they reach are residents in different regions (that is, different samples), but the business is the same (that is, the characteristics are the same); correspondingly, the collaborating party device can be a credible institution.

Referring to Fig. 10, Fig. 10 is a horizontal federated learning method of a multi-agent model provided by the embodiment of the present application. Here, one collaborator device and n participant devices are shown, and the structures and working methods of each participant are the same. In this embodiment, each participant device has the same multi-agent model, with its own private predictable parameters X _{1, E} , ..., X _N, _E , and its own unpredictable parameters X _{1, V} , ... , X _{N, V} , and the target variables Y _1, _gt , ..., Y _{N, gt} of the local multi-agent model simulation of each party. In specific implementation, the local multi-agent model is initialized by determining the value of the predictable parameter X _E , the structure of the multi-agent model, the prediction target Y _gt and selecting the unpredictable parameter X _V .

Step 302, input the parameter values of the predictable parameters into the local multi-agent model.

Continuing to refer to FIG. 10 , the private predictable parameters X _{1 , E} , . . . , X _{N , E} are input to the local ABS model.

Step 303, in the case of fixing the parameter value of the predictable parameter, input multiple parameter value groups into the multi-agent model for prediction respectively, and obtain multiple prediction results.

As an example, here is an example of optimizing two parameters (a, b). Each participant initializes three sets of values (which can be regarded as a point), and each set includes a value of the two parameters. These three sets of parameters were brought into the model for simulation, and the model prediction results corresponding to the three sets of parameters were obtained. Here continue to refer to Figure 10, input the respective unpredictable parameters X _{1, V} , ..., X _{N, V} to the local ABS model, combined with the above example, here X _{1, V} corresponds to the parameter a, X _{2, V} corresponds to the parameter b, each participant initializes 3 sets of values (which can be regarded as a point) namely [a ₁ , b ₁ ], [a ₂ , b ₂ ] and [a ₃ , b ₃ ], and these 3 sets of parameters are respectively Bring in the model for simulation, and get the model prediction results corresponding to the three sets of parameters. That is, bring [a1, b1], [a2, b2] and [a3, b3] into the model for simulation, and get the model predictions corresponding to the three sets of parameters respectively. result.

Step 304, respectively comparing multiple predicted results with corresponding actual results.

Continuing with the above example, if the purpose of the multi-agent model is to predict the number of local deaths, then within a certain period of time, the actual number of deaths in the local area is the actual result, and comparing multiple predicted results with the corresponding actual results is [ A ₁ , b ₁ ], [a ₂ , b ₂ ] and [a ₃ , b ₃ ] respectively correspond to the predicted death toll and the local actual death toll.

Step 305, based on the comparison result, determine the loss value corresponding to each parameter value group.

In actual implementation, the mean square error (MSE) is usually used as the loss function to calculate the loss value corresponding to each parameter value group.

Step 306, sort the multiple loss values to obtain the optimal model parameter value group, the worst model parameter value group and other model parameter value groups.

Following the above example, determine the loss values of the prediction results corresponding to [a ₁ , b ₁ ], [a ₂ , b ₂ ] and [a ₃ , b ₃ ] respectively, and sort the three loss values to obtain the optimal model parameters value group [a ₁ , b ₁ ], worst model parameter value group [a ₂ , b ₂ ] and other model parameter value group [a ₃ , b ₃ ].

Step 307, aggregate parameter values of unpredictable parameters of all model parameter value groups except the worst model parameter value group to obtain intermediate parameter values corresponding to each unpredictable parameter.

As an example, the aggregation of parameter values of unpredictable parameters here can be to obtain the geometric center point of the optimal model parameter value group and other model parameter value groups. Referring to FIG. 11, FIG. 11 is a multiple An optional schematic diagram of unpredictable parameter aggregation of the agent model, following the above example, here find the geometric center of the optimal model parameter value group [a ₁ , b ₁ ] and other model parameter value groups [a ₃ , b ₃ ] Point C, where C=[(a ₁ +a ₃ )/2, (b ₁ +b ₃ )/2].

It should be noted that after the geometric center point C is obtained, the model parameters are updated based on the model parameter value group [(a ₁ +a ₃ )/2, (b ₁ +b ₃ )/2] corresponding to C, and the [a ₁ , b ₁ ], [a ₃ , b ₃ ] and [(a ₁ +a ₃ )/2, (b ₁ +b ₃ )/2] continue to be brought into the updated model for simulation, and the corresponding The prediction results of the three sets of model parameter value groups, and then continue the process of step 304-step 307, so that each participant iteratively optimizes its own unpredictable parameters N _L rounds locally, and obtains their respective final geometric center points C _i,V ^t+1 That is, the intermediate parameter value.

Step 308, sending the intermediate parameter value to the partner device.

Continuing to refer to FIG. 10 , the n participant devices send their respective final geometric center points C _{i, V} ^t+1 to the coordinating device.

In step 309, the coordinating device aggregates the received intermediate parameter values to obtain target parameter values corresponding to each unpredictable parameter.

As an example, three specific aggregation methods are listed to describe in detail the process of the coordinating party's aggregation processing of the received intermediate parameter values, specifically including: a) A typical aggregation method is geometric mean, that is, C _Server, _V ^t+1 ＝centroid(C _{1, V} ^t+1 ,..., C _{N, V} ^t+1 ); b) Randomly select the center points uploaded by some participants for averaging, such as randomly selecting K parties, K<N, C _{Server, V} ^t+1 = centroid(C _{1, V} ^t+1 ,..., C _{K, V} ^t+1 ); c) In addition to uploading the geometric center point, the participant uploads the loss value of the best point or the worst point at the same time, Or the average loss value of all points except the worst point; sort the participants according to the loss value, select the best K center points for average, and get a new center point, K<N, C _{Server, V} ^{t+ 1} = centroid(C _{1 , V} ^t+1 , . . . , C _{K , V} ^t+1 ).

Exemplarily, the coordinating party device aggregates the received geometric center points, that is, calculates the geometric mean of C ₁ , ..., C _n , where, if C ₁ =[x ₁ , y ₁ ], C _n =[x _n , y _n ], then C _{Server, V} ^t+1 = [(x ₁ +...+x _n )/n, (y ₁ +...+y _n )/n].

Step 310, sending the target parameter value to each participant device.

Continuing to refer to FIG. 10 , the coordinating party device sends the target parameter value C _Server, _V ^t+1 corresponding to each unpredictable parameter obtained through aggregation to n participant devices.

Step 311, update the multi-agent model based on the target parameter value.

In actual implementation, after obtaining the target parameter value, that is, the optimized unpredictable parameter, the participant device optimizes the local multi-agent model according to the unpredictable parameter.

The following continues to explain the multi-agent model training device 254 provided by the embodiment of the present application, referring to FIG. 12 , which is a schematic structural diagram of the multi-agent model training device 254 provided by the embodiment of the present application. The training device 254 of the agent model comprises:

The obtaining module 2541 is configured such that the participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and in the case of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model. The agent model performs prediction and obtains multiple prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter;

The comparison module 2542 is configured to determine the impact factor of each of the parameter value groups based on the plurality of prediction results and the actual results corresponding to each of the prediction results;

The aggregation module 2543 is configured to aggregate the parameter values of each of the unpredictable parameters based on each of the parameter value groups and the corresponding impact factors, to obtain intermediate parameter values corresponding to each of the unpredictable parameters;

The sending module 2544 is configured to send the obtained intermediate parameter value to the cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices Processing to obtain target parameter values corresponding to each of the unpredictable parameters;

The updating module 2545 is configured to receive target parameter values corresponding to the unpredictable parameters returned by the cooperating device, and update the multi-agent model based on the target parameter values.

In some embodiments, the acquisition module 2541 is further configured to acquire the number of unpredictable parameters, and determine the number of parameter value groups based on the number of unpredictable parameters; based on the number of parameter value groups , determine the parameter values of the unpredictable parameters in each parameter value group; respectively input the parameter values of the unpredictable parameters in the parameter value groups to the multi-agent model for prediction, and obtain the parameters corresponding to the multiple parameter value groups multiple predictions.

In some embodiments, the obtaining module 2541 is further configured to obtain the parameter type of each unpredictable parameter in the parameter value group; determine the corresponding parameter value range according to the parameter type corresponding to each unpredictable parameter; The parameter value range of each unpredictable parameter determines the parameter value of each unpredictable parameter.

In some embodiments, the comparison module 2542 is further configured to determine the prediction accuracy corresponding to each parameter value group based on the prediction result corresponding to each parameter value group and the corresponding actual result; The prediction accuracy corresponding to each of the parameter value groups is used as the corresponding impact factor.

In some embodiments, the aggregation module 2543 is further configured to multiply the prediction accuracy corresponding to each of the parameter value groups by the parameter value of the unpredictable parameter to obtain the corresponding to each of the parameter value groups A product result: accumulating the product results corresponding to each of the parameter value groups to obtain an accumulation result; using the accumulation result as an intermediate parameter value of the unpredictable parameter.

In some embodiments, the comparison module 2542 is further configured to determine the loss value corresponding to each parameter value group based on the predicted result corresponding to each parameter value group and the corresponding actual result; The loss value corresponding to the parameter value group determines the impact factor of the corresponding parameter value group.

In some embodiments, the aggregation module 2543 is further configured to sort the plurality of parameter value groups based on the impact factor of each parameter value group to obtain a sorting result; based on the sorting result, from the Selecting a target number of parameter value groups from a plurality of parameter value groups; wherein, the target number is smaller than the number of the plurality of parameter value groups; based on the selected target number of parameter value groups, for each parameter of the unpredictable parameter Values are aggregated to obtain intermediate parameter values corresponding to each of the unpredictable parameters.

In some embodiments, the aggregation module 2543 is further configured to obtain the average value of the parameter values of the unpredictable parameters in the target number of parameter value groups; and use the average value as the middle value of the unpredictable parameters parameter value.

In some embodiments, the sending module 2544 is further configured to perform privacy protection on the intermediate parameter values of the unpredictable parameters respectively to obtain the privacy-protected intermediate parameter values; and send the privacy-protected intermediate parameter values to the collaborative party device, wherein the intermediate parameter value is used to trigger the coordinating party device to aggregate the privacy-protected intermediate parameter values sent by multiple participant devices to obtain the target corresponding to each of the unpredictable parameters parameter value.

In some embodiments, the device further includes a second acquisition module 1210 and a prediction module 1220, the second acquisition module 1210 measures the training parameter value of the parameter; the prediction module 1220 is configured to input the actual parameter value The updated multi-agent model performs prediction and obtains corresponding prediction results.

In some embodiments, the predictable parameters include the sex, age, occupation, and number of infected persons of the target disease; the second acquisition module 1210 is also configured to obtain the sex of the infected persons of the target disease in the target area , age, occupation, and the number of infections; the prediction module 1220 is also configured to input the sex, age, occupation, and number of infections of the infected persons of the target disease in the target area into the updated multi-agent model , to predict the number of deaths caused by the target disease in the target area.

The following describes the prediction device 1200 based on the multi-agent model provided by the embodiment of the present application. Refer to FIG. 13. FIG. The prediction device 1200 based on the multi-agent model includes:

The second acquiring module 1210 is configured to acquire an actual parameter value of the predictable parameter, where the actual parameter value is different from the training parameter value of the predictable parameter;

The prediction module 1220 is configured to input the actual parameter values into the updated multi-agent model for prediction, and obtain corresponding prediction results.

The embodiment of the present application also provides an electronic device, and the electronic device includes:

memory for storing executable instructions;

The embodiment of the present application also provides a computer program product, including a computer program, and when the computer program is executed by a processor, the multi-agent model training method provided in the embodiment of the present application is implemented.

The embodiment of the present application also provides a computer-readable storage medium storing executable instructions, wherein the executable instructions are stored. When the executable instructions are executed by the processor, the processor will be caused to execute the multi-intelligence system provided by the embodiment of the present application. Body model training method.

In some embodiments, the computer-readable storage medium can be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; Various equipment.

In some embodiments, executable instructions may take the form of programs, software, software modules, scripts, or code written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and its Can be deployed in any form, including as a stand-alone program or as a module, component, subroutine or other unit suitable for use in a computing environment.

As an example, executable instructions may, but do not necessarily correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, in a Hyper Text Markup Language (HTML) document in one or more scripts, in a single file dedicated to the program in question, or in multiple cooperating files (for example, files that store one or more modules, subroutines, or sections of code).

As an example, executable instructions may be deployed to be executed on one computing device, or on multiple computing devices located at one site, or alternatively, on multiple computing devices distributed across multiple sites and interconnected by a communication network. to execute.

To sum up, through the embodiment of this application, when multiple participants train the multi-agent model with the same purpose, the values of unpredictable parameters are jointly optimized, so as to obtain a multi-agent whose simulation results are better in line with the real data. Model, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize joint modeling among multiple participants, thereby improving the accuracy of model prediction.

The above descriptions are merely examples of the present application, and are not intended to limit the protection scope of the present application. Any modifications, equivalent replacements and improvements made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims

A training method of a multi-agent model, based on a federated learning system, the system includes a coordinating party device and at least two participant devices, the method is executed by the participant device, and the method includes:

The participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and under the condition of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model for prediction, Get multiple prediction results;

Wherein, the set of parameter values includes at least one parameter value of an unpredictable parameter;

determining an impact factor for each of the parameter value groups based on the plurality of prediction results and actual results corresponding to each of the prediction results;

Based on each of the parameter value groups and corresponding impact factors, the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameter values corresponding to each of the unpredictable parameters;

Send the obtained intermediate parameter value to the cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices to obtain the The target parameter value of the unpredictable parameter;

receiving target parameter values corresponding to the unpredictable parameters returned by the coordinating device, and updating the multi-agent model based on the target parameter values.
The method according to claim 1, wherein the multiple parameter value groups are respectively input into the multi-agent model for prediction, and multiple prediction results are obtained, including:

obtaining the number of unpredictable parameters, and determining the number of parameter value groups based on the number of unpredictable parameters;

determining parameter values for unpredictable parameters in each parameter value group based on the number of parameter value groups;

Inputting parameter values of unpredictable parameters in each parameter value group into the multi-agent model for prediction, and obtaining multiple prediction results corresponding to the multiple parameter value groups.
The method according to claim 2, wherein said determining parameter values of unpredictable parameters in each parameter value group comprises:

Obtain the parameter type of each unpredictable parameter in the parameter value group;

According to the parameter type corresponding to each unpredictable parameter, determine the corresponding parameter value range;

The parameter value of each unpredictable parameter is determined according to the parameter value range of each unpredictable parameter.
The method according to claim 1, wherein said determining the influence factor of each said parameter value group based on said plurality of predicted results and actual results corresponding to said predicted results comprises:

Based on the prediction results corresponding to each of the parameter value groups and the corresponding actual results, determine the prediction accuracy corresponding to each of the parameter value groups;

The prediction accuracy corresponding to each parameter value group is used as the corresponding impact factor.
The method according to claim 4, wherein, based on each of the parameter value groups and corresponding influence factors, the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameters corresponding to each of the unpredictable parameters values, including:

Perform the following operations for any of the unpredictable parameters in the parameter value group:

Multiply the prediction accuracy corresponding to each of the parameter value groups by the parameter value of the unpredictable parameter to obtain the product result corresponding to each of the parameter value groups;

Accumulating the product results corresponding to each of the parameter value groups to obtain an accumulation result;

The accumulation result is used as an intermediate parameter value of the unpredictable parameter.
The method according to claim 1, wherein said determining the influence factor of each said parameter value group based on said plurality of predicted results and actual results corresponding to said predicted results comprises:

Determining a loss value corresponding to each parameter value group based on the prediction result corresponding to each parameter value group and the corresponding actual result;

Based on the loss value corresponding to each parameter value group, the impact factor of the corresponding parameter value group is determined.
The method according to claim 1, wherein, based on each of the parameter value groups and corresponding impact factors, the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameters corresponding to each of the unpredictable parameters values, including:

sorting the plurality of parameter value groups based on the impact factors of each of the parameter value groups to obtain a sorting result;

Selecting a target number of parameter value groups from the plurality of parameter value groups based on the sorting result; wherein the target number is smaller than the number of the plurality of parameter value groups;

Based on the selected target number of parameter value groups, the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameter values corresponding to each of the unpredictable parameters.
The method according to claim 7, wherein the parameter value group based on the selected target quantity aggregates the parameter values of each of the unpredictable parameters to obtain an intermediate parameter value corresponding to each of the unpredictable parameters, including :

Perform the following operations for any of the unpredictable parameters in the parameter value group:

Obtaining an average value of parameter values of said unpredictable parameter in said target number of parameter value groups;

The mean value is used as the intermediate parameter value of the unpredictable parameter.
The method according to claim 1, wherein the sending the obtained intermediate parameter value to the coordinating party device includes:

Performing privacy protection on the intermediate parameter values of each of the unpredictable parameters respectively, to obtain the privacy-protected intermediate parameter values;

sending the privacy-protected intermediate parameter value to the coordinating party device, where the intermediate parameter value is used to trigger the cooperating party device to aggregate the privacy-protected intermediate parameter values sent by multiple participant devices, A target parameter value corresponding to each of the unpredictable parameters is obtained.
The method according to claim 1, wherein the method further comprises:

obtaining an actual parameter value of the predictable parameter, the actual parameter value being different from the training parameter value of the predictable parameter;

The actual parameter values are input into the updated multi-agent model for prediction, and corresponding prediction results are obtained.
The method according to claim 10, wherein said predictable parameters include the sex, age, occupation, and number of infected persons of the target disease;

The acquiring the actual parameter value of the predictable parameter includes:

Obtain the gender, age, occupation, and number of infected persons of the target disease in the target area;

Said inputting the actual parameter value into the updated multi-agent model for prediction, and obtaining corresponding prediction results, including:

The gender, age, occupation, and number of infected persons of the target disease in the target area are input into the updated multi-agent model, and the number of deaths caused by the target disease in the target area is predicted.
A training device for a multi-agent model, said device comprising:

The acquisition module is configured to input the training parameter values of the predictable parameters to the local multi-agent model by the participant equipment, and input multiple parameter value groups into the multi-agent model under the condition of fixing the training parameter values respectively. The volume model is predicted to obtain multiple prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter;

A comparison module configured to determine an impact factor for each of the parameter value groups based on the plurality of prediction results and actual results corresponding to each of the prediction results;

An aggregation module configured to aggregate the parameter values of each of the unpredictable parameters based on each of the parameter value groups and the corresponding impact factors, to obtain intermediate parameter values corresponding to each of the unpredictable parameters;

A sending module, configured to send the obtained intermediate parameter value to a cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices , to obtain the target parameter value corresponding to each of the unpredictable parameters;

The update module is configured to receive target parameter values corresponding to each of the unpredictable parameters returned by the coordinating device, and update the multi-agent model based on the target parameter values.
An electronic device comprising:

memory for storing executable instructions;

The processor is configured to implement the method according to any one of claims 1 to 11 when executing the executable instructions stored in the memory.
A computer-readable storage medium storing executable instructions for implementing the method according to any one of claims 1 to 11 when executed by a processor.
A computer program product, comprising a computer program, the computer program implements the method according to any one of claims 1 to 11 when executed by a processor.