WO2023024378A1 - Multi-agent model training method, apparatus, electronic device, storage medium and program product - Google Patents

Multi-agent model training method, apparatus, electronic device, storage medium and program product Download PDF

Info

Publication number
WO2023024378A1
WO2023024378A1 PCT/CN2021/142157 CN2021142157W WO2023024378A1 WO 2023024378 A1 WO2023024378 A1 WO 2023024378A1 CN 2021142157 W CN2021142157 W CN 2021142157W WO 2023024378 A1 WO2023024378 A1 WO 2023024378A1
Authority
WO
WIPO (PCT)
Prior art keywords
parameter value
parameter
unpredictable
parameters
agent model
Prior art date
Application number
PCT/CN2021/142157
Other languages
French (fr)
Chinese (zh)
Inventor
何元钦
康焱
刘洋
陈天健
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2023024378A1 publication Critical patent/WO2023024378A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular to a multi-agent model training method, device, electronic equipment, computer readable storage medium and computer program product.
  • Artificial Intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • artificial intelligence is a comprehensive technique of computer science that attempts to understand the nature of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Horizontal federated learning in related technologies usually trains machine learning models by different parties and a collaborative party. Its goal is to use the limited data of all parties to jointly train a global model under the premise of ensuring data security. Because the global model uses the data of each participant for training, the effect of the model can approach the situation where the data of each participant is trained together, which is significantly better than the effect of the model obtained by each participant only based on its own data.
  • the use of multi-agent models is very different from traditional machine learning, and it is impossible to apply federated learning to solve multi-agent model verification according to the traditional federated machine learning model training method.
  • Embodiments of the present application provide a multi-agent model training method, device, electronic device, computer-readable storage medium, and computer program product, which can improve model prediction accuracy while ensuring local data security.
  • An embodiment of the present application provides a multi-agent model training method, based on a federated learning system, the system includes a collaborator device and at least two participant devices, and the method is executed by the participant device, including:
  • the participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and under the condition of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model for prediction, Get multiple prediction results;
  • the set of parameter values includes at least one parameter value of an unpredictable parameter
  • the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameter values corresponding to each of the unpredictable parameters;
  • the embodiment of the present application also provides a multi-agent model training device, the device comprising:
  • the acquisition module is configured to input the training parameter values of the predictable parameters to the local multi-agent model by the participant equipment, and input multiple parameter value groups into the multi-agent model respectively under the condition of fixing the training parameter values
  • the volume model is predicted to obtain multiple prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter;
  • a comparison module configured to determine an impact factor for each of the parameter value groups based on the plurality of prediction results and actual results corresponding to each of the prediction results;
  • An aggregation module configured to aggregate the parameter values of each of the unpredictable parameters based on each of the parameter value groups and the corresponding impact factors, to obtain intermediate parameter values corresponding to each of the unpredictable parameters;
  • a sending module configured to send the obtained intermediate parameter value to a cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices , to obtain the target parameter value corresponding to each of the unpredictable parameters;
  • the update module is configured to receive target parameter values corresponding to each of the unpredictable parameters returned by the coordinating device, and update the multi-agent model based on the target parameter values.
  • An embodiment of the present application provides an electronic device, including:
  • the processor is configured to implement the multi-agent model training method provided in the embodiment of the present application when executing the executable instructions stored in the memory.
  • the embodiment of the present application provides a computer-readable storage medium, which stores executable instructions, and is used to cause a processor to execute the method to implement the multi-agent model training method provided in the embodiment of the present application.
  • An embodiment of the present application provides a computer program product, including a computer program, and when the computer program is executed by a processor, the multi-agent model training method provided in the embodiment of the present application is implemented.
  • the training method, device, electronic equipment, and computer-based multi-agent model based on the horizontal federated learning architecture can Read storage media and computer program products, obtain intermediate parameter values through local aggregation of unpredictable parameters by the participating parties and send them to the collaborating party, and based on the target parameter value obtained by secondary aggregation of the received intermediate parameter values by the collaborating party , to update the multi-agent model.
  • the agent model ensures the security of local data, solves the problem of data islands in the field of multi-agent models, and realizes joint modeling among multiple parties, thereby improving the accuracy of model prediction.
  • Fig. 1 is a schematic diagram of the implementation scene of the training method of the multi-agent model provided by the embodiment of the present application;
  • Fig. 2 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • Fig. 3 is a comparison diagram of the verification process of the multi-agent model provided by the embodiment of the present application and the training process of the machine learning model;
  • Fig. 4 is a schematic flow chart of the training method of the multi-agent model provided by the embodiment of the present application.
  • Fig. 5 is an optional flowchart of the training method of the multi-agent model provided by the embodiment of the present application.
  • Fig. 6A is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by an embodiment of the present application
  • Fig. 6B is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by the embodiment of the present application;
  • Fig. 7A is an optional flowchart of the multi-agent model training method provided by the embodiment of the present application.
  • FIG. 7B is an optional flowchart of the multi-agent model training method provided by the embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a prediction method for a multi-agent model provided in an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a training method for a multi-agent model provided in an embodiment of the present application.
  • Fig. 10 is a horizontal federated learning method of a multi-agent model provided by the embodiment of the present application.
  • Fig. 11 is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by an embodiment of the present application.
  • Fig. 12 is a schematic structural diagram of a training device for a multi-agent model provided in an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a prediction device for a multi-agent model provided by an embodiment of the present application.
  • first ⁇ second ⁇ third is only used to distinguish similar objects, and does not represent a specific order for objects. Understandably, “first ⁇ second ⁇ third” is used in Where permitted, the specific order or sequence may be interchanged such that the embodiments of the application described herein can be practiced in other sequences than illustrated or described herein.
  • Federated learning refers to the method of machine learning by uniting different participants (participants, or parties, also known as data owners, or clients). In federated learning, participants do not need to expose their own data to other participants and coordinators (coordinator, also known as parameter server (parameter server), or aggregation server (aggregation server)), so federated learning can be very good Protect user privacy and ensure data security.
  • horizontal federated learning is to take out the part of the data with the same data characteristics of the participants but not the same users for joint machine learning when the data characteristics of each participant overlap more, but the users overlap less. For example, there are two banks in different regions, and their user groups come from their respective regions, and the mutual intersection is very small. But their businesses are very similar, and most of the recorded user data features are the same. Horizontal federated learning can be used to help two banks build a joint model to predict their customer behavior.
  • the simulation method of the multi-agent model is a calculation used to simulate the actions and interactions of agents (independent individuals or common groups, such as organizations and teams) Model.
  • the multi-agent model is a microscopic model that reproduces and predicts complex phenomena by simulating the simultaneous actions and interactions of multiple agents. This process is the emergence from a low (micro) level to a high (macro) level.
  • ABS urban traffic conditions and disease transmission can be simulated.
  • ABS can be used to simulate the spread of new crown virus to help predict the development of the new crown virus epidemic. And analyze the suppression effect of different intervention methods on the epidemic.
  • Homomorphic Encryption is a symmetric encryption algorithm.
  • the purpose of homomorphic encryption is to find an encryption algorithm that can perform addition and multiplication operations on the ciphertext, so that the encrypted The result obtained by performing a certain operation on the ciphertext is exactly equal to the ciphertext obtained by performing the expected operation on the plaintext before encryption and then encrypting it.
  • Homomorphic encryption effectively ensures that the data processor can directly process the ciphertext of the data, but cannot know the plaintext information of the data it processes. This characteristic of homomorphic encryption enables users' data and privacy to be guaranteed corresponding security. Therefore, homomorphic encryption is applied in many real-world scenarios to ensure data security.
  • an encryption function satisfies additive homomorphism and multiplicative homomorphism at the same time, it is called fully homomorphic encryption.
  • Various encrypted operations addition, subtraction, multiplication, division, polynomial evaluation, exponent, logarithm, trigonometric function, etc. can be completed by using this encryption function.
  • a simulated ABS model of a well-built multi-agent model can be applied to different regions, and only needs to adjust its predictable parameters (such as the age of the population, sex ratio, etc.) according to the corresponding situation in the target region, and then verify Given the values of the unpredictable parameters, the model can be used to predict and analyze the subsequent development of the outbreak in the target area.
  • the larger the area involved in the simulation the more agents used to build the model, the better the effect of the model, and the more accurately it can reflect the real situation of the system.
  • the embodiment of the present application provides a multi-agent model training method, device, electronic equipment, computer-readable storage medium, and computer program product, so that multi-participant equipment can jointly train a multi-agent model under the coordination of the coordinating equipment.
  • Agent model and to ensure the security of local data, to solve the problem of data islands in the field of multi-agent model.
  • the implementation scenario of the training method of the multi-agent model provided by the embodiment of the present application is described below, see Figure 1, which is the multi-agent model provided by the embodiment of the present application Schematic diagram of the implementation scenario of the training method of the model.
  • the participant devices 200-1, 200-2, ..., 200-n are connected to the collaborator device 400 through the network 300, wherein the participant device 200- 1, 200-2, ..., 200-n may be institutions that store predictable parameters, unpredictable parameters, and real values of predicted targets, such as hospitals, and the collaborating party device 400 may be a credible institution.
  • the devices 200-1, 200-2, ..., 200-n and the collaborating party's device 400 assist each other in federated learning so that the participating devices 200-1, 200-2, ..., 200-n can obtain a multi-agent model
  • the network 300 may be a wide area network or a local area network, or a combination of the two, using wireless or wired links for data transmission.
  • Participant devices are used to input the training parameter values of predictable parameters into the local multi-agent model, and in the case of fixing the training parameter values
  • multiple parameter value groups are respectively input to the multi-agent model for prediction, and multiple prediction results are obtained; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter; based on multiple prediction results and the actual corresponding prediction results
  • the influence factor of each parameter value group is determined; based on each parameter value group and the corresponding influence factor, the parameter values of each unpredictable parameter are aggregated to obtain the intermediate parameter value corresponding to each unpredictable parameter; the obtained intermediate parameter The value is sent to the partner device.
  • the coordinating party device (including the coordinating party device 400 ) is configured to aggregate the intermediate parameter values sent by multiple participant devices to obtain target parameter values corresponding to each unpredictable parameter; and send the target parameter value to the participant device.
  • the participant devices are also used to receive the target parameter values corresponding to the unpredictable parameters returned by the coordinating device, and compare the multi- The agent model is updated.
  • the trained multi-agent model can be applied to the modeling of the new crown epidemic that has recently spread around the world, realizing joint modeling among multiple cities, regions, and countries, improving the prediction accuracy of the model, and serving the public and Policymakers provide more accurate data.
  • the participant devices 200-1, 200-2, ..., 200-n and the coordinating party device 400 may be independent physical servers, or server clusters or distributed systems composed of multiple physical servers. It can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery network (Content Deliver Network, CDN), and big data and Cloud servers for basic cloud computing services such as artificial intelligence platforms.
  • the participant devices 200-1, 200-2, ..., 200-n and the collaborator device 400 can also be smart phones, tablet computers, notebook computers, desktop computers, smart speakers, smart watches, etc., but are not limited thereto .
  • the participant devices 200-1, 200-2, . . . , 200-n and the cooperating device 400 may be connected directly or indirectly through wired or wireless communication, which is not limited in this application.
  • FIG. 2 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device 200 shown in FIG. 2 includes: at least one processor 210 , a memory 250 , at least one network interface 220 and a user interface 230 .
  • Various components in the electronic device 200 are coupled together through the bus system 240 .
  • the bus system 240 is used to realize connection and communication between these components.
  • the bus system 240 also includes a power bus, a control bus and a status signal bus.
  • the various buses are labeled as bus system 240 in FIG.
  • Processor 210 can be a kind of integrated circuit chip, has signal processing capability, such as general-purpose processor, digital signal processor (Digital Signal Processor, DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware Components, etc., wherein the general-purpose processor can be a microprocessor or any conventional processor, etc.
  • DSP Digital Signal Processor
  • User interface 230 includes one or more output devices 231 that enable presentation of media content, including one or more speakers and/or one or more visual displays.
  • the user interface 230 also includes one or more input devices 232, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
  • Memory 250 may be removable, non-removable or a combination thereof.
  • Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like.
  • Memory 250 optionally includes one or more storage devices located physically remote from processor 210 .
  • Memory 250 includes volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory.
  • the non-volatile memory can be read-only memory (Read Only Memory, ROM), and the volatile memory can be random access memory (Random Access Memory, RAM).
  • ROM Read Only Memory
  • RAM Random Access Memory
  • memory 250 is capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.
  • Operating system 251 including system programs for processing various basic system services and performing hardware-related tasks, such as framework layer, core library layer, driver layer, etc., for implementing various basic services and processing hardware-based tasks;
  • Network communication module 252 for reaching other computing devices via one or more (wired or wireless) network interfaces 220
  • exemplary network interfaces 220 include: Bluetooth, Wireless Fidelity (Wireless Fidelity, WiFi), and Universal Serial Bus Universal Serial Bus (USB), etc.;
  • the input processing module 253 is configured to detect one or more user inputs or interactions from one or more of the input devices 232 and translate the detected inputs or interactions.
  • the training device of the multi-agent model provided by the embodiment of the present application can be realized by software
  • Fig. 2 shows a training device 254 of the multi-agent model stored in the memory 250, which can be a program and a plug-in and other forms of software, including the following software modules: acquisition module 2541, comparison module 2542, aggregation module 2543, sending module 2544, and update module 2545, these modules are logical, so any combination or combination can be performed according to the realized functions Further splitting, the functions of each module will be explained below.
  • the multi-agent model training device provided by the embodiment of the present application can be realized by combining software and hardware.
  • the multi-agent model training device provided by the embodiment of the present application can be implemented by using hardware
  • a processor in the form of a code processor which is programmed to execute the multi-agent model training method provided by the embodiment of the present application
  • a processor in the form of a hardware decoding processor can use one or more application-specific integrated circuits (Application Specific Integrated Circuit, ASIC), DSP, Programmable Logic Device (Programmable Logic Device, PLD), Complex Programmable Logic Device (Complex Programmable Logic Device, CPLD), Field Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other electronic components.
  • the process of obtaining an updated multi-agent model specifically includes building an initial multi-agent model (construction model), verifying the multi-agent model (verification process) and testing Multi-agent model (test procedure).
  • the construction of the initial multi-agent model refers to the initialization of the model parameters, the preset loss function (for updating the multi-agent model), etc.; Prediction parameters; the testing process refers to testing the correctness of the multi-agent model by modifying the output results of the model.
  • the process of obtaining a converged machine learning model specifically includes building an initial machine learning model, training a machine learning model, and testing a machine learning model. iterative update.
  • the verification process of the multi-agent model on real data is similar to the training process in machine learning, that is, to optimize the values of unpredictable parameters so that the results predicted by the model are as close as possible to the real data.
  • Fig. 4 is a schematic flow chart of the training method of the multi-agent model provided by the embodiment of the present application
  • the training method of the multi-agent model provided by the embodiment of the present application includes:
  • Step 101 the participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and in the case of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model for prediction, and obtains multi-agent prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter.
  • the value of the predictable parameter here is determined according to the local conditions of each party. For example, it can be the age, occupation, gender, and daily travel trajectory of local residents, or the gender, Age, occupation, the number of infected people, and the action trajectory of the target disease infected person; here, the training parameter value of the predictable parameter is based on the difference in the training purpose of the local multi-agent model, and the different predictable parameters obtained, namely In the process of training and optimizing a multi-agent model, the value of predictable parameters is fixed. As an example, if the multi-agent model is used to predict the number of local disease deaths, the total number of local residents, residents Gender, age, etc.
  • the fixed predictable parameter at this time can be the number of contacts between healthy users and sick users; correspondingly, it can be determined by changing the predictable parameter, that is, the number of contacts between healthy users and sick users New disease transmission probability.
  • the parameter value group includes at least one parameter value of an unpredictable parameter.
  • the value of the unpredictable parameter cannot be deduced from existing data or experience, and the predicted value obtained by bringing the unpredictable parameter into the model is required. It is obtained by comparing with the corresponding real value, that is, by adjusting the value of the unpredictable parameter, the model result is consistent with the actual prediction target, the optimal value is determined, and the accuracy of the simulation result is verified on the test data, that is to say , select the appropriate value of the unpredictable parameter, so that the simulation results of the model conform to the real data (distribution) as much as possible.
  • FIG. 5 is a training method for the multi-agent model provided by the embodiment of the present application.
  • An optional schematic flow chart of FIG. 4, step 101 can also be implemented in the following manner:
  • Step 1011 acquire the number of unpredictable parameters, and determine the number of parameter value groups based on the number of unpredictable parameters.
  • the number of unpredictable parameters that need to be optimized is determined, so that the number of parameter value groups is determined based on the number of unpredictable parameters.
  • the number of parameter value groups may be n+1.
  • Step 1012 based on the number of parameter value groups, determine the parameter values of the unpredictable parameters in each parameter value group.
  • parameter values of unpredictable parameters corresponding to the number of parameter value groups are selected.
  • the number of parameter value groups is n+1, select n+1 parameter values as the unpredictable parameters in each parameter value group.
  • the parameter value group is The four groups are A, B, C and D, where the parameter values of the unpredictable parameters include A(a 1 , b 1 , c 1 , d 1 ), B(a 2 , b 2 , c 2 , d 2 ), C (a 3 , b 3 , c 3 , d 3 ) and D (a 4 , b 4 , c 4 , d 4 ).
  • selecting the parameter value of the unpredictable parameter includes obtaining the parameter type of each unpredictable parameter in the parameter value group, and then determining the corresponding parameter value range according to the parameter type corresponding to each unpredictable parameter, and then according to each unpredictable parameter
  • the parameter value range of the parameter determines the parameter value of each unpredictable parameter.
  • the unpredictable parameter can be the transmission coefficient of the disease, or it can be the influence of weather, age, gender, etc. on the transmission of the disease.
  • determine the The value range of the unpredictable parameter is 0-K, and then the parameter value of the unpredictable parameter is randomly selected from the range of 0-K.
  • a is an unpredictable parameter to be optimized with a value range of 0-K, then a 1 , a 2 , a 3 and a 4 are all parameter values between (0, K).
  • Step 1013 respectively input the parameter values of the unpredictable parameters in each parameter value group to the multi-agent model for prediction, and obtain multiple prediction results corresponding to multiple parameter value groups.
  • A(a 1 , b 1 , c 1 , d 1 ), B(a 2 , b 2 , c 2 , d 2 ), C(a 3 , b 3 , c 3 , d 3 ) and D(a 4 , b 4 , c 4 , d 4 ) are respectively input to the multi-agent model for prediction, and the prediction results corresponding to group A, the prediction results corresponding to group B, the prediction results corresponding to group C and the prediction results corresponding to group D are obtained result.
  • Step 102 based on multiple prediction results and actual results corresponding to each prediction result, determine the impact factor of each parameter value group.
  • the impact factor can be used to characterize the degree of influence of unpredictable parameters in each parameter value group, that is, to characterize the degree of influence of each parameter value.
  • determining the impact factor of each parameter value group based on multiple prediction results and actual results corresponding to each prediction result includes determining each The prediction accuracy corresponding to the parameter value group; the prediction accuracy corresponding to each parameter value group is used as the corresponding impact factor.
  • the prediction accuracy may be the weight corresponding to each parameter value group.
  • determining the influence factor of each parameter value group based on multiple prediction results and actual results corresponding to each prediction result includes determining each The loss values corresponding to each parameter value group; based on the loss value corresponding to each parameter value group, determine the impact factor of the corresponding parameter value group.
  • the reciprocal of the loss value can be used as the influence factor of the corresponding parameter value group. The larger the loss value is, the smaller the reciprocal of the loss value is, the smaller the influence factor is, or the loss value can be used as the influence factor of the corresponding parameter value group.
  • Factor the greater the loss value, the greater the impact factor.
  • the embodiment of the present application does not limit the method of determining the impact factor of the corresponding parameter value group through the loss value.
  • Step 103 based on each parameter value group and the corresponding impact factor, the parameter values of each unpredictable parameter are aggregated to obtain the intermediate parameter value corresponding to each unpredictable parameter.
  • the weights corresponding to each parameter value group are multiplied by the parameter values of unpredictable parameters to obtain the product results corresponding to each parameter value group, and then The multiplication results corresponding to each parameter value group are accumulated to obtain the accumulation result, and finally the accumulation result is used as the intermediate parameter value of the unpredictable parameter.
  • the parameter groups here are A(a 1 , b 1 , c 1 , d 1 ), B(a 2 , b 2 , c 2 , d 2 ), C(a 3 , b 3 , c 3 , d 3 ) and D(a 4 , b 4 , c 4 , d 4 ), the corresponding weights are x, y, z and k, then the intermediate parameter value P of the unpredictable parameter is (a 1 *x+a 2 *y+a 3 *z+a 4 *k, b 1 *x+b 2 *y+b 3 *z+b 4 *k, c 1 *x+c 2 *y+c 3 *z+c 4 *k, d 1 *x+d 2 *y+d 3 *z+d 4 *k).
  • multiple parameter value groups are sorted based on the influence factors of each parameter value group to obtain the sorting result; based on Sorting results, selecting a parameter value group of a target quantity from a plurality of parameter value groups; wherein, the target quantity is less than the quantity of a plurality of parameter value groups; obtaining the average value of the parameter value of the unpredictable parameter in the parameter value group of the target quantity;
  • the mean value serves as an intermediate parameter value for unpredictable parameters.
  • the impact factor is the reciprocal of the loss value
  • sort multiple parameter value groups from large to small or small to large based on the size of the loss value, and then select the target from the sorted parameter value group A number of parameter value groups, where the target number is less than the number of parameter value groups.
  • the parameter groups here are A(a 1 , b 1 , c 1 , d 1 ), B(a 2 , b 2 , c 2 , d 2 ), C(a 3 , b 3 , c 3 , d 3 ) and D(a 4 , b 4 , c 4 , d 4 ), based on the size of the loss value, determine the optimal model parameter value group A, the worst model parameter value group D and other model parameter value groups B and C .
  • the process of aggregating the parameter values of the unpredictable parameters in the selected target number of parameter value groups includes obtaining the average value of the parameter values of the unpredictable parameters in the target number of parameter value groups, and then using the average value as the unpredictable parameter.
  • the process of using the average value as the intermediate parameter value of the unpredictable parameter is described, exemplarily, optimize n select n parameter groups from n+1 parameter groups, and average the parameter values of the corresponding unpredictable parameters in the n parameter groups to use as the intermediate parameter value of the parameter values of the unpredictable parameters.
  • the average value can also be used to update the multi-agent model, and then the average value and the selected target number
  • the parameter value groups are aggregated, that is, the parameter value groups of the target number are selected again, and the parameter values of the unpredictable parameters in the parameter value groups of the target number selected again are averaged, and then the above-mentioned updating of the multi-agent model is continued.
  • the process and the process of aggregation again are iterated, and the average value obtained by the last aggregation is used as the intermediate parameter value of the unpredictable parameter. In this way, each participant iteratively optimizes its unpredictable parameter preset rounds locally, and obtains its final average value, which is the intermediate parameter value.
  • FIG. 6B is an optional schematic diagram of the aggregation of unpredictable parameters of a multi-agent model provided by the embodiment of the present application.
  • the multiple parameter value groups can also be sorted based on the weight of each parameter value group, and the target can be selected from the multiple parameter value groups based on the sorting results.
  • multiple parameter value groups can also be sorted based on the loss value, based on the sorting results, from multiple Select the parameter value groups of the target number from the parameter value groups, wherein the target number is less than the number of multiple parameter value groups, and then multiply the weights corresponding to the selected parameter value groups with the parameter values of the unpredictable parameters, Obtain the multiplication result corresponding to each parameter value group, and then accumulate the multiplication result corresponding to each parameter value group to obtain the accumulation result, and finally use the accumulation result as the intermediate parameter value of the unpredictable parameter, the embodiment of the present application is based on each parameter value group And the corresponding impact factors, there is no limit to the way of aggregation of parameter values of unpredictable parameters.
  • Step 104 Send the obtained intermediate parameter value to the cooperating device, wherein the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participating devices to obtain target parameters corresponding to each unpredictable parameter value.
  • privacy protection is performed on the intermediate parameter values of each unpredictable parameter after obtaining the intermediate parameter values, and the privacy-protected intermediate parameter values are obtained;
  • the privacy protection method can be fuzzy processing on the intermediate parameter values, for example, adding Noise, differential privacy processing, etc., what the coordinating device obtains is the parameter value obtained by at least two participant devices after performing privacy processing on the intermediate parameter value. When the parameter value is set, the noise in it will cancel each other out, without affecting the aggregation result of the intermediate parameter value.
  • the processing method of privacy protection can also be to perform homomorphic encryption on intermediate parameter values.
  • the coordinating party there are many ways for the coordinating party to aggregate the intermediate parameter values sent by multiple participant devices.
  • the center point uploaded by the participant is averaged, or the participant uploads the loss value of the optimal model parameter value group or the worst model parameter value group at the same time in addition to uploading the geometric center point, or other than the worst model parameter value group
  • the participants are sorted according to the loss value, and multiple better center points are selected for averaging to obtain a new center point.
  • the embodiment of the present application does not limit the process of the parameter aggregation operation performed by the coordinating party.
  • Step 105 receiving target parameter values corresponding to unpredictable parameters returned by the coordinating device, and updating the multi-agent model based on the target parameter values.
  • FIG. 7A is an optional flow chart of the multi-agent model training method provided by the embodiment of the present application.
  • the entire model training process is divided into two stages, and the first stage is local
  • the multi-agent model training until the model reaches the convergence condition, the intermediate parameter values at the time of convergence are uploaded to the collaborating party device (parameter aggregation device), where the intermediate parameter value is used to trigger the collaborating party device to perform the second stage parameter Aggregation operation, in order to adapt to preliminary modeling or rapid modeling scenarios, the parameter aggregation in the second stage can be performed only once, and the entire model will converge.
  • FIG. 7B is an optional flow chart of the multi-agent model training method provided by the embodiment of the present application.
  • the participants can also only conduct local multi-agent model training Parameter aggregation, that is, to upload each intermediate parameter value to the partner device, wherein the intermediate parameter value is used to trigger the second stage parameter aggregation operation of the partner device only once, and then return the aggregated target parameter value to each participant equipment for each participant’s equipment to update the local model, and then continue to simulate the local multi-agent model based on the updated model, and then upload the intermediate parameter values to the collaborating party’s equipment, and continue the above process until the local The multi-agent model converges.
  • the participant device updates the local multi-agent model based on the target parameter value, and then compares the target parameter value with the value selected before the model update.
  • the target number of parameter value groups is input to the updated local multi-agent model, and the target parameter value and the target number of parameter value groups selected before the model update are aggregated, that is, the target number of parameter value groups is selected again, Calculate the average value of the parameter values of the unpredictable parameters in the target number of parameter value groups selected again, and send them to the coordinating device as intermediate parameter values, and then continue the above process.
  • the training of the multi-agent model after the training of the multi-agent model is completed, other uses of the multi-agent model can be realized by changing the actual parameter values of the predictable parameters, where the actual parameter values are different from the training parameters of the predictable parameters value; as an example, the predictable parameters include the sex, age, occupation, and number of infected persons of the target disease, and the actual parameter values may be the sex, age, occupation, and number of infected persons of the target disease in the target area, Then the actual parameter values are input into the updated multi-agent model for prediction, so that the number of deaths caused by the target disease in the target area can be obtained.
  • the predictable parameters include the sex, age, occupation, and number of infected persons of the target disease
  • the actual parameter values may be the sex, age, occupation, and number of infected persons of the target disease in the target area
  • the multi-agent model is used to predict the data related to the disease, which improves the accuracy of the model prediction, and then timely controls the situation related to the disease, so as to quickly dispatch medical resources and timely carry out disease prevention and control.
  • the multi-agent model is updated.
  • the multi-agent model is updated.
  • Jointly optimize the value of unpredictable parameters so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation.
  • Co-modeling among them improves the prediction accuracy of the model.
  • Fig. 8 is a schematic flowchart of the prediction method of the multi-agent model provided by the embodiment of the present application, the prediction method based on the multi-agent model provided by the embodiment of the present application include:
  • step 201 the participant device acquires an actual parameter value of a predictable parameter, wherein the actual parameter value is different from a training parameter value of the predictable parameter.
  • obtaining the actual parameter values of the predictable parameters includes obtaining the total number of residents in the target area, the sex, age, and occupation of the residents, and the sex, age, occupation of the target disease infected person, and the activity track of the infected person.
  • the target area can be a certain city or a certain country
  • the target disease can be a new type of disease with strong transmission
  • the target disease infected person can be at least one foreign disease infected person who flows into the target area from an area outside the target area , or it could be a free-moving local spreader not subject to disease control in the target area.
  • Step 202 input actual parameter values into the updated multi-agent model for prediction, and obtain corresponding prediction results.
  • the acquired total number of residents in the target area, the sex, age, occupation of the residents, and the sex, age, occupation of the target disease infected person, and the activity trajectory of the infected person are input into the updated multi-intelligence
  • the body model can predict the impact of the target disease infection on the residents in the target area, that is, the number of new infections in the target area caused by the target disease infection can be obtained.
  • the updated multi-agent model can accurately predict the impact of the target disease infected person on the target area, that is, the number of infections. , it is possible to fully prepare medical resources, provide timely treatment for disease-infected persons, and avoid the problem of rising mortality due to insufficient medical resources.
  • the updated multi-agent model can also be used to predict urban traffic conditions, that is, to predict the number of vehicles congested within the target time period for the target road segment in the target area within a certain period of time in the future, specifically including obtaining predictable parameters
  • the actual parameter values of the target area are the population travel trajectory, office area distribution, holiday time, etc.; here, the target area can be different central areas of the city.
  • the acquired population travel trajectory, office area The distribution, holiday time, etc. are input into the updated multi-agent model, which can predict the number of congested vehicles in the target road segment in the target area within the target time period.
  • the updated multi-agent model can accurately predict the congestion situation of the target road section in the target area within the target time period, So as to make timely traffic control.
  • the multi-agent model is updated.
  • the multi-agent model is updated.
  • Jointly optimize the value of unpredictable parameters so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation.
  • Co-modeling among them improves the prediction accuracy of the model.
  • FIG. 9 is a schematic flowchart of a training method for a multi-agent model provided in an embodiment of the present application, including:
  • Step 301 each participant device initializes a local multi-agent model.
  • each participant as the data holder, has relatively little user overlap and relatively large user feature overlap in the data set owned by each participant, and each participant has the label of the corresponding user; for example, each participant It can be hospitals in different regions, and the users they reach are residents in different regions (that is, different samples), but the business is the same (that is, the characteristics are the same); correspondingly, the collaborating party device can be a credible institution.
  • Fig. 10 is a horizontal federated learning method of a multi-agent model provided by the embodiment of the present application.
  • each participant device has the same multi-agent model, with its own private predictable parameters X 1, E , ..., X N, E , and its own unpredictable parameters X 1, V , ... , X N, V , and the target variables Y 1, gt , ..., Y N, gt of the local multi-agent model simulation of each party.
  • the local multi-agent model is initialized by determining the value of the predictable parameter X E , the structure of the multi-agent model, the prediction target Y gt and selecting the unpredictable parameter X V .
  • Step 302 input the parameter values of the predictable parameters into the local multi-agent model.
  • the private predictable parameters X 1 , E , . . . , X N , E are input to the local ABS model.
  • Step 303 in the case of fixing the parameter value of the predictable parameter, input multiple parameter value groups into the multi-agent model for prediction respectively, and obtain multiple prediction results.
  • each participant initializes three sets of values (which can be regarded as a point), and each set includes a value of the two parameters. These three sets of parameters were brought into the model for simulation, and the model prediction results corresponding to the three sets of parameters were obtained.
  • Step 304 respectively comparing multiple predicted results with corresponding actual results.
  • the purpose of the multi-agent model is to predict the number of local deaths, then within a certain period of time, the actual number of deaths in the local area is the actual result, and comparing multiple predicted results with the corresponding actual results is [ A 1 , b 1 ], [a 2 , b 2 ] and [a 3 , b 3 ] respectively correspond to the predicted death toll and the local actual death toll.
  • Step 305 based on the comparison result, determine the loss value corresponding to each parameter value group.
  • the mean square error (MSE) is usually used as the loss function to calculate the loss value corresponding to each parameter value group.
  • Step 306 sort the multiple loss values to obtain the optimal model parameter value group, the worst model parameter value group and other model parameter value groups.
  • Step 307 aggregate parameter values of unpredictable parameters of all model parameter value groups except the worst model parameter value group to obtain intermediate parameter values corresponding to each unpredictable parameter.
  • the aggregation of parameter values of unpredictable parameters can be to obtain the geometric center point of the optimal model parameter value group and other model parameter value groups.
  • the model parameters are updated based on the model parameter value group [(a 1 +a 3 )/2, (b 1 +b 3 )/2] corresponding to C, and the [a 1 , b 1 ], [a 3 , b 3 ] and [(a 1 +a 3 )/2, (b 1 +b 3 )/2] continue to be brought into the updated model for simulation, and the corresponding The prediction results of the three sets of model parameter value groups, and then continue the process of step 304-step 307, so that each participant iteratively optimizes its own unpredictable parameters N L rounds locally, and obtains their respective final geometric center points C i,V t+1 That is, the intermediate parameter value.
  • Step 308 sending the intermediate parameter value to the partner device.
  • the n participant devices send their respective final geometric center points C i, V t+1 to the coordinating device.
  • step 309 the coordinating device aggregates the received intermediate parameter values to obtain target parameter values corresponding to each unpredictable parameter.
  • Step 310 sending the target parameter value to each participant device.
  • the coordinating party device sends the target parameter value C Server, V t+1 corresponding to each unpredictable parameter obtained through aggregation to n participant devices.
  • Step 311 update the multi-agent model based on the target parameter value.
  • the participant device after obtaining the target parameter value, that is, the optimized unpredictable parameter, the participant device optimizes the local multi-agent model according to the unpredictable parameter.
  • the multi-agent model is updated.
  • the multi-agent model is updated.
  • Jointly optimize the value of unpredictable parameters so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation.
  • Co-modeling among them improves the prediction accuracy of the model.
  • FIG. 12 is a schematic structural diagram of the multi-agent model training device 254 provided by the embodiment of the present application.
  • the training device 254 of the agent model comprises:
  • the obtaining module 2541 is configured such that the participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and in the case of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model.
  • the agent model performs prediction and obtains multiple prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter;
  • the comparison module 2542 is configured to determine the impact factor of each of the parameter value groups based on the plurality of prediction results and the actual results corresponding to each of the prediction results;
  • the aggregation module 2543 is configured to aggregate the parameter values of each of the unpredictable parameters based on each of the parameter value groups and the corresponding impact factors, to obtain intermediate parameter values corresponding to each of the unpredictable parameters;
  • the sending module 2544 is configured to send the obtained intermediate parameter value to the cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices Processing to obtain target parameter values corresponding to each of the unpredictable parameters;
  • the updating module 2545 is configured to receive target parameter values corresponding to the unpredictable parameters returned by the cooperating device, and update the multi-agent model based on the target parameter values.
  • the acquisition module 2541 is further configured to acquire the number of unpredictable parameters, and determine the number of parameter value groups based on the number of unpredictable parameters; based on the number of parameter value groups , determine the parameter values of the unpredictable parameters in each parameter value group; respectively input the parameter values of the unpredictable parameters in the parameter value groups to the multi-agent model for prediction, and obtain the parameters corresponding to the multiple parameter value groups multiple predictions.
  • the obtaining module 2541 is further configured to obtain the parameter type of each unpredictable parameter in the parameter value group; determine the corresponding parameter value range according to the parameter type corresponding to each unpredictable parameter; The parameter value range of each unpredictable parameter determines the parameter value of each unpredictable parameter.
  • the comparison module 2542 is further configured to determine the prediction accuracy corresponding to each parameter value group based on the prediction result corresponding to each parameter value group and the corresponding actual result; The prediction accuracy corresponding to each of the parameter value groups is used as the corresponding impact factor.
  • the aggregation module 2543 is further configured to multiply the prediction accuracy corresponding to each of the parameter value groups by the parameter value of the unpredictable parameter to obtain the corresponding to each of the parameter value groups A product result: accumulating the product results corresponding to each of the parameter value groups to obtain an accumulation result; using the accumulation result as an intermediate parameter value of the unpredictable parameter.
  • the comparison module 2542 is further configured to determine the loss value corresponding to each parameter value group based on the predicted result corresponding to each parameter value group and the corresponding actual result; The loss value corresponding to the parameter value group determines the impact factor of the corresponding parameter value group.
  • the aggregation module 2543 is further configured to sort the plurality of parameter value groups based on the impact factor of each parameter value group to obtain a sorting result; based on the sorting result, from the Selecting a target number of parameter value groups from a plurality of parameter value groups; wherein, the target number is smaller than the number of the plurality of parameter value groups; based on the selected target number of parameter value groups, for each parameter of the unpredictable parameter Values are aggregated to obtain intermediate parameter values corresponding to each of the unpredictable parameters.
  • the aggregation module 2543 is further configured to obtain the average value of the parameter values of the unpredictable parameters in the target number of parameter value groups; and use the average value as the middle value of the unpredictable parameters parameter value.
  • the sending module 2544 is further configured to perform privacy protection on the intermediate parameter values of the unpredictable parameters respectively to obtain the privacy-protected intermediate parameter values; and send the privacy-protected intermediate parameter values to the collaborative party device, wherein the intermediate parameter value is used to trigger the coordinating party device to aggregate the privacy-protected intermediate parameter values sent by multiple participant devices to obtain the target corresponding to each of the unpredictable parameters parameter value.
  • the device further includes a second acquisition module 1210 and a prediction module 1220, the second acquisition module 1210 measures the training parameter value of the parameter; the prediction module 1220 is configured to input the actual parameter value
  • the updated multi-agent model performs prediction and obtains corresponding prediction results.
  • the multi-agent model is updated.
  • the multi-agent model is updated.
  • Jointly optimize the value of unpredictable parameters so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation.
  • Co-modeling among them improves the prediction accuracy of the model.
  • FIG. 13 The prediction device 1200 based on the multi-agent model includes:
  • the second acquiring module 1210 is configured to acquire an actual parameter value of the predictable parameter, where the actual parameter value is different from the training parameter value of the predictable parameter;
  • the prediction module 1220 is configured to input the actual parameter values into the updated multi-agent model for prediction, and obtain corresponding prediction results.
  • the multi-agent model is updated.
  • the multi-agent model is updated.
  • Jointly optimize the value of unpredictable parameters so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation.
  • Co-modeling among them improves the prediction accuracy of the model.
  • the embodiment of the present application also provides an electronic device, and the electronic device includes:
  • the processor is configured to implement the multi-agent model training method provided in the embodiment of the present application when executing the executable instructions stored in the memory.
  • the embodiment of the present application also provides a computer program product, including a computer program, and when the computer program is executed by a processor, the multi-agent model training method provided in the embodiment of the present application is implemented.
  • the embodiment of the present application also provides a computer-readable storage medium storing executable instructions, wherein the executable instructions are stored.
  • the processor will be caused to execute the multi-intelligence system provided by the embodiment of the present application. Body model training method.
  • the computer-readable storage medium can be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; Various equipment.
  • executable instructions may take the form of programs, software, software modules, scripts, or code written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and its Can be deployed in any form, including as a stand-alone program or as a module, component, subroutine or other unit suitable for use in a computing environment.
  • executable instructions may, but do not necessarily correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, in a Hyper Text Markup Language (HTML) document in one or more scripts, in a single file dedicated to the program in question, or in multiple cooperating files (for example, files that store one or more modules, subroutines, or sections of code).
  • HTML Hyper Text Markup Language
  • executable instructions may be deployed to be executed on one computing device, or on multiple computing devices located at one site, or alternatively, on multiple computing devices distributed across multiple sites and interconnected by a communication network. to execute.
  • model when multiple participants train the multi-agent model with the same purpose, the values of unpredictable parameters are jointly optimized, so as to obtain a multi-agent whose simulation results are better in line with the real data.
  • Model and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize joint modeling among multiple participants, thereby improving the accuracy of model prediction.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Provided are a multi-agent model training method, an apparatus, an electronic device, a storage medium, and a program product, comprising: a participant device inputting a training parameter value of a predictable parameter into a local multi-agent model, and when the training parameter value is fixed, inputting each of a plurality of parameter value groups into the multi-agent model for prediction, so as to obtain a plurality of prediction results; comparing each of the prediction results against corresponding actual results, so as to determine an influence factor for each parameter value group; then, aggregating the parameter values of unpredictable parameters to obtain an intermediate parameter value corresponding to each unpredictable parameter, and sending the intermediate parameter value to a collaborator device, wherein an intermediate parameter value is used to trigger the collaborator device to aggregate received intermediate parameter values, so as to obtain a target parameter value corresponding to each unpredictable parameter; receiving a target parameter value corresponding to each unpredictable parameter and returned by the collaborator device, and updating the multi-agent model on the basis of the target parameter values.

Description

多智能体模型的训练方法、装置、电子设备、存储介质及程序产品Multi-agent model training method, device, electronic equipment, storage medium and program product
相关申请的交叉引用Cross References to Related Applications
本申请基于申请号为202110981895.1、申请日为2021年08月25日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is based on a Chinese patent application with application number 202110981895.1 and a filing date of August 25, 2021, and claims the priority of this Chinese patent application. The entire content of this Chinese patent application is hereby incorporated by reference into this application.
技术领域technical field
本申请涉及人工智能技术领域,尤其涉及一种多智能体模型的训练方法、装置、电子设备、计算机可读存储介质及计算机程序产品。The present application relates to the technical field of artificial intelligence, and in particular to a multi-agent model training method, device, electronic equipment, computer readable storage medium and computer program product.
背景技术Background technique
人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法和技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the nature of intelligence and produce a new kind of intelligent machine that can respond in a similar way to human intelligence. Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
相关技术中的横向联邦学习通常由不同的参与方和一个协作方训练机器学习模型,其目标是利用各方有限的数据,在保障数据安全的前提下,共同训练一个全局模型。该全局模型因为利用了各参与方的数据进行训练,所以模型效果能够逼近将各参与方数据放在一起训练的情况,显著优于各参与方只基于自有数据得到的模型的效果。然而,多智能体的模型的使用与传统的机器学习十分不同,无法按照传统的联邦机器学习模型的训练方式来应用联邦学习解决多方智能体模型的验证。Horizontal federated learning in related technologies usually trains machine learning models by different parties and a collaborative party. Its goal is to use the limited data of all parties to jointly train a global model under the premise of ensuring data security. Because the global model uses the data of each participant for training, the effect of the model can approach the situation where the data of each participant is trained together, which is significantly better than the effect of the model obtained by each participant only based on its own data. However, the use of multi-agent models is very different from traditional machine learning, and it is impossible to apply federated learning to solve multi-agent model verification according to the traditional federated machine learning model training method.
发明内容Contents of the invention
本申请实施例提供一种多智能体模型的训练方法、装置、电子设备、计算机可读存储介质及计算机程序产品,能够在保障本地数据安全的同时,提升模型预测准确度。Embodiments of the present application provide a multi-agent model training method, device, electronic device, computer-readable storage medium, and computer program product, which can improve model prediction accuracy while ensuring local data security.
本申请实施例提供一种多智能体模型的训练方法,基于联邦学习系统,所述系统包括协作方设备及至少两个参与方设备,所述方法由参与方设备执行,包括:An embodiment of the present application provides a multi-agent model training method, based on a federated learning system, the system includes a collaborator device and at least two participant devices, and the method is executed by the participant device, including:
参与方设备将可预测参数的训练参数值输入至本地的多智能体模型,并在固定所述训练参数值的情况下,将多个参数值组分别输入至所述多智能体模型进行预测,得到多个预测结 果;The participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and under the condition of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model for prediction, Get multiple prediction results;
其中,所述参数值组包括至少一个不可预测参数的参数值;Wherein, the set of parameter values includes at least one parameter value of an unpredictable parameter;
基于所述多个预测结果与各所述预测结果对应的实际结果,确定每个所述参数值组的影响因子;determining an impact factor for each of the parameter value groups based on the plurality of prediction results and actual results corresponding to each of the prediction results;
基于各所述参数值组以及相应的影响因子,对各所述不可预测参数的参数值进行聚合,得到对应各所述不可预测参数的中间参数值;Based on each of the parameter value groups and corresponding impact factors, the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameter values corresponding to each of the unpredictable parameters;
将得到的所述中间参数值发送至协作方设备,其中,所述中间参数值用于触发所述协作方设备对多个参与方设备发送的所述中间参数值进行聚合处理,得到对应各所述不可预测参数的目标参数值;Send the obtained intermediate parameter value to the cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices to obtain the The target parameter value of the unpredictable parameter;
接收所述协作方设备返回的对应各所述不可预测参数的目标参数值,并基于所述目标参数值对所述多智能体模型进行更新。receiving target parameter values corresponding to the unpredictable parameters returned by the coordinating device, and updating the multi-agent model based on the target parameter values.
本申请实施例还提供一种多智能体模型的训练装置,所述装置包括:The embodiment of the present application also provides a multi-agent model training device, the device comprising:
获取模块,配置为参与方设备将可预测参数的训练参数值输入至本地的多智能体模型,并在固定所述训练参数值的情况下,将多个参数值组分别输入至所述多智能体模型进行预测,得到多个预测结果;其中,所述参数值组包括至少一个不可预测参数的参数值;The acquisition module is configured to input the training parameter values of the predictable parameters to the local multi-agent model by the participant equipment, and input multiple parameter value groups into the multi-agent model respectively under the condition of fixing the training parameter values The volume model is predicted to obtain multiple prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter;
对比模块,配置为基于所述多个预测结果与各所述预测结果对应的实际结果,确定每个所述参数值组的影响因子;A comparison module configured to determine an impact factor for each of the parameter value groups based on the plurality of prediction results and actual results corresponding to each of the prediction results;
聚合模块,配置为基于各所述参数值组以及相应的影响因子,对各所述不可预测参数的参数值进行聚合,得到对应各所述不可预测参数的中间参数值;An aggregation module configured to aggregate the parameter values of each of the unpredictable parameters based on each of the parameter value groups and the corresponding impact factors, to obtain intermediate parameter values corresponding to each of the unpredictable parameters;
发送模块,配置为将得到的所述中间参数值发送至协作方设备,其中,所述中间参数值用于触发所述协作方设备对多个参与方设备发送的所述中间参数值进行聚合处理,得到对应各所述不可预测参数的目标参数值;A sending module, configured to send the obtained intermediate parameter value to a cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices , to obtain the target parameter value corresponding to each of the unpredictable parameters;
更新模块,配置为接收所述协作方设备返回的对应各所述不可预测参数的目标参数值,并基于所述目标参数值对所述多智能体模型进行更新。The update module is configured to receive target parameter values corresponding to each of the unpredictable parameters returned by the coordinating device, and update the multi-agent model based on the target parameter values.
本申请实施例提供一种电子设备,包括:An embodiment of the present application provides an electronic device, including:
存储器,用于存储可执行指令;memory for storing executable instructions;
处理器,用于执行所述存储器中存储的可执行指令时,实现本申请实施例提供的多智能体模型的训练方法。The processor is configured to implement the multi-agent model training method provided in the embodiment of the present application when executing the executable instructions stored in the memory.
本申请实施例提供一种计算机可读存储介质,存储有可执行指令,用于引起处理器执行时,实现本申请实施例提供的多智能体模型的训练方法。The embodiment of the present application provides a computer-readable storage medium, which stores executable instructions, and is used to cause a processor to execute the method to implement the multi-agent model training method provided in the embodiment of the present application.
本申请实施例提供一种计算机程序产品,包括计算机程序,该计算机程序被处理器执行 时实现本申请实施例提供的多智能体模型的训练方法。An embodiment of the present application provides a computer program product, including a computer program, and when the computer program is executed by a processor, the multi-agent model training method provided in the embodiment of the present application is implemented.
本申请实施例具有以下有益效果:The embodiment of the present application has the following beneficial effects:
相较于相关技术中多智能体的模型只能由数据拥有方单独训练的方式,应用本申请实施例提供的基于横向联邦学习架构的多智能体模型的训练方法、装置、电子设备、计算机可读存储介质及计算机程序产品,通过参与方在本地对不可预测参数进行聚合后得到中间参数值并发送至协作方,并基于协作方对接收的中间参数值进行二次聚合后得到的目标参数值,以对多智能体模型进行更新,如此,当多个参与方对用途相同的多智能体模型进行训练时,联合优化不可预测参数的取值,从而获得模拟结果与真实数据符合更好的多智能体模型,并保障了本地数据的安全,解决多智能体的模型领域的数据孤岛问题,实现多参与方之间共同建模,从而提升了模型预测准确度。Compared with the way in which the multi-agent model in the related art can only be trained independently by the data owner, the training method, device, electronic equipment, and computer-based multi-agent model based on the horizontal federated learning architecture provided by the embodiment of the present application can Read storage media and computer program products, obtain intermediate parameter values through local aggregation of unpredictable parameters by the participating parties and send them to the collaborating party, and based on the target parameter value obtained by secondary aggregation of the received intermediate parameter values by the collaborating party , to update the multi-agent model. In this way, when multiple participants train the multi-agent model with the same purpose, they jointly optimize the value of unpredictable parameters, so as to obtain a multi-agent model that is better in line with the real data. The agent model ensures the security of local data, solves the problem of data islands in the field of multi-agent models, and realizes joint modeling among multiple parties, thereby improving the accuracy of model prediction.
附图说明Description of drawings
图1是本申请实施例提供的多智能体模型的训练方法的实施场景示意图;Fig. 1 is a schematic diagram of the implementation scene of the training method of the multi-agent model provided by the embodiment of the present application;
图2是本申请实施例提供的电子设备的结构示意图;Fig. 2 is a schematic structural diagram of an electronic device provided by an embodiment of the present application;
图3是本申请实施例提供的多智能体模型的验证过程和机器学习模型的训练过程的对比图;Fig. 3 is a comparison diagram of the verification process of the multi-agent model provided by the embodiment of the present application and the training process of the machine learning model;
图4是本申请实施例提供的多智能体模型的训练方法的流程示意图;Fig. 4 is a schematic flow chart of the training method of the multi-agent model provided by the embodiment of the present application;
图5是本申请实施例提供的多智能体模型的训练方法的一个可选的流程示意图;Fig. 5 is an optional flowchart of the training method of the multi-agent model provided by the embodiment of the present application;
图6A是本申请实施例提供的一个多智能体模型的不可预测参数聚合的一个可选示意图;Fig. 6A is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by an embodiment of the present application;
图6B是本申请实施例提供的一个多智能体模型的不可预测参数聚合的一个可选示意图;Fig. 6B is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by the embodiment of the present application;
图7A是本申请实施例提供的多智能体模型训练方法的一个可选的流程示意图;Fig. 7A is an optional flowchart of the multi-agent model training method provided by the embodiment of the present application;
图7B是本申请实施例提供的多智能体模型训练方法的一个可选的流程示意图;FIG. 7B is an optional flowchart of the multi-agent model training method provided by the embodiment of the present application;
图8是本申请实施例提供的多智能体模型的预测方法的流程示意图;FIG. 8 is a schematic flowchart of a prediction method for a multi-agent model provided in an embodiment of the present application;
图9是本申请实施例提供的多智能体模型的训练方法的流程示意图;FIG. 9 is a schematic flowchart of a training method for a multi-agent model provided in an embodiment of the present application;
图10是本申请实施例提供的一个多智能体模型的横向联邦学习方法;Fig. 10 is a horizontal federated learning method of a multi-agent model provided by the embodiment of the present application;
图11是本申请实施例提供的一个多智能体模型的不可预测参数聚合的一个可选示意图;Fig. 11 is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by an embodiment of the present application;
图12是本申请实施例提供的多智能体模型的训练装置的结构示意图;Fig. 12 is a schematic structural diagram of a training device for a multi-agent model provided in an embodiment of the present application;
图13是本申请实施例提供的多智能体模型的预测装置的结构示意图。FIG. 13 is a schematic structural diagram of a prediction device for a multi-agent model provided by an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请作进一步地 详细描述,所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the application clearer, the application will be further described in detail below in conjunction with the accompanying drawings. All other embodiments obtained under the premise of creative labor belong to the scope of protection of this application.
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。In the following description, references to "some embodiments" describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or a different subset of all possible embodiments, and Can be combined with each other without conflict.
在以下的描述中,所涉及的术语“第一\第二\第三”仅仅是区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。In the following description, the term "first\second\third" is only used to distinguish similar objects, and does not represent a specific order for objects. Understandably, "first\second\third" is used in Where permitted, the specific order or sequence may be interchanged such that the embodiments of the application described herein can be practiced in other sequences than illustrated or described herein.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of the present application, and are not intended to limit the present application.
对本申请实施例进行进一步详细说明之前,对本申请实施例中涉及的名词和术语进行说明,本申请实施例中涉及的名词和术语适用于如下的解释。Before further describing the embodiments of the present application in detail, the nouns and terms involved in the embodiments of the present application are described, and the nouns and terms involved in the embodiments of the present application are applicable to the following explanations.
1)联邦学习(federated learning)是指通过联合不同的参与者(participant,或者party,也称为数据拥有者(data owner),或者客户(client))进行机器学习的方法。在联邦学习中,参与者并不需要向其它参与者和协调者(coordinator,也称为参数服务器(parameter server),或者聚合服务器(aggregation server))暴露自己拥有的数据,因而联邦学习可以很好的保护用户隐私和保障数据安全。1) Federated learning refers to the method of machine learning by uniting different participants (participants, or parties, also known as data owners, or clients). In federated learning, participants do not need to expose their own data to other participants and coordinators (coordinator, also known as parameter server (parameter server), or aggregation server (aggregation server)), so federated learning can be very good Protect user privacy and ensure data security.
其中,横向联邦学习是在各个参与者的数据特征重叠较多,而用户重叠较少的情况下,取出参与者数据特征相同而用户不完全相同的那部分数据进行联合机器学习。比如有两家不同地区的银行,它们的用户群体分别来自各自所在的地区,相互的交集很小。但是它们的业务很相似,记录的用户数据特征很大部分是相同的。可以使用横向联邦学习来帮助两家银行构建联合模型来预测他们的客户行为。Among them, horizontal federated learning is to take out the part of the data with the same data characteristics of the participants but not the same users for joint machine learning when the data characteristics of each participant overlap more, but the users overlap less. For example, there are two banks in different regions, and their user groups come from their respective regions, and the mutual intersection is very small. But their businesses are very similar, and most of the recorded user data features are the same. Horizontal federated learning can be used to help two banks build a joint model to predict their customer behavior.
2)多智能体模型的模拟方法(agent based simulation or agent based modeling,ABS或ABM),是一种用来模拟智能体(独立个体或共同群体,例如组织、团队)的行动和相互作用的计算模型。多智能体模型是一个微观模型,通过模拟多个智能体的同时行动和相互作用以再现和预测复杂现象。这个过程是从低(微观)层次到高(宏观)层次的涌现,通过ABS可以模拟城市交通情况和疾病传播等现象,例如,可以通过ABS模拟新冠病毒的传播,帮助预测新冠病毒疫情的发展情况和分析不同干预手段对疫情的抑制效果。这种场景下,通常涉及到3个部分,1)贴近真实分布的人群模型;2)人群之间的社交网络模型;3)疾病的传播模型;基于以上三部分模型和对应的参数,可以模拟在给定初始感染人数的情况下,疫 情的发展趋势。其中,除了模型中通过数据得到的参数和经验参数(称为可预测参数),还有部分参数的取值无法确定(称为不可预测参数),这部分参数取值就需要通过在真实数据上进行验证(validation)来得到,这里在真实数据上的验证步骤类似于机器学习中的训练步骤,即优化不可预测参数的值,让模型模拟的结果与真实数据尽量接近。一种常用的确定这些参数的方法是基于优化的方法,比如Nelder-Mead Optimization优化方法。2) The simulation method of the multi-agent model (agent based simulation or agent based modeling, ABS or ABM) is a calculation used to simulate the actions and interactions of agents (independent individuals or common groups, such as organizations and teams) Model. The multi-agent model is a microscopic model that reproduces and predicts complex phenomena by simulating the simultaneous actions and interactions of multiple agents. This process is the emergence from a low (micro) level to a high (macro) level. Through ABS, urban traffic conditions and disease transmission can be simulated. For example, ABS can be used to simulate the spread of new crown virus to help predict the development of the new crown virus epidemic. And analyze the suppression effect of different intervention methods on the epidemic. In this scenario, three parts are usually involved, 1) a crowd model that is close to the real distribution; 2) a social network model between crowds; 3) a disease transmission model; based on the above three-part model and corresponding parameters, it is possible to simulate Given the initial number of infected people, the development trend of the epidemic. Among them, in addition to the parameters and empirical parameters obtained from the data in the model (called predictable parameters), there are still some parameters whose values cannot be determined (called unpredictable parameters), and the values of these parameters need to be passed on the real data. It is obtained by performing validation. Here, the validation step on real data is similar to the training step in machine learning, that is, to optimize the values of unpredictable parameters so that the model simulation results are as close as possible to the real data. A commonly used method for determining these parameters is based on optimization methods, such as the Nelder-Mead Optimization optimization method.
3)同态加密(Homomorphic Encryption,HE)是一种对称加密算法,同态加密的目的是找到一种加密算法,这种加密算法能够在密文上执行加法、乘法运算,使得对加密后的密文进行某种操作所得到的结果,恰好等于对加密前的明文进行预期操作后再加密得到的密文。同态加密有效保证了数据处理方可以直接对数据的密文进行相应的处理,而无法获知其所处理的数据明文信息。同态加密的这一特性使用户的数据和隐私可以得到相应的安全保障,因此,同态加密被应用于许多现实场景来保证数据的安全。3) Homomorphic Encryption (HE) is a symmetric encryption algorithm. The purpose of homomorphic encryption is to find an encryption algorithm that can perform addition and multiplication operations on the ciphertext, so that the encrypted The result obtained by performing a certain operation on the ciphertext is exactly equal to the ciphertext obtained by performing the expected operation on the plaintext before encryption and then encrypting it. Homomorphic encryption effectively ensures that the data processor can directly process the ciphertext of the data, but cannot know the plaintext information of the data it processes. This characteristic of homomorphic encryption enables users' data and privacy to be guaranteed corresponding security. Therefore, homomorphic encryption is applied in many real-world scenarios to ensure data security.
如果一个加密函数同时满足加法同态和乘法同态,称为全同态加密。使用这个加密函数可以完成各种加密后的运算(加减乘除、多项式求值、指数、对数、三角函数等)。If an encryption function satisfies additive homomorphism and multiplicative homomorphism at the same time, it is called fully homomorphic encryption. Various encrypted operations (addition, subtraction, multiplication, division, polynomial evaluation, exponent, logarithm, trigonometric function, etc.) can be completed by using this encryption function.
申请人发现,一个构建好的多智能体模型的模拟ABS模型,可以适用于不同的地区,只需要根据目标地区相应的情况调整其可预测参数(如人口的年龄,性别比例等),然后验证得出不可预测参数的值,即可使用该模型在目标地区预测和分析疫情的后续发展情况。通常,参与模拟的区域越大,构建模型使用的智能体越多,模型的效果越好,越能准确反应系统的真实情况。然而由于各地区的人口分布、人口活动情况以及疫情情况数据可能涉及隐私或安全问题,比较敏感,因此这些数据通常只有当地的具有公信力的机构有权限查看,无法汇总到一处用于训练/验证,所以各机构只能基于自有的有限的数据进行验证的模拟,得到的不可预测参数的值往往不是最优结果,模型效果会受到影响,可能导致预测的偏差。The applicant found that a simulated ABS model of a well-built multi-agent model can be applied to different regions, and only needs to adjust its predictable parameters (such as the age of the population, sex ratio, etc.) according to the corresponding situation in the target region, and then verify Given the values of the unpredictable parameters, the model can be used to predict and analyze the subsequent development of the outbreak in the target area. Generally, the larger the area involved in the simulation, the more agents used to build the model, the better the effect of the model, and the more accurately it can reflect the real situation of the system. However, because the data of population distribution, population activity, and epidemic situation in each region may involve privacy or security issues and are relatively sensitive, these data are usually only authorized to be viewed by local credible institutions, and cannot be aggregated in one place for training/validation , so each institution can only conduct verification simulations based on its own limited data, and the values of unpredictable parameters obtained are often not optimal results, and the effect of the model will be affected, which may lead to deviations in prediction.
基于此,本申请实施例提供一种多智能体模型训练方法、装置、电子设备、计算机可读存储介质及计算机程序产品,使得多参与方设备在协作方设备的协调下可以共同训练一个多智能体的模型,并保障本地数据的安全,解决多智能体的模型领域的数据孤岛问题。Based on this, the embodiment of the present application provides a multi-agent model training method, device, electronic equipment, computer-readable storage medium, and computer program product, so that multi-participant equipment can jointly train a multi-agent model under the coordination of the coordinating equipment. Agent model, and to ensure the security of local data, to solve the problem of data islands in the field of multi-agent model.
基于上述对本申请实施例中涉及的名词和术语的解释,下面说明本申请实施例提供的多智能体模型的训练方法的实施场景,参见图1,图1是本申请实施例提供的多智能体模型的训练方法的实施场景示意图,为实现支撑一个示例性应用,参与方设备200-1、200-2、……、200-n通过网络300连接协作方设备400,其中,参与方设备200-1、200-2、……、200-n可以是存储有可预测参数、不可预测参数以及预测目标的真实值的机构,例如可以是医院,协作方设备400可以是具有公信力的机构,参与方设备200-1、200-2、……、200-n和协作方 设备400互相协助进行联邦学习以使参与方设备200-1、200-2、……、200-n得到多智能体模型,网络300可以是广域网或者局域网,又或者是二者的组合,使用无线或有线链路实现数据传输。Based on the above-mentioned explanations of terms and terms involved in the embodiment of the present application, the implementation scenario of the training method of the multi-agent model provided by the embodiment of the present application is described below, see Figure 1, which is the multi-agent model provided by the embodiment of the present application Schematic diagram of the implementation scenario of the training method of the model. In order to support an exemplary application, the participant devices 200-1, 200-2, ..., 200-n are connected to the collaborator device 400 through the network 300, wherein the participant device 200- 1, 200-2, ..., 200-n may be institutions that store predictable parameters, unpredictable parameters, and real values of predicted targets, such as hospitals, and the collaborating party device 400 may be a credible institution. The devices 200-1, 200-2, ..., 200-n and the collaborating party's device 400 assist each other in federated learning so that the participating devices 200-1, 200-2, ..., 200-n can obtain a multi-agent model, The network 300 may be a wide area network or a local area network, or a combination of the two, using wireless or wired links for data transmission.
参与方设备(包括参与方设备200-1、200-2、……、200-n),用于可预测参数的训练参数值输入至本地的多智能体模型,并在固定训练参数值的情况下,将多个参数值组分别输入至多智能体模型进行预测,得到多个预测结果;其中,参数值组包括至少一个不可预测参数的参数值;基于多个预测结果与各预测结果对应的实际结果,确定每个参数值组的影响因子;基于各参数值组以及相应的影响因子,对各不可预测参数的参数值进行聚合,得到对应各不可预测参数的中间参数值;将得到的中间参数值发送至协作方设备。Participant devices (including participant devices 200-1, 200-2, ..., 200-n), are used to input the training parameter values of predictable parameters into the local multi-agent model, and in the case of fixing the training parameter values In this case, multiple parameter value groups are respectively input to the multi-agent model for prediction, and multiple prediction results are obtained; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter; based on multiple prediction results and the actual corresponding prediction results As a result, the influence factor of each parameter value group is determined; based on each parameter value group and the corresponding influence factor, the parameter values of each unpredictable parameter are aggregated to obtain the intermediate parameter value corresponding to each unpredictable parameter; the obtained intermediate parameter The value is sent to the partner device.
协作方设备(包括协作方设备400),用于对多个参与方设备发送的中间参数值进行聚合处理,得到对应各不可预测参数的目标参数值;将目标参数值发送至参与方设备。The coordinating party device (including the coordinating party device 400 ) is configured to aggregate the intermediate parameter values sent by multiple participant devices to obtain target parameter values corresponding to each unpredictable parameter; and send the target parameter value to the participant device.
参与方设备(包括参与方设备200-1、200-2、……、200-n),还用于接收协作方设备返回的对应各不可预测参数的目标参数值,并基于目标参数值对多智能体模型进行更新。The participant devices (including the participant devices 200-1, 200-2, ..., 200-n) are also used to receive the target parameter values corresponding to the unpredictable parameters returned by the coordinating device, and compare the multi- The agent model is updated.
在实际应用中,训练得到的多智能体模型可以应用于近期在世界蔓延的新冠疫情的建模,实现多城市、多地区、多国家之间共同建模,提升模型预测准确度,为民众和政策制定者提供更为准确的数据。In practical applications, the trained multi-agent model can be applied to the modeling of the new crown epidemic that has recently spread around the world, realizing joint modeling among multiple cities, regions, and countries, improving the prediction accuracy of the model, and serving the public and Policymakers provide more accurate data.
在实际应用中,参与方设备200-1、200-2、……、200-n和协作方设备400可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Deliver Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。参与方设备200-1、200-2、……、200-n和协作方设备400同样可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等,但并不局限于此。参与方设备200-1、200-2、……、200-n和协作方设备400可以通过有线或无线通信方式进行直接或间接地连接,本申请在此不做限制。In practical applications, the participant devices 200-1, 200-2, ..., 200-n and the coordinating party device 400 may be independent physical servers, or server clusters or distributed systems composed of multiple physical servers. It can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery network (Content Deliver Network, CDN), and big data and Cloud servers for basic cloud computing services such as artificial intelligence platforms. The participant devices 200-1, 200-2, ..., 200-n and the collaborator device 400 can also be smart phones, tablet computers, notebook computers, desktop computers, smart speakers, smart watches, etc., but are not limited thereto . The participant devices 200-1, 200-2, . . . , 200-n and the cooperating device 400 may be connected directly or indirectly through wired or wireless communication, which is not limited in this application.
下面对本申请实施例提供的实施多智能体模型的训练方法的电子设备的硬件结构做详细说明,电子设备包括但不限于服务器或终端。参见图2,图2是本申请实施例提供的电子设备的结构示意图,图2所示的电子设备200包括:至少一个处理器210、存储器250、至少一个网络接口220和用户接口230。电子设备200中的各个组件通过总线系统240耦合在一起。可以理解的是,总线系统240用于实现这些组件之间的连接通信。总线系统240除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在 图2中将各种总线都标为总线系统240。The hardware structure of the electronic device implementing the multi-agent model training method provided by the embodiment of the present application is described in detail below, and the electronic device includes but is not limited to a server or a terminal. Referring to FIG. 2 , FIG. 2 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. The electronic device 200 shown in FIG. 2 includes: at least one processor 210 , a memory 250 , at least one network interface 220 and a user interface 230 . Various components in the electronic device 200 are coupled together through the bus system 240 . It can be understood that the bus system 240 is used to realize connection and communication between these components. In addition to the data bus, the bus system 240 also includes a power bus, a control bus and a status signal bus. However, for clarity of illustration, the various buses are labeled as bus system 240 in FIG.
处理器210可以是一种集成电路芯片,具有信号的处理能力,例如通用处理器、数字信号处理器(Digital Signal Processor,DSP),或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等,其中,通用处理器可以是微处理器或者任何常规的处理器等。Processor 210 can be a kind of integrated circuit chip, has signal processing capability, such as general-purpose processor, digital signal processor (Digital Signal Processor, DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware Components, etc., wherein the general-purpose processor can be a microprocessor or any conventional processor, etc.
用户接口230包括使得能够呈现媒体内容的一个或多个输出装置231,包括一个或多个扬声器和/或一个或多个视觉显示屏。用户接口230还包括一个或多个输入装置232,包括有助于用户输入的用户接口部件,比如键盘、鼠标、麦克风、触屏显示屏、摄像头、其他输入按钮和控件。User interface 230 includes one or more output devices 231 that enable presentation of media content, including one or more speakers and/or one or more visual displays. The user interface 230 also includes one or more input devices 232, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
存储器250可以是可移除的,不可移除的或其组合。示例性的硬件设备包括固态存储器,硬盘驱动器,光盘驱动器等。存储器250可选地包括在物理位置上远离处理器210的一个或多个存储设备。Memory 250 may be removable, non-removable or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 250 optionally includes one or more storage devices located physically remote from processor 210 .
存储器250包括易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。非易失性存储器可以是只读存储器(Read Only Memory,ROM),易失性存储器可以是随机存取存储器(Random Access Memory,RAM)。本申请实施例描述的存储器250旨在包括任意适合类型的存储器。Memory 250 includes volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The non-volatile memory can be read-only memory (Read Only Memory, ROM), and the volatile memory can be random access memory (Random Access Memory, RAM). The memory 250 described in the embodiment of the present application is intended to include any suitable type of memory.
在一些实施例中,存储器250能够存储数据以支持各种操作,这些数据的示例包括程序、模块和数据结构或者其子集或超集,下面示例性说明。In some embodiments, memory 250 is capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.
操作系统251,包括用于处理各种基本系统服务和执行硬件相关任务的系统程序,例如框架层、核心库层、驱动层等,用于实现各种基础业务以及处理基于硬件的任务;Operating system 251, including system programs for processing various basic system services and performing hardware-related tasks, such as framework layer, core library layer, driver layer, etc., for implementing various basic services and processing hardware-based tasks;
网络通信模块252,用于经由一个或多个(有线或无线)网络接口220到达其他计算设备,示例性的网络接口220包括:蓝牙、无线相容性认证(Wireless Fidelity,WiFi)、和通用串行总线(Universal Serial Bus,USB)等;Network communication module 252, for reaching other computing devices via one or more (wired or wireless) network interfaces 220, exemplary network interfaces 220 include: Bluetooth, Wireless Fidelity (Wireless Fidelity, WiFi), and Universal Serial Bus Universal Serial Bus (USB), etc.;
输入处理模块253,用于对一个或多个来自一个或多个输入装置232之一的一个或多个用户输入或互动进行检测以及翻译所检测的输入或互动。The input processing module 253 is configured to detect one or more user inputs or interactions from one or more of the input devices 232 and translate the detected inputs or interactions.
在一些实施例中,本申请实施例提供的多智能体模型的训练装置可以采用软件方式实现,图2示出了存储在存储器250中多智能体模型的训练装置254,其可以是程序和插件等形式的软件,包括以下软件模块:获取模块2541、对比模块2542、聚合模块2543,发送模块2544,以及更新模块2545,这些模块是逻辑上的,因此根据所实现的功能可以进行任意的组合或进一步拆分,将在下文中说明各个模块的功能。In some embodiments, the training device of the multi-agent model provided by the embodiment of the present application can be realized by software, and Fig. 2 shows a training device 254 of the multi-agent model stored in the memory 250, which can be a program and a plug-in and other forms of software, including the following software modules: acquisition module 2541, comparison module 2542, aggregation module 2543, sending module 2544, and update module 2545, these modules are logical, so any combination or combination can be performed according to the realized functions Further splitting, the functions of each module will be explained below.
在另一些实施例中,本申请实施例提供的多智能体模型的训练装置可以采用软硬件结合的方式实现,作为示例,本申请实施例提供的多智能体模型的训练装置可以是采用硬件译码 处理器形式的处理器,其被编程以执行本申请实施例提供的多智能体模型的训练方法,例如,硬件译码处理器形式的处理器可以采用一个或多个应用专用集成电路(Application Specific Integrated Circuit,ASIC)、DSP、可编程逻辑器件(Programmable Logic Device,PLD)、复杂可编程逻辑器件(Complex Programmable Logic Device,CPLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或其他电子元件。In some other embodiments, the multi-agent model training device provided by the embodiment of the present application can be realized by combining software and hardware. As an example, the multi-agent model training device provided by the embodiment of the present application can be implemented by using hardware A processor in the form of a code processor, which is programmed to execute the multi-agent model training method provided by the embodiment of the present application, for example, a processor in the form of a hardware decoding processor can use one or more application-specific integrated circuits (Application Specific Integrated Circuit, ASIC), DSP, Programmable Logic Device (Programmable Logic Device, PLD), Complex Programmable Logic Device (Complex Programmable Logic Device, CPLD), Field Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other electronic components.
基于上述对本申请实施例的多智能体模型的训练方法的实施场景及电子设备的说明,下面说明本申请实施例提供的多智能体模型的训练方法。需要说明的是,本申请实施例中的多智能体模型的训练过程与传统机器学习模型的训练过程存在显著差异,参见图3,图3是本申请实施例提供的多智能体模型的验证过程和机器学习模型的训练过程的对比图,基于图3,得到一个更新完成的多智能体模型的过程具体包括构建初始多智能体模型(构建模型)、验证多智能体模型(验证过程)以及测试多智能体模型(测试过程)。其中,构建初始多智能体模型是指对模型参数进行初始化、预设损失函数(用于对多智能体模型进行更新)等;验证过程是指通过预设轮次的迭代来更新模型中的不可预测参数;测试过程是指通过修改模型的输出结果对多智能体模型的正确性进行测试。而得到一个已收敛的机器学习模型的过程具体包括构建初始机器学习模型、训练机器学习模型以及测试机器学习模型,其中,机器学习模型的训练阶段是通过训练样本数据对机器学习模型进行预测轮次的迭代更新。需要说明的是,多智能体模型的在真实数据上的验证过程类似于机器学习中的训练过程,即优化不可预测参数的值,让模型预测的结果与真实数据尽量接近。Based on the above description of the implementation scenarios and electronic equipment of the multi-agent model training method of the embodiment of the present application, the following describes the multi-agent model training method provided by the embodiment of the present application. It should be noted that there are significant differences between the training process of the multi-agent model in the embodiment of the present application and the training process of the traditional machine learning model, see Figure 3, which is the verification process of the multi-agent model provided in the embodiment of the present application Compared with the training process of the machine learning model, based on Figure 3, the process of obtaining an updated multi-agent model specifically includes building an initial multi-agent model (construction model), verifying the multi-agent model (verification process) and testing Multi-agent model (test procedure). Among them, the construction of the initial multi-agent model refers to the initialization of the model parameters, the preset loss function (for updating the multi-agent model), etc.; Prediction parameters; the testing process refers to testing the correctness of the multi-agent model by modifying the output results of the model. The process of obtaining a converged machine learning model specifically includes building an initial machine learning model, training a machine learning model, and testing a machine learning model. iterative update. It should be noted that the verification process of the multi-agent model on real data is similar to the training process in machine learning, that is, to optimize the values of unpredictable parameters so that the results predicted by the model are as close as possible to the real data.
参见图4,图4是本申请实施例提供的多智能体模型的训练方法的流程示意图,本申请实施例提供的多智能体模型的训练方法包括:Referring to Fig. 4, Fig. 4 is a schematic flow chart of the training method of the multi-agent model provided by the embodiment of the present application, the training method of the multi-agent model provided by the embodiment of the present application includes:
步骤101,参与方设备将可预测参数的训练参数值输入至本地的多智能体模型,并在固定训练参数值的情况下,将多个参数值组分别输入至多智能体模型进行预测,得到多个预测结果;其中,参数值组包括至少一个不可预测参数的参数值。 Step 101, the participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and in the case of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model for prediction, and obtains multi-agent prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter.
在实际实施时,这里的可预测参数的取值根据各方本地的情况确定,示例性地,可以是当地居民的年龄、职业、性别以及每天出行轨迹,又或者目标疾病的感染者的性别、年龄、职业,感染人数,以及目标疾病感染者的行动轨迹等;这里,可预测参数的训练参数值是基于本地的多智能体模型的训练目的的差异,所获取的不同的可预测参数,即在对一个多智能体模型进行训练优化的过程中,可预测参数的取值是固定的,作为一个示例,如果该多智能体模型用于预测当地疾病死亡人数,则当地居民的总人数、居民的性别、年龄等是在对该多智能体模型进行训练优化的过程中固定的可预测参数;相应地,在改变该多智能体模型的用 途时,只需调整可预测参数即可实现模型的其他用途,示例性地,当该模型用于预测另一地区的死亡人数,则将可预测参数调整为另一地区居民的总人数、居民的性别、年龄等;又或者该多智能体模型是用于预测疾病的传播概率,则此时固定的可预测参数可以是健康用户与患病用户的接触次数;相应地,可通过改变可预测参数即健康用户与患病用户的接触次数,来确定新的疾病传播概率。In actual implementation, the value of the predictable parameter here is determined according to the local conditions of each party. For example, it can be the age, occupation, gender, and daily travel trajectory of local residents, or the gender, Age, occupation, the number of infected people, and the action trajectory of the target disease infected person; here, the training parameter value of the predictable parameter is based on the difference in the training purpose of the local multi-agent model, and the different predictable parameters obtained, namely In the process of training and optimizing a multi-agent model, the value of predictable parameters is fixed. As an example, if the multi-agent model is used to predict the number of local disease deaths, the total number of local residents, residents Gender, age, etc. are fixed predictable parameters in the process of training and optimizing the multi-agent model; correspondingly, when changing the use of the multi-agent model, only need to adjust the predictable parameters to realize the For other purposes, for example, when the model is used to predict the number of deaths in another region, the predictable parameters are adjusted to the total number of residents in another region, the sex, age, etc. of the residents; or the multi-agent model is For predicting the spread probability of the disease, the fixed predictable parameter at this time can be the number of contacts between healthy users and sick users; correspondingly, it can be determined by changing the predictable parameter, that is, the number of contacts between healthy users and sick users New disease transmission probability.
在本申请实施例中,参数值组包括至少一个不可预测参数的参数值,不可预测参数的取值无法从已有数据或经验中推出,需要通过对将不可预测参数带入模型得到的预测值与相应的真实值进行比较从而得到,即通过调整不可预测参数的取值,使得模型结果与实际预测目标相符合,确定其最优值,并在测试数据上验证模拟结果准确性,也就是说,选取合适的不可预测参数的取值,使得模型的模拟结果尽可能符合真实数据(的分布)。In the embodiment of the present application, the parameter value group includes at least one parameter value of an unpredictable parameter. The value of the unpredictable parameter cannot be deduced from existing data or experience, and the predicted value obtained by bringing the unpredictable parameter into the model is required. It is obtained by comparing with the corresponding real value, that is, by adjusting the value of the unpredictable parameter, the model result is consistent with the actual prediction target, the optimal value is determined, and the accuracy of the simulation result is verified on the test data, that is to say , select the appropriate value of the unpredictable parameter, so that the simulation results of the model conform to the real data (distribution) as much as possible.
在一些实施例中,针对将多个参数值组分别输入至多智能体模型进行预测,得到多个预测结果的处理过程参见图5,图5是本申请实施例提供的多智能体模型的训练方法的一个可选的流程示意图,基于图4,步骤101还可以通过如下方式实现:In some embodiments, for the process of inputting multiple parameter value groups into the multi-agent model for prediction and obtaining multiple prediction results, refer to FIG. 5 . FIG. 5 is a training method for the multi-agent model provided by the embodiment of the present application. An optional schematic flow chart of FIG. 4, step 101 can also be implemented in the following manner:
步骤1011,获取不可预测参数的数量,并基于不可预测参数的数量确定参数值组的数量。 Step 1011, acquire the number of unpredictable parameters, and determine the number of parameter value groups based on the number of unpredictable parameters.
在实际实施时,确定需要进行优化的不可预测参数的个数,从而基于不可预测参数的个数确定参数值组的个数。作为一个示例,当需要进行优化的不可预测参数的个数为n个时,参数值组的个数可以为n+1个。In actual implementation, the number of unpredictable parameters that need to be optimized is determined, so that the number of parameter value groups is determined based on the number of unpredictable parameters. As an example, when the number of unpredictable parameters to be optimized is n, the number of parameter value groups may be n+1.
步骤1012,基于参数值组的数量,确定各参数值组中不可预测参数的参数值。 Step 1012, based on the number of parameter value groups, determine the parameter values of the unpredictable parameters in each parameter value group.
在实际实施时,当确定可参数值组的个数后,基于参数值组的个数,选取与参数值组个数对应的不可预测参数的参数值。接上述示例,当参数值组的个数为n+1个时,选取n+1个参数值作为各参数值组中不可预测参数,接上述示例,当n为3时,参数值组即为4组为A、B、C以及D,这里的不可预测参数的参数值包括A(a 1,b 1,c 1,d 1)、B(a 2,b 2,c 2,d 2)、C(a 3,b 3,c 3,d 3)以及D(a 4,b 4,c 4,d 4)。 In actual implementation, after the number of parameter value groups is determined, based on the number of parameter value groups, parameter values of unpredictable parameters corresponding to the number of parameter value groups are selected. Following the above example, when the number of parameter value groups is n+1, select n+1 parameter values as the unpredictable parameters in each parameter value group. Following the above example, when n is 3, the parameter value group is The four groups are A, B, C and D, where the parameter values of the unpredictable parameters include A(a 1 , b 1 , c 1 , d 1 ), B(a 2 , b 2 , c 2 , d 2 ), C (a 3 , b 3 , c 3 , d 3 ) and D (a 4 , b 4 , c 4 , d 4 ).
需要说明的是,这里选取不可预测参数的参数值包括获取参数值组中各不可预测参数的参数类型,然后根据各不可预测参数对应的参数类型,确定相应的参数值范围,再根据各不可预测参数的参数值范围,确定各不可预测参数的参数值。这里,不可预测参数可以为疾病的传播系数,或者可以是天气、年龄、性别等对疾病传播造成的影响,示例性地,当待优化的不可预测参数之一为疾病的传播系数时,确定该不可预测参数的取值范围为0-K,然后从0-K的范围内随机选取不可预测参数的参数值。接上述示例,例如这里的a是取值范围为0-K的待优化的不可预测参数,则a 1、a 2、a 3以及a 4均为(0,K)之间的参数值。 It should be noted that selecting the parameter value of the unpredictable parameter here includes obtaining the parameter type of each unpredictable parameter in the parameter value group, and then determining the corresponding parameter value range according to the parameter type corresponding to each unpredictable parameter, and then according to each unpredictable parameter The parameter value range of the parameter determines the parameter value of each unpredictable parameter. Here, the unpredictable parameter can be the transmission coefficient of the disease, or it can be the influence of weather, age, gender, etc. on the transmission of the disease. For example, when one of the unpredictable parameters to be optimized is the transmission coefficient of the disease, determine the The value range of the unpredictable parameter is 0-K, and then the parameter value of the unpredictable parameter is randomly selected from the range of 0-K. Following the above example, for example, here a is an unpredictable parameter to be optimized with a value range of 0-K, then a 1 , a 2 , a 3 and a 4 are all parameter values between (0, K).
步骤1013,分别将各参数值组中不可预测参数的参数值输入至多智能体模型进行预测, 得到对应多个参数值组的多个预测结果。 Step 1013, respectively input the parameter values of the unpredictable parameters in each parameter value group to the multi-agent model for prediction, and obtain multiple prediction results corresponding to multiple parameter value groups.
接上述示例,将A(a 1,b 1,c 1,d 1)、B(a 2,b 2,c 2,d 2)、C(a 3,b 3,c 3,d 3)以及D(a 4,b 4,c 4,d 4)分别输入至多智能体模型进行预测,得到对应A组的预测结果,对应B组的预测结果,对应C组的预测结果以及对应D组的预测结果。 Following the above example, A(a 1 , b 1 , c 1 , d 1 ), B(a 2 , b 2 , c 2 , d 2 ), C(a 3 , b 3 , c 3 , d 3 ) and D(a 4 , b 4 , c 4 , d 4 ) are respectively input to the multi-agent model for prediction, and the prediction results corresponding to group A, the prediction results corresponding to group B, the prediction results corresponding to group C and the prediction results corresponding to group D are obtained result.
步骤102,基于多个预测结果与各预测结果对应的实际结果,确定每个参数值组的影响因子。 Step 102, based on multiple prediction results and actual results corresponding to each prediction result, determine the impact factor of each parameter value group.
这里,影响因子可以用于表征每个参数值组中不可预测参数的影响程度,即用于表征每个参数值租的影响程度。Here, the impact factor can be used to characterize the degree of influence of unpredictable parameters in each parameter value group, that is, to characterize the degree of influence of each parameter value.
在一些实施例中,基于多个预测结果与各预测结果对应的实际结果,确定每个参数值组的影响因子包括分别基于每个参数值组对应的预测结果与相应的实际结果,确定每个参数值组对应的预测准确度;将每个参数值组对应的预测准确度作为相应的影响因子。这里,预测准确度可以为每个参数值组对应的权重。In some embodiments, determining the impact factor of each parameter value group based on multiple prediction results and actual results corresponding to each prediction result includes determining each The prediction accuracy corresponding to the parameter value group; the prediction accuracy corresponding to each parameter value group is used as the corresponding impact factor. Here, the prediction accuracy may be the weight corresponding to each parameter value group.
在另一些实施例中,基于多个预测结果与各预测结果对应的实际结果,确定每个参数值组的影响因子包括分别基于每个参数值组对应的预测结果与相应的实际结果,确定每个参数值组对应的损失值;基于每个参数值组对应的损失值,确定相应参数值组的影响因子。在实际实施时,可以将损失值的倒数作为相应参数值组的影响因子,损失值越大,则损失值的倒数越小即影响因子越小,又或者将损失值作为相应参数值组的影响因子,损失值越大,则影响因子越大,这里,对于通过损失值确定相应参数值组的影响因子的方式,本申请实施例对此不做限制。In some other embodiments, determining the influence factor of each parameter value group based on multiple prediction results and actual results corresponding to each prediction result includes determining each The loss values corresponding to each parameter value group; based on the loss value corresponding to each parameter value group, determine the impact factor of the corresponding parameter value group. In actual implementation, the reciprocal of the loss value can be used as the influence factor of the corresponding parameter value group. The larger the loss value is, the smaller the reciprocal of the loss value is, the smaller the influence factor is, or the loss value can be used as the influence factor of the corresponding parameter value group. Factor, the greater the loss value, the greater the impact factor. Here, the embodiment of the present application does not limit the method of determining the impact factor of the corresponding parameter value group through the loss value.
步骤103,基于各参数值组以及相应的影响因子,对各不可预测参数的参数值进行聚合,得到对应各不可预测参数的中间参数值。 Step 103, based on each parameter value group and the corresponding impact factor, the parameter values of each unpredictable parameter are aggregated to obtain the intermediate parameter value corresponding to each unpredictable parameter.
在一些实施例中,当相应参数值组的影响因子为权重时,分别将各参数值组对应的权重与不可预测参数的参数值进行相乘,得到对应各参数值组的乘积结果,然后对各参数值组对应的乘积结果进行累加,得到累加结果,最后将累加结果作为不可预测参数的中间参数值。接上述示例,这里的参数组为A(a 1,b 1,c 1,d 1)、B(a 2,b 2,c 2,d 2)、C(a 3,b 3,c 3,d 3)以及D(a 4,b 4,c 4,d 4),相应的权重为将x、y、z以及k,则不可预测参数的中间参数值P为(a 1*x+a 2*y+a 3*z+a 4*k,b 1*x+b 2*y+b 3*z+b 4*k,c 1*x+c 2*y+c 3*z+c 4*k,d 1*x+d 2*y+d 3*z+d 4*k)。 In some embodiments, when the influencing factors of the corresponding parameter value groups are weights, the weights corresponding to each parameter value group are multiplied by the parameter values of unpredictable parameters to obtain the product results corresponding to each parameter value group, and then The multiplication results corresponding to each parameter value group are accumulated to obtain the accumulation result, and finally the accumulation result is used as the intermediate parameter value of the unpredictable parameter. Following the above example, the parameter groups here are A(a 1 , b 1 , c 1 , d 1 ), B(a 2 , b 2 , c 2 , d 2 ), C(a 3 , b 3 , c 3 , d 3 ) and D(a 4 , b 4 , c 4 , d 4 ), the corresponding weights are x, y, z and k, then the intermediate parameter value P of the unpredictable parameter is (a 1 *x+a 2 *y+a 3 *z+a 4 *k, b 1 *x+b 2 *y+b 3 *z+b 4 *k, c 1 *x+c 2 *y+c 3 *z+c 4 *k, d 1 *x+d 2 *y+d 3 *z+d 4 *k).
在一些实施例中,当相应参数值组的影响因子与每个参数值组对应的损失值相关时,基于各参数值组的影响因子,对多个参数值组进行排序,得到排序结果;基于排序结果,从多个参数值组中选取目标数量的参数值组;其中,目标数量小于多个参数值组的数量;获取目标数量的参数值组中不可预测参数的参数值的平均值;将平均值作为不可预测参数的中间参 数值。In some embodiments, when the influence factor of the corresponding parameter value group is related to the loss value corresponding to each parameter value group, multiple parameter value groups are sorted based on the influence factors of each parameter value group to obtain the sorting result; based on Sorting results, selecting a parameter value group of a target quantity from a plurality of parameter value groups; wherein, the target quantity is less than the quantity of a plurality of parameter value groups; obtaining the average value of the parameter value of the unpredictable parameter in the parameter value group of the target quantity; The mean value serves as an intermediate parameter value for unpredictable parameters.
在实际实施时,当影响因子为损失值的倒数时,基于损失值的大小,从大到小或者从小到大的对多个参数值组进行排序,然后从排序后的参数值组中选取目标数量的参数值组,这里,目标数量为小于多个参数值组的数量。In actual implementation, when the impact factor is the reciprocal of the loss value, sort multiple parameter value groups from large to small or small to large based on the size of the loss value, and then select the target from the sorted parameter value group A number of parameter value groups, where the target number is less than the number of parameter value groups.
接上述示例,这里的参数组为A(a 1,b 1,c 1,d 1)、B(a 2,b 2,c 2,d 2)、C(a 3,b 3,c 3,d 3)以及D(a 4,b 4,c 4,d 4),基于损失值的大小,确定最优模型参数值组A,最差模型参数值组D以及其它模型参数值组B和C。然后对选取出的目标数量的参数值组中不可预测参数的参数值进行聚合,即将a 1、a 2、a 3、a 4进行聚合,将b 1、b 2、b 3、b 4进行聚合,将c 1、c 2、c 3、c 4进行聚合以及将d 1、d 2、d 3、d 4进行聚合。 Following the above example, the parameter groups here are A(a 1 , b 1 , c 1 , d 1 ), B(a 2 , b 2 , c 2 , d 2 ), C(a 3 , b 3 , c 3 , d 3 ) and D(a 4 , b 4 , c 4 , d 4 ), based on the size of the loss value, determine the optimal model parameter value group A, the worst model parameter value group D and other model parameter value groups B and C . Then aggregate the parameter values of the unpredictable parameters in the selected target number of parameter value groups, that is, aggregate a 1 , a 2 , a 3 , and a 4 , and aggregate b 1 , b 2 , b 3 , and b 4 , aggregate c 1 , c 2 , c 3 , c 4 and aggregate d 1 , d 2 , d 3 , d 4 .
这里,对选取出的目标数量的参数值组中不可预测参数的参数值进行聚合的过程包括获取目标数量的参数值组中不可预测参数的参数值的平均值,然后将平均值作为不可预测参数的中间参数值,作为一个示例,对获取目标数量的参数值组中不可预测参数的参数值的平均值,将平均值作为不可预测参数的中间参数值的过程进行说明,示例性地,优化n个参数,从n+1个参数组中选取n个参数组,对n个参数组中相应的不可预测参数的参数值求平均值,以作为该不可预测参数的参数值的中间参数值。Here, the process of aggregating the parameter values of the unpredictable parameters in the selected target number of parameter value groups includes obtaining the average value of the parameter values of the unpredictable parameters in the target number of parameter value groups, and then using the average value as the unpredictable parameter As an example, to obtain the average value of the parameter value of the unpredictable parameter in the parameter value group of the target quantity, the process of using the average value as the intermediate parameter value of the unpredictable parameter is described, exemplarily, optimize n select n parameter groups from n+1 parameter groups, and average the parameter values of the corresponding unpredictable parameters in the n parameter groups to use as the intermediate parameter value of the parameter values of the unpredictable parameters.
需要说明的是,在得到目标数量的参数值组中不可预测参数的参数值的平均值后,还可以利用该平均值对多智能体模型进行更新,再对该平均值以及选取的目标数量的参数值组进行聚合,即再一次选取目标数量的参数值组,对再一次所选取的目标数量的参数值组中不可预测参数的参数值求取平均值,然后继续上述更新多智能体模型的过程并再一次聚合的过程,以此进行迭代,将最后一次所聚合得到的平均值作为不可预测参数的中间参数值。如此,各参与方本地迭代优化各自不可预测参数预设轮次,得到各自的最终平均值即中间参数值。It should be noted that after obtaining the average value of the parameter values of the unpredictable parameters in the parameter value group of the target number, the average value can also be used to update the multi-agent model, and then the average value and the selected target number The parameter value groups are aggregated, that is, the parameter value groups of the target number are selected again, and the parameter values of the unpredictable parameters in the parameter value groups of the target number selected again are averaged, and then the above-mentioned updating of the multi-agent model is continued. The process and the process of aggregation again are iterated, and the average value obtained by the last aggregation is used as the intermediate parameter value of the unpredictable parameter. In this way, each participant iteratively optimizes its unpredictable parameter preset rounds locally, and obtains its final average value, which is the intermediate parameter value.
接上述示例,这里的参数组为A(a 1,b 1,c 1,d 1)、B(a 2,b 2,c 2,d 2)、C(a 3,b 3,c 3,d 3)以及D(a 4,b 4,c 4,d 4),基于损失值的大小,确定最优模型参数值组A,最差模型参数值组D以及其它模型参数值组B和C,接着求取最优模型参数值组和其它模型参数组的几何平均点,这里,参照图6A,图6A是本申请实施例提供的一个多智能体模型的不可预测参数聚合的一个可选示意图,此处求取A、B、C三组参数值组的几何平均点P,这里的P=[(a 1+a 2+a 3)/3,(b 1+b 2+b 3)/3,(c 1+c 2+c 3)/3,(d 1+d 2+d 3)/3]。在得到几何中心点P后,基于P对应的模型参数值组[(a 1+a 2+a 3)/3,(b 1+b 2+b 3)/3,(c 1+c 2+c 3)/3,(d 1+d 2+d 3)/3]对模型参数进行更新,这里,并将A、B、C、P继续带入更新后的模型进行模拟,得到分别对应四组模型参数值组的预测结果,这里,参见图6B,图6B是本申请实施例提供的一个多智能体模型的不可预测参数聚合的一个可选示意图,依据损失值的大小,从A、B、C、P四 组模型参数值组中继续确定最优模型参数值组,最差模型参数值组以及其它模型参数值组,接着求取最优模型参数值组和其它模型参数组的几何平均点,继续上述过程,如此,各参与方本地迭代优化各自不可预测参数预设轮次,得到各自的最终几何中心点即中间参数值。 Following the above example, the parameter groups here are A(a 1 , b 1 , c 1 , d 1 ), B(a 2 , b 2 , c 2 , d 2 ), C(a 3 , b 3 , c 3 , d 3 ) and D(a 4 , b 4 , c 4 , d 4 ), based on the size of the loss value, determine the optimal model parameter value group A, the worst model parameter value group D and other model parameter value groups B and C , and then obtain the geometric mean point of the optimal model parameter value group and other model parameter groups, here, referring to Figure 6A, Figure 6A is an optional schematic diagram of aggregation of unpredictable parameters of a multi-agent model provided by the embodiment of the present application , where the geometric mean point P of the three groups of parameter values A, B, and C is obtained, where P=[(a 1 +a 2 +a 3 )/3, (b 1 +b 2 +b 3 )/ 3, (c 1 +c 2 +c 3 )/3, (d 1 +d 2 +d 3 )/3]. After obtaining the geometric center point P, based on the model parameter value group corresponding to P [(a 1 +a 2 +a 3 )/3, (b 1 +b 2 +b 3 )/3, (c 1 +c 2 + c 3 )/3, (d 1 +d 2 +d 3 )/3] to update the model parameters. Here, A, B, C, and P are brought into the updated model for simulation, and the corresponding four For the prediction results of group model parameter value groups, here, see Figure 6B. Figure 6B is an optional schematic diagram of the aggregation of unpredictable parameters of a multi-agent model provided by the embodiment of the present application. According to the size of the loss value, from A to B Continue to determine the optimal model parameter value group, the worst model parameter value group and other model parameter value groups in the four groups of model parameter value groups, C and P, and then calculate the geometric mean of the optimal model parameter value group and other model parameter groups point, continue the above process, so that each participant locally iteratively optimizes their own unpredictable parameter preset rounds, and obtains their final geometric center point, that is, the intermediate parameter value.
如此,通过上述对参数值组的不可预测参数的参数值进行聚合的方式,不会产生额外的模拟量即不产生新的全局不可预测参数取值,从而各参与方无需对新值进行模拟,可以较单方本地优化更快更稳定的找到最优的不可预测参数值,减少了模拟次数和模型计算量。In this way, through the above method of aggregating the parameter values of the unpredictable parameters of the parameter value group, no additional simulations will be generated, that is, no new global unpredictable parameter values will be generated, so that each participant does not need to simulate new values, Compared with unilateral local optimization, the optimal unpredictable parameter value can be found faster and more stably, reducing the number of simulations and model calculations.
在一些实施例中,当相应参数值组的影响因子为权重时,还可以基于各参数值组的权重,对多个参数值组进行排序,基于排序结果,从多个参数值组中选取目标数量的参数值组,其中,目标数量小于多个参数值组的数量,然后分别将所选取的各参数值组对应的权重与不可预测参数的参数值进行相乘,得到对应各参数值组的乘积结果,再对各参数值组对应的乘积结果进行累加,得到累加结果,最后将累加结果作为不可预测参数的中间参数值。In some embodiments, when the influencing factors of the corresponding parameter value groups are weights, the multiple parameter value groups can also be sorted based on the weight of each parameter value group, and the target can be selected from the multiple parameter value groups based on the sorting results. The number of parameter value groups, wherein the target number is less than the number of multiple parameter value groups, and then respectively multiply the weight corresponding to each selected parameter value group with the parameter value of the unpredictable parameter to obtain the corresponding parameter value group The multiplication result, and then accumulate the multiplication results corresponding to each parameter value group to obtain the accumulation result, and finally use the accumulation result as the intermediate parameter value of the unpredictable parameter.
需要说明的是,对于基于各参数值组以及相应的影响因子,对各不可预测参数的参数值进行聚合的方式,还可以基于损失值对多个参数值组进行排序,基于排序结果,从多个参数值组中选取目标数量的参数值组,其中,目标数量小于多个参数值组的数量,然后分别将所选取的各参数值组对应的权重与不可预测参数的参数值进行相乘,得到对应各参数值组的乘积结果,再对各参数值组对应的乘积结果进行累加,得到累加结果,最后将累加结果作为不可预测参数的中间参数值,本申请实施例对基于各参数值组以及相应的影响因子,对各不可预测参数的参数值进行聚合的方式不做限制。It should be noted that, for the method of aggregating the parameter values of unpredictable parameters based on each parameter value group and the corresponding impact factors, multiple parameter value groups can also be sorted based on the loss value, based on the sorting results, from multiple Select the parameter value groups of the target number from the parameter value groups, wherein the target number is less than the number of multiple parameter value groups, and then multiply the weights corresponding to the selected parameter value groups with the parameter values of the unpredictable parameters, Obtain the multiplication result corresponding to each parameter value group, and then accumulate the multiplication result corresponding to each parameter value group to obtain the accumulation result, and finally use the accumulation result as the intermediate parameter value of the unpredictable parameter, the embodiment of the present application is based on each parameter value group And the corresponding impact factors, there is no limit to the way of aggregation of parameter values of unpredictable parameters.
步骤104,将得到的中间参数值发送至协作方设备,其中,中间参数值用于触发协作方设备对多个参与方设备发送的中间参数值进行聚合处理,得到对应各不可预测参数的目标参数值。Step 104: Send the obtained intermediate parameter value to the cooperating device, wherein the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participating devices to obtain target parameters corresponding to each unpredictable parameter value.
在实际实施时,得到中间参数值后对各不可预测参数的中间参数值分别进行隐私保护,得到隐私保护后的中间参数值;这里隐私保护的方式可以为对中间参数值进行模糊处理,例如添加噪声、差分隐私处理等,协作方设备获得的即为至少两个参与方设备对中间参数值进行隐私处理后的参数值,应当理解的是,协作方设备在统计至少两个参与方设备的中间参数值时,其中的噪声将会互相抵消,不影响对中间参数值的聚合结果。此外,隐私保护的处理方式还可以为对中间参数值进行同态加密。In actual implementation, privacy protection is performed on the intermediate parameter values of each unpredictable parameter after obtaining the intermediate parameter values, and the privacy-protected intermediate parameter values are obtained; here, the privacy protection method can be fuzzy processing on the intermediate parameter values, for example, adding Noise, differential privacy processing, etc., what the coordinating device obtains is the parameter value obtained by at least two participant devices after performing privacy processing on the intermediate parameter value. When the parameter value is set, the noise in it will cancel each other out, without affecting the aggregation result of the intermediate parameter value. In addition, the processing method of privacy protection can also be to perform homomorphic encryption on intermediate parameter values.
在实际实施时,协作方对多个参与方设备发送的中间参数值进行聚合处理的过程可以有多种方式,示例性地,对各参与方发送的中间参数值求几何平均,或者随机选取部分参与方上传的中心点进行平均,又或者在参与方除了上传几何中心点,同时上传最优模型参数值组或最差模型参数值组的损失值,或除最差模型参数值组之外其它所有模型参数值组的平均损 失值的基础上,根据损失值对参与方进行排序,选取较好的多个中心点进行平均,得到新的中心点。对于协作方进行参数聚合操作的过程本申请实施例对此不做限制。In actual implementation, there are many ways for the coordinating party to aggregate the intermediate parameter values sent by multiple participant devices. The center point uploaded by the participant is averaged, or the participant uploads the loss value of the optimal model parameter value group or the worst model parameter value group at the same time in addition to uploading the geometric center point, or other than the worst model parameter value group On the basis of the average loss value of all model parameter value groups, the participants are sorted according to the loss value, and multiple better center points are selected for averaging to obtain a new center point. The embodiment of the present application does not limit the process of the parameter aggregation operation performed by the coordinating party.
步骤105,接收协作方设备返回的对应各不可预测参数的目标参数值,并基于目标参数值对多智能体模型进行更新。 Step 105, receiving target parameter values corresponding to unpredictable parameters returned by the coordinating device, and updating the multi-agent model based on the target parameter values.
需要说明的是,参与方基于目标参数值对多智能体模型进行更新有两种实现方式。It should be noted that there are two ways for the participants to update the multi-agent model based on the target parameter values.
在一些实施例中,参见图7A,图7A是本申请实施例提供的多智能体模型训练方法的一个可选的流程示意图,这里,整个模型训练过程分成两个阶段完成,第一阶段是本地的多智能体模型训练,直至模型达到收敛条件后,将收敛时的各中间参数值上传至协作方设备(参数聚合设备),其中,中间参数值用于触发协作方设备进行第二阶段的参数聚合操作,为了适应初步建模或快速建模场景,第二阶段的参数聚合可只进行一次,整个模型就收敛。In some embodiments, refer to FIG. 7A. FIG. 7A is an optional flow chart of the multi-agent model training method provided by the embodiment of the present application. Here, the entire model training process is divided into two stages, and the first stage is local The multi-agent model training, until the model reaches the convergence condition, the intermediate parameter values at the time of convergence are uploaded to the collaborating party device (parameter aggregation device), where the intermediate parameter value is used to trigger the collaborating party device to perform the second stage parameter Aggregation operation, in order to adapt to preliminary modeling or rapid modeling scenarios, the parameter aggregation in the second stage can be performed only once, and the entire model will converge.
在另一些实施例中,参见图7B,图7B是本申请实施例提供的多智能体模型训练方法的一个可选的流程示意图,这里,参与方还可以仅进行一次本地的多智能体模型的参数聚合,即将各中间参数值上传至协作方设备,其中,中间参数值用于触发协作方设备进行仅一次的第二阶段的参数聚合操作,然后将聚合后的目标参数值返回至各参与方设备,以供各参与方设备进行本地的模型更新,然后基于更新后的模型,继续进行本地的多智能体模型的模拟,再将各中间参数值上传至协作方设备,继续上述过程,直至本地的多智能体模型收敛。In some other embodiments, refer to FIG. 7B. FIG. 7B is an optional flow chart of the multi-agent model training method provided by the embodiment of the present application. Here, the participants can also only conduct local multi-agent model training Parameter aggregation, that is, to upload each intermediate parameter value to the partner device, wherein the intermediate parameter value is used to trigger the second stage parameter aggregation operation of the partner device only once, and then return the aggregated target parameter value to each participant equipment for each participant’s equipment to update the local model, and then continue to simulate the local multi-agent model based on the updated model, and then upload the intermediate parameter values to the collaborating party’s equipment, and continue the above process until the local The multi-agent model converges.
需要说明的是,在上述第二种更新方式中,各参与方设备在得到目标参数值后,参与方设备基于目标参数值更新本地多智能体模型,再将目标参数值与模型更新前所选取的目标数量的参数值组输入至更新后的本地多智能体模型,对该目标参数值以及模型更新前所选取的目标数量的参数值组进行聚合,即再一次选取目标数量的参数值组,对再一次所选取的目标数量的参数值组中不可预测参数的参数值求取平均值,以作为中间参数值发送至协作方设备,然后继续上述过程。It should be noted that in the second update method above, after each participant device obtains the target parameter value, the participant device updates the local multi-agent model based on the target parameter value, and then compares the target parameter value with the value selected before the model update. The target number of parameter value groups is input to the updated local multi-agent model, and the target parameter value and the target number of parameter value groups selected before the model update are aggregated, that is, the target number of parameter value groups is selected again, Calculate the average value of the parameter values of the unpredictable parameters in the target number of parameter value groups selected again, and send them to the coordinating device as intermediate parameter values, and then continue the above process.
在一些实施例中,在多智能体模型训练完成后,可以通过改变可预测参数的实际参数值来实现多智能体模型的其他用途,这里的实际参数值不同于所述可预测参数的训练参数值;作为一个示例,可预测参数包括目标疾病的感染者的性别、年龄、职业,以及感染人数,实际参数值可以是目标区域内目标疾病的感染者的性别、年龄、职业,以及感染人数,然后将实际参数值输入更新后的多智能体模型进行预测,从而可以得到目标区域内目标疾病导致的死亡人数。In some embodiments, after the training of the multi-agent model is completed, other uses of the multi-agent model can be realized by changing the actual parameter values of the predictable parameters, where the actual parameter values are different from the training parameters of the predictable parameters value; as an example, the predictable parameters include the sex, age, occupation, and number of infected persons of the target disease, and the actual parameter values may be the sex, age, occupation, and number of infected persons of the target disease in the target area, Then the actual parameter values are input into the updated multi-agent model for prediction, so that the number of deaths caused by the target disease in the target area can be obtained.
如此,通过该多智能体模型进行与疾病相关的数据的预测,提升了模型预测准确度,进而及时掌控与疾病相关的情况,以快速调度医疗资源并及时进行疾病防治与管控。In this way, the multi-agent model is used to predict the data related to the disease, which improves the accuracy of the model prediction, and then timely controls the situation related to the disease, so as to quickly dispatch medical resources and timely carry out disease prevention and control.
应用本申请上述实施例,相较于相关技术中多智能体的模型只能由数据拥有方单独训练 的方式,通过参与方在本地对不可预测参数进行聚合后得到的中间参数值并发送至协作方,并基于协作方对接收到中间参数值进行二次聚合返回的目标参数值,以对多智能体模型进行更新,如此,当多个参与方对用途相同的多智能体模型进行训练时,联合优化不可预测参数的取值,从而获得模拟结果与真实数据符合更好的多智能体模型,并保障了本地数据的安全,解决多智能体的模型领域的数据孤岛问题,实现多参与方之间共同建模,从而提升了模型预测准确度。Applying the above-mentioned embodiments of the present application, compared with the way in which the multi-agent model in the related art can only be trained by the data owner alone, the intermediate parameter values obtained by the participants' local aggregation of unpredictable parameters are sent to the collaborative Party, and based on the target parameter value returned by the secondary aggregation of the intermediate parameter value received by the collaborating party, the multi-agent model is updated. In this way, when multiple participants train the multi-agent model with the same purpose, Jointly optimize the value of unpredictable parameters, so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation. Co-modeling among them improves the prediction accuracy of the model.
在对本申请实施例提供的多智能体模型的训练方法进行说明之后,接下来对训练得到的多智能体模型的应用进行说明,这里,以疾病的传播预测的实际场景为例,对本申请实施例提供的多智能体模型的预测方法进行介绍,参见图8,图8是本申请实施例提供的多智能体模型的预测方法的流程示意图,本申请实施例提供的基于多智能体模型的预测方法包括:After describing the training method of the multi-agent model provided by the embodiment of the present application, the application of the multi-agent model obtained by training will be described next. The prediction method of the provided multi-agent model is introduced, see Fig. 8, Fig. 8 is a schematic flowchart of the prediction method of the multi-agent model provided by the embodiment of the present application, the prediction method based on the multi-agent model provided by the embodiment of the present application include:
步骤201,参与方设备获取可预测参数的实际参数值,其中,实际参数值不同于可预测参数的训练参数值。In step 201, the participant device acquires an actual parameter value of a predictable parameter, wherein the actual parameter value is different from a training parameter value of the predictable parameter.
在实际实施时,获取可预测参数的实际参数值包括获取目标区域内居民的总人数,居民的性别、年龄、职业,和目标疾病感染者的性别、年龄、职业,以及感染者的活动轨迹。这里,目标区域可以是某一城市或者某一国家,目标疾病可以是一种传播性强的新型疾病,目标疾病感染者可以是从目标区域以外的区域流入目标区域内的至少一个外来疾病感染者,或者也可以是在目标区域内没有接受疾病管控的自由行动的本地传播者。In actual implementation, obtaining the actual parameter values of the predictable parameters includes obtaining the total number of residents in the target area, the sex, age, and occupation of the residents, and the sex, age, occupation of the target disease infected person, and the activity track of the infected person. Here, the target area can be a certain city or a certain country, the target disease can be a new type of disease with strong transmission, and the target disease infected person can be at least one foreign disease infected person who flows into the target area from an area outside the target area , or it could be a free-moving local spreader not subject to disease control in the target area.
步骤202,将实际参数值输入更新后的多智能体模型进行预测,得到相应的预测结果。 Step 202, input actual parameter values into the updated multi-agent model for prediction, and obtain corresponding prediction results.
在实际实施时,将获取到的目标区域内居民的总人数,居民的性别、年龄、职业,和目标疾病感染者的性别、年龄、职业,以及感染者的活动轨迹输入至更新后的多智能体模型,可以预测目标疾病感染者对目标区域内居民的影响,即得到目标疾病感染者导致目标区域内的新增感染人数。In actual implementation, the acquired total number of residents in the target area, the sex, age, occupation of the residents, and the sex, age, occupation of the target disease infected person, and the activity trajectory of the infected person are input into the updated multi-intelligence The body model can predict the impact of the target disease infection on the residents in the target area, that is, the number of new infections in the target area caused by the target disease infection can be obtained.
如此,在获取到具体的可预测参数值后,相较于之前的多智能体模型,通过更新后的多智能体模型可以准确的预测出目标疾病感染者对目标区域的影响即传染人数,这样,可以充分准备医疗资源,对疾病感染者进行及时治疗,避免由于医疗资源不足导致疾病死亡率上升的问题。In this way, after obtaining specific predictable parameter values, compared with the previous multi-agent model, the updated multi-agent model can accurately predict the impact of the target disease infected person on the target area, that is, the number of infections. , it is possible to fully prepare medical resources, provide timely treatment for disease-infected persons, and avoid the problem of rising mortality due to insufficient medical resources.
在一些实施例中,更新完成的多智能体模型还可以用于城市交通情况预测,即预测未来一段时间内,针对目标区域的目标路段在目标时间段内拥堵车辆数,具体包括获取可预测参数的实际参数值即目标区域的人口出行轨迹、办公区域分布、节假日时间等;这里,目标区域可以是城市的不同中心区域,在实际实施时,将获取到的目标区域的人口出行轨迹、办公 区域分布、节假日时间等输入至更新后的多智能体模型,可以预测目标区域的目标路段在目标时间段内拥堵车辆数。如此,在获取到具体的可预测参数值后,相较于之前的多智能体模型,通过更新后的多智能体模型可以准确的预测出目标区域的目标路段在目标时间段内的拥堵情况,从而及时做出交通管控。In some embodiments, the updated multi-agent model can also be used to predict urban traffic conditions, that is, to predict the number of vehicles congested within the target time period for the target road segment in the target area within a certain period of time in the future, specifically including obtaining predictable parameters The actual parameter values of the target area are the population travel trajectory, office area distribution, holiday time, etc.; here, the target area can be different central areas of the city. In actual implementation, the acquired population travel trajectory, office area The distribution, holiday time, etc. are input into the updated multi-agent model, which can predict the number of congested vehicles in the target road segment in the target area within the target time period. In this way, after obtaining specific predictable parameter values, compared with the previous multi-agent model, the updated multi-agent model can accurately predict the congestion situation of the target road section in the target area within the target time period, So as to make timely traffic control.
应用本申请上述实施例,相较于相关技术中多智能体的模型只能由数据拥有方单独训练的方式,通过参与方在本地对不可预测参数进行聚合后得到的中间参数值并发送至协作方,并基于协作方对接收到中间参数值进行二次聚合返回的目标参数值,以对多智能体模型进行更新,如此,当多个参与方对用途相同的多智能体模型进行训练时,联合优化不可预测参数的取值,从而获得模拟结果与真实数据符合更好的多智能体模型,并保障了本地数据的安全,解决多智能体的模型领域的数据孤岛问题,实现多参与方之间共同建模,从而提升了模型预测准确度。Applying the above-mentioned embodiments of the present application, compared with the way in which the multi-agent model in the related art can only be trained by the data owner alone, the intermediate parameter values obtained by the participants' local aggregation of unpredictable parameters are sent to the collaborative Party, and based on the target parameter value returned by the secondary aggregation of the intermediate parameter value received by the collaborating party, the multi-agent model is updated. In this way, when multiple participants train the multi-agent model with the same purpose, Jointly optimize the value of unpredictable parameters, so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation. Co-modeling among them improves the prediction accuracy of the model.
接下来以横向联邦学习的应用场景为例,对本申请实施例提供的多智能体模型的训练进行说明。在横向联邦学习的场景下,通常有一个协作方与至少两个参与方,也即对于模型的训练由一个协作方设备和至少两个参与方设备共同实施。参与方设备与协作方设备均可以是服务器,也可以是终端。参见图9,图9是本申请实施例提供的多智能体模型的训练方法的流程示意图,包括:Next, taking the application scenario of horizontal federated learning as an example, the training of the multi-agent model provided by the embodiment of the present application will be described. In the scenario of horizontal federated learning, there is usually one collaborator and at least two participants, that is, the training of the model is jointly implemented by one collaborator device and at least two participant devices. Both the participant device and the coordinating device can be servers or terminals. Referring to FIG. 9, FIG. 9 is a schematic flowchart of a training method for a multi-agent model provided in an embodiment of the present application, including:
步骤301,各参与方设备初始化本地多智能体模型。Step 301, each participant device initializes a local multi-agent model.
这里,在横向联邦学习的应用场景下,各参与方作为数据持有方,所拥有的数据集中用户重叠相对少而用户特征重叠相对较多,各参与方拥有对应用户的标签;比如各参与方可以为不同地区的医院,他们触达的用户为不同地区的居民(即样本不同),但是业务相同(即特征相同);相应地,协作方设备可以是具有公信力的机构。Here, in the application scenario of horizontal federated learning, each participant, as the data holder, has relatively little user overlap and relatively large user feature overlap in the data set owned by each participant, and each participant has the label of the corresponding user; for example, each participant It can be hospitals in different regions, and the users they reach are residents in different regions (that is, different samples), but the business is the same (that is, the characteristics are the same); correspondingly, the collaborating party device can be a credible institution.
参见图10,图10是本申请实施例提供的一个多智能体模型的横向联邦学习方法,这里展示了一个协作方设备和n个参与方设备,各参与方的结构与工作方式均相同。在本实施例中,各参与方设备都有一个相同的多智能体模型,有各自私有的可预测参数X 1,E,…,X N, E,各自的不可预测参数X 1,V,…,X N,V,以及各方本地多智能体模型模拟的目标变量Y 1, gt,…,Y N,gt。在具体实施时,通过确定可预测参数取值X E、多智能体模型结构、预测目标Y gt以及选取不可预测参数X V来初始化本地多智能体模型。 Referring to Fig. 10, Fig. 10 is a horizontal federated learning method of a multi-agent model provided by the embodiment of the present application. Here, one collaborator device and n participant devices are shown, and the structures and working methods of each participant are the same. In this embodiment, each participant device has the same multi-agent model, with its own private predictable parameters X 1, E , ..., X N, E , and its own unpredictable parameters X 1, V , ... , X N, V , and the target variables Y 1, gt , ..., Y N, gt of the local multi-agent model simulation of each party. In specific implementation, the local multi-agent model is initialized by determining the value of the predictable parameter X E , the structure of the multi-agent model, the prediction target Y gt and selecting the unpredictable parameter X V .
步骤302,将可预测参数的参数值输入至本地的多智能体模型。Step 302, input the parameter values of the predictable parameters into the local multi-agent model.
继续参见图10,将各自私有的可预测参数X 1,E,…,X N,E输入至本地的ABS模型。 Continuing to refer to FIG. 10 , the private predictable parameters X 1 , E , . . . , X N , E are input to the local ABS model.
步骤303,在固定可预测参数的参数值的情况下,将多个参数值组分别输入至多智能体 模型进行预测,得到多个预测结果。Step 303, in the case of fixing the parameter value of the predictable parameter, input multiple parameter value groups into the multi-agent model for prediction respectively, and obtain multiple prediction results.
作为一个示例,这里以优化2个参数为例(a,b),各参与方初始化3组取值(可看作一个点),每组包含这2个参数的一种取值。将这3组参数分别带入模型进行模拟,得到对应三组参数的模型预测结果。这里继续参见图10,将各自的不可预测参数X 1,V,…,X N,V输入至本地的ABS模型,结合上述示例,这里的X 1,V对应参数a,X 2,V对应参数b,则各参与方初始化3组取值(可看作一个点)即为[a 1,b 1],[a 2,b 2]和[a 3,b 3],将这3组参数分别带入模型进行模拟,得到对应三组参数的模型预测结果也就是将[a1,b1],[a2,b2]和[a3,b3]带入模型进行模拟,得到分别对应三组参数的模型预测结果。 As an example, here is an example of optimizing two parameters (a, b). Each participant initializes three sets of values (which can be regarded as a point), and each set includes a value of the two parameters. These three sets of parameters were brought into the model for simulation, and the model prediction results corresponding to the three sets of parameters were obtained. Here continue to refer to Figure 10, input the respective unpredictable parameters X 1, V , ..., X N, V to the local ABS model, combined with the above example, here X 1, V corresponds to the parameter a, X 2, V corresponds to the parameter b, each participant initializes 3 sets of values (which can be regarded as a point) namely [a 1 , b 1 ], [a 2 , b 2 ] and [a 3 , b 3 ], and these 3 sets of parameters are respectively Bring in the model for simulation, and get the model prediction results corresponding to the three sets of parameters. That is, bring [a1, b1], [a2, b2] and [a3, b3] into the model for simulation, and get the model predictions corresponding to the three sets of parameters respectively. result.
步骤304,分别将多个预测结果与相应的实际结果进行比较。Step 304, respectively comparing multiple predicted results with corresponding actual results.
接上述示例,如果该多智能体模型的用途是预测当地死亡人数,则在某一时段内,当地实际死亡人数即是实际结果,将多个预测结果与相应的实际结果进行比较即是将[a 1,b 1],[a 2,b 2]和[a 3,b 3]分别对应的预测死亡人数与当地实际死亡人数进行比较。 Continuing with the above example, if the purpose of the multi-agent model is to predict the number of local deaths, then within a certain period of time, the actual number of deaths in the local area is the actual result, and comparing multiple predicted results with the corresponding actual results is [ A 1 , b 1 ], [a 2 , b 2 ] and [a 3 , b 3 ] respectively correspond to the predicted death toll and the local actual death toll.
步骤305,基于比较结果,确定每个参数值组对应的损失值。Step 305, based on the comparison result, determine the loss value corresponding to each parameter value group.
在实际实施时,通常可用均方误差(MSE)作为损失函数来计算得到每个参数值组对应的损失值。In actual implementation, the mean square error (MSE) is usually used as the loss function to calculate the loss value corresponding to each parameter value group.
步骤306,对多个损失值进行排序,得到最优模型参数值组,最差模型参数值组以及其它模型参数值组。Step 306, sort the multiple loss values to obtain the optimal model parameter value group, the worst model parameter value group and other model parameter value groups.
接上述示例,确定[a 1,b 1],[a 2,b 2]和[a 3,b 3]分别对应的预测结果的损失值,对三个损失值进行排序,得到最优模型参数值组[a 1,b 1],最差模型参数值组[a 2,b 2]以及其它模型参数值组[a 3,b 3]。 Following the above example, determine the loss values of the prediction results corresponding to [a 1 , b 1 ], [a 2 , b 2 ] and [a 3 , b 3 ] respectively, and sort the three loss values to obtain the optimal model parameters value group [a 1 , b 1 ], worst model parameter value group [a 2 , b 2 ] and other model parameter value group [a 3 , b 3 ].
步骤307,对除最差模型参数值组之外所有模型参数值组的不可预测参数的参数值进行聚合,得到对应各不可预测参数的中间参数值。Step 307, aggregate parameter values of unpredictable parameters of all model parameter value groups except the worst model parameter value group to obtain intermediate parameter values corresponding to each unpredictable parameter.
作为一个示例,这里对不可预测参数的参数值进行聚合可以是求取最优模型参数值组和其它模型参数值组的几何中心点,参照图11,图11是本申请实施例提供的一个多智能体模型的不可预测参数聚合的一个可选示意图,接上述示例,此处求最优模型参数值组[a 1,b 1]和其它模型参数值组[a 3,b 3]的几何中心点C,这里C=[(a 1+a 3)/2,(b 1+b 3)/2]。 As an example, the aggregation of parameter values of unpredictable parameters here can be to obtain the geometric center point of the optimal model parameter value group and other model parameter value groups. Referring to FIG. 11, FIG. 11 is a multiple An optional schematic diagram of unpredictable parameter aggregation of the agent model, following the above example, here find the geometric center of the optimal model parameter value group [a 1 , b 1 ] and other model parameter value groups [a 3 , b 3 ] Point C, where C=[(a 1 +a 3 )/2, (b 1 +b 3 )/2].
需要说明的是,在得到几何中心点C后,基于C对应的模型参数值组[(a 1+a 3)/2,(b 1+b 3)/2]对模型参数进行更新,并将[a 1,b 1],[a 3,b 3]和[(a 1+a 3)/2,(b 1+b 3)/2]继续带入更新后的模型进行模拟,得到分别对应三组模型参数值组的预测结果,然后继续步骤304-步骤307的过程,如此,各参与方本地迭代优化各自不可预测参数N L轮,得到各自的最终几何中心点C i,V t+1即中间参数值。 It should be noted that after the geometric center point C is obtained, the model parameters are updated based on the model parameter value group [(a 1 +a 3 )/2, (b 1 +b 3 )/2] corresponding to C, and the [a 1 , b 1 ], [a 3 , b 3 ] and [(a 1 +a 3 )/2, (b 1 +b 3 )/2] continue to be brought into the updated model for simulation, and the corresponding The prediction results of the three sets of model parameter value groups, and then continue the process of step 304-step 307, so that each participant iteratively optimizes its own unpredictable parameters N L rounds locally, and obtains their respective final geometric center points C i,V t+1 That is, the intermediate parameter value.
步骤308,将中间参数值发送至协作方设备。Step 308, sending the intermediate parameter value to the partner device.
继续参见图10,n个参与方设备将各自的最终几何中心点C i,V t+1各发送至协作方设备。 Continuing to refer to FIG. 10 , the n participant devices send their respective final geometric center points C i, V t+1 to the coordinating device.
步骤309,协作方设备对接收到的中间参数值进行聚合处理,得到对应各不可预测参数的目标参数值。In step 309, the coordinating device aggregates the received intermediate parameter values to obtain target parameter values corresponding to each unpredictable parameter.
作为一个示例,列举三种具体的聚合方法对协作方对接收到的中间参数值进行聚合处理的过程进行详细说明,具体包括,a)一种典型的聚合方式为求几何平均,即C Server, V t+1=centroid(C 1,V t+1,…,C N,V t+1);b)随机选取部分参与方上传的中心点进行平均,如随机选取K方,K<N,C Server,V t+1=centroid(C 1,V t+1,…,C K,V t+1);c)参与方除了上传几何中心点,同时上传最优点或最差点的损失值,或除最差点之外其它所有点的平均损失值;根据损失值对参与方进行排序,选取最好的K个中心点进行平均,得到新的中心点,K<N,C Server,V t+1=centroid(C 1,V t+1,…,C K,V t+1)。 As an example, three specific aggregation methods are listed to describe in detail the process of the coordinating party's aggregation processing of the received intermediate parameter values, specifically including: a) A typical aggregation method is geometric mean, that is, C Server, V t+1 =centroid(C 1, V t+1 ,..., C N, V t+1 ); b) Randomly select the center points uploaded by some participants for averaging, such as randomly selecting K parties, K<N, C Server, V t+1 = centroid(C 1, V t+1 ,..., C K, V t+1 ); c) In addition to uploading the geometric center point, the participant uploads the loss value of the best point or the worst point at the same time, Or the average loss value of all points except the worst point; sort the participants according to the loss value, select the best K center points for average, and get a new center point, K<N, C Server, V t+ 1 = centroid(C 1 , V t+1 , . . . , C K , V t+1 ).
示例性地,协作方设备对接收到的几何中心点进行聚合处理,即对C 1,…,C n求几何平均,这里,若C 1=[x 1,y 1],C n=[x n,y n],则C Server,V t+1=[(x 1+…+x n)/n,(y 1+…+y n)/n]。 Exemplarily, the coordinating party device aggregates the received geometric center points, that is, calculates the geometric mean of C 1 , ..., C n , where, if C 1 =[x 1 , y 1 ], C n =[x n , y n ], then C Server, V t+1 = [(x 1 +...+x n )/n, (y 1 +...+y n )/n].
步骤310,将目标参数值发送至各参与方设备。Step 310, sending the target parameter value to each participant device.
继续参见图10,协作方设备将通过聚合得到的对应各不可预测参数的目标参数值C Server, V t+1发送至n个参与方设备。 Continuing to refer to FIG. 10 , the coordinating party device sends the target parameter value C Server, V t+1 corresponding to each unpredictable parameter obtained through aggregation to n participant devices.
步骤311,基于目标参数值对多智能体模型进行更新。Step 311, update the multi-agent model based on the target parameter value.
在实际实施时,参与方设备在得到目标参数值即优化后的不可预测参数后,根据该不可预测参数对本地的多智能体模型进行优化。In actual implementation, after obtaining the target parameter value, that is, the optimized unpredictable parameter, the participant device optimizes the local multi-agent model according to the unpredictable parameter.
应用本申请上述实施例,相较于相关技术中多智能体的模型只能由数据拥有方单独训练的方式,通过参与方在本地对不可预测参数进行聚合后得到的中间参数值并发送至协作方,并基于协作方对接收到中间参数值进行二次聚合返回的目标参数值,以对多智能体模型进行更新,如此,当多个参与方对用途相同的多智能体模型进行训练时,联合优化不可预测参数的取值,从而获得模拟结果与真实数据符合更好的多智能体模型,并保障了本地数据的安全,解决多智能体的模型领域的数据孤岛问题,实现多参与方之间共同建模,从而提升了模型预测准确度。Applying the above-mentioned embodiments of the present application, compared with the way in which the multi-agent model in the related art can only be trained by the data owner alone, the intermediate parameter values obtained by the participants' local aggregation of unpredictable parameters are sent to the collaborative Party, and based on the target parameter value returned by the secondary aggregation of the intermediate parameter value received by the collaborating party, the multi-agent model is updated. In this way, when multiple participants train the multi-agent model with the same purpose, Jointly optimize the value of unpredictable parameters, so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation. Co-modeling among them improves the prediction accuracy of the model.
下面继续说明本申请实施例提供的多智能体模型的训练装置254,参见图12,图12是本申请实施例提供的多智能体模型的训练装置254的结构示意图,本申请实施例提供的多智能体模型的训练装置254包括:The following continues to explain the multi-agent model training device 254 provided by the embodiment of the present application, referring to FIG. 12 , which is a schematic structural diagram of the multi-agent model training device 254 provided by the embodiment of the present application. The training device 254 of the agent model comprises:
获取模块2541,配置为参与方设备将可预测参数的训练参数值输入至本地的多智能体模 型,并在固定所述训练参数值的情况下,将多个参数值组分别输入至所述多智能体模型进行预测,得到多个预测结果;其中,所述参数值组包括至少一个不可预测参数的参数值;The obtaining module 2541 is configured such that the participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and in the case of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model. The agent model performs prediction and obtains multiple prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter;
对比模块2542,配置为基于所述多个预测结果与各所述预测结果对应的实际结果,确定每个所述参数值组的影响因子;The comparison module 2542 is configured to determine the impact factor of each of the parameter value groups based on the plurality of prediction results and the actual results corresponding to each of the prediction results;
聚合模块2543,配置为基于各所述参数值组以及相应的影响因子,对各所述不可预测参数的参数值进行聚合,得到对应各所述不可预测参数的中间参数值;The aggregation module 2543 is configured to aggregate the parameter values of each of the unpredictable parameters based on each of the parameter value groups and the corresponding impact factors, to obtain intermediate parameter values corresponding to each of the unpredictable parameters;
发送模块2544,配置为将得到的所述中间参数值发送至协作方设备,其中,所述中间参数值用于触发所述协作方设备对多个参与方设备发送的所述中间参数值进行聚合处理,得到对应各所述不可预测参数的目标参数值;The sending module 2544 is configured to send the obtained intermediate parameter value to the cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices Processing to obtain target parameter values corresponding to each of the unpredictable parameters;
更新模块2545,配置为接收所述协作方设备返回的对应各所述不可预测参数的目标参数值,并基于所述目标参数值对所述多智能体模型进行更新。The updating module 2545 is configured to receive target parameter values corresponding to the unpredictable parameters returned by the cooperating device, and update the multi-agent model based on the target parameter values.
在一些实施例中,所述获取模块2541,还配置为获取所述不可预测参数的数量,并基于所述不可预测参数的数量确定所述参数值组的数量;基于所述参数值组的数量,确定各参数值组中不可预测参数的参数值;分别将所述各参数值组中不可预测参数的参数值输入至所述多智能体模型进行预测,得到对应所述多个参数值组的多个预测结果。In some embodiments, the acquisition module 2541 is further configured to acquire the number of unpredictable parameters, and determine the number of parameter value groups based on the number of unpredictable parameters; based on the number of parameter value groups , determine the parameter values of the unpredictable parameters in each parameter value group; respectively input the parameter values of the unpredictable parameters in the parameter value groups to the multi-agent model for prediction, and obtain the parameters corresponding to the multiple parameter value groups multiple predictions.
在一些实施例中,所述获取模块2541,还配置为获取所述参数值组中各不可预测参数的参数类型;根据所述各不可预测参数对应的参数类型,确定相应的参数值范围;根据所述各不可预测参数的参数值范围,确定所述各不可预测参数的参数值。In some embodiments, the obtaining module 2541 is further configured to obtain the parameter type of each unpredictable parameter in the parameter value group; determine the corresponding parameter value range according to the parameter type corresponding to each unpredictable parameter; The parameter value range of each unpredictable parameter determines the parameter value of each unpredictable parameter.
在一些实施例中,所述对比模块2542,还配置为分别基于每个所述参数值组对应的预测结果与相应的实际结果,确定每个所述参数值组对应的预测准确度;将每个所述参数值组对应的预测准确度作为相应的影响因子。In some embodiments, the comparison module 2542 is further configured to determine the prediction accuracy corresponding to each parameter value group based on the prediction result corresponding to each parameter value group and the corresponding actual result; The prediction accuracy corresponding to each of the parameter value groups is used as the corresponding impact factor.
在一些实施例中,所述聚合模块2543,还配置为分别将各所述参数值组对应的预测准确度与所述不可预测参数的参数值进行相乘,得到对应各所述参数值组的乘积结果;对各所述参数值组对应的乘积结果进行累加,得到累加结果;将所述累加结果作为所述不可预测参数的中间参数值。In some embodiments, the aggregation module 2543 is further configured to multiply the prediction accuracy corresponding to each of the parameter value groups by the parameter value of the unpredictable parameter to obtain the corresponding to each of the parameter value groups A product result: accumulating the product results corresponding to each of the parameter value groups to obtain an accumulation result; using the accumulation result as an intermediate parameter value of the unpredictable parameter.
在一些实施例中,所述对比模块2542,还配置为分别基于每个所述参数值组对应的预测结果与相应的实际结果,确定每个所述参数值组对应的损失值;基于每个所述参数值组对应的损失值,确定相应参数值组的影响因子。In some embodiments, the comparison module 2542 is further configured to determine the loss value corresponding to each parameter value group based on the predicted result corresponding to each parameter value group and the corresponding actual result; The loss value corresponding to the parameter value group determines the impact factor of the corresponding parameter value group.
在一些实施例中,所述聚合模块2543,还配置为基于各所述参数值组的影响因子,对所述多个参数值组进行排序,得到排序结果;基于所述排序结果,从所述多个参数值组中选取目标数量的参数值组;其中,所述目标数量小于所述多个参数值组的数量;基于选取的目标 数量的参数值组,对各所述不可预测参数的参数值进行聚合,得到对应各所述不可预测参数的中间参数值。In some embodiments, the aggregation module 2543 is further configured to sort the plurality of parameter value groups based on the impact factor of each parameter value group to obtain a sorting result; based on the sorting result, from the Selecting a target number of parameter value groups from a plurality of parameter value groups; wherein, the target number is smaller than the number of the plurality of parameter value groups; based on the selected target number of parameter value groups, for each parameter of the unpredictable parameter Values are aggregated to obtain intermediate parameter values corresponding to each of the unpredictable parameters.
在一些实施例中,所述聚合模块2543,还配置为获取所述目标数量的参数值组中所述不可预测参数的参数值的平均值;将所述平均值作为所述不可预测参数的中间参数值。In some embodiments, the aggregation module 2543 is further configured to obtain the average value of the parameter values of the unpredictable parameters in the target number of parameter value groups; and use the average value as the middle value of the unpredictable parameters parameter value.
在一些实施例中,所述发送模块2544,还配置为对各所述不可预测参数的中间参数值分别进行隐私保护,得到隐私保护后的中间参数值;发送隐私保护后的中间参数值至协作方设备,其中,所述中间参数值用于触发所述协作方设备对多个参与方设备发送的、隐私保护后的所述中间参数值进行聚合处理,得到对应各所述不可预测参数的目标参数值。In some embodiments, the sending module 2544 is further configured to perform privacy protection on the intermediate parameter values of the unpredictable parameters respectively to obtain the privacy-protected intermediate parameter values; and send the privacy-protected intermediate parameter values to the collaborative party device, wherein the intermediate parameter value is used to trigger the coordinating party device to aggregate the privacy-protected intermediate parameter values sent by multiple participant devices to obtain the target corresponding to each of the unpredictable parameters parameter value.
在一些实施例中,所述装置还包括第二获取模块1210和预测模块1220,所述第二获取模块1210测参数的训练参数值;所述预测模块1220,配置为将所述实际参数值输入更新后的所述多智能体模型进行预测,得到相应的预测结果。In some embodiments, the device further includes a second acquisition module 1210 and a prediction module 1220, the second acquisition module 1210 measures the training parameter value of the parameter; the prediction module 1220 is configured to input the actual parameter value The updated multi-agent model performs prediction and obtains corresponding prediction results.
在一些实施例中,所述可预测参数包括目标疾病的感染者的性别、年龄、职业,以及感染人数;所述第二获取模块1210,还配置为获取目标区域内目标疾病的感染者的性别、年龄、职业,以及感染人数;所述预测模块1220,还配置为将所述目标区域内目标疾病的感染者的性别、年龄、职业,以及感染人数输入至更新后的所述多智能体模型,预测得到所述目标区域内所述目标疾病导致的死亡人数。In some embodiments, the predictable parameters include the sex, age, occupation, and number of infected persons of the target disease; the second acquisition module 1210 is also configured to obtain the sex of the infected persons of the target disease in the target area , age, occupation, and the number of infections; the prediction module 1220 is also configured to input the sex, age, occupation, and number of infections of the infected persons of the target disease in the target area into the updated multi-agent model , to predict the number of deaths caused by the target disease in the target area.
应用本申请上述实施例,相较于相关技术中多智能体的模型只能由数据拥有方单独训练的方式,通过参与方在本地对不可预测参数进行聚合后得到的中间参数值并发送至协作方,并基于协作方对接收到中间参数值进行二次聚合返回的目标参数值,以对多智能体模型进行更新,如此,当多个参与方对用途相同的多智能体模型进行训练时,联合优化不可预测参数的取值,从而获得模拟结果与真实数据符合更好的多智能体模型,并保障了本地数据的安全,解决多智能体的模型领域的数据孤岛问题,实现多参与方之间共同建模,从而提升了模型预测准确度。Applying the above-mentioned embodiments of the present application, compared with the way in which the multi-agent model in the related art can only be trained by the data owner alone, the intermediate parameter values obtained by the participants' local aggregation of unpredictable parameters are sent to the collaborative Party, and based on the target parameter value returned by the secondary aggregation of the intermediate parameter value received by the collaborating party, the multi-agent model is updated. In this way, when multiple participants train the multi-agent model with the same purpose, Jointly optimize the value of unpredictable parameters, so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation. Co-modeling among them improves the prediction accuracy of the model.
下面说明本申请实施例提供的基于多智能体模型的预测装置1200,参见图13,图13是本申请实施例提供的基于多智能体模型的预测装置1200的结构示意图,本申请实施例提供的基于多智能体模型的预测装置1200包括:The following describes the prediction device 1200 based on the multi-agent model provided by the embodiment of the present application. Refer to FIG. 13. FIG. The prediction device 1200 based on the multi-agent model includes:
第二获取模块1210,配置为获取所述可预测参数的实际参数值,所述实际参数值不同于所述可预测参数的训练参数值;The second acquiring module 1210 is configured to acquire an actual parameter value of the predictable parameter, where the actual parameter value is different from the training parameter value of the predictable parameter;
预测模块1220,配置为将所述实际参数值输入更新后的所述多智能体模型进行预测,得到相应的预测结果。The prediction module 1220 is configured to input the actual parameter values into the updated multi-agent model for prediction, and obtain corresponding prediction results.
应用本申请上述实施例,相较于相关技术中多智能体的模型只能由数据拥有方单独训练的方式,通过参与方在本地对不可预测参数进行聚合后得到的中间参数值并发送至协作方,并基于协作方对接收到中间参数值进行二次聚合返回的目标参数值,以对多智能体模型进行更新,如此,当多个参与方对用途相同的多智能体模型进行训练时,联合优化不可预测参数的取值,从而获得模拟结果与真实数据符合更好的多智能体模型,并保障了本地数据的安全,解决多智能体的模型领域的数据孤岛问题,实现多参与方之间共同建模,从而提升了模型预测准确度。Applying the above-mentioned embodiments of the present application, compared with the way in which the multi-agent model in the related art can only be trained by the data owner alone, the intermediate parameter values obtained by the participants' local aggregation of unpredictable parameters are sent to the collaborative Party, and based on the target parameter value returned by the secondary aggregation of the intermediate parameter value received by the collaborating party, the multi-agent model is updated. In this way, when multiple participants train the multi-agent model with the same purpose, Jointly optimize the value of unpredictable parameters, so as to obtain a multi-agent model with better conformity between simulation results and real data, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize multi-participant cooperation. Co-modeling among them improves the prediction accuracy of the model.
本申请实施例还提供一种电子设备,所述电子设备包括:The embodiment of the present application also provides an electronic device, and the electronic device includes:
存储器,用于存储可执行指令;memory for storing executable instructions;
处理器,用于执行所述存储器中存储的可执行指令时,实现本申请实施例提供的多智能体模型的训练方法。The processor is configured to implement the multi-agent model training method provided in the embodiment of the present application when executing the executable instructions stored in the memory.
本申请实施例还提供了一种计算机程序产品,包括计算机程序,该计算机程序被处理器执行时实现本申请实施例提供的多智能体模型的训练方法。The embodiment of the present application also provides a computer program product, including a computer program, and when the computer program is executed by a processor, the multi-agent model training method provided in the embodiment of the present application is implemented.
本申请实施例还提供一种存储有可执行指令的计算机可读存储介质,其中存储有可执行指令,当可执行指令被处理器执行时,将引起处理器执行本申请实施例提供的多智能体模型的训练方法。The embodiment of the present application also provides a computer-readable storage medium storing executable instructions, wherein the executable instructions are stored. When the executable instructions are executed by the processor, the processor will be caused to execute the multi-intelligence system provided by the embodiment of the present application. Body model training method.
在一些实施例中,计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、闪存、磁表面存储器、光盘、或CD-ROM等存储器;也可以是包括上述存储器之一或任意组合的各种设备。In some embodiments, the computer-readable storage medium can be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; Various equipment.
在一些实施例中,可执行指令可以采用程序、软件、软件模块、脚本或代码的形式,按任意形式的编程语言(包括编译或解释语言,或者声明性或过程性语言)来编写,并且其可按任意形式部署,包括被部署为独立的程序或者被部署为模块、组件、子例程或者适合在计算环境中使用的其它单元。In some embodiments, executable instructions may take the form of programs, software, software modules, scripts, or code written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and its Can be deployed in any form, including as a stand-alone program or as a module, component, subroutine or other unit suitable for use in a computing environment.
作为示例,可执行指令可以但不一定对应于文件系统中的文件,可以可被存储在保存其它程序或数据的文件的一部分,例如,存储在超文本标记语言(Hyper Text Markup Language,HTML)文档中的一个或多个脚本中,存储在专用于所讨论的程序的单个文件中,或者,存储在多个协同文件(例如,存储一个或多个模块、子程序或代码部分的文件)中。As an example, executable instructions may, but do not necessarily correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, in a Hyper Text Markup Language (HTML) document in one or more scripts, in a single file dedicated to the program in question, or in multiple cooperating files (for example, files that store one or more modules, subroutines, or sections of code).
作为示例,可执行指令可被部署为在一个计算设备上执行,或者在位于一个地点的多个计算设备上执行,又或者,在分布在多个地点且通过通信网络互连的多个计算设备上执行。As an example, executable instructions may be deployed to be executed on one computing device, or on multiple computing devices located at one site, or alternatively, on multiple computing devices distributed across multiple sites and interconnected by a communication network. to execute.
综上所述,通过本申请实施例当多个参与方对用途相同的多智能体模型进行训练时,联 合优化不可预测参数的取值,从而获得模拟结果与真实数据符合更好的多智能体模型,并保障了本地数据的安全,解决多智能体的模型领域的数据孤岛问题,实现多参与方之间共同建模,从而提升了模型预测准确度。To sum up, through the embodiment of this application, when multiple participants train the multi-agent model with the same purpose, the values of unpredictable parameters are jointly optimized, so as to obtain a multi-agent whose simulation results are better in line with the real data. Model, and ensure the security of local data, solve the problem of data islands in the field of multi-agent models, and realize joint modeling among multiple participants, thereby improving the accuracy of model prediction.
以上所述,仅为本申请的实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和范围之内所作的任何修改、等同替换和改进等,均包含在本申请的保护范围之内。The above descriptions are merely examples of the present application, and are not intended to limit the protection scope of the present application. Any modifications, equivalent replacements and improvements made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (15)

  1. 一种多智能体模型的训练方法,基于联邦学习系统,所述系统包括协作方设备及至少两个参与方设备,所述方法由参与方设备执行,所述方法包括:A training method of a multi-agent model, based on a federated learning system, the system includes a coordinating party device and at least two participant devices, the method is executed by the participant device, and the method includes:
    参与方设备将可预测参数的训练参数值输入至本地的多智能体模型,并在固定所述训练参数值的情况下,将多个参数值组分别输入至所述多智能体模型进行预测,得到多个预测结果;The participant device inputs the training parameter values of the predictable parameters into the local multi-agent model, and under the condition of fixing the training parameter values, respectively inputs multiple parameter value groups into the multi-agent model for prediction, Get multiple prediction results;
    其中,所述参数值组包括至少一个不可预测参数的参数值;Wherein, the set of parameter values includes at least one parameter value of an unpredictable parameter;
    基于所述多个预测结果与各所述预测结果对应的实际结果,确定每个所述参数值组的影响因子;determining an impact factor for each of the parameter value groups based on the plurality of prediction results and actual results corresponding to each of the prediction results;
    基于各所述参数值组以及相应的影响因子,对各所述不可预测参数的参数值进行聚合,得到对应各所述不可预测参数的中间参数值;Based on each of the parameter value groups and corresponding impact factors, the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameter values corresponding to each of the unpredictable parameters;
    将得到的所述中间参数值发送至协作方设备,其中,所述中间参数值用于触发所述协作方设备对多个参与方设备发送的所述中间参数值进行聚合处理,得到对应各所述不可预测参数的目标参数值;Send the obtained intermediate parameter value to the cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices to obtain the The target parameter value of the unpredictable parameter;
    接收所述协作方设备返回的对应各所述不可预测参数的目标参数值,并基于所述目标参数值对所述多智能体模型进行更新。receiving target parameter values corresponding to the unpredictable parameters returned by the coordinating device, and updating the multi-agent model based on the target parameter values.
  2. 根据权利要求1所述的方法,其中,所述将多个参数值组分别输入至所述多智能体模型进行预测,得到多个预测结果,包括:The method according to claim 1, wherein the multiple parameter value groups are respectively input into the multi-agent model for prediction, and multiple prediction results are obtained, including:
    获取所述不可预测参数的数量,并基于所述不可预测参数的数量确定所述参数值组的数量;obtaining the number of unpredictable parameters, and determining the number of parameter value groups based on the number of unpredictable parameters;
    基于所述参数值组的数量,确定各参数值组中不可预测参数的参数值;determining parameter values for unpredictable parameters in each parameter value group based on the number of parameter value groups;
    分别将所述各参数值组中不可预测参数的参数值输入至所述多智能体模型进行预测,得到对应所述多个参数值组的多个预测结果。Inputting parameter values of unpredictable parameters in each parameter value group into the multi-agent model for prediction, and obtaining multiple prediction results corresponding to the multiple parameter value groups.
  3. 根据权利要求2所述的方法,其中,所述确定各参数值组中不可预测参数的参数值,包括:The method according to claim 2, wherein said determining parameter values of unpredictable parameters in each parameter value group comprises:
    获取所述参数值组中各不可预测参数的参数类型;Obtain the parameter type of each unpredictable parameter in the parameter value group;
    根据所述各不可预测参数对应的参数类型,确定相应的参数值范围;According to the parameter type corresponding to each unpredictable parameter, determine the corresponding parameter value range;
    根据所述各不可预测参数的参数值范围,确定所述各不可预测参数的参数值。The parameter value of each unpredictable parameter is determined according to the parameter value range of each unpredictable parameter.
  4. 根据权利要求1所述的方法,其中,所述基于所述多个预测结果与各所述预测结果对应的实际结果,确定每个所述参数值组的影响因子,包括:The method according to claim 1, wherein said determining the influence factor of each said parameter value group based on said plurality of predicted results and actual results corresponding to said predicted results comprises:
    分别基于每个所述参数值组对应的预测结果与相应的实际结果,确定每个所述参数值组 对应的预测准确度;Based on the prediction results corresponding to each of the parameter value groups and the corresponding actual results, determine the prediction accuracy corresponding to each of the parameter value groups;
    将每个所述参数值组对应的预测准确度作为相应的影响因子。The prediction accuracy corresponding to each parameter value group is used as the corresponding impact factor.
  5. 根据权利要求4所述的方法,其中,所述基于各所述参数值组以及相应的影响因子,对各所述不可预测参数的参数值进行聚合,得到对应各所述不可预测参数的中间参数值,包括:The method according to claim 4, wherein, based on each of the parameter value groups and corresponding influence factors, the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameters corresponding to each of the unpredictable parameters values, including:
    针对所述参数值组中任一所述不可预测参数执行以下操作:Perform the following operations for any of the unpredictable parameters in the parameter value group:
    分别将各所述参数值组对应的预测准确度与所述不可预测参数的参数值进行相乘,得到对应各所述参数值组的乘积结果;Multiply the prediction accuracy corresponding to each of the parameter value groups by the parameter value of the unpredictable parameter to obtain the product result corresponding to each of the parameter value groups;
    对各所述参数值组对应的乘积结果进行累加,得到累加结果;Accumulating the product results corresponding to each of the parameter value groups to obtain an accumulation result;
    将所述累加结果作为所述不可预测参数的中间参数值。The accumulation result is used as an intermediate parameter value of the unpredictable parameter.
  6. 根据权利要求1所述的方法,其中,所述基于所述多个预测结果与各所述预测结果对应的实际结果,确定每个所述参数值组的影响因子,包括:The method according to claim 1, wherein said determining the influence factor of each said parameter value group based on said plurality of predicted results and actual results corresponding to said predicted results comprises:
    分别基于每个所述参数值组对应的预测结果与相应的实际结果,确定每个所述参数值组对应的损失值;Determining a loss value corresponding to each parameter value group based on the prediction result corresponding to each parameter value group and the corresponding actual result;
    基于每个所述参数值组对应的损失值,确定相应参数值组的影响因子。Based on the loss value corresponding to each parameter value group, the impact factor of the corresponding parameter value group is determined.
  7. 根据权利要求1所述的方法,其中,所述基于各所述参数值组以及相应的影响因子,对各所述不可预测参数的参数值进行聚合,得到对应各所述不可预测参数的中间参数值,包括:The method according to claim 1, wherein, based on each of the parameter value groups and corresponding impact factors, the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameters corresponding to each of the unpredictable parameters values, including:
    基于各所述参数值组的影响因子,对所述多个参数值组进行排序,得到排序结果;sorting the plurality of parameter value groups based on the impact factors of each of the parameter value groups to obtain a sorting result;
    基于所述排序结果,从所述多个参数值组中选取目标数量的参数值组;其中,所述目标数量小于所述多个参数值组的数量;Selecting a target number of parameter value groups from the plurality of parameter value groups based on the sorting result; wherein the target number is smaller than the number of the plurality of parameter value groups;
    基于选取的目标数量的参数值组,对各所述不可预测参数的参数值进行聚合,得到对应各所述不可预测参数的中间参数值。Based on the selected target number of parameter value groups, the parameter values of each of the unpredictable parameters are aggregated to obtain intermediate parameter values corresponding to each of the unpredictable parameters.
  8. 根据权利要求7所述的方法,其中,所述基于选取的目标数量的参数值组,对各所述不可预测参数的参数值进行聚合,得到对应各所述不可预测参数的中间参数值,包括:The method according to claim 7, wherein the parameter value group based on the selected target quantity aggregates the parameter values of each of the unpredictable parameters to obtain an intermediate parameter value corresponding to each of the unpredictable parameters, including :
    针对所述参数值组中任一所述不可预测参数执行以下操作:Perform the following operations for any of the unpredictable parameters in the parameter value group:
    获取所述目标数量的参数值组中所述不可预测参数的参数值的平均值;Obtaining an average value of parameter values of said unpredictable parameter in said target number of parameter value groups;
    将所述平均值作为所述不可预测参数的中间参数值。The mean value is used as the intermediate parameter value of the unpredictable parameter.
  9. 根据权利要求1所述的方法,其中,所述将得到的所述中间参数值发送至协作方设备,包括:The method according to claim 1, wherein the sending the obtained intermediate parameter value to the coordinating party device includes:
    对各所述不可预测参数的中间参数值分别进行隐私保护,得到隐私保护后的中间参数值;Performing privacy protection on the intermediate parameter values of each of the unpredictable parameters respectively, to obtain the privacy-protected intermediate parameter values;
    发送隐私保护后的中间参数值至协作方设备,其中,所述中间参数值用于触发所述协作方设备对多个参与方设备发送的、隐私保护后的所述中间参数值进行聚合处理,得到对应各所述不可预测参数的目标参数值。sending the privacy-protected intermediate parameter value to the coordinating party device, where the intermediate parameter value is used to trigger the cooperating party device to aggregate the privacy-protected intermediate parameter values sent by multiple participant devices, A target parameter value corresponding to each of the unpredictable parameters is obtained.
  10. 根据权利要求1所述的方法,其中,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    获取所述可预测参数的实际参数值,所述实际参数值不同于所述可预测参数的训练参数值;obtaining an actual parameter value of the predictable parameter, the actual parameter value being different from the training parameter value of the predictable parameter;
    将所述实际参数值输入更新后的所述多智能体模型进行预测,得到相应的预测结果。The actual parameter values are input into the updated multi-agent model for prediction, and corresponding prediction results are obtained.
  11. 根据权利要求10所述的方法,其中,所述可预测参数包括目标疾病的感染者的性别、年龄、职业,以及感染人数;The method according to claim 10, wherein said predictable parameters include the sex, age, occupation, and number of infected persons of the target disease;
    所述获取所述可预测参数的实际参数值,包括:The acquiring the actual parameter value of the predictable parameter includes:
    获取目标区域内目标疾病的感染者的性别、年龄、职业,以及感染人数;Obtain the gender, age, occupation, and number of infected persons of the target disease in the target area;
    所述将所述实际参数值输入更新后的所述多智能体模型进行预测,得到相应的预测结果,包括:Said inputting the actual parameter value into the updated multi-agent model for prediction, and obtaining corresponding prediction results, including:
    将所述目标区域内目标疾病的感染者的性别、年龄、职业,以及感染人数输入至更新后的所述多智能体模型,预测得到所述目标区域内所述目标疾病导致的死亡人数。The gender, age, occupation, and number of infected persons of the target disease in the target area are input into the updated multi-agent model, and the number of deaths caused by the target disease in the target area is predicted.
  12. 一种多智能体模型的训练装置,所述装置包括:A training device for a multi-agent model, said device comprising:
    获取模块,配置为参与方设备将可预测参数的训练参数值输入至本地的多智能体模型,并在固定所述训练参数值的情况下,将多个参数值组分别输入至所述多智能体模型进行预测,得到多个预测结果;其中,所述参数值组包括至少一个不可预测参数的参数值;The acquisition module is configured to input the training parameter values of the predictable parameters to the local multi-agent model by the participant equipment, and input multiple parameter value groups into the multi-agent model under the condition of fixing the training parameter values respectively. The volume model is predicted to obtain multiple prediction results; wherein, the parameter value group includes at least one parameter value of an unpredictable parameter;
    对比模块,配置为基于所述多个预测结果与各所述预测结果对应的实际结果,确定每个所述参数值组的影响因子;A comparison module configured to determine an impact factor for each of the parameter value groups based on the plurality of prediction results and actual results corresponding to each of the prediction results;
    聚合模块,配置为基于各所述参数值组以及相应的影响因子,对各所述不可预测参数的参数值进行聚合,得到对应各所述不可预测参数的中间参数值;An aggregation module configured to aggregate the parameter values of each of the unpredictable parameters based on each of the parameter value groups and the corresponding impact factors, to obtain intermediate parameter values corresponding to each of the unpredictable parameters;
    发送模块,配置为将得到的所述中间参数值发送至协作方设备,其中,所述中间参数值用于触发所述协作方设备对多个参与方设备发送的所述中间参数值进行聚合处理,得到对应各所述不可预测参数的目标参数值;A sending module, configured to send the obtained intermediate parameter value to a cooperating device, where the intermediate parameter value is used to trigger the cooperating device to aggregate the intermediate parameter values sent by multiple participant devices , to obtain the target parameter value corresponding to each of the unpredictable parameters;
    更新模块,配置为接收所述协作方设备返回的对应各所述不可预测参数的目标参数值,并基于所述目标参数值对所述多智能体模型进行更新。The update module is configured to receive target parameter values corresponding to each of the unpredictable parameters returned by the coordinating device, and update the multi-agent model based on the target parameter values.
  13. 一种电子设备,所述电子设备包括:An electronic device comprising:
    存储器,用于存储可执行指令;memory for storing executable instructions;
    处理器,用于执行所述存储器中存储的可执行指令时,实现权利要求1至11任一项所 述的方法。The processor is configured to implement the method according to any one of claims 1 to 11 when executing the executable instructions stored in the memory.
  14. 一种计算机可读存储介质,存储有可执行指令,用于被处理器执行时,实现权利要求1至11任一项所述的方法。A computer-readable storage medium storing executable instructions for implementing the method according to any one of claims 1 to 11 when executed by a processor.
  15. 一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现权利要求1至11任一项所述的方法。A computer program product, comprising a computer program, the computer program implements the method according to any one of claims 1 to 11 when executed by a processor.
PCT/CN2021/142157 2021-08-25 2021-12-28 Multi-agent model training method, apparatus, electronic device, storage medium and program product WO2023024378A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110981895.1A CN113658689A (en) 2021-08-25 2021-08-25 Multi-agent model training method and device, electronic equipment and storage medium
CN202110981895.1 2021-08-25

Publications (1)

Publication Number Publication Date
WO2023024378A1 true WO2023024378A1 (en) 2023-03-02

Family

ID=78492853

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/142157 WO2023024378A1 (en) 2021-08-25 2021-12-28 Multi-agent model training method, apparatus, electronic device, storage medium and program product

Country Status (2)

Country Link
CN (1) CN113658689A (en)
WO (1) WO2023024378A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116935136A (en) * 2023-08-02 2023-10-24 深圳大学 Federal learning method for processing classification problem of class imbalance medical image

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113658689A (en) * 2021-08-25 2021-11-16 深圳前海微众银行股份有限公司 Multi-agent model training method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239845A (en) * 2016-03-29 2017-10-10 中国石油化工股份有限公司 The construction method of effect of reservoir development forecast model
CN109871702A (en) * 2019-02-18 2019-06-11 深圳前海微众银行股份有限公司 Federal model training method, system, equipment and computer readable storage medium
CN110797124A (en) * 2019-10-30 2020-02-14 腾讯科技(深圳)有限公司 Model multi-terminal collaborative training method, medical risk prediction method and device
EP3742229A1 (en) * 2019-05-21 2020-11-25 ASML Netherlands B.V. Systems and methods for adjusting prediction models between facility locations
CN112584347A (en) * 2020-09-28 2021-03-30 西南电子技术研究所(中国电子科技集团公司第十研究所) UAV heterogeneous network multi-dimensional resource dynamic management method
CN113095512A (en) * 2021-04-23 2021-07-09 深圳前海微众银行股份有限公司 Federal learning modeling optimization method, apparatus, medium, and computer program product
CN113658689A (en) * 2021-08-25 2021-11-16 深圳前海微众银行股份有限公司 Multi-agent model training method and device, electronic equipment and storage medium

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10733515B1 (en) * 2017-02-21 2020-08-04 Amazon Technologies, Inc. Imputing missing values in machine learning models
CN109118013A (en) * 2018-08-29 2019-01-01 黑龙江工业学院 A kind of management data prediction technique, readable storage medium storing program for executing and forecasting system neural network based
CN110263936B (en) * 2019-06-14 2023-04-07 深圳前海微众银行股份有限公司 Horizontal federal learning method, device, equipment and computer storage medium
CN110826725B (en) * 2019-11-07 2022-10-04 深圳大学 Intelligent agent reinforcement learning method, device and system based on cognition
CN111737749A (en) * 2020-06-28 2020-10-02 南方电网科学研究院有限责任公司 Measuring device alarm prediction method and device based on federal learning
CN112132277A (en) * 2020-09-21 2020-12-25 平安科技(深圳)有限公司 Federal learning model training method and device, terminal equipment and storage medium
CN112329940A (en) * 2020-11-02 2021-02-05 北京邮电大学 Personalized model training method and system combining federal learning and user portrait
CN112289448A (en) * 2020-11-06 2021-01-29 新智数字科技有限公司 Health risk prediction method and device based on joint learning
CN112257873A (en) * 2020-11-11 2021-01-22 深圳前海微众银行股份有限公司 Training method, device, system, equipment and storage medium of machine learning model
CN112447299A (en) * 2020-12-01 2021-03-05 平安科技(深圳)有限公司 Medical care resource prediction model training method, device, equipment and storage medium
CN112700010A (en) * 2020-12-30 2021-04-23 深圳前海微众银行股份有限公司 Feature completion method, device, equipment and storage medium based on federal learning
US11017322B1 (en) * 2021-01-28 2021-05-25 Alipay Labs (singapore) Pte. Ltd. Method and system for federated learning
CN113112321A (en) * 2021-03-10 2021-07-13 深兰科技(上海)有限公司 Intelligent energy body method, device, electronic equipment and storage medium
CN113095508A (en) * 2021-04-23 2021-07-09 深圳前海微众银行股份有限公司 Regression model construction optimization method, device, medium, and computer program product

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239845A (en) * 2016-03-29 2017-10-10 中国石油化工股份有限公司 The construction method of effect of reservoir development forecast model
CN109871702A (en) * 2019-02-18 2019-06-11 深圳前海微众银行股份有限公司 Federal model training method, system, equipment and computer readable storage medium
EP3742229A1 (en) * 2019-05-21 2020-11-25 ASML Netherlands B.V. Systems and methods for adjusting prediction models between facility locations
CN110797124A (en) * 2019-10-30 2020-02-14 腾讯科技(深圳)有限公司 Model multi-terminal collaborative training method, medical risk prediction method and device
CN112584347A (en) * 2020-09-28 2021-03-30 西南电子技术研究所(中国电子科技集团公司第十研究所) UAV heterogeneous network multi-dimensional resource dynamic management method
CN113095512A (en) * 2021-04-23 2021-07-09 深圳前海微众银行股份有限公司 Federal learning modeling optimization method, apparatus, medium, and computer program product
CN113658689A (en) * 2021-08-25 2021-11-16 深圳前海微众银行股份有限公司 Multi-agent model training method and device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116935136A (en) * 2023-08-02 2023-10-24 深圳大学 Federal learning method for processing classification problem of class imbalance medical image

Also Published As

Publication number Publication date
CN113658689A (en) 2021-11-16

Similar Documents

Publication Publication Date Title
Nguyen et al. Federated learning for internet of things: A comprehensive survey
Gao et al. STAN: spatio-temporal attention network for pandemic prediction using real-world evidence
Lin et al. ELECTRE II method to deal with probabilistic linguistic term sets and its application to edge computing
CN113159327B (en) Model training method and device based on federal learning system and electronic equipment
Viana et al. Combining discrete-event simulation and system dynamics in a healthcare setting: A composite model for Chlamydia infection
Frias-Martinez et al. An agent-based model of epidemic spread using human mobility and social network information
Kishore et al. Lockdowns result in changes in human mobility which may impact the epidemiologic dynamics of SARS-CoV-2
CN110874648A (en) Federal model training method and system and electronic equipment
CN112749749B (en) Classification decision tree model-based classification method and device and electronic equipment
WO2022237194A1 (en) Abnormality detection method and apparatus for accounts in federal learning system, and electronic device
CN112712182A (en) Model training method and device based on federal learning and storage medium
Schneider et al. Social network analysis via multi-state reliability and conditional influence models
Miao et al. Federated deep reinforcement learning based secure data sharing for Internet of Things
Martín et al. Leveraging social networks for understanding the evolution of epidemics
Lin et al. DRL-based adaptive sharding for blockchain-based federated learning
Dum et al. Global systems science and policy
CN112101577B (en) XGboost-based cross-sample federal learning and testing method, system, device and medium
WO2023024378A1 (en) Multi-agent model training method, apparatus, electronic device, storage medium and program product
van Maanen et al. An agent-based approach to modeling online social influence
CN112308238A (en) Analytical model training method and device, electronic equipment and storage medium
Xia et al. Synthesis of a high resolution social contact network for Delhi with application to pandemic planning
Wu et al. Development path based on the equalization of public services under the management mode of the Internet of Things
Kuehn et al. The influence of a transport process on the epidemic threshold
Agate et al. A framework for parallel assessment of reputation management systems
Kafsi et al. Mitigating epidemics through mobile micro-measures

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21954893

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE