WO2021147373A1 - 实现模型更新的方法及其设备 - Google Patents
实现模型更新的方法及其设备 Download PDFInfo
- Publication number
- WO2021147373A1 WO2021147373A1 PCT/CN2020/119432 CN2020119432W WO2021147373A1 WO 2021147373 A1 WO2021147373 A1 WO 2021147373A1 CN 2020119432 W CN2020119432 W CN 2020119432W WO 2021147373 A1 WO2021147373 A1 WO 2021147373A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- difference
- training
- service device
- model
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
Definitions
- the embodiments of the present application relate to the computer field, and specifically relate to a method and equipment for implementing model update.
- Federated Learning is an emerging basic artificial intelligence technology. It was originally used to solve the problem of end users updating the model locally. Its design goal is to protect the privacy of terminal data and personal data. Carry out high-efficiency machine learning between nodes. At present, the purpose of joint learning has been extended to jointly build an artificial intelligence (AI) model without data sharing to improve the effect of AI models.
- AI artificial intelligence
- each client will use local data for training locally, and upload the difference information of the trained local model to the service device, and the service device will then update the local model that has been uploaded for each client training.
- the difference information is jointly updated, a new central model is generated, and the new model is sent to the client.
- the client will train again locally based on the new model, and upload the trained multi-batch models to the service device. After several such repeated processes, the service equipment will converge to the central model, indicating that the joint learning is complete.
- the service device After the service device receives the difference information uploaded by the client, the service device will issue some training information for the next training, such as the number of training batches or the learning rate, but the issued training information is set by the service device before, for example It is fixed, or the next training information is set according to the law of increasing or decreasing, but every time the client updates the training model, the best time for the client to upload the model after training according to the new model and the last training result is Relatedly, if the training information does not correspondingly change, or the service device only sets the corresponding training information in an incremental or decremental manner, the client may miss the best uploading opportunity, thereby affecting the accuracy of the training model.
- some training information for the next training such as the number of training batches or the learning rate
- the issued training information is set by the service device before, for example It is fixed, or the next training information is set according to the law of increasing or decreasing, but every time the client updates the training model, the best time for the client to upload the model after training according to the new model and the last training result
- the embodiment of the application provides a method and device for implementing model update, which are used to obtain first training information according to the difference information of at least two clients.
- the accuracy of the training model can be improved. Rate.
- the first aspect of the embodiments of the present application provides a method for implementing model update.
- the client training when the client training is completed, it will send difference information to the service device, that is, the service device will receive at least two difference information sent by at least two clients, and the difference information is the client based on the first model.
- the difference information of the second model obtained through training relative to the first model, and the first model is sent by the service device to the client.
- the service device calculates according to the difference information uploaded by the at least two clients to obtain the first difference consistency information, and the first difference consistency information is used to indicate the degree of consistency of the difference information uploaded by the at least two clients.
- the service device After receiving the difference information uploaded by the at least two clients, the service device updates the first model according to the difference information uploaded by the at least two clients to obtain the third model.
- the service device calculates the first training information according to the first difference agreement information, and the first training information is used for the client to train the third model.
- the service device After obtaining the first training information, the service device sends the first training information and the third model to the at least two clients.
- the service device calculates the first training information according to the difference information uploaded by at least two clients, and sends the first training information to the client.
- the client trains the third training information according to the first training information.
- the client can improve the accuracy of the training model when using the first training information for training.
- the service device may specifically calculate the first difference agreement information according to the following formula:
- GCR t represents the first difference consistency information
- g i represents the difference information uploaded by the i-th client among at least two clients
- i is a positive integer greater than or equal to 1.
- the service device calculates the first difference consistency information according to a specific formula, which improves the feasibility of the solution.
- the first training information includes the first training batch number, or the first At least one of the learning rates, the first training batch number is used for the client to determine the number of training batches, and the first learning rate is used for the client to determine the training learning rate when training the model.
- the specific information contained in the first training information is limited, which improves the feasibility of the solution.
- the first training information includes the first training batch number
- the first training information calculated by the service device according to the first difference consistent information needs to meet the following conditions:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t represents the first difference agreement information
- E t- 1 represents the number of second training batches
- the number of second training batches represents the number of training batches sent by the service device to the client before the service device obtains the first batch number
- E t represents the information about the number of first training batches.
- the service if the first training information includes the first learning rate, the service The first learning rate calculated by the device according to the first difference consistent information needs to meet the following conditions:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t represents the first difference agreement information
- lr t- 1 showing a second learning rate
- a second study was obtained to the client's learning rate information before the service apparatus in a first learning rate service equipment
- lr t represents a first learning rate
- the specific conditions when the service device calculates the first learning rate are limited, which improves the feasibility of the solution.
- the service device can determine according to the first difference agreement information and the first performance The information is calculated to obtain the first training information, and the first performance determination information is used to indicate the weight of the performance determination factor when the service device calculates the first training information.
- the first training information is calculated according to the first performance determination information, which improves the feasibility of the solution.
- the first performance determination information includes the accuracy rate weight and the communication cost weight. At least one of the accuracy rate weight is used to indicate the weight of the model accuracy rate when the service device calculates the first training information, and the communication cost weight is used to indicate the weight of the communication resource used by the service device when calculating the first training information.
- the accuracy rate weights and communication cost weights are limited.
- the accuracy rate weights and communication cost weights can be set according to actual application conditions, which improves the operability of the solution.
- the service device can calculate the first training batch number according to the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t represents the first difference agreement information
- E t- 1 represents the second training batch
- the second training batch information represents the number of training batches sent by the server to the client last time
- E t represents the first training batch information
- ⁇ represents the accuracy rate weight
- ⁇ represents the communication cost weight
- a , B, c and d are positive numbers not exceeding 6;
- E t calculated by the formula is a non-positive integer, then the value of Et is made to be a positive integer by rounding.
- the number of the first training batch is calculated according to a specific formula, which improves the feasibility of the solution.
- the service device can calculate the first learning rate according to the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t represents the first difference agreement information
- lr t-1 represents a second learning rate
- lr t represents a first learning rate
- [alpha] represents the weight accuracy
- a 1 , b 1 and b 2 are positive numbers not exceeding 6.
- the first learning rate is calculated according to a specific formula, which improves the feasibility of the solution.
- the service device calculates the first difference consistency information, it may The exponential moving average method is used to modify the first difference agreement information, and the revised first difference agreement information is obtained.
- the first difference and consistency information is corrected by the exponential moving average method, and the first training information is calculated by the corrected first difference and consistency information, thereby improving the accuracy of model training.
- the service device may correct the first difference agreement information according to the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t indicates the first difference agreement information
- GCR' indicates the revised The first difference is consistent information
- the value range of is 0.9 to 0.99.
- the first difference agreement information is corrected by a specific formula, which improves the feasibility of the solution.
- the second aspect of the embodiments of the present application provides a method for implementing model update.
- the service device receives the difference information uploaded by at least two clients, and the difference information uploaded by any one of the at least two clients is based on the second model trained on the first model relative to the first model.
- the difference information of the first model is received by the client from the service device.
- the service device updates the first model, that is, the central model of the service device, according to the difference information uploaded by the at least two clients, to obtain the third model.
- the service device obtains the difference information between the first model and the third model (hereinafter referred to as the difference information of the service device) according to the first model and the third model.
- the service device calculates according to the difference information of the service device and the difference information uploaded by the first client of the at least two clients to obtain the target difference consistency information, and the target difference consistency information is used to indicate the difference information and the first difference information of the service device.
- the degree of consistency of the difference information of a client is used to indicate the difference information and the first difference information of the service device.
- the service device calculates the target training information according to the target difference consistent information, and the target training information is used for the client to train the third model.
- the service device sends the target training information and the third model to the first client.
- the training batch information of a single client is calculated by a single calculation, which improves the accuracy of training the model.
- the first training information includes at least one of the first training batch number or the first learning rate, and the first training batch number It is used for the client to determine the number of batches for training, and the first learning rate is used for the client to determine the training learning rate when training the model.
- the specific information contained in the first training information is limited, which improves the feasibility of the solution.
- the first training information includes the first training batch number
- the first training information calculated by the service device according to the first difference consistent information needs to meet the following conditions:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t represents the first difference agreement information
- E t- 1 represents the number of second training batches
- the number of second training batches represents the number of training batches sent by the service device to the client before the service device obtains the first batch number
- E t represents the information about the number of first training batches.
- the service if the first training information includes the first learning rate, the service The first learning rate calculated by the device according to the first difference consistent information needs to meet the following conditions:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t represents the first difference agreement information
- lr t- 1 showing a second learning rate
- a second study was obtained to the client's learning rate information before the service apparatus in a first learning rate service equipment
- lr t represents a first learning rate
- the specific conditions when the service device calculates the first learning rate are limited, which improves the feasibility of the solution.
- the service device can determine according to the first difference agreement information and the first performance The information is calculated to obtain the first training information, and the first performance determination information is used to indicate the weight of the performance determination factor when the service device calculates the first training information.
- the first training information is calculated according to the first performance determination information, which improves the feasibility of the solution.
- the first performance determination information includes the accuracy rate weight and the communication cost weight. At least one of the accuracy rate weight is used to indicate the weight of the model accuracy rate when the service device calculates the first training information, and the communication cost weight is used to indicate the weight of the communication resource used by the service device when calculating the first training information.
- the accuracy rate weights and communication cost weights are limited.
- the accuracy rate weights and communication cost weights can be set according to actual application conditions, which improves the operability of the solution.
- the service device can calculate the first training batch number according to the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t represents the first difference agreement information
- E t- 1 represents the second training batch information
- the second training batch information represents the number of training batches sent by the server to the client last time
- E t represents the first training batch information
- ⁇ represents the accuracy rate weight
- ⁇ represents the communication cost weight
- a, b, c and d are positive numbers not exceeding 6;
- E t calculated by the formula is a non-positive integer, then the value of Et is made to be a positive integer by rounding.
- the number of the first training batch is calculated according to a specific formula, which improves the feasibility of the solution.
- the service device can calculate the first learning rate according to the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t represents the first difference agreement information
- lr t-1 represents a second learning rate
- lr t represents a first learning rate
- [alpha] represents the weight accuracy
- a 1 , b 1 and b 2 are positive numbers not exceeding 6.
- the first learning rate is calculated according to a specific formula, which improves the feasibility of the solution.
- the service device calculates the first difference consistency information, it may The exponential moving average method is used to modify the first difference agreement information, and the revised first difference agreement information is obtained.
- the first difference and consistency information is corrected by the exponential moving average method, and the first training information is calculated by the corrected first difference and consistency information, thereby improving the accuracy of model training.
- the service device may correct the first difference agreement information according to the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t indicates the first difference agreement information
- GCR' indicates the revised The first difference is consistent information
- the value range of is 0.9 to 0.99.
- the first difference agreement information is corrected by a specific formula, which improves the feasibility of the solution.
- the third aspect of the embodiments of the present application provides a method for implementing model update.
- the client receives the first model sent by the service device, and the client trains based on the first model to obtain the second model.
- the client sends the difference information of the second model relative to the first model to the service device.
- the client receives the third model and the first training information sent by the service device.
- the third model is obtained by the service device updating the first model according to the difference information sent by the client and the difference information sent by other clients.
- the training information is obtained by the service device according to the difference information sent by the client and the difference information sent by other clients.
- the client trains the third model according to the first training information.
- the client receives the first training information sent by the service device, and the first training information is calculated from the difference information of the client, so when the client trains the third model according to the first training information , Improve the accuracy of the training model.
- the first training information includes at least one of the first training batch number or the first learning rate, and the first training The number of batches is used by the client to determine the number of batches for training, and the first learning rate is used by the client to determine the learning rate for training.
- the first training information may specifically include the number of training batches and the learning rate, which improves the feasibility of the solution.
- the site when the client is a site analysis device, the site The analysis device trains a third model according to the first training information and the first training sample to obtain a fourth model.
- the first training sample includes the feature data of the network device of the site network corresponding to the site analysis device.
- the third model can be trained through the feature data obtained by the network device of the site network, which improves the feasibility of the solution.
- a fourth aspect of the embodiments of the present application provides a terminal.
- the receiving unit is configured to receive difference information uploaded by at least two clients.
- the difference information uploaded by the client is the difference information of the second model relative to the first model obtained by the client training based on the first model.
- the first model is used by the client Received from the service device, the second model is obtained by the client through training based on the first model;
- the calculation unit is configured to calculate according to the difference information uploaded by at least two clients to obtain the first difference consistency information, where the first difference consistency information is used to indicate the degree of consistency of the difference information uploaded by the at least two clients;
- the update unit is configured to update the first model according to the difference information uploaded by at least two clients to obtain the third model
- the calculation unit is further configured to calculate according to the first difference agreement information to obtain the first training information, and the first training information is used to train the third model;
- the sending unit is configured to send the first training information and the third model to at least two clients.
- the calculation unit is specifically configured to calculate the first difference agreement information according to the following formula:
- GCR t represents the first difference consistent information
- g i represents the difference information uploaded by the i-th client among at least two clients
- i is a positive integer greater than or equal to 1.
- the first training information includes at least one of the first training batch number or the first learning rate, the first training batch number is used for the client to determine the number of training batches, and the first learning rate is used for the client to determine the number of training batches. Learning rate.
- the calculation unit 602 is specifically configured to calculate the first training information calculated according to the first difference agreement information to meet the following conditions:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t represents the first difference agreement information
- E t-1 Represents the number of second training batches
- the number of second training batches represents the number of training batches sent by the service device to the client before the service device obtains the first batch number
- E t represents the number of first training batches.
- the calculation unit calculates according to the first difference agreement information to obtain the first training information that satisfies the following conditions:
- GCR t-1 indicates the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t indicates the first difference agreement information
- lr t-1 indicates second learning rate, the second rate of learning the learning rate information obtained prior to the first learning rate service to the client device at the service equipment, lr t represents a first learning rate.
- the calculation unit is specifically configured to calculate the first training information according to the first difference agreement information and the first performance determination information, and the first performance determination information is used to instruct the service device to calculate the weight of the performance determination factor when calculating the first training information.
- the first performance determination information includes at least one of an accuracy rate weight and a communication cost weight.
- the accuracy rate weight is used to indicate the weight of the model accuracy rate when the service device calculates the first training information
- the communication cost weight is used to indicate the service.
- the device uses the weight of the communication resource when calculating the first training information.
- the calculation unit specifically calculates the first training batch number according to the following formula:
- GCR t-1 indicates the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t indicates the first difference agreement information
- E t-1 indicates The second training batch number
- the second training batch number represents the number of training batches sent by the server to the client last time
- E t represents the first training batch information
- ⁇ represents the accuracy weight
- ⁇ represents the communication cost weight
- a, b, c and d are positive numbers not exceeding 6;
- E t calculated by the formula is a non-positive integer, then the value of Et is made to be a positive integer by rounding.
- the calculation unit specifically calculates the first learning rate according to the following formula:
- GCR t-1 indicates the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t indicates the first difference agreement information
- lr t-1 indicates second learning rate
- lr t represents a first learning rate
- [alpha] represents the accuracy of the weight
- cost weight beta] represents a communication
- a 1 , b 1 and b 2 are positive numbers not exceeding 6.
- the service equipment further includes:
- the correction unit is used to correct the first difference agreement information by using the exponential moving average method to obtain the corrected first difference agreement information.
- the calculation unit specifically configured to use the exponential movement parameter to correct the first difference agreement information may be obtained by the following formula, including:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t indicates the first difference agreement information
- GCR' indicates the revised The first difference is consistent information
- the value range of is 0.9 to 0.99.
- the value of is 0.95.
- a fifth aspect of the embodiments of the present application provides a service device.
- the receiving unit is configured to receive difference information uploaded by at least two clients.
- the difference information uploaded by the client is based on the difference information of the second model obtained by training of the first model relative to the first model.
- the first model is received by the client from Service equipment
- the update unit updates the first model according to the difference information uploaded by at least two clients to obtain the third model
- the processing unit is used to obtain the difference information between the first model and the third model according to the first model and the third model (hereinafter referred to as the difference information of the service device);
- the calculation unit is configured to calculate according to the difference information of the service device and the difference information of the first client of the at least two clients to obtain target difference agreement information, and the target difference agreement information is used to represent the difference information of the service device model and the customer The degree of consistency of the difference information at the end;
- the calculation unit is also used to calculate according to the target difference consistent information to obtain target training information, and the target training information is used to train the third model;
- the sending unit is configured to send target training information and the third model to the first client.
- a sixth aspect of the embodiments of the present application provides a client.
- the training unit is used for training based on the first model to obtain the second model, and the first model is received from the service device;
- a sending unit configured to send difference information of the second model relative to the first model to the service device
- the receiving unit is configured to receive the third model and the first training information sent by the service device.
- the third model is obtained by the service device updating the first model according to the difference information sent by the client and the difference information sent by other clients.
- the training information is obtained by the service device according to the difference information sent by the client and the difference information sent by other clients;
- the training unit is also used to train the third model according to the first training information.
- the first training information includes at least one of the first training batch number or the first learning rate, the first training batch number is used for the client to determine the number of training batches, and the first learning rate is used for the client to determine the number of training batches. Learning rate.
- the client is a site analysis device
- the training unit is further configured to train a third model according to the first training information and the first training sample to obtain a fourth model, and the first training sample includes the feature data of the network device of the site network corresponding to the site analysis device.
- a seventh aspect of the embodiments of the present application provides a terminal.
- the processor is connected to the memory and the input and output equipment;
- the processor executes the methods described in the implementation manners of the first aspect and the third aspect of the present application.
- An eighth aspect of the embodiments of the present application provides a service device.
- the processor is connected to the memory and the input and output equipment;
- the processor executes the method described in the implementation manner of the second aspect of the present application.
- the ninth aspect of the embodiments of the present application provides a computer storage medium that stores instructions in the computer storage medium.
- the computer executes the same as the first to third aspects of the present application. The method described in the way.
- the tenth aspect of the embodiments of the present application provides a computer program product.
- the computer program product When the computer program product is executed on a computer, the computer executes the method described in the implementation manners of the first to third aspects of the present application.
- the service device uses the difference information uploaded by at least two clients, the service device obtains the first training information according to the difference information of the at least two clients, and sends the first training information to the client.
- the service device The difference information uploaded by the two clients calculates the relevant first training information. Because the difference information is related to the result of the previous client training, the client can improve the accuracy of the model when training using the first training information.
- Figure 1 is a framework diagram of joint training provided by an embodiment of this application.
- FIG. 2 is a schematic flowchart of a model data processing method provided by an embodiment of the application
- FIG. 3 is a schematic diagram of another flow of a model data processing method provided by an embodiment of the application.
- FIG. 4 is a schematic diagram of another flow chart of a model data processing method provided by an embodiment of the application.
- FIG. 5 is a schematic structural diagram of a terminal provided by an embodiment of the application.
- FIG. 6 is a schematic diagram of another structure of a terminal provided by an embodiment of this application.
- FIG. 7 is a schematic structural diagram of a service device provided by an embodiment of the application.
- FIG. 8 is a schematic diagram of another structure of a terminal provided by an embodiment of the application.
- FIG. 9 is a schematic diagram of another structure of a terminal provided by an embodiment of the application.
- FIG. 10 is a schematic diagram of another structure of a terminal provided by an embodiment of this application.
- FIG. 11 is a schematic diagram of another structure of a service device provided by an embodiment of the application.
- the embodiment of the application provides a data processing method, which is used to obtain first training information according to the difference information uploaded by the service device when updating the training model, and send it to the client, and the client uses the first training information
- the accuracy of the model can be improved during training.
- Federated Learning is an emerging basic artificial intelligence technology. It was originally used to solve the problem of Android mobile phone terminal users updating the model locally. Its design goal is to protect the privacy of terminal data and personal data and ensure legal compliance. Under the premise, high-efficiency machine learning is carried out among multiple parties or multiple computing nodes. At present, the purpose of joint learning has been expanded to jointly build AI models without data sharing to improve the effects of AI models.
- machine learning algorithms have been widely used in many fields. From the perspective of learning methods, machine learning algorithms can be divided into supervised learning algorithms, unsupervised learning algorithms, semi-supervised learning algorithms, and reinforcement learning algorithms.
- Supervised learning algorithm refers to the ability to learn an algorithm or establish a model based on training data, and use this algorithm or model to infer new instances.
- Training data also called training samples, is composed of input data and expected output.
- the model of a machine learning algorithm also called a machine learning model, its expected output, called a label, which can be a predicted classification result (called a classification label).
- the difference between the unsupervised learning algorithm and the supervised learning algorithm is that the training samples of the unsupervised learning algorithm do not have a given label, and the machine learning algorithm model obtains certain results by analyzing the training samples.
- part of the training samples are labeled and the other part is unlabeled, and unlabeled data is far more than labeled data.
- Reinforcement learning algorithms try to maximize the expected benefits through continuous attempts in the environment, and through the rewards or punishments given by the environment, generate the choices that can obtain the greatest benefits.
- Figure 1 is a schematic diagram of the framework of a method for implementing model update provided by this application.
- the framework of the model update method includes multiple devices, including service devices and clients.
- the number of service devices and clients in Figure 1 is only for illustration, and is not provided as an example of this application. Limitations of the application scenarios involved in the model update method.
- the service device may refer to any device that supports multi-model aggregation or combination, or cloud platform, server, public cloud, etc., which is not specifically limited here.
- the client can be any device that supports local training, such as mobile phones, tablets, computers, switches, optical line terminals (OLT), optical network equipment (ONTs), routers, etc., specifically not here. Make a limit.
- the service device and the client can be connected through a wired network or a wireless network. If they are connected through a wired network, the general form of connection is an optical fiber network. It is understandable that they can also be connected through other wired networks, which are not specifically limited here. If it is connected through a wireless network, it can be connected through Bluetooth or wireless network wi-fi. It is understandable that it can also be connected through other wireless networks, which is not specifically limited here.
- the service device will deliver the model to the client. After the client receives the model, it will train locally. After the training is completed, the client will send the trained model to the service device, or The difference information is sent to the service device. After the service device receives the model or difference information sent by the client, the service device updates the central model. After the service device updates the central model, it will send the updated central model to the client. Go to the next training.
- the learning rate can also be referred to as the step size, that is, the learning rate in the embodiment of the present application can also be replaced with the step size, which is not specifically limited here.
- the application scenarios are not specifically limited, as long as the specific application scenarios can apply the framework.
- the model update method can be applied to a variety of application scenarios. Therefore, the embodiment of the present application schematically enumerates implementation manners of several specific scenarios, which are described separately below.
- FIG. 2 is a schematic flowchart of an embodiment of the method for implementing model update provided by this application.
- step 201 the service device receives difference information uploaded by at least two clients.
- the client training when the client training is completed, it will send difference information to the service device, that is, the service device will receive at least two difference information sent by at least two clients, and the difference information is the first model and the second model.
- the difference information For the difference information between the models, the first model is sent by the service device to the client, and the second model is obtained by the client training according to the first model.
- the first model is incrementally trained to obtain the second model.
- the difference information may be gradient information or other types of information, as long as it can indicate the difference information between the model before and after the training of the client, which is not specifically limited here.
- the difference information may specifically be a matrix composed of difference values of multiple parameters.
- the first model includes 4 parameters, and the matrix composed of the values of these 4 parameters is [a1, b1, c1, d1], and the second model also includes these 4 parameters, and the matrix composed of the values is [a2, b2, c2, d2], then the matrix composed of the difference of these 4 parameters is [a2-a1, b2-b1, c2-c1, d2-d1], that is, the difference information is [a2-a1, b2-b1 , C2-c1, d2-d1].
- step 202 the service device calculates the first difference agreement information according to the at least two pieces of difference information.
- the service device When the service device receives the difference information uploaded by at least two clients, the service device calculates the first difference agreement information through the at least two difference information, and the first difference agreement information indicates the difference between the difference information of the at least two clients Degree of agreement.
- the first difference agreement information is gradient agreement information.
- the first difference agreement information can be calculated by the following formula to obtain the gradient agreement information:
- GCR t represents the first difference agreement information
- g i represents the difference information from the i-th client.
- the formula is:
- step 203 the service device uses an exponential movement method to modify the first difference and agreement information, and obtains the corrected first difference and agreement information.
- the service device may correct the first difference agreement information by using an exponential moving average method to obtain the revised first difference agreement information.
- the correction process can be understood as a process in which the service equipment implements exponential moving average (EMA) EMA smoothing on the first difference consistent information, that is, the service equipment corrects the first EMA smoothly.
- EMA exponential moving average
- One difference agreement information the revised first difference agreement information is obtained.
- the revised first difference agreement information can be obtained through the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- the GCR' indicates the revised first difference agreement information information
- the value range of can be 0.9-0.99, and further, can be 0.95, which is not specifically limited here.
- step 204 the service device obtains the first training information according to the revised first difference agreement information and the first performance determination information.
- the service device After the service device obtains the corrected first difference and consistency information, the service device calculates the first training information according to the corrected first difference and consistency information and the first performance determination information generated by the service device.
- the first performance determination information is preset by the service device, and the first performance determination information is used to indicate the weight of the performance determination factor when calculating the next training information.
- the first performance determination information may include an accuracy rate weight and a communication cost weight.
- ⁇ and ⁇ respectively represent the accuracy rate weight and the communication cost weight. It is understood that in actual application, other symbols may also be used to indicate the accuracy rate weight and the communication cost weight, which are not specifically limited here.
- Acc represents the accuracy of the model
- Acc max represents the maximum value of the model accuracy
- Cost represents the value of the communication cost actually spent in the joint learning process, that is, the value of the communication resources consumed
- Cost max represents the maximum value of the communication cost Value
- P represents the performance of the joint learning process, that is, the larger the value of P, the joint The better the performance of learning.
- ⁇ + ⁇ 1, that is, when ⁇ is larger and ⁇ is smaller, it means that when training the model, it is more inclined to consider the improvement of model accuracy.
- this formula can be used to evaluate the effect of joint learning.
- the service device can calculate the first training batch information according to the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information, that is, the service device has received at least two
- the second difference information is calculated by the service device based on the difference information uploaded by at least two clients last time
- GCR t represents the first difference consistent information
- the E t-1 represents the second training Batch number, that is, the number of training batches sent by the service device to the client before the first training batch information
- E t represents the first training batch number information, that is, the number of training batches this time
- ⁇ represents the accuracy rate weight
- ⁇ represents The communication cost weight, a, b, c, and d are positive integers not exceeding 6. For example, the value of a is 3, the value of b is 4, the value of c is 2, and the value of d is 4.
- the first training batch information can also be obtained through other formulas, as long as the first training batch information meets the following conditions:
- GCR t GCR t-1
- E t E t-1
- E t is a positive integer
- the service device can calculate the first learning rate according to the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information, that is, the service device received at least two customers last time
- GCR t represents the first difference consistent information
- lr t-1 represents the second learning rate
- lr t represents a first learning rate, which means that a client sent to the learning rate service provisioning equipment
- [alpha] represents a second Performance determination information
- ⁇ represents third performance determination information
- the first learning rate can also be obtained by other formulas, as long as the calculation result of the formula meets the following conditions:
- step 205 the service device updates the central model.
- the service device updates the central model of the service device according to the difference information received from the client to obtain the updated model.
- the center model is the first model, and the updated center model is the second model.
- the service device can count the number of clients sending difference information to the service device L, L is less than or equal to N, when the ratio of L to N is greater than the threshold K, K is greater than 0 and Less than or equal to 1, specifically can be a value greater than 0.5, such as 0.8, the service device updates the first model with the received multiple difference information to obtain the third model, and the service device distributes the third model to N customers respectively end.
- the service device updates the central model according to the difference information to obtain the updated model
- the following methods can be used:
- the mean value of the difference information uploaded by at least two clients update the first model with the mean value, and obtain the third model. For example, update the first model based on the difference information uploaded by client 1 and client 2, where the difference information uploaded by client 1 and client 2 are respectively [a2-a1, b2-b1, c2-c1, d2-d1 ] And [a3-a1, b3-b1, c3-c1, d3-d1], the average value of the difference information uploaded by these two clients is [(a2-a1+a3-a1)/2, (b2-b1+ b3-b1)/2, (c2-c1+c3-c1)/2, (d2-d1+d3-d1)/2], update the first model with the average value.
- step 206 the service device sends the first training information to the client.
- the service device After the service device updates the central model, the service device will deliver the updated model to the client, and send the first training information to the client.
- the client After the client receives the updated model and the first training information, the client will update the local model according to the updated model issued by the service device, and update the local training batch number and learning rate. Training is performed again, that is, further model training is performed on the updated model issued by the service device according to the received training information (such as the number of training batches, learning rate).
- the difference information When the training is over, the difference information will be sent to the service device again, and steps 201 to 206 will be repeated until the central model converges, and the joint learning training ends.
- the first performance determination information may not be set.
- the first performance determination information is not substituted into the formula as a parameter.
- step 203 is an optional step.
- the service device can directly calculate the first training information according to the first difference consistency information calculated in step 202.
- step 205 there is no sequence between step 205 and step 202 to step 204.
- the service device obtains the first training information from the difference information uploaded by at least two clients, and sends the first training information to the client.
- the service device calculates according to the difference information uploaded by the client.
- the first training information required for the next client training model because the difference information is related to the result of previous client training, so the client can improve the accuracy of the model when using the first training information for training.
- FIG. 3 is a schematic flowchart of another embodiment of the model update method provided by this application.
- step 301 the service device receives difference information uploaded by at least two clients.
- step 302 the service device updates the central model according to the difference information uploaded by at least two clients.
- Step 301 and step 302 in this embodiment are similar to step 201 and step 205 in the embodiment shown in FIG. 2 and will not be repeated here.
- step 303 the service device calculates the target difference consistency information based on the difference information between the central model before the update and the updated central model and the difference information uploaded by the client.
- the service device When the service device receives the difference information uploaded by at least two clients, if the service device needs to calculate the target training information of one of the clients, the service device will first update the central model according to the difference information uploaded by the at least two clients. Then obtain the difference information between the central model before the update and the updated central model. The service equipment calculates the difference information uploaded by the client and the difference information between the central model before the update and the central model after the update, and the target difference is consistent.
- the target difference consistency information indicates the degree of consistency between the difference information uploaded by the client and the difference information of the central model before and after the update of the service device.
- the center model is the first model, and the updated center model is the second model.
- the target difference agreement information is gradient agreement information.
- the target difference agreement information can be calculated by the following formula to obtain the gradient agreement information:
- GCR t represents the target difference agreement information
- g 3 represents the difference information from the client
- g 4 represents the difference information from the central model before the update of the service device and the central model after the update.
- step 304 the service device uses the exponential moving average method to modify the target difference agreement information to obtain the corrected target difference agreement information.
- step 305 the service device obtains target training information according to the corrected target difference agreement information and the first performance determination information.
- step 306 the service device sends target training information and the updated central model to at least two clients.
- Step 304 to step 306 in this embodiment are similar to step 203, step 204, and step 206 in the embodiment shown in FIG. 2 and will not be repeated here.
- the service device may separately determine target training information for each of the at least two clients. For any one of the clients, what is obtained in step 303 is the target difference agreement information corresponding to the client, specifically the target difference agreement information is obtained according to the difference information of the client; in step 304 the target difference agreement information is corrected
- the target difference agreement information is corrected
- the difference agreement information corresponding to the client terminal calculated in the previous round can be used as the second difference agreement information in step 203 for the target difference.
- Consistent information is corrected; in step 305, the method of obtaining the target training information is detailed in the method of obtaining first training information in step 204; accordingly, in step 306, the target training corresponding to the client is sent to the client information.
- the client After the client receives the updated model and target training information, the client will update the local model according to the updated model issued by the service device, and update the local training batch number and learning rate, and then Conduct training. When the training is over, the difference information will be sent to the service device again, and steps 301 to 306 will be repeated until the central model converges, and the joint learning training ends.
- the first performance determination information may not be set, and when the target training information is obtained by calculation, the first performance determination information is not substituted into the formula as a parameter.
- step 304 is an optional step.
- the service device can directly calculate the target training information according to the target difference agreement information obtained in step 303.
- the service device calculates the target training information of a single client through the difference information of a single client, and sends the target training information to the client.
- the client uses the target training information to train the model, it is compared with the unified calculation.
- the obtained target training information improves the accuracy of the individual client training model based on the target training information.
- the service device may perform the above steps to calculate the target training information of the client and send it to the client.
- FIG. 4 is a schematic flowchart of another embodiment of the model update method provided by this application.
- the service device is a cloud device and the client is a site analysis device as an example for description.
- the model training system involved in the model data processing method includes multiple site analysis devices, that is, multiple site networks.
- the site network can be a core network or an edge network.
- Each site Users of the network can be operators or corporate customers.
- Different site networks can be different networks divided according to corresponding dimensions, for example, they can be networks of different regions, networks of different operators, networks of different services, and different network domains.
- Multiple site analysis devices can correspond to multiple site networks one-to-one.
- Each site analysis device is used to provide data analysis services for the corresponding site network, and each site analysis device can be located in the corresponding site network. , It can also be located outside the corresponding office network.
- step 401 the cloud device sends the first model to the site analysis device.
- the cloud device obtains the first model and sends the first model to the site analysis device.
- the cloud device may also send target training information to the site analysis device, where the target training information is used to indicate the parameter information of the site analysis device when training the first model.
- step 402 the network device sends the first characteristic data to the site analysis device.
- the network device sends first characteristic data to the site analysis device, where the first characteristic data refers to data generated by the network device.
- the first characteristic data may include, for example, when the network device is a camera, the characteristic data of the camera may be image data collected and generated by the camera.
- the characteristic data of the sound recorder may be collected and generated by the sound recorder.
- the characteristic data of the switch can be KPI data, and the KPI data can be statistical information generated when the switch forwards traffic, such as the number of outgoing message bytes, the number of outgoing messages, and the depth of the queue , Throughput information, number of lost packets, etc.
- the site analysis device uses the first feature data and the target training information to train the first model to obtain the second model, and obtain the difference information between the first model and the second model.
- the local analysis equipment uses the first characteristic data and target training information to train the first model to obtain the second model.
- the local analysis equipment compares the first model and the second model to obtain Difference information between the first model and the second model.
- the difference information may specifically be a matrix composed of difference values of multiple parameters.
- the first model includes 4 parameters, and the matrix composed of the values of these 4 parameters is [a1, b1, c1, d1], and the second model also includes these 4 parameters, and the matrix composed of the values is [a2, b2, c2, d2], then the matrix composed of the difference of these 4 parameters is [a2-a1, b2-b1, c2-c1, d2-d1], that is, the difference information is [a2-a1, b2-b1 , C2-c1, d2-d1].
- step 404 the site analysis device sends the difference information to the cloud device.
- the site analysis device After the site analysis device obtains the difference information, the site analysis device sends the difference information to the cloud device.
- step 405 the cloud device calculates the first difference agreement information according to at least two pieces of difference information.
- the cloud device When the cloud device receives at least two difference information uploaded by at least two site analysis devices, the cloud device obtains the first difference agreement information by calculating the difference information, and the first difference agreement information indicates the difference between the two site analysis devices. The degree of agreement between the difference information.
- the first difference agreement information is gradient agreement information.
- the first difference agreement information can be calculated by the following formula to obtain the gradient agreement information:
- GCR t represents the first difference agreement information
- g i represents the difference information from the i-th site analysis equipment.
- the formula is:
- step 406 the cloud device uses the exponential moving average method to correct the first difference agreement information, and obtain the corrected first difference agreement information.
- the cloud device may correct the first difference agreement information by using an exponential moving average method to obtain the revised first difference agreement information.
- the correction process can be understood as a process in which the cloud device implements exponential moving average (EMA) EMA smoothing on the first difference consistent information, that is, the cloud device corrects the first EMA smoothly.
- EMA exponential moving average
- One difference agreement information the revised first difference agreement information is obtained.
- the revised first difference agreement information can be obtained through the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the cloud device before the cloud device obtains the first difference agreement information
- the GCR' indicates the revised first difference agreement information information
- the value range of can be 0.9-0.99, and further, can be 0.95, which is not specifically limited here.
- the cloud device calculates the first training information according to the corrected first difference agreement information and the first performance determination information.
- the cloud device After the cloud device obtains the corrected first difference and consistency information, the cloud device calculates the first training information according to the corrected first difference and consistency information and the first performance determination information generated by the cloud device.
- the first performance determination information is preset by the cloud device, and the first performance determination information is used to indicate the weight of each parameter when calculating the next training information.
- the first performance determination information may include an accuracy rate weight and a communication cost weight.
- ⁇ and ⁇ respectively represent the accuracy rate weight and the communication cost weight. It is understood that in actual application, other symbols may also be used to indicate the accuracy rate weight and the communication cost weight, which are not specifically limited here.
- Acc represents the accuracy of the model
- Acc max represents the maximum value of the model accuracy
- Cost represents the value of the communication cost actually spent in the joint learning process, that is, the value of the communication resources consumed
- Cost max represents the maximum value of the communication cost
- the value means that when the base device trains the model every time it is uploaded to the cloud device, the value of the communication cost of the joint learning at this time is the maximum value
- P represents the performance of the joint learning process, that is, the larger the value of P, the joint The better the performance of learning.
- ⁇ + ⁇ 1, that is, when ⁇ is larger and ⁇ is smaller, it means that when training the model, it is more inclined to consider the improvement of model accuracy.
- this formula can also be used to evaluate the effect of joint learning.
- the cloud device can calculate the first training batch information according to the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the cloud device obtains the first difference agreement information, that is, the cloud device has received at least two
- the second difference information is calculated by the cloud device based on the difference information uploaded by at least two site analysis devices last time.
- GCR t represents the first difference consistent information
- the E t-1 Indicates the second training batch information, that is, the training batch information sent by the cloud device to the site analysis device before the first training batch information
- E t represents the first training batch information, that is, the training batch this time
- ⁇ represents the weight of the accuracy rate
- ⁇ represents the weight of the communication cost
- a, b, c, and d are positive integers not exceeding 6, for example, the value of a is 3, the value of b is 4, the value of c is 2, and the value of d is Is 4.
- the first training batch information can also be obtained through other formulas, as long as the first training batch information meets the following conditions:
- GCR t GCR t-1
- E t E t-1
- E t is a positive integer
- the service device can calculate the first learning rate according to the following formula:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the cloud device before the cloud device obtains the first difference agreement information, that is, the service device received at least two customers last time
- GCR t represents the first difference consistent information
- lr t-1 represents the second learning rate
- lr t represents a first learning rate, i.e., it represents the next time Board study distributed cloud point analysis equipment device is ready Rate
- ⁇ represents the second performance determination information
- ⁇ represents the third performance determination information
- the first learning rate can also be obtained by other formulas, as long as the calculation result of the formula meets the following conditions:
- step 408 the cloud device updates the central model.
- the cloud device updates the central model of the cloud device according to the difference information received from the site analysis device to obtain the updated model.
- the cloud device can count the number of site analysis devices that send difference information to the cloud device L, L is less than or equal to N, when the ratio of L to N is greater than the threshold K , K is greater than 0 and less than or equal to 1, and can be a value greater than 0.5, such as 0.8.
- the cloud device updates the first model with the received multiple difference information to obtain the third model, and the cloud device downloads the third model separately Sent to N site analysis equipment.
- the cloud device updates the central model according to the difference information to obtain the updated model
- the following methods can be used:
- Obtain the average value of the difference information uploaded by at least two site analysis devices update the first model with the average value, and obtain the third model. For example, update the first model based on the difference information uploaded by site analysis equipment 1 and site analysis equipment 2, where the difference information uploaded by site analysis equipment 1 and site analysis equipment 2 are respectively [a2-a1, b2-b1 , C2-c1, d2-d1] and [a3-a1, b3-b1, c3-c1, d3-d1], the average value of the difference information uploaded by the two site analysis equipment is [(a2-a1+a3- a1)/2, (b2-b1+b3-b1)/2, (c2-c1+c3-c1)/2, (d2-d1+d3-d1)/2], use the mean to update the first model .
- step 409 the cloud device sends the first training information to the site analysis device.
- the cloud device After the cloud device updates the central model, the cloud device will deliver the updated model to the site analysis device, and send the first training information to the site analysis device.
- the site analysis device After the site analysis device receives the updated model and the first training information, the site analysis device will update the local model according to the updated model issued by the cloud device, and update the local training batch number And the learning rate before training. When the training is over, the site analysis device will send the difference information to the cloud device again, and repeat steps 406 to 410 until the central model converges, and the model training ends.
- the first performance determination information may not be set, and when the first training information is calculated, the first performance determination information is not substituted into the formula as a parameter.
- step 406 is an optional step.
- the service device can directly calculate the first training information according to the first difference and consistency information calculated in step 405.
- the site analysis device obtains the first feature data sent by the network device, and uses the first feature data to train the model, which improves the feasibility of the solution.
- the complete model data of the updated central model may be sent, and the complete model data includes the model structure data (such as the functional form of the model) and model parameter data (including the values of the model parameters) of the updated central model.
- the client can directly load the complete model data to obtain the updated central model, and replace the central model in the client (that is, the central model received in the previous round).
- the model parameter data can be used to replace the model parameter data of the central model in the client to obtain the updated central model.
- the difference information between the model parameter data of the center model after the update and the model parameter data of the center model before the update may be sent.
- the client terminal uses the difference information to modify the model parameter data of the central model in the client terminal to obtain the updated central model. For example, if the value of a certain parameter of the central model in the client is a, and the difference information corresponding to the model parameter is +b, the client modifies the updated value of the model parameter to obtain a+b.
- FIG. 5 is a schematic structural diagram of an embodiment of the service device provided by this application.
- the receiving unit 501 is configured to receive difference information uploaded by at least two clients.
- the difference information uploaded by the client is the difference information of the second model compared to the first model obtained by the client training based on the first model.
- the first model is used by the client.
- the terminal is received from the service device, and the second model is obtained by the client through training based on the first model;
- the calculation unit 502 is configured to calculate according to the difference information uploaded by at least two clients to obtain first difference consistency information, where the first difference consistency information is used to indicate the degree of consistency of the difference information uploaded by the at least two clients;
- the updating unit 503 is configured to update the first model according to the difference information uploaded by at least two clients to obtain the third model;
- the calculation unit 502 is further configured to calculate according to the first difference agreement information to obtain first training information, and the first training information is used to train the third model;
- the sending unit 504 is configured to send the first training information and the third model to at least two clients.
- each unit of the service device is similar to those described in the method performed by the service device in the embodiment shown in FIG. 2 or FIG. 4, and will not be repeated here.
- FIG. 6 is a schematic structural diagram of an embodiment of the service device provided by this application.
- the receiving unit 601 is configured to receive difference information uploaded by at least two clients.
- the difference information uploaded by the client is the difference information between the second model and the first model obtained by the client training based on the first model.
- the first model is used by the client.
- the terminal is received from the service device, and the second model is obtained by the client through training based on the first model;
- the calculating unit 602 is configured to calculate according to the difference information uploaded by at least two clients to obtain first difference consistency information, where the first difference consistency information is used to indicate the degree of consistency of the difference information uploaded by the at least two clients;
- the updating unit 605 is configured to update the first model according to the difference information uploaded by at least two clients to obtain the third model;
- the calculation unit 602 is further configured to calculate according to the first difference agreement information to obtain first training information, and the first training information is used to train the third model;
- the sending unit 604 is configured to send the first training information and the third model to at least two clients.
- the calculation unit 602 is specifically configured to calculate the first difference agreement information according to the following formula:
- GCR t represents the first difference consistent information
- g i represents the difference information uploaded by the i-th client among at least two clients
- i is a positive integer greater than or equal to 1.
- the first training information includes at least one of the first training batch number or the first learning rate, the first training batch number is used for the client to determine the number of training batches, and the first learning rate is used for the client to determine the number of training batches. Learning rate.
- the calculation unit 602 is specifically configured to calculate the first training information calculated according to the first difference agreement information to meet the following conditions:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t represents the first difference agreement information
- E t-1 Represents the number of second training batches
- the number of second training batches represents the number of training batches sent by the service device to the client before the service device obtains the first batch number
- E t represents the number of first training batches.
- the calculation unit 602 calculates according to the first difference agreement information to obtain the first training information that satisfies the following conditions:
- GCR t-1 indicates the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t indicates the first difference agreement information
- lr t-1 indicates second learning rate, the second rate of learning the learning rate information obtained prior to the first learning rate service to the client device at the service equipment, lr t represents a first learning rate.
- the calculation unit 602 is specifically configured to calculate the first training information according to the first difference agreement information and the first performance determination information, and the first performance determination information is used to instruct the service device to calculate the weight of the performance determination factor when the first training information is calculated. .
- the first performance determination information includes at least one of an accuracy rate weight and a communication cost weight.
- the accuracy rate weight is used to indicate the weight of the model accuracy rate when the service device calculates the first training information
- the communication cost weight is used to indicate the service.
- the device uses the weight of the communication resource when calculating the first training information.
- the calculation unit 602 specifically calculates the first training batch number according to the following formula:
- GCR t-1 indicates the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t indicates the first difference agreement information
- E t-1 indicates The second training batch number
- the second training batch number represents the number of training batches sent by the server to the client last time
- E t represents the first training batch information
- ⁇ represents the accuracy weight
- ⁇ represents the communication cost weight
- a, b, c and d are positive numbers not exceeding 6;
- E t calculated by the formula is a non-positive integer, then the value of Et is made to be a positive integer by rounding.
- the calculation unit 602 specifically calculates the first learning rate according to the following formula:
- GCR t-1 indicates the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t indicates the first difference agreement information
- lr t-1 indicates second learning rate
- lr t represents a first learning rate
- [alpha] represents the accuracy of the weight
- cost weight beta] represents a communication
- a 1 , b 1 and b 2 are positive numbers not exceeding 6.
- the service equipment further includes:
- the correction unit 603 is configured to correct the first difference agreement information by using the exponential moving average method to obtain the corrected first difference agreement information.
- the calculation unit specifically configured to use the exponential movement parameter to correct the first difference agreement information may be obtained by the following formula, including:
- GCR t-1 represents the second difference agreement information
- the second difference agreement information is the difference agreement information calculated by the service device before the service device obtains the first difference agreement information
- GCR t indicates the first difference agreement information
- GCR' indicates the revised The first difference is consistent information
- the value range of is 0.9 to 0.99.
- the value of is 0.95.
- each unit of the service device is similar to those described in the method performed by the service device in the embodiment shown in FIG. 2 or FIG. 4, and will not be repeated here.
- FIG. 7 is a schematic structural diagram of an embodiment of the service device provided by this application.
- the receiving unit 701 is configured to receive difference information uploaded by at least two clients, the difference information uploaded by the client is based on the difference information of the second model obtained by training of the first model relative to the first model, and the first model is received by the client Self-service equipment
- the updating unit 702 updates the first model according to the difference information uploaded by at least two clients to obtain the third model
- the processing unit 703 is configured to obtain difference information between the first model and the third model according to the first model and the third model (hereinafter referred to as the difference information of the service device);
- the calculation unit 704 is configured to calculate according to the difference information of the service device and the difference information of the first client of the at least two clients to obtain target difference agreement information, and the target difference agreement information is used to represent the difference information and the difference information of the service device model.
- the calculation unit 704 is further configured to calculate according to the target difference consistent information to obtain target training information, and the target training information is used to train the third model;
- the sending unit 705 is configured to send target training information and the third model to the first client.
- each unit of the service device is similar to those described in the method performed by the service device in the embodiment shown in FIG. 3, and will not be repeated here.
- FIG. 8 is a schematic structural diagram of another embodiment of a terminal provided by this application.
- the training unit 801 is used for training based on the first model to obtain a second model, and the first model is received from the service device;
- the sending unit 802 is configured to send difference information of the second model relative to the first model to the service device;
- the receiving unit 803 is configured to receive the third model and the first training information sent by the service device.
- the third model is obtained by the service device updating the first model according to the difference information sent by the client and the difference information sent by other clients.
- One training information is obtained by the service device according to the difference information sent by the client and the difference information sent by other clients;
- the training unit 801 is also used to train the third model according to the first training information.
- each unit of the service device is similar to those described in the method performed by the client in the embodiment shown in FIG. 2 or FIG. 4, and will not be repeated here.
- FIG. 9 is a schematic structural diagram of another embodiment of a terminal provided by this application.
- the training unit 901 is used for training based on the first model to obtain a second model, and the first model is received from the service device;
- the sending unit 902 is configured to send difference information of the second model relative to the first model to the service device;
- the receiving unit 903 is configured to receive the third model and the first training information sent by the service device.
- the third model is obtained by the service device updating the first model according to the difference information sent by the client and the difference information sent by other clients.
- One training information is obtained by the service device according to the difference information sent by the client and the difference information sent by other clients;
- the training unit 901 is also used to train the third model according to the first training information.
- the first training information includes at least one of the first training batch number or the first learning rate, the first training batch number is used for the client to determine the number of training batches, and the first learning rate is used for the client to determine the number of training batches. Learning rate.
- the client is a site analysis device
- the training unit 901 is further configured to train a third model according to the first training information and the first training sample to obtain a fourth model.
- the first training sample includes the feature data of the network device of the site network corresponding to the site analysis device.
- each unit of the service device is similar to those described in the method performed by the client in the embodiment shown in FIG. 2 or FIG. 4, and will not be repeated here.
- FIG. 10 is a schematic structural diagram of another embodiment of a terminal provided by this application.
- the terminal includes a processor 1001, a memory 1002, a bus 1005, and an interface 1004.
- the processor 1001 is connected to the memory 1002 and an interface 1004.
- the bus 1005 is respectively connected to the processor 1001, the memory 1002, and the interface 1004.
- the interface 1004 is used for receiving or sending
- the processor 1001 is a single-core or multi-core central processing unit, or a specific integrated circuit, or one or more integrated circuits configured to implement the embodiments of the present invention.
- the memory 1002 may be a random access memory (Random Access Memory, RAM) or a non-volatile memory (non-volatile memory), such as at least one hard disk memory.
- the memory 1002 is used to store computer execution instructions. Specifically, the program 1003 may be included in the computer-executable instructions.
- the processor 1001 may perform operations performed by the terminal in the embodiment shown in FIG. 2 or FIG. 4, and details are not described herein again.
- FIG. 11 is a schematic structural diagram of another embodiment of a terminal provided by this application.
- the terminal includes a processor 1101, a memory 1102, a bus 1105, and an interface 1104.
- the processor 1101 is connected to the memory 1102 and an interface 1104.
- the bus 1105 is connected to the processor 1101, the memory 1102, and the interface 1104, respectively, and the interface 1104 is used for receiving or sending
- the processor 1101 is a single-core or multi-core central processing unit, or a specific integrated circuit, or one or more integrated circuits configured to implement the embodiments of the present invention.
- the memory 1102 may be a random access memory (Random Access Memory, RAM), or a non-volatile memory (non-volatile memory), such as at least one hard disk memory.
- the memory 1102 is used to store computer execution instructions. Specifically, the program 1103 may be included in the computer execution instruction.
- the processor 1101 can perform operations performed by the terminal in the foregoing embodiment shown in FIG. 3, and details are not described herein again.
- the embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
- the computer program is executed by a computer, the method process related to the service device in any of the foregoing method embodiments is implemented.
- the processor mentioned in the service device in the above embodiment of this application may be a central processing unit (CPU) or other general-purpose processors.
- CPU central processing unit
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGA ready-made programmable gate array
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the number of processors in the service device in the above embodiments of the present application may be one or multiple, and may be adjusted according to actual application scenarios. This is only an exemplary description and is not limited.
- the number of memories in the embodiment of the present application may be one or multiple, and may be adjusted according to actual application scenarios. This is only an exemplary description and is not limited.
- the memory or readable storage medium mentioned in the service device in the above embodiments in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electrically available Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
- the volatile memory may be random access memory (RAM), which is used as an external cache.
- RAM random access memory
- static random access memory static random access memory
- dynamic RAM dynamic RAM
- DRAM dynamic random access memory
- synchronous dynamic random access memory synchronous DRAM, SDRAM
- double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
- enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
- synchronous connection dynamic random access memory serial DRAM, SLDRAM
- direct rambus RAM direct rambus RAM, DR RAM
- the service device includes a processor (or processing unit) and a memory
- the processor in this application can be integrated with the memory, or the processor and the memory can be connected through an interface, which can be based on actual conditions.
- the application scenario adjustment is not limited.
- the embodiment of the present application also provides a computer program or a computer program product including a computer program.
- the computer program When the computer program is executed on a computer, the computer will enable the computer to implement any of the foregoing method embodiments and the service device. Method flow.
- the computer program product includes one or more computer instructions.
- the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
- the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
- the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
- wired such as coaxial cable, optical fiber, digital subscriber line (DSL)
- wireless such as infrared, wireless, microwave, etc.
- the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server or a data center integrated with one or more available media.
- the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
- the disclosed system, device, and method can be implemented in other ways.
- the device embodiments described above are merely illustrative.
- the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, server, or other service device, etc.) execute all or part of the steps of the methods described in the various embodiments in Figures 2 to 6 of this application.
- the storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code.
- the words “if” or “if” as used herein can be interpreted as “when” or “when” or “in response to determination” or “in response to detection”.
- the phrase “if determined” or “if detected (statement or event)” can be interpreted as “when determined” or “in response to determination” or “when detected (statement or event) )” or “in response to detection (statement or event)”.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
一种实现模型更新的方法,用于联合更新框架下。该方法包括:服务设备接收至少两个客户端上传的差异信息,根据至少两个客户端上传的差异信息计算得到第一差异一致信息,第一差异一致信息用于表示至少两个客户端上传的差异信息的一致程度,服务设备根据第一差异一致信息计算得到第一训练信息,第一训练信息用于训练第三模型,第三模型为服务设备根据至少两个客户端上传的差异信息更新第一模型得到,服务设备向至少两个客户端发送第一训练信息。该方法可以提升模型的准确率。
Description
本申请要求于2020年1月23日提交中国专利局、申请号为202010077143.8、发明名称为“实现模型更新的方法及其设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请实施例涉及计算机领域,具体涉及一种实现模型更新的方法及其设备。
联合学习(FL,Federated Learning)是一种新兴的人工智能基础技术,原本用于解决终端用户在本地更新模型的问题,其设计目标是保护终端数据和个人数据隐私,在多参与方或多计算结点之间开展高效率的机器学习。目前,联合学习的宗旨被拓展为在数据不共享的情况下共同建设人工智能(Artificial Intelligence,AI)模型,提升AI模型效果。
在联合学习的过程中,各客户端会在本地利用本地的数据进行训练,并将训练好的本地模型的差异信息上传到服务设备,服务设备再对已经上传的各个客户端训练的本地模型的差异信息进行联合更新,生成新的中心模型,并将新的模型再下发给客户端,客户端则根据新的模型再次在本地进行训练,并将训练多批次的模型再上传到服务设备,经过几个这样反复的过程,服务设备会收敛中心模型,表示此次联合学习完毕。
服务设备接收客户端上传的差异信息后,服务设备会下发下次训练的一些训练信息,例如训练批次数或者是学习率,但是下发的这些训练信息是之前服务设备已经设置好的,例如是固定的,或者是按照递增规律或者递减规律设置下次的训练信息,但是每次客户端更新训练模型后,客户端根据新的模型训练后上传模型的最佳时机和上一次的训练结果是相关的,如果训练信息没有对应的发生变化,或者服务设备只是以递增或者递减的方式设置对应的训练信息,那么客户端可能就会错过最佳的上传时机,从而影响训练模型的准确率。
发明内容
本申请实施例提供了一种实现模型更新的方法及其装置,用于根据至少两个客户端的差异信息得到第一训练信息,当客户端使用该第一训练信息训练时可以提升训练模型的准确率。
本申请实施例第一方面提供了一种实现模型更新的方法。
在进行联合训练时,当客户端训练完成时,会向服务设备发送差异信息,即服务设备会接收到至少两个客户端发送的至少两个差异信息,该差异信息为客户端基于第一模型训练得到的第二模型相对于第一模型的差异信息,第一模型为服务设备发送给客户端的。
服务设备根据至少两个客户端上传的差异信息计算,以得到第一差异一致信息,该第一差异一致信息用于表示至少两个客户端上传的差异信息的一致程度。
服务设备在接收到至少两个客户端上传的差异信息之后,根据该至少两个客户端上传的差异信息更新第一模型,以得到第三模型。
服务设备根据第一差异一致信息计算得到第一训练信息,该第一训练信息是用于客户端训练第三模型的。
服务设备在得到第一训练信息后,会向该至少两个客户端发送第一训练信息和第三模型。
本申请实施例中,服务设备根据至少两个客户端上传的差异信息计算得到第一训练信息,并将该第一训练信息下发给客户端,当客户端根据该第一训练信息训练第三模型时,因该第一训练信息是根据上一次的差异信息计算得出的,因此客户端使用该第一训练信息训练时可以提升训练模型的准确率。
基于本申请实施例第一方面的实施方式,本申请实施例第一方面第一种实施方式中,服务设备具体可以根据以下公式计算得到第一差异一致信息:
其中,GCR
t表示第一差异一致信息,g
i表示至少两个客户端中第i个客户端上传的差异信息,i为大于等于1的正整数。
本申请实施例中,服务设备根据具体的公式计算第一差异一致信息,提升了方案的可实现性。
基于本申请实施例第一方面或本申请实施例第一方面第一种实施方式,本申请实施例第一方面第二种实施方式中,第一训练信息包括第一训练批次数,或者第一学习率中的至少一个,该第一训练批次数用于让客户端确定训练的批次数,第一学习率用于客户端在训练模型时确定训练的学习率。
本申请实施例中,限定了第一训练信息包含的具体信息,提升了方案的可实现性。
基于本申请实施例第一方面至本申请实施例第一方面第二种实施方式,本申请实施例第一方面第三种实施方式中,若第一训练信息中包括第一训练批次数,则服务设备根据第一差异一致信息计算得到的第一训练信息需要满足下列条件:
若GCR
t<GCR
t-1,则E
t<E
t-1;
若GCR
t=GCR
t-1,则E
t=E
t-1;
若GCR
t>GCR
t-1,则E
t>E
t-1。
其中,GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,E
t-1表示第二训练批次数,第二训练批次数表示服务设备得到第一批次数之前服务设备下发给客户端的训练批次数,E
t表示第一训练批次数信息。
本申请实施例中,限定了服务设备计算第一训练信息时具体的条件,提升了方案的可实现性。
基于本申请实施例第一方面至本申请实施例第一方面第三种实施方式,本申请实施例第一方面第四种实施方式中,若第一训练信息中包括第一学习率,则服务设备根据第一差异一致信息计算得到的第一学习率需要满足下列条件:
若GCR
t<GCR
t-1,则lr
t<lr
t-1;
若GCR
t=GCR
t-1,则lr
t=lr
t-1;
若GCR
t>GCR
t-1,则lr
t>lr
t-1;
其中,GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,lr
t-1表示第二学习率,第二学习率为在服务设备得到第一学习率之前服务设备下发给客户端的学习率信息,lr
t表示第一学习率。
本申请实施例中,限定了服务设备计算第一学习率时具体的条件,提升了方案的可实现性。
基于本申请实施例第一方面至本申请实施例第一方面第四种实施方式,本申请实施例第一方面第五种实施方式中,服务设备可以根据第一差异一致信息和第一性能判定信息计算得到第一训练信息,第一性能判定信息用于指示服务设备计算第一训练信息时性能判定因素的权重。
本申请实施例中,根据第一性能判定信息计算得到第一训练信息,提升了方案的可实现性。
基于本申请实施例第一方面至本申请实施例第一方面第五种实施方式,本申请实施例第一方面第六种实施方式中,第一性能判定信息包括准确率权重和通信代价权重中的至少一个,准确率权重用于指示服务设备计算第一训练信息时的模型准确率的权重,通信代价权重用于指示服务设备计算第一训练信息时使用通信资源的权重。
本申请实施例中,限定了具体的准确率权重和通信代价权重,在计算第一训练信息时,可以根据实际应用情况设定准确率权重和通信代价权重,提升了方案的可操作性。
基于本申请实施例第一方面至本申请实施例第一方面第六种实施方式,本申请实施例第一方面第七种实施方式中,若第一训练信息中包括第一训练批次数,且第一判定性能信息包括准确率权重和通信代价权重,那么服务设备可以根据以下公式计算得到第一训练批次数:
其中,GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,E
t-1表示第二训练批次,第二训练批次信息表示服务端上一次向客户端发送的训练批次数,E
t表示第一训练批信息,α表示准确率权重,β表示通信代价权重,a、b、c和d为不超过6的正数;
若通过公式计算得到的E
t为非正整数,则通过取整的方式使得Et的取值为正整数。
本申请实施例中,根据具体的公式计算第一训练批次数,提升了方案的可实现性。
基于本申请实施例第一方面至本申请实施例第一方面第七种实施方式,本申请实施例 第一方面第八种实施方式中,若第一训练信息包括第一学习率,则服务设备可以根据以下公式计算得到第一学习率:
其中GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,lr
t-1表示第二学习率,第二学习率为在服务设备得到第一学习率之前服务设备下发给客户端的学习率信息,lr
t表示第一学习率,α表示准确率权重,β表示通信代价权重,a
1、b
1和b
2为不超过6的正数。
本申请实施例中,根据具体的公式计算第一学习率,提升了方案的可实现性。
基于本申请实施例第一方面至本申请实施例第一方面第八种实施方式,本申请实施例第一方面第九种实施方式中,当服务设备计算得到了第一差异一致信息后,可以利用指数移动平均法修正该第一差异一致信息,得到修正后的第一差异一致信息。
本申请实施例中,通过指数移动平均法修正第一差异一致信息,并通过修正过的第一差异一致信息计算第一训练信息,进而提升了模型训练的准确率。
基于本申请实施例第一方面至本申请实施例第一方面第九种实施方式,本申请实施例第一方面第十种实施方式中,服务设备可以根据下列公式修正第一差异一致信息:
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,GCR'表示修正后的第一差异一致信息;
本申请实施例中,通过具体的公式修正第一差异一致信息,提升了方案的可实现性。
本申请实施例第二方面提供了一种实现模型更新的方法。
在联合训练时,服务设备接收了至少两个客户端上传的差异信息,该至少两个客户端中任一个客户端上传的差异信息是基于第一模型训练得到的第二模型相对于第一模型的差异信息,该第一模型是由该客户端接收自服务设备的。
服务设备根据至少两个客户端上传的差异信息更新第一模型,即服务设备的中心模型,得到第三模型。
服务设备根据第一模型和第三模型,得到第一模型和第三模型的差异信息(后续称为服务设备的差异信息)。
服务设备根据该服务设备的差异信息和所述至少两个客户端中的第一客户端上传的差异信息计算以得到目标差异一致信息,该目标差异一致信息用于表示服务设备的差异信息 和第一客户端的差异信息的一致程度。
服务设备根据目标差异一致信息计算得到目标训练信息,该目标训练信息用于让客户端训练该第三模型。
服务设备向所述第一客户端发送目标训练信息和第三模型。
本申请实施例中,通过单一的计算单一的客户端的训练批次信息,提升了训练模型时的准确率。
基于本申请实施例第二方面,本申请实施例第二方面第一种实施方式中,第一训练信息包括第一训练批次数,或者第一学习率中的至少一个,该第一训练批次数用于让客户端确定训练的批次数,第一学习率用于客户端在训练模型时确定训练的学习率。
本申请实施例中,限定了第一训练信息包含的具体信息,提升了方案的可实现性。
基于本申请实施例第二方面或本申请实施例第二方面第一种实施方式,本申请实施例第一方面第二种实施方式中,若第一训练信息中包括第一训练批次数,则服务设备根据第一差异一致信息计算得到的第一训练信息需要满足下列条件:
若GCR
t<GCR
t-1,则E
t<E
t-1;
若GCR
t=GCR
t-1,则E
t=E
t-1;
若GCR
t>GCR
t-1,则E
t>E
t-1。
其中,GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,E
t-1表示第二训练批次数,第二训练批次数表示服务设备得到第一批次数之前服务设备下发给客户端的训练批次数,E
t表示第一训练批次数信息。
本申请实施例中,限定了服务设备计算第一训练信息时具体的条件,提升了方案的可实现性。
基于本申请实施例第二方面至本申请实施例第二方面第二种实施方式,本申请实施例第一方面第三种实施方式中,若第一训练信息中包括第一学习率,则服务设备根据第一差异一致信息计算得到的第一学习率需要满足下列条件:
若GCR
t<GCR
t-1,则lr
t<lr
t-1;
若GCR
t=GCR
t-1,则lr
t=lr
t-1;
若GCR
t>GCR
t-1,则lr
t>lr
t-1;
其中,GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,lr
t-1表示第二学习率,第二学习率为在服务设备得到第一学习率之前服务设备下发给客户端的学习率信息,lr
t表示第一学习率。
本申请实施例中,限定了服务设备计算第一学习率时具体的条件,提升了方案的可实现性。
基于本申请实施例第二方面至本申请实施例第二方面第三种实施方式,本申请实施例第一方面第四种实施方式中,服务设备可以根据第一差异一致信息和第一性能判定信息计 算得到第一训练信息,第一性能判定信息用于指示服务设备计算第一训练信息时性能判定因素的权重。
本申请实施例中,根据第一性能判定信息计算得到第一训练信息,提升了方案的可实现性。
基于本申请实施例第二方面至本申请实施例第二方面第四种实施方式,本申请实施例第一方面第五种实施方式中,第一性能判定信息包括准确率权重和通信代价权重中的至少一个,准确率权重用于指示服务设备计算第一训练信息时的模型准确率的权重,通信代价权重用于指示服务设备计算第一训练信息时使用通信资源的权重。
本申请实施例中,限定了具体的准确率权重和通信代价权重,在计算第一训练信息时,可以根据实际应用情况设定准确率权重和通信代价权重,提升了方案的可操作性。
基于本申请实施例第二方面至本申请实施例第二方面第五种实施方式,本申请实施例第一方面第六种实施方式中,若第一训练信息中包括第一训练批次数,且第一判定性能信息包括准确率权重和通信代价权重,那么服务设备可以根据以下公式计算得到第一训练批次数:
其中,GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,E
t-1表示第二训练批次信息,第二训练批次信息表示服务端上一次向客户端发送的训练批次数,E
t表示第一训练批信息,α表示准确率权重,β表示通信代价权重,a、b、c和d为不超过6的正数;
若通过公式计算得到的E
t为非正整数,则通过取整的方式使得Et的取值为正整数。
本申请实施例中,根据具体的公式计算第一训练批次数,提升了方案的可实现性。
基于本申请实施例第二方面至本申请实施例第二方面第六种实施方式,本申请实施例第一方面第七种实施方式中,若第一训练信息包括第一学习率,则服务设备可以根据以下公式计算得到第一学习率:
其中GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,lr
t-1表示第二学习率,第二学习率为在服务设备得到第一学习率之前服务设备下发给客户端的学习率信息,lr
t表示第一学习率,α表示准确率权重,β表示通信代价权重,a
1、b
1和b
2为不超过6的正数。
本申请实施例中,根据具体的公式计算第一学习率,提升了方案的可实现性。
基于本申请实施例第二方面至本申请实施例第二方面第七种实施方式,本申请实施例第一方面第八种实施方式中,当服务设备计算得到了第一差异一致信息后,可以利用指数移动平均法修正该第一差异一致信息,得到修正后的第一差异一致信息。
本申请实施例中,通过指数移动平均法修正第一差异一致信息,并通过修正过的第一差异一致信息计算第一训练信息,进而提升了模型训练的准确率。
基于本申请实施例第二方面至本申请实施例第二方面第八种实施方式,本申请实施例第一方面第九种实施方式中,服务设备可以根据下列公式修正第一差异一致信息:
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,GCR'表示修正后的第一差异一致信息;
本申请实施例中,通过具体的公式修正第一差异一致信息,提升了方案的可实现性。
本申请实施例第三方面提供了一种实现模型更新的方法。
在联合学习过程中,客户端接收服务设备发送的第一模型,客户端基于第一模型进行训练得到第二模型。
该客户端向服务设备发送第二模型相对于第一模型的差异信息。
该客户端接收服务设备发送的第三模型和第一训练信息,该第三模型为服务设备根据该客户端发送的差异信息和其他客户端发送的差异信息更新该第一模型得到的,第一训练信息为服务设备根据该客户端发送的差异信息和其他客户端发送的差异信息得到的。
客户端根据第一训练信息训练该第三模型。
本申请实施例中,客户端通过接受了服务设备发送的第一训练信息,且该第一训练信息是由客户端的差异信息计算得到的,因此客户端根据该第一训练信息训练第三模型时,提升了训练模型的准确率。
基于本申请实施例第三方面的实施方式,本申请实施例第三方面的第一种实施方式中,第一训练信息包括第一训练批次数或第一学习率中的至少一个,第一训练批次数用于客户端确定训练的批次数,第一学习率用于客户端确定训练的学习率。
本申请实施例中,限定了第一训练信息具体可以包括训练批次数和学习率,提升了方案的可实现性。
基于本申请实施例第三方面或本申请实施例第三方面的第一种实施方式,本申请实施例第三方面的第二种实施方式中,当客户端为局点分析设备时,局点分析设备根据第一训练信息和第一训练样本训练第三模型得到第四模型,第一训练样本包括局点分析设备所对应的局点网络的网络设备的特征数据。
本申请实施例中,当客户端为局点分析设备时,可以通过局点网络的网路设备获取的 特征数据来训练第三模型,提升了方案的可实现性。
本申请实施例第四方面提供了一种终端。
接收单元,用于接收至少两个客户端上传的差异信息,客户端上传的差异信息是客户端基于第一模型训练得到的第二模型相对于第一模型的差异信息,第一模型由客户端接收自服务设备,第二模型为客户端基于第一模型进行训练得到;
计算单元,用于根据至少两个客户端上传的差异信息计算以得到第一差异一致信息,第一差异一致信息用于表示至少两个客户端上传的差异信息的一致程度;
更新单元,用于根据至少两个客户端上传的差异信息更新第一模型得到第三模型;
计算单元还用于根据第一差异一致信息计算以得到第一训练信息,第一训练信息用于训练第三模型;
发送单元,用于向至少两个客户端发送第一训练信息和第三模型。
可选地,计算单元具体用于根据以下公式计算得到第一差异一致信息:
GCR
t表示第一差异一致信息,g
i表示至少两个客户端中第i个客户端上传的差异信息,i为大于等于1的正整数。
可选地,第一训练信息包括第一训练批次数或第一学习率中的至少一个,第一训练批次数用于客户端确定训练的批次数,第一学习率用于客户端确定训练的学习率。
可选地,若第一训练信息包括第一训练批次数,则计算单元602具体用于根据第一差异一致信息计算得到的第一训练信息满足下列条件:
若GCR
t<GCR
t-1,则E
t<E
t-1;
若GCR
t=GCR
t-1,则E
t=E
t-1;
若GCR
t>GCR
t-1,则E
t>E
t-1。
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前,服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,E
t-1表示第二训练批次数,第二训练批次数表示服务设备得到第一批次数之前服务设备下发给客户端的训练批次数,E
t表示第一训练批次数。
可选地,若第一训练信息包括第一学习率,则计算单元根据第一差异一致信息计算得到第一训练信息满足下列条件:
若GCR
t<GCR
t-1,则lr
t<lr
t-1;
若GCR
t=GCR
t-1,则lr
t=lr
t-1;
若GCR
t>GCR
t-1,则lr
t>lr
t-1;
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,lr
t-1表示第二学习率,第二学习率为在服务设备得到第一学习率之前服务设备下发给客户端的学习率信息,lr
t表示第一学习率。
可选地,计算单元具体用于根据第一差异一致信息和第一性能判定信息计算得到第一训练信息,第一性能判定信息用于指示服务设备计算第一训练信息时性能判定因素的权 重。
可选地,第一性能判定信息包括准确率权重和通信代价权重中的至少一个,准确率权重用于指示服务设备计算第一训练信息时的模型准确率的权重,通信代价权重用于指示服务设备计算第一训练信息时使用通信资源的权重。
可选地,若第一训练信息包括第一训练批次信息,且第一判定性能信息包括准确率权重和通信代价权重,则计算单元具体根据以下公式计算得到第一训练批次数:
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,E
t-1表示第二训练批次数,第二训练批次数表示服务端上一次向客户端发送的训练批次数,E
t表示第一训练批信息,α表示准确率权重,β表示通信代价权重,a、b、c和d为不超过6的正数;
若通过公式计算得到的E
t为非正整数,则通过取整的方式使得Et的取值为正整数。
可选地,若第一训练信息包括第一学习率,则计算单元具体根据以下公式计算得到第一学习率:
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,lr
t-1表示第二学习率,第二学习率为在服务设备得到第一学习率之前服务设备下发给客户端的学习率信息,lr
t表示第一学习率,α表示准确率权重,β表示通信代价权重,a
1、b
1和b
2为不超过6的正数。
可选地,服务设备还包括:
修正单元,用于利用指数移动平均法修正第一差异一致信息,得到修正后的第一差异一致信息。
可选地,计算单元具体用于利用指数移动参数修正第一差异一致信息可以通过以下公式得到,包括:
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,GCR'表示修正后的第一差异一致信息;
本申请实施例第五方面提供了一种服务设备。
接收单元,用于接收至少两个客户端上传的差异信息,客户端上传的差异信息是基于第一模型训练得到的第二模型相对于第一模型的差异信息,第一模型由客户端接收自服务设备;
更新单元,根据至少两个客户端上传的差异信息更新第一模型,得到第三模型;
处理单元,用于根据第一模型和第三模型得到第一模型和第三模型的差异信息(后续称为服务设备的差异信息);
计算单元,用于根据服务设备的差异信息和所述至少两个客户端中的第一客户端的差异信息计算以得到目标差异一致信息,目标差异一致信息用于表示服务设备模型的差异信息和客户端的差异信息的一致程度;
计算单元还用于根据目标差异一致信息计算以得到目标训练信息,目标训练信息用于训练第三模型;
发送单元,用于向所述第一客户端发送目标训练信息和第三模型。
本申请实施例第六方面提供了一种客户端。
训练单元,用于基于第一模型进行训练得到第二模型,第一模型接收自服务设备;
发送单元,用于向服务设备发送第二模型相对于第一模型的差异信息;
接收单元,用于接收服务设备发送的第三模型和第一训练信息,第三模型为服务设备根据该客户端发送的差异信息和其它客户端发送的差异信息更新第一模型得到的,第一训练信息为服务设备根据该客户端发送的差异信息和其它客户端发送的差异信息得到的;
训练单元还用于根据第一训练信息训练第三模型。
可选地,第一训练信息包括第一训练批次数或第一学习率中的至少一个,第一训练批次数用于客户端确定训练的批次数,第一学习率用于客户端确定训练的学习率。
可选地,客户端为据点分析设备;
训练单元还用于根据第一训练信息和第一训练样本训练第三模型得到第四模型,第一训练样本包括局点分析设备所对应的局点网络的网络设备的特征数据。
本申请实施例第七方面提供了一种终端。
处理器、存储器、输入输出设备;
处理器与存储器、输入输出设备相连;
处理器执行如本申请第一方面和第三方面实施方式所述的方法。
本申请实施例第八方面提供了一种服务设备。
处理器、存储器、输入输出设备;
处理器与存储器、输入输出设备相连;
处理器执行如本申请第二方面实施方式所述的方法。
本申请实施例第九方面提供了一种计算机存储介质,所述计算机存储介质中存储有指令,所述指令在所述计算机上执行时,使得计算机执行如本申请第一方面至第三方面实施方式所述的方法。
本申请实施例第十方面提供了一种计算机程序产品,所述计算机程序产品在计算机上执行时,使得所述计算机执行如本申请第一方面至第三方面实施方式所述的方法。
从以上技术方案可以看出,本申请实施例具有以下优点:
服务设备通过至少两个客户端上传的差异信息,服务设备根据至少两个客户端的差异信息得到第一训练信息,并将第一训练信息下发给客户端,在这个过程中,服务设备根据至少两个客户端上传的差异信息计算相关的第一训练信息,因为差异信息和之前客户端训练的结果相关,因此客户端使用该第一训练信息训练时可以提升模型的准确率。
图1为本申请实施例提供的联合训练框架图;
图2为本申请实施例提供的模型数据处理方法一个流程示意图;
图3为本申请实施例提供的模型数据处理方法另一流程示意图;
图4为本申请实施例提供的模型数据处理方法另一流程示意图;
图5为本申请实施例提供的终端一个结构示意图;
图6为本申请实施例提供的终端另一结构示意图;
图7为本申请实施例提供的服务设备一个结构示意图;
图8为本申请实施例提供的终端另一结构示意图;
图9为本申请实施例提供的终端另一结构示意图;
图10为本申请实施例提供的终端另一结构示意图;
图11为本申请实施例提供的服务设备另一结构示意图。
本申请实施例提供了一种数据处理方法,用于在更新训练模型时,可以根据服务设备上传的差异信息得到第一训练信息,并下发给客户端,该客户端使用该第一训练信息训练时可以提升模型的准确率。
为了便于读者理解,本申请实施例对提供的模型更新方法所涉及的学习算法进行简单介绍。
联合学习(FL,Federated Learning)是一种新兴的人工智能基础技术,原本用于解决安卓手机终端用户在本地更新模型的问题,其设计目标是保护终端数据和个人数据隐私、保证合法合规的前提下,在多参与方或多计算结点之间开展高效率的机器学习。目前,联合学习的宗旨被拓展为在数据不共享的情况下共同建设AI模型,提升AI模型效果。
机器学习算法作为AI领域的一个重要分支,在众多领域得到了广泛的应用。从学习方法的角度,机器学习算法可以分为监督式学习算法、非监督式学习算法、半监督式学习算法、强化学习算法几大类。监督式学习算法,是指可以基于训练数据学习一个算法或建立一个模式,并以此算法或模式推测新的实例。训练数据,也称训练样本,是由输入数据和预期输出组成。机器学习算法的模型,也称机器学习模型,其预期输出,称为标签,其可以是一个预测的分类结果(称作分类标签)。非监督式学习算法与监督式学习算法的区别在于,非监督式学习算法的训练样本没有给定标签,机器学习算法模型通过分析训练样本,从而得到一定的成果。半监督学习算法,其训练样本一部分带有标签,另一部分没有标签,而无标签的数据远远多于有标签的数据。强化学习算法通过不断在环境中尝试,以取得最大化的预期利益,通过环境给予的奖励或惩罚,产生能获得最大利益的选择。
请参阅图1,为本申请提供的一个实现模型更新的方法的框架示意图。
如图1所示,该模型更新方法的框架包括多个设备,该多个设备包括服务设备和客户端,图1中服务设备和客户端的数量仅用作示意,不作为对本申请实施例提供的模型更新方法所涉及的应用场景的限制。
其中,服务设备可以指任何支持多模型汇聚、联合的设备,或者云平台、服务器、公有云等,具体此处不做限定。客户端可以是任何支持本地训练的设备,例如,手机,平板,电脑,交换机、光线路终端(optical line terminal,OLT)、光网络设备(optical network terminal,ONT)、路由器等,具体此处不做限定。
服务设备和客户端可以通过有线网络或者无线网络连接,如果是通过有线网络连接,一般的连接形式为光纤网络,可以理解的是,还可以通过其他有线网络连接,具体此处不做限定。如果是通过无线网络连接,可以是通过蓝牙连接,无线网wi-fi连接,可以理解的是,还可以通过其他无线网络连接,具体此处不做限定。
在该模型更新方法框架下,服务设备会向客户端下发模型,客户端接收到模型之后,会在本地进行训练,训练完成后,客户端会将训练好的模型发送给服务设备,或者将差异信息发送给服务设备,服务设备接收到客户端发送的模型或者差异信息之后,服务设备会更新中心模型,服务设备更新完中心模型后,又会将更新之后的中心模型下发给客户端,进行下一次的训练。
在实际应用过程中,学习率也可以称为步长,即本申请实施例中学习率也可以替换为步长,具体此处不做限定。
可以理解的是,在该模型更新方法的框架下,并不具体限制应用场景,只要具体的应用场景可以应用本框架即可。
下面结合图1的模型更新方法的框架,对本申请实施例中的模型更新方法进行描述:
本申请实施例中,模型更新方法可以应用于多种应用场景,因此本申请实施例示意性的列举几种具体场景的实施方式,下面分别进行描述。
请参阅图2,为本申请提供的实现模型更新的方法的一个实施例的流程示意图。
在步骤201中,服务设备接收至少两个客户端上传的差异信息。
在进行联合训练时,当客户端训练完成时,会向服务设备发送差异信息,即服务设备会接收到至少两个客户端发送的至少两个差异信息,该差异信息为第一模型和第二模型之间的差异信息,第一模型为服务设备发送给客户端的,第二模型为客户端根据第一模型训练得到的,例如对第一模型进行增量训练,获得第二模型。在实际应用过程中,该差异信息可以是梯度信息,还可以是其他类型的信息,只要可以表示出客户端的训练前的模型和训练后的模型的差异信息即可,具体此处不做限定。
例如,该差异信息具体可以是由多个参数的差值组成的矩阵。如,第一模型包括4个参数,这4个参数的取值组成的矩阵为[a1、b1、c1、d1],第二模型也包括这4个参数,取值组成的矩阵为[a2、b2、c2、d2],则这4个参数的差值组成的矩阵为[a2-a1、b2-b1、c2-c1、d2-d1],即该差异信息为[a2-a1、b2-b1、c2-c1、d2-d1]。
在步骤202中,服务设备根据至少两个差异信息计算得到第一差异一致信息。
当服务设备接收到至少两个客户端上传的差异信息时,服务设备通过该至少两个差异 信息计算得到第一差异一致信息,该第一差异一致信息表示至少两个客户端的差异信息之间的一致程度。
在实际应用过程中,当差异信息为梯度信息时,则第一差异一致信息为梯度一致信息。该第一差异一致信息可以通过以下公式计算得到该梯度一致信息:
其中,GCR
t表示第一差异一致信息,g
i表示来自第i个客户端的差异信息,例如,当接收两个客户端的差异信息时,则该公式为:
可以理解的是,还可以有其他变形公式可以计算该第一差异一致信息,具体此处不做限定。
在步骤203中,服务设备利用指数移动法修正第一差异一致信息,得到修正后的第一差异一致信息。
在服务设备根据差异信息计算得到第一差异一致信息之后,服务设备可以通过指数移动平均法修正该第一差异一致信息,得到修正后的第一差异一致信息。
在实际应用过程中,该修正的过程可以理解为服务设备对该第一差异一致信息实行指数移动平均(exponential moving average,EMA)EMA光滑的一个过程,即服务设备通过EMA光滑的方式修正该第一差异一致信息,得到修正后的第一差异一致信息。例如,可以通过如下公式得到修正后的第一差异一致信息:
可以理解的是,在实际应用过程中,还存在其他变形公式可以计算得出该第一差异一致信息,具体此处不做限定。
在步骤204中,服务设备根据修正后的第一差异一致信息和第一性能判定信息得到第一训练信息。
当服务设备得到修正后的第一差异一致信息之后,服务设备根据修正后的第一差异一致信息和服务设备生成的第一性能判定信息计算得到第一训练信息。
该第一性能判定信息由服务设备预先设置的,该第一性能判定信息用于指示计算下一次训练信息时的性能判定因素的权重。
可选地,该第一性能判定信息可以包括准确率权重和通信代价权重。
相应的,准确率权重的值设置的越高,则表示在后续模型训练时的准确率越高。
相应的,通信代价权重的值设置的越高,则表示在后续模型训练时所上传的频数越低,即客户端与服务设备传输所使用的通信资源越少。
本申请实施例以α和β分别表示准确率权重和通信代价权重,可以理解的是,在实际应用过程中,还可以通过其他符号表示准确率权重和通信代价权重,具体此处不做限定。
示意性的,α和β可以通过以下公式设置对应的值:
进一步的,可以利用以下公式来评估联合学习的性能,
其中,Acc表示模型准确率,Acc
max表示模型准确率的最大值,Cost表示实际在联合学习过程中花费的通信代价的值,即消耗的通信资源的值,Cost
max表示通信代价的值的最大值,即表示当客户端每训练一次模型就上传至服务设备,此时的联合学习的通信代价的值为最大值,其中,P表示联合学习过程的性能,即P值越大,则表示联合学习的性能越好。其中,α+β=1,即当α越大,β越小时,表示在训练模型时更偏向于考虑模型准确率的提高,则此时消耗的通信资源的值越大,当α越小,β越大时,则表示在训练模型时更偏向于考虑通信代价的值,即此时消耗的通信资源的值越小。
例如,当α=0.5,β=0.5时,则指示模型既模型准确率的权重和通信代价的权重是相同的,当α=0.75,β=0.25时,则指示模型准确率的权重更高,而通信代价的权重更低,当α=0.25,β=0.75时,则指示模型准确率的权重更低,而通信代价的权重更高。
进一步的,还可以通过这个公式评估联合学习的效果。
例如,当P值越大时,则表示联合学习的效果越好,当P值越小时,则表示联合学习的效果越差。
当第一训练信息包括第一训练批次数信息时,服务设备可以根据下列公式计算得到该第一训练批次信息:
其中,GCR
t-1表示第二差异一致信息,该第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,即服务设备在上一次接收到至少两个客户端上传的差异信息时,该第二差异信息为服务设备根据上一次至少两个客户端上传的差异信息计算得到的,GCR
t表示第一差异一致信息,该E
t-1表示第二训练批次数,即在第一训练批次信息之前服务设备下发给客户端的训练批次数,E
t表示第一训练批数信息,即表示这一次的训练批次数,α表示准确率权重,β表示通信代价权重,a、b、c和d为不超过6的正整数,例如,a取值为3,b取值为4,c取值为2,d取值为4。
当服务设备计算得到的E
t的值为非正整数时,则通过取整数的方式让E
t的取值为正整数。
例如,当服务设备计算得到的E
t的值为3.3时,则通过向下取整的方式,取E
t的值为3,或者,还可以通过向上取整的方式,取E
t的值为4,具体的取整方式此处不做限定。
可以理解的是,在实际应用过程中,还可以通过其他公式得到该第一训练批次信息,只要该第一训练批次信息满足下列条件即可:
若GCR
t<GCR
t-1,则E
t<E
t-1,E
t为正整数;
若GCR
t=GCR
t-1,则E
t=E
t-1,E
t为正整数;
若GCR
t>GCR
t-1,则E
t>E
t-1,E
t为正整数。
当第一训练信息包括第一学习率时,服务设备可以根据下列公式计算得到该第一学习率:
其中,GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,即服务设备在上一次接收到至少两个客户端上传的差异信息时,该第二差异信息为服务设备根据上一次至少两个客户端上传的差异信息计算得到的,GCR
t表示第一差异一致信息,lr
t-1表示第二学习率,该第二学习率为在第一学习率之前服务设备下发给客户端的学习率信息,lr
t表示第一学习率,即表示这一次服务设备准备下发给客户端的学习率,α表示第二性能判定信息,β表示第三性能判定信息,a
1、b
1和b
2为不超过6的正整数,例如a
1=0.1,b
1=2,b
2=4。
可以理解的是,在实际应用过程中,还可以通过其他公式得到该第一学习率,只要该公式的计算结果符合下列条件即可:
若GCR
t<GCR
t-1,则lr
t<lr
t-1;
若GCR
t=GCR
t-1,则lr
t=lr
t-1;
若GCR
t>GCR
t-1,则lr
t>lr
t-1。
在步骤205中,服务设备更新中心模型。
服务设备根据从客户端接收到的差异信息更新服务设备的中心模型得到更新后的模型。
该中心模型即为第一模型,该更新后的中心模型即为第二模型。
当与服务设备相连的客户端有N个时,服务设备可以统计向服务设备发送差异信息的客户端的数量L,L小于或等于N,当L与N的比值大于阈值K时,K大于0且小于或等于1,具体可以是大于0.5的值,如0.8,服务设备利用接收到的多个差异信息更新第一模型,获得第三模型,服务设备将该第三模型分别下发给N个客户端。
例如,当服务设备根据差异信息更新中心模型得到更新后的模型可以通过如下方式:
获取至少两个客户端上传的差异信息的均值,利用均值更新第一模型,获得第三模型。如,基于客户端1和客户端2上传的差异信息更新第一模型,其中,客户端1和客户端2上传的差异信息分别为[a2-a1,b2-b1,c2-c1,d2-d1]和[a3-a1,b3-b1,c3-c1,d3-d1],这两个客户端上传的差异信息的均值为[(a2-a1+a3-a1)/2,(b2-b1+b3-b1)/2,(c2-c1+c3-c1)/2,(d2-d1+d3-d1)/2],利用该均值更新该第一模型。
在步骤206中,服务设备向客户端发送第一训练信息。
服务设备在更新了中心模型之后,服务设备会向客户端下发该更新后的模型,并且向客户端发送第一训练信息。
在实际应用过程中,客户端接收到该更新后的模型和第一训练信息后,客户端会根据服务设备下发的更新后的模型更新本地模型,并且更新本地的训练批次数和学习率,再进行训练,即根据接收的训练信息(如训练批次数、学习率)对服务设备下发的更新后的模型进行进一步的模型训练。当训练结束后,又会向服务设备发送差异信息,重复步骤201至206,直到中心模型收敛,联合学习训练结束。
本申请实施例中,在步骤204中,也可以不设置第一性能判定信息,在计算得到第一训练信息时,不将第一性能判定信息作为参数代入公式。
本申请实施例中,步骤203为可选步骤,当不执行步骤203时,则服务设备可以直接根据步骤202计算得到的第一差异一致信息计算得到第一训练信息。
需要说明的是,本申请实施例中,步骤205与步骤202至步骤204之间没有先后顺序。
本实施例中,服务设备通过至少两个客户端上传的差异信息得到第一训练信息,并将第一训练信息下发给客户端,在这个过程中,服务设备根据客户端上传的差异信息计算下次客户端训练模型需要的第一训练信息,因为差异信息和之前客户端训练的结果相关,因此客户端使用该第一训练信息训练时可以提升模型的准确率。
请参阅图3,为本申请提供的模型更新方法的另一实施例的流程示意图。
在步骤301中,服务设备接收至少两个客户端上传的差异信息。
在步骤302中,服务设备根据至少两个客户端上传的差异信息更新中心模型。
本实施例中的步骤301和步骤302与图2所示实施例中的步骤201和步骤205类似,此处不再赘述。
在步骤303中,服务设备根据更新前的中心模型和更新后的中心模型的差异信息和客户端上传的差异信息计算得到目标差异一致信息。
当服务设备接收到至少两个客户端上传的差异信息时,若服务设备需要计算其中一个客户端的目标训练信息时,则服务设备会先根据该至少两个客户端上传的差异信息更新中心模型,再获取更新前的中心模型和更新后的中心模型的差异信息,服务设备通过计算该客户端上传的差异信息和服务设备更新前的中心模型和更新后的中心模型的差异信息,得到目标差异一致信息,该目标差异一致信息表示客户端上传的差异信息与服务设备更新前的中心模型和更新后的中心模型的差异信息之间的一致程度。
该中心模型即为第一模型,该更新后的中心模型即为第二模型。
在实际应用过程中,当差异信息为梯度信息时,则目标差异一致信息为梯度一致信息。该目标差异一致信息可以通过以下公式计算得到该梯度一致信息:
其中,GCR
t表示目标差异一致信息,g
3表示来自客户端的差异信息,g
4表示来自服务设备更新前的中心模型和更新后的中心模型的差异信息。
可以理解的是,还可以有其他变形公式可以计算该目标差异一致信息,具体此处不做限定。
在步骤304中,服务设备利用指数移动平均法修正目标差异一致信息,得到修正后的目标差异一致信息。
在步骤305中,服务设备根据修正后的目标差异一致信息和第一性能判定信息得到目标训练信息。
在步骤306中,服务设备向至少两个客户端发送目标训练信息和更新后的中心模型。
本实施例中的步骤304至步骤306与图2所示实施例中的步骤203、步骤204和步骤206类似,此处不再赘述。
在本实施例中,服务设备可以为所述至少两个客户端中的每个客户端分别确定目标训练信息。对于其中任一客户端,在步骤303中得到的是该客户端对应的目标差异一致信息,具体是根据该客户端的差异信息得到该目标差异一致信息;在步骤304中修正该目标差异一致信息的方式具体参见步骤203中修正第一差异一致信息的方式,在修正时,具体可以将上一轮计算得到的该客户端对应的差异一致信息作为步骤203中的第二差异一致信息对该目标差异一致信息进行修正;在步骤305中,得到该目标训练信息的方式具体参见步骤204中得到第一训练信息的方式;相应地,在步骤306中,向该客户端发送该客户端对应的目标训练信息。
在实际应用过程中,客户端接收到该更新后的模型和目标训练信息后,客户端会根据服务设备下发的更新后的模型更新本地模型,并且更新本地的训练批次数和学习率,再进行训练。当训练结束后,又会向服务设备发送差异信息,重复步骤301至306,直到中心模型收敛,联合学习训练结束。
本申请实施例中,在步骤305中,也可以不设置第一性能判定信息,则在计算得到目标训练信息时,不将第一性能判定信息作为参数代入公式。
本申请实施例中,步骤304为可选步骤,当不执行步骤304时,则服务设备可以直接根据步骤303得到的目标差异一致信息计算得到目标训练信息。
本实施例中,服务设备通过单独一个客户端的差异信息,计算单独客户端的目标训练信息,并下发该目标训练信息到客户端,当客户端用该目标训练信息训练模型时,相对于统一计算得到的目标训练信息,提升了单独客户端根据目标训练信息训练模型的准确率。在具体实现时,对于所述多个客户端中的每个客户端,服务设备均可以执行上述步骤计算该客户端的目标训练信息并发送给该客户端。
请参阅图4,为本申请提供的模型更新方法的另一实施例的流程示意图。
本实施例中,以服务设备为云端设备,客户端为局点分析设备为例进行说明。
在该应用场景中,模型数据处理方法所涉及的模型训练系统包括多个局点分析设备,即包括多个局点网络,局点网络可以为核心网,也可以为边缘网络,每个局点网络的用户可以为运营商或企业客户。不同局点网络可以是按照相应维度划分的不同网络,如,可以是不同地域的网络、不同运营商的网络、不同业务网络、不同网络域等。多个局点分析设备与多个局点网络可以一一对应,每个局点分析设备用于为对应的局点网络提供数据分析服务,每个局点分析设备可以位于对应的局点网络内,也可以位于对应的局点网络外。
在步骤401中,云端设备向局点分析设备发送第一模型。
在训练模型的过程中,云端设备获取第一模型,并且向局点分析设备发送第一模型。
需要说明的是,云端设备还可以向局点分析设备发送目标训练信息,该目标训练信息用于指示局点分析设备在训练第一模型时的参数信息。
在步骤402中,网络设备向局点分析设备发送第一特征数据。
网络设备向局点分析设备发送第一特征数据,该第一特征数据指由网络设备生成的数据。
该第一特征数据可以包括,例如,当网络设备是一台摄像头,摄像头的特征数据可以是摄像头采集生成的图像数据,例如当网络设备是是一台录音机,录音机的特征数据可以是录音机采集生成的声音数据,例如当网络设备是一台交换机,交换机的特征数据可以是KPI数据,KPI数据可以是交换机转发流量时生成的统计信息,例如出报文字节数,出报文数,队列深度,吞吐信息,丢包个数等。
在步骤403中,局点分析设备利用第一特征数据和目标训练信息训练第一模型得到第二模型,获取第一模型和第二模型的差异信息。
当局点分析设备接收到第一特征数据之后,局点分析设备利用第一特征数据和目标训练信息训练第一模型,得到第二模型,局点分析设备再对比第一模型和第二模型,得到第一模型和第二模型的差异信息。
例如,该差异信息具体可以由多个参数的差值组成的矩阵。如,第一模型包括4个参数,这4个参数的取值组成的矩阵为[a1、b1、c1、d1],第二模型也包括这4个参数,取值组成的矩阵为[a2、b2、c2、d2],则这4个参数的差值组成的矩阵为[a2-a1、b2-b1、c2-c1、d2-d1],即该差异信息为[a2-a1、b2-b1、c2-c1、d2-d1]。
在步骤404中,局点分析设备向云端设备发送差异信息。
局点分析设备获得差异信息后,局点分析设备向云端设备发送差异信息。
在步骤405中,云端设备根据至少两个差异信息计算得到第一差异一致信息。
当云端设备接收到至少两个局点分析设备上传的至少两个差异信息时,云端设备通过计算该差异信息,得到第一差异一致信息,该第一差异一致信息表示两个局点分析设备的差异信息之间的一致程度。
在实际应用过程中,当差异信息为梯度信息时,则第一差异一致信息为梯度一致信息。该第一差异一致信息可以通过以下公式计算得到该梯度一致信息:
其中,GCR
t表示第一差异一致信息,g
i表示来自第i个局点分析设备的差异信息,例如,当接收两个局点分析设备的差异信息时,则该公式为:
可以理解的是,还可以有其他变形公式可以计算该第一差异一致信息,具体此处不做限定。
在步骤406中,云端设备利用指数移动平均法修正第一差异一致信息,得到修正后的第一差异一致信息。
在云端设备根据差异信息计算得到第一差异一致信息之后,云端设备可以通过指数移动平均法修正该第一差异一致信息,得到修正后的第一差异一致信息。
在实际应用过程中,该修正的过程可以理解为云端设备对该第一差异一致信息实行指数移动平均(exponential moving average,EMA)EMA光滑的一个过程,即云端设备通过EMA光滑的方式修正该第一差异一致信息,得到修正后的第一差异一致信息。例如,可以通过如下公式得到修正后的第一差异一致信息:
可以理解的是,在实际应用过程中,还存在其他变形公式可以计算得出该第一差异一致信息,具体此处不做限定。
在步骤407中,云端设备根据修正后的第一差异一致信息和第一性能判定信息计算得到第一训练信息。
当云端设备得到修正后的第一差异一致信息之后,云端设备根据修正后的第一差异一致信息和云端设备生成的第一性能判定信息计算得到第一训练信息。
该第一性能判定信息由云端设备预先设置的,该第一性能判定信息用于指示计算下一次训练信息时的各个参数的权重。
可选地,该第一性能判定信息可以包括准确率权重和通信代价权重。
相应的,准确率权重的值设置的越高,则表示在后续模型训练时的准确率越高。
相应的,通信代价权重的值设置的越高,则表示在后续模型训练时所上传的频数越低,即客户端与服务设备传输所使用的通信资源越少。
本申请实施例以α和β分别表示准确率权重和通信代价权重,可以理解的是,在实际应用过程中,还可以通过其他符号表示准确率权重和通信代价权重,具体此处不做限定。
示意性的,α和β可以通过以下公式设置对应的值:
其中,Acc表示模型准确率,Acc
max表示模型准确率的最大值,Cost表示实际在联合学习过程中花费的通信代价的值,即消耗的通信资源的值,Cost
max表示通信代价的值的最大值,即表示当据点设备每训练一次模型就上传至云端设备,此时的联合学习的通信代价的值为最大值,其中,P表示联合学习过程的性能,即P值越大,则表示联合学习的性能越好。其中,α+β=1,即当α越大,β越小时,表示在训练模型时更偏向于考虑模型准确率的提高,则此时消耗的通信资源的值越大,当α越小,β越大时,则表示在训练模型 时更偏向于考虑通信代价的值,即此时消耗的通信资源的值越小。
例如,当α=0.5,β=0.5时,则指示模型既模型准确率的权重和通信代价的权重是相同的,当α=0.75,β=0.25时,则指示模型准确率的权重更高,而通信代价的权重更低,当α=0.25,β=0.75时,则指示模型准确率的权重更低,而通信代价的权重更高。
进一步的,还可以通过这个公式评估联合学习的效果。
例如,当P值越大时,则表示联合学习的效果越好,当P值越小时,则表示联合学习的效果越差。
当第一训练信息包括第一训练批次信息时,云端设备可以根据下列公式计算得到该第一训练批次信息:
其中,GCR
t-1表示第二差异一致信息,该第二差异一致信息为在云端设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,即云端设备在上一次接收到至少两个局点分析设备上传的差异信息时,该第二差异信息为云端设备根据上一次至少两个局点分析设备上传的差异信息计算得到的,GCR
t表示第一差异一致信息,该E
t-1表示第二训练批次信息,即在第一训练批次信息之前云端设备下发给局点分析设备的训练批次信息,E
t表示第一训练批信息,即表示这一次的训练批次,α表示准确率权重,β表示通信代价权重,a、b、c和d为不超过6的正整数,例如,a取值为3,b取值为4,c取值为2,d取值为4。
当云端设备计算得到的E
t的值为非正整数时,则通过取整数的方式让E
t的取值为正整数。
例如,当云端设备计算得到的E
t的值为3.3时,则通过向下取整的方式,取E
t的值为3,或者,还可以通过向上取整的方式,取E
t的值为4,具体的取整方式此处不做限定。
可以理解的是,在实际应用过程中,还可以通过其他公式得到该第一训练批次信息,只要该第一训练批次信息满足下列条件即可:
若GCR
t<GCR
t-1,则E
t<E
t-1,E
t为正整数;
若GCR
t=GCR
t-1,则E
t=E
t-1,E
t为正整数;
若GCR
t>GCR
t-1,则E
t>E
t-1,E
t为正整数。
当第一训练信息包括第一学习率时,服务设备可以根据下列公式计算得到该第一学习率:
其中,GCR
t-1表示第二差异一致信息,第二差异一致信息为在云端设备得到第一差异一致信息之前云端设备计算得到的差异一致信息,即服务设备在上一次接收到至少两个客 户端上传的差异信息时,该第二差异信息为服务设备根据上一次至少两个客户端上传的差异信息计算得到的,GCR
t表示第一差异一致信息,lr
t-1表示第二学习率,该第二学习率为在第一学习率之前云端设备下发给局点分析设备的学习率信息,lr
t表示第一学习率,即表示这一次云端设备准备下发给局点分析设备的学习率,α表示第二性能判定信息,β表示第三性能判定信息,a
1、b
1和b
2为不超过6的正整数,例如a
1=0.1,b
1=2,b
2=4。
可以理解的是,在实际应用过程中,还可以通过其他公式得到该第一学习率,只要该公式的计算结果符合下列条件即可:
若GCR
t<GCR
t-1,则lr
t<lr
t-1;
若GCR
t=GCR
t-1,则lr
t=lr
t-1;
若GCR
t>GCR
t-1,则lr
t>lr
t-1。
在步骤408中,云端设备更新中心模型。
云端设备根据从局点分析设备接收到的差异信息更新云端设备的中心模型得到更新后的模型。
当与云端设备相连的局点分析设备有N个时,云端设备可以统计向云端设备发送差异信息的局点分析设备的数量L,L小于或等于N,当L与N的比值大于阈值K时,K大于0且小于或等于1,具体可以是大于0.5的值,如0.8,云端设备利用接收到的多个差异信息更新第一模型,获得第三模型,云端设备将该第三模型分别下发给N个局点分析设备。
例如,当云端设备根据差异信息更新中心模型得到更新后的模型可以通过如下方式:
获取至少两个局点分析设备上传的差异信息的均值,利用均值更新第一模型,获得第三模型。如,基于局点分析设备1和局点分析设备2上传的差异信息更新第一模型,其中,局点分析设备1和局点分析设备2上传的差异信息分别为[a2-a1,b2-b1,c2-c1,d2-d1]和[a3-a1,b3-b1,c3-c1,d3-d1],这两个局点分析设备上传的差异信息的均值为[(a2-a1+a3-a1)/2,(b2-b1+b3-b1)/2,(c2-c1+c3-c1)/2,(d2-d1+d3-d1)/2],利用该均值更新该第一模型。
在步骤409中,云端设备向局点分析设备发送第一训练信息。
云端设备在更新了中心模型之后,云端设备会向局点分析设备下发该更新后的模型,并且向局点分析设备发送第一训练信息。
在实际应用过程中,局点分析设备接收到该更新后的模型和第一训练信息后,局点分析设备会根据云端设备下发的更新后的模型更新本地模型,并且更新本地的训练批次数和学习率,再进行训练。当训练结束后,局点分析设备又会向云端设备发送差异信息,重复步骤406至410,直到中心模型收敛,模型训练结束。
本申请实施例中,在步骤407中,也可以不设置第一性能判定信息,则在计算得到第一训练信息时,不将第一性能判定信息作为参数代入公式。
本申请实施例中,步骤406为可选步骤,当不执行步骤406时,则服务设备可以直接根据步骤405计算得到的第一差异一致信息计算得到第一训练信息。
本实施例中,局点分析设备获得了网络设备发送的第一特征数据,并且利用第一特征数据进行了模型的训练,提升了方案的可实现性。
在上述步骤206、306和409中发送更新后的中心模型的具体实现方式可以有多种。
具体可以是发送该更新后的中心模型的完整模型数据,该完整模型数据包括该更新后的中心模型的模型结构数据(如模型的函数形式)和模型参数数据(包括模型参数的值)。相应地,客户端收到该完整模型数据后可以直接加载该完整模型数据得到该更新后的中心模型,替换掉该客户端中的中心模型(即上一轮接收的中心模型)。
具体还可以是仅发送该更新后的中心模型的模型参数数据。相应地,客户端收到该更新后的中心模型的模型参数数据后可以采用该模型参数数据替代该客户端中的中心模型的模型参数数据,得到更新后的中心模型。
具体还可以是发送该更新后的中心模型的模型参数数据相对于更新前的中心模型的模型参数数据的差异信息。相应地,客户端收到后该差异信息后,利用该差异信息修改该客户端中的中心模型的模型参数数据,得到更新后的中心模型。例如,该客户端中的中心模型的某个参数的值为a,该模型参数对应的差异信息为+b,则客户端修改得到该模型参数的更新值为a+b。
上面对本申请实施例中的模型数据处理方法进行了描述,下面对本申请实施例中的服务设备进行描述,请参阅图5,为本申请提供的服务设备的一个实施例的结构示意图。
接收单元501,用于接收至少两个客户端上传的差异信息,客户端上传的差异信息是客户端基于第一模型训练得到的第二模型相对于第一模型的差异信息,第一模型由客户端接收自服务设备,第二模型为客户端基于第一模型进行训练得到;
计算单元502,用于根据至少两个客户端上传的差异信息计算以得到第一差异一致信息,第一差异一致信息用于表示至少两个客户端上传的差异信息的一致程度;
更新单元503,用于根据至少两个客户端上传的差异信息更新第一模型得到第三模型;
计算单元502还用于根据第一差异一致信息计算以得到第一训练信息,第一训练信息用于训练第三模型;
发送单元504,用于向至少两个客户端发送第一训练信息和第三模型。
本实施例中,服务设备各单元所执行的操作与前述图2或图4所示实施例中服务设备所执行的方法描述的类似,此处不再赘述。
请参阅图6,为本申请提供的服务设备的一个实施例的结构示意图。
接收单元601,用于接收至少两个客户端上传的差异信息,客户端上传的差异信息是客户端基于第一模型训练得到的第二模型相对于第一模型的差异信息,第一模型由客户端接收自服务设备,第二模型为客户端基于第一模型进行训练得到;
计算单元602,用于根据至少两个客户端上传的差异信息计算以得到第一差异一致信息,第一差异一致信息用于表示至少两个客户端上传的差异信息的一致程度;
更新单元605,用于根据至少两个客户端上传的差异信息更新第一模型得到第三模型;
计算单元602还用于根据第一差异一致信息计算以得到第一训练信息,第一训练信息用于训练第三模型;
发送单元604,用于向至少两个客户端发送第一训练信息和第三模型。
可选地,计算单元602具体用于根据以下公式计算得到第一差异一致信息:
GCR
t表示第一差异一致信息,g
i表示至少两个客户端中第i个客户端上传的差异信息,i为大于等于1的正整数。
可选地,第一训练信息包括第一训练批次数或第一学习率中的至少一个,第一训练批次数用于客户端确定训练的批次数,第一学习率用于客户端确定训练的学习率。
可选地,若第一训练信息包括第一训练批次数,则计算单元602具体用于根据第一差异一致信息计算得到的第一训练信息满足下列条件:
若GCR
t<GCR
t-1,则E
t<E
t-1;
若GCR
t=GCR
t-1,则E
t=E
t-1;
若GCR
t>GCR
t-1,则E
t>E
t-1。
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前,服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,E
t-1表示第二训练批次数,第二训练批次数表示服务设备得到第一批次数之前服务设备下发给客户端的训练批次数,E
t表示第一训练批次数。
可选地,若第一训练信息包括第一学习率,则计算单元602根据第一差异一致信息计算得到第一训练信息满足下列条件:
若GCR
t<GCR
t-1,则lr
t<lr
t-1;
若GCR
t=GCR
t-1,则lr
t=lr
t-1;
若GCR
t>GCR
t-1,则lr
t>lr
t-1;
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,lr
t-1表示第二学习率,第二学习率为在服务设备得到第一学习率之前服务设备下发给客户端的学习率信息,lr
t表示第一学习率。
可选地,计算单元602具体用于根据第一差异一致信息和第一性能判定信息计算得到第一训练信息,第一性能判定信息用于指示服务设备计算第一训练信息时性能判定因素的权重。
可选地,第一性能判定信息包括准确率权重和通信代价权重中的至少一个,准确率权重用于指示服务设备计算第一训练信息时的模型准确率的权重,通信代价权重用于指示服务设备计算第一训练信息时使用通信资源的权重。
可选地,若第一训练信息包括第一训练批次信息,且第一判定性能信息包括准确率权重和通信代价权重,则计算单元602具体根据以下公式计算得到第一训练批次数:
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,E
t-1表示第二训练批次数,第二训练批次数表示服务端上一次向客户端发送的训练批次数,E
t表示第一训练批信息,α表示准确率权重,β表示通信代价权重,a、b、c和d为不超过6的正数;
若通过公式计算得到的E
t为非正整数,则通过取整的方式使得Et的取值为正整数。
可选地,若第一训练信息包括第一学习率,则计算单元602具体根据以下公式计算得到第一学习率:
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,lr
t-1表示第二学习率,第二学习率为在服务设备得到第一学习率之前服务设备下发给客户端的学习率信息,lr
t表示第一学习率,α表示准确率权重,β表示通信代价权重,a
1、b
1和b
2为不超过6的正数。
可选地,服务设备还包括:
修正单元603,用于利用指数移动平均法修正第一差异一致信息,得到修正后的第一差异一致信息。
可选地,计算单元具体用于利用指数移动参数修正第一差异一致信息可以通过以下公式得到,包括:
GCR
t-1表示第二差异一致信息,第二差异一致信息为在服务设备得到第一差异一致信息之前服务设备计算得到的差异一致信息,GCR
t表示第一差异一致信息,GCR'表示修正后的第一差异一致信息;
本实施例中,服务设备各单元所执行的操作与前述图2或图4所示实施例中服务设备所执行的方法描述的类似,此处不再赘述。
请参阅图7,为本申请提供的服务设备的一个实施例的结构示意图。
接收单元701,用于接收至少两个客户端上传的差异信息,客户端上传的差异信息是基于第一模型训练得到的第二模型相对于第一模型的差异信息,第一模型由客户端接收自服务设备;
更新单元702,根据至少两个客户端上传的差异信息更新第一模型,得到第三模型;
处理单元703,用于根据第一模型和第三模型得到第一模型和第三模型的差异信息(后续称为服务设备的差异信息);
计算单元704,用于根据服务设备的差异信息和所述至少两个客户端中的第一客户端的差异信息计算以得到目标差异一致信息,目标差异一致信息用于表示服务设备模型的差 异信息和第一客户端的差异信息的一致程度;
计算单元704还用于根据目标差异一致信息计算以得到目标训练信息,目标训练信息用于训练第三模型;
发送单元705,用于向所述第一客户端发送目标训练信息和第三模型。
本实施例中,服务设备各单元所执行的操作与前述图3所示实施例中服务设备所执行的方法描述的类似,此处不再赘述。
请参阅图8,为本申请提供的终端的另一实施例的结构示意图。
训练单元801,用于基于第一模型进行训练得到第二模型,第一模型接受自服务设备;
发送单元802,用于向服务设备发送第二模型相对于第一模型的差异信息;
接收单元803,用于接收服务设备发送的第三模型和第一训练信息,第三模型为服务设备根据该客户端发送的差异信息和其它客户端发送的差异信息更新第一模型得到的,第一训练信息为服务设备根据该客户端发送的差异信息和其它客户端发送的差异信息得到的;
训练单元801还用于根据第一训练信息训练第三模型。
本实施例中,服务设备各单元所执行的操作与前述图2或图4所示实施例中客户端所执行的方法描述的类似,此处不再赘述。
请参阅图9,为本申请提供的终端的另一实施例的结构示意图。
训练单元901,用于基于第一模型进行训练得到第二模型,第一模型接受自服务设备;
发送单元902,用于向服务设备发送第二模型相对于第一模型的差异信息;
接收单元903,用于接收服务设备发送的第三模型和第一训练信息,第三模型为服务设备根据该客户端发送的差异信息和其它客户端发送的差异信息更新第一模型得到的,第一训练信息为服务设备根据该客户端发送的差异信息和其它客户端发送的差异信息得到的;
训练单元901还用于根据第一训练信息训练第三模型。
可选地,第一训练信息包括第一训练批次数或第一学习率中的至少一个,第一训练批次数用于客户端确定训练的批次数,第一学习率用于客户端确定训练的学习率。
可选地,客户端为据点分析设备;
训练单元901还用于根据第一训练信息和第一训练样本训练第三模型得到第四模型,第一训练样本包括局点分析设备所对应的局点网络的网络设备的特征数据。
本实施例中,服务设备各单元所执行的操作与前述图2或图4所示实施例中客户端所执行的方法描述的类似,此处不再赘述。
请参阅图10,为本申请提供的终端的另一实施例的结构示意图。
终端中包括处理器1001、存储器1002、总线1005、接口等设备1004,处理器1001与存储器1002、接口1004相连,总线1005分别连接处理器1001、存储器1002以及接口1004,接口1004用于接收或者发送数据,处理器1001是单核或多核中央处理单元,或者为特定集成电路,或者为被配置成实施本发明实施例的一个或多个集成电路。存储器1002 可以为随机存取存储器(Random Access Memory,RAM),也可以为非易失性存储器(non-volatile memory),例如至少一个硬盘存储器。存储器1002用于存储计算机执行指令。具体的,计算机执行指令中可以包括程序1003。
本实施例中,该处理器1001可以执行前述图2或者图4所示实施例中终端所执行的操作,具体此处不再赘述。
请参阅图11,为本申请提供的终端的另一实施例的结构示意图。
终端中包括处理器1101、存储器1102、总线1105、接口等设备1104,处理器1101与存储器1102、接口1104相连,总线1105分别连接处理器1101、存储器1102以及接口1104,接口1104用于接收或者发送数据,处理器1101是单核或多核中央处理单元,或者为特定集成电路,或者为被配置成实施本发明实施例的一个或多个集成电路。存储器1102可以为随机存取存储器(Random Access Memory,RAM),也可以为非易失性存储器(non-volatile memory),例如至少一个硬盘存储器。存储器1102用于存储计算机执行指令。具体的,计算机执行指令中可以包括程序1103。
本实施例中,该处理器1101可以执行前述图3所示实施例中终端所执行的操作,具体此处不再赘述。
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被计算机执行时实现上述任一方法实施例中与服务设备相关的方法流程。
应理解,本申请以上实施例中的服务设备中提及的处理器,或者本申请上述实施例提供的处理器,可以是中央处理单元(central processing unit,CPU),还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
还应理解,本申请中以上实施例中的服务设备中的处理器的数量可以是一个,也可以是多个,可以根据实际应用场景调整,此处仅仅是示例性说明,并不作限定。本申请实施例中的存储器的数量可以是一个,也可以是多个,可以根据实际应用场景调整,此处仅仅是示例性说明,并不作限定。
还应理解,本申请实施例中以上实施例中的服务设备提及的存储器或可读存储介质等,可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随 机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
还需要说明的是,当服务设备包括处理器(或处理单元)与存储器时,本申请中的处理器可以是与存储器集成在一起的,也可以是处理器与存储器通过接口连接,可以根据实际应用场景调整,并不作限定。
本申请实施例还提供了一种计算机程序或包括计算机程序的一种计算机程序产品,该计算机程序在某一计算机上执行时,将会使所述计算机实现上述任一方法实施例中与服务设备的方法流程。
在上述图2-图4中各个实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备 (可以是个人计算机,服务器,或者其他服务设备等)执行本申请图2至图6中各个实施例所述方法的全部或部分步骤。而该存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
本申请各实施例中提供的消息/帧/信息、模块或单元等的名称仅为示例,可以使用其他名称,只要消息/帧/信息、模块或单元等的作用相同即可。
在本申请实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本发明。在本申请实施例中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,在本申请的描述中,除非另有说明,“/”表示前后关联的对象是一种“或”的关系,例如,A/B可以表示A或B;本申请中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。
取决于语境,如在此所使用的词语“如果”或“若”可以被解释成为“在……时”或“当……时”或“响应于确定”或“响应于检测”。类似地,取决于语境,短语“如果确定”或“如果检测(陈述的条件或事件)”可以被解释成为“当确定时”或“响应于确定”或“当检测(陈述的条件或事件)时”或“响应于检测(陈述的条件或事件)”。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。
Claims (36)
- 一种实现模型更新的方法,其特征在于,包括:服务设备接收至少两个客户端上传的差异信息,所述客户端上传的差异信息是所述客户端基于第一模型训练得到的第二模型相对于所述第一模型的差异信息,所述第一模型由客户端接收自服务设备,所述第二模型为所述客户端基于第一模型进行训练得到;所述服务设备根据所述至少两个客户端上传的差异信息计算得到第一差异一致信息,所述第一差异一致信息用于表示所述至少两个客户端上传的差异信息的一致程度;所述服务设备根据所述至少两个客户端上传的差异信息更新所述第一模型得到第三模型;所述服务设备根据所述第一差异一致信息计算得到第一训练信息,所述第一训练信息用于训练所述第三模型;所述服务设备向所述至少两个客户端发送所述第一训练信息和所述第三模型。
- 根据权利要求1或2所述的方法,其特征在于,所述第一训练信息包括第一训练批次数或第一学习率中的至少一个,所述第一训练批次数用于所述客户端确定训练的批次数,所述第一学习率用于所述客户端确定训练的学习率。
- 根据权利要求3所述的方法,其特征在于,若所述第一训练信息包括所述第一训练批次数,则所述服务设备根据所述第一差异一致信息计算得到的第一训练信息满足下列条件:若GCR t<GCR t-1,则E t<E t-1;若GCR t=GCR t-1,则E t=E t-1;若GCR t>GCR t-1,则E t>E t-1;所述GCR t-1表示第二差异一致信息,所述第二差异一致信息为在所述服务设备得到所述第一差异一致信息之前所述服务设备计算得到的差异一致信息,所述GCR t表示所述第一差异一致信息,所述E t-1表示第二训练批次数,所述第二训练批次数表示所述服务设备得到所述第一批次数之前所述服务设备下发给所述客户端的训练批次数,所述E t表示所述第一训练批次数信息。
- 根据权利要求3或4所述的方法,其特征在于,若所述第一训练信息包括所述第一学习率,则所述服务设备根据所述第一差异一致信息计算得到第一训练信息满足下列条件:若GCR t<GCR t-1,则lr t<lr t-1;若GCR t=GCR t-1,则lr t=lr t-1;若GCR t>GCR t-1,则lr t>lr t-1;所述GCR t-1表示第二差异一致信息,所述第二差异一致信息为在所述服务设备得到所述第一差异一致信息之前所述服务设备计算得到的差异一致信息,所述GCR t表示所述第一差异一致信息,所述lr t-1表示第二学习率,所述第二学习率为在所述服务设备得到所述第一学习率之前所述服务设备下发给所述客户端的学习率,所述lr t表示所述第一学习率。
- 根据权利要求4或5所述的方法,其特征在于,所述服务设备根据所述第一差异一致信息计算得到第一训练信息包括:所述服务设备根据所述第一差异一致信息和所述第一性能判定信息计算得到第一训练信息,所述第一性能判定信息用于指示所述服务设备计算所述第一训练信息时性能判定因素的权重。
- 根据权利要求6所述的方法,其特征在于,所述第一性能判定信息包括准确率权重和通信代价权重中的至少一个,所述准确率权重用于指示所述服务设备计算所述第一训练信息时的模型准确率的权重,所述通信代价权重用于指示所述服务设备计算所述第一训练信息时使用通信资源的权重。
- 根据权利要求7所述的方法,其特征在于,若所述第一训练信息包括所述第一训练批次数,且所述第一判定性能信息包括准确率权重和通信代价权重,所述服务设备具体根据以下公式计算得到所述第一训练批次数:所述GCR t-1表示第二差异一致信息,所述第二差异一致信息为在所述服务设备得到所述第一差异一致信息之前所述服务设备计算得到的差异一致信息,所述GCR t表示所述第一差异一致信息,所述E t-1表示第二训练批次信息,所述第二训练批次数表示所述服务设备上一次向所述客户端发送的训练批次数,所述E t表示所述第一训练批次数,所述α表示所述准确率权重,所述β表示通信代价权重,所述a、b、c和d为不超过6的正数;若通过所述公式计算得到的E t为非正整数,则通过取整的方式使得所述Et的取值为正整数。
- 根据权利要求1至9中任一项所述的方法,其特征在于,所述服务设备根据所述差异信息计算得到第一差异一致信息之后,所述方法还包括:所述服务设备利用指数移动平均法修正所述第一差异一致信息,得到修正后的第一差异一致信息。
- 一种模型数据处理方法,其特征在于,包括:服务设备接收至少两个客户端上传的差异信息,所述客户端上传的差异信息是基于第一模型训练得到的第二模型相对于所述第一模型的差异信息,所述第一模型由所述客户端接收自所述服务设备;所述服务设备根据所述至少两个客户端上传的差异信息更新第一模型,得到第三模型;所述服务设备根据所述第一模型和所述第三模型得到所述第一模型和所述第三模型的第一差异信息;所述服务设备根据所述第一差异信息和所述至少两个客户端中的第一客户端的差异信息计算以得到目标差异一致信息,所述目标差异一致信息用于表示所述服务设备的差异信息和所述第一客户端的差异信息的一致程度;所述服务设备根据所述目标差异一致信息计算得到目标训练信息,所述目标训练信息用于训练所述第三模型;所述服务设备向所述第一客户端发送所述目标训练信息和所述第三模型。
- 一种实现模型更新的方法,其特征在于,包括:客户端基于第一模型进行训练得到第二模型,所述第一模型接收自服务设备;所述客户端向所述服务设备发送所述第二模型相对于所述第一模型的差异信息;所述客户端接收所述服务设备发送的第三模型和第一训练信息,所述第三模型为所述服务设备根据所述客户端发送的差异信息和其它客户端发送的差异信息更新所述第一模型得到的,所述第一训练信息为所述服务设备根据所述客户端发送的差异信息和其它客户端发送的差异信息得到的;所述客户端根据所述第一训练信息训练所述第三模型。
- 根据权利要求14所述的方法,其特征在于,所述第一训练信息包括第一训练批次数或第一学习率中的至少一个,所述第一训练批次数用于确定训练批次数,所述第一学习率用于确定训练学习率。
- 根据权利要求14或15所述的方法,其特征在于,所述客户端为局点分析设备;所述局点分析设备根据所述第一训练信息和第一训练样本训练所述第三模型得到第四模型,所述第一训练样本包括所述局点分析设备所对应的局点网络的网络设备的特征数据。
- 一种服务设备,其特征在于,包括:接收单元,用于接收至少两个客户端上传的差异信息,所述客户端上传的差异信息是所述客户端基于第一模型训练得到的第二模型相对于所述第一模型的差异信息,所述第一模型由客户端接收自服务设备,所述第二模型为所述客户端基于第一模型进行训练得到;计算单元,用于根据所述至少两个客户端上传的差异信息计算以得到第一差异一致信息,所述第一差异一致信息用于表示所述至少两个客户端上传的差异信息的一致程度;更新单元,用于根据所述至少两个客户端上传的差异信息更新所述第一模型得到第三模型;所述计算单元还用于根据所述第一差异一致信息计算以得到第一训练信息,所述第一训练信息用于训练所述第三模型;发送单元,用于向所述至少两个客户端发送所述第一训练信息和所述第三模型。
- 根据权利要求17或18所述的服务设备,其特征在于,所述第一训练信息包括第一训练批次数或第一学习率中的至少一个,所述第一训练批次数用于所述客户端确定训练的批次数,所述第一学习率用于所述客户端确定训练的学习率。
- 根据权利要求19所述的服务设备,其特征在于,若所述第一训练信息包括所述第一训练批次数,则所述计算单元具体用于根据所述第一差异一致信息计算得到的第一训练信息满足下列条件:若GCR t<GCR t-1,则E t<E t-1;若GCR t=GCR t-1,则E t=E t-1;若GCR t>GCR t-1,则E t>E t-1;所述GCR t-1表示第二差异一致信息,所述第二差异一致信息为在所述服务设备得到所述第一差异一致信息之前,所述服务设备计算得到的差异一致信息,所述GCR t表示所述第一差异一致信息,所述E t-1表示第二训练批次数,所述第二训练批次数表示所述服务设备得到所述第一批次数之前所述服务设备下发给所述客户端的训练批次数,所述E t表示所述第一训练批次数。
- 根据权利要求19或20所述的服务设备,其特征在于,若所述第一训练信息包括所述第一学习率,则所述计算单元根据所述第一差异一致信息计算得到第一训练信息满足下列条件:若GCR t<GCR t-1,则lr t<lr t-1;若GCR t=GCR t-1,则lr t=lr t-1;若GCR t>GCR t-1,则lr t>lr t-1;所述GCR t-1表示第二差异一致信息,所述第二差异一致信息为在所述服务设备得到所述第一差异一致信息之前所述服务设备计算得到的差异一致信息,所述GCR t表示所述第一差异一致信息,所述lr t-1表示第二学习率,所述第二学习率为在所述服务设备得到所述第一学习率之前所述服务设备下发给所述客户端的学习率信息,所述lr t表示所述第一学习率。
- 根据权利要求20或21所述的服务设备,其特征在于,所述计算单元具体用于根据所述第一差异一致信息和所述第一性能判定信息计算得到第一训练信息,所述第一性能判定信息用于指示所述服务设备计算所述第一训练信息时性能判定因素的权重。
- 根据权利要求22所述的服务设备,其特征在于,所述第一性能判定信息包括准确率权重和通信代价权重中的至少一个,所述准确率权重用于指示所述服务设备计算所述第一训练信息时的模型准确率的权重,所述通信代价权重用于指示所述服务设备计算所述第一训练信息时使用通信资源的权重。
- 根据权利要求23所述的服务设备,其特征在于,若所述第一训练信息包括所述第一训练批次信息,且所述第一判定性能信息包括准确率权重和通信代价权重,则所述计算单元具体根据以下公式计算得到所述第一训练批次数:所述GCR t-1表示第二差异一致信息,所述第二差异一致信息为在所述服务设备得到所述第一差异一致信息之前所述服务设备计算得到的差异一致信息,所述GCR t表示所述第一差异一致信息,所述E t-1表示第二训练批次数,所述第二训练批次数表示所述服务端上一次向所述客户端发送的训练批次数,所述E t表示所述第一训练批信息,所述α表示所述准确率权重,所述β表示通信代价权重,所述a、b、c和d为不超过6的正数;若通过所述公式计算得到的E t为非正整数,则通过取整的方式使得所述Et的取值为正整数。
- 根据权利要求17至25中任一项所述的服务设备,其特征在于,所述服务设备还包括:修正单元,用于利用指数移动平均法修正所述第一差异一致信息,得到修正后的第一差异一致信息。
- 一种服务设备,其特征在于,包括:接收单元,用于接收至少两个客户端上传的差异信息,所述客户端上传的差异信息是基于第一模型训练得到的第二模型相对于所述第一模型的差异信息,所述第一模型由所述客户端接收自所述服务设备;更新单元,根据所述至少两个客户端上传的差异信息更新第一模型,得到第三模型;处理单元,用于根据所述第一模型和所述第三模型得到所述第一模型和所述第三模型的服务设备的差异信息;计算单元,用于根据所述服务设备的差异信息和所述至少两个客户端中的第一客户端的差异信息计算以得到目标差异一致信息,所述目标差异一致信息用于表示所述服务设备模型的差异信息和所述第一客户端的差异信息的一致程度;所述计算单元还用于根据所述目标差异一致信息计算以得到目标训练信息,所述目标训练信息用于训练所述第三模型;发送单元,用于向所述第一客户端发送所述目标训练信息和所述第三模型。
- 一种客户端,其特征在于,包括:训练单元,用于基于第一模型进行训练得到第二模型,所述第一模型接受自服务设备;发送单元,用于向所述服务设备发送所述第二模型相对于所述第一模型的差异信息;接收单元,用于接收所述服务设备发送的第三模型和第一训练信息,所述第三模型为所述服务设备根据所述客户端发送的差异信息和其它客户端发送的差异信息更新所述第一模型得到的,所述第一训练信息为所述服务设备根据所述客户端发送的差异信息和其它客 户端发送的差异信息得到的;训练单元还用于根据所述第一训练信息训练所述第三模型。
- 根据权利要求30所述的客户端,其特征在于,所述第一训练信息包括第一训练批次数或第一学习率中的至少一个,所述第一训练批次数用于所述客户端确定训练的批次数,所述第一学习率用于所述客户端确定训练的学习率。
- 根据权利要求30或31所述的客户端,其特征在于,所述客户端为据点分析设备;所述训练单元还用于根据所述第一训练信息和第一训练样本训练所述第三模型得到第四模型,所述第一训练样本包括所述局点分析设备所对应的局点网络的网络设备的特征数据。
- 一种客户端,其特征在于,包括:处理器、存储器、输入输出设备;所述处理器与所述存储器、所述输入输出设备相连;所述处理器执行如权利要求1至12或权利要求14至16中任一项所述的方法。
- 一种服务设备,其特征在于,包括:处理器、存储器、输入输出设备;所述处理器与所述存储器、所述输入输出设备相连;所述处理器执行如权利要求13所述的方法。
- 一种计算机存储介质,所述计算机存储介质中存储有指令,所述指令在所述计算机上执行时,使得计算机执行如权利要求1至16中任一项所述的方法。
- 一种计算机程序产品,其特征在于,所述计算机程序产品在计算机上执行时,使得所述计算机执行如权利要求1至16中任一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP20914895.6A EP4083832A4 (en) | 2020-01-23 | 2020-09-30 | METHOD AND DEVICE FOR IMPLEMENTING MODEL UPDATE |
US17/871,084 US20220366310A1 (en) | 2020-01-23 | 2022-07-22 | Method for implementing model update and device thereof |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010077143.8A CN113159332B (zh) | 2020-01-23 | 2020-01-23 | 实现模型更新的方法及其设备 |
CN202010077143.8 | 2020-01-23 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/871,084 Continuation US20220366310A1 (en) | 2020-01-23 | 2022-07-22 | Method for implementing model update and device thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021147373A1 true WO2021147373A1 (zh) | 2021-07-29 |
Family
ID=76882155
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/119432 WO2021147373A1 (zh) | 2020-01-23 | 2020-09-30 | 实现模型更新的方法及其设备 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20220366310A1 (zh) |
EP (1) | EP4083832A4 (zh) |
CN (1) | CN113159332B (zh) |
WO (1) | WO2021147373A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114844889B (zh) * | 2022-04-14 | 2023-07-07 | 北京百度网讯科技有限公司 | 视频处理模型的更新方法、装置、电子设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180307748A1 (en) * | 2015-07-03 | 2018-10-25 | Sap Se | Adaptive adjustment of network responses to client requests in digital networks |
CN110008696A (zh) * | 2019-03-29 | 2019-07-12 | 武汉大学 | 一种面向深度联邦学习的用户数据重建攻击方法 |
CN110197084A (zh) * | 2019-06-12 | 2019-09-03 | 上海联息生物科技有限公司 | 基于可信计算及隐私保护的医疗数据联合学习系统及方法 |
CN110598870A (zh) * | 2019-09-02 | 2019-12-20 | 深圳前海微众银行股份有限公司 | 一种联邦学习方法及装置 |
CN110719158A (zh) * | 2019-09-11 | 2020-01-21 | 南京航空航天大学 | 基于联合学习的边缘计算隐私保护系统及保护方法 |
-
2020
- 2020-01-23 CN CN202010077143.8A patent/CN113159332B/zh active Active
- 2020-09-30 EP EP20914895.6A patent/EP4083832A4/en active Pending
- 2020-09-30 WO PCT/CN2020/119432 patent/WO2021147373A1/zh unknown
-
2022
- 2022-07-22 US US17/871,084 patent/US20220366310A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180307748A1 (en) * | 2015-07-03 | 2018-10-25 | Sap Se | Adaptive adjustment of network responses to client requests in digital networks |
CN110008696A (zh) * | 2019-03-29 | 2019-07-12 | 武汉大学 | 一种面向深度联邦学习的用户数据重建攻击方法 |
CN110197084A (zh) * | 2019-06-12 | 2019-09-03 | 上海联息生物科技有限公司 | 基于可信计算及隐私保护的医疗数据联合学习系统及方法 |
CN110598870A (zh) * | 2019-09-02 | 2019-12-20 | 深圳前海微众银行股份有限公司 | 一种联邦学习方法及装置 |
CN110719158A (zh) * | 2019-09-11 | 2020-01-21 | 南京航空航天大学 | 基于联合学习的边缘计算隐私保护系统及保护方法 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4083832A4 |
Also Published As
Publication number | Publication date |
---|---|
CN113159332A (zh) | 2021-07-23 |
CN113159332B (zh) | 2024-01-30 |
EP4083832A1 (en) | 2022-11-02 |
EP4083832A4 (en) | 2023-06-21 |
US20220366310A1 (en) | 2022-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111447083B (zh) | 动态带宽和不可靠网络下的联邦学习架构及其压缩算法 | |
WO2020168761A1 (zh) | 训练模型的方法和装置 | |
US20220407809A1 (en) | Data Stream Classification Model Updating Method and Related Device | |
CN112511325B (zh) | 网络拥塞控制方法、节点、系统及存储介质 | |
CN110545307B (zh) | 边缘计算平台、调用方法及计算机可读存储介质 | |
WO2021052162A1 (zh) | 网络参数配置方法、装置、计算机设备以及存储介质 | |
CN111030861A (zh) | 一种边缘计算分布式模型训练方法、终端和网络侧设备 | |
WO2021098618A1 (zh) | 数据分类方法、装置、终端设备及可读存储介质 | |
US11888703B1 (en) | Machine learning algorithms for quality of service assurance in network traffic | |
US11483177B2 (en) | Dynamic intelligent analytics VPN instantiation and/or aggregation employing secured access to the cloud network device | |
WO2019206100A1 (zh) | 一种特征工程编排方法及装置 | |
CN115643210A (zh) | 控制数据包发送方法及系统 | |
WO2021147373A1 (zh) | 实现模型更新的方法及其设备 | |
WO2024012065A1 (zh) | 数据传输控制方法、装置、计算机可读存储介质、计算机设备及计算机程序产品 | |
CN116503642A (zh) | 基于改进联邦学习的数据分类方法、系统及相关设备 | |
CN107979540B (zh) | 一种sdn网络多控制器的负载均衡方法及系统 | |
CN109194504A (zh) | 面向动态网络的时序链路预测方法及计算机可读存储介质 | |
Hagos et al. | Classification of delay-based TCP algorithms from passive traffic measurements | |
WO2022121979A1 (zh) | 内环值的调整方法和装置、存储介质及电子装置 | |
WO2023098222A1 (zh) | 多业务场景的识别方法和决策森林模型的训练方法 | |
CN113824670B (zh) | 5g切片空口协议栈调度方法、装置及计算设备 | |
EP4140096A1 (en) | First network node and method performed therein for handling data in a communication network | |
CN114726729B (zh) | 一种面向网络切片的无线接入网的接纳控制方法及其系统 | |
CN118157999B (zh) | 模型训练方法、装置、终端及存储介质 | |
WO2025011136A1 (zh) | 联邦模型训练系统、方法、装置、通信设备和存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20914895 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2020914895 Country of ref document: EP Effective date: 20220727 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |