CN113705823A - Model training method based on federal learning and electronic equipment - Google Patents

Model training method based on federal learning and electronic equipment Download PDF

Info

Publication number
CN113705823A
CN113705823A CN202010446725.9A CN202010446725A CN113705823A CN 113705823 A CN113705823 A CN 113705823A CN 202010446725 A CN202010446725 A CN 202010446725A CN 113705823 A CN113705823 A CN 113705823A
Authority
CN
China
Prior art keywords
model
training
data
fused
cloud
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010446725.9A
Other languages
Chinese (zh)
Inventor
王妍
刘宏马
郭文静
苗磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202010446725.9A priority Critical patent/CN113705823A/en
Publication of CN113705823A publication Critical patent/CN113705823A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The embodiment of the application is suitable for the technical field of artificial intelligence and discloses a model training method and electronic equipment based on federal learning, wherein the method comprises the following steps: the method comprises the steps that a first electronic device sends a model to be trained to a second electronic device; the method comprises the steps that a first electronic device receives a model to be fused sent by one or more second electronic devices, wherein the model to be fused is obtained by the second electronic devices through training the model to be trained according to a local training sample data set; the method comprises the steps that cross validation is conducted on a model to be fused by first electronic equipment through a cross validation data set, and a cross validation result is obtained; the first electronic equipment distributes fusion weight to one or more models to be fused according to the cross validation result; and the first electronic equipment performs model weighted fusion on the models to be fused according to the fusion weight of each model to be fused to obtain an updated cloud model. Through the technical scheme provided by the embodiment of the application, the robustness of the cloud model obtained by cloud fusion can be improved.

Description

Model training method based on federal learning and electronic equipment
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a model training method based on federal learning and electronic equipment.
Background
With the increasing of the data security and privacy awareness of users and the frequent exposure of personal privacy data of users, the protection of the data security and privacy of users is also continuously enhanced, and a new challenge is brought to the data processing mode and the model training mode of the traditional Artificial Intelligence (AI).
In order to meet new challenges, a mode of model training for federated learning (fed learning) is proposed. In federal learning, the training data of the model is stored in each client device without being uploaded to a cloud server; each client device uses a local data training model, and uploads the obtained model parameters to the cloud server. Specifically, the cloud server firstly issues a model to be trained to each client device; after each client device receives the model, carrying out model training by using training data stored locally to obtain new model parameters; then, the client device encrypts the new model parameters and uploads the encrypted model parameters to the cloud server; and finally, the cloud server performs model fusion updating according to the model parameters uploaded by the client devices to obtain a new model.
However, in the existing federal learning, the robustness of a new model obtained by fusing a cloud server is poor.
Disclosure of Invention
The application provides a model training method based on federal learning and electronic equipment, and aims to solve the problem that in the existing federal learning, a new model obtained by fusing a cloud server is poor in robustness.
In a first aspect, an embodiment of the present application provides a method for model training based on federal learning, where the method includes: the method comprises the steps that a first electronic device sends a model to be trained to a second electronic device; then, the first electronic device receives a model to be fused sent by one or more second electronic devices, wherein the model to be fused is obtained by the second electronic devices through training the model to be trained according to a local training sample data set; then, the first electronic device performs cross validation on the model to be fused by using the cross validation data set to obtain a cross validation result; and finally, after distributing fusion weights to one or more models to be fused according to the cross validation result, performing model weighted fusion on the models to be fused according to the fusion weight of each model to be fused by the first electronic equipment to obtain an updated cloud model.
The first electronic device is a cloud device, such as a server. And the second electronic device is a terminal device or a client device, such as a mobile phone and a tablet. The model to be fused refers to a model uploaded to the cloud on the end side, and the model is generally used for providing personalized services for users, and therefore can be called a personalized model. The model to be fused may also be referred to as an end-side model or an end-side personalized model.
It should be noted that the model to be fused may be a model that is trained on the end side but not verified, or may be a model that is both trained on the end side and verified.
Therefore, in the embodiment of the application, before model fusion is performed on the model to be fused uploaded by the end-side device, cross validation is performed on the model to be fused, that is, the credibility and contribution of the model are judged through the cross validation. And distributing different fusion weights to the models to be fused according to the cross validation result, namely distributing different weights to the models according to the confidence level of the models. And finally, performing weighted fusion on each model to be fused according to different fusion weights to obtain an updated cloud new model. Therefore, the robustness of the cloud model obtained by cloud fusion is improved.
In some possible implementation manners of the first aspect, after the first electronic device receives the to-be-fused models sent by the one or more second electronic devices, the first electronic device may further determine, according to a preset sufficient training condition, a training result of each to-be-fused model, where the training result includes sufficient training and insufficient training.
At this time, the specific process of the first electronic device allocating the fusion weight to each model to be fused according to the cross validation result may include: the first electronic equipment determines a first weight of each model to be fused according to a cross validation result; the first electronic equipment adjusts the first weight according to the training result aiming at each model to be fused to obtain a second weight; and the first electronic equipment takes the second weight of each model to be fused as a fusion weight.
Specifically, after the cloud acquires the model to be fused uploaded on the end side, statistical analysis can be performed on the model to be fused, data such as the end side training times and the end side training data volume of each model to be fused are counted, and then whether the model to be fused is sufficiently trained is determined according to the statistical analysis result. The result of whether the model to be fused is sufficiently trained can be used for assisting in judging the fusion weight of each model to be fused.
Specifically, when the fusion weight is assigned to each model to be fused according to the cross validation result, the fusion weight may be assigned to the user by integrating the cross validation result and the result of whether the model to be fused is sufficiently trained.
The sufficient training conditions may include the number of training sessions on the end side and the amount of training data on the end side. The number of training times and the data volume of the training at the end side can be obtained by analyzing the uploaded model parameters of the model to be fused. Whether the model to be fused is fully trained or not can be judged by counting the data amount of the end-side training and the end-side training data amount of each model to be fused at the moment.
In other words, in some possible implementation manners of the first aspect, the specific process of the first electronic device determining the training result of each model to be fused according to the preset sufficient training condition may include: if the number of training times of the end side of the model to be fused is larger than a preset number threshold value and the data volume of the end side of the model to be fused is larger than a preset threshold value, the first electronic device determines that the training result of the model to be fused is full training; and if the number of training times of the end side of the model to be fused is smaller than a preset number threshold value and/or the data volume of the end side of the model to be fused is smaller than a preset threshold value, the first electronic equipment determines that the training result of the model to be fused is insufficient training.
The number of training times of the model to be trained on the second electronic device is the number of training times of the model to be trained on the second electronic device, and the data size of the end side is the local training sample data size of the second electronic device.
The preset time threshold and the preset threshold can be set according to actual application requirements. For example, the prediction number threshold is 1000, and the preset threshold is 100.
After determining whether each model to be fused is sufficiently trained or not, the results of the sufficient training or not and the cross-validation results may be combined to assign fusion weights to the models. In the process, a first weight can be distributed to the model to be fused according to the cross validation result, and then the first weight is adjusted according to whether the model is fully trained or not to obtain a second weight. The second weight is a fusion weight of the model to be fused.
Specifically, in some possible implementation manners of the first aspect, the adjusting, by the first electronic device, the first weight according to the training result, and the obtaining the second weight may include: if the training result of the model to be fused is full training, the first electronic equipment adds the first weight and a preset numerical value to obtain a second weight; and if the training result of the model to be fused is insufficient training, the first electronic equipment subtracts the first weight from the preset value to obtain a second weight.
The preset value can be set according to the actual application requirement, for example, the preset value is 0.001.
In some possible implementations of the first aspect, the determining, by the first electronic device, the first weight of each model to be fused according to the cross-validation result may include: the first electronic equipment adds the cross validation results of each model to be fused to obtain a cross validation result sum; and the first electronic equipment takes the ratio of the cross validation result of each model to be fused and the sum of the cross validation results as a first weight of the model to be fused.
In specific application, the fusion weight can be distributed to the model only according to the model cross validation result except that whether the model to be fused is fully trained or not and the model cross validation result can be synthesized to distribute the fusion weight to the model.
In some possible implementation manners of the first aspect, the process, by the first electronic device, of assigning a fusion weight to each model to be fused according to the cross-validation result may also include: the first electronic equipment adds the cross validation results of each model to be fused to obtain a cross validation result sum; and the first electronic equipment takes the ratio of the cross validation result of each model to be fused to the sum of the cross validation results as the fusion weight of the model to be fused.
At this time, fusion weights are assigned to the models based only on the model cross-validation results.
It should be noted that, compared with a method of assigning a fusion weight to a model only according to a model cross validation result, the method of assigning a weight to a model by integrating whether the model to be fused is sufficient or not and the model cross validation result is more accurate in assigning the fusion weight to the model.
In some possible implementation manners of the first aspect, the cross-verifying, by the first electronic device, the model to be fused by using the cross-verification dataset, and the obtaining of the cross-verification result may include: the first electronic device dividing the cross-validation dataset into sub-cross-validation datasets; the first electronic device respectively uses each sub-cross validation data set to validate each model to be fused to obtain sub-validation results, wherein one sub-cross validation data set corresponds to one sub-validation result; and the first electronic equipment obtains the cross validation result of each model to be fused according to the sub validation result of each model to be fused.
As an example and not by way of limitation, the cross validation data set is divided into 3 sub cross validation data sets, and for a certain model a to be fused, 3 sub validation results corresponding to the 3 sub cross validation data sets are 90% and 85% respectively. 95 percent. For simplicity of calculation, the average of the 3 sub-verification results was calculated, i.e., (90% + 85% + 95%)/3 ═ 90%. At this time, the average value is taken as the cross-validation result of the model a to be fused, that is, the cross-validation result of the model a to be fused is 90%.
In some possible implementations of the first aspect, the method may further include: the method comprises the steps that a first electronic device obtains cloud verification data uploaded by a second electronic device, wherein the cloud verification data are generated by the second electronic device through data simulation of a training sample; the first electronic device constructs a cross validation data set according to the cloud validation data and the cloud data, wherein the cloud data is data stored in the cloud local.
It should be noted that the end-side device may or may not perform the data simulation process. If the end-side device does not perform the data simulation process, at this time, the cloud device may construct the cross-validation dataset based on the cloud data only. And if the end-side equipment performs a data simulation process, and uploads the data obtained in the data simulation process to the cloud as data for constructing a cloud cross-validation data set, at this time, the cloud equipment can construct the cross-validation data set by combining the cloud validation data uploaded by the end-side equipment and the local data of the cloud.
The cloud data refers to data stored locally in the cloud, and the data can be obtained by the cloud according to model parameters of the model to be fused uploaded at the end side, and can also be training data used for training the model to be trained.
Compared with the prior art, the cross validation data set is constructed by combining the cloud validation data uploaded at the end side and the cloud data, so that the scene and complexity of model validation can be enriched, and the robustness of a new model obtained by cloud fusion is further improved.
In some possible implementations of the first aspect, after the first electronic device performs cross-validation on the model to be fused by using the cross-validation dataset to obtain a cross-validation result, the method may further include the following steps: if the model structure of the model to be fused is different from the model structure of the model to be fused trained in the previous round, the first electronic equipment determines a target cloud model from the pre-stored cloud models according to the model structure of the model to be fused; the method comprises the steps that model verification is conducted on a target cloud model through first electronic equipment; if the verification is passed, the first electronic device takes the target cloud model as an updated cloud model;
and if the model structure of the model to be fused is the same as that of the model to be fused trained in the previous round, the first electronic equipment performs the step of distributing fusion weight to each model to be fused according to the cross validation result.
Specifically, after the cloud performs reliability cross validation on the model to be fused, the cloud can directly perform the step of model weighted fusion, and also can judge whether the model structure of the model to be fused is changed or not, and perform model weighted fusion only when the model structure is not changed. And if the model structure is changed, a new cloud model can be obtained through model searching. Therefore, the updated cloud model can still be obtained when the model result changes.
In some possible implementation manners of the first aspect, the performing, by the first electronic device, model weighted fusion on the to-be-fused models according to the fusion weight of each to-be-fused model, and the obtaining of the updated cloud model may include: the method comprises the steps that first electronic equipment determines whether a model to be fused needs to be retrained or not according to preset retraining conditions; if retraining is needed, the first electronic equipment trains the model to be fused in a weighted federal meta-learning mode according to the fusion weight to obtain a fusion model; performing model verification on the fusion model, and if the verification is passed, taking the fusion model as an updated cloud model;
if the retraining is not needed, the first electronic equipment performs weighted fusion on the model to be fused according to the fusion weight to obtain a fusion model; and performing model verification on the fusion model, and if the verification is passed, taking the fusion model as an updated cloud model.
It should be noted that the retraining condition is set mainly according to the training task and the model characteristics. For some models which are simple, the model obtained by directly performing weighted fusion or weighted average has good effect, so retraining is not needed, such as random forests; for some models with complex structures, the models obtained by directly performing weighted fusion or weighted average have poor effects, so that weighted training needs to be performed again, for example, deep learning models.
In a second aspect, an embodiment of the present application provides a method for model training based on federal learning, where the method may include: the second electronic equipment receives the model to be trained sent by the first electronic equipment; then, the second electronic equipment performs data validity verification on the training sample data set to obtain a verified training sample data set; then, the second electronic equipment performs incremental training on the model to be trained according to the verified training sample data set to obtain an end-side model; and finally, the second electronic equipment carries out model verification on the end-side model, and the end-side model passing the model verification is used as the model to be fused and transmitted to the first electronic equipment.
Compared with the prior federal study, when the model is trained by using local training data, the effectiveness of the local training data is not verified, and the model obtained by training is not verified. After the model training is completed, the client device can also perform model verification on the trained model, and after the verification is passed, the trained model is uploaded to the cloud. Therefore, the effect of uploading the model to the cloud can be ensured, and the robustness of the new model with cloud fusion is further improved
In some possible implementation manners of the second aspect, the second electronic device performs data validity verification on the training sample data set, and the process of obtaining the verified training sample data set may include: the second electronic equipment carries out data classification on the training sample data set, and determines first effective data, redundant data and noise data in the training sample data; the second electronic equipment removes redundant data; the second electronic equipment repairs the noise data to obtain second effective data; and the second electronic equipment forms a verified training sample data set based on the first valid data and the second valid data.
In some possible implementation manners of the second aspect, the performing, by the second electronic device, incremental training on the model to be trained according to the verified training sample data set, and the obtaining the end-side model may include:
if the type of the model training task is different from that of the model training task of the previous round of training, the second electronic equipment performs incremental training on the model to be trained in a multi-task incremental learning mode according to the verified training sample data set to obtain an end-side model;
if the type of the model training task is the same as that of the model training task of the previous round of training, and the type of the base model of the model to be trained is a machine learning model, the second electronic equipment performs incremental training on the model to be trained in a machine incremental learning mode according to the verified training sample data set to obtain an end-side model;
and if the type of the model training task is the same as that of the model training task of the previous round of training, and the type of the base model of the model to be trained is a non-machine learning model, the second electronic equipment performs incremental training on the model to be trained in a neural network incremental learning mode according to the verified training sample data set to obtain the end-side model.
By way of example and not limitation, when the model training task is a classification task, if the current model training task is a 3-classification task and the previous round of model training task is a 2-classification task, the type of the model training task is considered to be changed.
It should be noted that, according to the type of the end-side model and the end-side model training task, the implementation method selects the optimal or most suitable end-side model incremental training mode, so as to further improve the training efficiency and the training effect of the end-side model, and enable the model to be fast converged.
In some possible implementation manners of the second aspect, the incrementally training the model to be trained by the second electronic device through neural network incremental learning according to the verified training sample data set, and the process of obtaining the end-side model may include:
if the model to be trained meets the first type of condition, the second electronic equipment conducts incremental training on the model to be trained through small sample learning and convolutional layer updating according to the verified training sample data set to obtain an end-side model;
and if the model to be trained meets the second type of conditions, the second electronic equipment performs incremental training on the model to be trained through knowledge distillation and convolutional layer solidification according to the verified training sample data set to obtain the end side model.
In some possible implementations of the second aspect, the method may further include a data enhancement process, and the data enhancement process may specifically include: the second electronic equipment acquires sample data at the end side; the second electronic equipment performs data simulation according to the end-side sample data to generate training data and cloud verification data used for constructing a cloud cross verification data set; the second electronic equipment uploads the cloud verification data to the first electronic equipment; and the second electronic equipment constructs a training sample data set according to the end-side sample data and the training data.
It should be noted that, in the implementation manner, data enhancement is performed on a small number of training samples to generate a large number of training samples through simulation, and then the model is trained by using the training sample data generated through data simulation, so that the model training efficiency is improved, and the model obtained through training is more personalized.
In a third aspect, an embodiment of the present application provides a model training apparatus based on federal learning, where the apparatus may include:
the model to be trained sending module is used for sending the model to be trained to the second electronic equipment;
the model receiving module is used for receiving the model to be fused sent by one or more second electronic devices, wherein the model to be fused is obtained by the second electronic devices through training the model to be trained according to a local training sample data set;
the cross validation module is used for performing cross validation on the model to be fused by using the cross validation data set to obtain a cross validation result;
the weight distribution module is used for distributing fusion weights to one or more models to be fused according to the cross validation result;
and the model fusion module is used for performing model weighted fusion on the models to be fused according to the fusion weight of each model to be fused to obtain an updated cloud model.
In some possible implementations of the third aspect, the apparatus may further include:
the statistical analysis module is used for determining the training result of each model to be fused according to the preset sufficient training condition, wherein the training result comprises sufficient training and insufficient training;
at this time, the weight assignment module is specifically configured to: determining a first weight of each model to be fused according to a cross validation result; aiming at each model to be fused, adjusting the first weight according to a training result to obtain a second weight; and taking the second weight of each model to be fused as a fusion weight.
In some possible implementation manners of the third aspect, the statistical analysis module is specifically configured to:
if the number of training times of the end side of the model to be fused is larger than a preset number threshold value and the data volume of the end side of the model to be fused is larger than a preset threshold value, determining that the training result of the model to be fused is full training;
if the number of training times of the end side of the model to be fused is smaller than a preset number threshold value and/or the data volume of the end side of the model to be fused is smaller than a preset threshold value, determining that the training result of the model to be fused is insufficient training;
the number of training times of the model to be trained on the second electronic device is the number of training times of the model to be trained on the second electronic device, and the data size of the end side is the local training sample data size of the second electronic device.
In some possible implementation manners of the third aspect, the weight assignment module is specifically configured to: if the training result of the model to be fused is full training, adding the first weight and a preset numerical value to obtain a second weight; and if the training result of the model to be fused is insufficient training, subtracting the preset value from the first weight to obtain a second weight.
In some possible implementation manners of the third aspect, the weight assignment module is specifically configured to: adding the cross validation results of each model to be fused to obtain a cross validation result sum; and taking the ratio of the cross validation result of each model to be fused to the sum of the cross validation results as the first weight of the model to be fused.
In some possible implementation manners of the third aspect, the weight assignment module is specifically configured to: adding the cross validation results of each model to be fused to obtain a cross validation result sum; and taking the ratio of the cross validation result of each model to be fused to the sum of the cross validation results as the fusion weight of the models to be fused.
In some possible implementations of the third aspect, the cross-validation module is specifically configured to: dividing the cross-validation dataset into sub-cross-validation datasets; verifying each model to be fused by using each sub cross verification data set respectively to obtain sub verification results, wherein one sub cross verification data set corresponds to one sub verification result; and obtaining the cross validation result of each model to be fused according to the sub validation result of each model to be fused.
In some possible implementations of the third aspect, the apparatus may further include:
the cross validation data set building module is used for obtaining cloud validation data uploaded by the second electronic device, and the cloud validation data are generated by the second electronic device through data simulation of the training sample; and constructing a cross validation data set according to the cloud validation data and the cloud data, wherein the cloud data is data stored in the cloud local.
In some possible implementations of the third aspect, the apparatus may further include:
the model searching module is used for determining a target cloud model from the pre-stored cloud models according to the model structure of the model to be fused if the model structure of the model to be fused is different from the model structure of the model to be fused trained in the previous round; performing model verification on the target cloud model; if the verification is passed, taking the target cloud model as an updated cloud model; and if the model structure of the model to be fused is the same as that of the model to be fused trained in the previous round, distributing fusion weight to each model to be fused according to the cross validation result.
In some possible implementation manners of the third aspect, the model fusion module is specifically configured to: determining whether the model to be fused needs to be retrained according to a preset retraining condition; if retraining is needed, training the model to be fused in a weighted federal meta-learning mode according to the fusion weight to obtain a fusion model; performing model verification on the fusion model, and if the verification is passed, taking the fusion model as an updated cloud model; if retraining is not needed, performing weighted fusion on the model to be fused according to the fusion weight to obtain a fusion model; and performing model verification on the fusion model, and if the verification is passed, taking the fusion model as an updated cloud model.
The federal learning-based model training device has a function of implementing the federal learning-based model training method in the first aspect, and the function may be implemented by hardware or by hardware executing corresponding software, where the hardware or software includes one or more modules corresponding to the function, and the modules may be software and/or hardware.
It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/modules, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and reference may be made to the part of the embodiment of the method specifically, and details are not described here.
In a fourth aspect, an embodiment of the present application provides a model training apparatus based on federal learning, which may include the following modules:
the model receiving module to be trained is used for receiving the model to be trained sent by the first electronic equipment;
the data validity verification module is used for verifying the data validity of the training sample data set to obtain a verified training sample data set;
the model training module is used for carrying out incremental training on the model to be trained according to the verified training sample data set to obtain an end-side model;
the model verification module is used for performing model verification on the end-side model;
and the model to be fused uploading module is used for transmitting the end-side model passing the model verification as the model to be fused to the first electronic equipment.
In some possible implementation manners of the fourth aspect, the data validity verification module is specifically configured to: carrying out data classification on the training sample data set, and determining first effective data, redundant data and noise data in the training sample data; removing redundant data; carrying out noise data restoration on the noise data to obtain second effective data; and forming a verified training sample data set based on the first valid data and the second valid data.
In some possible implementations of the fourth aspect, the model training module is specifically configured to:
if the type of the model training task is different from that of the model training task of the previous round of training, performing incremental training on the model to be trained in a multi-task incremental learning mode according to the verified training sample data set to obtain an end-side model;
if the type of the model training task is the same as that of the model training task of the previous round of training and the type of the base model of the model to be trained is a machine learning model, performing incremental training on the model to be trained in a machine incremental learning mode according to the verified training sample data set to obtain an end-side model;
and if the type of the model training task is the same as that of the model training task of the previous round of training and the type of the base model of the model to be trained is a non-machine learning model, performing incremental training on the model to be trained in a neural network incremental learning mode according to the verified training sample data set to obtain the end-side model.
In some possible implementations of the fourth aspect, the model training module is specifically configured to:
if the model to be trained meets the first type of condition, the second electronic equipment conducts incremental training on the model to be trained through small sample learning and convolutional layer updating according to the verified training sample data set to obtain an end-side model;
and if the model to be trained meets the second type of conditions, the second electronic equipment performs incremental training on the model to be trained through knowledge distillation and convolutional layer solidification according to the verified training sample data set to obtain the end-side model.
In some possible implementations of the fourth aspect, the apparatus may further include:
the data enhancement module is used for acquiring sample data at the end side; performing data simulation according to the sample data at the end side, and generating training data and cloud verification data for constructing a cloud cross verification data set; uploading the cloud verification data to the first electronic equipment; and constructing a training sample data set according to the end-side sample data and the training data.
The federal learning based model training device has a function of implementing the federal learning based model training method of the second aspect, which can be implemented by hardware, or by hardware executing corresponding software, where the hardware or software includes one or more modules corresponding to the function, and the modules may be software and/or hardware.
It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/modules, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and reference may be made to the part of the embodiment of the method specifically, and details are not described here.
In a fifth aspect, an embodiment of the present application provides a method for model training based on federal learning, where the method may include:
the first electronic device sends the model to be trained to the second electronic device.
The second electronic equipment performs data validity verification on the training sample data set to obtain a verified training sample data set; then, performing incremental training on the model to be trained according to the verified training sample data set to obtain an end-side model; then, carrying out model verification on the end-side model; and then transmitting the end-side model passing the model verification as the model to be fused to the first electronic equipment.
The method comprises the steps that cross validation is conducted on a model to be fused by first electronic equipment through a cross validation data set, and a cross validation result is obtained; then, distributing fusion weight to one or more models to be fused according to the cross validation result; and finally, performing model weighted fusion on the models to be fused according to the fusion weight of each model to be fused to obtain an updated cloud model.
In some possible implementation manners of the fifth aspect, the second electronic device performs data validity verification on the training sample data set, and the process of obtaining the verified training sample data set may include:
the second electronic equipment carries out data classification on the training sample data set, and determines first effective data, redundant data and noise data in the training sample data; the second electronic equipment removes redundant data; the second electronic equipment repairs the noise data to obtain second effective data; and the second electronic equipment forms a verified training sample data set based on the first valid data and the second valid data.
In some possible implementation manners of the fifth aspect, the performing, by the second electronic device, incremental training on the model to be trained according to the verified training sample data set, and the obtaining the end-side model may include:
if the type of the model training task is different from that of the model training task of the previous round of training, the second electronic equipment performs incremental training on the model to be trained in a multi-task incremental learning mode according to the verified training sample data set to obtain an end-side model;
if the type of the model training task is the same as that of the model training task of the previous round of training, and the type of the base model of the model to be trained is a machine learning model, the second electronic equipment performs incremental training on the model to be trained in a machine incremental learning mode according to the verified training sample data set to obtain an end-side model;
and if the type of the model training task is the same as that of the model training task of the previous round of training, and the type of the base model of the model to be trained is a non-machine learning model, the second electronic equipment performs incremental training on the model to be trained in a neural network incremental learning mode according to the verified training sample data set to obtain the end-side model.
Further, in some possible implementation manners of the fifth aspect, the performing, by the second electronic device, incremental training on the model to be trained in a neural network incremental learning manner according to the verified training sample data set to obtain the end-side model, including:
if the model to be trained meets the first type of condition, the second electronic equipment conducts incremental training on the model to be trained through small sample learning and convolutional layer updating according to the verified training sample data set to obtain an end-side model; and if the model to be trained meets the second type of conditions, the second electronic equipment performs incremental training on the model to be trained through knowledge distillation and convolutional layer solidification according to the verified training sample data set to obtain the end-side model.
In some possible implementations of the fifth aspect, the second electronic device may also perform a data simulation process. The data simulation process may include:
the second electronic equipment acquires sample data at the end side; the second electronic equipment performs data simulation according to the end-side sample data to generate training data and cloud verification data used for constructing a cloud cross verification data set; the second electronic equipment uploads the cloud verification data to the first electronic equipment; and the second electronic equipment constructs a training sample data set according to the end-side sample data and the training data.
In some possible implementations of the fifth aspect, the method may further include: the first electronic equipment determines a training result of each model to be fused according to a preset full training condition, wherein the training result comprises full training and insufficient training.
At this time, the process of allocating, by the first electronic device, the fusion weight to each model to be fused according to the cross validation result may include: the first electronic equipment determines a first weight of each model to be fused according to a cross validation result; the first electronic equipment adjusts the first weight according to the training result aiming at each model to be fused to obtain a second weight; and the first electronic equipment takes the second weight of each model to be fused as a fusion weight.
In some possible implementation manners of the fifth aspect, the determining, by the first electronic device, the training result of each model to be fused according to a preset sufficient training condition may include: if the number of training times of the end side of the model to be fused is larger than a preset number threshold value and the data volume of the end side of the model to be fused is larger than a preset threshold value, the first electronic device determines that the training result of the model to be fused is full training; and if the number of training times of the end side of the model to be fused is smaller than a preset number threshold value and/or the data volume of the end side of the model to be fused is smaller than a preset threshold value, the first electronic equipment determines that the training result of the model to be fused is insufficient training.
The number of training times of the model to be trained on the second electronic device is the number of training times of the model to be trained on the second electronic device, and the data size of the end side is the local training sample data size of the second electronic device.
In some possible implementation manners of the fifth aspect, the adjusting, by the first electronic device, the first weight according to the training result, and the obtaining the second weight may include: if the training result of the model to be fused is full training, the first electronic equipment adds the first weight and a preset numerical value to obtain a second weight; and if the training result of the model to be fused is insufficient training, the first electronic equipment subtracts the first weight from the preset value to obtain a second weight.
In some possible implementations of the fifth aspect, the determining, by the first electronic device, the first weight of each model to be fused according to the cross-validation result may include: the first electronic equipment adds the cross validation results of each model to be fused to obtain a cross validation result sum; and the first electronic equipment takes the ratio of the cross validation result of each model to be fused and the sum of the cross validation results as a first weight of the model to be fused.
In some possible implementations of the fifth aspect, the process of assigning, by the first electronic device, a fusion weight to each model to be fused according to the cross-validation result may include: the first electronic equipment adds the cross validation results of each model to be fused to obtain a cross validation result sum; and the first electronic equipment takes the ratio of the cross validation result of each model to be fused to the sum of the cross validation results as the fusion weight of the model to be fused.
In some possible implementations of the fifth aspect, the cross-verifying the model to be fused by using the cross-verification dataset by the first electronic device, and obtaining the cross-verification result may include: the first electronic device dividing the cross-validation dataset into sub-cross-validation datasets; the first electronic device respectively uses each sub-cross validation data set to validate each model to be fused to obtain sub-validation results, wherein one sub-cross validation data set corresponds to one sub-validation result; and the first electronic equipment obtains the cross validation result of each model to be fused according to the sub validation result of each model to be fused.
In some possible implementations of the fifth aspect, the method may further include: the method comprises the steps that a first electronic device obtains cloud verification data uploaded by a second electronic device, wherein the cloud verification data are generated by the second electronic device through data simulation of a training sample; the first electronic device constructs a cross validation data set according to the cloud validation data and the cloud data, wherein the cloud data is data stored in the cloud local.
In some possible implementations of the fifth aspect, after the first electronic device performs cross-validation on the model to be fused by using the cross-validation dataset to obtain a cross-validation result, the method may further include: if the model structure of the model to be fused is different from the model structure of the model to be fused trained in the previous round, the first electronic equipment determines a target cloud model from the pre-stored cloud models according to the model structure of the model to be fused; the method comprises the steps that model verification is conducted on a target cloud model through first electronic equipment; if the verification is passed, the first electronic device takes the target cloud model as an updated cloud model; and if the model structure of the model to be fused is the same as that of the model to be fused trained in the previous round, the first electronic equipment performs the step of distributing fusion weight to each model to be fused according to the cross validation result.
In some possible implementation manners of the fifth aspect, the performing, by the first electronic device, model weighted fusion on the to-be-fused models according to the fusion weight of each to-be-fused model, and the obtaining of the updated cloud model may include: the method comprises the steps that first electronic equipment determines whether a model to be fused needs to be retrained or not according to preset retraining conditions; if retraining is needed, the first electronic equipment trains the model to be fused in a weighted federal meta-learning mode according to the fusion weight to obtain a fusion model; performing model verification on the fusion model, and if the verification is passed, taking the fusion model as an updated cloud model; if the retraining is not needed, the first electronic equipment performs weighted fusion on the model to be fused according to the fusion weight to obtain a fusion model; and performing model verification on the fusion model, and if the verification is passed, taking the fusion model as an updated cloud model.
In a sixth aspect, an embodiment of the present application provides a model training apparatus based on federal learning, where the apparatus includes a first electronic device and a second electronic device, and the apparatus is configured to implement the functions of the model training method based on federal learning in the fifth aspect.
In a seventh aspect, an embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the method according to the first aspect or any one of the second aspects.
In an eighth aspect, embodiments of the present application provide a federal learning based model training system, which may include a first electronic device and a second electronic device.
The first electronic device is configured to perform the model training method according to any one of the first aspect, and the second electronic device is configured to perform the model training method according to any one of the second aspect.
In a ninth aspect, embodiments of the present application provide a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the method according to the first aspect or any one of the second aspects.
In a tenth aspect, embodiments of the present application provide a chip system, where the chip system includes a processor, the processor is coupled with a memory, and the processor executes a computer program stored in the memory to implement the method according to the first aspect or the second aspect. The chip system can be a single chip or a chip module consisting of a plurality of chips.
In an eleventh aspect, embodiments of the present application provide a computer program product, which, when run on an electronic device, causes the electronic device to perform the method of any one of the first aspect or the second aspect.
It is to be understood that, for the beneficial effects of the third aspect to the eleventh aspect, reference may be made to the description of the first aspect or the second aspect, and details are not repeated here.
Drawings
FIG. 1 is a block diagram illustrating a model training system architecture according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a data simulation process provided in an embodiment of the present application;
fig. 3 is a schematic diagram of end-side data validity verification provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of an end-side model training process provided by an embodiment of the present application;
fig. 5 is a schematic diagram of a peer cloud coordination process provided in an embodiment of the present application;
FIG. 6 is a schematic diagram of a model training process provided in an embodiment of the present application;
FIG. 7 is a schematic diagram of a model training process provided by an embodiment of the present application;
FIG. 8 is another schematic diagram of a model training process provided by an embodiment of the present application;
FIG. 9 is a further schematic diagram of a model training process provided in an embodiment of the present application;
FIG. 10 is a schematic illustration of model training provided by an embodiment of the present application;
fig. 11 is a schematic diagram of regression prediction effect provided in the embodiment of the present application;
fig. 12 is a schematic diagram of a picture recommendation provided in an embodiment of the present application;
fig. 13 is an interaction diagram of an end-side device and a cloud device provided in an embodiment of the present application;
fig. 14 is a schematic view of an electronic device according to an embodiment of the present application.
Detailed Description
The inventor finds that, in the prior art, when a model is trained through federal learning, a cloud server does not perform reliability verification on the model uploaded by client equipment before model fusion, an average updating strategy is adopted during model fusion, and different weights are not distributed according to different confidence levels of the model. Therefore, the robustness of a new model obtained by fusing the cloud server is poor, and the effect of the cloud aggregation model cannot be ensured.
In addition, in the existing federal learning, when the client device trains the model by using the local training data, the validity of the local training data is not verified, and the model obtained by training is not verified. Therefore, the effect of uploading the model to the cloud cannot be guaranteed, and the robustness of the new model fused by the cloud server is poor.
In addition, in the existing federal learning, the number of training samples of local training data of the client device is small, so that the training efficiency is low, and the model is difficult to converge quickly.
Aiming at the problems in the traditional federal learning, the embodiment of the application provides a model training scheme based on the federal learning. In the embodiment of the application, the cloud device verifies the reliability of the model uploaded by the client device before the model is fused so as to judge the credibility and contribution degree of the model; calculating the weight corresponding to each model according to the reliability verification result, namely distributing different weights to each model according to the confidence level of each model; and finally, carrying out weighted fusion on the models according to the weights corresponding to the models to obtain a new cloud aggregation model. Therefore, the robustness of the new model obtained by cloud aggregation can be improved.
Further, in this embodiment of the application, the client device may perform data validity verification on the local training data, and then train the model using the training data after validity verification. After the model training is completed, the client device can also perform model verification on the trained model, and after the verification is passed, the trained model is uploaded to the cloud. Therefore, the effect of uploading the model to the cloud can be ensured, and the robustness of the new model with the cloud integration is further improved.
Further, in the embodiment of the application, the client device may perform data enhancement on a small number of training samples to generate a large number of training samples through simulation, and then train the model by using the training samples generated through data simulation, so that the model training efficiency is improved, and the trained model is more personalized.
In order to better describe the technical solutions of the embodiments of the present application, the following will explain the relevant contents in detail.
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application.
The following first describes a system architecture to which embodiments of the present application may relate.
Referring to fig. 1, a schematic block diagram of a model training system architecture provided in the embodiment of the present application is shown, as shown in fig. 1, the system includes a cloud device 11 and a client device 12, the cloud device 11 and the client device 12 are communicatively connected through a communication network 13, and the communication network 13 may be, for example, the internet. The number and type of client devices 12 may be arbitrary. Typically, there are at least two client devices 12, i.e., there may be at least two client devices participating in model training. The type of client device may be, but is not limited to, a mobile phone, a tablet, a vehicle terminal, a computer, etc., and the client device 12 in fig. 1 includes, by way of example and not limitation, a mobile phone, a tablet, etc. The cloud device 11 is typically a cloud server. The cloud device 11 may send the model to be trained to the client device 12, where the model is generally an AI model, and the type of the AI model may be arbitrary, for example and without limitation, and the AI model may be a deep learning model or a machine learning model. More specifically, the AI model may be different according to different practical application scenarios. The AI model may provide personalized services for the user, and may also be referred to as a user personalization model. For example, in a voice wake scenario, the AI model is a voice wake model; in a charging management and control scene, the AI model is a charging management and control model; in a news recommendation scenario, the AI model is a news recommendation model.
It should be noted that the model to be trained may be a model updated by the cloud device after the previous round of model training. That is to say, the cloud device and the client device may perform multiple rounds of model training, and in one round of model training, the cloud device may issue the model obtained by the previous round of fusion update to the client device as the model to be trained. After the client device trains the model to be trained by using the local data, the model obtained by training is uploaded to the cloud, and the cloud device performs model fusion updating according to the model uploaded by each client device to obtain an updated model. And circulating according to the above steps, and performing one or more rounds of model training. Of course, when the first round of model training is performed, the model to be trained may be an initial model with universality, which is trained in advance by the cloud device.
For example, in a voice wake scenario, the model to be trained is a voice wake model. Before the first round of model training, the cloud server may collect a large amount of legally compliant voice wakeup data, where the voice wakeup data may include voice data of multiple users, and the voice data may be collected on the premise of obtaining user authorization. The cloud trains the constructed voice awakening model based on the collected voice awakening data. And taking the trained voice awakening model as an initial voice awakening model. The initial voice wake-up model recognizes well the wake-up voice of standard mandarin chinese. However, for some mandarins with dialects or local accents, the voice wake-up effect may be poor. Alternatively, in some cases where there is a large amount of background noise, the initial voice wake-up model may not work well. After the cloud issues the initial voice awakening model to the mobile phone of the user, the mobile phone can use the initial voice awakening model to perform awakening voice recognition, and in the process, the mobile phone can acquire the side-side data related to voice awakening after acquiring clear user authorization. At this time, the end-side data may include voice data corresponding to success of voice wakeup, voice data corresponding to failure of voice wakeup but success of wakeup by other means (e.g., success of face wakeup when voice wakeup fails), and so on.
After receiving the model to be trained issued by the cloud device, the client device may train the model to be trained by using the training sample data set on the end side.
For example, in a news recommendation scene, the model to be trained is a news recommendation model, and the news recommendation model can recommend news meeting the user preference for the user according to data such as news records browsed by the user. At this time, after acquiring explicit user authorization, the mobile phone may collect relevant peer-side data in a data dotting manner, where the relevant peer-side data may include a period of time when the user browses news (e.g., morning or evening), a news category (e.g., sports news or entertainment news), and the like. After obtaining the definite user authorization, the mobile phone can form training sample data of a news recommendation model by the related end-side data.
In some embodiments, after collecting a small amount of training sample data, the client device may compose a training sample data set based on the small amount of sample data, and train the model to be trained using the training sample data set. However, when the model is trained by using the training sample data set including a small number of samples, the model training efficiency is low due to less sample data, and it is difficult to quickly converge to the personalized model meeting the personalized requirements of the user.
In other embodiments, to improve model training efficiency, the model may be allowed to converge quickly to a personalized model. The client device 12 may perform data enhancement on a small amount of sample data after collecting the small amount of sample data to generate a large amount of training sample data through data simulation. Then, a training sample data set is formed based on training sample data generated by data simulation, and the training sample data set is used for training the model to be trained. Therefore, a large amount of training sample data is generated through data enhancement, and then the model is trained by using a large amount of training sample data, so that the model training efficiency can be improved, the model can be rapidly converged, and the trained model can be more personalized.
By way of example and not limitation, the following is presented in conjunction with the data simulation process schematic shown in FIG. 2.
As shown in fig. 2, the client device obtains an initial small number of samples from an end-side data pool, which may include, but is not limited to, historical behavior data, dotting data, and the like. After the client device obtains a small amount of sample data from the end-side data pool, preliminary data analysis can be performed on the small amount of sample data, and the preliminary data analysis can refer to data classification, clustering or regression and other operations. Then, based on the preliminary data analysis result, reinforcement learning modeling is carried out on the user behavior on one side of the client device, a corresponding model is obtained, AI simulation data generation is carried out by using the model, the obtained generated data can be added to a training data set to serve as training data for model training of the client device, or the generated data is added to cloud verification data, and the cloud verification data can be used for constructing a cloud cross-verification data set. Secondly, judging whether the data volume of the training data is sufficient or not, and if the data volume is insufficient, generating AI simulation data again; if the data volume is sufficient, the training data can be subjected to data preprocessing operation to obtain the end-side data, and the end-side data is prepared.
It should be noted that, by performing data simulation on a small amount of sample data, training sample data with more sample data is generated, so that the training sample data set on one side of the client device is expanded, the model training efficiency on one side of the client device is further improved, and the model trained on one side of the client device is more personalized.
In addition, after a small amount of sample data is obtained or a large amount of sample data is generated through data simulation, before the model is trained by using training sample data, data validity verification can be performed on the obtained training sample data to obtain an optimal training data set, and then the model is trained by using the optimal training data set. It should be understood that the training sample data may be sample data obtained after data enhancement, or may be sample data without data enhancement.
Data validation on training sample data may include, but is not limited to, steps of data classification, redundant data removal, and noisy data repair. The data classification means classifying training sample data to determine valid data, redundant data and noise data in the training sample data. Redundant data refers to data that is duplicated in a data set, i.e., there are multiple identical data in a data set. Noisy data refers to meaningless data in a data set, which generally refers to data in which errors or anomalies are present in the data. Valid data refers to data in a data set which meets a data quality specification, and the quality of the data can be evaluated through the integrity, consistency and accuracy of the data.
For redundant data, the redundant data may be removed, and for noisy data, noisy data repair may be performed, converting the noisy data into valid data. Finally, an optimal training data set is formed based on the valid data. Wherein the valid data in the optimal training data set may include valid data obtained by noise data restoration and valid data obtained by data classification.
Of course, after the training sample data is obtained and before the data validity verification is performed, data cleaning operation can be performed on the training sample data to perform preliminary data cleaning on the training sample data, and data with obvious errors, incompleteness and wrong formats are screened.
By way of example and not limitation, the following is described in connection with the end-side data validity verification diagram shown in fig. 3.
As shown in fig. 3, after the client device acquires the end-side data, data cleaning may be performed on the end-side data first. The end-side data may be sample data with a large data size obtained by data enhancement, or may be sample data with a small data size without data enhancement. Then, the end-side data validity verification is performed. The process of end-side data validity verification may include: and classifying the training sample data in an active learning mode to determine effective data, redundant data and noise data in the training sample data. Specifically, in the active learning process, a supervised learning process and an unsupervised learning process may be performed based on the presence or absence of the label, respectively. Specifically, whether the sample data has a label or not can be judged, if so, the data is classified through supervised learning in the active learning, if not, the data without the label is automatically labeled through unsupervised learning in the active learning, and then the data is classified to obtain a data classification result. In a specific application, the labels of the training sample data may be embodied as 0 and 1, for example, when the training sample data is voice data in a voice wake-up scene, the labels are divided into 0 and 1, where 0 indicates that the training sample data is the voice data corresponding to the voice wake-up, and 1 indicates that the training sample data is the voice data corresponding to the voice wake-up.
After the data classification result is obtained, whether redundant data exist in the end-side data can be judged, and if the redundant data exist, the redundant data are stored in a data cache region; if the redundant data does not exist, judging whether the data on the end side has noise data, if so, repairing the noise data, and then judging whether the data on the end side has effective data; if there is no noise data, it is judged whether there is valid data in the end side data. If valid data exists in the end-side data, an end-side training set is formed based on the valid data, and the end-side training set is an optimal training sample data set. And if the valid data does not exist in the end side data, storing the end side data into the data buffer area.
It should be noted that the end-side data in fig. 3 may refer to training sample data for training the end-side model, and the end-side data in fig. 2 may include cloud verification data for constructing a cloud cross-verification data set and training sample data for training the end-side model. Of course, in some other embodiments, the end-side data in fig. 3 may also include cloud verification data used for constructing a cloud cross-validation dataset and training sample data used for training the end-side model, that is, in addition to performing data validity verification on the training sample data, data validity verification may also be performed on the cloud verification data.
It should be noted that, before model training, data validity verification is performed on training sample data to obtain an optimal training data set, and then the optimal training data set is used to train the model to ensure the effect of the model uploaded to the cloud by the client device as much as possible, so that the robustness of the new cloud-fused model is further improved.
After the client device obtains the set of training sample data, the model may be trained using the training sample data. At this time, the training sample data set may be the end-side training set in fig. 3, that is, the training sample data set used in the end-side model training is a data set subjected to data validity verification. Of course, the training sample data set may also be a data set without data validity verification. The training sample data set may be a data set obtained by data enhancement, or may be a data set without data enhancement. In a specific application, the training process of the end-side model may be different according to different training sample data sets. It will be appreciated that the model training approach on the client device is incremental training.
By way of example and not limitation, the following is described in connection with the end-side model training process schematic shown in FIG. 4.
As shown in fig. 4, after the client device acquires the end-side training set, it is first determined whether the acquired model training task changes, and whether the model training task changes may be specifically determined by determining whether the model training task increases or decreases. Specifically, the current model training task may be compared with the previous model training task to determine whether the current training task is increased or decreased. For example, if the previous training task was a three-classification task and the current training task was a four-classification task, it may be determined that the current model training task was incremental.
If the model training task is changed, performing model incremental training in a multitask incremental learning manner to obtain a trained model, where the multitask incremental learning algorithm may be, for example, lwf (learning without learning) or ewc (elastic weight learning).
If the model training task is not changed, the type of the base model is judged, and different incremental learning modes are adopted according to the type of the base model. The base model types may include deep learning models and machine learning models. And if the base model type is a machine learning model, performing model incremental training in a machine learning incremental training mode to obtain a trained model. And if the base model type is not the machine learning model, performing incremental training on the model by adopting a neural network incremental learning mode.
In the neural network incremental learning mode, the difficulty of model updating can be judged firstly, and particularly, the difficulty of model updating can be judged by judging the structure of the model and the size of training sample data. If the model is difficult to update, incremental training is carried out on the model in a meta-learning and convolutional layer updating mode; and conversely, if the model is easy to update, the model is subjected to incremental training in a knowledge distillation and convolution layer solidification mode.
When the difficulty degree of model updating is judged, a sample data size threshold value and a level number threshold value in the model can be preset, the size of training sample data size is judged by judging the size relationship between the training sample data size and the set sample data size threshold value, and whether the model structure is complex is judged by judging the size relationship between the level of the model to be trained and the level number threshold value. For example, if the model to be trained has only 3 fully-connected layers, the structure of the model to be trained is determined to be simple, and if 50 fully-connected layers exist, the structure of the model to be trained can be determined to be complex. Generally, if the model structure is complex and the amount of training sample data is small, the judgment model is difficult to update, and otherwise, if the model structure is simple and the amount of training sample data is large, the judgment model is easy to update.
It should be noted that if the end-side training set is a data set obtained by data enhancement, and in this case, since the training set contains a large amount of training sample data, it can be determined that the model is easy to update, model incremental training is performed by using knowledge distillation and convolutional layer hardening. If the end-side training set is a data set without data enhancement, and the training set contains less training sample data and can judge that the model is difficult to update, model incremental training is performed by adopting a meta-learning and convolutional layer updating mode, and specifically, incremental training of the model can be performed by adopting small sample learning (few-shot learning) of the meta-learning. Of course, whether the model is difficult to update or not needs to be judged by combining whether the model structure is complex or not.
Of course, in other embodiments, the process of training the end-side model may be different from the above-mentioned process, for example, the determination of whether the task is changed or not may not be performed, or the determination of the difficulty level of updating the model may not be performed. However, in contrast, in the model training process mentioned in fig. 4, different incremental training modes are adopted according to whether the task is changed, so that the model can still be incrementally trained when the task is changed. In addition, different incremental training modes are adopted according to the types of the base models, so that model incremental training can be carried out in the incremental training mode of meta-learning, namely in the mode of small sample learning and convolutional layer updating in the meta-learning even under the condition of a small number of samples, and a trained model is obtained. In other words, the corresponding end-side model training mode in fig. 4 may select an optimal or most appropriate end-side model incremental training mode according to the type of the end-side model and the end-side model training task, so as to further improve the training efficiency and the training effect of the end-side model, and enable the model to be fast converged.
After the client device conducts the incremental training of the model at the end side, model verification can be conducted on the model obtained through training, the model is uploaded to the cloud end after the model is verified, the effect of the model uploaded to the cloud end can be guaranteed as far as possible, and therefore the robustness of the new model fused with the cloud end is further improved. The model verification may refer to verifying the trained model by using a corresponding verification data set to obtain a model verification result. When the model verification result reaches a certain condition, the model verification can be judged to be passed, the end-side personalized model in the figure 4 is obtained, and the end-side personalized model is uploaded to the cloud; on the contrary, when the model test result does not reach the corresponding condition, it can be determined that the model verification fails, and at this time, the end-side training set can be reused for model incremental training.
In specific application, a model verification result is generally expressed as the precision of a model, a corresponding precision threshold value is preset, if the precision of the model is greater than the set precision threshold value, the model verification is judged to be passed, and if the precision of the model is smaller than the set precision threshold value, the model verification is judged not to be passed. For example, in a voice wakeup scene, the set accuracy threshold is 90%, the voice wakeup model after incremental training is verified by using the verification data set, the wakeup success rate of the voice wakeup model is 91%, and since 91% is greater than 90%, it is determined that the model verification is passed, and the voice wakeup model after incremental training can be uploaded to the cloud.
Of course, in some other embodiments, after obtaining the trained end-side model, the client device may also directly upload the trained model to the cloud without performing model verification.
After the client device obtains the personalized model which can be uploaded to the cloud through the training of the end-side model, the personalized model is uploaded to the cloud device. The cloud device can count the number of the received personalized models, and model fusion is performed when the number of the personalized models reaches a certain number.
After the cloud device obtains a sufficient amount of personalized models, reliability cross-validation can be performed on the personalized models based on the cross-validation data set, then the fusion weight of each personalized model is determined according to the model reliability cross-validation result, and finally model weighting fusion is performed according to the fusion weight corresponding to each personalized model to obtain a new cloud model after fusion updating.
In some embodiments, the cloud cross-validation dataset may be formed from cloud validation data and cloud data generated by the client device during the data simulation. At this time, the cloud verification data may be cloud verification data as shown in fig. 2. The cloud data can be generated by the cloud device according to parameters or characteristic information of the personalized model uploaded by the client device, namely the cloud device can receive the personalized model uploaded by each client device, generate corresponding cloud data according to the characteristics or parameters of each personalized model and data stored in the cloud in advance, and construct a cross validation data set based on the generated cloud data and cloud validation data generated by the client device.
The data pre-stored by the cloud may refer to data used by the cloud for training the initial model. For example, in a voice wake scenario, the model to be trained is a voice wake model. The cloud may first collect a large amount of speech data as a training data set for training the initial speech arousal model. At this time, the training data set used by the cloud to train the initial voice wakeup model is data pre-stored by the cloud.
In other embodiments, the client device may not perform the data simulation process and generate the corresponding cloud verification data. At this time, the cross validation data set of the cloud may be formed by only the cloud data. The cloud data is data generated by the cloud device according to parameters or characteristic information of the personalized model uploaded by the client device and data pre-stored in the cloud.
In still other embodiments, the cross-validation data set of the cloud may also be composed of only data pre-stored by the cloud.
Compared with the prior art, the cross validation data set is constructed according to the cloud validation data generated by the client device and the cloud data generated by the cloud, so that the scene and the complexity of model validation can be enriched, and the robustness of a new model obtained by cloud fusion is further improved.
After the cross validation data set is constructed, the cloud device can use the cross validation data set to perform model reliability cross validation on each personalized model respectively to obtain a cross validation result. The process of model-reliable cross validation may include: dividing the constructed cross validation data set into a plurality of sub validation data sets, and validating each end-side personalized model by using the plurality of sub validation data sets respectively to obtain a validation result of each sub validation data set; for each end-side personalized model, a verification result can be calculated according to the verification result corresponding to each sub-verification data set, and the verification result is used as the cross-verification result of the end-side personalized model.
The model reliability cross validation refers to cross validation of the model, and the purpose of the cross validation is to obtain the model with higher reliability. For example, in a voice wake-up scenario, the end-side personalized model is a voice wake-up model, and the cross-validation result of model reliability is expressed as a voice wake-up success rate. And after the voice awakening success rate of each voice awakening model is obtained through model cross validation, selecting the voice awakening model with higher voice awakening success rate for model weighted fusion. By way of example and not limitation, the accuracy threshold is set to be 95%, that is, a voice awakening model with a voice awakening success rate greater than or equal to 95% is selected for model weighted fusion. Of course, the voice wake-up model for model weighted fusion can also be selected by a quantity threshold. By way of example and not limitation, the first 100 voice awakening models are selected for model weighted fusion according to the success rate of voice awakening.
Of course, after cross validation of model reliability, the cloud device may not screen the end-side personalized models for model weighted fusion according to the cross validation result, but all the end-side personalized models are used for model weighted fusion.
After the cross validation data set is used by the cloud device for cross validation of model reliability, a corresponding cross validation result can be obtained, then the fusion weight of each personalized model is calculated according to the cross validation result, and finally model weighting fusion is carried out according to the fusion weight, so that an updated new model is obtained.
Of course, in some other embodiments, besides calculating the model fusion weight according to the cross-validation result, the personalized model may be statistically analyzed, and the model fusion weight is determined by the statistical analysis result. The statistical analysis may be to count the number of training times of the personalized model uploaded by each client device on the end side, the data size of the training data set on the end side, and the like, respectively, to obtain a statistical analysis result. After the cloud device obtains the statistical analysis result, the fusion weight of the end-side personalized model can be determined by combining the statistical analysis result and the cross verification result of model reliability.
That is, the fusion weight of the end-side personalized model may be determined only according to the model reliability cross-validation result, or may be determined by combining the model reliability cross-validation result and the statistical analysis result.
The process of determining the fusion weights of the end-side personalized model based only on the cross-validation result of model reliability is described below.
The cloud device is used for obtaining a cloud cross-validation data set after being constructed, and dividing the cloud cross-validation data set into a plurality of sub-validation data sets after obtaining the end-side personalized models uploaded by the client devices, wherein the number of the sub-validation data sets can be set according to actual needs and actual application scenes. And then, verifying each end-side personalized model by using each sub-verification data set to obtain a verification result. At this time, since there are a plurality of sub-verification data sets, each of the end-side personalized models corresponds to a plurality of verification results, the verification results of each of the end-side personalized models are weighted-averaged or arithmetically averaged to obtain an average value, and the average value is used as a model cross-verification result of the end-side personalized model.
After the model cross-validation result of each end-side personalized model is obtained, the end-side personalized model for model weighted fusion can be screened out according to the preset screening conditions. The screening condition may be a number threshold or a model accuracy threshold. For example, according to the level of the model cross-validation result, the top 100 end-side personalized models are selected, or the end-side personalized models with the model accuracy higher than 90% are selected, and the selected models are used for model weighted fusion.
After being screened out for model weighted fusion based on the model cross-validation results, each model may be assigned a corresponding weight based on the model cross-validation results.
For example, in a voice wakeup scenario, the cloud acquires 1000 voice wakeup models, and divides the cloud cross validation dataset into 3 sub validation datasets, where each sub validation dataset includes voice data of a user.
And verifying 1000 voice awakening models by using the 3 sub-verification data sets respectively to obtain the voice awakening success rate of the models, wherein the voice awakening success rate is the model verification result. For each voice awakening model, there are 3 voice awakening success rates corresponding to the sub-verification data sets, that is, each voice awakening model has 3 corresponding voice awakening success rates.
For convenience of calculation, an average value of the 3 voice wakeup success rates, that is, 90% + 80% + 70% ═ 240%, and 240%/3 = 80%, is calculated here, and the calculated average value is taken as the voice wakeup success rate of the voice wakeup model, that is, the voice wakeup success rate of the voice wakeup model is 80%.
And similarly, respectively calculating the voice awakening success rates of the 1000 voice awakening models.
After the voice awakening success rate of 1000 voice awakening models is calculated, the voice awakening models used for model weighting fusion can be screened out from the 1000 voice awakening models according to the voice awakening success rate, and the screened voice awakening models are models with high reliability. For convenience of calculation, the first 10 models of the voice awakening success rate are screened out for model fusion. The success rates of voice wake-up of these 10 voice wake-up models were 95%, 96%, 94%, 90%, 92%, 96%, 90% and 90%, respectively.
And allocating corresponding weights to the 10 voice wake-up models based on the 10 voice wake-up success rates. For simplicity of calculation, the weighted average is used as the weight of each voice wakeup model. The total of 10 voice awakening success rates is calculated, namely, 925 percent is obtained after 95 percent +96 percent +94 percent +90 percent +92 percent +96 percent +90 percent is calculated; then, the ratio of the respective voice wake-up success rate to the sum is respectively used as the weight of the voice wake-up model, namely 95/925, 96/925, 94/925, 90/925, 92/925, 92/925, 96/925, 90/925, 90/925 and 90/925. Finally, the weights of the 10 voice wake-up models are obtained, which are 0.103, 0.104, 0.102, 0.097, 0.099, 0.104, 0.097 and 0.097, respectively. So far, corresponding weights are distributed to the 10 voice awakening models, and then the 10 voice awakening models can be subjected to weighted fusion according to different distributed weights, so that a new voice awakening model is obtained.
After the voice awakening success rates of 1000 voice awakening models are calculated, in addition to the weight distribution mode described above, the models can be screened according to the voice awakening success rates, and the 1000 voice awakening models are used for model weighting fusion. At this time, the 1000 voice awakening models need to be respectively assigned with corresponding weights, the process of assigning weights to the 1000 voice awakening models is similar to the process of assigning weights to the 10 voice awakening models mentioned above, that is, the sum of the voice awakening success rates of all models is calculated, and then the ratio of the voice awakening success rate of each model to the sum is made to obtain the corresponding weights. The detailed process is not described herein.
After the determination of the fusion weights of the end-side personalized model only based on the model cross-validation results is described, the determination of the fusion weights of the end-side personalized model in combination with the model cross-validation results and the statistical analysis results is described next below.
After the cloud acquires the end-side personalized model, statistical analysis can be performed on the end-side personalized model to obtain a statistical analysis result. By way of example and not limitation, the number of training times and the amount of training sample data for each end-side personalized model are counted. And determining whether the end-side personalized model is fully trained according to the preset condition whether the model is fully trained. The condition of sufficient training can be used for equipment according to actual needs and actual application scenes. For example, the conditions for adequate training are: the training times are more than or equal to 1000 times, and the training sample data size is more than or equal to 100, that is, if the training times of a certain end-side personalized model are more than or equal to 1000 times and the training sample data size is more than or equal to 100, the end-side personalized model is considered to be fully trained, otherwise, the end-side personalized model is considered not to be fully trained.
After the cloud cross-validation data set is constructed by the cloud, the weights of the end-side personalized models can be calculated according to the process of determining the fusion weights of the models only according to the model cross-validation result as shown above. But here, the weights need to be adjusted accordingly according to whether the model is trained sufficiently or not, so as to obtain the final fusion weight.
Adjusting the weights according to whether the model is sufficiently trained or not may refer to: if the model is sufficiently trained, a preset value is correspondingly added on the basis of the fusion weight calculated on the basis of the model cross-validation result, otherwise, if the model is insufficiently trained, the preset value is correspondingly subtracted on the basis of the fusion weight calculated on the basis of the model cross-validation result.
For example, in a voice wakeup scenario, the cloud acquires 1000 voice wakeup models, and divides the cloud cross validation dataset into 3 sub validation datasets, where each sub validation dataset includes voice data of a user.
And verifying 1000 voice awakening models by using the 3 sub-verification data sets respectively to obtain the voice awakening success rate of the models, wherein the voice awakening success rate is the model verification result. For each voice awakening model, there are 3 voice awakening success rates corresponding to the sub-verification data sets, that is, each voice awakening model has 3 corresponding voice awakening success rates.
For convenience of calculation, an average value of the 3 voice wakeup success rates, that is, 90% + 80% + 70% ═ 240%, and 240%/3 = 80%, is calculated here, and the calculated average value is taken as the voice wakeup success rate of the voice wakeup model, that is, the voice wakeup success rate of the voice wakeup model is 80%.
And similarly, respectively calculating the voice awakening success rates of the 1000 voice awakening models.
After the voice awakening success rate of 1000 voice awakening models is calculated, the voice awakening models used for model weighting fusion can be screened out from the 1000 voice awakening models according to the voice awakening success rate, and the screened voice awakening models are models with high reliability. For convenience of calculation, the first 10 models of the voice awakening success rate are screened for model fusion. The success rates of voice wake-up of these 10 voice wake-up models were 95%, 96%, 94%, 90%, 92%, 96%, 90% and 90%, respectively.
And allocating corresponding weights to the 10 voice wake-up models based on the 10 voice wake-up success rates. For simplicity of calculation, the weighted average is used as the weight of each voice wakeup model. The total of 10 voice awakening success rates is calculated, namely, 925 percent is obtained after 95 percent +96 percent +94 percent +90 percent +92 percent +96 percent +90 percent is calculated; then, the ratio of the respective voice wake-up success rate to the sum is respectively used as the weight of the voice wake-up model, namely 95/925, 96/925, 94/925, 90/925, 92/925, 92/925, 96/925, 90/925, 90/925 and 90/925. Finally, the weights of the 10 voice wake-up models are obtained, which are 0.103, 0.104, 0.102, 0.097, 0.099, 0.104, 0.097 and 0.097, respectively.
Then, whether the 10 voice awakening models are sufficiently trained is judged through a statistical analysis result. If, the conditions for sufficient training are: the training times are more than or equal to 1000 times, and the quantity of training sample data is more than or equal to 100. The result of determining whether the 10 voice awakening models are sufficiently trained according to the sufficient training condition is as follows: 95%, 96%, 94%, 90% and 92% of the corresponding voice wake-up models are fully trained, and 92%, 96%, 90% and 90% of the corresponding voice wake-up models are not fully trained.
For a fully trained voice wake-up model, the weight value is correspondingly increased by 0.001, and for an insufficiently trained voice wake-up model, the weight value is correspondingly decreased by 0.001. The weight value adjusted according to whether the training is sufficient or not is as follows: 0.104, 0.105, 0.103, 0.098, 0.100, 0.098, 0.103, 0.096, and 0.096. Thus, the final fusion weights of 10 voice awakening models can be obtained, and model weighted fusion can be performed by the adjusted fusion weights subsequently.
Of course, in a specific application, the weight adjustment value corresponding to whether the model is sufficiently trained or not may be set according to an actual application scenario.
After the voice awakening success rates of 1000 voice awakening models are calculated, in addition to the weight distribution mode described above, the models can be screened according to the voice awakening success rates, and the 1000 voice awakening models are used for model weighting fusion. At this time, the 1000 voice awakening models need to be respectively assigned with corresponding weights, the process of assigning weights to the 1000 voice awakening models is similar to the process of assigning weights to the 10 voice awakening models mentioned above, that is, the sum of the voice awakening success rates of all models is calculated, and then the ratio of the voice awakening success rate of each model to the sum is made to obtain the corresponding weights. Then, according to the preset sufficient training condition and the statistical analysis result, the weight value is correspondingly adjusted to obtain the final fusion weight. The detailed process is not described herein.
After the cloud device determines the fusion weight of the end-side personalized model according to the fusion weight calculation method, model weighting fusion can be performed according to the model fusion weight to obtain a new cloud model.
In other embodiments, after cross-validation of the model reliability, the cloud device may further determine whether a structure of the personalized model uploaded by the client device is changed, perform model weighted fusion if the model structure is not changed, analyze the model structure if the model structure is changed, specifically analyze the depth and width of the model, then perform cloud model search according to the model structure analysis, and use the searched model as a new cloud model. Therefore, under the condition that the structure of the end-side model is changed, a new cloud model can be generated by combining the cloud model searching technology.
It should be noted that, in fig. 4, if the client device determines that the model training task finds a change, the client device performs incremental training on the model by using a multitask incremental learning method, which may cause a change in the structure of the end-side model.
In other embodiments, after obtaining the weighted and fused model, the cloud device may perform model verification on the weighted and fused model, use the model as a new cloud model after verification is passed, and if verification is not passed, may determine again whether the structure of the personalized model at the end side is changed.
By way of example and not limitation, the following is presented in conjunction with the end cloud coordination process diagram shown in fig. 5.
As shown in fig. 5, it is divided into a cloud side 51 and an end side 52, wherein the dotted line in fig. 5 is up to the cloud side 51 and the dotted line is down to the end side 52. The cloud side 51 refers to the cloud device side, and the end side 52 refers to the client device side. Specifically, the cloud side generally includes a cloud software platform and a cloud device, and the cloud software platform and the cloud device work together to provide cloud services. For example, the cloud side may refer to the server side as opposed to the client side. The end side generally refers to a terminal device side, for example, the end side may be a user mobile phone side, a computer side, and the like. The cloud side and the end side are connected through a communication network, which is generally the internet, for example, a mobile phone as a client and a server side are connected through internet communication.
The end side comprises an end side model and cloud verification data, the end side model refers to a model which is obtained by client equipment through model training and can be uploaded to the cloud, and the end side model can be a model after model verification or a model which is not subjected to model verification. The cloud verification data may be cloud verification data generated by the client device after performing data simulation on a small amount of sample data, and the generated cloud verification data is uploaded to the cloud.
Of course, in other embodiments, the client device may not include the cloud verification data if it is not performing the data emulation process. At this time, the cross validation dataset of the cloud side may be composed of cloud data generated by the cloud.
After acquiring the end-side model uploaded by the end side, the cloud side can store the end-side model in the end-side model pool in the cloud. And generating cloud data according to the characteristic information carried by the end-side model and cloud storage data, and constructing a cross validation data set for model reliability validation according to the data cloud generated by the cloud and the cloud validation data uploaded by the end side. The cloud storage data refers to data stored in advance by the cloud. The cloud device then cross-verifies reliability using the constructed cross-verification data set row model. Meanwhile, the cloud equipment also conducts end-side model training statistical analysis on the end-side model. And determining the fusion weight of the end-side model by combining the reliability cross validation result and the statistical analysis result.
After the cloud side carries out reliability cross validation on the end side model, whether the structure of the end side model is changed or not can be judged. If the structure of the end model is changed, the structure of the end model can be analyzed, specifically, the depth and the width of the end model can be analyzed, for example, the end model has several convolution layers, several full-connected layers, and the like. And then searching a cloud model, and searching out a model with a better model effect as a new cloud model. And if the structure of the end-side model is not changed, combining the training statistical analysis of the end-side model and the reliability cross validation of the model to carry out model weighted fusion.
In the model weighted fusion, whether the model needs to be retrained is judged according to preset conditions, and the judgment conditions of retraining are mainly set according to training tasks and model characteristics. For some models which are simple, the model obtained by directly performing weighted fusion or weighted average has good effect, so retraining is not needed, such as random forests; for some models with complex structures, the models obtained by directly performing weighted fusion or weighted average have poor effects, so that weighted training needs to be performed again, for example, deep learning models.
And the cloud side determines whether the model needs to be retrained, if so, the model is trained by using weighted federated meta-learning, and if not, the model is weighted and fused or weighted and averaged by using the determined model fusion weight. Weighted federal meta-learning refers to: and according to the fusion weight of the end-side personalized model, carrying out weighted aggregation on the model parameters of the end-side personalized model and the corresponding fusion weight to obtain new model parameters, taking the new model parameters as the initial model parameters of the federal meta-learning, then training according to the initial model parameters, and taking the trained model as a new cloud model.
The cloud side obtains a model through model weighting fusion, or after obtaining the model through cloud model search, model verification can be carried out on the obtained model, the model is used as a new cloud model after verification is passed, and if verification is not passed, the step of judging whether the model structure on the side of the cloud side is changed can be returned. After the cloud side obtains the new cloud model, the new cloud model can be issued to the end side, and the end side can continue to perform incremental training on the obtained new cloud model.
It should be noted that multiple rounds of model training may be performed between the client device and the cloud device. In one round of model training, the client device may perform multiple end-side model training.
To better describe the system architecture and corresponding flow that may be involved in embodiments of the present application, reference will be made below to a model training process diagram shown in fig. 6.
As shown in fig. 6, the cloud 61 issues the model 62 to be trained to the client device 64. The models 62 to be trained received by the plurality of client devices 64 are all the same. After receiving the model 62 to be trained issued by the cloud 61, the client device 64 may perform an end-side self-learning process to obtain a trained model, perform model effect verification and other processes on the trained model, and obtain a personalized model 65 that may be uploaded to the cloud 61. After each client device 64 trains the model to be trained 62 by using local training sample data, different personalized models 65 can be obtained, and the personalized models 65 are uploaded to the cloud 61. And the cloud 61 performs model fusion updating according to the uploaded personalized model 65 to obtain a fusion updated cloud model 63.
It should be noted that the end-side self-learning process may be the above-mentioned process in which the end-side device trains the model issued by the cloud using the local training sample data set. The method can comprise the processes of end-side data collection, end-side data simulation generation, data validity verification, end-side incremental training, model effect verification and the like. The process of how the model is trained on the end side can be referred to the corresponding contents of fig. 2, fig. 3 and fig. 4 above.
The process of the cloud 61 performing cloud model fusion according to the personalized model 65 to obtain a new model can refer to the content corresponding to fig. 5 above, and details are not repeated here.
It should be noted that the client device 64 may be trained multiple times before obtaining the personalized model 65 uploaded to the cloud 61. The reason for multiple training may be that the trained model fails model validation.
The foregoing describes a system architecture that may be involved in the embodiments of the present application, and possible flows of the client device and the cloud device.
The following provides an exemplary description of a process flow that may be involved in embodiments of the present application.
First, a schematic block diagram of the model training process shown in fig. 7 will be described. As shown in fig. 7, it includes a cloud side 71 and an end side 72.
The end-side 72 refers to the client device side, which may include an end-side data pool, an end-side data preparation, an end-side data validation, an end-side model training, and an end-side model.
The peer-side data pool refers to a place on the client device side for storing peer-side data, and may include historical behavior data, dotting data, privacy data, and the like recorded by the client device, and the data in the peer-side data pool may be data recorded by the peer after obtaining legal and clear user authorization.
The preparation of the end-side data refers to obtaining the end-side data based on the sample data in the end-side data pool. In some embodiments, the end-side data with a large sample data amount may be generated in a data simulation manner, that is, a small amount of sample data is obtained from an end-side data pool, the small amount of sample data is analyzed, the end-side behavior is modeled through reinforcement learning, and then AI simulation data generation is performed by using the established model to obtain training data and cloud verification data, so as to obtain the end-side data. The process of preparing the end-side data by means of data emulation can be seen in fig. 2, and is not described herein again. In other embodiments, if the client device does not perform the data simulation process, a small amount of sample data may be obtained from the end-side data pool, and the end-side data may be obtained after the data preprocessing is performed on the small amount of sample data.
The end side data validity verification means performing data validity verification on end side data prepared by the end side data. The data validity verification process can be referred to in fig. 3 above, and is not described in detail here. Of course, in some other embodiments, the end-side data validity verification process may not be performed, and in this case, after the end-side data preparation is completed, a training data set may be formed based on the end-side data, and the training data set may be used to perform the end-side model training.
The end-side model training refers to training an end-side model issued by a cloud terminal by a client device side. The end-side model training process may be the model training process corresponding to fig. 4 above, and will not be described herein again.
After the end-side model (i.e., the user personalized model) is obtained by the end-side model training, the end-side 72 may coordinate each client device to upload the trained user personalized model to the cloud side through the end-cloud control platform.
The cloud side 71 includes whether the end-side model is sufficient, cross validation of the end-side model, weighted fusion of the cloud-side model, and the like. The method comprises the steps that whether an end-side model is sufficient or not refers to the fact that a cloud side continuously receives end-side models uploaded by client devices and counts the number of the received end-side models; and comparing the number of the end side models with a preset number threshold, if the number of the end side models is greater than the preset number threshold, judging that the quantity is sufficient, otherwise, judging that the quantity is insufficient. The preset number threshold may be set according to actual application requirements, and is not limited herein.
The cloud side 71, upon determining that the end-side model is sufficient, may perform end-side model cross-validation using the constructed cloud-side cross-validation dataset. And then, carrying out weighted fusion on the cloud model according to different weights distributed to the end-side model to obtain a new cloud model, and sending the new cloud model to the client equipment again. The process of cross validation of the end-side model, the process of calculating the model fusion weight, and the process of model weighted fusion may refer to the above corresponding contents, which are not described herein again.
As can be seen from the above, in some embodiments, the client device side may or may not perform the data simulation process. Correspondingly, one side of the cloud device can construct a cross validation data set according to the cloud validation data generated by the client device and the data generated by the cloud, and can also construct a cross validation data set only according to the data generated by the cloud.
In the following, a scheme corresponding to a cross-validation data set established by the cloud device side according to data generated by the cloud device side without performing a data simulation process on the client device side is introduced.
Referring to fig. 8, another schematic diagram of the model training process is shown, which includes a cloud side 81 and an end side 82, as shown in fig. 8.
The flow of the end side 82 may include:
the client device obtains the end-side data, which may include user behavior data, and may specifically be obtained by data dotting or the like. The data at the end side may be data collected by the end side after legal and clear user authorization is obtained. After the end-side data is acquired, the client device may perform data preprocessing on the end-side data, where the data preprocessing may include operations such as data cleaning. Next, the client device may perform end-side data validity verification on the data after the data preprocessing, and a specific process of the end-side data validity verification may refer to relevant contents corresponding to fig. 3, which is not described herein again. Of course, instead of performing data validity verification, the end-side model training may be performed directly.
After the client device performs data validity verification, a training data set is obtained, and incremental training is performed on the model based on the training data set. The model increment training process can refer to the relevant contents in fig. 4, and is not described herein again. It should be noted that, since the data simulation process is not performed in this embodiment, the sample size of the training data set is small, and at this time, based on the training process corresponding to fig. 4, it is determined that the model is difficult to update, and few-shot learning in the meta learning is used to perform model incremental training.
After model incremental training is carried out on the client device, model verification can be carried out on the trained model, after the model verification is passed, an individualized model is obtained, and then the individualized model is uploaded to a cloud side model pool. The model verification at the client device side may refer to: verifying the model obtained after the incremental training by using a verification data set to obtain a model verification result, wherein the model verification result can be the precision of the model, such as the awakening success rate of the voice awakening model; if the accuracy of the model is higher than a preset threshold value, the model verification is considered to be passed, and if the accuracy of the model is lower than the preset threshold value, the model verification is considered to be failed. And uploading the model passing the model verification to the cloud, and not uploading the model not passing the verification to the cloud.
The flow of the cloud side 81 may include:
because the client device does not perform the data simulation process in this embodiment, the cloud verification data used for constructing the cloud cross-verification data set is not generated, and at this time, the cloud device may construct the cross-verification data set according to the cloud storage data. The cloud storage data refers to data which is called cloud pre-storage data. Of course, the cross-validation data set can also be constructed directly based on data stored in advance in the cloud.
Then, the cloud device can perform cross validation on the reliability of the end-side model by using the cross validation data set, count the training times, the training data amount and the like of the personalized model, and calculate the fusion weight of each personalized model by combining the cross validation result and the statistical analysis result. And then, the cloud equipment performs weighted average on the model parameters according to the fusion weight to obtain a new cloud model. The cloud-side process may refer to relevant contents corresponding to fig. 5, which is not described herein again.
In this embodiment, before model fusion, the cloud side performs reliability cross validation on the model uploaded on the end side according to the cloud cross validation data set, and different weights are assigned to different end side models according to the reliability cross validation result and the statistical analysis result, so that the robustness of a new model obtained by cloud aggregation is improved. Further, on the client side, through processes of end-side data validity verification, small sample learning, model verification and the like, the effect of the model uploaded to the cloud side can be guaranteed as far as possible, and further robustness of the new model obtained through cloud fusion is improved.
Of course, in other embodiments, the end-side model training process may not include the end-side data validity verification process of FIG. 7, or alternatively, may not include the client device-side model verification process of FIG. 7.
In other embodiments, the client device side may perform a data simulation process, and the cross-validation dataset of the cloud device side is formed by cloud-validation data generated by the client device and data generated by the cloud. This scheme will be described below in conjunction with yet another schematic of the model training process shown in FIG. 9.
As shown in fig. 9, it includes a cloud side 91 and an end side 92.
The flow of end side 92 may include:
the client device obtains the peer-side data, which may include user behavior data. The end-side data is data that the end-side collects after obtaining legitimate and unambiguous user authorization. Then, the client device performs data simulation operation based on the acquired end-side data to generate a large amount of sample data from a small amount of sample data. After data simulation, the generated data can be obtained and can be used as training data and cloud verification data, so that the training data and the cloud verification data can be obtained, the cloud verification data is uploaded to a cloud, and data preprocessing operation is performed on the training data. The data simulation process can be referred to fig. 2 specifically above, and is not described in detail here.
After the training sample data generated by data simulation is subjected to data preprocessing, the client device can perform data validity verification on the training data subjected to data preprocessing to obtain a training data set. The process of data validity verification may specifically refer to fig. 3 above, and is not described herein again.
After obtaining the training data set, the client device may incrementally train the model based on the training data set. The model increment training process on the client side can refer to fig. 4 above, and is not described in detail here. It should be noted that, in this embodiment, a data simulation process is performed, so that the sample size of the training data set is large, and at this time, based on the model training process corresponding to fig. 4, it can be determined that the model is easy to update, and then model incremental training is performed by using a knowledge distillation and convolutional layer solidification method.
After model increment training, the client device can perform model verification on the trained model, and after the model verification is passed, an individualized model can be obtained and then uploaded to the cloud-side model pool.
The cloud side process may include:
in this embodiment, since the end side performs the data simulation process, the cloud end may receive cloud end verification data uploaded by the end side, and the cloud end verification data may be used to construct a cloud end cross-validation data set. At this time, the cloud device may construct a cross validation dataset based on the cloud validation data and the cloud storage data, where the cloud storage data refers to the data pre-stored by the cloud mentioned above; the cross validation data set can also be constructed based on cloud validation data and cloud generated data, wherein the cloud generated data is generated by the cloud according to characteristic parameter information of the end-side model and the cloud pre-stored data. Only one of which is shown in fig. 9.
After the cross validation data set is constructed by the cloud equipment, the cloud equipment carries out cross validation on the reliability of the end-side model by using the cross validation data set, counts the training times, the training data amount and the like of the end-side personalized model, and calculates the fusion weight of the end-side model by combining the cross validation combination and the statistical analysis result. And then, carrying out weighted average on the model parameters according to the fusion weight to obtain a new cloud model. The cloud-side fusion weight calculation process and the model weighted fusion process may refer to fig. 5 above, and are not described herein again.
To better describe the corresponding embodiment of fig. 9, the following description is made with reference to another schematic diagram of model training shown in fig. 10.
As shown in fig. 10, the end-side data generates a large amount of sample data through data simulation. And in the data generated by the data simulation, one part of the data is uploaded to the cloud and used for cloud verification data composition, and the other part of the data is used for end-side training data composition. After the end-side forms training sample data through a data simulation process, data validity verification (not shown in the figure) may be performed on the training sample data set, and then the end-side model incremental training may be performed. In the process of the incremental training of the end-side model, the data simulation process is performed, so that the training sample data volume is large, and the model can be judged to be easy to update. And the model is easy to update, and the incremental training of the model is carried out by adopting a mode of convolutional layer solidification and knowledge distillation. In the case of the end-side model incremental training, a distillation Loss (Loss) and a Regression Loss (Regression Loss) can be obtained. The distillation loss is weighted by λ and the regression loss is weighted by 1- λ, and the Total loss (Total loss) can be found based on the distillation loss and the regression loss, and the corresponding weights.
After the trained model is obtained at the end side, the model effect verification can be performed on the trained model. After the model effect verification is passed, the trained model can be uploaded to the cloud.
And after the cloud obtains the model uploaded at the end side, performing cross validation on the model by using a cross validation data set, and calculating the fusion weight of each model according to the cross validation result. And after the fusion weight of each model is obtained through weight calculation, performing model weighted fusion by using weighted federal meta-learning to obtain a cloud model after fusion updating.
It should be noted that, in this embodiment, compared with the embodiment corresponding to fig. 8, a difference is that the embodiment performs an end-side data simulation process, and the cross validation data set of the cloud includes cloud validation data generated by the end-side data simulation process.
Based on this, in this embodiment, before model fusion is performed on one side of the cloud, reliability cross validation is performed on the model uploaded on the end side according to the cloud cross validation data set, and different weights are assigned to different end side models according to the reliability cross validation result and the statistical analysis result, so that the robustness of a new model obtained by cloud aggregation is improved. Further, on the client side, through processes of validity verification of data on the client side, model verification and the like, the effect of the model uploaded to the cloud side can be guaranteed as far as possible, and the robustness of the new model obtained by cloud fusion is further improved.
In addition, in the embodiment, through the end-side data simulation process, the end-side training data set is expanded, the training efficiency of the end-side model is improved, and the end-side model obtained through training is more personalized. In addition, the cross validation data set comprises a cloud validation data set, so that the scenes and the complexity of model validation are enriched, and the robustness of a new model obtained by cloud fusion can be further improved.
It should be noted that the solutions shown in the embodiments corresponding to fig. 8 and fig. 9 above are only examples, and in a specific application, some steps may be added or reduced correspondingly, or some steps may be replaced by other implementations, so as to obtain a new solution, but these solutions through replacement or recombination all fall within the protection scope of the embodiments of the present application.
For example, the end-side model training process in fig. 8 and 9 may be an incremental training process as in fig. 4. In other embodiments, the training process of fig. 8 and 9 may also use the existing end-to-end training process of federal learning. In comparison, in the model increment training process corresponding to fig. 4, the optimal model increment training mode is selected through the model training task and the model type, so that the end-side model training efficiency and the model training effect can be further improved. Still further, the model incremental training process in fig. 4 may not include a determination process of whether the end-side training task is changed, or the like.
For example, in the cloud-side flow in fig. 8 and 9, the determination process as to whether the model structure is changed may not be performed, or the model after the model is weighted and fused may not be verified. However, in the process of judging the change of the model structure, a new cloud model can still be generated by combining the cloud model search under the condition that the end-side model structure is changed. The model verification process after adding the model weighting fusion can ensure the effect of the cloud new model as much as possible.
In other embodiments, the end-side flow may be as shown in fig. 8 or fig. 9, but the cloud-side flow may use an existing federal learning corresponding flow instead of the cloud-side flow corresponding to fig. 8 and fig. 9. However, if only the end-side flow in fig. 8 or fig. 9 is used, the cloud-side flow is the cloud-side flow in the existing federal learning. At this time, since the end side performs the data validity verification process, the end side model verification process and the model increment training process corresponding to fig. 4, the end side model effect uploaded to the cloud side can be ensured as much as possible, and the robustness of the new model obtained by cloud side fusion can still be improved. However, the cloud side performs reliability cross validation on the end-side model, and different fusion weights are allocated to different end-side models according to the reliability cross validation result and the statistical analysis result, so that the robustness of the new cloud-side fusion model can be further improved.
After the system architecture provided by the embodiment of the present application and the related processes that may be involved in the client device and the cloud device are introduced, the following describes application scenarios that may be involved in the embodiment of the present application.
The application scene of the embodiment of the application is wide, and no limitation is caused to a specific field or algorithm. By way of example and not limitation, the embodiments of the present application may be applied to biometric identification, for example, scenes such as fingerprint identification, voiceprint identification, Face ID unlocking, and the like; in some embodiments, the combination of 5G (5th-Generation, fifth Generation mobile communication technology) and IoT (Internet of Things) technology can also be applied to multi-terminal data construction ecology, AI capability improvement, and the like. Scenarios such as power consumption management, charging management, and voice wake-up. The above shows the related content in the voice wake-up scenario, and the processes and types of the other scenarios are not described herein again.
In the embodiment of the application, the model can be predicted with higher and higher accuracy through continuous end-side self-learning. And a more robust model can be obtained through end cloud cooperation, so that the model can be accurately predicted under the condition of a few sample data volumes. The regression prediction effect diagram shown in fig. 11 will be described below. Fig. 11 may be a scheme based on the corresponding embodiment of fig. 8 described above.
In the embodiment corresponding to fig. 8, the data simulation process is not performed on the client device side, and the amount of training sample data is small. At this time, the end-side self-learning process may include: the method comprises the steps of end-side data collection, end-side data validity verification, end-side model increment training and model effect verification. Since the amount of training sample data is small, it can be determined that the model is difficult to update. And performing incremental training on the model by using convolutional layer updating and meta-learning modes if the model is difficult to update. The end cloud cooperation process is a process of model training by combining a cloud end and an end side, namely the cloud end issues a model to be trained to a client end, client equipment trains the model to be trained through end side self-learning and uploads the trained model to the cloud end, and the cloud end fuses and updates received models to obtain a new cloud end model. Fig. 11 is a technical effect diagram illustrating a regression prediction problem. In the regression prediction problem, there is a difference between the predicted value and the true value of the model.
As shown in fig. 11, the striped rectangles are true values, and the blank rectangles and the dotted rectangles are regression prediction values. The blank rectangle is a regression prediction value before end synergy, and the dotted rectangle is a regression prediction value after end cloud synergy. The height of the bar represents the magnitude of the value. The difference degree between the regression prediction value and the real value can be obtained through the column height difference between the real value and the regression prediction value. The horizontal axis of fig. 11 is a time axis, and time nodes such as 7/month 1/day, 8/month 1/day, 9/month 1/day, 10/month 1/day, 11/month 1/day, and 12/month 1/day are shown in the figure.
And performing two end-side self-learning processes on 8 month 1 and 9 month 1, namely performing two incremental training on the model to be trained issued by the cloud by the client device. A one-time end-side self-learning process was also performed around day 10 and 1. It can be seen that three end-side self-learning processes were performed from 8 months 1 to 11 months 1. After each end-side self-learning, the cylinder height difference between the real value and the regression prediction value is gradually smaller, namely the regression prediction value is closer to the real value. That is, the regression prediction accuracy will be higher and higher by the constant end-side self-learning. However, there are some times when the regression prediction value and the true value are different greatly, that is, the robustness of the model is low.
And performing a primary cloud coordination process in 11 months and 1 days, namely controlling a plurality of client devices to upload the trained models to the cloud, performing reliability cross validation on the received models by the cloud, distributing different weights according to the reliability cross validation result, and finally performing model weighting fusion according to the distributed weights to obtain a new cloud model. And the cloud end issues the new model obtained by fusion to the end side. The client device performs regression prediction by using the model obtained by fusion updating, and the difference value between the true value and the regression prediction value is always stable and cannot suddenly become large. That is to say, a more robust model can be obtained through the end cloud cooperation process, and the model can also realize accurate prediction in a scene with a small amount of data.
The following description will be made with reference to a picture recommendation diagram shown in fig. 12, where fig. 12 is a technical solution based on the embodiment shown in fig. 9 and fig. 10.
In the embodiments corresponding to fig. 9 and 10, the end side performs a data simulation process. In this case, the amount of training sample data on the end side is large. The process of the end-side self-learning can comprise the processes of end-side data collection, data simulation, data validity verification, end-side model increment training, model effect verification and the like.
As shown in fig. 12, in the leftmost and middle cell phone, there are four pictures, picture 221, picture 222, picture 223 and picture 224. Before the end-side self-learning is not performed, the picture recommended by the system is picture 222, and the picture selected by the user is picture 221. At this time, through the behavior of selecting pictures by the user, the picture category tag of the user is obtained, that is, the user prefers smiling pictures of the category corresponding to the picture 221. After the operation data of the user is obtained, a large amount of end-side operation data is generated through a data simulation process. And forming training sample data by using the end-side operation data, and then training the picture recommendation model by using the training sample data.
After the end-side self-learning, the system recommends a picture 221 to the user through the picture recommendation model. That is to say, after the end-side self-learning, the picture recommendation model which can better embody the individuation of the user can be obtained.
And uploading the trained models to the cloud end on the end side, and aggregating a large number of personalized models on the end side into a new picture recommendation model by the cloud end. The image recommendation is performed by using the image recommendation model after the end cloud cooperation or the image recommendation model after the cloud fusion update, for example, the rightmost mobile phone in the figure includes four images, which are an image 225, an image 226, an image 227 and an image 228. After the end clouds are coordinated, the system recommends a picture 225 to the user.
Note that the picture 225 is a smile picture at a side angle, and the picture 221 is a smile picture at a front angle. The model obtained by the cooperation of the image 225 and the image 221 is intended to show that the effect can be ensured in a complex scene. Fig. 12 shows that the model after the end cloud cooperation can recommend smiling pictures meeting the user preference in the scene of angle change.
The model training process will be described below with reference to the schematic interaction diagram of the end-side device and the cloud device shown in fig. 13.
As shown in fig. 13, the interactive process may include the following steps:
step S1301, the cloud device sends the model to be trained to the end-side device.
Step S1302, the end-side device performs data simulation according to the collected small amount of samples to generate a large amount of training samples.
It can be understood that the data simulation process of the end-side device may refer to the corresponding content above, and will not be described herein again.
Step S1303, the end-side device uploads the cloud verification data generated through data simulation to the cloud device.
Step S1304, the end-side equipment constructs a training sample data set according to a training sample generated by data simulation, and performs data validity verification on the training sample data set to obtain a verified training sample data set.
It is understood that the data validity verification process may refer to the corresponding contents above, and will not be described herein.
Step S1305, the end-side device trains the model to be trained by using the verified training sample data set, so as to obtain the end-side model. And carrying out model verification on the end side model to obtain an end side personalized model passing the verification.
Step 1306, the end-side device uploads the end-side personalized model to the cloud device.
Step 1307, the cloud device constructs a cross validation data set of the cloud according to the cloud validation data uploaded by the end-side device.
Step S1308, the cloud device performs reliability cross validation on the end-side personalized model by using a cross validation data set, so as to obtain a cross validation result.
Step 1309, the cloud device performs statistical analysis on the end-side personalized model to obtain a statistical analysis result representing whether the end-side personalized model is fully trained or not.
Step S1310, the cloud device allocates fusion weight to each end-side personalized model according to the cross validation result and the statistical analysis result.
Step 1311, the cloud device performs model weighted fusion according to the fusion weight and the end-side personalized model to obtain an updated cloud model.
It should be noted that, the same or similar contents in this embodiment and the above embodiments may be referred to each other, and are not described herein again.
Embodiments of the present application also provide an electronic device including a memory and a processor, and a computer program stored in the memory and executable on the processor. The processor, when executing the computer program, may implement the model training method in any of the embodiments above.
In an embodiment of the present application, the first electronic device is a cloud device, which is generally a server. The second electronic device is a terminal device, which may be, but not limited to, a mobile phone, a tablet computer, a notebook computer, a wearable device, a car or car-mounted device, an Augmented Reality (AR)/Virtual Reality (VR) device, an ultra-mobile personal computer (UMPC), a Personal Digital Assistant (PDA), and the like.
By way of example and not limitation, as shown in fig. 14, the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a Universal Serial Bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, keys 190, a motor 191, an indicator 192, a camera 193, a display screen 194, a Subscriber Identity Module (SIM) card interface 195, and the like. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
It is to be understood that the illustrated structure of the embodiment of the present application does not specifically limit the electronic device 100. In other embodiments of the present application, electronic device 100 may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Processor 110 may include one or more processing units, such as: the processor 110 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors.
The controller may be, among other things, a neural center and a command center of the electronic device 100. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.
A memory may also be provided in processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 110, thereby increasing the efficiency of the system.
In some embodiments, processor 110 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.
The I2C interface is a bi-directional synchronous serial bus that includes a serial data line (SDA) and a Serial Clock Line (SCL). In some embodiments, processor 110 may include multiple sets of I2C buses. The processor 110 may be coupled to the touch sensor 180K, the charger, the flash, the camera 193, etc. through different I2C bus interfaces, respectively. For example: the processor 110 may be coupled to the touch sensor 180K via an I2C interface, such that the processor 110 and the touch sensor 180K communicate via an I2C bus interface to implement the touch functionality of the electronic device 100.
The I2S interface may be used for audio communication. In some embodiments, processor 110 may include multiple sets of I2S buses. The processor 110 may be coupled to the audio module 170 via an I2S bus to enable communication between the processor 110 and the audio module 170.
The PCM interface may also be used for audio communication, sampling, quantizing and encoding analog signals. In some embodiments, the audio module 170 and the wireless communication module 160 may be coupled by a PCM bus interface. Both the I2S interface and the PCM interface may be used for audio communication.
The UART interface is a universal serial data bus used for asynchronous communications. The bus may be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication. In some embodiments, a UART interface is generally used to connect the processor 110 with the wireless communication module 160. For example: the processor 110 communicates with a bluetooth module in the wireless communication module 160 through a UART interface to implement a bluetooth function.
MIPI interfaces may be used to connect processor 110 with peripheral devices such as display screen 194, camera 193, and the like. The MIPI interface includes a Camera Serial Interface (CSI), a Display Serial Interface (DSI), and the like. In some embodiments, processor 110 and camera 193 communicate through a CSI interface to implement the capture functionality of electronic device 100. The processor 110 and the display screen 194 communicate through the DSI interface to implement the display function of the electronic device 100.
The GPIO interface may be configured by software. The GPIO interface may be configured as a control signal and may also be configured as a data signal. In some embodiments, a GPIO interface may be used to connect the processor 110 with the camera 193, the display 194, the wireless communication module 160, the audio module 170, the sensor module 180, and the like. The GPIO interface may also be configured as an I2C interface, an I2S interface, a UART interface, a MIPI interface, and the like.
The USB interface 130 is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface 130 may be used to connect a charger to charge the electronic device 100, and may also be used to transmit data between the electronic device 100 and a peripheral device. And the earphone can also be used for connecting an earphone and playing audio through the earphone. The interface may also be used to connect other electronic devices, such as AR devices and the like.
It should be understood that the interface connection relationship between the modules illustrated in the embodiments of the present application is only an illustration, and does not limit the structure of the electronic device 100. In other embodiments of the present application, the electronic device 100 may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.
The charging management module 140 is configured to receive charging input from a charger. The charger may be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 140 may receive charging input from a wired charger via the USB interface 130. In some wireless charging embodiments, the charging management module 140 may receive a wireless charging input through a wireless charging coil of the electronic device 100. The charging management module 140 may also supply power to the electronic device through the power management module 141 while charging the battery 142.
The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140 and provides power to the processor 110, the internal memory 121, the external memory, the display 194, the camera 193, the wireless communication module 160, and the like. The power management module 141 may also be used to monitor parameters such as battery capacity, battery cycle count, battery state of health (leakage, impedance), etc. In some other embodiments, the power management module 141 may also be disposed in the processor 110. In other embodiments, the power management module 141 and the charging management module 140 may be disposed in the same device.
The wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like.
The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the electronic device 100 may be used to cover a single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile communication module 150 may provide a solution including 2G/3G/4G/5G wireless communication applied to the electronic device 100. The mobile communication module 150 may include at least one filter, a switch, a power amplifier, a Low Noise Amplifier (LNA), and the like. The mobile communication module 150 may receive the electromagnetic wave from the antenna 1, filter, amplify, etc. the received electromagnetic wave, and transmit the electromagnetic wave to the modem processor for demodulation. The mobile communication module 150 may also amplify the signal modulated by the modem processor, and convert the signal into electromagnetic wave through the antenna 1 to radiate the electromagnetic wave. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the processor 110. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the same device as at least some of the modules of the processor 110.
The modem processor may include a modulator and a demodulator. The modulator is used for modulating a low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then passes the demodulated low frequency baseband signal to a baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs a sound signal through an audio device (not limited to the speaker 170A, the receiver 170B, etc.) or displays an image or video through the display screen 194. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be provided in the same device as the mobile communication module 150 or other functional modules, independent of the processor 110.
The wireless communication module 160 may provide a solution for wireless communication applied to the electronic device 100, including Wireless Local Area Networks (WLANs) (e.g., wireless fidelity (Wi-Fi) networks), bluetooth (bluetooth, BT), Global Navigation Satellite System (GNSS), Frequency Modulation (FM), Near Field Communication (NFC), Infrared (IR), and the like. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, performs frequency modulation and filtering processing on electromagnetic wave signals, and transmits the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be transmitted from the processor 110, perform frequency modulation and amplification on the signal, and convert the signal into electromagnetic waves through the antenna 2 to radiate the electromagnetic waves.
In some embodiments, antenna 1 of electronic device 100 is coupled to mobile communication module 150 and antenna 2 is coupled to wireless communication module 160 so that electronic device 100 can communicate with networks and other devices through wireless communication techniques. The wireless communication technology may include global system for mobile communications (GSM), General Packet Radio Service (GPRS), code division multiple access (code division multiple access, CDMA), Wideband Code Division Multiple Access (WCDMA), time-division code division multiple access (time-division code division multiple access, TD-SCDMA), Long Term Evolution (LTE), LTE, BT, GNSS, WLAN, NFC, FM, and/or IR technologies, etc. The GNSS may include a Global Positioning System (GPS), a global navigation satellite system (GLONASS), a beidou navigation satellite system (BDS), a quasi-zenith satellite system (QZSS), and/or a Satellite Based Augmentation System (SBAS).
The electronic device 100 implements display functions via the GPU, the display screen 194, and the application processor. The GPU is a microprocessor for image processing, and is connected to the display screen 194 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
The display screen 194 is used to display images, video, and the like. The display screen 194 includes a display panel. The display panel may adopt a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), and the like. In some embodiments, the electronic device 100 may include 1 or N display screens 194, with N being a positive integer greater than 1.
The electronic device 100 may implement a shooting function through the ISP, the camera 193, the video codec, the GPU, the display 194, the application processor, and the like.
The ISP is used to process the data fed back by the camera 193. For example, when a photo is taken, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converting into an image visible to naked eyes. The ISP can also carry out algorithm optimization on the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in camera 193.
The camera 193 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The light sensing element converts the optical signal into an electrical signal, which is then passed to the ISP where it is converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, the electronic device 100 may include 1 or N cameras 193, N being a positive integer greater than 1.
The digital signal processor is used for processing digital signals, and can process digital image signals and other digital signals. For example, when the electronic device 100 selects a frequency bin, the digital signal processor is used to perform fourier transform or the like on the frequency bin energy.
Video codecs are used to compress or decompress digital video. The electronic device 100 may support one or more video codecs. In this way, the electronic device 100 may play or record video in a variety of encoding formats, such as: moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, and the like.
The NPU is a neural-network (NN) computing processor that processes input information quickly by using a biological neural network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. Applications such as intelligent recognition of the electronic device 100 can be realized through the NPU, for example: image recognition, face recognition, speech recognition, text understanding, and the like.
The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to extend the memory capability of the electronic device 100. The external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.
The internal memory 121 may be used to store computer-executable program code, which includes instructions. The processor 110 executes various functional applications of the electronic device 100 and data processing by executing instructions stored in the internal memory 121. The internal memory 121 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The storage data area may store data (such as audio data, phone book, etc.) created during use of the electronic device 100, and the like. In addition, the internal memory 121 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like.
The electronic device 100 may implement audio functions via the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. Such as music playing, recording, etc.
The audio module 170 is used to convert digital audio information into an analog audio signal output and also to convert an analog audio input into a digital audio signal. The audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or some functional modules of the audio module 170 may be disposed in the processor 110.
The speaker 170A, also called a "horn", is used to convert the audio electrical signal into an acoustic signal. The electronic apparatus 100 can listen to music through the speaker 170A or listen to a handsfree call.
The receiver 170B, also called "earpiece", is used to convert the electrical audio signal into an acoustic signal. When the electronic apparatus 100 receives a call or voice information, it can receive voice by placing the receiver 170B close to the ear of the person.
The microphone 170C, also referred to as a "microphone," is used to convert sound signals into electrical signals. When making a call or transmitting voice information, the user can input a voice signal to the microphone 170C by speaking the user's mouth near the microphone 170C. The electronic device 100 may be provided with at least one microphone 170C. In other embodiments, the electronic device 100 may be provided with two microphones 170C to achieve a noise reduction function in addition to collecting sound signals. In other embodiments, the electronic device 100 may further include three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, perform directional recording, and so on.
The headphone interface 170D is used to connect a wired headphone. The headset interface 170D may be the USB interface 130, or may be a 3.5mm open mobile electronic device platform (OMTP) standard interface, a cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.
The pressure sensor 180A is used for sensing a pressure signal, and converting the pressure signal into an electrical signal. In some embodiments, the pressure sensor 180A may be disposed on the display screen 194. The pressure sensor 180A can be of a wide variety, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like. The capacitive pressure sensor may be a sensor comprising at least two parallel plates having an electrically conductive material. When a force acts on the pressure sensor 180A, the capacitance between the electrodes changes. The electronic device 100 determines the strength of the pressure from the change in capacitance. When a touch operation is applied to the display screen 194, the electronic apparatus 100 detects the intensity of the touch operation according to the pressure sensor 180A. The electronic apparatus 100 may also calculate the touched position from the detection signal of the pressure sensor 180A. In some embodiments, the touch operations that are applied to the same touch position but different touch operation intensities may correspond to different operation instructions. For example: and when the touch operation with the touch operation intensity smaller than the first pressure threshold value acts on the short message application icon, executing an instruction for viewing the short message. And when the touch operation with the touch operation intensity larger than or equal to the first pressure threshold value acts on the short message application icon, executing an instruction of newly building the short message.
The gyro sensor 180B may be used to determine the motion attitude of the electronic device 100. In some embodiments, the angular velocity of electronic device 100 about three axes (i.e., the x, y, and z axes) may be determined by gyroscope sensor 180B. The gyro sensor 180B may be used for photographing anti-shake. For example, when the shutter is pressed, the gyro sensor 180B detects a shake angle of the electronic device 100, calculates a distance to be compensated for by the lens module according to the shake angle, and allows the lens to counteract the shake of the electronic device 100 through a reverse movement, thereby achieving anti-shake. The gyroscope sensor 180B may also be used for navigation, somatosensory gaming scenes.
The air pressure sensor 180C is used to measure air pressure. In some embodiments, electronic device 100 calculates altitude, aiding in positioning and navigation, from barometric pressure values measured by barometric pressure sensor 180C.
The magnetic sensor 180D includes a hall sensor. The electronic device 100 may detect the opening and closing of the flip holster using the magnetic sensor 180D. In some embodiments, when the electronic device 100 is a flip phone, the electronic device 100 may detect the opening and closing of the flip according to the magnetic sensor 180D. And then according to the opening and closing state of the leather sheath or the opening and closing state of the flip cover, the automatic unlocking of the flip cover is set.
The acceleration sensor 180E may detect the magnitude of acceleration of the electronic device 100 in various directions (typically three axes). The magnitude and direction of gravity can be detected when the electronic device 100 is stationary. The method can also be used for recognizing the posture of the electronic equipment, and is applied to horizontal and vertical screen switching, pedometers and other applications.
A distance sensor 180F for measuring a distance. The electronic device 100 may measure the distance by infrared or laser. In some embodiments, taking a picture of a scene, electronic device 100 may utilize range sensor 180F to range for fast focus.
The proximity light sensor 180G may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device 100 emits infrared light to the outside through the light emitting diode. The electronic device 100 detects infrared reflected light from nearby objects using a photodiode. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100. When insufficient reflected light is detected, the electronic device 100 may determine that there are no objects near the electronic device 100. The electronic device 100 can utilize the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear for talking, so as to automatically turn off the screen to achieve the purpose of saving power. The proximity light sensor 180G may also be used in a holster mode, a pocket mode automatically unlocks and locks the screen.
The ambient light sensor 180L is used to sense the ambient light level. Electronic device 100 may adaptively adjust the brightness of display screen 194 based on the perceived ambient light level. The ambient light sensor 180L may also be used to automatically adjust the white balance when taking a picture. The ambient light sensor 180L may also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in a pocket to prevent accidental touches.
The fingerprint sensor 180H is used to collect a fingerprint. The electronic device 100 can utilize the collected fingerprint characteristics to unlock the fingerprint, access the application lock, photograph the fingerprint, answer an incoming call with the fingerprint, and so on.
The temperature sensor 180J is used to detect temperature. In some embodiments, electronic device 100 implements a temperature processing strategy using the temperature detected by temperature sensor 180J. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the electronic device 100 performs a reduction in performance of a processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection. In other embodiments, the electronic device 100 heats the battery 142 when the temperature is below another threshold to avoid the low temperature causing the electronic device 100 to shut down abnormally. In other embodiments, when the temperature is lower than a further threshold, the electronic device 100 performs boosting on the output voltage of the battery 142 to avoid abnormal shutdown due to low temperature.
The touch sensor 180K is also referred to as a "touch panel". The touch sensor 180K may be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is used to detect a touch operation applied thereto or nearby. The touch sensor can communicate the detected touch operation to the application processor to determine the touch event type. Visual output associated with the touch operation may be provided through the display screen 194. In other embodiments, the touch sensor 180K may be disposed on a surface of the electronic device 100, different from the position of the display screen 194.
The bone conduction sensor 180M may acquire a vibration signal. In some embodiments, the bone conduction sensor 180M may acquire a vibration signal of the human vocal part vibrating the bone mass. The bone conduction sensor 180M may also contact the human pulse to receive the blood pressure pulsation signal. In some embodiments, the bone conduction sensor 180M may also be disposed in a headset, integrated into a bone conduction headset. The audio module 170 may analyze a voice signal based on the vibration signal of the bone mass vibrated by the sound part acquired by the bone conduction sensor 180M, so as to implement a voice function. The application processor can analyze heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M, so as to realize the heart rate detection function.
The keys 190 include a power-on key, a volume key, and the like. The keys 190 may be mechanical keys. Or may be touch keys. The electronic apparatus 100 may receive a key input, and generate a key signal input related to user setting and function control of the electronic apparatus 100.
The motor 191 may generate a vibration cue. The motor 191 may be used for incoming call vibration cues, as well as for touch vibration feedback. For example, touch operations applied to different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 191 may also respond to different vibration feedback effects for touch operations applied to different areas of the display screen 194. Different application scenes (such as time reminding, receiving information, alarm clock, game and the like) can also correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization.
Indicator 192 may be an indicator light that may be used to indicate a state of charge, a change in charge, or a message, missed call, notification, etc.
The SIM card interface 195 is used to connect a SIM card. The SIM card can be brought into and out of contact with the electronic apparatus 100 by being inserted into the SIM card interface 195 or being pulled out of the SIM card interface 195. The electronic device 100 may support 1 or N SIM card interfaces, N being a positive integer greater than 1. The SIM card interface 195 may support a Nano SIM card, a Micro SIM card, a SIM card, etc. The same SIM card interface 195 can be inserted with multiple cards at the same time. The types of the plurality of cards may be the same or different. The SIM card interface 195 may also be compatible with different types of SIM cards. The SIM card interface 195 may also be compatible with external memory cards. The electronic device 100 interacts with the network through the SIM card to implement functions such as communication and data communication. In some embodiments, the electronic device 100 employs esims, namely: an embedded SIM card. The eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100.
The software system of the electronic device 100 may employ a layered architecture, an event-driven architecture, a micro-core architecture, a micro-service architecture, or a cloud architecture.
By way of example and not limitation, processor 110 in electronic device 100 may invoke and execute a model training program in internal memory 121 to implement the steps of the model training method in any of the embodiments above.
For example, if the electronic device is embodied as a cloud server, the internal memory 121 of the cloud server stores a model training program and various models. At this time, the model stored by the internal memory 121 on the cloud storage server may include, but is not limited to: and the model to be trained, the personalized model uploaded at the end side and the cloud are integrated and updated to obtain the model.
In specific application, the cloud server can store the end-side personalized model and the cloud verification data into the internal memory after receiving the end-side personalized model uploaded by the end-side device and the cloud verification data uploaded by the end side. In the subsequent process of constructing the cloud cross validation data set, the processor 110 of the cloud server may execute the model training program to read the cloud validation data from the internal memory 121, so as to implement the step of constructing the cloud validation data according to the cloud validation data. In the process of performing cross-validation using the cross-validation dataset, the processor 110 of the cloud server may execute a model training program stored on the internal memory 121 to perform cross-validation on each end-side personalized model using the cross-validation dataset to obtain a cross-validation result, and store the cross-validation result in the internal memory 121. Then, the processor 110 of the cloud server may further execute a model training program stored in the internal storage 121 to allocate fusion weights to different end-side personalized models according to the cross validation result, perform weighted fusion according to the fusion weight of each end-side personalized model to obtain a cloud fusion updated model, and store the cloud fusion updated model to the internal storage 121.
Of course, in some other embodiments, the functions of the internal memory 121 may also be implemented by an external memory, and in this case, the processor 100 of the cloud server reads data on the external memory through the external storage interface 120.
For another example, if the electronic device is an end-side device, e.g., a cell phone, a tablet, or a wearable device, etc. The end-side device includes a processor 110 and an internal memory 121. The internal memory may have stored thereon: the model training program, the model to be trained issued by the cloud server, the collected training sample data, the model after training and the like.
After receiving the model to be trained issued by the cloud server, the end-side device may write the model to be trained into the internal memory 121. Similarly, after the end-side device may acquire training sample data by means of data dotting, the acquired training sample data is written into the internal memory 121. The processor 100 of the end-side device implements a data simulation process, a data validity verification process, an end-side model incremental training process, a model verification process, and the like by executing a model training program stored on the internal memory 121. The processes of data simulation, data validity verification, end-side model increment training, model verification and the like can refer to the corresponding embodiments, and are not repeated herein.
Taking the data validity verification process as an example, the processor 100 of the end-side device executes a model training program to read end-side data from the internal memory 121, and then classifies the end-side data to determine valid data, redundant data, and noise data; the redundant data is stored in the data buffer, and the noise data is restored and stored in the internal memory 121.
In addition, the embodiment of the application also provides a model training device, which comprises a processing module and a storage module. The storage module may be configured to store a model training program and a model, and the processing module may be configured to execute the model training program stored on the storage module to implement the model training method in any of the above embodiments.
By way of example and not limitation, the storage module may be the internal memory 121 and the processing module may be the processor 110.
In some embodiments, the model training device may be a cloud server, or may be a device integrated on the cloud server. At this time, the processing module of the model training device can execute the model training program on the storage module to realize the cloud cross validation data set construction process, the model cross validation process, the model fusion weight distribution process, the model weighting fusion process and the like.
The storage module of the model training device stores a model training program and various models. The model stored by the storage module may include, but is not limited to: and the model to be trained, the personalized model uploaded at the end side and the cloud are integrated and updated to obtain the model.
For example, the model training device may store both the peer-side personalized model and the cloud verification data to the storage module after receiving the peer-side personalized model and the cloud verification data. In the subsequent process of constructing the cloud cross validation data set, the processing module of the model training device can execute a model training program to read cloud validation data from the storage module, so that the step of constructing cloud validation data according to the cloud validation data is realized. In the process of performing cross validation using the cross validation data set, the processing module of the model training apparatus may execute a model training program stored on the storage module to perform cross validation on each end-side personalized model using the cross validation data set to obtain a cross validation result, and store the cross validation result in the internal storage module. In addition, the processing module of the model training device can also execute a model training program stored on the storage module so as to distribute fusion weights to different end-side personalized models according to a cross validation result, perform weighted fusion according to the fusion weight of each end-side personalized model to obtain a cloud fusion updated model, and store the cloud fusion updated model to the storage module.
In other embodiments, the model training apparatus may be an end-side device or an apparatus integrated on the end-side device. At this time, the processing module of the model training apparatus executes the model training program on the storage module to implement the above data simulation process, data validity verification process, end-side model increment training process, model verification process, and the like.
The internal storage module of the model training apparatus may include, but is not limited to: the model training program, the model to be trained issued by the cloud, the collected training sample data, the model after training and the like.
The model training device can write the model to be trained into the storage module after receiving the model to be trained issued by the cloud server. Similarly, after the model training device can acquire training sample data in a data dotting mode and the like, the acquired training sample data is written into the storage module. The processing module of the model training device can realize a data simulation process, a data validity verification process, an end-side model increment training process, a model verification process and the like by executing a model training program stored on the storage module. The processes of data simulation, data validity verification, end-side model increment training, model verification and the like can refer to the corresponding embodiments, and are not repeated herein.
Taking a data validity verification process as an example, a processing module of the model training device executes a model training program to read end-side data from a storage module, and then classifies the end-side data to determine valid data, redundant data and noise data; and storing the redundant data in a storage module, and storing the noise data in the storage module after noise restoration.
It should be noted that, the same or similar points as those in the above embodiments may be referred to the corresponding contents in the above embodiments, and are not repeated herein.
The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps that can implement the model training method in the foregoing embodiments.
The embodiment of the present application provides a computer program product, which when running on an electronic device, enables the electronic device to implement the steps of the model training method in the above embodiments when executed.
An embodiment of the present application further provides a chip system, where the chip system includes a processor, the processor is coupled to a memory, and the processor executes a computer program stored in the memory to implement the model training method in any of the above embodiments. The chip system can be a single chip or a chip module consisting of a plurality of chips.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance. Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise.
In the above embodiments, the descriptions of the respective embodiments are focused, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in detail in a certain embodiment.
Finally, it should be noted that: the above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (17)

1. A method for model training based on federal learning, the method comprising:
the method comprises the steps that a first electronic device sends a model to be trained to a second electronic device;
the first electronic equipment receives a model to be fused sent by one or more second electronic equipment, wherein the model to be fused is obtained by the second electronic equipment through training the model to be trained according to a local training sample data set;
the first electronic equipment carries out cross validation on the model to be fused by using a cross validation data set to obtain a cross validation result;
the first electronic equipment distributes fusion weight to one or more models to be fused according to the cross validation result;
and the first electronic equipment performs model weighted fusion on the models to be fused according to the fusion weight of each model to be fused to obtain an updated cloud model.
2. The method of claim 1, wherein after the first electronic device receives the model to be fused sent by one or more of the second electronic devices, the method further comprises:
the first electronic equipment determines a training result of each model to be fused according to a preset sufficient training condition, wherein the training result comprises sufficient training and insufficient training;
the first electronic device distributes fusion weight to each model to be fused according to the cross validation result, and the method comprises the following steps:
the first electronic equipment determines a first weight of each model to be fused according to the cross validation result;
the first electronic equipment adjusts the first weight according to the training result aiming at each model to be fused to obtain a second weight;
the first electronic device takes the second weight of each model to be fused as the fusion weight.
3. The method of claim 2, wherein the determining, by the first electronic device, the training result of each model to be fused according to a preset sufficient training condition includes:
if the number of training times of the end side of the model to be fused is larger than a preset number threshold value and the data volume of the end side of the model to be fused is larger than a preset threshold value, the first electronic device determines that the training result of the model to be fused is full training;
if the number of training times of the end side of the model to be fused is smaller than the preset number threshold, and/or the data volume of the end side of the model to be fused is smaller than the preset threshold, the first electronic device determines that the training result of the model to be fused is insufficient training;
the number of training times of the model to be trained on the second electronic device is the number of training times of the model to be trained on the second electronic device, and the amount of data of the end side is the amount of training sample data local to the second electronic device.
4. The method of claim 2, wherein the first electronic device adjusting the first weight according to the training result to obtain a second weight comprises:
if the training result of the model to be fused is full training, the first electronic equipment adds the first weight and a preset numerical value to obtain a second weight;
and if the training result of the model to be fused is insufficient training, subtracting the preset value from the first weight by the first electronic equipment to obtain the second weight.
5. The method of claim 2, wherein the first electronic device determining a first weight for each of the models to be fused based on the cross-validation results comprises:
the first electronic equipment adds the cross validation results of the models to be fused to obtain a cross validation result sum;
and the first electronic equipment takes the ratio of the cross validation result of each model to be fused and the sum of the cross validation results as a first weight of the model to be fused.
6. The method of claim 1, wherein the first electronic device assigns a fusion weight to each model to be fused according to the cross-validation result, comprising:
the first electronic equipment adds the cross validation results of the models to be fused to obtain a cross validation result sum;
and the first electronic equipment takes the ratio of the cross validation result of each model to be fused and the sum of the cross validation results as the fusion weight of the model to be fused.
7. The method of claim 1, wherein the first electronic device cross-verifies the model to be fused using a cross-verification dataset to obtain a cross-verification result, comprising:
the first electronic device dividing the cross-validation dataset into sub-cross-validation datasets;
the first electronic device verifies each model to be fused by using each sub-cross verification data set respectively to obtain sub-verification results, wherein one sub-cross verification data set corresponds to one sub-verification result;
and the first electronic equipment obtains the cross validation result of each model to be fused according to the sub validation result of each model to be fused.
8. The method of claim 1, wherein the method further comprises:
the first electronic device acquires cloud verification data uploaded by the second electronic device, wherein the cloud verification data are generated by the second electronic device through data simulation of a training sample;
the first electronic device constructs the cross validation data set according to the cloud validation data and cloud data, wherein the cloud data is data stored in a cloud local area.
9. The method of claim 1, wherein after the first electronic device cross-verifies the model to be fused using a cross-verification dataset, resulting in a cross-verification result, the method further comprises:
if the model structure of the model to be fused is different from that of the model to be fused trained in the previous round, the first electronic device determines a target cloud model from pre-stored cloud models according to the model structure of the model to be fused;
the first electronic equipment carries out model verification on the target cloud model;
if the verification is passed, the first electronic device takes the target cloud model as an updated cloud model;
and if the model structure of the model to be fused is the same as that of the model to be fused trained in the previous round, the first electronic device performs the step of distributing fusion weight to each model to be fused according to the cross validation result.
10. The method according to any one of claims 1 to 9, wherein the performing, by the first electronic device, model weighted fusion on the models to be fused according to the fusion weight of each model to be fused to obtain an updated cloud model, includes:
the first electronic equipment determines whether the model to be fused needs to be retrained according to a preset retraining condition;
if retraining is needed, the first electronic equipment trains the model to be fused in a weighted federal meta-learning mode according to the fusion weight to obtain a fusion model; performing model verification on the fusion model, and if the model passes the verification, taking the fusion model as the updated cloud model;
if not, the first electronic device performs weighted fusion on the model to be fused according to the fusion weight to obtain a fusion model; and performing model verification on the fusion model, and if the model passes the verification, taking the fusion model as the updated cloud model.
11. A method for model training based on federal learning, the method comprising:
the second electronic equipment receives the model to be trained sent by the first electronic equipment;
the second electronic equipment performs data validity verification on the training sample data set to obtain a verified training sample data set;
the second electronic equipment performs incremental training on the model to be trained according to the verified training sample data set to obtain an end-side model;
the second electronic device performs model verification on the end-side model;
and the second electronic equipment transmits the end-side model passing the model verification as the model to be fused to the first electronic equipment.
12. The method of claim 11, wherein the second electronic device performs data validity verification on the training sample data set to obtain a verified training sample data set, comprising:
the second electronic equipment performs data classification on the training sample data set, and determines first valid data, redundant data and noise data in the training sample data;
the second electronic device removing the redundant data;
the second electronic equipment repairs the noise data to obtain second effective data;
the second electronic device composes the verified training sample data set based on the first valid data and the second valid data.
13. The method of claim 11, wherein the second electronic device performs incremental training on the model to be trained according to the verified training sample data set to obtain an end-side model, and the method comprises:
if the type of the model training task is different from the type of the model training task of the previous round of training, the second electronic equipment performs incremental training on the model to be trained in a multi-task incremental learning mode according to the verified training sample data set to obtain the end-side model;
if the type of the model training task is the same as that of the model training task of the previous round of training, and the type of the base model of the model to be trained is a machine learning model, the second electronic device performs incremental training on the model to be trained in a machine incremental learning mode according to the verified training sample data set to obtain the end-side model;
and if the type of the model training task is the same as that of the model training task of the previous round of training, and the type of the base model of the model to be trained is a non-machine learning model, the second electronic equipment performs incremental training on the model to be trained in a neural network incremental learning mode according to the verified training sample data set to obtain the end-side model.
14. The method of claim 13, wherein the second electronic device performs incremental training on the model to be trained according to the verified training sample data set by means of neural network incremental learning to obtain the end-side model, and the method includes:
if the model to be trained meets the first class condition, the second electronic equipment performs incremental training on the model to be trained through small sample learning and convolutional layer updating according to the verified training sample data set to obtain the end-side model;
and if the model to be trained meets a second type of condition, the second electronic equipment performs incremental training on the model to be trained through knowledge distillation and convolutional layer solidification according to the verified training sample data set to obtain the end-side model.
15. The method of claim 11, wherein the method further comprises:
the second electronic equipment acquires sample data at the end side;
the second electronic equipment performs data simulation according to the end-side sample data to generate training data and cloud verification data used for constructing a cloud cross verification data set;
the second electronic equipment uploads the cloud verification data to the first electronic equipment;
and the second electronic equipment constructs the training sample data set according to the end-side sample data and the training data.
16. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 10 or 11 to 15 when executing the computer program.
17. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 10 or 11 to 15.
CN202010446725.9A 2020-05-22 2020-05-22 Model training method based on federal learning and electronic equipment Pending CN113705823A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010446725.9A CN113705823A (en) 2020-05-22 2020-05-22 Model training method based on federal learning and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010446725.9A CN113705823A (en) 2020-05-22 2020-05-22 Model training method based on federal learning and electronic equipment

Publications (1)

Publication Number Publication Date
CN113705823A true CN113705823A (en) 2021-11-26

Family

ID=78646559

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010446725.9A Pending CN113705823A (en) 2020-05-22 2020-05-22 Model training method based on federal learning and electronic equipment

Country Status (1)

Country Link
CN (1) CN113705823A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114048864A (en) * 2022-01-11 2022-02-15 中兴通讯股份有限公司 Method for managing federal learning data, electronic device and storage medium
CN114465722A (en) * 2022-01-29 2022-05-10 深圳前海微众银行股份有限公司 Information processing method, apparatus, device, storage medium, and program product
CN114529228A (en) * 2022-04-24 2022-05-24 南京鼎研电力科技有限公司 Risk early warning method and system for power monitoring system supply chain
CN114827289A (en) * 2022-06-01 2022-07-29 深圳大学 Communication compression method, system, electronic device and storage medium
CN115022316A (en) * 2022-05-20 2022-09-06 阿里巴巴(中国)有限公司 End cloud cooperative data processing system, method, equipment and computer storage medium
CN115271033A (en) * 2022-07-05 2022-11-01 西南财经大学 Medical image processing model construction and processing method based on federal knowledge distillation
CN116049862A (en) * 2023-03-13 2023-05-02 杭州海康威视数字技术股份有限公司 Data protection method, device and system based on asynchronous packet federation learning
CN117349670A (en) * 2023-10-25 2024-01-05 杭州汇健科技有限公司 Tumor detection model training system, method, equipment and storage medium
WO2024011456A1 (en) * 2022-07-13 2024-01-18 Oppo广东移动通信有限公司 Data processing method and apparatus, communication method and apparatus, and terminal device and network device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871702A (en) * 2019-02-18 2019-06-11 深圳前海微众银行股份有限公司 Federal model training method, system, equipment and computer readable storage medium
CN110222762A (en) * 2019-06-04 2019-09-10 恒安嘉新(北京)科技股份公司 Object prediction method, apparatus, equipment and medium
CN110442457A (en) * 2019-08-12 2019-11-12 北京大学深圳研究生院 Model training method, device and server based on federation's study

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871702A (en) * 2019-02-18 2019-06-11 深圳前海微众银行股份有限公司 Federal model training method, system, equipment and computer readable storage medium
CN110222762A (en) * 2019-06-04 2019-09-10 恒安嘉新(北京)科技股份公司 Object prediction method, apparatus, equipment and medium
CN110442457A (en) * 2019-08-12 2019-11-12 北京大学深圳研究生院 Model training method, device and server based on federation's study

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114048864A (en) * 2022-01-11 2022-02-15 中兴通讯股份有限公司 Method for managing federal learning data, electronic device and storage medium
CN114465722A (en) * 2022-01-29 2022-05-10 深圳前海微众银行股份有限公司 Information processing method, apparatus, device, storage medium, and program product
CN114465722B (en) * 2022-01-29 2024-04-02 深圳前海微众银行股份有限公司 Information processing method, apparatus, device, storage medium, and program product
CN114529228A (en) * 2022-04-24 2022-05-24 南京鼎研电力科技有限公司 Risk early warning method and system for power monitoring system supply chain
CN115022316B (en) * 2022-05-20 2023-08-11 阿里巴巴(中国)有限公司 End cloud collaborative data processing system, method, equipment and computer storage medium
CN115022316A (en) * 2022-05-20 2022-09-06 阿里巴巴(中国)有限公司 End cloud cooperative data processing system, method, equipment and computer storage medium
CN114827289A (en) * 2022-06-01 2022-07-29 深圳大学 Communication compression method, system, electronic device and storage medium
CN114827289B (en) * 2022-06-01 2023-06-13 深圳大学 Communication compression method, system, electronic device and storage medium
CN115271033B (en) * 2022-07-05 2023-11-21 西南财经大学 Medical image processing model construction and processing method based on federal knowledge distillation
CN115271033A (en) * 2022-07-05 2022-11-01 西南财经大学 Medical image processing model construction and processing method based on federal knowledge distillation
WO2024011456A1 (en) * 2022-07-13 2024-01-18 Oppo广东移动通信有限公司 Data processing method and apparatus, communication method and apparatus, and terminal device and network device
CN116049862A (en) * 2023-03-13 2023-05-02 杭州海康威视数字技术股份有限公司 Data protection method, device and system based on asynchronous packet federation learning
CN117349670A (en) * 2023-10-25 2024-01-05 杭州汇健科技有限公司 Tumor detection model training system, method, equipment and storage medium
CN117349670B (en) * 2023-10-25 2024-04-12 杭州汇健科技有限公司 Tumor detection model training system, method, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN113705823A (en) Model training method based on federal learning and electronic equipment
EP3923634B1 (en) Method for identifying specific position on specific route and electronic device
CN111669515B (en) Video generation method and related device
US11868463B2 (en) Method for managing application permission and electronic device
WO2021104485A1 (en) Photographing method and electronic device
US20220180485A1 (en) Image Processing Method and Electronic Device
CN113364971A (en) Image processing method and device
WO2021052139A1 (en) Gesture input method and electronic device
CN111625670A (en) Picture grouping method and device
CN114242037A (en) Virtual character generation method and device
WO2022073417A1 (en) Fusion scene perception machine translation method, storage medium, and electronic device
CN111103922A (en) Camera, electronic equipment and identity verification method
CN115589051B (en) Charging method and terminal equipment
CN112651510A (en) Model updating method, working node and model updating system
CN115718913A (en) User identity identification method and electronic equipment
CN113574525A (en) Media content recommendation method and equipment
CN114424927A (en) Sleep monitoring method and device, electronic equipment and computer readable storage medium
CN113468929A (en) Motion state identification method and device, electronic equipment and storage medium
CN114444705A (en) Model updating method and device
CN111768765A (en) Language model generation method and electronic equipment
CN113536834A (en) Pouch detection method and device
WO2022214004A1 (en) Target user determination method, electronic device and computer-readable storage medium
CN114283195B (en) Method for generating dynamic image, electronic device and readable storage medium
US20230402150A1 (en) Adaptive Action Evaluation Method, Electronic Device, and Storage Medium
CN114079725B (en) Video anti-shake method, terminal device, and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination