CN113642707A - Model training method, device, equipment and storage medium based on federal learning - Google Patents

Model training method, device, equipment and storage medium based on federal learning Download PDF

Info

Publication number
CN113642707A
CN113642707A CN202110924484.9A CN202110924484A CN113642707A CN 113642707 A CN113642707 A CN 113642707A CN 202110924484 A CN202110924484 A CN 202110924484A CN 113642707 A CN113642707 A CN 113642707A
Authority
CN
China
Prior art keywords
training
matrix
model parameter
iteration
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110924484.9A
Other languages
Chinese (zh)
Other versions
CN113642707B (en
Inventor
张玉君
张卫军
钱勇
黎奉薪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Pingan Zhihui Enterprise Information Management Co ltd
Original Assignee
Shenzhen Pingan Zhihui Enterprise Information Management Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Pingan Zhihui Enterprise Information Management Co ltd filed Critical Shenzhen Pingan Zhihui Enterprise Information Management Co ltd
Priority to CN202110924484.9A priority Critical patent/CN113642707B/en
Publication of CN113642707A publication Critical patent/CN113642707A/en
Application granted granted Critical
Publication of CN113642707B publication Critical patent/CN113642707B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Optimization (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Algebra (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to the technical field of artificial intelligence, and discloses a model training method, a device, equipment and a storage medium based on federal learning, wherein the method comprises the following steps: acquiring the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning; taking the number of the target clients as the number of the participating training mechanisms; obtaining the total number of training samples according to the number of all the single model training samples; calculating a model parameter summarizing matrix of the ith iteration according to the number of the mechanisms participating in training, all single model parameter matrixes, the number of all single model training samples and the model parameter summarizing matrix of the (i-1) th iteration; determining the learning rate to be updated according to the number of the training mechanisms; and sending the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client. The effect of model training is improved, and the method is suitable for application scenes with data limited by confidentiality.

Description

Model training method, device, equipment and storage medium based on federal learning
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a method, a device, equipment and a storage medium for model training based on federal learning.
Background
The people post matching, namely the matching of a candidate and a post, is an important link of an organization in the field of recruitment, but because people post matching requires a great deal of time and energy input of a recruiter and has high requirements on professional knowledge, at present, the automation of people post matching is carried out by a few organizations by combining a deep learning method. However, not all organizations have the ability to build a deep learning based human-job matching model, some have a lack of data accumulation, some have a lack of development resources, and the like, so that many technical development services for providing human-job matching appear in the market. However, these technology development services have the following problems: (1) lack of training samples, single mechanism does not have enough training samples to carry out deep learning model training, resulting in unsatisfactory effect of the post matching model; (2) data is limited by confidentiality, and an organization cannot output resume and post data out of the organization for the reason of protecting business confidentiality, so that a post matching model is only suitable for the organization and is difficult to reach application in a wider range; (3) the data source is single, and the mechanism data can not leave the mechanism, so that the personnel matching model needs to be trained in the mechanism, the flexibility of model training and updating is greatly limited, and a special maintenance team needs to develop irregularly.
Disclosure of Invention
The application mainly aims to provide a method, a device, equipment and a storage medium for model training based on federated learning, and aims to solve the technical problem that in the prior art, a human job matching model based on deep learning has an unsatisfactory effect due to the lack of training samples, single data source, privacy limitation on data and single data source.
In order to achieve the above object, the present application provides a method for training a model based on federal learning, the method including:
obtaining the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning, wherein the target clients are used for performing model training on a local network model by adopting a local training sample set, and extracting the number of the single model training samples and the single model parameter matrix according to the trained local network model;
acquiring the number of the target clients as the number of the training mechanisms participating;
adding the number of all the single model training samples to obtain the total number of the training samples;
obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all the single model parameter matrices, all the single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration;
calculating the learning rate according to the number of the training mechanisms to be participated in to obtain the learning rate to be updated;
sending the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target clients are further used for updating parameters of the local network model according to the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration;
and adding 1 to the i, and repeatedly executing the step of obtaining the number of the single model training samples and the single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning until the end condition of the federal learning is met.
Further, the calculation formula of the model parameter summary matrix of the ith iteration is W_all[i]:
Figure BDA0003208741330000021
wherein ,W_all[i-1]Is a model parameter summary matrix of the i-1 iteration of the mth round of federated learning, n is the number of participating training institutions of the i-1 iteration of the mth round of federated learning, w [ a ]][i]Is the single model parameter matrix, sp _ n [ a ], sent by the a-th target client in the ith iteration of the m-th round of federated learning]The number of the single model training samples sent by the a-th target client in the ith iteration of the m-th round of federal learning is shown.
Further, the calculation formula b [ i ] of the learning rate to be updated is:
b[i]=1/(i*n2)
where n is the number of participating training institutions and i is i in the ith iteration of the mth round of federated learning.
Further, the target client is further configured to perform parameter updating on the local network model according to the total number of the training samples, the learning rate to be updated, and the model parameter summarizing matrix of the ith iteration, and includes:
performing model parameter matrix calculation according to the last updated gradient data of the client to be updated, the single model parameter matrix, the total number of training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration by using the client to be updated to obtain a model parameter matrix to be updated, wherein the client to be updated is any one of the target clients;
updating the parameters of the local network model corresponding to the client to be updated according to the model parameter matrix to be updated;
wherein the model parameters to be updatedCalculation formula w of matrixgComprises the following steps:
Figure BDA0003208741330000031
w [ r ] [ i ] is the single model parameter matrix of the client r to be updated for the ith iteration of the mth round of federated learning, W _ all [ i ] is the model parameter summary matrix of the ith iteration of the mth round of federated learning, sp _ all [ i ] is the total number of training samples for the ith iteration of the mth round of federated learning, b [ i ] is the learning rate to be updated for the ith iteration of the mth round of federated learning, and grad [ i ] is the last update gradient data of the local network model corresponding to the client to be updated for the ith iteration of the mth round of federated learning.
Further, the federal learning end condition includes: the iteration times of the mth round of federal learning meet the preset number of single-round of federal learning iterations, or the deviation of the model parameter summary matrix of the mth round of federal learning in three adjacent times meets the preset federal learning convergence condition;
the calculation formula epoch of the preset federal learning single-round iteration times is as follows:
epoch=g(5+log2n)
wherein g (5+ log)2n) is for 5+ log2n is rounded up, n being the number of participating training institutions, log2n is a logarithmic function with base n being 2;
the deviation of the model parameter summary matrix of the m-th round of federal learning three adjacent times meets the preset federal learning convergence condition, and comprises the following steps:
the deviation of the model parameter summary matrix of the ith iteration and the (i-1) th iteration of the mth round of federal learning and the deviation of the model parameter summary matrix of the (i-1) th iteration and the (i-2) th iteration of the mth round of federal learning are both smaller than a federal learning deviation threshold value;
wherein, the calculation formula w of the deviation of the model parameter summary matrix of the ith iteration and the (i-1) th iteration of the mth round of federal learningp[i]Comprises the following steps:
wp[i]=avg(W_all[i]-W_all[i-1])
w _ all [ i ] is the model parameter summary matrix for the ith iteration of the mth round of federated learning, W _ all [ i-1] is the model parameter summary matrix for the (i-1) th iteration of the mth round of federated learning, and avg () is the average of the absolute values of the various parameters of the computation matrix.
Further, before the step of obtaining the number of single model training samples and the single model parameter matrix sent by each target client in the ith iteration of the mth round of federal learning, the method further includes:
obtaining a plurality of training samples to be processed through a client to be processed, wherein the training samples to be processed comprise: the resume to be processed, the recruitment post information and the resume screening calibration value, wherein the client to be processed is any one of the target clients;
acquiring one training sample to be processed from a plurality of training samples to be processed as a target training sample to be processed;
respectively performing structural analysis on the resume to be processed and the recruitment post information corresponding to the training sample to be processed to obtain a structural data pair to be processed;
local training sample generation is carried out on the resume screening calibration value corresponding to the target to-be-processed training sample according to the to-be-processed structured data, and a to-be-stored local training sample corresponding to the target to-be-processed training sample is obtained;
repeatedly executing the step of obtaining one training sample to be processed from a plurality of training samples to be processed as a target training sample to be processed until the obtaining of the training sample to be processed is completed;
and taking all the local training samples to be stored as the local training sample set corresponding to the client to be processed.
Further, the step of performing local training sample generation on the resume screening calibration value corresponding to the target to-be-processed training sample according to the to-be-processed structured data to obtain a to-be-stored local training sample corresponding to the target to-be-processed training sample includes:
standardizing the structural data pair to be processed to obtain a standardized structural data pair;
acquiring a preset knowledge graph, and extracting information of the standardized structured data pair according to the preset knowledge graph to obtain an information set pair to be converted;
carrying out vector conversion of a tensor space on the information set pair to be converted to obtain a tensor space vector pair;
and performing local training sample generation on the resume screening calibration value corresponding to the target to-be-processed training sample according to the tensor space vector to obtain a to-be-stored local training sample corresponding to the target to-be-processed training sample.
The application also provides a model training device based on federal learning, the device includes:
the data acquisition module is used for acquiring the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning, wherein the target clients are used for performing model training on a local network model by adopting a local training sample set and extracting the number of the single model training samples and the single model parameter matrix according to the local network model after training;
the number determining module of the participating training mechanisms is used for acquiring the number of the target clients as the number of the participating training mechanisms;
the training sample total number determining module is used for adding the number of all the single model training samples to obtain the total number of the training samples;
the model parameter summarizing matrix determining module is used for obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all the single model parameter matrices, all the single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration;
the learning rate determining module to be updated is used for calculating the learning rate according to the number of the training participation mechanisms to obtain the learning rate to be updated;
the parameter updating module is used for sending the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target client is further used for updating parameters of the local network model according to the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration;
and the cyclic execution module is used for adding 1 to the i, and repeatedly executing the step of obtaining the number of the single model training samples and the single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning until the federal learning end condition is met.
The present application further proposes a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps of any of the above methods when executing the computer program.
The present application also proposes a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method of any of the above.
According to the model training method, the device, the equipment and the storage medium based on the federal learning, the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of the federal learning are obtained; acquiring the number of target clients as the number of participating training mechanisms; adding the number of all the single model training samples to obtain the total number of the training samples; obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all single model parameter matrices, all single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration; calculating the learning rate according to the number of the training mechanisms to obtain the learning rate to be updated; the method comprises the steps of sending the total number of training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target clients are further used for carrying out parameter updating on a local network model according to the last updated gradient data, the single model parameter matrix, the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration of the target clients, the diversity of data sources is increased by adopting federal learning, the model training effect is improved, a mechanism lacking the training samples can use the model, and the target clients participating in training only need to send the number of the single model training samples and the single model parameter matrix without uploading detailed data of the training samples, so that the method is suitable for application scenes with data limited by the scheme.
Drawings
FIG. 1 is a schematic flow chart illustrating a federated learning-based model training method according to an embodiment of the present application;
FIG. 2 is a block diagram illustrating the structure of a model training apparatus based on federated learning according to an embodiment of the present application;
fig. 3 is a block diagram illustrating a structure of a computer device according to an embodiment of the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Referring to fig. 1, an embodiment of the present application provides a method for model training based on federal learning, where the method includes:
s1: obtaining the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning, wherein the target clients are used for performing model training on a local network model by adopting a local training sample set, and extracting the number of the single model training samples and the single model parameter matrix according to the trained local network model;
s2: acquiring the number of the target clients as the number of the training mechanisms participating;
s3: adding the number of all the single model training samples to obtain the total number of the training samples;
s4: obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all the single model parameter matrices, all the single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration;
s5: calculating the learning rate according to the number of the training mechanisms to be participated in to obtain the learning rate to be updated;
s6: sending the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target clients are further used for updating parameters of the local network model according to the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration;
s7: and adding 1 to the i, and repeatedly executing the step of obtaining the number of the single model training samples and the single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning until the end condition of the federal learning is met.
In the embodiment, the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning are obtained; acquiring the number of target clients as the number of participating training mechanisms; adding the number of all the single model training samples to obtain the total number of the training samples; obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all single model parameter matrices, all single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration; calculating the learning rate according to the number of the training mechanisms to obtain the learning rate to be updated; the method comprises the steps of sending the total number of training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target clients are further used for carrying out parameter updating on a local network model according to the last updated gradient data, the single model parameter matrix, the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration of the target clients, the diversity of data sources is increased by adopting federal learning, the model training effect is improved, a mechanism lacking the training samples can use the model, and the target clients participating in training only need to send the number of the single model training samples and the single model parameter matrix without uploading detailed data of the training samples, so that the method is suitable for application scenes with data limited by the scheme.
For S1, through the communication connection between the target client and the aggregation end, each target client of the ith iteration of the mth round of federal learning sends the number of local single model training samples and the single model parameter matrix to the aggregation end, so that the diversity of data sources is increased, and detailed data of the training samples does not need to be uploaded, thereby the scheme is suitable for application scenarios with data privacy limitation. That is, in the ith iteration of the mth round of federated learning, each target client corresponds to one single model training sample number and one single model parameter matrix. The number of single model training samples and the single model parameter matrix of the aggregation end for aggregation at each time are data of the same iteration in the same round of federal learning.
Optionally, the number of single-model training samples and the single-model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning are obtained, and it can be understood that each target client uses a homomorphic encryption algorithm to encrypt the number of single-model training samples and the single-model parameter matrix of the ith iteration of the mth round of federal learning, which correspond to each target client, respectively, and send the encrypted number of single-model training samples and the encrypted single-model parameter matrix to the aggregation end. The aggregation end can perform addition and multiplication operations under the condition that the number of single model training samples and the actual data content of the encrypted single model parameter matrix are not known, so that the confidentiality of federal learning is further ensured.
And the target client is a client participating in the mth round of federal learning. Each client corresponds to a local network model.
And the aggregation end aggregates the training samples according to the number of the single model training samples and the single model parameter matrix sent by each target client to obtain the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix.
Optionally, the local network model is a human-job matching model obtained based on a CNN (convolutional neural network) network. The local network model comprises in sequence: a plurality of convolutional layers, pooling layers, and fully-connected layers. It is to be understood that the local network model may also be other models, which are not limited herein.
The local training sample set includes a plurality of local training samples. The local training sample comprises a sample data to be trained and a sample data calibration value. In the same local training sample, the sample data calibration value is an accurate calibration result of the sample data to be trained. For example, when the local network model is a post matching model, the sample data to be trained is data obtained according to resume and recruitment post information, and the calibration value of the sample data is an accurate result of resume screening according to the recruitment post information, which is not specifically limited in this example.
It is to be understood that the number of the target clients participating in the federal learning per round is greater than a preset participation authority threshold, and the number of the local training samples in the local training sample set of each of the target clients is greater than a preset training sample threshold. That is, only the clients in the local training sample set whose number of local training samples is greater than a preset training sample threshold can participate in the federal learning, and only the target clients participating in the federal learning whose number is greater than a preset participating agency threshold can start a round of the federal learning.
Optionally, the preset participation mechanism threshold is set to 10, and the preset training sample threshold is 50. It is understood that the preset participation mechanism threshold may be set to other values, and is not limited herein. The preset training sample threshold may also be set to other values, which are not limited herein.
Wherein any one of the target clients is used as a client to be trained; the method comprises the steps of obtaining a local training sample set corresponding to a client to be trained as a target local training sample set through the client to be trained, carrying out ith model training of m-th federal learning on a local network model corresponding to the client to be trained by adopting the target local training sample set, taking the actual number of local training samples used during model training as the number of single model training samples corresponding to the client to be trained, obtaining a parameter matrix of a model from the local network model after the ith iterative training of the m-th federal learning corresponding to the client to be trained is finished as the single model parameter matrix corresponding to the client to be trained, and sending the number of the single model training samples corresponding to the client to be trained and the single model parameter matrix to an aggregation end. That is to say, the number of the single-model training samples corresponding to the client to be trained is less than or equal to the number of the local training samples in the target local training sample set.
The step of performing the ith model training of the m-th round of federal learning on the local network model corresponding to the client to be trained by using the target local training sample set comprises the following steps: and performing ith model training of m-th federal learning on the local network model corresponding to the client to be trained by adopting the target local training sample set until a preset local training end condition is met, and taking the local network model meeting the preset local training end condition as the local network model after the ith iterative training of the m-th federal learning corresponding to the client to be trained is finished.
The preset local training end condition comprises the following steps: and the local training samples in the target local training sample set are completely trained once, or the loss values of the local network model corresponding to the client to be trained for a plurality of times continuously meet the preset local training convergence condition.
For example, the preset local training convergence condition is: loss values of the local network model corresponding to the client to be trained for 3 consecutive times are all smaller than a preset loss value threshold, which is not specifically limited in this example.
Optionally, the preset loss value threshold is set to 0.0001. It is understood that the preset loss value threshold may also be set to other values, which are not limited herein.
For S2, calculating, by the aggregation end, the number of target clients of the ith iteration of the mth round of federal learning, and taking the calculated number as the number of participating training agencies.
For S3, the number of all the single-model training samples is added through the aggregation end, and the added data is used as the total number of the training samples. That is, the total number of training samples is the total number of local training samples taken at the ith iteration of the mth round of federated learning.
For S4, through the aggregation end, the model parameter summary matrix of the i-1 st iteration of the mth round of federal learning may be obtained from the database, or the model parameter summary matrix of the i-1 st iteration of the mth round of federal learning may be obtained from the cache.
And calculating the model parameter summarizing matrix of the i-1 th iteration of the mth round of federal learning according to the number of the training mechanisms participating in the training, all the single model parameter matrixes, all the single model training samples and the model parameter summarizing matrix of the i-1 th iteration through a gathering end, and taking the calculated data as the model parameter summarizing matrix of the i-th iteration of the mth round of federal learning. That is, each time the model parameter summary matrix is calculated, the calculation is performed on the basis of the model parameter summary matrix of the previous iteration.
For S5, optionally, a learning rate is calculated according to the number of the training participation mechanisms, and the calculated learning rate is used as a learning rate to be updated, where a learning rate calculation formula k [ i ] to be updated is:
k[i]=1/(d*n2)
where n is the number of participating exercise mechanisms and d is a constant.
Optionally, the learning rate in the deep learning model is related to the learning speed of the deep learning model, and the larger the learning rate is, the faster the model learns, but the reason is that the model is possibly too large, and the best position of the gradient optimization of the model is missed; the smaller the learning rate is, the slower the model learning is, but the learning rate is possibly too small, so that the model convergence receives a local position, and the optimal position of the gradient optimization of the model is missed, therefore, a preset learning rate calculation formula is adopted, and the learning rate is calculated by the aggregation end according to the number of the participating training mechanisms, so that the dynamic generation of the learning rate is realized, and the federal learning is favorably and quickly locked to the approximate position of the global optimal solution and then gradually searched in a small-step iteration mode. The value of the preset learning rate calculation formula is reduced along with the increase of the iteration times, so that the learning rate is smaller after the iteration is carried out, and a foundation is provided for realizing federal learning which is firstly and quickly locked to the approximate position of the global optimal solution and then is gradually searched in a small-step iteration mode.
And S6, sending the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client through a gathering end.
And each target client updates the parameters of the corresponding local network model according to the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration, so that the ith iteration of the mth round of federal learning is completed.
For S7, adding 1 to i, and using the updated i for the next iteration of the mth round of federal learning, and repeatedly executing steps S1 to S7 until the federal learning end condition is met. When the federal learning end condition is met, the mth round of federal learning is ended, and the local network model corresponding to the target client at the moment can be used for predicting actual production of each organization.
It is understood that the total number of training samples, the learning rate to be updated, and the model parameter summary matrix of the ith iteration may also be sent to a client not participating in the mth round of federal learning, so that an organization lacking training samples can also adopt the results of federal learning.
Optionally, for a new client, the model parameter summary matrix of the ith iteration is directly used as a parameter of the local network model corresponding to the new client, and the local network model of the new client after the parameter update is subjected to actual production prediction.
In one embodiment, the calculation formula of the model parameter summary matrix of the ith iteration is W_all[i]:
Figure BDA0003208741330000111
wherein ,W_all[i-1]Is a model parameter summary matrix of the i-1 iteration of the mth round of federated learning, n is the number of participating training institutions of the i-1 iteration of the mth round of federated learning, w [ a ]][i]Is the single model parameter matrix, sp _ n [ a ], sent by the a-th target client in the ith iteration of the m-th round of federated learning]The number of the single model training samples sent by the a-th target client in the ith iteration of the m-th round of federal learning is shown.
According to the embodiment, the model parameter summarizing matrix of the ith iteration is calculated according to the number of the participating training mechanisms, all the single model parameter matrixes, all the single model training sample numbers and the model parameter summarizing matrix of the (i-1) th iteration, so that the feature aggregation of the local network models of all the target clients is realized.
The calculation of the model parameter summarizing matrix of the ith iteration of the mth round of federal learning is carried out on the basis of the model parameter summarizing matrix of the (i-1) th iteration of the mth round of federal learning, so that successive iteration calculation is realized, and the continuity of the model parameter summarizing matrix is improved.
In one embodiment, the formula b [ i ] for calculating the learning rate to be updated is:
b[i]=1/(i*n2)
where n is the number of participating training institutions and i is i in the ith iteration of the mth round of federated learning.
In the embodiment, the learning rate is calculated according to the number of the training mechanisms participating in the dynamic generation of the learning rate, so that the dynamic generation of the learning rate is realized, and the federal learning is favorably and quickly locked to the approximate position of the global optimal solution and then gradually searched in a small-step iteration mode.
Due to 1/(i x n)2) As n increases, 1/(i n)2) The value of (2) is reduced, so that the learning rate is smaller after the global optimal solution is obtained, and a foundation is provided for realizing the federal learning that the global optimal solution is quickly locked to the approximate position and then gradually searched in a small-step iteration mode.
In an embodiment, the target client is further configured to perform a parameter update on the local network model according to the total number of training samples, the learning rate to be updated, and the model parameter aggregation matrix of the ith iteration, where the step includes:
s61: performing model parameter matrix calculation according to the last updated gradient data of the client to be updated, the single model parameter matrix, the total number of training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration by using the client to be updated to obtain a model parameter matrix to be updated, wherein the client to be updated is any one of the target clients;
s62: updating the parameters of the local network model corresponding to the client to be updated according to the model parameter matrix to be updated;
wherein, the calculation formula w of the model parameter matrix to be updatedgComprises the following steps:
Figure BDA0003208741330000121
w [ r ] [ i ] is the single model parameter matrix of the client r to be updated for the ith iteration of the mth round of federated learning, W _ all [ i ] is the model parameter summary matrix of the ith iteration of the mth round of federated learning, sp _ all [ i ] is the total number of training samples for the ith iteration of the mth round of federated learning, b [ i ] is the learning rate to be updated for the ith iteration of the mth round of federated learning, and grad [ i ] is the last update gradient data of the local network model corresponding to the client to be updated for the ith iteration of the mth round of federated learning.
In this embodiment, the model parameter matrix calculation is performed according to the last updated gradient data and the single model parameter matrix of the client to be updated, the total number of training samples, the learning rate to be updated, and the model parameter summarizing matrix of the ith iteration, so that the model parameter matrix to be updated obtained by calculation is associated with the last updated gradient data and the single model parameter matrix of the client to be updated, thereby realizing the personalized updating of the parameters of the local network model of the target client,
for S61, inputting, by the client to be updated, the last update gradient data of the client to be updated of the i-th iteration of the m-th round of federated learning, the single model parameter matrix of the client to be updated of the i-th iteration of the m-th round of federated learning, the training sample total number of the i-th iteration of the m-th round of federated learning, the learning rate to be updated of the i-th iteration of the m-th round of federated learning, and the model parameter summary matrix of the i-th iteration of the m-th round of federated learning into the calculation formula wgAnd calculating a model parameter matrix, and taking the calculated model parameter matrix as the model parameter matrix to be updated.
The last time of updating gradient data of the client to be updated is the last time of local training of the local network model corresponding to the client to be updated during the ith iterative training of the mth round of federal learning corresponding to the client to be updated. That is to say, the single model parameter matrix of the client to be updated is gradient data corresponding to the single model parameter matrix of the client to be updated.
The single model parameter matrix of the client to be updated is obtained from the local network model after the ith iterative training of the mth round of federal learning corresponding to the client to be updated is finished.
For step S62, the model parameter matrix to be updated is updated to the local network model corresponding to the client to be updated.
In one embodiment, the federal learning end condition includes: the iteration times of the mth round of federal learning meet the preset number of single-round of federal learning iterations, or the deviation of the model parameter summary matrix of the mth round of federal learning in three adjacent times meets the preset federal learning convergence condition;
the calculation formula epoch of the preset federal learning single-round iteration times is as follows:
epoch=g(5+log2n)
wherein g (5+ log)2n) is for 5+ log2n is rounded up, n being the number of participating training institutions, log2n is a logarithmic function with base n being 2;
the deviation of the model parameter summary matrix of the m-th round of federal learning three adjacent times meets the preset federal learning convergence condition, and comprises the following steps:
the deviation of the model parameter summary matrix of the ith iteration and the (i-1) th iteration of the mth round of federal learning and the deviation of the model parameter summary matrix of the (i-1) th iteration and the (i-2) th iteration of the mth round of federal learning are both smaller than a federal learning deviation threshold value;
wherein, the calculation formula w of the deviation of the model parameter summary matrix of the ith iteration and the (i-1) th iteration of the mth round of federal learningp[i]Comprises the following steps:
wp[i]=avg(W_all[i]-W_all[i-1])
w _ all [ i ] is the model parameter summary matrix for the ith iteration of the mth round of federated learning, W _ all [ i-1] is the model parameter summary matrix for the (i-1) th iteration of the mth round of federated learning, and avg () is the average of the absolute values of the various parameters of the computation matrix.
Since the larger the iteration number is, the larger the computing resource overhead of the aggregation end and the target client is, in order to ensure that the training process is flexibly adjusted according to the number of mechanisms participating in training, the embodiment uses g (5+ log)2n) as the preset number of single iteration rounds of federal learning; and the deviation of the model parameter summarizing matrix of the ith iteration and the (i-1) th iteration of the mth round of federal learning and the deviation of the model parameter summarizing matrix of the (i-1) th iteration and the (i-2) th iteration of the mth round of federal learning are smaller than a federal learning deviation threshold value to serve as a loss value convergence condition, so that the local network model which cannot be converged continuously is prevented from being trained continuously.
Optionally, the federal learning bias threshold is set to 0.001. It is understood that the federal learned deviation threshold value can be set to other values, and is not particularly limited herein.
Wherein, the calculation formula w of the deviation of the model parameter summary matrix of the i-1 th iteration and the i-2 th iteration of the m-th round of federal learningp[i-1]Comprises the following steps:
wp[i-1]=avg(W_all[i-1]-W_all[i-2])
w _ all [ i-1] is the model parameter summary matrix for the i-1 th iteration of the mth round of federated learning, W _ all [ i-2] is the model parameter summary matrix for the i-2 th iteration of the mth round of federated learning, and avg () is the average of the absolute values of the various parameters of the computation matrix.
In an embodiment, before the step of obtaining the number of single model training samples and the single model parameter matrix sent by each target client in the ith iteration of the mth round of federal learning, the method further includes:
s11: obtaining a plurality of training samples to be processed through a client to be processed, wherein the training samples to be processed comprise: the resume to be processed, the recruitment post information and the resume screening calibration value, wherein the client to be processed is any one of the target clients;
s12: acquiring one training sample to be processed from a plurality of training samples to be processed as a target training sample to be processed;
s13: respectively performing structural analysis on the resume to be processed and the recruitment post information corresponding to the training sample to be processed to obtain a structural data pair to be processed;
s14: local training sample generation is carried out on the resume screening calibration value corresponding to the target to-be-processed training sample according to the to-be-processed structured data, and a to-be-stored local training sample corresponding to the target to-be-processed training sample is obtained;
s15: repeatedly executing the step of obtaining one training sample to be processed from a plurality of training samples to be processed as a target training sample to be processed until the obtaining of the training sample to be processed is completed;
s16: and taking all the local training samples to be stored as the local training sample set corresponding to the client to be processed.
According to the embodiment, the to-be-processed resume corresponding to the target to-be-processed training sample and the recruitment post information are respectively subjected to structured analysis to obtain to-be-processed structured data pairs, and the resume screening calibration value corresponding to the target to-be-processed training sample is subjected to local training sample generation according to the to-be-processed structured data pairs, so that the automatic cost-based training sample is realized, the consistency of local training samples of all target clients of federal learning is ensured, and the accuracy of federal learning is improved.
For S11, the client may obtain a plurality of training samples to be processed from the database, obtain a plurality of training samples to be processed input by the user, or obtain a plurality of training samples to be processed from the third-party application system.
Each of the training samples to be processed comprises: a resume to be processed, a recruiting post information and a resume screening calibration value.
The resume to be processed is data of a resume of an applicant.
The recruiting position information is the description information of a recruiting position.
And in the same training sample to be processed, the resume screening calibration value is used for accurately calibrating whether the resume to be processed conforms to the recruitment post information.
The resume filter calibration includes a calibration. For example, the resume screening calibration value is 1 or 0, when the resume screening calibration value is 1, it indicates that the to-be-processed resume conforms to the recruitment position information, and when the resume screening calibration value is 0, it indicates that the to-be-processed resume does not conform to the recruitment position information.
For step S12, one to-be-processed training sample is sequentially obtained from a plurality of to-be-processed training samples, and the obtained to-be-processed training sample is used as a target to-be-processed training sample.
For step S13, a text parser is adopted to perform structured parsing on the resume to be processed corresponding to the training sample to be processed of the target, so as to obtain structured data of the resume to be processed; adopting a text analyzer to perform structured analysis on the recruitment position information corresponding to the target training sample to be processed to obtain structured data of the recruitment position to be processed; and taking the resume structured data to be processed and the recruitment position structured data to be processed as a data pair, and taking the data pair as the structured data pair to be processed.
The text parser can select a resume parser. The implementation method of the resume parser is not described herein.
For example, the resume structured data to be processed includes:
name: zhang three
Learning a calendar: this section
Recent companies: company A
The latest post: project management
Working life: for 3 years
Skill: java, office, which is not specifically limited by this example.
For example, the to-be-processed recruitment position structured data is:
the post name: project manager
The requirements of the study calendar are as follows: this section
Company size: more than 10000 people
The post experience requirement is as follows: project management related post
Skill: java, office, and english, which are not specifically limited by examples herein.
For step S14, generating to-be-trained sample data of the local training sample to be stored corresponding to the training sample to be processed according to the to-be-processed structured data pair; and determining a sample data calibration value of a local training sample to be stored corresponding to the training sample to be processed according to the resume screening calibration value corresponding to the training sample to be processed.
For S15, steps S12 to S15 are repeatedly performed until the acquisition of the training sample to be processed is completed.
For S16, all the local training samples to be stored are used as a set, and the set is used as the local training sample set corresponding to the client to be processed.
In an embodiment, the step of performing local training sample generation on the resume screening calibration value corresponding to the target to-be-processed training sample according to the to-be-processed structured data to obtain a to-be-stored local training sample corresponding to the target to-be-processed training sample includes:
s141: standardizing the structural data pair to be processed to obtain a standardized structural data pair;
s142: acquiring a preset knowledge graph, and extracting information of the standardized structured data pair according to the preset knowledge graph to obtain an information set pair to be converted;
s143: carrying out vector conversion of a tensor space on the information set pair to be converted to obtain a tensor space vector pair;
s144: and performing local training sample generation on the resume screening calibration value corresponding to the target to-be-processed training sample according to the tensor space vector to obtain a to-be-stored local training sample corresponding to the target to-be-processed training sample.
In this embodiment, the to-be-processed structured data pair is subjected to standardization processing, information extraction and tensor space vector conversion in sequence, and finally, local training sample generation is performed on the resume screening calibration value corresponding to the target to-be-processed training sample according to the tensor space vector pair, so that the local network model can be trained quickly.
And for S141, sequentially performing entity standardization and format conversion of continuous features on the resume structured data of the to-be-processed structured data pair, sequentially performing entity standardization and format conversion of continuous features on the recruitment position structured data of the to-be-processed structured data pair, and taking the resume structured data subjected to the entity standardization and format conversion of the continuous features and the recruitment position structured data subjected to the entity standardization and format conversion of the continuous features as the standardized structured data pair.
Entity normalization, i.e. entity alignment. For example, the "beida" in the field of graduates is standardized to "beijing university," which is not limited in detail herein.
Entities, i.e., entities in triples, are abstractions of objective individuals.
The format conversion of the continuous characteristic is to adopt a preset rule to carry out format conversion on the continuous characteristic. For example, the learning time in the educational experience is a continuous feature, the format of the president is converted into 4 years and 5 years, and the format of the student is converted into 2 years, 2.5 years and 3 years, which is not limited in detail herein.
For S142, the preset knowledge graph may be obtained from the database, may also be obtained from the third-party application system, and may also be obtained from the preset knowledge graph input by the user.
The preset knowledge graph is the knowledge graph of the entity determined according to the application scene of the application.
Searching each keyword in the preset knowledge graph in the resume structured data of the standardized structured data pair, and taking all entities searched in the resume structured data as a resume information set; searching each keyword in the preset knowledge map in the structured data of the recruitment position of the standardized structured data pair, and taking all entities searched in the structured data of the recruitment position as a recruitment position information set; and taking the resume information set and the recruitment post information set as an information set pair to be converted.
For S143, carrying out vector conversion of a tensor space on the resume information set of the information set pair to be converted to obtain a resume vector; carrying out vector conversion of tensor space on the recruitment position information set of the information set pair to be converted to obtain a recruitment position vector; and taking the resume vector and the recruitment position vector as a tensor space vector pair.
For step S144, the tensor space vector pair is used as the sample data to be trained of the local training sample to be stored corresponding to the training sample to be processed by the target; and taking the resume screening calibration value corresponding to the training sample to be processed as the sample data calibration value of the local training sample to be stored corresponding to the training sample to be processed.
Referring to fig. 2, a model training apparatus based on federal learning, the apparatus comprising:
the data acquisition module 100 is configured to acquire the number of single model training samples and a single model parameter matrix sent by each target client of an ith iteration of mth round of federal learning, where the target client is configured to perform model training on a local network model by using a local training sample set, and extract the number of single model training samples and the single model parameter matrix according to the local network model after training;
a training participation mechanism number determining module 200, configured to obtain the number of the target clients as the number of training participation mechanisms;
a training sample total number determining module 300, configured to add the number of all the single-model training samples to obtain a total number of training samples;
a model parameter summary matrix determination module 400, configured to obtain a model parameter summary matrix of the i-1 st iteration of the mth round of federated learning, and calculate the model parameter summary matrix of the i-th iteration according to the number of participating training mechanisms, all the single model parameter matrices, all the single model training sample numbers, and the model parameter summary matrix of the i-1 st iteration;
the learning rate determining module 500 to be updated is configured to calculate a learning rate according to the number of the training participation mechanisms to obtain the learning rate to be updated;
a parameter updating module 600, configured to send the total number of training samples, the learning rate to be updated, and the model parameter summarizing matrix of the ith iteration to each target client, where the target client is further configured to perform parameter updating on the local network model according to the total number of training samples, the learning rate to be updated, and the model parameter summarizing matrix of the ith iteration;
and the loop execution module 700 is configured to add 1 to i, and repeatedly execute the step of obtaining the number of single model training samples and the single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning until the federal learning end condition is met.
In the embodiment, the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning are obtained; acquiring the number of target clients as the number of participating training mechanisms; adding the number of all the single model training samples to obtain the total number of the training samples; obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all single model parameter matrices, all single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration; calculating the learning rate according to the number of the training mechanisms to obtain the learning rate to be updated; the method comprises the steps of sending the total number of training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target clients are further used for carrying out parameter updating on a local network model according to the last updated gradient data, the single model parameter matrix, the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration of the target clients, the diversity of data sources is increased by adopting federal learning, the model training effect is improved, a mechanism lacking the training samples can use the model, and the target clients participating in training only need to send the number of the single model training samples and the single model parameter matrix without uploading detailed data of the training samples, so that the method is suitable for application scenes with data limited by the scheme.
Referring to fig. 3, a computer device, which may be a server and whose internal structure may be as shown in fig. 3, is also provided in the embodiment of the present application. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer designed processor is used to provide computational and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The memory provides an environment for the operation of the operating system and the computer program in the non-volatile storage medium. The database of the computer equipment is used for storing data such as a model training method based on federal learning. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a federated learning-based model training method. The model training method based on the federal learning comprises the following steps: obtaining the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning, wherein the target clients are used for performing model training on a local network model by adopting a local training sample set, and extracting the number of the single model training samples and the single model parameter matrix according to the trained local network model; acquiring the number of the target clients as the number of the training mechanisms participating; adding the number of all the single model training samples to obtain the total number of the training samples; obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all the single model parameter matrices, all the single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration; calculating the learning rate according to the number of the training mechanisms to be participated in to obtain the learning rate to be updated; sending the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target clients are further used for updating parameters of the local network model according to the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration; and adding 1 to the i, and repeatedly executing the step of obtaining the number of the single model training samples and the single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning until the end condition of the federal learning is met.
In the embodiment, the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning are obtained; acquiring the number of target clients as the number of participating training mechanisms; adding the number of all the single model training samples to obtain the total number of the training samples; obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all single model parameter matrices, all single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration; calculating the learning rate according to the number of the training mechanisms to obtain the learning rate to be updated; the method comprises the steps of sending the total number of training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target clients are further used for carrying out parameter updating on a local network model according to the last updated gradient data, the single model parameter matrix, the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration of the target clients, the diversity of data sources is increased by adopting federal learning, the model training effect is improved, a mechanism lacking the training samples can use the model, and the target clients participating in training only need to send the number of the single model training samples and the single model parameter matrix without uploading detailed data of the training samples, so that the method is suitable for application scenes with data limited by the scheme.
An embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements a method for model training based on federated learning, including the steps of: obtaining the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning, wherein the target clients are used for performing model training on a local network model by adopting a local training sample set, and extracting the number of the single model training samples and the single model parameter matrix according to the trained local network model; acquiring the number of the target clients as the number of the training mechanisms participating; adding the number of all the single model training samples to obtain the total number of the training samples; obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all the single model parameter matrices, all the single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration; calculating the learning rate according to the number of the training mechanisms to be participated in to obtain the learning rate to be updated; sending the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target clients are further used for updating parameters of the local network model according to the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration; and adding 1 to the i, and repeatedly executing the step of obtaining the number of the single model training samples and the single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning until the end condition of the federal learning is met.
According to the executed model training method based on the federal learning, the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning are obtained; acquiring the number of target clients as the number of participating training mechanisms; adding the number of all the single model training samples to obtain the total number of the training samples; obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all single model parameter matrices, all single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration; calculating the learning rate according to the number of the training mechanisms to obtain the learning rate to be updated; the method comprises the steps of sending the total number of training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target clients are further used for carrying out parameter updating on a local network model according to the last updated gradient data, the single model parameter matrix, the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration of the target clients, the diversity of data sources is increased by adopting federal learning, the model training effect is improved, a mechanism lacking the training samples can use the model, and the target clients participating in training only need to send the number of the single model training samples and the single model parameter matrix without uploading detailed data of the training samples, so that the method is suitable for application scenes with data limited by the scheme.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided herein and used in the examples may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double-rate SDRAM (SSRSDRAM), Enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and bus dynamic RAM (RDRAM).
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (10)

1. A method for model training based on federal learning, the method comprising:
obtaining the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning, wherein the target clients are used for performing model training on a local network model by adopting a local training sample set, and extracting the number of the single model training samples and the single model parameter matrix according to the trained local network model;
acquiring the number of the target clients as the number of the training mechanisms participating;
adding the number of all the single model training samples to obtain the total number of the training samples;
obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all the single model parameter matrices, all the single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration;
calculating the learning rate according to the number of the training mechanisms to be participated in to obtain the learning rate to be updated;
sending the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target clients are further used for updating parameters of the local network model according to the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration;
and adding 1 to the i, and repeatedly executing the step of obtaining the number of the single model training samples and the single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning until the end condition of the federal learning is met.
2. The method of claim 1, wherein the calculation formula of the model parameter summary matrix of the ith iteration is W_all[i]:
Figure FDA0003208741320000011
wherein ,W_all[i-1]Is a model parameter summary matrix of the i-1 iteration of the mth round of federated learning, n is the number of participating training institutions of the i-1 iteration of the mth round of federated learning, w [ a ]][i]Is the single model parameter matrix, sp _ n [ a ], sent by the a-th target client in the ith iteration of the m-th round of federated learning]Is the ith time of the mth round of federal learningThe number of the single model training samples sent by the a-th target client in the iteration.
3. The method of claim 1, wherein the learning rate to be updated is calculated by the formula b [ i ]:
b[i]=1/(i*n2)
where n is the number of participating training institutions and i is i in the ith iteration of the mth round of federated learning.
4. The method of claim 1, wherein the target client is further configured to perform a parameter update on the local network model according to the total number of training samples, the learning rate to be updated, and the model parameter aggregation matrix of the i-th iteration, and includes:
performing model parameter matrix calculation according to the last updated gradient data of the client to be updated, the single model parameter matrix, the total number of training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration by using the client to be updated to obtain a model parameter matrix to be updated, wherein the client to be updated is any one of the target clients;
updating the parameters of the local network model corresponding to the client to be updated according to the model parameter matrix to be updated;
wherein, the calculation formula w of the model parameter matrix to be updatedgComprises the following steps:
Figure FDA0003208741320000021
w [ r ] [ i ] is the single model parameter matrix of the client r to be updated for the ith iteration of the mth round of federated learning, W _ all [ i ] is the model parameter summary matrix of the ith iteration of the mth round of federated learning, sp _ all [ i ] is the total number of training samples for the ith iteration of the mth round of federated learning, b [ i ] is the learning rate to be updated for the ith iteration of the mth round of federated learning, and grad [ i ] is the last update gradient data of the local network model corresponding to the client to be updated for the ith iteration of the mth round of federated learning.
5. The method of claim 1, wherein the federal learning end conditions include: the iteration times of the mth round of federal learning meet the preset number of single-round of federal learning iterations, or the deviation of the model parameter summary matrix of the mth round of federal learning in three adjacent times meets the preset federal learning convergence condition;
the calculation formula epoch of the preset federal learning single-round iteration times is as follows:
epoch=g(5+log2n)
wherein g (5+ log)2n) is for 5+ log2n is rounded up, n being the number of participating training institutions, log2n is a logarithmic function with base n being 2;
the deviation of the model parameter summary matrix of the m-th round of federal learning three adjacent times meets the preset federal learning convergence condition, and comprises the following steps:
the deviation of the model parameter summary matrix of the ith iteration and the (i-1) th iteration of the mth round of federal learning and the deviation of the model parameter summary matrix of the (i-1) th iteration and the (i-2) th iteration of the mth round of federal learning are both smaller than a federal learning deviation threshold value;
wherein, the calculation formula w of the deviation of the model parameter summary matrix of the ith iteration and the (i-1) th iteration of the mth round of federal learningp[i]Comprises the following steps:
wp[i]=avg(W_all[i]-W_all[i-1])
w _ all [ i ] is the model parameter summary matrix for the ith iteration of the mth round of federated learning, W _ all [ i-1] is the model parameter summary matrix for the (i-1) th iteration of the mth round of federated learning, and avg () is the average of the absolute values of the various parameters of the computation matrix.
6. The method for model training based on federated learning of claim 1, wherein the step of obtaining the number of single model training samples and the single model parameter matrix sent by each target client for the ith iteration of the mth round of federated learning further comprises:
obtaining a plurality of training samples to be processed through a client to be processed, wherein the training samples to be processed comprise: the resume to be processed, the recruitment post information and the resume screening calibration value, wherein the client to be processed is any one of the target clients;
acquiring one training sample to be processed from a plurality of training samples to be processed as a target training sample to be processed;
respectively performing structural analysis on the resume to be processed and the recruitment post information corresponding to the training sample to be processed to obtain a structural data pair to be processed;
local training sample generation is carried out on the resume screening calibration value corresponding to the target to-be-processed training sample according to the to-be-processed structured data, and a to-be-stored local training sample corresponding to the target to-be-processed training sample is obtained;
repeatedly executing the step of obtaining one training sample to be processed from a plurality of training samples to be processed as a target training sample to be processed until the obtaining of the training sample to be processed is completed;
and taking all the local training samples to be stored as the local training sample set corresponding to the client to be processed.
7. The method of claim 6, wherein the step of performing local training sample generation on the resume screening calibration value corresponding to the target to-be-processed training sample according to the to-be-processed structured data to obtain a to-be-stored local training sample corresponding to the target to-be-processed training sample comprises:
standardizing the structural data pair to be processed to obtain a standardized structural data pair;
acquiring a preset knowledge graph, and extracting information of the standardized structured data pair according to the preset knowledge graph to obtain an information set pair to be converted;
carrying out vector conversion of a tensor space on the information set pair to be converted to obtain a tensor space vector pair;
and performing local training sample generation on the resume screening calibration value corresponding to the target to-be-processed training sample according to the tensor space vector to obtain a to-be-stored local training sample corresponding to the target to-be-processed training sample.
8. A federal learning-based model training apparatus, the apparatus comprising:
the data acquisition module is used for acquiring the number of single model training samples and a single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning, wherein the target clients are used for performing model training on a local network model by adopting a local training sample set and extracting the number of the single model training samples and the single model parameter matrix according to the local network model after training;
the number determining module of the participating training mechanisms is used for acquiring the number of the target clients as the number of the participating training mechanisms;
the training sample total number determining module is used for adding the number of all the single model training samples to obtain the total number of the training samples;
the model parameter summarizing matrix determining module is used for obtaining a model parameter summarizing matrix of the i-1 st iteration of the mth round of federal learning, and calculating the model parameter summarizing matrix of the i-th iteration according to the number of the participating training mechanisms, all the single model parameter matrices, all the single model training sample numbers and the model parameter summarizing matrix of the i-1 st iteration;
the learning rate determining module to be updated is used for calculating the learning rate according to the number of the training participation mechanisms to obtain the learning rate to be updated;
the parameter updating module is used for sending the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration to each target client, wherein the target client is further used for updating parameters of the local network model according to the total number of the training samples, the learning rate to be updated and the model parameter summarizing matrix of the ith iteration;
and the cyclic execution module is used for adding 1 to the i, and repeatedly executing the step of obtaining the number of the single model training samples and the single model parameter matrix sent by each target client of the ith iteration of the mth round of federal learning until the federal learning end condition is met.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202110924484.9A 2021-08-12 2021-08-12 Model training method, device, equipment and storage medium based on federal learning Active CN113642707B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110924484.9A CN113642707B (en) 2021-08-12 2021-08-12 Model training method, device, equipment and storage medium based on federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110924484.9A CN113642707B (en) 2021-08-12 2021-08-12 Model training method, device, equipment and storage medium based on federal learning

Publications (2)

Publication Number Publication Date
CN113642707A true CN113642707A (en) 2021-11-12
CN113642707B CN113642707B (en) 2023-08-18

Family

ID=78421075

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110924484.9A Active CN113642707B (en) 2021-08-12 2021-08-12 Model training method, device, equipment and storage medium based on federal learning

Country Status (1)

Country Link
CN (1) CN113642707B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114219095A (en) * 2021-11-15 2022-03-22 浙江大华技术股份有限公司 Training method and device of machine learning model and readable storage medium
CN114417138A (en) * 2021-12-27 2022-04-29 海信集团控股股份有限公司 Health information recommendation method and equipment
CN114219095B (en) * 2021-11-15 2024-05-10 浙江大华技术股份有限公司 Training method and device for machine learning model and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112288100A (en) * 2020-12-29 2021-01-29 支付宝(杭州)信息技术有限公司 Method, system and device for updating model parameters based on federal learning
CN112862011A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Model training method and device based on federal learning and federal learning system
WO2021121106A1 (en) * 2019-12-20 2021-06-24 深圳前海微众银行股份有限公司 Federated learning-based personalized recommendation method, apparatus and device, and medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021121106A1 (en) * 2019-12-20 2021-06-24 深圳前海微众银行股份有限公司 Federated learning-based personalized recommendation method, apparatus and device, and medium
CN112288100A (en) * 2020-12-29 2021-01-29 支付宝(杭州)信息技术有限公司 Method, system and device for updating model parameters based on federal learning
CN112862011A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Model training method and device based on federal learning and federal learning system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114219095A (en) * 2021-11-15 2022-03-22 浙江大华技术股份有限公司 Training method and device of machine learning model and readable storage medium
CN114219095B (en) * 2021-11-15 2024-05-10 浙江大华技术股份有限公司 Training method and device for machine learning model and readable storage medium
CN114417138A (en) * 2021-12-27 2022-04-29 海信集团控股股份有限公司 Health information recommendation method and equipment
CN114417138B (en) * 2021-12-27 2024-04-02 海信集团控股股份有限公司 Health information recommendation method and equipment

Also Published As

Publication number Publication date
CN113642707B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
CN108446769B (en) Knowledge graph relation inference method, knowledge graph relation inference device, computer equipment and storage medium
CN109783604B (en) Information extraction method and device based on small amount of samples and computer equipment
US10360482B1 (en) Crowd-sourced artificial intelligence image processing services
CN111598213B (en) Network training method, data identification method, device, equipment and medium
CN112016295B (en) Symptom data processing method, symptom data processing device, computer equipment and storage medium
WO2021169364A1 (en) Semantic emotion analysis method and apparatus, device, and storage medium
CN112259247B (en) Method, device, equipment and medium for confrontation network training and medical data supplement
WO2022178948A1 (en) Model distillation method and apparatus, device, and storage medium
CN112434217A (en) Position information recommendation method, system, computer equipment and storage medium
CN112699923A (en) Document classification prediction method and device, computer equipment and storage medium
Mesquita et al. Embarrassingly parallel MCMC using deep invertible transformations
CN113642707A (en) Model training method, device, equipment and storage medium based on federal learning
Djaneye-Boundjou et al. Gradient-based discrete-time concurrent learning for standalone function approximation
CN112766485B (en) Named entity model training method, device, equipment and medium
CN112541739B (en) Method, device, equipment and medium for testing question-answer intention classification model
CN111079175B (en) Data processing method, data processing device, computer readable storage medium and computer equipment
CN113947185B (en) Task processing network generation method, task processing device, electronic equipment and storage medium
CN113935554B (en) Model training method in delivery system, resource delivery method and device
CN113610215B (en) Task processing network generation method, task processing device and electronic equipment
CN114548297A (en) Data classification method, device, equipment and medium based on domain self-adaption
CN113657496A (en) Information matching method, device, equipment and medium based on similarity matching model
CN113673698B (en) Distillation method, device, equipment and storage medium suitable for BERT model
CN112989788A (en) Method, device, equipment and medium for extracting relation triples
Yu et al. A homoscedasticity test for the accelerated failure time model
CN112559671B (en) ES-based text search engine construction method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant