CN112101578A - Distributed language relationship recognition method, system and device based on federal learning - Google Patents
Distributed language relationship recognition method, system and device based on federal learning Download PDFInfo
- Publication number
- CN112101578A CN112101578A CN202011285430.4A CN202011285430A CN112101578A CN 112101578 A CN112101578 A CN 112101578A CN 202011285430 A CN202011285430 A CN 202011285430A CN 112101578 A CN112101578 A CN 112101578A
- Authority
- CN
- China
- Prior art keywords
- local
- data
- classifier model
- global
- sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000012549 training Methods 0.000 claims abstract description 68
- 238000012512 characterization method Methods 0.000 claims description 57
- 230000006870 function Effects 0.000 claims description 27
- 238000009826 distribution Methods 0.000 claims description 24
- 239000000126 substance Substances 0.000 claims description 24
- 238000004220 aggregation Methods 0.000 claims description 21
- 238000013140 knowledge distillation Methods 0.000 claims description 13
- 238000005457 optimization Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 11
- 238000003860 storage Methods 0.000 claims description 10
- 230000002776 aggregation Effects 0.000 claims description 9
- 238000004821 distillation Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 4
- 238000010801 machine learning Methods 0.000 abstract description 7
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000013145 classification model Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Machine Translation (AREA)
Abstract
The invention belongs to the field of data identification, and particularly relates to a distributed language relationship identification method, a distributed language relationship identification system and a distributed language relationship identification device based on federal learning, aiming at solving the problem that a plurality of participants are difficult to jointly model on the basis of not sharing data. The invention comprises the following steps: each local platform acquires data to be recognized and local training data with labels, a local classifier model is trained according to the local training data, the trained classifier model is integrated into a global classifier model, the local classifier model is initialized through the global classifier model, and the initialized local classifier model is used for recognizing the data to be recognized to obtain a language relationship prediction result. The invention realizes the data use and machine learning modeling of the local platform combined with other platforms on the basis of not sharing data, improves the cooperation of the natural language recognition model training, solves the problem of data isolated island, and reduces the total amount and requirements of the training data required by each platform training model.
Description
Technical Field
The invention belongs to the field of data identification, and particularly relates to a distributed language relationship identification method, a system and a device based on federal learning.
Background
In real life, most enterprises have the problems of small data volume and poor data quality, and the realization of artificial intelligence technology is not enough supported; meanwhile, the domestic and foreign environments also strengthen data protection step by step, so that data freely flow on the premise of safety compliance, and become a great trend; data owned by business companies often has great potential value from both a user and enterprise perspective. Two companies, and even departments between companies, are concerned with the exchange of interests, often these organizations do not provide their private data to other companies, and even within the same company, the data often appears in an isolated island. To address data islanding, and privacy protection issues, federal learning has come into force. Federal Machine Learning (Federal Machine Learning/Federal Learning), also known as Federal Learning, joint Learning and league Learning, is a Machine Learning framework, and can effectively help a plurality of organizations to perform data use and Machine Learning modeling under the condition of meeting the requirements of user privacy protection, data safety and government regulations. The federated learning is used as a distributed machine learning paradigm, the problem of data islands can be effectively solved, participators can jointly model on the basis of not sharing data, the data islands can be technically broken, and the cooperation of artificial intelligence is realized.
Disclosure of Invention
In order to solve the above problems in the prior art, how to implement joint modeling of multiple participants on the basis of not sharing data, and break through the problem of implementing artificial intelligence cooperation of a data island, the invention provides a distributed language relationship identification method based on federal learning, which comprises the following steps:
s100, each local platform acquires data to be identified and local training data with labels;
s200, each local platform trains a first local classifier model through the local training data with the labels to obtain a second local classifier model;
step S300, generating a global classifier model by a weighted average method based on the second local classifier models of the local platforms;
step S400, initializing the second local classifier model of each local platform based on the global classifier model, and generating a third local classifier model;
step S500, performing language relationship recognition on the data to be recognized through the third local classifier model, and obtaining a language relationship prediction result of the data to be recognized.
Further, step S200 includes:
step S210, let t =1, perform an encoding operation on the tagged local training data through a BERT model, and generate 1 sentence-characterized representation and a plurality of entity-characterized representations for each sentence of the tagged local training data;
step S220, selecting the t-th sentence of the local data with the tag, selecting 2 entity characterization representations, and splicing the selected 2 entity characterization representations and the sentence characterization representations to generate a sentence representation containing entity information:
Wherein the content of the first and second substances,for the purpose of the sentence characterization representation,for the 1 st selected entity characterization representation,for the 2 nd selected entity characterization representation,representing a real space, d representing the dimension of each characterization representation;
step S230, the sentence representation based on the implication entity informationAcquiring a predicted language relation p of the labeled local training data through the first local classifier model:
wherein the content of the first and second substances,sentence representation representing information of implication entityThe final prediction relation p is the maximum value in the prediction relation distribution,a label representing the relationship between the user and the user,a set of sentences is represented in a set of sentences,the parameters of the model are represented by,Wb is a trainable parameter adjusted in the training process, and softmax represents a universal softmax classifier;
step S240, let t = t +1 and jump to step S220, and trainable parameters are adjusted through a stochastic gradient descent algorithmWAnd training the parameters b until the local loss function of the model is smaller than a preset first threshold value, and obtaining a second local classifier model.
Further, step S300 further includes step S300B of obtaining a global classifier model by knowledge distillation based on the second local classifier models of the respective local platforms.
Further, the global classifier model is obtained by knowledge distillation based on each second local classifier model, and the method comprises the following steps:
step S310B, each local platform acquires global server data with labels;
step S320B, predicting the global server data with labels through the second local classifier models of each local platform respectively to obtain the local prediction relation of the global dataCollecting;
step S330B, performing the global data local prediction relationAggregation of sets and optimization of global data local prediction relation through stochastic gradient descent algorithmCollecting a global loss function L of the global server data with the label until the global loss function value L is smaller than a preset second threshold value, and obtaining a global classifier model;
the global loss function L is:
wherein the content of the first and second substances,in order to verify the set of images,iin order to be an index,sentence representation for entity informationThe predicted relationship distribution of (a) is,is prepared by reacting withSentence representation of different entity informationPredicted relationship distribution of (2):
where τ is the temperature parameter used to control the distribution of the distillation, r is another index different from i,comprises the following steps:
wherein the content of the first and second substances,is the value of the location of the label aggregated model for each second local classifier model,sentence representation representing local implication entity information for aggregationS represents a sentence, J is an index in the local classifier model, and J is the local classifier model.
In another aspect of the present invention, a distributed language relationship recognition system based on federal learning is provided, the system includes: the system comprises a data acquisition module, a local training module, an aggregation module, a local classifier optimization module and a data identification module;
each local platform of the data acquisition module acquires data to be identified and local training data with labels;
the local training module is configured to train a first local classifier model through local training data with labels by each local platform to obtain a second local classifier model;
the aggregation module is configured to generate a global classifier model through a weighted average method based on the second local classifier models of the local platforms;
the local classifier optimization module is configured to initialize the second local classifier model of each local platform based on the global classifier model and generate a third local classifier model;
the data identification module is configured to perform language relationship identification on the data to be identified through the third local classifier model, and obtain a language relationship prediction result of the data to be identified.
Further, the local training module comprises: the sentence prediction and updating system comprises a sentence coding unit, a characterization representation unit, a sentence prediction unit and an iteration updating unit;
the sentence coding unit makes t =1, performs coding operation on the tagged local training data through a BERT model, and generates 1 sentence characterization representation and a plurality of entity characterization representations for each sentence of the tagged local training data;
the characterization representation unit is used for randomly selecting 2 entity characterization representations for each sentence of the tagged local data, and splicing the selected 2 entity characterization representations and the sentence characterization representations to generate a sentence representation containing entity information:
Wherein the content of the first and second substances,for the purpose of the sentence characterization representation,for the 1 st selected entity characterization representation,for the 2 nd selected entity characterization representation,the space of real numbers is represented by a real number,representing dimensions of each of the characterization representations;
the sentence prediction unit is configured to represent the sentence based on the entity informationAnd acquiring a prediction relation p of the labeled local training data through the first local classifier model:
wherein the content of the first and second substances,sentence representation representing each implication entity informationThe final prediction relation p is the maximum value in the prediction relation distribution,representing relationshipsThe number of the labels is such that,a set of sentences is represented in a set of sentences,the parameters of the model are represented by,Wb is a trainable parameter adjusted in the training process, and softmax represents a universal softmax classifier;
and the iterative updating unit makes t = t +1 and jumps to the characterization representation unit, and the trainable parameters are adjusted by a random gradient descent algorithmWAnd training the parameters b until the local loss function of the model is smaller than a preset first threshold value, and obtaining a second local classifier model.
Further, the aggregation module further includes: and obtaining a global classifier model by knowledge distillation based on the second local classifier models of the local platforms.
Further, the second local classifier model based on each local platform obtains a global classifier model by knowledge distillation, and the method comprises the following steps:
step S310B, acquiring global server data with labels;
step S320B, the global server data with labels is predicted through each second local classifier model respectively, and a global data local prediction relation is obtainedCollecting;
step S330B, performing the global data local prediction relationAggregation of sets and optimization of global data local prediction relation through stochastic gradient descent algorithmThe global loss function L of the set and the global server data with the label is smaller than the preset global loss function value LA second threshold value, obtaining a global classifier model;
the global loss function L is:
wherein the content of the first and second substances,for the local training data that is tagged,in order to be the global data,iin order to be an index,for sentence representation of local implication entity informationThe predicted relationship distribution of (a) is,and distributing the prediction relation of the aggregated models of the second local classifier models:
where τ is the temperature parameter used to control the distribution of the distillation, r is another index different from i,comprises the following steps:
wherein the content of the first and second substances,is each second local classifierThe label of the model aggregates the value of the location of the model,sentence representation representing local implication entity information for aggregationS represents a sentence, J is an index in the local classifier model, and J is the local classifier model.
In a third aspect of the present invention, a storage device is provided, in which a plurality of programs are stored, the programs being adapted to be loaded and executed by a processor to implement the above-mentioned distributed natural language identification method based on federal learning.
In a fourth aspect of the present invention, a processing apparatus is provided, which includes a processor, a storage device; the processor is suitable for executing various programs; the storage device is suitable for storing a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the federated learning-based distributed natural language identification method described above.
The invention has the beneficial effects that:
(1) according to the distributed language relationship recognition method based on the federal learning, the classification models trained locally are gathered into a global classifier model, and the local classifier model is initialized through the global classifier model, so that data use and machine learning modeling are carried out on a local platform in combination with other platforms on the basis of not sharing data, and the cooperation of natural language recognition model training is improved.
(2) According to the distributed language relationship recognition method based on the federal learning, through the federal learning mode, each local classification model can be trained in a coordinated mode without sending out data, the problem of data islanding is solved, data barriers are broken, and the total amount and requirements of training data required by each platform training model are reduced.
(3) According to the distributed language relationship recognition method based on the federal learning, the transmission cost of the federal learning is reduced and the learning efficiency of the model is improved through a knowledge distillation technology.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
fig. 1 is a flow chart of a distributed language relationship recognition method based on federal learning according to an embodiment of the present invention.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
The invention discloses a distributed language relationship recognition method based on federal learning, which comprises the following steps:
s100, each local platform acquires data to be identified and local training data with labels;
s200, each local platform trains a first local classifier model through the local training data with the labels to obtain a second local classifier model;
step S300, generating a global classifier model by a weighted average method based on the second local classifier models of the local platforms;
step S400, initializing the second local classifier model of each local platform based on the global classifier model, and generating a third local classifier model;
step S500, performing language relationship recognition on the data to be recognized through the third local classifier model, and obtaining a language relationship prediction result of the data to be recognized.
In order to more clearly illustrate the distributed language relationship identification method based on federal learning of the present invention, the following describes each step in the embodiment of the method of the present invention in detail with reference to fig. 1.
The distributed language relationship recognition method based on federal learning in the embodiment of the invention comprises the following steps S100-S500, and the steps are described in detail as follows:
s100, each local platform acquires data to be identified and local training data with labels;
s200, each local platform trains a first local classifier model through the local training data with the labels to obtain a second local classifier model;
in this embodiment, step S200 includes:
step S210, let t =1, perform an encoding operation on the tagged local training data through a BERT model, and generate 1 sentence-characterized representation and a plurality of entity-characterized representations for each sentence of the tagged local training data; in this embodiment, the BERT model is a Deep transforms depth coding model;
step S220, selecting the t-th sentence of the local data with the tag, selecting 2 entity characterization representations, and splicing the selected 2 entity characterization representations and the sentence characterization representations to generate a sentence representation containing entity informationAs shown in equation (1): (1)
wherein the content of the first and second substances,for the purpose of the sentence characterization representation,for the 1 st selected entity characterization representation,for the 2 nd selected entity characterization representation,the space of real numbers is represented by a real number,representing dimensions of each of the characterization representations;
in this embodiment, the entities are selected words, one sentence includes a plurality of words, and each sentence marks 2 entities;
step S230, the sentence representation based on the implication entity informationAcquiring a prediction language relation p of the labeled local training data through the first local classifier model, wherein the prediction language relation p is shown as a formula (2);
wherein the content of the first and second substances,sentence representation representing each implication entity informationThe final prediction relation p is the maximum value in the prediction relation distribution,a label representing the relationship between the user and the user,a set of sentences is represented in a set of sentences,the parameters of the model are represented by,Wand b is a trainable parameter, softmax, adjusted during trainingRepresents the generic softmax classifier;
the invention may also use other classifier models, and softmax is only used as an example to facilitate understanding of the invention, and is not limited to the specific example.
Step S240, let t = t +1 and jump to step S220, and trainable parameters are adjusted through a stochastic gradient descent algorithmWAnd training the parameters b until the local loss function of the model is smaller than a preset first threshold value, and obtaining a second local classifier model.
Step S300, generating a global classifier model by a weighted average method based on the second local classifier models of the local platforms;
in this embodiment, step S300 further includes step S300B of obtaining a global classifier model by knowledge distillation based on the second local classifier models of the respective local platforms.
In this embodiment, the second local classifier model based on each local platform obtains the global classifier model by knowledge distillation, and the method includes:
step S310B, each local platform acquires global server data with labels;
step S320B, predicting the global server data with labels through the second local classifier models of each local platform respectively to obtain the local prediction relation of the global dataCollecting;
step S330B, performing the global data local prediction relationAggregation of sets and optimization of global data local prediction relation through stochastic gradient descent algorithmCollecting global loss function L of the global server data with the label until the global loss function value L is smaller than a preset second threshold value, and obtaining the global loss function value LA classifier model;
the global loss function L is shown in equation (3):
wherein the content of the first and second substances,for the local training data that is tagged,for global data, i is an index,sentence representation for local implication of entity informationThe predicted relationship distribution of (a) is,and (3) the prediction relation distribution of the aggregated model of each second local classifier model is shown as formula (4):
where τ is the temperature parameter used to control the distribution of the distillation, r is another index different from i,as shown in equation (5):
wherein the content of the first and second substances,is the value of the location of the aggregated model for each second local classifier model label,sentence representation representing local implication entity information for aggregationThe predicted relationship distribution of (a) is,sthe representation of a sentence is represented by,jfor the purpose of indexing in the local classifier model,Jis a local classifier model.
The I2B2 dataset was used as a training and testing corpus. The corpus contains 10231 training data pieces and 19114 training data pieces.
The effectiveness of the prior art method is demonstrated by comparing the effects of the method. The results are shown in table 1:
table 1 comparison of the effects of the prior art and the examples of the present invention:
the first part (first three lines) in the table is the effect of the traditional centralized method on the corpus with the standard annotations, and the second part (last three lines) adopts the federal learning mode to train the result. From the experimental results, we propose a federal training method that exceeds the previous method, and this method has proven effective.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
A distributed language relationship recognition system based on federated learning according to a second embodiment of the present invention, the system includes: the system comprises a data acquisition module, a local training module, an aggregation module, a local classifier optimization module and a data identification module;
each local platform of the data acquisition module acquires data to be identified and local training data with labels;
the local training module is configured to train a first local classifier model through local training data with labels by each local platform to obtain a second local classifier model;
in this embodiment, the local training module includes: the sentence prediction and updating system comprises a sentence coding unit, a characterization representation unit, a sentence prediction unit and an iteration updating unit;
the sentence coding unit makes t =1, performs coding operation on the tagged local training data through a BERT model, and generates 1 sentence characterization representation and a plurality of entity characterization representations for each sentence of the tagged local training data;
the characterization representation unit selects 2 entity characterization representations for each sentence of the tagged local data, and splices the selected 2 entity characterization representations and the sentence characterization representations to generate a sentence representation containing entity informationAs shown in equation (6):
wherein the content of the first and second substances,for the purpose of the sentence characterization representation,for the 1 st selected entity characterization representation,for the 2 nd selected entity characterization representation,the space of real numbers is represented by a real number,representing dimensions of each of the characterization representations;
the sentence prediction unit is configured to represent the sentence based on the entity informationThe prediction relation p of the labeled local training data obtained by the first local classifier model is shown in formula (7):
wherein the content of the first and second substances,sentence representation representing each implication entity informationThe final prediction relation p is the maximum value in the prediction relation distribution,a label representing the relationship between the user and the user,a set of sentences is represented in a set of sentences,the parameters of the model are represented by,Wb is a trainable parameter adjusted in the training process, and softmax represents a universal softmax classifier;
and the iterative updating unit makes t = t +1 and jumps to the characterization representation unit, and the trainable parameters are adjusted by a random gradient descent algorithmWAnd trainable parameters b until the local loss function of the model is less than a preset first threshold value, and obtainA second local classifier model.
The aggregation module is configured to generate a global classifier model through a weighted average method based on the second local classifier models of the local platforms;
in this embodiment, the aggregation module further includes: and obtaining a global classifier model by knowledge distillation based on the second local classifier models of the local platforms.
In this embodiment, the second local classifier model based on each local platform obtains the global classifier model by knowledge distillation, and the method includes:
step S310B, acquiring global server data with labels;
step S320B, the global server data with labels is predicted through each second local classifier model respectively, and a global data local prediction relation is obtainedCollecting;
step S330B, performing the global data local prediction relationAggregation of sets and optimization of global data local prediction relation through stochastic gradient descent algorithmCollecting a global loss function L of the global server data with the label until the global loss function value L is smaller than a preset second threshold value, and obtaining a global classifier model;
the global loss function L is shown in equation (8):
wherein the content of the first and second substances,for the local training data that is tagged,for global data, i is an index,sentence representation for local implication of entity informationThe predicted relationship distribution of (a) is,and (3) the prediction relation distribution of the aggregated model of each second local classifier model is shown as formula (9):
where τ is the temperature parameter used to control the distribution of the distillation, r is another index different from i,as shown in equation (10):
wherein the content of the first and second substances,is the value of the location of the label aggregated model for each second local classifier model,sentence representation representing local implication entity information for aggregationThe predicted relationship distribution of (a) is,sdisplay sentenceIn the case of a hybrid vehicle,jfor the purpose of indexing in the local classifier model,Jis a local classifier model.
The local classifier optimization module is configured to initialize the second local classifier model of each local platform based on the global classifier model and generate a third local classifier model;
the data identification module is configured to perform language relationship identification on the data to be identified through the third local classifier model, and obtain a language relationship prediction result of the data to be identified.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiments, and will not be described herein again.
It should be noted that, the distributed language relationship identification system based on federal learning provided in the foregoing embodiment is only illustrated by the division of the foregoing functional modules, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the modules or steps in the embodiment of the present invention are further decomposed or combined, for example, the modules in the foregoing embodiment may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the functions described above. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.
A storage apparatus according to a third embodiment of the present invention stores therein a plurality of programs, which are adapted to be loaded and executed by a processor to implement the distributed language relationship recognition method based on federal learning described above.
A processing apparatus according to a fourth embodiment of the present invention includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the federated learning-based distributed language relationship recognition approach described above.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes and related descriptions of the storage device and the processing device described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
Those of skill in the art would appreciate that the various illustrative modules, method steps, and modules described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that programs corresponding to the software modules, method steps may be located in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. To clearly illustrate this interchangeability of electronic hardware and software, various illustrative components and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as electronic hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.
The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.
Claims (10)
1. A distributed language relationship recognition method based on federated learning is characterized in that the method comprises the following steps:
s100, each local platform acquires data to be identified and local training data with labels;
s200, each local platform trains a first local classifier model through the local training data with the labels to obtain a second local classifier model;
step S300, generating a global classifier model by a weighted average method based on the second local classifier models of the local platforms;
step S400, initializing the second local classifier model of each local platform based on the global classifier model, and generating a third local classifier model;
step S500, performing language relationship recognition on the data to be recognized through the third local classifier model, and obtaining a language relationship prediction result of the data to be recognized.
2. The distributed language relationship recognition method based on federal learning as claimed in claim 1, wherein step S200 comprises:
step S210, let t =1, perform an encoding operation on the tagged local training data through a BERT model, and generate 1 sentence-characterized representation and a plurality of entity-characterized representations for each sentence of the tagged local training data;
step S220, selecting the t-th sentence of the local data with the tag, selecting 2 entity characterization representations, and splicing the selected 2 entity characterization representations and the sentence characterization representations to generate a sentence representation containing entity information:
Wherein the content of the first and second substances,for the purpose of the sentence characterization representation,for the 1 st selected entity characterization representation,for the 2 nd selected entity characterization representation,the space of real numbers is represented by a real number,representing dimensions of each of the characterization representations;
step S230, the sentence representation based on the implication entity informationAcquiring a predicted language relation p of the labeled local training data through the first local classifier model:
wherein the content of the first and second substances,sentence representation representing each implication entity informationPredicted relationship distribution of (1), predicted relationship distributionThe final predicted relationship with the largest numerical valuep,A label representing the relationship between the user and the user,a set of sentences is represented in a set of sentences,the parameters of the model are represented by,Wb is a trainable parameter adjusted in the training process, and softmax represents a universal softmax classifier;
step S240, let t = t +1 and jump to step S220, and trainable parameters are adjusted through a stochastic gradient descent algorithmWAnd training the parameters b until the local loss function of the model is smaller than a preset first threshold value, and obtaining a second local classifier model.
3. The distributed language relationship recognition method based on federal learning as claimed in claim 2, wherein step S300 further comprises step S300B of obtaining a global classifier model by knowledge distillation based on the second local classifier models of the respective local platforms.
4. The distributed language relationship recognition method based on federal learning as claimed in claim 3, wherein the second local classifier model based on each local platform obtains a global classifier model by knowledge distillation, and the method comprises:
step S310B, each local platform acquires global server data with labels;
step S320B, predicting the global server data with labels through the second local classifier models of each local platform respectively to obtain the local prediction relation of the global dataCollecting;
step S330B, performing the global data local prediction relationAggregation of sets and optimization of global data local prediction relation through stochastic gradient descent algorithmCollecting a global loss function L of the global server data with the label until the global loss function value L is smaller than a preset second threshold value, and obtaining a global classifier model;
the global loss function L is:
wherein the content of the first and second substances,for the local training data that is tagged,for global data, i is an index,sentence representation for local implication of entity informationThe predicted relationship distribution of (a) is,and distributing the prediction relation of the aggregated models of the second local classifier models:
where τ is the temperature parameter used to control the distribution of the distillation, r is another index different from i,comprises the following steps:
wherein the content of the first and second substances,is the value of the location of the aggregated model for each second local classifier model label,sentence representation representing local implication entity information for aggregationThe predicted relationship distribution of (a) is,sthe representation of a sentence is represented by,jfor the purpose of indexing in the local classifier model,Jis a local classifier model.
5. A distributed language relationship recognition system based on federal learning, the system comprising: the system comprises a data acquisition module, a local training module, an aggregation module, a local classifier optimization module and a data identification module;
each local platform of the data acquisition module acquires data to be identified and local training data with labels;
the local training module is configured to train a first local classifier model through local training data with labels by each local platform to obtain a second local classifier model;
the aggregation module is configured to generate a global classifier model through a weighted average method based on the second local classifier models of the local platforms;
the local classifier optimization module is configured to initialize the second local classifier model of each local platform based on the global classifier model and generate a third local classifier model;
the data identification module is configured to perform language relationship identification on the data to be identified through the third local classifier model, and obtain a language relationship prediction result of the data to be identified.
6. The distributed language relationship recognition based on federated learning system of claim 5, wherein the local training module comprises: the sentence prediction and updating system comprises a sentence coding unit, a characterization representation unit, a sentence prediction unit and an iteration updating unit;
the sentence coding unit makes t =1, performs coding operation on the tagged local training data through a BERT model, and generates 1 sentence characterization representation and a plurality of entity characterization representations for each sentence of the tagged local training data;
the characterization representation unit selects 2 entity characterization representations for each sentence of the tagged local data, and splices the selected 2 entity characterization representations and the sentence characterization representations to generate a sentence representation containing entity information:
Wherein the content of the first and second substances,for sentence characterization, for the 1 st selected entity characterization,for the 2 nd selected entity characterization representation,the space of real numbers is represented by a real number,representing dimensions of each of the characterization representations;
the sentence prediction unit is configured to represent the sentence based on the entity informationAnd acquiring a prediction relation p of the labeled local training data through the first local classifier model:
wherein the content of the first and second substances,sentence representation representing each implication entity informationThe final prediction relation p is the maximum value in the prediction relation distribution,a label representing the relationship between the user and the user,a set of sentences is represented in a set of sentences,the parameters of the model are represented by,Wb is a trainable parameter adjusted in the training process, and softmax represents a universal softmax classifier;
the iterative updating unit makes t = t +1 and jumps to the characterization representation unit, and the random ladder is usedDegree descent algorithm adjustment trainable parametersWAnd training the parameters b until the local loss function of the model is smaller than a preset first threshold value, and obtaining a second local classifier model.
7. The distributed language relationship recognition based on federated learning system of claim 6, wherein the aggregation module further comprises: and obtaining a global classifier model by knowledge distillation based on the second local classifier models of the local platforms.
8. The distributed language relationship recognition system based on federal learning as claimed in claim 7, wherein the second local classifier model based on each local platform obtains a global classifier model by knowledge distillation by:
step S310B, acquiring global server data with labels;
step S320B, the global server data with labels is predicted through each second local classifier model respectively, and a global data local prediction relation is obtainedCollecting;
step S330B, performing the global data local prediction relationAggregation of sets and optimization of global data local prediction relation through stochastic gradient descent algorithmCollecting a global loss function L of the global server data with the label until the global loss function value L is smaller than a preset second threshold value, and obtaining a global classifier model;
the global loss function L is:
wherein the content of the first and second substances,for the local training data that is tagged,for global data, i is an index,sentence representation for local implication of entity informationThe predicted relationship distribution of (a) is,and distributing the prediction relation of the aggregated models of the second local classifier models:
where τ is the temperature parameter used to control the distribution of the distillation, r is another index different from i,comprises the following steps:
wherein the content of the first and second substances,is the value of the location of the label aggregated model for each second local classifier model,sentence representation representing local implication entity information for aggregationS represents a sentence, J is an index in the local classifier model, and J is the local classifier model.
9. A storage device having stored therein a plurality of programs, wherein the programs are adapted to be loaded and executed by a processor to implement the distributed language relationship recognition based on federated learning method of any of claims 1-4.
10. A processing apparatus comprising a processor adapted to execute various programs, and a storage apparatus adapted to store a plurality of programs, wherein the programs are adapted to be loaded and executed by the processor to implement the federal learning based distributed language relationship identification method as claimed in any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011285430.4A CN112101578B (en) | 2020-11-17 | 2020-11-17 | Distributed language relationship recognition method, system and device based on federal learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011285430.4A CN112101578B (en) | 2020-11-17 | 2020-11-17 | Distributed language relationship recognition method, system and device based on federal learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112101578A true CN112101578A (en) | 2020-12-18 |
CN112101578B CN112101578B (en) | 2021-02-23 |
Family
ID=73784706
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011285430.4A Active CN112101578B (en) | 2020-11-17 | 2020-11-17 | Distributed language relationship recognition method, system and device based on federal learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112101578B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113537509A (en) * | 2021-06-28 | 2021-10-22 | 南方科技大学 | Collaborative model training method and device |
CN113657607A (en) * | 2021-08-05 | 2021-11-16 | 浙江大学 | Continuous learning method for federal learning |
WO2022178719A1 (en) * | 2021-02-24 | 2022-09-01 | 华为技术有限公司 | Horizontal federated learning-based training method, apparatus and system |
WO2022227212A1 (en) * | 2021-04-25 | 2022-11-03 | 平安科技(深圳)有限公司 | Federated learning-based speech representation model training method and apparatus, device, and medium |
WO2023005133A1 (en) * | 2021-07-28 | 2023-02-02 | 深圳前海微众银行股份有限公司 | Federated learning modeling optimization method and device, and readable storage medium and program product |
CN117540829A (en) * | 2023-10-18 | 2024-02-09 | 广西壮族自治区通信产业服务有限公司技术服务分公司 | Knowledge sharing large language model collaborative optimization method and system |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103559181A (en) * | 2013-11-14 | 2014-02-05 | 苏州大学 | Establishment method and system for bilingual semantic relation classification model |
CN108280058A (en) * | 2018-01-02 | 2018-07-13 | 中国科学院自动化研究所 | Relation extraction method and apparatus based on intensified learning |
CN108345583A (en) * | 2017-12-28 | 2018-07-31 | 中国科学院自动化研究所 | Event recognition and sorting technique based on multi-lingual attention mechanism and device |
US20200057810A1 (en) * | 2017-12-11 | 2020-02-20 | Abbyy Production Llc | Information object extraction using combination of classifiers |
CN111737552A (en) * | 2020-06-04 | 2020-10-02 | 中国科学院自动化研究所 | Method, device and equipment for extracting training information model and acquiring knowledge graph |
CN111831829A (en) * | 2020-06-12 | 2020-10-27 | 广州多益网络股份有限公司 | Entity relationship extraction method and device for open domain and terminal equipment |
-
2020
- 2020-11-17 CN CN202011285430.4A patent/CN112101578B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103559181A (en) * | 2013-11-14 | 2014-02-05 | 苏州大学 | Establishment method and system for bilingual semantic relation classification model |
US20200057810A1 (en) * | 2017-12-11 | 2020-02-20 | Abbyy Production Llc | Information object extraction using combination of classifiers |
CN108345583A (en) * | 2017-12-28 | 2018-07-31 | 中国科学院自动化研究所 | Event recognition and sorting technique based on multi-lingual attention mechanism and device |
CN108280058A (en) * | 2018-01-02 | 2018-07-13 | 中国科学院自动化研究所 | Relation extraction method and apparatus based on intensified learning |
CN111737552A (en) * | 2020-06-04 | 2020-10-02 | 中国科学院自动化研究所 | Method, device and equipment for extracting training information model and acquiring knowledge graph |
CN111831829A (en) * | 2020-06-12 | 2020-10-27 | 广州多益网络股份有限公司 | Entity relationship extraction method and device for open domain and terminal equipment |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022178719A1 (en) * | 2021-02-24 | 2022-09-01 | 华为技术有限公司 | Horizontal federated learning-based training method, apparatus and system |
WO2022227212A1 (en) * | 2021-04-25 | 2022-11-03 | 平安科技(深圳)有限公司 | Federated learning-based speech representation model training method and apparatus, device, and medium |
CN113537509A (en) * | 2021-06-28 | 2021-10-22 | 南方科技大学 | Collaborative model training method and device |
WO2023005133A1 (en) * | 2021-07-28 | 2023-02-02 | 深圳前海微众银行股份有限公司 | Federated learning modeling optimization method and device, and readable storage medium and program product |
CN113657607A (en) * | 2021-08-05 | 2021-11-16 | 浙江大学 | Continuous learning method for federal learning |
CN113657607B (en) * | 2021-08-05 | 2024-03-22 | 浙江大学 | Continuous learning method for federal learning |
CN117540829A (en) * | 2023-10-18 | 2024-02-09 | 广西壮族自治区通信产业服务有限公司技术服务分公司 | Knowledge sharing large language model collaborative optimization method and system |
CN117540829B (en) * | 2023-10-18 | 2024-05-17 | 广西壮族自治区通信产业服务有限公司技术服务分公司 | Knowledge sharing large language model collaborative optimization method and system |
Also Published As
Publication number | Publication date |
---|---|
CN112101578B (en) | 2021-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112101578B (en) | Distributed language relationship recognition method, system and device based on federal learning | |
CN104781837B (en) | System and method for forming predictions using event-based sentiment analysis | |
US20170323212A1 (en) | Agent aptitude prediction | |
CN109033105A (en) | The method and apparatus for obtaining judgement document's focus | |
CN113139628B (en) | Sample image identification method, device and equipment and readable storage medium | |
CN109948140B (en) | Word vector embedding method and device | |
CN111444410A (en) | Associated transaction mining and identifying method and device based on knowledge graph | |
CN108959236A (en) | Medical literature disaggregated model training method, medical literature classification method and its device | |
CN111428448A (en) | Text generation method and device, computer equipment and readable storage medium | |
CN112925914B (en) | Data security grading method, system, equipment and storage medium | |
CN107665221A (en) | The sorting technique and device of keyword | |
CN112084342A (en) | Test question generation method and device, computer equipment and storage medium | |
CN113850666A (en) | Service scheduling method, device, equipment and storage medium | |
CN113505273B (en) | Data sorting method, device, equipment and medium based on repeated data screening | |
US7644049B2 (en) | Decision forest based classifier for determining predictive importance in real-time data analysis | |
CN116756281A (en) | Knowledge question-answering method, device, equipment and medium | |
CN108830302B (en) | Image classification method, training method, classification prediction method and related device | |
CN113724055B (en) | Commodity attribute mining method and device | |
US20160162538A1 (en) | Platform for consulting solution | |
CN115293867A (en) | Financial reimbursement user portrait optimization method, device, equipment and storage medium | |
CN110555143A (en) | Question automatic answering method and computer storage medium | |
CN114387088A (en) | Loan risk identification method and device based on knowledge graph | |
CN113536111A (en) | Insurance knowledge content recommendation method and device and terminal equipment | |
CN108241650A (en) | The training method and device of training criteria for classification | |
CN115496638B (en) | Student course score analysis management method and system based on smart campus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |