WO2022121183A1

WO2022121183A1 - Text model training method, recognition method, apparatus, device and storage medium

Info

Publication number: WO2022121183A1
Application number: PCT/CN2021/084297
Authority: WO
Inventors: 李志韬; 王健宗; 程宁
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-12-11
Filing date: 2021-03-31
Publication date: 2022-06-16
Also published as: CN112734050A

Abstract

A federated learning-based text model training method, a text model recognition method, an apparatus, a computer device, and computer-readable storage medium. The federated learning-based text model training method comprises: acquiring a data set to be trained, and on the basis of the data set, training a preset language model to obtain model parameter information of the preset language model (S101); encrypting the model parameter information and uploading same to a preset aggregated federated model so as to obtain aggregated model parameter information which is returned after the preset aggregated federated model performs federated learning on the model parameter information (S102); and on the basis of the aggregated model parameter information, updating the preset language model to obtain a corresponding text model (S103). A plurality of models are jointly trained on the basis of protecting data privacy, the accuracy with which infringing text is predicted is improved, and the training time for the models is reduced.

Description

Training method, recognition method, device, equipment and storage medium of text model

This application claims the priority of the Chinese patent application filed on December 11, 2020, with the application number of 202011446681.6 and the invention titled "Text Model Training Method, Recognition Method, Apparatus, Equipment and Storage Medium", all of which are The contents are incorporated herein by reference.

technical field

The present application relates to the technical field of artificial intelligence, and in particular, to a method for training a text model based on federated learning, a method, apparatus, computer equipment and computer-readable storage medium for identifying a text model.

Background technique

The inventor realizes that the traditional method of detecting illegal content is to hire professionals to screen, label and filter. Although AI filtering is currently introduced, and semantic recognition and classification technologies are used, different enterprise platforms receive different illegal content. However, considering the privacy, insecurity and inability to spread and share these illegal content data, it is difficult to achieve joint modeling.

technical problem

The purpose of this application is to solve the technical problem that the existing data set is uploaded to the cloud as model training data, which is prone to data set leakage, which damages the safety of users, and the obtained training model predicts the technical problem of inaccurate content of violations.

technical solutions

In a first aspect, the present application provides a method for training a text model based on federated learning, and the method for training a text model based on federated learning includes the following steps:

Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;

Encrypting and uploading the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;

The preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.

In a second aspect, the present application provides a method for identifying a text model based on federated learning, and the method for identifying a text model based on federated learning includes the following steps:

Get the text to be predicted;

Based on the text encoding model and the to-be-predicted text, acquiring second text semantic vector information of the text-encoding model outputting the to-be-predicted text;

obtaining, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;

According to the label information, it is determined whether the text to be predicted violates the rules, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.

In a third aspect, the present application also provides a training device for a text model based on federated learning, and the device for training a text model based on federated learning includes:

a first obtaining module, configured to obtain the data of the set to be trained, train a preset language model based on the data of the set to be trained, and obtain model parameter information of the preset language model;

a second acquisition module, configured to encrypt and upload the model parameter information to a preset aggregate federation model, so as to acquire aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;

A generating module is configured to update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.

In a fourth aspect, the present application further provides a training device for a text model based on federated learning, and the device for training a text model based on federated learning includes:

a first obtaining module, used for obtaining the text to be predicted;

a second acquiring module, configured to acquire, based on the text encoding model and the to-be-predicted text, the second text semantic vector information of the text-encoding model outputting the to-be-predicted text;

a third acquiring module, configured to acquire, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;

A determination module, configured to determine whether the text to be predicted violates the rules according to the label information, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.

In a fifth aspect, the present application also provides a computer device, the computer device comprising a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein the computer-readable When the instructions are executed by the processor, the following steps are implemented:

In a sixth aspect, the present application also provides a computer device comprising a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein the computer-readable instructions When the instructions are executed by the processor, the following steps are implemented:

Get the text to be predicted;

According to the label information, it is determined whether the text to be predicted violates the rules.

In a seventh aspect, the present application further provides a computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, wherein when the computer-readable instructions are executed by a processor, the following steps are implemented:

In an eighth aspect, the present application further provides a computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, wherein when the computer-readable instructions are executed by a processor, the following steps are implemented:

Get the text to be predicted;

beneficial effect

The present application provides a method for training a text model based on federated learning, a method, device, computer equipment, and a computer-readable storage medium for identifying a text model. By acquiring the data of the to-be-trained set, the preset language is trained based on the data of the to-be-trained set. model, obtain the model parameter information of the preset language model; encrypt the model parameter information and upload it to the preset aggregate federation model to obtain the preset aggregate federation model and perform federated learning on the model parameter information and return it based on the aggregation model parameter information; update the preset language model based on the aggregation model parameter information to obtain the corresponding text model, realize the joint training of multiple models on the basis of protecting data privacy, and improve the accuracy of predicting illegal text and reduce The training time of the model.

Description of drawings

1 is a schematic flowchart of a training method for a federated learning-based text model provided by an embodiment of the present application;

2 is a schematic flowchart of sub-steps of the training method of the federated learning-based text model in FIG. 1;

3 is a schematic diagram of encrypting multiple first model parameter information and multiple second model parameter information provided by an embodiment of the present application and uploading it to a preset aggregate federation model;

FIG. 4 is a schematic flowchart of sub-steps of the training method of the federated learning-based text model in FIG. 1;

5 is a schematic flowchart of a method for identifying a text model based on federated learning provided by an embodiment of the present application;

6 is a schematic block diagram of a training apparatus for a federated learning-based text model provided by an embodiment of the present application;

7 is a schematic block diagram of a recognition apparatus based on a federated learning text model provided by an embodiment of the present application;

FIG. 8 is a schematic block diagram of the structure of a computer device according to an embodiment of the present application.

Embodiments of the present invention

Embodiments of the present application provide a federated learning-based text model training method, a text model recognition method, an apparatus, a computer device, and a computer-readable storage medium. The training method of the text model based on federated learning and the recognition method of the text model based on federated learning can be applied to computer equipment, and the computer equipment can be electronic equipment such as notebook computers, desktop computers, and servers.

Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and features in the embodiments may be combined with each other without conflict.

Please refer to FIG. 1 , which is a schematic flowchart of a training method for a federated learning-based text model according to an embodiment of the present application.

As shown in FIG. 1 , the training method of the text model based on federated learning includes steps S101 to S103.

Step S101: Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model.

Exemplarily, the data of the to-be-trained set is obtained, where the data of the to-be-trained set includes a plurality of texts to be trained, wherein the to-be-trained texts include illegal content, for example, the content that includes illegal words such as obscene, violent, insulting, and special words. Text, the data of the to-be-trained set is stored as a preset storage path or a preset blockchain. When the data of the to-be-trained set is acquired, the preset language model is trained by the data of the to-be-trained set, and the model parameter information of the preset language model is obtained, wherein the preset language model includes a preset neural network model, wherein the There are multiple preset language models, the specific number is not limited, and the preset language models are located at the user end.

In an embodiment, specifically, referring to FIG. 2 , step S101 includes: sub-step S1011 to sub-step S1022.

Sub-step S1011: Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the pre-trained language model after training. Set the first model parameter information of the pre-trained language model.

Exemplarily, the preset language model includes a preset pre-training language model and a preset dual propagation model, and model parameters of the preset language model include a first model parameter and a second model parameter. Obtain the to-be-trained text in the to-be-trained set data, train the preset pre-trained language model shown in Figure 3 through the to-be-trained text, and obtain the preset pre-trained language model to output the first semantic vector information that the to-be-trained text cannot , where the text to be trained includes a violating word and a label value marked for the violating word, and the label value may be 1-10, or a numerical value between 1-100. For example, the label value of the offending word is 1, 5, 10, etc.; or the label value is 1, 20, 50, 100, etc. The preset pre-trained language model is a preset BERT model, and the full name of the BERT model is Bidirectional Encoder Representations From Transformer, the role of the BERT model obtains the rich semantic information of the text, that is, the semantic representation of the text. Wherein, the preset pre-trained language model is located at the user end, and one user end can be set with at least one pre-trained language model. The word in the text to be trained is extracted through the hidden layer of the pre-trained language model, the semantic vector information of the word is obtained through the weight matrix of the hidden layer, and the semantic vector information of the illegal word is used as the first semantic vector information, and output through the output layer.

After acquiring the first semantic vector information of the text to be trained output by the preset pre-training model, acquire the first model parameter information of the current preset pre-training model. The features of the text to be trained are extracted by the network layer in the preset pre-training model, and the gradient value of the text to be trained is obtained. For example, the vector feature information of the word is obtained by the full weight matrix of the hidden layer in the pre-trained language model, and the vector feature information of the word is obtained by the full weight matrix of the hidden layer in the pre-trained language model, The corresponding gradient value is obtained according to the vector feature information. The model parameters of the preset pre-training language model are updated by the gradient value of the text to be trained, so as to obtain the updated first model parameter information of the preset pre-training language model. Wherein, when there are multiple preset pre-trained language models, first model parameter information and first semantic vector information of each preset pre-trained language model are obtained respectively.

Sub-step S1021: Train the preset dual propagation model based on the first semantic vector information, and acquire second model parameter information of the preset dual propagation model after training.

Exemplarily, the preset dual propagation model is a BiLSTM model (Bi-directional Long Short-Term Memory), which is composed of forward LSTM and backward LSTM. After obtaining the preset pre-trained language model and outputting the first semantic vector information corresponding to the text to be trained, the preset dual propagation model is trained by the first semantic vector information, and the second model parameter information of the pre-trained dual propagation model is obtained, for example , and perform feature extraction on the first semantic vector information by using the network layer in the preset double propagation model to obtain the gradient value of the label value corresponding to the first semantic vector information. For example, the vector feature information of the label value is obtained by the full weight matrix of the hidden layer in the preset dual propagation model, and the vector feature information of the label value is obtained by the full weight matrix of the hidden layer in the preset dual propagation model. According to the vector feature information to get the corresponding gradient value. The model parameters of the preset dual propagation model are updated by the gradient value of the text to be trained, so as to obtain the updated second model parameter information of the preset dual propagation model. Wherein, when there are multiple preset dual propagation models, the second model parameter information of each preset dual propagation model is obtained respectively.

Step S102: Encrypt and upload the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information.

Exemplarily, the preset aggregate federation model is located in the server, sends an upload request to the server, receives the encryption public key sent by the server, encrypts the model parameters of each preset language model with the encryption public key, and encrypts the encrypted Model parameters are sent to the server. When receiving the encrypted model parameters, the server decrypts each encrypted model parameter, and obtains the decrypted model parameters of each preset language model. Each model parameter is learned through the preset aggregation federation model in the server, the corresponding aggregation model parameter is obtained, and the obtained aggregation model parameter is returned to each preset language model. Among them, the aggregate federation model includes the aggregated horizontal federation model, the aggregated vertical federation model, and the aggregated federation migration model.

It should be noted that federated learning refers to the method of machine learning modeling by uniting different clients or participants. In federated learning, clients do not need to expose their own data to other clients and coordinators (also known as servers), so federated learning can well protect user privacy and data security, and can solve the problem of data silos . Federated learning has the following advantages: data isolation, data will not be leaked to the outside, to meet the needs of user privacy protection and data security; it can ensure that the quality of the federated learning model is lossless, and there will be no negative transfer, ensuring that the federated learning model is better than a split independent model. The effect is good; it can ensure that each client can perform encrypted exchange of information and model parameters while maintaining independence, and grow at the same time.

In one embodiment, the model parameter information includes first model parameter information and second model parameter information; the model parameter information is encrypted and uploaded to a preset aggregate federation model to obtain the preset aggregate federation The aggregated model parameter information returned after the model performs federated learning on the model parameter information, includes: encrypting the first model parameter information and uploading it to a preset aggregated federated model, and obtaining the information about the first model from the preset aggregated federated model. The first aggregated model parameter information returned after the horizontal federated learning of model parameter information is performed; the second model parameter information is encrypted and uploaded to the preset aggregated federated model, and the second model is obtained from the preset aggregated federated model. The parameter information is the second aggregated model parameter information returned after the horizontal federated learning is performed.

Exemplarily, the public key sent by the server is received, wherein the number of the public key is multiple. For example, when the number of the public keys is two, that is, the first public key and the second public key. The first model parameter information of each preset pre-trained language model and the second model parameter information of each preset dual propagation model are encrypted respectively by using the received public key. For example, when the first public key and the second public key are received, the first public key and the second public key respectively correspond to the first model parameter information of each preset pre-trained language model and the The second model parameter information is encrypted.

After encrypting the first model parameter information of each preset pre-training language model and the second model parameter information of each preset dual propagation model with the public key, as shown in FIG. 3 , each preset pre-training language model and preset The double propagation model adopts a construction method of inadvertent transmission, establishes a secret communication channel, and passes the encrypted first model parameter information of each preset pre-trained language model and the second model parameter of each preset double propagation model through the The secret communication channel is sent to the server. The first public key and the second public key respectively encrypt the first model parameter information of each preset pre-trained language model, and the first public key and the second public key respectively encrypt the second model of each preset dual propagation model When the parameter information is encrypted, the first model parameter information of each preset pre-training language model encrypted by the first public key and the second model parameters of each preset pre-training language model encrypted by the second public key are encrypted through the secret communication channel, and the second model parameters of each preset dual propagation model encrypted by the first public key and the second model parameter information of each preset dual propagation model encrypted by the second public key are sent to the server.

The server decrypts the received encrypted first model parameter information of each preset pre-trained language model and second model parameter information of each preset dual propagation model. For example, receive the first model parameter information of each preset pre-training language model encrypted by the first public key, the first model parameter information of each preset pre-training language model encrypted by the second public key, and the first public key encryption. When the first model parameter information of each preset dual propagation model and the first model parameter information of each preset dual propagation model encrypted by the second public key are randomly encrypted by the private key for each preset pretraining of the first public key The first model parameter information of the language model and the first model parameter information of each preset pre-trained language model encrypted by the second public key, and the first model parameter information of each preset dual propagation model encrypted by the first public key and The first model parameter information of each preset double propagation model encrypted by the second public key is decrypted. The private key corresponds to the first public key or the second public key, that is, the private key decrypts the first public key or decrypts the second public key. After decrypting the first model parameter information of the preset pre-training language model encrypted by the first public key and the first model parameter information of each preset pre-training language model encrypted by the second public key with the private key, each preset pre-training is obtained. The first model parameter information of the language model, the first model parameter information of each preset double propagation model and the first model parameter information of each preset double propagation model encrypted by the second public key are decrypted by the private key, and each preset double propagation model is obtained. First model parameter information of the propagation model.

The parameters corresponding to the intersection features of the first model parameter information of each preset pre-trained language model are learned through the horizontal federated learning mechanism in the server, and the parameters corresponding to the intersection features are calculated on average to obtain the corresponding first aggregation model parameters, and the The first aggregated model parameter is returned to each preset pretrained language model. The parameters corresponding to the intersection features of the second model parameter information of each preset dual-propagation model are learned through the horizontal federated learning mechanism in the server, and the parameters corresponding to the intersection features are calculated on average to obtain the corresponding second aggregation model parameters, and the first The two-aggregation model parameters are returned to each of the preset dual-propagation models.

Step S103: Update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.

As an example, after receiving the aggregated model parameter information returned by the aggregated federation model, by updating the aggregated model parameter information with the model parameter information of each preset language model, each preset language model after updating the aggregated model parameter information is generated corresponding to each preset language model. text model.

In an embodiment, specifically, referring to FIG. 4 , step S103 includes: sub-step S1031 to sub-step S1032.

Sub-step S1031: Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information, and generate a corresponding text encoding model.

In an exemplary example, by updating the first model parameter information of the preset pre-trained language model with the first aggregate model parameter information returned by the aggregate federation model, a corresponding text encoding model is generated from the updated preset pre-trained language model. Wherein, when there are multiple preset pre-trained language models and they are on different user terminals, the second model parameter information of each preset pre-trained language model is updated by updating the first aggregate model parameter information returned by the aggregate federation model, and the The updated preset pre-trained language models generate corresponding text encoding models respectively.

Sub-step S1032: Update the second model parameters of the preset dual propagation model based on the second aggregation model parameters, and generate a corresponding text recognition model.

As an example, by updating the second model parameter information of the preset dual propagation model with the second aggregation model parameter information returned by the aggregation federation model, a corresponding text recognition model is generated from the updated preset dual propagation model, wherein the preset When there are multiple dual propagation models and they are on different clients, the second model parameter information of each preset dual propagation model is updated by updating the second model parameter information of each preset dual propagation model returned by the second aggregation model parameter information returned by the aggregation federation model. The dual propagation models generate corresponding text recognition models respectively.

In an embodiment, before generating the corresponding text encoding model and/or generating the corresponding text recognition model, the method includes: determining whether the preset pre-trained language model and/or the preset dual propagation model is in a convergent state. If it is determined that the preset pre-trained language model and/or the preset dual propagation model are in a convergent state, then the preset pre-trained language model is used as a text encoding model and/or the preset dual propagation model is used as text Recognition model; if the preset pre-training language model and/or the preset dual propagation model are not in a convergent state, then train the preset pre-training language model and/or the preset pre-training language model according to the preset sample data to be trained A dual propagation model is set to obtain third model parameter information of the preset pre-trained language model and/or fourth model parameter information of the preset dual propagation model after training.

Exemplarily, it is determined whether the preset pretrained language model and/or the preset dual propagation model is in a convergent state. For example, the first aggregation model parameter information is compared with the previously recorded first aggregation model parameter information, if the first aggregation model parameter information is the same as the previously recorded first aggregation model parameter information, or, the first aggregation If the difference between the model parameter information and the previously recorded first aggregated model parameter information is less than the preset difference, it is determined that the preset pre-trained language model is in a convergent state; and/or, the second aggregated model parameter information and the previously recorded The second aggregation model parameter information is compared, if the second aggregation model parameter information is the same as the previously recorded second aggregation model parameter information, or, the second aggregation model parameter information and the previously recorded second aggregation model parameter information If the difference is smaller than the preset difference, it is determined that the preset dual propagation model is in a convergent state.

For example, the first aggregation model parameter information is compared with the previously recorded first aggregation model parameter information, if the first aggregation model parameter information is different from the previously recorded first aggregation model parameter information, or the first aggregation model parameter information If the difference between the aggregated model parameter information and the previously recorded first aggregated model parameter information is greater than or equal to a preset difference, it is determined that the preset pretrained language model is not in a convergent state; and/or, the second aggregated model parameter The information is compared with the previously recorded second aggregation model parameter information, if the second aggregation model parameter information is not the same as the previously recorded second aggregation model parameter information, or, the second aggregation model parameter information and the previously recorded No. If the difference between the parameter information of the two aggregation models is greater than or equal to the preset difference, it is determined that the preset dual propagation model is not in a convergent state.

If it is determined that the preset pre-trained language model is in a convergent state, the preset pre-trained language model is used as the text encoding model; and/or, if it is determined that the preset dual-propagation model is in a convergent state, the preset dual-propagation model is used as Text recognition model.

If it is determined that the preset pre-trained language model is not in a convergent state, continue to train the preset pre-trained language model according to the preset sample data to be trained, and obtain the third model parameter information and the second semantics of the pre-trained pre-trained language model after training vector information; and/or, if it is determined that the preset dual propagation model is not in a convergent state, continue to train the preset dual propagation model according to the second semantic vector information, obtain the trained fourth model parameter information, and use the third model The parameter information and/or the fourth model parameter information is uploaded to the aggregate federated model for federated learning.

In the embodiment of the present application, the preset language model is trained through the data of the to-be-trained set to obtain the model parameter information of the preset language model, and the model parameter information is federated by the aggregated federated model to obtain the aggregated model parameter information, and the The aggregated model parameter information updates the model parameter information of the preset language model, generates a corresponding text model, realizes joint training of multiple models on the basis of protecting data privacy, improves the accuracy of predicting illegal text and reduces the training time of the model.

Please refer to FIG. 5 , which is a schematic flowchart of a method for identifying a text model based on federated learning provided by an embodiment of the present application.

As shown in FIG. 5 , the recognition method of the text model based on federated learning includes steps S201 to S204.

Step S201, acquiring the text to be predicted.

Exemplarily, the to-be-predicted text is acquired, where the to-be-predicted text contains violating words or non-violating words, and is a sentence or a short sentence sent by a user detected through a network.

Step S202 , based on the text encoding model and the to-be-predicted text, obtain second text semantic vector information of the to-be-predicted text output by the text encoding model.

Exemplarily, semantic prediction is performed on the to-be-predicted text through a text encoding model to obtain second text semantic vector information of the to-be-predicted text. For example, the semantic vector of each word in the text to be predicted is extracted through the hidden layer of the text encoding model, and the obtained semantic vectors are combined to obtain the second text semantic vector information of the text to be predicted.

Step S203 , based on the text recognition model and the second text semantic vector information, obtain label information of the text recognition model outputting the second text semantic vector information.

Exemplarily, the second text semantic vector information is predicted by a text recognition model to obtain label information of the second text semantic vector information. For example, the semantic vector of each word in the second text semantic vector information is extracted through the hidden layer of the text recognition model, and the semantic vector of each word is mapped to obtain the label information of the second text semantic vector information.

Step S204 , according to the label information, determine whether the text to be predicted violates the rules, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.

Exemplarily, when the label information is obtained, it is determined whether the text to be predicted violates the rules based on the label information. For example, when the tag information is a tag value, the tag value is compared with the preset tag value, and if the tag value is greater than or equal to the preset tag value, it is determined that the text to be predicted is illegal content; if the tag value is determined If the value is less than the preset label value, it is determined that the text to be predicted is not illegal content, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.

In this embodiment of the present application, the second text semantic vector information of the text to be predicted is obtained through a text encoding model, the label information of the second text semantic vector information is obtained through a text recognition model, and whether the to-be-predicted text is illegal is determined through the label information content, wherein the text encoding model and the text recognition model are obtained through federated learning, thereby improving the accuracy of the text encoding model and the text recognition model.

Please refer to FIG. 6 , which is a schematic block diagram of a training apparatus for a federated learning-based text model provided by an embodiment of the present application.

As shown in FIG. 6 , the training apparatus 400 for a text model based on federated learning includes: a first acquisition module 401 , a second acquisition module 402 , and a generation module 403 .

The first obtaining module 401 is configured to obtain the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;

The second acquiring module 402 is configured to encrypt and upload the model parameter information to a preset aggregate federation model to acquire aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;

The generating module 403 is configured to update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.

Wherein, the first obtaining module 401 is also specifically used for:

Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the preset pre-training language after training first model parameter information of the model;

The preset dual propagation model is trained based on the first semantic vector information, and second model parameter information of the preset dual propagation model after training is acquired.

Wherein, the second obtaining module 402 is also specifically used for:

Encrypting and uploading the first model parameter information to a preset aggregate federation model, and obtaining first aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the first model parameter information;

The second model parameter information is encrypted and uploaded to the preset aggregate federation model, and the second aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the second model parameter information is obtained.

Wherein, the generating module 403 is also specifically used for:

Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information to generate a corresponding text encoding model;

The second model parameters of the preset dual propagation model are updated based on the second aggregation model parameters to generate a corresponding text recognition model.

Wherein, the generating module 403 is also specifically used for:

determining whether the preset pre-trained language model and/or the preset dual propagation model is in a convergent state;

If it is determined that the preset pretrained language model and/or the preset dual propagation model is in a convergent state, the preset pretrained language model is used as a text encoding model and/or the preset dual propagation model is used as a text recognition model Model;

If the preset pre-trained language model and/or the preset dual propagation model is not in a convergent state, the preset pre-trained language model and/or the preset dual propagation model are trained according to the preset sample data to be trained model, and obtain the third model parameter information of the preset pre-trained language model and/or the fourth model parameter information of the preset dual propagation model after training.

It should be noted that those skilled in the art can clearly understand that, for the convenience and brevity of the description, the specific working process of the device and each module and unit described above can be implemented with reference to the aforementioned training method of the text model based on federated learning. The corresponding process in the example will not be repeated here.

Please refer to FIG. 7 , which is a schematic block diagram of a recognition apparatus based on a federated learning text model provided by an embodiment of the present application.

As shown in FIG. 7 , the recognition apparatus 500 based on the federated learning text model includes: a first acquisition module 501 , a second acquisition module 502 , a third acquisition module 503 , and a determination module 504 .

The first obtaining module 501 is used to obtain the text to be predicted;

A second acquiring module 502, configured to acquire, based on the text encoding model and the text to be predicted, the second text semantic vector information of the text encoding model outputting the text to be predicted;

A third obtaining module 503, configured to obtain, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;

The determination module 504 is configured to determine whether the text to be predicted violates the rules according to the label information, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.

It should be noted that those skilled in the art can clearly understand that, for the convenience and brevity of the description, the specific working process of the above-described device and each module and unit can be implemented with reference to the aforementioned recognition method of the text model based on federated learning. The corresponding process in the example will not be repeated here.

The apparatuses provided by the above embodiments may be implemented in the form of computer-readable instructions, and the computer-readable instructions may be executed on a computer device as shown in FIG. 8 .

Please refer to FIG. 8 , FIG. 8 is a schematic block diagram of the structure of a computer device according to an embodiment of the present application. The computer device may be a terminal.

As shown in FIG. 8 , the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a computer-readable storage medium and an internal memory.

The computer-readable storage medium can be non-volatile or volatile, and the computer-readable storage medium can store an operating system and computer-readable instructions. When executed, the computer-readable instructions can cause the processor to execute any one of the federated learning-based text model training methods and federated learning-based text model recognition methods.

The processor is used to provide computing and control capabilities to support the operation of the entire computer equipment.

The internal memory provides an environment for the execution of computer-readable instructions in the computer-readable storage medium. When the computer-readable instructions are executed by the processor, the processor can execute any federated learning-based text model training method and federation-based training method. Recognition methods for learned text models.

Wherein, in one embodiment, the processor is configured to execute computer-readable instructions stored in the memory to implement the steps of the training method and the identification method of the present application.

Embodiments of the present application further provide a computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, and the method implemented when the computer-readable instructions are executed may refer to the federated learning-based method of the present application. Various embodiments of a method for training a text model and a method for recognizing a text model based on federated learning.

The blockchain referred to in this application is a new application mode of computer technologies such as preset pre-training language model, preset dual propagation model, text encoding model and text recognition model storage, point-to-point transmission, consensus mechanism, encryption algorithm and so on. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Claims

A training method for a federated learning-based text model, including:

Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;

Encrypting and uploading the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;

The preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
The method for training a text model based on federated learning according to claim 1, wherein the data of the to-be-trained set comprises text to be trained, and the preset language model includes a preset pre-trained language model and a preset dual propagation model, The model parameter information includes first model parameter information and second model parameter information;

The training of the preset language model based on the data of the to-be-trained set to obtain model parameter information of the preset language model includes:

Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the preset pre-training language after training first model parameter information of the model;

The preset dual propagation model is trained based on the first semantic vector information, and second model parameter information of the preset dual propagation model after training is acquired.
The method for training a text model based on federated learning according to claim 1, wherein the model parameter information includes first model parameter information and second model parameter information;

Encrypting and uploading the model parameter information to the preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information, including:

Encrypting and uploading the first model parameter information to a preset aggregate federation model, and obtaining first aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the first model parameter information;

The second model parameter information is encrypted and uploaded to the preset aggregate federation model, and the second aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the second model parameter information is obtained.
The method for training a text model based on federated learning according to claim 3, wherein the preset language model includes a preset pre-training language model and a preset dual propagation model, and the text model includes a text encoding model and a text recognition model Model;

The updating of the preset language model to be trained based on the aggregated model parameter information to obtain a corresponding text model, including:

Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information to generate a corresponding text encoding model;

The second model parameters of the preset dual propagation model are updated based on the second aggregation model parameters to generate a corresponding text recognition model.
The method for training a text model based on federated learning as claimed in claim 4, wherein, before generating the corresponding text encoding model and/or generating the corresponding text recognition model, comprising:

determining whether the preset pretrained language model and/or the preset dual propagation model is in a convergent state;

If it is determined that the preset pretrained language model and/or the preset dual propagation model is in a convergent state, the preset pretrained language model is used as a text encoding model and/or the preset dual propagation model is used as a text recognition model Model;

If the preset pre-trained language model and/or the preset dual propagation model is not in a convergent state, the preset pre-trained language model and/or the preset dual propagation model are trained according to the preset sample data to be trained model, and obtain the third model parameter information of the preset pre-trained language model and/or the fourth model parameter information of the preset dual propagation model after training.
A recognition method of a text model based on federated learning, including:

Get the text to be predicted;

Based on the text encoding model and the to-be-predicted text, acquiring second text semantic vector information of the text-encoding model outputting the to-be-predicted text;

obtaining, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;

According to the label information, it is determined whether the text to be predicted violates the rules, wherein the text encoding model and the text recognition model are obtained by the training method of the federated learning-based text model according to claims 1-5.
A training device for a federated learning-based text model, comprising:

a first obtaining module, configured to obtain the data of the set to be trained, train a preset language model based on the data of the set to be trained, and obtain model parameter information of the preset language model;

a second acquisition module, configured to encrypt and upload the model parameter information to a preset aggregate federation model, so as to acquire aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;

A generating module is configured to update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.
A recognition device based on a federated learning text model, comprising:

a first obtaining module, used for obtaining the text to be predicted;

a second acquiring module, configured to acquire, based on the text encoding model and the to-be-predicted text, the second text semantic vector information of the text-encoding model outputting the to-be-predicted text;

a third acquiring module, configured to acquire, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;

A determination module, configured to determine whether the to-be-predicted text violates the rules according to the label information, wherein the text encoding model and the text recognition model are those of the federated learning-based text model according to claims 1-5. obtained by the training method.
A computer device, wherein the computer device includes a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein the computer-readable instructions are executed by the processor When executed, implement the following steps:

Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;

Encrypting and uploading the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;

The preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
The computer device according to claim 9, wherein the data of the to-be-trained set includes text to be trained, the preset language model includes a preset pre-trained language model and a preset dual propagation model, and the model parameter information includes the first a model parameter information and a second model parameter information;

The training of the preset language model based on the data of the to-be-trained set to obtain model parameter information of the preset language model includes:

Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the preset pre-training language after training first model parameter information of the model;

The preset dual propagation model is trained based on the first semantic vector information, and second model parameter information of the preset dual propagation model after training is acquired.
The computer device of claim 9, wherein the model parameter information includes first model parameter information and second model parameter information;

Encrypting and uploading the model parameter information to the preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information, including:

Encrypting and uploading the first model parameter information to a preset aggregate federation model, and obtaining first aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the first model parameter information;

The second model parameter information is encrypted and uploaded to the preset aggregate federation model, and the second aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the second model parameter information is obtained.
The computer device of claim 11, wherein the preset language model includes a preset pre-training language model and a preset dual propagation model, and the text model includes a text encoding model and a text recognition model;

The updating of the preset language model to be trained based on the aggregated model parameter information to obtain a corresponding text model, including:

Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information to generate a corresponding text encoding model;

The second model parameters of the preset dual propagation model are updated based on the second aggregation model parameters to generate a corresponding text recognition model.
The computer device according to claim 12, wherein, before generating the corresponding text encoding model and/or generating the corresponding text recognition model, comprising:

determining whether the preset pretrained language model and/or the preset dual propagation model is in a convergent state;

If it is determined that the preset pretrained language model and/or the preset dual propagation model is in a convergent state, the preset pretrained language model is used as a text encoding model and/or the preset dual propagation model is used as a text recognition model Model;

If the preset pre-trained language model and/or the preset dual propagation model is not in a convergent state, the preset pre-trained language model and/or the preset dual propagation model are trained according to the preset sample data to be trained model, and obtain the third model parameter information of the preset pre-trained language model and/or the fourth model parameter information of the preset dual propagation model after training.
A computer device, wherein the computer device includes a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein the computer-readable instructions are executed by the processor When executed, implement the following steps:

Get the text to be predicted;

Based on the text encoding model and the to-be-predicted text, acquiring second text semantic vector information of the text-encoding model outputting the to-be-predicted text;

obtaining, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;

According to the label information, it is determined whether the text to be predicted violates the rules.
A computer-readable storage medium, wherein computer-readable instructions are stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the following steps are implemented:

Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;

Encrypting and uploading the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;

The preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
The computer-readable storage medium of claim 15, wherein the to-be-trained set data includes to-be-trained text, the preset language model includes a preset pre-trained language model and a preset dual propagation model, and the model parameters The information includes first model parameter information and second model parameter information;

The training of the preset language model based on the data of the to-be-trained set to obtain model parameter information of the preset language model includes:

Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the preset pre-training language after training first model parameter information of the model;

The preset dual propagation model is trained based on the first semantic vector information, and second model parameter information of the preset dual propagation model after training is acquired.
The computer-readable storage medium of claim 15, wherein the model parameter information includes first model parameter information and second model parameter information;

Encrypting and uploading the model parameter information to the preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information, including:

Encrypting and uploading the first model parameter information to a preset aggregate federation model, and obtaining first aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the first model parameter information;

The second model parameter information is encrypted and uploaded to the preset aggregate federation model, and the second aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the second model parameter information is obtained.
The computer-readable storage medium of claim 17, wherein the preset language model includes a preset pretrained language model and a preset dual propagation model, and the text model includes a text encoding model and a text recognition model;

The updating of the preset language model to be trained based on the aggregated model parameter information to obtain a corresponding text model, including:

Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information to generate a corresponding text encoding model;

The second model parameters of the preset dual propagation model are updated based on the second aggregation model parameters to generate a corresponding text recognition model.
The computer-readable storage medium according to claim 18, wherein, before generating the corresponding text encoding model and/or generating the corresponding text recognition model, comprising:

determining whether the preset pretrained language model and/or the preset dual propagation model is in a convergent state;

If it is determined that the preset pretrained language model and/or the preset dual propagation model is in a convergent state, the preset pretrained language model is used as a text encoding model and/or the preset dual propagation model is used as a text recognition model Model;

If the preset pre-trained language model and/or the preset dual propagation model is not in a convergent state, the preset pre-trained language model and/or the preset dual propagation model are trained according to the preset sample data to be trained model, and obtain the third model parameter information of the preset pre-trained language model and/or the fourth model parameter information of the preset dual propagation model after training.
A computer-readable storage medium, wherein computer-readable instructions are stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the following steps are implemented:

Get the text to be predicted;

Based on the text encoding model and the to-be-predicted text, acquiring second text semantic vector information of the text-encoding model outputting the to-be-predicted text;

obtaining, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;

According to the label information, it is determined whether the text to be predicted violates the rules.