WO2022121183A1 - Text model training method, recognition method, apparatus, device and storage medium - Google Patents

Text model training method, recognition method, apparatus, device and storage medium Download PDF

Info

Publication number
WO2022121183A1
WO2022121183A1 PCT/CN2021/084297 CN2021084297W WO2022121183A1 WO 2022121183 A1 WO2022121183 A1 WO 2022121183A1 CN 2021084297 W CN2021084297 W CN 2021084297W WO 2022121183 A1 WO2022121183 A1 WO 2022121183A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
preset
text
parameter information
trained
Prior art date
Application number
PCT/CN2021/084297
Other languages
French (fr)
Chinese (zh)
Inventor
李志韬
王健宗
程宁
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022121183A1 publication Critical patent/WO2022121183A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular, to a method for training a text model based on federated learning, a method, apparatus, computer equipment and computer-readable storage medium for identifying a text model.
  • the inventor realizes that the traditional method of detecting illegal content is to hire professionals to screen, label and filter.
  • AI filtering is currently introduced, and semantic recognition and classification technologies are used, different enterprise platforms receive different illegal content.
  • semantic recognition and classification technologies are used, different enterprise platforms receive different illegal content.
  • it is difficult to achieve joint modeling.
  • the purpose of this application is to solve the technical problem that the existing data set is uploaded to the cloud as model training data, which is prone to data set leakage, which damages the safety of users, and the obtained training model predicts the technical problem of inaccurate content of violations.
  • the present application provides a method for training a text model based on federated learning, and the method for training a text model based on federated learning includes the following steps:
  • the preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
  • the present application provides a method for identifying a text model based on federated learning, and the method for identifying a text model based on federated learning includes the following steps:
  • the label information it is determined whether the text to be predicted violates the rules, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.
  • the present application also provides a training device for a text model based on federated learning, and the device for training a text model based on federated learning includes:
  • a first obtaining module configured to obtain the data of the set to be trained, train a preset language model based on the data of the set to be trained, and obtain model parameter information of the preset language model
  • a second acquisition module configured to encrypt and upload the model parameter information to a preset aggregate federation model, so as to acquire aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
  • a generating module is configured to update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.
  • the present application further provides a training device for a text model based on federated learning, and the device for training a text model based on federated learning includes:
  • a first obtaining module used for obtaining the text to be predicted
  • a second acquiring module configured to acquire, based on the text encoding model and the to-be-predicted text, the second text semantic vector information of the text-encoding model outputting the to-be-predicted text;
  • a third acquiring module configured to acquire, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information
  • a determination module configured to determine whether the text to be predicted violates the rules according to the label information, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.
  • the present application also provides a computer device, the computer device comprising a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein the computer-readable When the instructions are executed by the processor, the following steps are implemented:
  • the preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
  • the present application also provides a computer device comprising a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein the computer-readable instructions When the instructions are executed by the processor, the following steps are implemented:
  • the label information it is determined whether the text to be predicted violates the rules.
  • the present application further provides a computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, wherein when the computer-readable instructions are executed by a processor, the following steps are implemented:
  • the preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
  • the present application further provides a computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, wherein when the computer-readable instructions are executed by a processor, the following steps are implemented:
  • the label information it is determined whether the text to be predicted violates the rules.
  • the present application provides a method for training a text model based on federated learning, a method, device, computer equipment, and a computer-readable storage medium for identifying a text model.
  • the preset language is trained based on the data of the to-be-trained set.
  • model obtain the model parameter information of the preset language model; encrypt the model parameter information and upload it to the preset aggregate federation model to obtain the preset aggregate federation model and perform federated learning on the model parameter information and return it based on the aggregation model parameter information; update the preset language model based on the aggregation model parameter information to obtain the corresponding text model, realize the joint training of multiple models on the basis of protecting data privacy, and improve the accuracy of predicting illegal text and reduce The training time of the model.
  • FIG. 1 is a schematic flowchart of a training method for a federated learning-based text model provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of sub-steps of the training method of the federated learning-based text model in FIG. 1;
  • FIG. 3 is a schematic diagram of encrypting multiple first model parameter information and multiple second model parameter information provided by an embodiment of the present application and uploading it to a preset aggregate federation model;
  • FIG. 4 is a schematic flowchart of sub-steps of the training method of the federated learning-based text model in FIG. 1;
  • FIG. 5 is a schematic flowchart of a method for identifying a text model based on federated learning provided by an embodiment of the present application
  • FIG. 6 is a schematic block diagram of a training apparatus for a federated learning-based text model provided by an embodiment of the present application
  • FIG. 7 is a schematic block diagram of a recognition apparatus based on a federated learning text model provided by an embodiment of the present application.
  • FIG. 8 is a schematic block diagram of the structure of a computer device according to an embodiment of the present application.
  • Embodiments of the present application provide a federated learning-based text model training method, a text model recognition method, an apparatus, a computer device, and a computer-readable storage medium.
  • the training method of the text model based on federated learning and the recognition method of the text model based on federated learning can be applied to computer equipment, and the computer equipment can be electronic equipment such as notebook computers, desktop computers, and servers.
  • FIG. 1 is a schematic flowchart of a training method for a federated learning-based text model according to an embodiment of the present application.
  • the training method of the text model based on federated learning includes steps S101 to S103.
  • Step S101 Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model.
  • the data of the to-be-trained set is obtained, where the data of the to-be-trained set includes a plurality of texts to be trained, wherein the to-be-trained texts include illegal content, for example, the content that includes illegal words such as obscene, violent, insulting, and special words.
  • the data of the to-be-trained set is stored as a preset storage path or a preset blockchain.
  • the preset language model is trained by the data of the to-be-trained set, and the model parameter information of the preset language model is obtained, wherein the preset language model includes a preset neural network model, wherein the There are multiple preset language models, the specific number is not limited, and the preset language models are located at the user end.
  • step S101 includes: sub-step S1011 to sub-step S1022.
  • Sub-step S1011 Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the pre-trained language model after training. Set the first model parameter information of the pre-trained language model.
  • the preset language model includes a preset pre-training language model and a preset dual propagation model
  • model parameters of the preset language model include a first model parameter and a second model parameter.
  • the label value of the offending word is 1, 5, 10, etc.; or the label value is 1, 20, 50, 100, etc.
  • the preset pre-trained language model is a preset BERT model, and the full name of the BERT model is Bidirectional Encoder Representations From Transformer, the role of the BERT model obtains the rich semantic information of the text, that is, the semantic representation of the text.
  • the preset pre-trained language model is located at the user end, and one user end can be set with at least one pre-trained language model.
  • the word in the text to be trained is extracted through the hidden layer of the pre-trained language model, the semantic vector information of the word is obtained through the weight matrix of the hidden layer, and the semantic vector information of the illegal word is used as the first semantic vector information, and output through the output layer.
  • the first semantic vector information of the text to be trained After acquiring the first semantic vector information of the text to be trained output by the preset pre-training model, acquire the first model parameter information of the current preset pre-training model.
  • the features of the text to be trained are extracted by the network layer in the preset pre-training model, and the gradient value of the text to be trained is obtained.
  • the vector feature information of the word is obtained by the full weight matrix of the hidden layer in the pre-trained language model
  • the vector feature information of the word is obtained by the full weight matrix of the hidden layer in the pre-trained language model
  • the corresponding gradient value is obtained according to the vector feature information.
  • the model parameters of the preset pre-training language model are updated by the gradient value of the text to be trained, so as to obtain the updated first model parameter information of the preset pre-training language model.
  • first model parameter information and first semantic vector information of each preset pre-trained language model are obtained respectively.
  • Sub-step S1021 Train the preset dual propagation model based on the first semantic vector information, and acquire second model parameter information of the preset dual propagation model after training.
  • the preset dual propagation model is a BiLSTM model (Bi-directional Long Short-Term Memory), which is composed of forward LSTM and backward LSTM.
  • the preset dual propagation model is trained by the first semantic vector information, and the second model parameter information of the pre-trained dual propagation model is obtained, for example , and perform feature extraction on the first semantic vector information by using the network layer in the preset double propagation model to obtain the gradient value of the label value corresponding to the first semantic vector information.
  • the vector feature information of the label value is obtained by the full weight matrix of the hidden layer in the preset dual propagation model
  • the vector feature information of the label value is obtained by the full weight matrix of the hidden layer in the preset dual propagation model.
  • the model parameters of the preset dual propagation model are updated by the gradient value of the text to be trained, so as to obtain the updated second model parameter information of the preset dual propagation model.
  • the second model parameter information of each preset dual propagation model is obtained respectively.
  • Step S102 Encrypt and upload the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information.
  • the preset aggregate federation model is located in the server, sends an upload request to the server, receives the encryption public key sent by the server, encrypts the model parameters of each preset language model with the encryption public key, and encrypts the encrypted Model parameters are sent to the server.
  • the server decrypts each encrypted model parameter, and obtains the decrypted model parameters of each preset language model.
  • Each model parameter is learned through the preset aggregation federation model in the server, the corresponding aggregation model parameter is obtained, and the obtained aggregation model parameter is returned to each preset language model.
  • the aggregate federation model includes the aggregated horizontal federation model, the aggregated vertical federation model, and the aggregated federation migration model.
  • federated learning refers to the method of machine learning modeling by uniting different clients or participants.
  • clients do not need to expose their own data to other clients and coordinators (also known as servers), so federated learning can well protect user privacy and data security, and can solve the problem of data silos .
  • Federated learning has the following advantages: data isolation, data will not be leaked to the outside, to meet the needs of user privacy protection and data security; it can ensure that the quality of the federated learning model is lossless, and there will be no negative transfer, ensuring that the federated learning model is better than a split independent model. The effect is good; it can ensure that each client can perform encrypted exchange of information and model parameters while maintaining independence, and grow at the same time.
  • the model parameter information includes first model parameter information and second model parameter information; the model parameter information is encrypted and uploaded to a preset aggregate federation model to obtain the preset aggregate federation
  • the aggregated model parameter information returned after the model performs federated learning on the model parameter information includes: encrypting the first model parameter information and uploading it to a preset aggregated federated model, and obtaining the information about the first model from the preset aggregated federated model.
  • the first aggregated model parameter information returned after the horizontal federated learning of model parameter information is performed; the second model parameter information is encrypted and uploaded to the preset aggregated federated model, and the second model is obtained from the preset aggregated federated model.
  • the parameter information is the second aggregated model parameter information returned after the horizontal federated learning is performed.
  • the public key sent by the server is received, wherein the number of the public key is multiple.
  • the number of the public keys is two, that is, the first public key and the second public key.
  • the first model parameter information of each preset pre-trained language model and the second model parameter information of each preset dual propagation model are encrypted respectively by using the received public key.
  • the first public key and the second public key are received, the first public key and the second public key respectively correspond to the first model parameter information of each preset pre-trained language model and the The second model parameter information is encrypted.
  • each preset pre-training language model and preset The double propagation model adopts a construction method of inadvertent transmission, establishes a secret communication channel, and passes the encrypted first model parameter information of each preset pre-trained language model and the second model parameter of each preset double propagation model through the The secret communication channel is sent to the server.
  • the first public key and the second public key respectively encrypt the first model parameter information of each preset pre-trained language model
  • the first public key and the second public key respectively encrypt the second model of each preset dual propagation model
  • the first model parameter information of each preset pre-training language model encrypted by the first public key and the second model parameters of each preset pre-training language model encrypted by the second public key are encrypted through the secret communication channel
  • the second model parameters of each preset dual propagation model encrypted by the first public key and the second model parameter information of each preset dual propagation model encrypted by the second public key are sent to the server.
  • the server decrypts the received encrypted first model parameter information of each preset pre-trained language model and second model parameter information of each preset dual propagation model. For example, receive the first model parameter information of each preset pre-training language model encrypted by the first public key, the first model parameter information of each preset pre-training language model encrypted by the second public key, and the first public key encryption.
  • the first model parameter information of each preset dual propagation model and the first model parameter information of each preset dual propagation model encrypted by the second public key are randomly encrypted by the private key for each preset pretraining of the first public key
  • the first model parameter information of the language model and the first model parameter information of each preset pre-trained language model encrypted by the second public key, and the first model parameter information of each preset dual propagation model encrypted by the first public key and The first model parameter information of each preset double propagation model encrypted by the second public key is decrypted.
  • the private key corresponds to the first public key or the second public key, that is, the private key decrypts the first public key or decrypts the second public key.
  • each preset pre-training After decrypting the first model parameter information of the preset pre-training language model encrypted by the first public key and the first model parameter information of each preset pre-training language model encrypted by the second public key with the private key, each preset pre-training is obtained.
  • the first model parameter information of the language model, the first model parameter information of each preset double propagation model and the first model parameter information of each preset double propagation model encrypted by the second public key are decrypted by the private key, and each preset double propagation model is obtained.
  • First model parameter information of the propagation model After decrypting the first model parameter information of the preset pre-training language model encrypted by the first public key and the first model parameter information of each preset pre-training language model encrypted by the second public key with the private key, each preset pre-training is obtained.
  • the parameters corresponding to the intersection features of the first model parameter information of each preset pre-trained language model are learned through the horizontal federated learning mechanism in the server, and the parameters corresponding to the intersection features are calculated on average to obtain the corresponding first aggregation model parameters, and the The first aggregated model parameter is returned to each preset pretrained language model.
  • the parameters corresponding to the intersection features of the second model parameter information of each preset dual-propagation model are learned through the horizontal federated learning mechanism in the server, and the parameters corresponding to the intersection features are calculated on average to obtain the corresponding second aggregation model parameters, and the first The two-aggregation model parameters are returned to each of the preset dual-propagation models.
  • Step S103 Update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.
  • each preset language model after updating the aggregated model parameter information is generated corresponding to each preset language model. text model.
  • step S103 includes: sub-step S1031 to sub-step S1032.
  • Sub-step S1031 Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information, and generate a corresponding text encoding model.
  • a corresponding text encoding model is generated from the updated preset pre-trained language model.
  • the second model parameter information of each preset pre-trained language model is updated by updating the first aggregate model parameter information returned by the aggregate federation model, and the The updated preset pre-trained language models generate corresponding text encoding models respectively.
  • Sub-step S1032 Update the second model parameters of the preset dual propagation model based on the second aggregation model parameters, and generate a corresponding text recognition model.
  • a corresponding text recognition model is generated from the updated preset dual propagation model, wherein the preset
  • the second model parameter information of each preset dual propagation model is updated by updating the second model parameter information of each preset dual propagation model returned by the second aggregation model parameter information returned by the aggregation federation model.
  • the dual propagation models generate corresponding text recognition models respectively.
  • the method before generating the corresponding text encoding model and/or generating the corresponding text recognition model, includes: determining whether the preset pre-trained language model and/or the preset dual propagation model is in a convergent state. If it is determined that the preset pre-trained language model and/or the preset dual propagation model are in a convergent state, then the preset pre-trained language model is used as a text encoding model and/or the preset dual propagation model is used as text Recognition model; if the preset pre-training language model and/or the preset dual propagation model are not in a convergent state, then train the preset pre-training language model and/or the preset pre-training language model according to the preset sample data to be trained A dual propagation model is set to obtain third model parameter information of the preset pre-trained language model and/or fourth model parameter information of the preset dual propagation model after training.
  • the first aggregation model parameter information is compared with the previously recorded first aggregation model parameter information, if the first aggregation model parameter information is the same as the previously recorded first aggregation model parameter information, or, the first aggregation If the difference between the model parameter information and the previously recorded first aggregated model parameter information is less than the preset difference, it is determined that the preset pre-trained language model is in a convergent state; and/or, the second aggregated model parameter information and the previously recorded The second aggregation model parameter information is compared, if the second aggregation model parameter information is the same as the previously recorded second aggregation model parameter information, or, the second aggregation model parameter information and the previously recorded second aggregation model parameter information If the difference is smaller than the preset difference, it is determined that the preset dual propagation model is in a convergent state
  • the first aggregation model parameter information is compared with the previously recorded first aggregation model parameter information, if the first aggregation model parameter information is different from the previously recorded first aggregation model parameter information, or the first aggregation model parameter information If the difference between the aggregated model parameter information and the previously recorded first aggregated model parameter information is greater than or equal to a preset difference, it is determined that the preset pretrained language model is not in a convergent state; and/or, the second aggregated model parameter The information is compared with the previously recorded second aggregation model parameter information, if the second aggregation model parameter information is not the same as the previously recorded second aggregation model parameter information, or, the second aggregation model parameter information and the previously recorded No. If the difference between the parameter information of the two aggregation models is greater than or equal to the preset difference, it is determined that the preset dual propagation model is not in a convergent state.
  • the preset pre-trained language model is used as the text encoding model; and/or, if it is determined that the preset dual-propagation model is in a convergent state, the preset dual-propagation model is used as Text recognition model.
  • the preset pre-trained language model is not in a convergent state, continue to train the preset pre-trained language model according to the preset sample data to be trained, and obtain the third model parameter information and the second semantics of the pre-trained pre-trained language model after training vector information; and/or, if it is determined that the preset dual propagation model is not in a convergent state, continue to train the preset dual propagation model according to the second semantic vector information, obtain the trained fourth model parameter information, and use the third model
  • the parameter information and/or the fourth model parameter information is uploaded to the aggregate federated model for federated learning.
  • the preset language model is trained through the data of the to-be-trained set to obtain the model parameter information of the preset language model, and the model parameter information is federated by the aggregated federated model to obtain the aggregated model parameter information, and the The aggregated model parameter information updates the model parameter information of the preset language model, generates a corresponding text model, realizes joint training of multiple models on the basis of protecting data privacy, improves the accuracy of predicting illegal text and reduces the training time of the model.
  • FIG. 5 is a schematic flowchart of a method for identifying a text model based on federated learning provided by an embodiment of the present application.
  • the recognition method of the text model based on federated learning includes steps S201 to S204.
  • Step S201 acquiring the text to be predicted.
  • the to-be-predicted text is acquired, where the to-be-predicted text contains violating words or non-violating words, and is a sentence or a short sentence sent by a user detected through a network.
  • Step S202 based on the text encoding model and the to-be-predicted text, obtain second text semantic vector information of the to-be-predicted text output by the text encoding model.
  • semantic prediction is performed on the to-be-predicted text through a text encoding model to obtain second text semantic vector information of the to-be-predicted text.
  • the semantic vector of each word in the text to be predicted is extracted through the hidden layer of the text encoding model, and the obtained semantic vectors are combined to obtain the second text semantic vector information of the text to be predicted.
  • Step S203 based on the text recognition model and the second text semantic vector information, obtain label information of the text recognition model outputting the second text semantic vector information.
  • the second text semantic vector information is predicted by a text recognition model to obtain label information of the second text semantic vector information.
  • the semantic vector of each word in the second text semantic vector information is extracted through the hidden layer of the text recognition model, and the semantic vector of each word is mapped to obtain the label information of the second text semantic vector information.
  • Step S204 determine whether the text to be predicted violates the rules, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.
  • the label information when the label information is obtained, it is determined whether the text to be predicted violates the rules based on the label information. For example, when the tag information is a tag value, the tag value is compared with the preset tag value, and if the tag value is greater than or equal to the preset tag value, it is determined that the text to be predicted is illegal content; if the tag value is determined If the value is less than the preset label value, it is determined that the text to be predicted is not illegal content, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.
  • the second text semantic vector information of the text to be predicted is obtained through a text encoding model
  • the label information of the second text semantic vector information is obtained through a text recognition model
  • whether the to-be-predicted text is illegal is determined through the label information content
  • the text encoding model and the text recognition model are obtained through federated learning, thereby improving the accuracy of the text encoding model and the text recognition model.
  • FIG. 6 is a schematic block diagram of a training apparatus for a federated learning-based text model provided by an embodiment of the present application.
  • the training apparatus 400 for a text model based on federated learning includes: a first acquisition module 401 , a second acquisition module 402 , and a generation module 403 .
  • the first obtaining module 401 is configured to obtain the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;
  • the second acquiring module 402 is configured to encrypt and upload the model parameter information to a preset aggregate federation model to acquire aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
  • the generating module 403 is configured to update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.
  • the first obtaining module 401 is also specifically used for:
  • the preset dual propagation model is trained based on the first semantic vector information, and second model parameter information of the preset dual propagation model after training is acquired.
  • the second obtaining module 402 is also specifically used for:
  • the second model parameter information is encrypted and uploaded to the preset aggregate federation model, and the second aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the second model parameter information is obtained.
  • the generating module 403 is also specifically used for:
  • the second model parameters of the preset dual propagation model are updated based on the second aggregation model parameters to generate a corresponding text recognition model.
  • the generating module 403 is also specifically used for:
  • the preset pretrained language model and/or the preset dual propagation model is in a convergent state, the preset pretrained language model is used as a text encoding model and/or the preset dual propagation model is used as a text recognition model Model;
  • the preset pre-trained language model and/or the preset dual propagation model are trained according to the preset sample data to be trained model, and obtain the third model parameter information of the preset pre-trained language model and/or the fourth model parameter information of the preset dual propagation model after training.
  • FIG. 7 is a schematic block diagram of a recognition apparatus based on a federated learning text model provided by an embodiment of the present application.
  • the recognition apparatus 500 based on the federated learning text model includes: a first acquisition module 501 , a second acquisition module 502 , a third acquisition module 503 , and a determination module 504 .
  • the first obtaining module 501 is used to obtain the text to be predicted
  • a second acquiring module 502 configured to acquire, based on the text encoding model and the text to be predicted, the second text semantic vector information of the text encoding model outputting the text to be predicted;
  • a third obtaining module 503, configured to obtain, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;
  • the determination module 504 is configured to determine whether the text to be predicted violates the rules according to the label information, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.
  • the apparatuses provided by the above embodiments may be implemented in the form of computer-readable instructions, and the computer-readable instructions may be executed on a computer device as shown in FIG. 8 .
  • FIG. 8 is a schematic block diagram of the structure of a computer device according to an embodiment of the present application.
  • the computer device may be a terminal.
  • the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a computer-readable storage medium and an internal memory.
  • the computer-readable storage medium can be non-volatile or volatile, and the computer-readable storage medium can store an operating system and computer-readable instructions.
  • the computer-readable instructions can cause the processor to execute any one of the federated learning-based text model training methods and federated learning-based text model recognition methods.
  • the processor is used to provide computing and control capabilities to support the operation of the entire computer equipment.
  • the internal memory provides an environment for the execution of computer-readable instructions in the computer-readable storage medium.
  • the processor can execute any federated learning-based text model training method and federation-based training method. Recognition methods for learned text models.
  • the processor is configured to execute computer-readable instructions stored in the memory to implement the steps of the training method and the identification method of the present application.
  • Embodiments of the present application further provide a computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, and the method implemented when the computer-readable instructions are executed may refer to the federated learning-based method of the present application.
  • the blockchain referred to in this application is a new application mode of computer technologies such as preset pre-training language model, preset dual propagation model, text encoding model and text recognition model storage, point-to-point transmission, consensus mechanism, encryption algorithm and so on.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Machine Translation (AREA)

Abstract

A federated learning-based text model training method, a text model recognition method, an apparatus, a computer device, and computer-readable storage medium. The federated learning-based text model training method comprises: acquiring a data set to be trained, and on the basis of the data set, training a preset language model to obtain model parameter information of the preset language model (S101); encrypting the model parameter information and uploading same to a preset aggregated federated model so as to obtain aggregated model parameter information which is returned after the preset aggregated federated model performs federated learning on the model parameter information (S102); and on the basis of the aggregated model parameter information, updating the preset language model to obtain a corresponding text model (S103). A plurality of models are jointly trained on the basis of protecting data privacy, the accuracy with which infringing text is predicted is improved, and the training time for the models is reduced.

Description

文本模型的训练方法、识别方法、装置、设备及存储介质Training method, recognition method, device, equipment and storage medium of text model
本申请要求于2020年12月11日提交中国专利局、申请号为202011446681.6、发明名称为“文本模型的训练方法、识别方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on December 11, 2020, with the application number of 202011446681.6 and the invention titled "Text Model Training Method, Recognition Method, Apparatus, Equipment and Storage Medium", all of which are The contents are incorporated herein by reference.
技术领域technical field
本申请涉及人工智能技术领域,尤其涉及一种基于联邦学习的文本模型的训练方法、文本模型的识别方法、装置、计算机设备及计算机可读存储介质。The present application relates to the technical field of artificial intelligence, and in particular, to a method for training a text model based on federated learning, a method, apparatus, computer equipment and computer-readable storage medium for identifying a text model.
背景技术Background technique
发明人意识到传统的违规内容检测的做法是雇佣职业人员去筛查,标注,过滤,虽然目前引入了AI过滤,用到了语义识别、分类的技术,但不同企业平台收到不同的违规内容,但这些违规内容数据考虑到隐私性、不安全性以及不能传播共享性,难以实现联合建模。The inventor realizes that the traditional method of detecting illegal content is to hire professionals to screen, label and filter. Although AI filtering is currently introduced, and semantic recognition and classification technologies are used, different enterprise platforms receive different illegal content. However, considering the privacy, insecurity and inability to spread and share these illegal content data, it is difficult to achieve joint modeling.
技术问题technical problem
本申请旨在解决现有将数据集上传至云端作为模型训练数据的过程中,容易出现数据集泄露,损害用户的安全,且得到的训练模型预测违规内容不准确的技术问题的技术问题。The purpose of this application is to solve the technical problem that the existing data set is uploaded to the cloud as model training data, which is prone to data set leakage, which damages the safety of users, and the obtained training model predicts the technical problem of inaccurate content of violations.
技术解决方案technical solutions
第一方面,本申请提供一种基于联邦学习的文本模型的训练方法,所述基于联邦学习的文本模型的训练方法包括以下步骤:In a first aspect, the present application provides a method for training a text model based on federated learning, and the method for training a text model based on federated learning includes the following steps:
获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息;Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;
将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息;Encrypting and uploading the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型。The preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
第二方面,本申请提供一种基于联邦学习的文本模型的识别方法,所述基于联邦学习的文本模型的识别方法包括以下步骤:In a second aspect, the present application provides a method for identifying a text model based on federated learning, and the method for identifying a text model based on federated learning includes the following steps:
获取待预测文本;Get the text to be predicted;
基于文本编码模型和所述待预测文本,获取所述文本编码模型输出所述待预测文本的第二文本语义向量信息;Based on the text encoding model and the to-be-predicted text, acquiring second text semantic vector information of the text-encoding model outputting the to-be-predicted text;
基于文本识别模型和所述第二文本语义向量信息,获取所述文本识别模型输出所述第二文本语义向量信息的标签信息;obtaining, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;
根据所述标签信息,确定所述待预测文本是否违规,其中,所述文本编码模型和所述文本识别模型为上述的基于联邦学习的文本模型的训练方法得到的。According to the label information, it is determined whether the text to be predicted violates the rules, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.
第三方面,本申请还提供一种基于联邦学习的文本模型的训练装置,所述基于联邦学习的文本模型的训练装置包括:In a third aspect, the present application also provides a training device for a text model based on federated learning, and the device for training a text model based on federated learning includes:
第一获取模块,用于获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息;a first obtaining module, configured to obtain the data of the set to be trained, train a preset language model based on the data of the set to be trained, and obtain model parameter information of the preset language model;
第二获取模块,用于将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息;a second acquisition module, configured to encrypt and upload the model parameter information to a preset aggregate federation model, so as to acquire aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
生成模块,用于基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型。A generating module is configured to update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.
第四方面,本申请还提供一种基于联邦学习的文本模型的训练装置,所述基于联邦学习的文本模型的训练装置包括:In a fourth aspect, the present application further provides a training device for a text model based on federated learning, and the device for training a text model based on federated learning includes:
第一获取模块,用于获取待预测文本;a first obtaining module, used for obtaining the text to be predicted;
第二获取模块,用于基于文本编码模型和所述待预测文本,获取所述文本编码模型输出所述待预测文本的第二文本语义向量信息;a second acquiring module, configured to acquire, based on the text encoding model and the to-be-predicted text, the second text semantic vector information of the text-encoding model outputting the to-be-predicted text;
第三获取模块,用于基于文本识别模型和所述第二文本语义向量信息,获取所述文本识别模型输出所述第二文本语义向量信息的标签信息;a third acquiring module, configured to acquire, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;
确定模块,用于根据所述标签信息,确定所述待预测文本是否违规,其中,所述文本编码模型和所述文本识别模型为上述的基于联邦学习的文本模型的训练方法得到的。A determination module, configured to determine whether the text to be predicted violates the rules according to the label information, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.
第五方面,本申请还提供一种计算机设备,所述计算机设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机可读指令,其中所述计算机可读指令被所述处理器执行时,实现如下步骤:In a fifth aspect, the present application also provides a computer device, the computer device comprising a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein the computer-readable When the instructions are executed by the processor, the following steps are implemented:
获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息;Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;
将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息;Encrypting and uploading the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型。The preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
第六方面,本申请还提供一种计算机设备,所述计算机设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机可读指令,其中所述计算机可读指令被所述处理器执行时,实现如下步骤:In a sixth aspect, the present application also provides a computer device comprising a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein the computer-readable instructions When the instructions are executed by the processor, the following steps are implemented:
获取待预测文本;Get the text to be predicted;
基于文本编码模型和所述待预测文本,获取所述文本编码模型输出所述待预测文本的第二文本语义向量信息;Based on the text encoding model and the to-be-predicted text, acquiring second text semantic vector information of the text-encoding model outputting the to-be-predicted text;
基于文本识别模型和所述第二文本语义向量信息,获取所述文本识别模型输出所述第二文本语义向量信息的标签信息;obtaining, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;
根据所述标签信息,确定所述待预测文本是否违规。According to the label information, it is determined whether the text to be predicted violates the rules.
第七方面,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,其中所述计算机可读指令被处理器执行时,实现如下步骤:In a seventh aspect, the present application further provides a computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, wherein when the computer-readable instructions are executed by a processor, the following steps are implemented:
获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息;Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;
将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息;Encrypting and uploading the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型。The preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
第八方面,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,其中所述计算机可读指令被处理器执行时,实现如下步骤:In an eighth aspect, the present application further provides a computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, wherein when the computer-readable instructions are executed by a processor, the following steps are implemented:
获取待预测文本;Get the text to be predicted;
基于文本编码模型和所述待预测文本,获取所述文本编码模型输出所述待预测文本的第二文本语义向量信息;Based on the text encoding model and the to-be-predicted text, acquiring second text semantic vector information of the text-encoding model outputting the to-be-predicted text;
基于文本识别模型和所述第二文本语义向量信息,获取所述文本识别模型输出所述第二文本语义向量信息的标签信息;obtaining, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;
根据所述标签信息,确定所述待预测文本是否违规。According to the label information, it is determined whether the text to be predicted violates the rules.
有益效果beneficial effect
本申请提供一种基于联邦学习的文本模型的训练方法、文本模型的识别方法、装置、计算机设备及计算机可读存储介质,通过获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息;将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息;基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型,实现保护数据隐私的基础上联合训练多个模型,并提高预测违规文本的准确率以及减少模型的训练时间。The present application provides a method for training a text model based on federated learning, a method, device, computer equipment, and a computer-readable storage medium for identifying a text model. By acquiring the data of the to-be-trained set, the preset language is trained based on the data of the to-be-trained set. model, obtain the model parameter information of the preset language model; encrypt the model parameter information and upload it to the preset aggregate federation model to obtain the preset aggregate federation model and perform federated learning on the model parameter information and return it based on the aggregation model parameter information; update the preset language model based on the aggregation model parameter information to obtain the corresponding text model, realize the joint training of multiple models on the basis of protecting data privacy, and improve the accuracy of predicting illegal text and reduce The training time of the model.
附图说明Description of drawings
图1为本申请实施例提供的一种基于联邦学习的文本模型的训练方法的流程示意图;1 is a schematic flowchart of a training method for a federated learning-based text model provided by an embodiment of the present application;
图2为图1中的基于联邦学习的文本模型的训练方法的子步骤流程示意图;2 is a schematic flowchart of sub-steps of the training method of the federated learning-based text model in FIG. 1;
图3是本申请实施例提供的一种多个第一模型参数信息加密和多个第二模型参数信息加密并上传至预置聚合联邦模型的示意图;3 is a schematic diagram of encrypting multiple first model parameter information and multiple second model parameter information provided by an embodiment of the present application and uploading it to a preset aggregate federation model;
图4为图1中的基于联邦学习的文本模型的训练方法的子步骤流程示意图;FIG. 4 is a schematic flowchart of sub-steps of the training method of the federated learning-based text model in FIG. 1;
图5为本申请实施例提供的一种基于联邦学习的文本模型的识别方法的流程示意图;5 is a schematic flowchart of a method for identifying a text model based on federated learning provided by an embodiment of the present application;
图6为本申请实施例提供的一种基于联邦学习的文本模型的训练装置的示意性框图;6 is a schematic block diagram of a training apparatus for a federated learning-based text model provided by an embodiment of the present application;
图7为本申请实施例提供的一种基于联邦学习的文本模型的识别装置的示意性框图;7 is a schematic block diagram of a recognition apparatus based on a federated learning text model provided by an embodiment of the present application;
图8为本申请一实施例涉及的计算机设备的结构示意框图。FIG. 8 is a schematic block diagram of the structure of a computer device according to an embodiment of the present application.
本发明的实施方式Embodiments of the present invention
本申请实施例提供一种基于联邦学习的文本模型的训练方法、文本模型的识别方法、装置、计算机设备及计算机可读存储介质。其中,该基于联邦学习的文本模型的训练方法和基于联邦学习的文本模型的识别方法可应用于计算机设备中,该计算机设备可以是笔记本电脑、台式电脑、服务器等电子设备。Embodiments of the present application provide a federated learning-based text model training method, a text model recognition method, an apparatus, a computer device, and a computer-readable storage medium. The training method of the text model based on federated learning and the recognition method of the text model based on federated learning can be applied to computer equipment, and the computer equipment can be electronic equipment such as notebook computers, desktop computers, and servers.
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and features in the embodiments may be combined with each other without conflict.
请参照图1,图1 为本申请的实施例提供的一种基于联邦学习的文本模型的训练方法的流程示意图。Please refer to FIG. 1 , which is a schematic flowchart of a training method for a federated learning-based text model according to an embodiment of the present application.
如图1所示,该基于联邦学习的文本模型的训练方法包括步骤S101至步骤S103。As shown in FIG. 1 , the training method of the text model based on federated learning includes steps S101 to S103.
步骤S101、获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息。Step S101: Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model.
示范性的,获取待训练集数据,该待训练集数据包括多个待训练文本,其中,该待训练文本包括违规内容,例如,该包括淫秽、暴力、侮辱、特殊字词等违规字词的文本,该待训练集数据存储为预置存储路径或预置区块链中。在获取到该待训练集数据时,通过该待训练集数据训练预置语言模型,得到该预置语言模型的模型参数信息,其中,该预置语言模型包括预置神经网络模型,其中,该预置语言模型为多个,具体数量不做限定,且该预置语言模型位于用户端。Exemplarily, the data of the to-be-trained set is obtained, where the data of the to-be-trained set includes a plurality of texts to be trained, wherein the to-be-trained texts include illegal content, for example, the content that includes illegal words such as obscene, violent, insulting, and special words. Text, the data of the to-be-trained set is stored as a preset storage path or a preset blockchain. When the data of the to-be-trained set is acquired, the preset language model is trained by the data of the to-be-trained set, and the model parameter information of the preset language model is obtained, wherein the preset language model includes a preset neural network model, wherein the There are multiple preset language models, the specific number is not limited, and the preset language models are located at the user end.
在一实施例中,具体地,参照图2,步骤S101包括:子步骤S1011至子步骤S1022。In an embodiment, specifically, referring to FIG. 2 , step S101 includes: sub-step S1011 to sub-step S1022.
子步骤S1011、基于所述待训练文本训练所述预置预训练语言模型,获取所述预置预训练语言模型输出所述待训练文本对应的第一语义向量信息,以及获取训练后所述预置预训练语言模型的第一模型参数信息。Sub-step S1011: Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the pre-trained language model after training. Set the first model parameter information of the pre-trained language model.
示范性的,预置语言模型包括预置预训练语言模型和预置双重传播模型,该预置语言模型的模型参数包括第一模型参数和第二模型参数。获取待训练集数据中的待训练文本,通过该待训练文本训练如图3所示预置预训练语言模型,获取该预置预训练语言模型输出该待训练文额不能的第一语义向量信息,其中,待训练文本包括违规字词以及对该违规字词标注的标签值,该标签值可以是1-10,也可以是1-100之间的数字值。例如,对该违规字词标注的标签值为1、5、10等;或者标签值为1、20、50、100等。该预置预训练语言模型为预置BERT模型,BERT模型的全称是Bidirectional Encoder Representations from Transformer,BERT模型的作用获得文本的包含丰富语义信息,即用文本的语义表示。其中,预置预训练语言模型处于用户端,一个用户端可以设置至少一个预训练语言模型。通过预置预训练语言模型的隐藏层提取待训练文本中字词,通过隐藏层的权重矩阵,得到该字词的语义向量信息,将该违规字词的语义向量信息作为第一语义向量信息,并通过输出层输出。Exemplarily, the preset language model includes a preset pre-training language model and a preset dual propagation model, and model parameters of the preset language model include a first model parameter and a second model parameter. Obtain the to-be-trained text in the to-be-trained set data, train the preset pre-trained language model shown in Figure 3 through the to-be-trained text, and obtain the preset pre-trained language model to output the first semantic vector information that the to-be-trained text cannot , where the text to be trained includes a violating word and a label value marked for the violating word, and the label value may be 1-10, or a numerical value between 1-100. For example, the label value of the offending word is 1, 5, 10, etc.; or the label value is 1, 20, 50, 100, etc. The preset pre-trained language model is a preset BERT model, and the full name of the BERT model is Bidirectional Encoder Representations From Transformer, the role of the BERT model obtains the rich semantic information of the text, that is, the semantic representation of the text. Wherein, the preset pre-trained language model is located at the user end, and one user end can be set with at least one pre-trained language model. The word in the text to be trained is extracted through the hidden layer of the pre-trained language model, the semantic vector information of the word is obtained through the weight matrix of the hidden layer, and the semantic vector information of the illegal word is used as the first semantic vector information, and output through the output layer.
在获取到预置预训练模型输出的待训练文本的第一语义向量信息后,获取当前预置预训练模型的第一模型参数信息。通过预置预训练模型中的网络层对待训练文本进行特征提取,得到该待训练文本的梯度值。例如,通过该预置预训练语言模型中隐藏层的全权重矩阵得到该字词的向量特征信息,以及通过预置预训练语言模型中隐藏层的全权重矩阵得到该字词的向量特征信息,根据向量特征信息得到对应的梯度值。通过该待训练文本的梯度值更新预置预训练语言模型的模型参数,得到该预置预训练语言模型更新后的第一模型参数信息。其中,当预置预训练语言模型为多个时,分别获取各个预置预训练语言模型的第一模型参数信息和第一语义向量信息。After acquiring the first semantic vector information of the text to be trained output by the preset pre-training model, acquire the first model parameter information of the current preset pre-training model. The features of the text to be trained are extracted by the network layer in the preset pre-training model, and the gradient value of the text to be trained is obtained. For example, the vector feature information of the word is obtained by the full weight matrix of the hidden layer in the pre-trained language model, and the vector feature information of the word is obtained by the full weight matrix of the hidden layer in the pre-trained language model, The corresponding gradient value is obtained according to the vector feature information. The model parameters of the preset pre-training language model are updated by the gradient value of the text to be trained, so as to obtain the updated first model parameter information of the preset pre-training language model. Wherein, when there are multiple preset pre-trained language models, first model parameter information and first semantic vector information of each preset pre-trained language model are obtained respectively.
子步骤S1021、基于所述第一语义向量信息对所述预置双重传播模型进行训练,获取训练后所述预置双重传播模型的第二模型参数信息。Sub-step S1021: Train the preset dual propagation model based on the first semantic vector information, and acquire second model parameter information of the preset dual propagation model after training.
示范性的,该预置双重传播模型为BiLSTM模型(Bi-directional Long Short-Term Memory),是由前向LSTM与后向LSTM组合而成。在得到预置预训练语言模型输出待训练文本对应的第一语义向量信息,通过该第一语义向量信息训练预置双重传播模型,得到训练后预置双重传播模型的第二模型参数信息,例如,通过预置双重传播模型中的网络层对第一语义向量信息进行特征提取,得到该第一语义向量信息对应标签值的梯度值。例如,通过该预置双重传播模型中隐藏层的全权重矩阵得到该标签值的向量特征信息,以及通过预置双重传播模型中隐藏层的全权重矩阵得到标签值的向量特征信息,根据向量特征信息得到对应的梯度值。通过该待训练文本的梯度值更新预置双重传播模型的模型参数,得到该预置双重传播模型更新后的第二模型参数信息。其中,当预置双重传播模型为多个时,分别获取各个预置双重传播模型的第二模型参数信息。Exemplarily, the preset dual propagation model is a BiLSTM model (Bi-directional Long Short-Term Memory), which is composed of forward LSTM and backward LSTM. After obtaining the preset pre-trained language model and outputting the first semantic vector information corresponding to the text to be trained, the preset dual propagation model is trained by the first semantic vector information, and the second model parameter information of the pre-trained dual propagation model is obtained, for example , and perform feature extraction on the first semantic vector information by using the network layer in the preset double propagation model to obtain the gradient value of the label value corresponding to the first semantic vector information. For example, the vector feature information of the label value is obtained by the full weight matrix of the hidden layer in the preset dual propagation model, and the vector feature information of the label value is obtained by the full weight matrix of the hidden layer in the preset dual propagation model. According to the vector feature information to get the corresponding gradient value. The model parameters of the preset dual propagation model are updated by the gradient value of the text to be trained, so as to obtain the updated second model parameter information of the preset dual propagation model. Wherein, when there are multiple preset dual propagation models, the second model parameter information of each preset dual propagation model is obtained respectively.
步骤S102、将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息。Step S102: Encrypt and upload the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information.
示范性的,该预置聚合联邦模型位于服务器中,向服务器发送上传请求,接收服务器发送的的加密公钥,通过该加密公钥对各个预置语言模型的模型参数进行加密,将加密后的模型参数发送至服务器。服务器在接收到加密后的模型参数时,分别对各个加密后的模型参数进行解密,获取解密后各个预置语言模型的模型参数。通过服务器中预置聚合联邦模型对各个模型参数进行学习,得到对应的聚合模型参数,将得到的聚合模型参数返回至各个预置语言模型。其中,聚合联邦模型包括聚合横向联邦模型、聚合纵向联邦模型以及聚合联邦迁移模型等类型。Exemplarily, the preset aggregate federation model is located in the server, sends an upload request to the server, receives the encryption public key sent by the server, encrypts the model parameters of each preset language model with the encryption public key, and encrypts the encrypted Model parameters are sent to the server. When receiving the encrypted model parameters, the server decrypts each encrypted model parameter, and obtains the decrypted model parameters of each preset language model. Each model parameter is learned through the preset aggregation federation model in the server, the corresponding aggregation model parameter is obtained, and the obtained aggregation model parameter is returned to each preset language model. Among them, the aggregate federation model includes the aggregated horizontal federation model, the aggregated vertical federation model, and the aggregated federation migration model.
需要说明的是,联邦学习是指通过联合不同的客户端或参与者进行机器学习建模的方法。在联邦学习中,客户端不需要向其它客户端和协调者(也称为服务器)暴露自己所拥有的数据,因而联邦学习可以很好的保护用户隐私和保障数据安全,并可以解决数据孤岛问题。联邦学习具有以下优势:数据隔离,数据不会泄露到外部,满足用户隐私保护和数据安全的需求;能够保证联邦学习模型的质量无损,不会出现负迁移,保证联邦学习模型比割裂的独立模型效果好;能够保证各客户端在保持独立性的情况下,进行信息与模型参数的加密交换,并同时获得成长。It should be noted that federated learning refers to the method of machine learning modeling by uniting different clients or participants. In federated learning, clients do not need to expose their own data to other clients and coordinators (also known as servers), so federated learning can well protect user privacy and data security, and can solve the problem of data silos . Federated learning has the following advantages: data isolation, data will not be leaked to the outside, to meet the needs of user privacy protection and data security; it can ensure that the quality of the federated learning model is lossless, and there will be no negative transfer, ensuring that the federated learning model is better than a split independent model. The effect is good; it can ensure that each client can perform encrypted exchange of information and model parameters while maintaining independence, and grow at the same time.
在一实施例中,所述模型参数信息包括第一模型参数信息和第二模型参数信息;所述将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息,包括:将所述第一模型参数信息加密并上传至预置聚合联邦模型,获取所述预置聚合联邦模型对所述第一模型参数信息进行横向联邦学习后返回的第一聚合模型参数信息;将所述第二模型参数信息加密并上传至预置聚合联邦模型,获取所述预置聚合联邦模型对所述第二模型参数信息进行横向联邦学习后返回的第二聚合模型参数信息。In one embodiment, the model parameter information includes first model parameter information and second model parameter information; the model parameter information is encrypted and uploaded to a preset aggregate federation model to obtain the preset aggregate federation The aggregated model parameter information returned after the model performs federated learning on the model parameter information, includes: encrypting the first model parameter information and uploading it to a preset aggregated federated model, and obtaining the information about the first model from the preset aggregated federated model. The first aggregated model parameter information returned after the horizontal federated learning of model parameter information is performed; the second model parameter information is encrypted and uploaded to the preset aggregated federated model, and the second model is obtained from the preset aggregated federated model. The parameter information is the second aggregated model parameter information returned after the horizontal federated learning is performed.
示范性的,接收服务器发送的公钥,其中,该公钥的数量为多个。例如,该公钥的数量为两个时,即第一公钥和第二公钥。通过接收到的公钥分别对各个预置预训练语言模型的第一模型参数信息和各个预置双重传播模型的第二模型参数信息进行加密。例如,在接收到第一公钥和第二公钥时,第一公钥和第二公钥分别对各个预置预训练语言模型的第一模型参数信息,以及对各个预置双重传播模型的第二模型参数信息进行加密。Exemplarily, the public key sent by the server is received, wherein the number of the public key is multiple. For example, when the number of the public keys is two, that is, the first public key and the second public key. The first model parameter information of each preset pre-trained language model and the second model parameter information of each preset dual propagation model are encrypted respectively by using the received public key. For example, when the first public key and the second public key are received, the first public key and the second public key respectively correspond to the first model parameter information of each preset pre-trained language model and the The second model parameter information is encrypted.
在通过公钥对各个预置预训练语言模型的第一模型参数信息和各个预置双重传播模型的第二模型参数信息进行加密后,如图3所示,各个预置预训练语言模型和预置双重传播模型采用不经意传输的一种构造方法,建立秘密通信通道,将加密处理后的各个预置预训练语言模型的第一模型参数信息和各个预置双重传播模型的第二模型参数通过该秘密通信通道发送至服务器。在第一公钥和第二公钥分别对各个预置预训练语言模型的第一模型参数信息进行加密,以及第一公钥和第二公钥分别对各个预置双重传播模型的第二模型参数信息进行加密时,通过秘密通信通道,将第一公钥加密的各个预置预训练语言模型第一模型参数信息和第二公钥加密的各个预置预训练语言模型的第二模型参数,以及第一公钥加密的各个预置双重传播模型的第二模型参数和第二公钥加密的各个预置双重传播模型的第二模型参数信息发送至服务器。After encrypting the first model parameter information of each preset pre-training language model and the second model parameter information of each preset dual propagation model with the public key, as shown in FIG. 3 , each preset pre-training language model and preset The double propagation model adopts a construction method of inadvertent transmission, establishes a secret communication channel, and passes the encrypted first model parameter information of each preset pre-trained language model and the second model parameter of each preset double propagation model through the The secret communication channel is sent to the server. The first public key and the second public key respectively encrypt the first model parameter information of each preset pre-trained language model, and the first public key and the second public key respectively encrypt the second model of each preset dual propagation model When the parameter information is encrypted, the first model parameter information of each preset pre-training language model encrypted by the first public key and the second model parameters of each preset pre-training language model encrypted by the second public key are encrypted through the secret communication channel, and the second model parameters of each preset dual propagation model encrypted by the first public key and the second model parameter information of each preset dual propagation model encrypted by the second public key are sent to the server.
服务器对接收到加密后的各个预置预训练语言模型的第一模型参数信息和各个预置双重传播模型的第二模型参数信息进行解密。例如,接收第一公钥加密的各个预置预训练语言模型的第一模型参数信息和第二公钥加密的各个预置预训练语言模型的的第一模型参数信息,以及第一公钥加密的各个预置双重传播模型的第一模型参数信息和第二公钥加密的各个预置双重传播模型的第一模型参数信息时,通过私钥随机对第一公钥加密的各个预置预训练语言模型的第一模型参数信息和第二公钥加密的各个预置预训练语言模型的第一模型参数信息,以及对第一公钥加密的各个预置双重传播模型的第一模型参数信息和第二公钥加密的各个预置双重传播模型的第一模型参数信息进行解密。其中,私钥与第一公钥或第二公钥对应,即私钥解密第一公钥或解密第二公钥。在通过私钥解密第一公钥加密的预置预训练语言模型的第一模型参数信息和第二公钥加密的各个预置预训练语言模型的第一模型参数信息,得到各个预置预训练语言模型的第一模型参数信息,通过私钥解密各个预置双重传播模型的第一模型参数信息和第二公钥加密的各个预置双重传播模型的第一模型参数信息,得到各个预置双重传播模型的第一模型参数信息。The server decrypts the received encrypted first model parameter information of each preset pre-trained language model and second model parameter information of each preset dual propagation model. For example, receive the first model parameter information of each preset pre-training language model encrypted by the first public key, the first model parameter information of each preset pre-training language model encrypted by the second public key, and the first public key encryption. When the first model parameter information of each preset dual propagation model and the first model parameter information of each preset dual propagation model encrypted by the second public key are randomly encrypted by the private key for each preset pretraining of the first public key The first model parameter information of the language model and the first model parameter information of each preset pre-trained language model encrypted by the second public key, and the first model parameter information of each preset dual propagation model encrypted by the first public key and The first model parameter information of each preset double propagation model encrypted by the second public key is decrypted. The private key corresponds to the first public key or the second public key, that is, the private key decrypts the first public key or decrypts the second public key. After decrypting the first model parameter information of the preset pre-training language model encrypted by the first public key and the first model parameter information of each preset pre-training language model encrypted by the second public key with the private key, each preset pre-training is obtained. The first model parameter information of the language model, the first model parameter information of each preset double propagation model and the first model parameter information of each preset double propagation model encrypted by the second public key are decrypted by the private key, and each preset double propagation model is obtained. First model parameter information of the propagation model.
通过服务器中的横向联邦学习机制学习各个预置预训练语言模型的第一模型参数信息的交集特征对应的参数,通过平均计算交集特征对应的参数,得到对应的第一聚合模型参数,并将该第一聚合模型参数返回至各个预置预训练语言模型。通过服务器中的横向联邦学习机制学习各个预置双重传播模型的第二模型参数信息的交集特征对应的参数,通过平均计算交集特征对应的参数,得到对应的第二聚合模型参数,并将该第二聚合模型参数返回至各个预置双重传播模型。The parameters corresponding to the intersection features of the first model parameter information of each preset pre-trained language model are learned through the horizontal federated learning mechanism in the server, and the parameters corresponding to the intersection features are calculated on average to obtain the corresponding first aggregation model parameters, and the The first aggregated model parameter is returned to each preset pretrained language model. The parameters corresponding to the intersection features of the second model parameter information of each preset dual-propagation model are learned through the horizontal federated learning mechanism in the server, and the parameters corresponding to the intersection features are calculated on average to obtain the corresponding second aggregation model parameters, and the first The two-aggregation model parameters are returned to each of the preset dual-propagation models.
步骤S103、基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型。Step S103: Update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.
示范例的,在接收到聚合联邦模型返回的聚合模型参数信息,通过将该聚合模型参数信息更新各个预置语言模型的模型参数信息,将更新聚合模型参数信息后的各个预置语言模型生成对应的文本模型。As an example, after receiving the aggregated model parameter information returned by the aggregated federation model, by updating the aggregated model parameter information with the model parameter information of each preset language model, each preset language model after updating the aggregated model parameter information is generated corresponding to each preset language model. text model.
在一实施例中,具体地,参照图4,步骤S103包括:子步骤S1031至子步骤S1032。In an embodiment, specifically, referring to FIG. 4 , step S103 includes: sub-step S1031 to sub-step S1032.
子步骤S1031、基于所述第一聚合模型参数信息更新所述预置预训练语言模型的第一模型参数信息,生成对应的文本编码模型。Sub-step S1031: Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information, and generate a corresponding text encoding model.
示范例的,通过将聚合联邦模型返回的第一聚合模型参数信息更新预置预训练语言模型的第一模型参数信息,将更新后的预置预训练语言模型生成对应的文本编码模型。其中,预置预训练语言模型为多个时,分别处于不同用户端时,通过将聚合联邦模型返回的第一聚合模型参数信息更新各个预置预训练语言模型的第二模型参数信息,将将更新后的各个预置预训练语言模型分别生成对应的文本编码模型。In an exemplary example, by updating the first model parameter information of the preset pre-trained language model with the first aggregate model parameter information returned by the aggregate federation model, a corresponding text encoding model is generated from the updated preset pre-trained language model. Wherein, when there are multiple preset pre-trained language models and they are on different user terminals, the second model parameter information of each preset pre-trained language model is updated by updating the first aggregate model parameter information returned by the aggregate federation model, and the The updated preset pre-trained language models generate corresponding text encoding models respectively.
子步骤S1032、基于所述第二聚合模型参数更新所述预置双重传播模型的第二模型参数,生成对应的文本识别模型。Sub-step S1032: Update the second model parameters of the preset dual propagation model based on the second aggregation model parameters, and generate a corresponding text recognition model.
示范例的,通过将聚合联邦模型返回的第二聚合模型参数信息更新预置双重传播模型的第二模型参数信息,将更新后的预置双重传播模型生成对应的文本识别模型,其中,预置双重传播模型为多个时,分别处于不同用户端时,通过将聚合联邦模型返回的第二聚合模型参数信息更新各个预置双重传播模型的第二模型参数信息,将将更新后的各个预置双重传播模型分别生成对应的文本识别模型。As an example, by updating the second model parameter information of the preset dual propagation model with the second aggregation model parameter information returned by the aggregation federation model, a corresponding text recognition model is generated from the updated preset dual propagation model, wherein the preset When there are multiple dual propagation models and they are on different clients, the second model parameter information of each preset dual propagation model is updated by updating the second model parameter information of each preset dual propagation model returned by the second aggregation model parameter information returned by the aggregation federation model. The dual propagation models generate corresponding text recognition models respectively.
在一实施例中,所述生成对应的文本编码模型和/或生成对应的文本识别模型之前,包括:确定所述预置预训练语言模型和/或所述预置双重传播模型是否处于收敛状态;若确定所述预置预训练语言模型和/或预置双重传播模型处于收敛状态,则将所述预置预训练语言模型作为文本编码模型和/或将所述预置双重传播模型作为文本识别模型;若所述预置预训练语言模型和/或所述预置双重传播模型未处于收敛状态,则根据预置待训练样本数据训练所述预置预训练语言模型和/或所述预置双重传播模型,得到训练后所述预置预训练语言模型的第三模型参数信息和/或所述预置双重传播模型的第四模型参数信息。In an embodiment, before generating the corresponding text encoding model and/or generating the corresponding text recognition model, the method includes: determining whether the preset pre-trained language model and/or the preset dual propagation model is in a convergent state. If it is determined that the preset pre-trained language model and/or the preset dual propagation model are in a convergent state, then the preset pre-trained language model is used as a text encoding model and/or the preset dual propagation model is used as text Recognition model; if the preset pre-training language model and/or the preset dual propagation model are not in a convergent state, then train the preset pre-training language model and/or the preset pre-training language model according to the preset sample data to be trained A dual propagation model is set to obtain third model parameter information of the preset pre-trained language model and/or fourth model parameter information of the preset dual propagation model after training.
示范性的,确定预置预训练语言模型和/或预置双重传播模型是否处于收敛状态。例如,将该第一聚合模型参数信息与之前记录的第一聚合模型参数信息进行比对,若该第一聚合模型参数信息与之前记录的第一聚合模型参数信息相同,或者,该第一聚合模型参数信息与之前记录的第一聚合模型参数信息的差值小于预置差值,则确定该预置预训练语言模型处于收敛状态;和/或,将该第二聚合模型参数信息与之前记录的第二聚合模型参数信息进行比对,若该第二聚合模型参数信息与之前记录的第二聚合模型参数信息相同,或者,该第二聚合模型参数信息与之前记录的第二聚合模型参数信息的差值小于预置差值,则确定该预置双重传播模型处于收敛状态。Exemplarily, it is determined whether the preset pretrained language model and/or the preset dual propagation model is in a convergent state. For example, the first aggregation model parameter information is compared with the previously recorded first aggregation model parameter information, if the first aggregation model parameter information is the same as the previously recorded first aggregation model parameter information, or, the first aggregation If the difference between the model parameter information and the previously recorded first aggregated model parameter information is less than the preset difference, it is determined that the preset pre-trained language model is in a convergent state; and/or, the second aggregated model parameter information and the previously recorded The second aggregation model parameter information is compared, if the second aggregation model parameter information is the same as the previously recorded second aggregation model parameter information, or, the second aggregation model parameter information and the previously recorded second aggregation model parameter information If the difference is smaller than the preset difference, it is determined that the preset dual propagation model is in a convergent state.
例如,将该第一聚合模型参数信息与之前记录的第一聚合模型参数信息进行比对,若该第一聚合模型参数信息与之前记录的第一聚合模型参数信息不相同,或者,该第一聚合模型参数信息与之前记录的第一聚合模型参数信息的差值大于或等于预置差值,则确定该预置预训练语言模型未处于收敛状态;和/或,将该第二聚合模型参数信息与之前记录的第二聚合模型参数信息进行比对,若该第二聚合模型参数信息与之前记录的第二聚合模型参数信息不相同,或者,该第二聚合模型参数信息与之前记录的第二聚合模型参数信息的差值大于或等于预置差值,则确定该预置双重传播模型未处于收敛状态。For example, the first aggregation model parameter information is compared with the previously recorded first aggregation model parameter information, if the first aggregation model parameter information is different from the previously recorded first aggregation model parameter information, or the first aggregation model parameter information If the difference between the aggregated model parameter information and the previously recorded first aggregated model parameter information is greater than or equal to a preset difference, it is determined that the preset pretrained language model is not in a convergent state; and/or, the second aggregated model parameter The information is compared with the previously recorded second aggregation model parameter information, if the second aggregation model parameter information is not the same as the previously recorded second aggregation model parameter information, or, the second aggregation model parameter information and the previously recorded No. If the difference between the parameter information of the two aggregation models is greater than or equal to the preset difference, it is determined that the preset dual propagation model is not in a convergent state.
若确定预置预训练语言模型处于收敛状态,则将所述预置预训练语言模型作为文本编码模型;和/或,若确定预置双重传播模型处于收敛状态,则将预置双重传播模型作为文本识别模型。If it is determined that the preset pre-trained language model is in a convergent state, the preset pre-trained language model is used as the text encoding model; and/or, if it is determined that the preset dual-propagation model is in a convergent state, the preset dual-propagation model is used as Text recognition model.
若确定预置预训练语言模型未处于收敛状态,则根据预置待训练样本数据训练继续训练预置预训练语言模型,得到训练后预置预训练语言模型的第三模型参数信息和第二语义向量信息;和/或,若确定预置双重传播模型未处于收敛状态,则根据第二语义向量信息继续训练预置双重传播模型,得到训练后的第四模型参数信息,并将该第三模型参数信息和/或第四模型参数信息上传至聚合联邦模型进行联邦学习。If it is determined that the preset pre-trained language model is not in a convergent state, continue to train the preset pre-trained language model according to the preset sample data to be trained, and obtain the third model parameter information and the second semantics of the pre-trained pre-trained language model after training vector information; and/or, if it is determined that the preset dual propagation model is not in a convergent state, continue to train the preset dual propagation model according to the second semantic vector information, obtain the trained fourth model parameter information, and use the third model The parameter information and/or the fourth model parameter information is uploaded to the aggregate federated model for federated learning.
在本申请实施例中,通过待训练集数据对预置语言模型进行训练,得到预置语言模型的模型参数信息,通过聚合联邦模型对模型参数信息进行联邦学习,得到聚合模型参数信息,并通该聚合模型参数信息更新预置语言模型的模型参数信息,生成对应的文本模型,实现保护数据隐私的基础上联合训练多个模型,并提高预测违规文本的准确率以及减少模型的训练时间。In the embodiment of the present application, the preset language model is trained through the data of the to-be-trained set to obtain the model parameter information of the preset language model, and the model parameter information is federated by the aggregated federated model to obtain the aggregated model parameter information, and the The aggregated model parameter information updates the model parameter information of the preset language model, generates a corresponding text model, realizes joint training of multiple models on the basis of protecting data privacy, improves the accuracy of predicting illegal text and reduces the training time of the model.
请参照图5,图5为本申请的实施例提供的一种基于联邦学习的文本模型的识别方法的流程示意图。Please refer to FIG. 5 , which is a schematic flowchart of a method for identifying a text model based on federated learning provided by an embodiment of the present application.
如图5所示,该基于联邦学习的文本模型的识别方法包括步骤S201至步骤S204。As shown in FIG. 5 , the recognition method of the text model based on federated learning includes steps S201 to S204.
步骤S201、获取待预测文本。Step S201, acquiring the text to be predicted.
示范性的,获取待预测文本,该待预测文本包含违规字词或不违规字词,且为通过网络检测到用户发送的语句或短句等。Exemplarily, the to-be-predicted text is acquired, where the to-be-predicted text contains violating words or non-violating words, and is a sentence or a short sentence sent by a user detected through a network.
步骤S202、基于文本编码模型和所述待预测文本,获取所述文本编码模型输出所述待预测文本的第二文本语义向量信息。Step S202 , based on the text encoding model and the to-be-predicted text, obtain second text semantic vector information of the to-be-predicted text output by the text encoding model.
示范性的,通过文本编码模型对该待预测文本进行语义预测,得到该待预测文本的第二文本语义向量信息。例如,通过文本编码模型的隐藏层提取待预测文本中各个字词的语义向量,将得到的语义向量进行组合,得到该待预测文本的第二文本语义向量信息。Exemplarily, semantic prediction is performed on the to-be-predicted text through a text encoding model to obtain second text semantic vector information of the to-be-predicted text. For example, the semantic vector of each word in the text to be predicted is extracted through the hidden layer of the text encoding model, and the obtained semantic vectors are combined to obtain the second text semantic vector information of the text to be predicted.
步骤S203、基于文本识别模型和所述第二文本语义向量信息,获取所述文本识别模型输出所述第二文本语义向量信息的标签信息。Step S203 , based on the text recognition model and the second text semantic vector information, obtain label information of the text recognition model outputting the second text semantic vector information.
示范性的,通过文本识别模型对该第二文本语义向量信息进行预测,得到该第二文本语义向量信息的标签信息。例如,通过文本识别模型的隐藏层提取第二文本语义向量信息中各个字词的语义向量,对各个字词的语义向量进行映射,得到该第二文本语义向量信息的标签信息。Exemplarily, the second text semantic vector information is predicted by a text recognition model to obtain label information of the second text semantic vector information. For example, the semantic vector of each word in the second text semantic vector information is extracted through the hidden layer of the text recognition model, and the semantic vector of each word is mapped to obtain the label information of the second text semantic vector information.
步骤S204、根据所述标签信息,确定所述待预测文本是否违规,其中,所述文本编码模型和所述文本识别模型为上述的基于联邦学习的文本模型的训练方法得到的。Step S204 , according to the label information, determine whether the text to be predicted violates the rules, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.
示范性的,在获取到该标签信息时,基于该标签信息确定该待预测文本是否违规。例如,该标签信息为标签值时,将该标签值与预置标签值进行比对,若该标签值大于或等于预置标签值时,确定该待预测文本是违规内容;若确定该标签值小于预置标签值,则确定该待预测文本不是违规内容,其中,该文本编码模型和文本识别模型为上述的基于联邦学习的文本模型的训练方法得到的。Exemplarily, when the label information is obtained, it is determined whether the text to be predicted violates the rules based on the label information. For example, when the tag information is a tag value, the tag value is compared with the preset tag value, and if the tag value is greater than or equal to the preset tag value, it is determined that the text to be predicted is illegal content; if the tag value is determined If the value is less than the preset label value, it is determined that the text to be predicted is not illegal content, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.
在本申请实施例中,通过文本编码模型得到待预测文本的第二文本语义向量信息,通过文本识别模型得到该第二文本语义向量信息的标签信息,通过标签信息确定该待预测文本是否为违规内容,其中,该文本编码模型和文本识别模型都是通过联邦学习得到,从而提高了文本编码模型和文本识别模型的准确率。In this embodiment of the present application, the second text semantic vector information of the text to be predicted is obtained through a text encoding model, the label information of the second text semantic vector information is obtained through a text recognition model, and whether the to-be-predicted text is illegal is determined through the label information content, wherein the text encoding model and the text recognition model are obtained through federated learning, thereby improving the accuracy of the text encoding model and the text recognition model.
请参照图6,图6为本申请实施例提供的一种基于联邦学习的文本模型的训练装置的示意性框图。Please refer to FIG. 6 , which is a schematic block diagram of a training apparatus for a federated learning-based text model provided by an embodiment of the present application.
如图6所示,该基于联邦学习的文本模型的训练装置400,包括:第一获取模块401、第二获取模块402、生成模块403。As shown in FIG. 6 , the training apparatus 400 for a text model based on federated learning includes: a first acquisition module 401 , a second acquisition module 402 , and a generation module 403 .
第一获取模块401,用于获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息;The first obtaining module 401 is configured to obtain the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;
第二获取模块402,用于将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息;The second acquiring module 402 is configured to encrypt and upload the model parameter information to a preset aggregate federation model to acquire aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
生成模块403,用于基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型。The generating module 403 is configured to update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.
其中,第一获取模块401具体还用于:Wherein, the first obtaining module 401 is also specifically used for:
基于所述待训练文本训练所述预置预训练语言模型,获取所述预置预训练语言模型输出所述待训练文本对应的第一语义向量信息,以及获取训练后所述预置预训练语言模型的第一模型参数信息;Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the preset pre-training language after training first model parameter information of the model;
基于所述第一语义向量信息对所述预置双重传播模型进行训练,获取训练后所述预置双重传播模型的第二模型参数信息。The preset dual propagation model is trained based on the first semantic vector information, and second model parameter information of the preset dual propagation model after training is acquired.
其中,第二获取模块402具体还用于:Wherein, the second obtaining module 402 is also specifically used for:
将所述第一模型参数信息加密并上传至预置聚合联邦模型,获取所述预置聚合联邦模型对所述第一模型参数信息进行横向联邦学习后返回的第一聚合模型参数信息;Encrypting and uploading the first model parameter information to a preset aggregate federation model, and obtaining first aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the first model parameter information;
将所述第二模型参数信息加密并上传至预置聚合联邦模型,获取所述预置聚合联邦模型对所述第二模型参数信息进行横向联邦学习后返回的第二聚合模型参数信息。The second model parameter information is encrypted and uploaded to the preset aggregate federation model, and the second aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the second model parameter information is obtained.
其中,生成模块403具体还用于:Wherein, the generating module 403 is also specifically used for:
基于所述第一聚合模型参数信息更新所述预置预训练语言模型的第一模型参数信息,生成对应的文本编码模型;Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information to generate a corresponding text encoding model;
基于所述第二聚合模型参数更新所述预置双重传播模型的第二模型参数,生成对应的文本识别模型。The second model parameters of the preset dual propagation model are updated based on the second aggregation model parameters to generate a corresponding text recognition model.
其中,生成模块403具体还用于:Wherein, the generating module 403 is also specifically used for:
确定所述预置预训练语言模型和/或所述预置双重传播模型是否处于收敛状态;determining whether the preset pre-trained language model and/or the preset dual propagation model is in a convergent state;
若确定所述预置预训练语言模型和/或预置双重传播模型处于收敛状态,则将所述预置预训练语言模型作为文本编码模型和/或将所述预置双重传播模型作为文本识别模型;If it is determined that the preset pretrained language model and/or the preset dual propagation model is in a convergent state, the preset pretrained language model is used as a text encoding model and/or the preset dual propagation model is used as a text recognition model Model;
若所述预置预训练语言模型和/或所述预置双重传播模型未处于收敛状态,则根据预置待训练样本数据训练所述预置预训练语言模型和/或所述预置双重传播模型,得到训练后所述预置预训练语言模型的第三模型参数信息和/或所述预置双重传播模型的第四模型参数信息。If the preset pre-trained language model and/or the preset dual propagation model is not in a convergent state, the preset pre-trained language model and/or the preset dual propagation model are trained according to the preset sample data to be trained model, and obtain the third model parameter information of the preset pre-trained language model and/or the fourth model parameter information of the preset dual propagation model after training.
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的装置和各模块及单元的具体工作过程,可以参考前述基于联邦学习的文本模型的训练方法实施例中的对应过程,在此不再赘述。It should be noted that those skilled in the art can clearly understand that, for the convenience and brevity of the description, the specific working process of the device and each module and unit described above can be implemented with reference to the aforementioned training method of the text model based on federated learning. The corresponding process in the example will not be repeated here.
请参照图7,图7为本申请实施例提供的一种基于联邦学习的文本模型的识别装置的示意性框图。Please refer to FIG. 7 , which is a schematic block diagram of a recognition apparatus based on a federated learning text model provided by an embodiment of the present application.
如图7所示,该基于联邦学习的文本模型的识别装置500,包括:第一获取模块501、第二获取模块502、第三获取模块503、确定模块504。As shown in FIG. 7 , the recognition apparatus 500 based on the federated learning text model includes: a first acquisition module 501 , a second acquisition module 502 , a third acquisition module 503 , and a determination module 504 .
第一获取模块501,用于获取待预测文本;The first obtaining module 501 is used to obtain the text to be predicted;
第二获取模块502,用于基于文本编码模型和所述待预测文本,获取所述文本编码模型输出所述待预测文本的第二文本语义向量信息;A second acquiring module 502, configured to acquire, based on the text encoding model and the text to be predicted, the second text semantic vector information of the text encoding model outputting the text to be predicted;
第三获取模块503,用于基于文本识别模型和所述第二文本语义向量信息,获取所述文本识别模型输出所述第二文本语义向量信息的标签信息;A third obtaining module 503, configured to obtain, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;
确定模块504,用于根据所述标签信息,确定所述待预测文本是否违规,其中,所述文本编码模型和所述文本识别模型为上述的基于联邦学习的文本模型的训练方法得到的。The determination module 504 is configured to determine whether the text to be predicted violates the rules according to the label information, wherein the text encoding model and the text recognition model are obtained by the above-mentioned training method of the text model based on federated learning.
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的装置和各模块及单元的具体工作过程,可以参考前述基于联邦学习的文本模型的识别方法实施例中的对应过程,在此不再赘述。It should be noted that those skilled in the art can clearly understand that, for the convenience and brevity of the description, the specific working process of the above-described device and each module and unit can be implemented with reference to the aforementioned recognition method of the text model based on federated learning. The corresponding process in the example will not be repeated here.
上述实施例提供的装置可以实现为一种计算机可读指令的形式,该计算机可读指令可以在如图8所示的计算机设备上运行。The apparatuses provided by the above embodiments may be implemented in the form of computer-readable instructions, and the computer-readable instructions may be executed on a computer device as shown in FIG. 8 .
请参阅图8,图8为本申请实施例提供的一种计算机设备的结构示意性框图。该计算机设备可以为终端。Please refer to FIG. 8 , FIG. 8 is a schematic block diagram of the structure of a computer device according to an embodiment of the present application. The computer device may be a terminal.
如图8所示,该计算机设备包括通过系统总线连接的处理器、存储器和网络接口,其中,存储器可以包括计算机可读存储介质和内存储器。As shown in FIG. 8 , the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a computer-readable storage medium and an internal memory.
计算机可读存储介质可以是非易失性,也可以是易失性,计算机可读存储介质可存储操作系统和计算机可读指令。该计算机可读指令被执行时,可使得处理器执行任意一种基于联邦学习的文本模型的训练方法和基于联邦学习的文本模型的识别方法。The computer-readable storage medium can be non-volatile or volatile, and the computer-readable storage medium can store an operating system and computer-readable instructions. When executed, the computer-readable instructions can cause the processor to execute any one of the federated learning-based text model training methods and federated learning-based text model recognition methods.
处理器用于提供计算和控制能力,支撑整个计算机设备的运行。The processor is used to provide computing and control capabilities to support the operation of the entire computer equipment.
内存储器为计算机可读存储介质中的计算机可读指令的运行提供环境,该计算机可读指令被处理器执行时,可使得处理器执行任意一种基于联邦学习的文本模型的训练方法和基于联邦学习的文本模型的识别方法。The internal memory provides an environment for the execution of computer-readable instructions in the computer-readable storage medium. When the computer-readable instructions are executed by the processor, the processor can execute any federated learning-based text model training method and federation-based training method. Recognition methods for learned text models.
其中,在一个实施例中,所述处理器用于运行存储在存储器中的计算机可读指令,以实现本申请的训练方法和识别方法的步骤。Wherein, in one embodiment, the processor is configured to execute computer-readable instructions stored in the memory to implement the steps of the training method and the identification method of the present application.
本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被执行时所实现的方法可参照本申请基于联邦学习的文本模型的训练方法和基于联邦学习的文本模型的识别方法的各个实施例。Embodiments of the present application further provide a computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, and the method implemented when the computer-readable instructions are executed may refer to the federated learning-based method of the present application. Various embodiments of a method for training a text model and a method for recognizing a text model based on federated learning.
本申请所指区块链是预置预训练语言模型、预置双重传播模型、文本编码模型和文本识别模型的存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technologies such as preset pre-training language model, preset dual propagation model, text encoding model and text recognition model storage, point-to-point transmission, consensus mechanism, encryption algorithm and so on. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Claims (20)

  1. 一种基于联邦学习的文本模型的训练方法,其中,包括:A training method for a federated learning-based text model, including:
    获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息;Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;
    将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息;Encrypting and uploading the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
    基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型。The preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
  2. 如权利要求1所述的基于联邦学习的文本模型的训练方法,其中,所述待训练集数据包括待训练文本,所述预置语言模型包括预置预训练语言模型和预置双重传播模型,所述模型参数信息包括第一模型参数信息和第二模型参数信息;The method for training a text model based on federated learning according to claim 1, wherein the data of the to-be-trained set comprises text to be trained, and the preset language model includes a preset pre-trained language model and a preset dual propagation model, The model parameter information includes first model parameter information and second model parameter information;
    所述基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息,包括:The training of the preset language model based on the data of the to-be-trained set to obtain model parameter information of the preset language model includes:
    基于所述待训练文本训练所述预置预训练语言模型,获取所述预置预训练语言模型输出所述待训练文本对应的第一语义向量信息,以及获取训练后所述预置预训练语言模型的第一模型参数信息;Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the preset pre-training language after training first model parameter information of the model;
    基于所述第一语义向量信息对所述预置双重传播模型进行训练,获取训练后所述预置双重传播模型的第二模型参数信息。The preset dual propagation model is trained based on the first semantic vector information, and second model parameter information of the preset dual propagation model after training is acquired.
  3. 如权利要求1所述的基于联邦学习的文本模型的训练方法,其中,所述模型参数信息包括第一模型参数信息和第二模型参数信息;The method for training a text model based on federated learning according to claim 1, wherein the model parameter information includes first model parameter information and second model parameter information;
    所述将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息,包括:Encrypting and uploading the model parameter information to the preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information, including:
    将所述第一模型参数信息加密并上传至预置聚合联邦模型,获取所述预置聚合联邦模型对所述第一模型参数信息进行横向联邦学习后返回的第一聚合模型参数信息;Encrypting and uploading the first model parameter information to a preset aggregate federation model, and obtaining first aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the first model parameter information;
    将所述第二模型参数信息加密并上传至预置聚合联邦模型,获取所述预置聚合联邦模型对所述第二模型参数信息进行横向联邦学习后返回的第二聚合模型参数信息。The second model parameter information is encrypted and uploaded to the preset aggregate federation model, and the second aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the second model parameter information is obtained.
  4. 如权利要求3所述的基于联邦学习的文本模型的训练方法,其中,所述预置语言模型包括预置预训练语言模型和预置双重传播模型,所述文本模型包括文本编码模型和文本识别模型;The method for training a text model based on federated learning according to claim 3, wherein the preset language model includes a preset pre-training language model and a preset dual propagation model, and the text model includes a text encoding model and a text recognition model Model;
    所述基于所述聚合模型参数信息更新所述预置待训练语言模型,得到对应的文本模型,包括:The updating of the preset language model to be trained based on the aggregated model parameter information to obtain a corresponding text model, including:
    基于所述第一聚合模型参数信息更新所述预置预训练语言模型的第一模型参数信息,生成对应的文本编码模型;Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information to generate a corresponding text encoding model;
    基于所述第二聚合模型参数更新所述预置双重传播模型的第二模型参数,生成对应的文本识别模型。The second model parameters of the preset dual propagation model are updated based on the second aggregation model parameters to generate a corresponding text recognition model.
  5. 如权利要求4所述的基于联邦学习的文本模型的训练方法,其中,所述生成对应的文本编码模型和/或生成对应的文本识别模型之前,包括:The method for training a text model based on federated learning as claimed in claim 4, wherein, before generating the corresponding text encoding model and/or generating the corresponding text recognition model, comprising:
    确定所述预置预训练语言模型和/或所述预置双重传播模型是否处于收敛状态;determining whether the preset pretrained language model and/or the preset dual propagation model is in a convergent state;
    若确定所述预置预训练语言模型和/或预置双重传播模型处于收敛状态,则将所述预置预训练语言模型作为文本编码模型和/或将所述预置双重传播模型作为文本识别模型;If it is determined that the preset pretrained language model and/or the preset dual propagation model is in a convergent state, the preset pretrained language model is used as a text encoding model and/or the preset dual propagation model is used as a text recognition model Model;
    若所述预置预训练语言模型和/或所述预置双重传播模型未处于收敛状态,则根据预置待训练样本数据训练所述预置预训练语言模型和/或所述预置双重传播模型,得到训练后所述预置预训练语言模型的第三模型参数信息和/或所述预置双重传播模型的第四模型参数信息。If the preset pre-trained language model and/or the preset dual propagation model is not in a convergent state, the preset pre-trained language model and/or the preset dual propagation model are trained according to the preset sample data to be trained model, and obtain the third model parameter information of the preset pre-trained language model and/or the fourth model parameter information of the preset dual propagation model after training.
  6. 一种基于联邦学习的文本模型的识别方法,其中,包括:A recognition method of a text model based on federated learning, including:
    获取待预测文本;Get the text to be predicted;
    基于文本编码模型和所述待预测文本,获取所述文本编码模型输出所述待预测文本的第二文本语义向量信息;Based on the text encoding model and the to-be-predicted text, acquiring second text semantic vector information of the text-encoding model outputting the to-be-predicted text;
    基于文本识别模型和所述第二文本语义向量信息,获取所述文本识别模型输出所述第二文本语义向量信息的标签信息;obtaining, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;
    根据所述标签信息,确定所述待预测文本是否违规,其中,所述文本编码模型和所述文本识别模型为如权利要求1-5所述的基于联邦学习的文本模型的训练方法得到的。According to the label information, it is determined whether the text to be predicted violates the rules, wherein the text encoding model and the text recognition model are obtained by the training method of the federated learning-based text model according to claims 1-5.
  7. 一种基于联邦学习的文本模型的训练装置,其中,包括:A training device for a federated learning-based text model, comprising:
    第一获取模块,用于获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息;a first obtaining module, configured to obtain the data of the set to be trained, train a preset language model based on the data of the set to be trained, and obtain model parameter information of the preset language model;
    第二获取模块,用于将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息;a second acquisition module, configured to encrypt and upload the model parameter information to a preset aggregate federation model, so as to acquire aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
    生成模块,用于基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型。A generating module is configured to update the preset language model based on the aggregated model parameter information to obtain a corresponding text model.
  8. 一种基于联邦学习的文本模型的识别装置,其中,包括:A recognition device based on a federated learning text model, comprising:
    第一获取模块,用于获取待预测文本;a first obtaining module, used for obtaining the text to be predicted;
    第二获取模块,用于基于文本编码模型和所述待预测文本,获取所述文本编码模型输出所述待预测文本的第二文本语义向量信息;a second acquiring module, configured to acquire, based on the text encoding model and the to-be-predicted text, the second text semantic vector information of the text-encoding model outputting the to-be-predicted text;
    第三获取模块,用于基于文本识别模型和所述第二文本语义向量信息,获取所述文本识别模型输出所述第二文本语义向量信息的标签信息;a third acquiring module, configured to acquire, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;
    确定模块,用于根据所述标签信息,确定所述待预测文本是否违规,其中,所述文本编码模型和所述文本识别模型为如权利要求1-5所述的基于联邦学习的文本模型的训练方法得到的。A determination module, configured to determine whether the to-be-predicted text violates the rules according to the label information, wherein the text encoding model and the text recognition model are those of the federated learning-based text model according to claims 1-5. obtained by the training method.
  9. 一种计算机设备,其中,所述计算机设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机可读指令,其中所述计算机可读指令被所述处理器执行时,实现如下步骤:A computer device, wherein the computer device includes a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein the computer-readable instructions are executed by the processor When executed, implement the following steps:
    获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息;Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;
    将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息;Encrypting and uploading the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
    基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型。The preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
  10. 如权利要求9所述的计算机设备,其中,所述待训练集数据包括待训练文本,所述预置语言模型包括预置预训练语言模型和预置双重传播模型,所述模型参数信息包括第一模型参数信息和第二模型参数信息;The computer device according to claim 9, wherein the data of the to-be-trained set includes text to be trained, the preset language model includes a preset pre-trained language model and a preset dual propagation model, and the model parameter information includes the first a model parameter information and a second model parameter information;
    所述基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息,包括:The training of the preset language model based on the data of the to-be-trained set to obtain model parameter information of the preset language model includes:
    基于所述待训练文本训练所述预置预训练语言模型,获取所述预置预训练语言模型输出所述待训练文本对应的第一语义向量信息,以及获取训练后所述预置预训练语言模型的第一模型参数信息;Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the preset pre-training language after training first model parameter information of the model;
    基于所述第一语义向量信息对所述预置双重传播模型进行训练,获取训练后所述预置双重传播模型的第二模型参数信息。The preset dual propagation model is trained based on the first semantic vector information, and second model parameter information of the preset dual propagation model after training is acquired.
  11. 如权利要求9所述的计算机设备,其中,所述模型参数信息包括第一模型参数信息和第二模型参数信息;The computer device of claim 9, wherein the model parameter information includes first model parameter information and second model parameter information;
    所述将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息,包括:Encrypting and uploading the model parameter information to the preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information, including:
    将所述第一模型参数信息加密并上传至预置聚合联邦模型,获取所述预置聚合联邦模型对所述第一模型参数信息进行横向联邦学习后返回的第一聚合模型参数信息;Encrypting and uploading the first model parameter information to a preset aggregate federation model, and obtaining first aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the first model parameter information;
    将所述第二模型参数信息加密并上传至预置聚合联邦模型,获取所述预置聚合联邦模型对所述第二模型参数信息进行横向联邦学习后返回的第二聚合模型参数信息。The second model parameter information is encrypted and uploaded to the preset aggregate federation model, and the second aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the second model parameter information is obtained.
  12. 如权利要求11所述的计算机设备,其中,所述预置语言模型包括预置预训练语言模型和预置双重传播模型,所述文本模型包括文本编码模型和文本识别模型;The computer device of claim 11, wherein the preset language model includes a preset pre-training language model and a preset dual propagation model, and the text model includes a text encoding model and a text recognition model;
    所述基于所述聚合模型参数信息更新所述预置待训练语言模型,得到对应的文本模型,包括:The updating of the preset language model to be trained based on the aggregated model parameter information to obtain a corresponding text model, including:
    基于所述第一聚合模型参数信息更新所述预置预训练语言模型的第一模型参数信息,生成对应的文本编码模型;Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information to generate a corresponding text encoding model;
    基于所述第二聚合模型参数更新所述预置双重传播模型的第二模型参数,生成对应的文本识别模型。The second model parameters of the preset dual propagation model are updated based on the second aggregation model parameters to generate a corresponding text recognition model.
  13. 如权利要求12所述的计算机设备,其中,所述生成对应的文本编码模型和/或生成对应的文本识别模型之前,包括:The computer device according to claim 12, wherein, before generating the corresponding text encoding model and/or generating the corresponding text recognition model, comprising:
    确定所述预置预训练语言模型和/或所述预置双重传播模型是否处于收敛状态;determining whether the preset pretrained language model and/or the preset dual propagation model is in a convergent state;
    若确定所述预置预训练语言模型和/或预置双重传播模型处于收敛状态,则将所述预置预训练语言模型作为文本编码模型和/或将所述预置双重传播模型作为文本识别模型;If it is determined that the preset pretrained language model and/or the preset dual propagation model is in a convergent state, the preset pretrained language model is used as a text encoding model and/or the preset dual propagation model is used as a text recognition model Model;
    若所述预置预训练语言模型和/或所述预置双重传播模型未处于收敛状态,则根据预置待训练样本数据训练所述预置预训练语言模型和/或所述预置双重传播模型,得到训练后所述预置预训练语言模型的第三模型参数信息和/或所述预置双重传播模型的第四模型参数信息。If the preset pre-trained language model and/or the preset dual propagation model is not in a convergent state, the preset pre-trained language model and/or the preset dual propagation model are trained according to the preset sample data to be trained model, and obtain the third model parameter information of the preset pre-trained language model and/or the fourth model parameter information of the preset dual propagation model after training.
  14. 一种计算机设备,其中,所述计算机设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机可读指令,其中所述计算机可读指令被所述处理器执行时,实现如下步骤:A computer device, wherein the computer device includes a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein the computer-readable instructions are executed by the processor When executed, implement the following steps:
    获取待预测文本;Get the text to be predicted;
    基于文本编码模型和所述待预测文本,获取所述文本编码模型输出所述待预测文本的第二文本语义向量信息;Based on the text encoding model and the to-be-predicted text, acquiring second text semantic vector information of the text-encoding model outputting the to-be-predicted text;
    基于文本识别模型和所述第二文本语义向量信息,获取所述文本识别模型输出所述第二文本语义向量信息的标签信息;obtaining, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;
    根据所述标签信息,确定所述待预测文本是否违规。According to the label information, it is determined whether the text to be predicted violates the rules.
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有计算机可读指令,其中所述计算机可读指令被处理器执行时,实现如下步骤:A computer-readable storage medium, wherein computer-readable instructions are stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the following steps are implemented:
    获取待训练集数据,基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息;Acquire the data of the to-be-trained set, train a preset language model based on the to-be-trained set data, and obtain model parameter information of the preset language model;
    将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息;Encrypting and uploading the model parameter information to a preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information;
    基于所述聚合模型参数信息更新所述预置语言模型,得到对应的文本模型。The preset language model is updated based on the aggregated model parameter information to obtain a corresponding text model.
  16. 如权利要求15所述的计算机可读存储介质,其中,所述待训练集数据包括待训练文本,所述预置语言模型包括预置预训练语言模型和预置双重传播模型,所述模型参数信息包括第一模型参数信息和第二模型参数信息;The computer-readable storage medium of claim 15, wherein the to-be-trained set data includes to-be-trained text, the preset language model includes a preset pre-trained language model and a preset dual propagation model, and the model parameters The information includes first model parameter information and second model parameter information;
    所述基于所述待训练集数据训练预置语言模型,得到所述预置语言模型的模型参数信息,包括:The training of the preset language model based on the data of the to-be-trained set to obtain model parameter information of the preset language model includes:
    基于所述待训练文本训练所述预置预训练语言模型,获取所述预置预训练语言模型输出所述待训练文本对应的第一语义向量信息,以及获取训练后所述预置预训练语言模型的第一模型参数信息;Train the preset pre-training language model based on the text to be trained, obtain the first semantic vector information corresponding to the text to be trained output by the preset pre-training language model, and obtain the preset pre-training language after training first model parameter information of the model;
    基于所述第一语义向量信息对所述预置双重传播模型进行训练,获取训练后所述预置双重传播模型的第二模型参数信息。The preset dual propagation model is trained based on the first semantic vector information, and second model parameter information of the preset dual propagation model after training is acquired.
  17. 如权利要求15所述的计算机可读存储介质,其中,所述模型参数信息包括第一模型参数信息和第二模型参数信息;The computer-readable storage medium of claim 15, wherein the model parameter information includes first model parameter information and second model parameter information;
    所述将所述模型参数信息加密并上传至预置聚合联邦模型,以获取所述预置聚合联邦模型对所述模型参数信息进行联邦学习后返回的聚合模型参数信息,包括:Encrypting and uploading the model parameter information to the preset aggregate federation model to obtain aggregate model parameter information returned after the preset aggregate federation model performs federated learning on the model parameter information, including:
    将所述第一模型参数信息加密并上传至预置聚合联邦模型,获取所述预置聚合联邦模型对所述第一模型参数信息进行横向联邦学习后返回的第一聚合模型参数信息;Encrypting and uploading the first model parameter information to a preset aggregate federation model, and obtaining first aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the first model parameter information;
    将所述第二模型参数信息加密并上传至预置聚合联邦模型,获取所述预置聚合联邦模型对所述第二模型参数信息进行横向联邦学习后返回的第二聚合模型参数信息。The second model parameter information is encrypted and uploaded to the preset aggregate federation model, and the second aggregate model parameter information returned after the preset aggregate federation model performs horizontal federated learning on the second model parameter information is obtained.
  18. 如权利要求17所述的计算机可读存储介质,其中,所述预置语言模型包括预置预训练语言模型和预置双重传播模型,所述文本模型包括文本编码模型和文本识别模型;The computer-readable storage medium of claim 17, wherein the preset language model includes a preset pretrained language model and a preset dual propagation model, and the text model includes a text encoding model and a text recognition model;
    所述基于所述聚合模型参数信息更新所述预置待训练语言模型,得到对应的文本模型,包括:The updating of the preset language model to be trained based on the aggregated model parameter information to obtain a corresponding text model, including:
    基于所述第一聚合模型参数信息更新所述预置预训练语言模型的第一模型参数信息,生成对应的文本编码模型;Update the first model parameter information of the preset pre-trained language model based on the first aggregated model parameter information to generate a corresponding text encoding model;
    基于所述第二聚合模型参数更新所述预置双重传播模型的第二模型参数,生成对应的文本识别模型。The second model parameters of the preset dual propagation model are updated based on the second aggregation model parameters to generate a corresponding text recognition model.
  19. 如权利要求18所述的计算机可读存储介质,其中,所述生成对应的文本编码模型和/或生成对应的文本识别模型之前,包括:The computer-readable storage medium according to claim 18, wherein, before generating the corresponding text encoding model and/or generating the corresponding text recognition model, comprising:
    确定所述预置预训练语言模型和/或所述预置双重传播模型是否处于收敛状态;determining whether the preset pretrained language model and/or the preset dual propagation model is in a convergent state;
    若确定所述预置预训练语言模型和/或预置双重传播模型处于收敛状态,则将所述预置预训练语言模型作为文本编码模型和/或将所述预置双重传播模型作为文本识别模型;If it is determined that the preset pretrained language model and/or the preset dual propagation model is in a convergent state, the preset pretrained language model is used as a text encoding model and/or the preset dual propagation model is used as a text recognition model Model;
    若所述预置预训练语言模型和/或所述预置双重传播模型未处于收敛状态,则根据预置待训练样本数据训练所述预置预训练语言模型和/或所述预置双重传播模型,得到训练后所述预置预训练语言模型的第三模型参数信息和/或所述预置双重传播模型的第四模型参数信息。If the preset pre-trained language model and/or the preset dual propagation model is not in a convergent state, the preset pre-trained language model and/or the preset dual propagation model are trained according to the preset sample data to be trained model, and obtain the third model parameter information of the preset pre-trained language model and/or the fourth model parameter information of the preset dual propagation model after training.
  20. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有计算机可读指令,其中所述计算机可读指令被处理器执行时,实现如下步骤:A computer-readable storage medium, wherein computer-readable instructions are stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the following steps are implemented:
    获取待预测文本;Get the text to be predicted;
    基于文本编码模型和所述待预测文本,获取所述文本编码模型输出所述待预测文本的第二文本语义向量信息;Based on the text encoding model and the to-be-predicted text, acquiring second text semantic vector information of the text-encoding model outputting the to-be-predicted text;
    基于文本识别模型和所述第二文本语义向量信息,获取所述文本识别模型输出所述第二文本语义向量信息的标签信息;obtaining, based on the text recognition model and the second text semantic vector information, the label information of the text recognition model outputting the second text semantic vector information;
    根据所述标签信息,确定所述待预测文本是否违规。According to the label information, it is determined whether the text to be predicted violates the rules.
PCT/CN2021/084297 2020-12-11 2021-03-31 Text model training method, recognition method, apparatus, device and storage medium WO2022121183A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011446681.6A CN112734050A (en) 2020-12-11 2020-12-11 Text model training method, text model recognition device, text model equipment and storage medium
CN202011446681.6 2020-12-11

Publications (1)

Publication Number Publication Date
WO2022121183A1 true WO2022121183A1 (en) 2022-06-16

Family

ID=75599292

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/084297 WO2022121183A1 (en) 2020-12-11 2021-03-31 Text model training method, recognition method, apparatus, device and storage medium

Country Status (2)

Country Link
CN (1) CN112734050A (en)
WO (1) WO2022121183A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117540829A (en) * 2023-10-18 2024-02-09 广西壮族自治区通信产业服务有限公司技术服务分公司 Knowledge sharing large language model collaborative optimization method and system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114118530A (en) * 2021-11-04 2022-03-01 杭州经纬信息技术股份有限公司 Prediction method and device based on multi-household power consumption prediction model
CN115049440A (en) * 2022-07-11 2022-09-13 中国工商银行股份有限公司 Method and device for predicting activity delivery information and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543030A (en) * 2018-10-12 2019-03-29 平安科技(深圳)有限公司 Customer service machine conference file classification method and device, equipment, storage medium
US20190340534A1 (en) * 2016-09-26 2019-11-07 Google Llc Communication Efficient Federated Learning
CN110457585A (en) * 2019-08-13 2019-11-15 腾讯科技(深圳)有限公司 Method for pushing, device, system and the computer equipment of negative text
CN111669757A (en) * 2020-06-15 2020-09-15 国家计算机网络与信息安全管理中心 Terminal fraud call identification method based on conversation text word vector
CN111966875A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 Sensitive information identification method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190340534A1 (en) * 2016-09-26 2019-11-07 Google Llc Communication Efficient Federated Learning
CN109543030A (en) * 2018-10-12 2019-03-29 平安科技(深圳)有限公司 Customer service machine conference file classification method and device, equipment, storage medium
CN110457585A (en) * 2019-08-13 2019-11-15 腾讯科技(深圳)有限公司 Method for pushing, device, system and the computer equipment of negative text
CN111669757A (en) * 2020-06-15 2020-09-15 国家计算机网络与信息安全管理中心 Terminal fraud call identification method based on conversation text word vector
CN111966875A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 Sensitive information identification method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117540829A (en) * 2023-10-18 2024-02-09 广西壮族自治区通信产业服务有限公司技术服务分公司 Knowledge sharing large language model collaborative optimization method and system
CN117540829B (en) * 2023-10-18 2024-05-17 广西壮族自治区通信产业服务有限公司技术服务分公司 Knowledge sharing large language model collaborative optimization method and system

Also Published As

Publication number Publication date
CN112734050A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
Shen et al. Privacy-preserving image retrieval for medical IoT systems: A blockchain-based approach
CN110399742B (en) Method and device for training and predicting federated migration learning model
WO2022121183A1 (en) Text model training method, recognition method, apparatus, device and storage medium
CN108833093A (en) Determination method, apparatus, equipment and the storage medium of account key
WO2021208701A1 (en) Method, apparatus, electronic device, and storage medium for generating annotation for code change
CN111612167B (en) Combined training method, device, equipment and storage medium of machine learning model
CN113704781B (en) File secure transmission method and device, electronic equipment and computer storage medium
CN111553443B (en) Training method and device for referee document processing model and electronic equipment
WO2022072415A1 (en) Privacy preserving machine learning using secure multi-party computation
CN112446791A (en) Automobile insurance grading method, device, equipment and storage medium based on federal learning
Cao et al. Generative steganography based on long readable text generation
CN112489742B (en) Prescription circulation processing method and device
CN112364376A (en) Attribute agent re-encryption medical data sharing method
CN112073235B (en) Multifunctional mutual-help system of virtual machine
WO2022076826A1 (en) Privacy preserving machine learning via gradient boosting
CN112149174A (en) Model training method, device, equipment and medium
CN113779355A (en) Network rumor source tracing evidence obtaining method and system based on block chain
CN112149141B (en) Model training method, device, equipment and medium
CN117669582A (en) Engineering consultation processing method and device based on deep learning and electronic equipment
CN109711178A (en) A kind of storage method of key-value pair, device, equipment and storage medium
JP6467063B2 (en) Secret authentication code adding apparatus, secret authentication code adding method, and program
CN115205089B (en) Image encryption method, training method and device of network model and electronic equipment
KR102517001B1 (en) System and method for processing digital signature on a blockchain network
CN116743743A (en) Metadata universe data sharing method and system
CN109889342A (en) Interface testing method for authenticating, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21901899

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21901899

Country of ref document: EP

Kind code of ref document: A1