CN115688903A

CN115688903A - Training method of text recognition model, text recognition method, medium, and apparatus

Info

Publication number: CN115688903A
Application number: CN202211347800.1A
Authority: CN
Inventors: 杜嘉晨; 周蓝珺; 潘树燊
Original assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Current assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority date: 2022-10-31
Filing date: 2022-10-31
Publication date: 2023-02-03

Abstract

The application relates to a training method of a text recognition model. The method comprises the following steps: acquiring a sample text training set and an updated text recognition model; the sample text training set comprises a sample text sequence and actual object information, actual capability information and actual emotion types of the sample text sequence; the updated text recognition model is obtained by updating model parameters in the text recognition model to be trained; inputting the sample text training set into the updated text recognition model to obtain prediction object information, prediction capability information and prediction emotion types of the sample text training set; and training the updated text recognition model according to the difference between the actual object information and the predicted object information, the difference between the actual capability information and the predicted capability information and the difference between the actual emotion type and the predicted emotion type to obtain the trained text recognition model. By adopting the method, the emotion recognition accuracy of the text can be improved.

Description

Training method of text recognition model, text recognition method, medium, and apparatus

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a training method for a text recognition model, a text recognition method, a computer device, a storage medium, and a computer program product.

Background

The text sentiment analysis is an important technology in the field of natural language processing, and is widely applied to scenes with online comment communities, such as various E-commerce communities, news communities, music platforms and the like, so as to help users to know sentiment conveyed by comments.

In the conventional technology, the emotion of the text is usually recognized according to the association relationship between the text and the known emotion tag, but the method needs to set the emotion tag of the text in advance, and when the emotion tag to which the text belongs is not set in advance, the accuracy of the obtained emotion recognition result is low.

Disclosure of Invention

In view of the above, it is necessary to provide a training method of a text recognition model, a text recognition method, a computer device, a computer readable storage medium, and a computer program product, which can improve the emotion recognition accuracy of a text, in order to solve the above technical problems.

In a first aspect, the present application provides a method for training a text recognition model. The method comprises the following steps:

acquiring a sample text training set and an updated text recognition model; the sample text training set comprises a sample text sequence and actual object information, actual capability information and actual emotion types of the sample text sequence; the updated text recognition model is obtained by updating model parameters in the text recognition model to be trained;

inputting the sample text training set into the updated text recognition model to obtain prediction object information, prediction capability information and prediction emotion types of the sample text training set;

and training the updated text recognition model according to the difference between the actual object information and the predicted object information, the difference between the actual capability information and the predicted capability information and the difference between the actual emotion type and the predicted emotion type to obtain a trained text recognition model.

In one embodiment, the updated text recognition model is obtained by:

acquiring a pre-training language model; the type of the model parameter in the pre-training language model is the same as that of the model parameter in the text recognition model to be trained;

and updating the model parameters in the text recognition model to be trained according to the model parameters in the pre-training language model to obtain the updated text recognition model.

In one embodiment, the model parameters in the pre-trained language model comprise text mapping parameters, semantic mapping parameters and character prediction parameters;

the method for updating the model parameters in the text recognition model to be trained according to the model parameters in the pre-training language model to obtain the updated text recognition model comprises the following steps:

and updating the text mapping parameter, the semantic mapping parameter and the character prediction parameter of the text recognition model to be trained according to the text mapping parameter, the semantic mapping parameter and the character prediction parameter of the pre-training language model respectively to obtain the updated text recognition model.

In one embodiment, inputting the sample text training set to the updated text recognition model to obtain the prediction object information, the prediction capability information, and the prediction emotion type of the sample text training set includes:

performing text mapping processing on the sample text training set according to the updated text mapping parameters of the text recognition model to obtain a text vector of the sample text training set;

performing semantic mapping processing on the text vector of the sample text training set according to the semantic mapping parameters of the updated text recognition model to obtain the semantic vector of the sample text training set;

and performing character prediction processing on the semantic vector of the sample text training set according to the updated character prediction parameters of the text recognition model to obtain prediction object information, prediction capability information and prediction emotion types of the sample text training set.

In one embodiment, training the updated text recognition model according to a difference between the actual object information and the predicted object information, a difference between the actual capability information and the predicted capability information, and a difference between the actual emotion type and the predicted emotion type to obtain a trained text recognition model, including:

obtaining a first loss value according to the difference between the actual object information and the predicted object information, obtaining a second loss value according to the difference between the actual capability information and the predicted capability information, and obtaining a third loss value according to the difference between the actual emotion type and the predicted emotion type;

according to the first loss value, the second loss value and the third loss value, respectively carrying out gradient updating on a text mapping parameter, a semantic mapping parameter and a character prediction parameter in the updated text recognition model until a training end condition is reached;

and taking the updated text recognition model which reaches the training end condition as the trained text recognition model.

In one embodiment, obtaining a sample text training set comprises:

acquiring a preset data spacer, the sample text sequence and actual object information, actual capability information and actual emotion type of the sample text sequence;

and according to the preset data interval symbol, carrying out fusion processing on the sample text sequence, the actual object information, the actual capability information and the actual emotion type to obtain the sample text training set.

In a second aspect, the present application provides a text recognition method. The method comprises the following steps:

acquiring a text to be identified;

inputting the text to be recognized into a trained text recognition model to obtain target object information, target capability information and a target emotion type of the text to be recognized as an emotion analysis result of the text to be recognized; the trained text recognition model is obtained by training the updated text recognition model through a sample text training set; the sample text training set comprises a sample text sequence, and actual object information, actual capability information and actual emotion types of the sample text sequence; and the updated text recognition model is obtained by updating the model parameters in the text recognition model to be trained.

In one embodiment, inputting the text to be recognized into a trained text recognition model to obtain target object information, target capability information and target emotion type of the text to be recognized includes:

performing text mapping processing on the text to be recognized according to the text mapping parameters of the trained text recognition model to obtain a text vector of the text to be recognized;

performing semantic mapping processing on the text to be recognized according to the semantic mapping parameters of the trained text recognition model to obtain a semantic vector of the text to be recognized;

and performing character prediction processing on the semantic vector of the text to be recognized according to the character prediction parameters of the trained text recognition model to obtain the target object information, the target capability information and the target emotion type of the text to be recognized.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the following steps when executing the computer program:

In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

In a fifth aspect, the present application further provides a computer program product. The computer program product comprising a computer program which when executed by a processor performs the steps of:

and training the updated text recognition model according to the difference between the actual object information and the predicted object information, the difference between the actual capability information and the predicted capability information, and the difference between the actual emotion type and the predicted emotion type to obtain the trained text recognition model.

The training method, the computer equipment, the storage medium and the computer program product of the text recognition model obtain a sample text training set and an updated text recognition model; the sample text training set comprises a sample text sequence, and actual object information, actual capability information and actual emotion types of the sample text sequence; the updated text recognition model is obtained by updating model parameters in the text recognition model to be trained, so that the updated text recognition model has higher emotion recognition accuracy rate under the condition of untrained operation; inputting the sample text training set into the updated text recognition model to obtain prediction object information, prediction capability information and prediction emotion types of the sample text training set; according to the difference between the actual object information and the predicted object information, the difference between the actual capacity information and the predicted capacity information and the difference between the actual emotion type and the predicted emotion type, the updated text recognition model is trained to obtain a trained text recognition model, so that the trained text recognition model can predict not only the emotion type of the text, but also the predicted object information and the predicted capacity information of the text, the object information and the capacity information of the text do not need to be marked in advance, the emotion recognition accuracy of the text is improved, and meanwhile, the emotion recognition efficiency of the text is improved.

Drawings

FIG. 1 is a schematic flow chart diagram illustrating a method for training a text recognition model in one embodiment;

FIG. 2 is a schematic diagram of a training method and a text recognition method for a text recognition model in one embodiment;

FIG. 3 is a flowchart illustrating the steps of obtaining a sample text training set in one embodiment;

FIG. 4 is a flowchart illustrating a text recognition method according to one embodiment;

FIG. 5 is a diagram illustrating a process of obtaining emotion analysis results for a text to be recognized in one embodiment;

FIG. 6 is a flowchart illustrating a method for training a text recognition model in accordance with another embodiment;

FIG. 7 is a diagram illustrating an exemplary implementation of a text recognition method;

fig. 8 is a schematic hardware configuration diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In an embodiment, as shown in fig. 1, a method for training a text recognition model is provided, and this embodiment is illustrated by applying the method to a terminal, it is to be understood that the method may also be applied to a server, and may also be applied to a system including a terminal and a server, and is implemented by interaction between the terminal and the server. In this embodiment, the method includes the steps of:

step S101, a sample text training set and an updated text recognition model are obtained; the sample text training set comprises a sample text sequence, and actual object information, actual capability information and actual emotion types of the sample text sequence; the updated text recognition model is obtained by updating model parameters in the text recognition model to be trained.

The sample text training set refers to a text data set used for training a text recognition model.

The sample text sequence refers to a sample text needing to identify object information, capability information and emotion types, and comprises a plurality of characters with a sequencing relation. For example, the sample text sequence is represented as X = [ X ] ₁ ,...x _N ]，x _N Is the character of the sample text sequence and N is the number of characters of the sample text sequence.

The actual object information refers to an object actually described by the sample text sequence, that is, the object described by the sample text sequence is referred to as actual object information in order to be distinguished from object information (i.e., predicted object information) recognized by a text recognition model described below.

The actual capability information refers to the capability of a certain aspect or a certain attribute of the object actually described by the sample text sequence. For example, it may be an authoring capability, a shooting capability, or a singing capability, etc. describing the object. That is, in order to distinguish from capability information (i.e., pre-capability information) recognized by a text recognition model described below, capability information of an object described by a sample text sequence is referred to as actual capability information.

The actual emotion type refers to the emotion type actually represented by the sample text sequence. For example, the emotion type may be Positive (Positive), negative (Negative), or Neutral (Neutral); the emotion type may also be represented using discrete values, such as [ -1,0,1], where-1 represents that the emotion polarity of the text is negative, 0 represents that the emotion polarity of the text is neutral, and 1 represents that the emotion polarity of the text is positive. That is, the emotion type represented by the sample text sequence is referred to as an actual emotion type for distinguishing from the emotion type (i.e., predicted emotion type) identified by the text recognition model described below.

For example, assume that the sample text sequence is "Zhang three authored songs are the top of true, leading the entire era! "where" zhang san "belongs to the actual object information," creation capability "belongs to the actual capability information, and its actual emotion type is positive.

Specifically, the terminal acquires a sample text sequence, and actual object information, actual capability information and actual emotion types of the sample text sequence; and carrying out data preprocessing on the sample text sequence, and the actual object information, the actual capability information and the actual emotion type of the sample text sequence to obtain a sample text training set. Further, the terminal updates the model parameters in the text recognition model to be trained according to the model parameters in the pre-training language model to obtain an updated text recognition model.

The pre-training language model refers to a language model which is the same as the model parameters of the text recognition model to be trained and is trained. For example, the pre-trained Language model and the text recognition model to be trained may each be a Language model based on a Transformer text generator, such as a GPT-2 (Language models associated with unsupervised multitask learners) Language model, a CPM (Chinese trained model) model, or other uni-directional pre-trained Language model.

The text recognition model refers to a model of object information, capability information, and emotion type used for generating text data.

The text recognition model to be trained refers to an untrained text recognition model.

The updated text recognition model refers to the text recognition model with updated model parameters.

And S102, inputting the sample text training set into the updated text recognition model to obtain the prediction object information, the prediction ability information and the prediction emotion type of the sample text training set.

The predicted object information refers to an object described by a sample text sequence generated by the updated text recognition model.

The predictive capability information refers to a certain aspect or a certain attribute of an object described by the sample text sequence generated by the updated text recognition model.

The predicted emotion type refers to the emotion type represented by the sample text sequence generated by the updated text recognition model.

Specifically, the terminal inputs the sample text training set to the updated text recognition model, and performs text generation processing on the sample text sequences in the sample text training set by using the model parameters through the updated text recognition model, that is, the updated text recognition model performs text generation processing on the sample text sequences in the sample text training set by using the model parameters in the pre-training language model in step S102, so as to obtain prediction object information, prediction capability information, and prediction emotion types of the sample text sequences.

Step S103, training the updated text recognition model according to the difference between the actual object information and the predicted object information, the difference between the actual ability information and the predicted ability information, and the difference between the actual emotion type and the predicted emotion type, so as to obtain the trained text recognition model.

The trained text recognition model refers to a text recognition model obtained after the updated text recognition model completes iterative training.

Specifically, loss values of each model parameter are respectively constructed according to the difference between the actual object information and the predicted object information in the sample text training set, the difference between the actual capability information and the predicted capability information in the sample text training set, and the difference between the actual emotion type and the predicted emotion type in the sample text training set; iteratively updating the model parameters in the updated text recognition model based on the loss values of the model parameters; and when the updated text recognition model is converged, obtaining the trained text recognition model.

Fig. 2 is a schematic diagram illustrating a principle of the training method and the text recognition method of the text recognition model, as shown in fig. 2, in the present application, an aspect-level emotion recognition task is converted into a text generation task, and first, a sample text sequence, actual object information, actual capability information, and an actual emotion type of the sample text sequence are fused into a sample text training set through data preprocessing, that is, the sample text training set can be regarded as a complete text data; then, using the model parameters in the pre-training language model to correspondingly update the model parameters in the text recognition model to be trained to obtain an updated text recognition model; and training the updated text recognition model according to the sample text training set to obtain the trained text recognition model.

It should be noted that the text recognition model in the present application is essentially a text generator (or a text generation model), that is, text content (for example, prediction object information, prediction capability information, and prediction emotion type) behind the sample text sequence is predicted by the text recognition model, and further, the text recognition model in the present application realizes an effect of recognizing emotion type of the sample text sequence while generating text information of prediction emotion type of the sample text sequence.

In the training method of the text recognition model, a sample text training set and an updated text recognition model are obtained; the sample text training set comprises a sample text sequence, and actual object information, actual capability information and actual emotion types of the sample text sequence; the updated text recognition model is obtained by updating the model parameters in the text recognition model to be trained, so that the updated text recognition model has higher emotion recognition accuracy rate under the condition of not being trained; inputting the sample text training set into the updated text recognition model to obtain prediction object information, prediction capability information and prediction emotion types of the sample text training set; according to the difference between the actual object information and the predicted object information, the difference between the actual capacity information and the predicted capacity information and the difference between the actual emotion type and the predicted emotion type, the updated text recognition model is trained to obtain a trained text recognition model, so that the trained text recognition model can predict not only the emotion type of the text, but also the predicted object information and the predicted capacity information of the text, the object information and the capacity information of the text are not required to be marked in advance, and the emotion recognition efficiency of the text is improved while the emotion recognition accuracy of the text is improved.

In one embodiment, the updated text recognition model is obtained by: acquiring a pre-training language model; the type of the model parameter in the pre-training language model is the same as that of the model parameter in the text recognition model to be trained; and updating the model parameters in the text recognition model to be trained according to the model parameters in the pre-training language model to obtain an updated text recognition model.

Specifically, the terminal constructs a text recognition model to be trained; the terminal can directly obtain a pre-training language model with the same type as the model parameters of the text recognition model to be trained from a third-party platform or a database, and can also train the pre-training language model to be trained to obtain the pre-training language model; and then, correspondingly updating the numerical values of the model parameters in the text recognition model to be trained by using the numerical values of the model parameters in the pre-training language model, namely, initializing the text recognition model to be trained by using the model parameters in the pre-training language model to obtain the updated text recognition model. It can be understood that the trained model parameters in the pre-trained language model are directly assigned to the model parameters in the untrained text recognition model to be trained, so that the text recognition model to be trained also has better text generation capability under the untrained condition, and then the model training is performed on the updated text recognition model, so that the performance of the trained text recognition model can be further improved, and the emotion recognition accuracy of the trained text recognition model to the text to be recognized is greatly improved.

In one embodiment, the model parameters in the pre-trained language model include text mapping parameters, semantic mapping parameters, and character prediction parameters.

The text mapping parameter refers to a mapping parameter capable of mapping a text into a text vector. For example, the text mapping parameter may be an Embedding mapping function.

The semantic mapping parameters refer to mapping parameters capable of mapping text vectors into semantic vectors. For example, the semantic mapping parameter may be a mapping function of a multi-layered stacked neural network.

The character prediction parameter refers to a mapping parameter capable of predicting a next character of the text vector. For example, the character prediction parameters may be a mapping function used in a Transformer text generator for character prediction.

In the step S102, according to the model parameters in the pre-training language model, the model parameters in the text recognition model to be trained are updated to obtain an updated text recognition model, which specifically includes the following contents: and updating the text mapping parameter, the semantic mapping parameter and the character prediction parameter of the text recognition model to be trained according to the text mapping parameter, the semantic mapping parameter and the character prediction parameter of the pre-training language model respectively to obtain an updated text recognition model.

Specifically, the terminal extracts a text mapping parameter, a semantic mapping parameter and a character prediction parameter from a pre-training language model; and then respectively updating the text mapping parameters in the text recognition model to be trained into the text mapping parameters in the pre-training language model, updating the semantic mapping parameters in the text recognition model to be trained into the semantic mapping parameters in the pre-training language model, and updating the character prediction parameters in the text recognition model to be trained into the character prediction parameters in the pre-training language model, thereby obtaining the updated text recognition model.

In this embodiment, the updated text recognition model is obtained by updating the corresponding model parameters (the text mapping parameter, the semantic mapping parameter and the character prediction parameter) in the text recognition model to be trained respectively according to the text mapping parameter, the semantic mapping parameter and the character prediction parameter of the pre-trained language model, so that the updated text recognition model has higher accuracy rate even without training, thereby further improving the accuracy rate of the text processing of the trained text recognition model.

In an embodiment, in step S103, the sample text training set is input to the updated text recognition model, so as to obtain the predicted object information, the predicted ability information, and the predicted emotion type of the sample text training set, which specifically include the following contents: performing text mapping processing on the sample text training set according to the updated text mapping parameters of the text recognition model to obtain text vectors of the sample text training set; performing semantic mapping processing on the text vector of the sample text training set according to the semantic mapping parameters of the updated text recognition model to obtain the semantic vector of the sample text training set; and performing character prediction processing on the semantic vector of the sample text training set according to the updated character prediction parameters of the text recognition model to obtain prediction object information, prediction capability information and prediction emotion types of the sample text training set.

Specifically, the terminal converts each character in a sample text sequence in a sample text training set into a text vector through the updated text mapping parameter of the text recognition model; converting the text vector of the sample text sequence into a semantic vector of the sample text sequence through the semantic mapping parameter of the updated text recognition model; performing character prediction processing on the semantic vector of the sample text sequence through the updated character prediction parameters of the text recognition model to obtain the next character of the semantic vector of the sample text sequence; performing character prediction processing on the semantic vector of the sample text sequence and the next character of the semantic vector of the sample text sequence through the updated character prediction parameters of the text recognition model to obtain the next character; according to a mode that a character prediction processing result and an input parameter of the character prediction processing of the current round are used as the input of the character prediction processing of the next round until the updated text recognition model outputs the prediction object information of the sample text training set; similarly, character prediction processing is carried out on the semantic vector and the prediction object information through a character prediction function, characters of each capability information are predicted in sequence, and prediction capability information is obtained; and finally, performing character prediction processing on the semantic vector, the prediction object information and the prediction capability information through a character prediction function to obtain a predicted emotion type.

In practical applications, sample text order is assumedColumn is represented by X = [ X ] ₁ ,...x _N ]，x _N The characters of the sample text sequence are obtained, and N is the number of the characters of the sample text sequence; suppose that the actual object information of the sample text sequence is represented as T = [ T ] ₁ ,...t _|T| ]T denotes a character sequence, T _|T| Is the character of the actual object information, | T | is the number of characters of the actual object information; suppose that the actual capability information is represented as a = [ a ] ₁ ,...a _|A| ]A denotes a character sequence, a _|A| Is the character of the actual ability information, | A | is the number of characters of the actual ability information; suppose that the actual emotion type is represented as P e [ -1,0,1]Wherein, -1 represents that the emotion polarity of the sample text sequence is negative, 0 represents that the emotion polarity of the sample text sequence is neutral, and 1 represents that the emotion polarity of the sample text sequence is positive; a is to [ X, T, A, P ]]As a sample text training set.

The terminal takes a sample text training set [ X, T, A, P ] as the input of the updated text recognition model, and then converts each character in the sample text sequence X into a text vector through a semantic mapping function Emb () of the updated text recognition model, wherein the text vector E can be obtained through the following method:

wherein d is _e A dimension representing a text vector;

denotes d _e Real numbers in the dimensional vector space.

Converting the text vector E into a semantic vector through a semantic mapping function Enc () of the updated text recognition model, wherein the semantic vector H can be obtained through the following method:

wherein d is _h A dimension representing a semantic vector;

denotes d _h Real numbers in the dimensional vector space.

Predicting the probability of the next character of the semantic vector H through the character prediction function of the updated text recognition model; the sample text training set is [ X, T, A, P ]]Therefore, the predicted next character is the first character of T, and the predicted first character of T is marked as T' ₁ (ii) a Semantic vectors H and t 'are then predicted again through the character prediction function' ₁ I.e. the second character of the predicted T, and marking the second character of the predicted T as T' ₂ (ii) a Until the updated text recognition model predicts all the prediction object information T'; similarly, character prediction processing is performed on the semantic vector H and the prediction object information T ' through a character prediction function to obtain prediction capability information a ', and finally character prediction processing is performed on the semantic vector H, the prediction object information T ' and the prediction capability information a ' through a character prediction function to obtain a predicted emotion type P '.

In this embodiment, a text vector of a sample text training set is obtained by performing text mapping processing on the sample text training set according to the updated text mapping parameters of the text recognition model; performing semantic mapping processing on the text vector of the sample text training set according to the semantic mapping parameters of the updated text recognition model to obtain the semantic vector of the sample text training set; according to the updated character prediction parameters of the text recognition model, character prediction processing is carried out on semantic vectors of the sample text training set to obtain predicted object information, predicted capability information and predicted emotion types of the sample text training set, and the aspect-level emotion recognition task of the sample text training set is reasonably converted into a text generation task, so that the trained text recognition model can predict the emotion types of texts and can predict the predicted object information and predicted capability information of the texts without marking the object information and capability information of the texts in advance, the emotion recognition efficiency of the texts is improved, errors caused when the model respectively processes the object information and the capability information are reduced, and the emotion recognition accuracy of the texts is improved.

In an embodiment, in step S104, the updated text recognition model is trained according to the difference between the actual object information and the predicted object information, the difference between the actual capability information and the predicted capability information, and the difference between the actual emotion type and the predicted emotion type, so as to obtain a trained text recognition model, which specifically includes the following contents: obtaining a first loss value according to the difference between the actual object information and the predicted object information, obtaining a second loss value according to the difference between the actual capability information and the predicted capability information, and obtaining a third loss value according to the difference between the actual emotion type and the predicted emotion type; according to the first loss value, the second loss value and the third loss value, respectively carrying out gradient updating on a text mapping parameter, a semantic mapping parameter and a character prediction parameter in the updated text recognition model until a training end condition is reached; and taking the updated text recognition model reaching the training end condition as the trained text recognition model.

Wherein the first loss value is a loss value calculated by a first loss function (a loss function for optimizing text mapping parameters in the updated text recognition model).

Wherein the second loss value is a loss value calculated by a second loss function (a loss function for optimizing semantic mapping parameters in the updated text recognition model).

Wherein the third loss value is a loss value calculated by a third loss function (a loss function for optimizing the character prediction parameters in the updated text recognition model).

Specifically, the terminal constructs a first loss value according to the difference between the actual object information and the predicted object information, constructs a second loss value according to the difference between the actual capability information and the predicted capability information, and constructs a third loss value according to the difference between the actual emotion type and the predicted emotion type; the first loss value, the second loss value and the third loss value may be constructed based on a gradient descent optimization algorithm or a variant algorithm of the gradient descent optimization algorithm, such as an Adaptive left with hidden Memory Cost (ada) optimizer or an Adaptive motion Estimation (am) optimizer. The terminal carries out iterative optimization on the text mapping parameter, the semantic mapping parameter and the character prediction parameter in the updated text recognition model according to the first loss value, the second loss value and the third loss value until the training text recognition model after iterative optimization converges; and taking the text recognition model when the model is converged as a trained text recognition model.

In this embodiment, a first loss value is obtained according to a difference between the actual object information and the predicted object information, a second loss value is obtained according to a difference between the actual capability information and the predicted capability information, and a third loss value is obtained according to a difference between the actual emotion type and the predicted emotion type; according to the first loss value, the second loss value and the third loss value, respectively carrying out gradient updating on a text mapping parameter, a semantic mapping parameter and a character prediction parameter in the updated text recognition model until a training end condition is reached; the text recognition model which reaches the training end condition and is updated again is used as the trained text recognition model, so that the trained text recognition model learns the actual object information, the actual capability information and the actual emotion type of the sample text training set, the trained text recognition model can predict the emotion type of the text and the predicted object information and the predicted capability information of the text, and the accuracy and the efficiency of emotion recognition of the text by the trained text recognition model are improved.

In an embodiment, in step S101, a sample text training set is obtained, which specifically includes the following contents: acquiring a preset data spacer, a sample text sequence and actual object information, actual capability information and an actual emotion type of the sample text sequence; and according to the preset data interval symbol, carrying out fusion processing on the sample text sequence, the actual object information, the actual capability information and the actual emotion type to obtain a sample text training set.

The data spacer is a symbol for distinguishing different character data in the sample text training set.

Specifically, the terminal acquires a preset data spacer, a sample text sequence, and actual object information, actual capability information and an actual emotion type of the sample text sequence; and the terminal fuses the sample text sequence and the actual object information through a preset data spacer to obtain a first fusion result, fuses the first fusion result and the actual capacity information through the preset data spacer again to obtain a second fusion result, and fuses the second fusion result and the actual emotion type through the preset data spacer again to obtain a sample text training set.

FIG. 3 is a flowchart illustrating the steps of obtaining a sample text training set, and as shown in FIG. 3, the data interval symbol may be set as [ sep ]]And the terminal acquires a sample text sequence X = [ X ] ₁ ,...x _N ]Actual object information T = [ T ] of sample text sequence ₁ ,...t _|T| ]Actual capability information a = [ a = ₁ ,...a _|A| ]And the actual emotion type P, in turn by the data spacer [ sep ]]And sequentially fusing X, T, A and P to obtain a sample text training set S. The sample text training set S may be represented as follows:

S＝X[sep]T[sep]A[sep]P

in the embodiment, the preset data interval symbol, the sample text sequence and the actual object information, the actual capability information and the actual emotion type of the sample text sequence are obtained; according to the preset data interval symbol, the sample text sequence, the actual object information, the actual capability information and the actual emotion type are subjected to fusion processing to obtain a sample text training set, and in the subsequent step, the updated text recognition model is trained on the basis of the sample text training set, so that the trained text recognition model can learn the actual object information, the actual capability information and the actual emotion type of the sample text training set, and the accuracy and the efficiency of emotion recognition on the text by the trained text recognition model are improved.

In an embodiment, as shown in fig. 4, a text recognition method is provided, and this embodiment is illustrated by applying this method to a terminal, and it is to be understood that this method may also be applied to a server, and may also be applied to a system including a terminal and a server, and is implemented by interaction between the terminal and the server. In this embodiment, the method includes the steps of:

step S401, a text to be recognized is obtained.

The text to be recognized refers to text data needing to recognize object information, capability information and emotion types.

Step S402, inputting the text to be recognized into the trained text recognition model to obtain target object information, target capability information and target emotion type of the text to be recognized as emotion analysis results of the text to be recognized; the trained text recognition model is obtained by training the updated text recognition model through a sample text training set; the sample text training set comprises a sample text sequence, and actual object information, actual capability information and actual emotion types of the sample text sequence; the updated text recognition model is obtained by updating model parameters in the text recognition model to be trained.

The target object information refers to an object described by a text sequence to be recognized, which is generated by a trained text recognition model.

The target capability information refers to a certain aspect or a certain attribute of an object described by a text sequence to be recognized, which is generated by a trained text recognition model. For example, the capability information may describe an aspect of authoring capability, an aspect of shooting capability, or an aspect of singing capability of the object, or the like.

The target emotion type refers to the emotion type represented by a text sequence to be recognized generated by the trained text recognition model. For example, the emotion type may be Positive (Positive), negative (Negative), or Neutral (Neutral); the emotion type may also be represented using discrete values, such as [ -1,0,1], where-1 represents that the emotion polarity of the text is negative, 0 represents that the emotion polarity of the text is neutral, and 1 represents that the emotion polarity of the text is positive.

Specifically, the terminal obtains a sample text training set carrying the sample text sequence, and the actual object information, the actual capability information and the actual emotion type of the sample text sequence, and trains the text recognition model to be trained through the sample text training set to obtain the trained text recognition model. Further, as shown in fig. 2, the terminal inputs the text to be recognized into the trained text recognition model for text generation processing, generates and obtains target object information, target capability information and target emotion type of the text to be recognized, and uses the target object information, the target capability information and the target emotion type as emotion analysis results of the text to be recognized.

FIG. 5 is a schematic diagram illustrating a process of obtaining emotion analysis results of a text to be recognized, and as shown in FIG. 5, it is assumed that the text S to be recognized is "the top of a song created by Zhang three is true, leading the entire era! Inputting a text S to be recognized into the trained text recognition model to obtain an emotion analysis result output by the trained text recognition model

Emotion analysis result

Can be regarded as a text sequence, emotion analysis result

The method comprises the following steps: target object information

Target capability information

And target emotion type

Namely, it is

Wherein,

is 'Zhang three',

in order to "authoring capability",

is 1, i.e.

The text recognition method comprises the steps of obtaining a text to be recognized; inputting the text to be recognized into the trained text recognition model to obtain target object information, target capability information and target emotion type of the text to be recognized as emotion analysis results of the text to be recognized; the trained text recognition model is obtained by training the updated text recognition model through a sample text training set; the sample text training set comprises a sample text sequence, and actual object information, actual capability information and actual emotion types of the sample text sequence; the updated text recognition model is obtained by updating the model parameters in the text recognition model to be trained. By adopting the method, the trained text recognition model can predict the emotion type of the text, can predict the prediction object information and the prediction capability information of the text without marking the object information and the capability information of the text in advance, and improves the emotion recognition efficiency of the text to be recognized while improving the emotion recognition accuracy of the text to be recognized.

In an embodiment, in step S402, the text to be recognized is input to the trained text recognition model, so as to obtain the target object information, the target capability information, and the target emotion type of the text to be recognized, which specifically include the following contents: performing text mapping processing on the text to be recognized according to the text mapping parameters of the trained text recognition model to obtain a text vector of the text to be recognized; according to the semantic mapping parameters of the trained text recognition model, performing semantic mapping processing on the text to be recognized to obtain a semantic vector of the text to be recognized; and performing character prediction processing on the semantic vector of the text to be recognized according to the character prediction parameters of the trained text recognition model to obtain target object information, target capability information and target emotion types of the text to be recognized.

Specifically, the terminal converts each character in the text to be recognized into a text vector through a text mapping parameter of a trained text recognition model; converting a text vector of a text to be recognized into a semantic vector through a semantic mapping parameter of the trained text recognition model; performing character prediction processing on the semantic vector of the text to be recognized through the character prediction parameters of the trained text recognition model to obtain the next character of the semantic vector of the text to be recognized; performing character prediction processing on the semantic vector of the text to be recognized and the next character of the semantic vector of the text to be recognized through the character prediction parameters of the trained text recognition model to obtain the next character; according to a mode that a character prediction processing result and an input parameter of the character prediction processing of the current round are used as input of the next round of character prediction processing, target object information of a text to be recognized is output until a trained text recognition model is finished; similarly, character prediction processing is carried out on the semantic vector of the text to be recognized and the target object information through a character prediction function, characters of each capability information are predicted in sequence, and target capability information is obtained; and finally, performing character prediction processing on the semantic vector, the target object information and the target capability information of the text to be recognized through a character prediction function to obtain the target emotion type.

In this embodiment, a text vector of a text to be recognized is obtained by performing text mapping processing on the text to be recognized according to a text mapping parameter of a trained text recognition model; according to the semantic mapping parameters of the trained text recognition model, performing semantic mapping processing on the text to be recognized to obtain a semantic vector of the text to be recognized; according to the character prediction parameters of the trained text recognition model, the semantic vector of the text to be recognized is subjected to character prediction processing to obtain target object information, target capability information and a target emotion type of the text to be recognized, and the aspect level emotion recognition task of the text to be recognized is reasonably converted into a text generation task, so that the trained text recognition model can predict the target emotion type of the text to be recognized, predict the target object information and the target capability information of the text to be recognized, reduce errors caused when the model processes the object information and the capability information respectively, and improve the emotion recognition accuracy of the text.

In an embodiment, as shown in fig. 6, another training method for a text recognition model is provided, which is described by taking the method as an example for being applied to a terminal, and includes the following steps:

step S601, acquiring a preset data spacer, a sample text sequence and actual object information, actual capability information and actual emotion types of the sample text sequence; and according to the preset data interval symbol, carrying out fusion processing on the sample text sequence, the actual object information, the actual capability information and the actual emotion type to obtain a sample text training set.

Step S602, updating the text mapping parameter, the semantic mapping parameter and the character prediction parameter of the text recognition model to be trained according to the text mapping parameter, the semantic mapping parameter and the character prediction parameter of the pre-training language model respectively to obtain an updated text recognition model.

And step S603, performing text mapping processing on the sample text training set according to the updated text mapping parameters of the text recognition model to obtain a text vector of the sample text training set.

Step S604, according to the semantic mapping parameters of the updated text recognition model, performing semantic mapping processing on the text vectors of the sample text training set to obtain the semantic vectors of the sample text training set.

And step S605, performing character prediction processing on the semantic vector of the sample text training set according to the updated character prediction parameters of the text recognition model to obtain prediction object information, prediction capability information and prediction emotion types of the sample text training set.

Step S606, a first loss value is obtained according to the difference between the actual object information and the predicted object information, a second loss value is obtained according to the difference between the actual capability information and the predicted capability information, and a third loss value is obtained according to the difference between the actual emotion type and the predicted emotion type.

Step S607, according to the first loss value, the second loss value, and the third loss value, respectively performing gradient update on the text mapping parameter, the semantic mapping parameter, and the character prediction parameter in the updated text recognition model until reaching the training end condition.

Step S608, the text recognition model updated again after reaching the training end condition is used as the text recognition model after the training is completed.

The training method of the text recognition model can achieve the following beneficial effects: (1) The updated text recognition model has higher emotion recognition accuracy rate under the untrained condition; (2) The trained text recognition model can predict the emotion type of the text, and also can predict the predicted object information and the predicted capability information of the text, the object information and the capability information of the text do not need to be marked in advance, and the emotion recognition accuracy of the text is improved while the emotion recognition efficiency of the text is improved.

In order to clarify the text recognition method provided by the embodiments of the present disclosure more clearly, the text recognition method is described in detail below with a specific embodiment. The method for recognizing the text can be applied to a terminal and specifically comprises the following steps:

fig. 7 is a schematic view of an application scene of the text recognition method, and as shown in fig. 7, the text recognition method in the present application may be applied to various application scenes with online comments, such as e-commerce comment platforms, news comment communities, music comment platforms, and the like, and a user may input personal comments to a terminal and display the personal comments on a comment page according to goods or works displayed on the terminal. The terminal obtains a comment text of the comment page, inputs the comment text into the trained text recognition model to obtain target object information, target capability information and target emotion types of the comment text, and takes the target object information, the target capability information and the target emotion types as comment summaries of the comment text and displays the comment text on the comment page for a user to check.

In this embodiment, the trained text recognition model can predict not only the target emotion type of the text to be recognized, but also the target object information and the target ability information of the text to be recognized, so that errors caused when the model processes the object information and the ability information respectively are reduced, the emotion recognition accuracy of the text is improved, and the target object information, the target ability information and the target emotion type of each comment text in the online comment application scene are extracted to assist a user in quickly knowing the emotion preference of other users and the core information of the comment text.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method of training a text recognition model and/or a method of text recognition. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the configuration shown in fig. 8 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

In an embodiment, a computer program product is provided, comprising a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

It should be noted that, the user information (including but not limited to user device information, user personal information, actual object information, etc.) and data (including but not limited to data for analysis, stored data, presented data, text data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), magnetic Random Access Memory (MRAM), ferroelectric Random Access Memory (FRAM), phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases involved in the embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application should be subject to the appended claims.

Claims

1. A method for training a text recognition model, the method comprising:

2. The method of claim 1, wherein the updated text recognition model is obtained by:

acquiring a pre-training language model; the type of the model parameter in the pre-training language model is the same as the type of the model parameter in the text recognition model to be trained;

3. The method of claim 2, wherein the model parameters in the pre-trained language model include text mapping parameters, semantic mapping parameters, and character prediction parameters;

the updating the model parameters in the text recognition model to be trained according to the model parameters in the pre-training language model to obtain the updated text recognition model includes:

4. The method of claim 1, wherein inputting the sample text training set to the updated text recognition model to obtain the predicted object information, the predicted capability information, and the predicted emotion type of the sample text training set comprises:

5. The method of claim 1, wherein the training the updated text recognition model according to the difference between the actual object information and the predicted object information, the difference between the actual capability information and the predicted capability information, and the difference between the actual emotion type and the predicted emotion type to obtain a trained text recognition model comprises:

and taking the text recognition model which reaches the training end condition as a trained text recognition model.

6. The method of claim 1, wherein obtaining a training set of sample texts comprises:

acquiring a preset data spacer, the sample text sequence and actual object information, actual capability information and actual emotion types of the sample text sequence;

7. A method of text recognition, the method comprising:

acquiring a text to be recognized;

inputting the text to be recognized into a trained text recognition model to obtain target object information, target capability information and a target emotion type of the text to be recognized as an emotion analysis result of the text to be recognized; the trained text recognition model is obtained by training the updated text recognition model through a sample text training set; the sample text training set comprises a sample text sequence, and actual object information, actual capability information and actual emotion types of the sample text sequence; the updated text recognition model is obtained by updating model parameters in the text recognition model to be trained.

8. The method of claim 7, wherein the inputting the text to be recognized into the trained text recognition model to obtain the target object information, the target ability information and the target emotion type of the text to be recognized comprises:

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 8.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.

11. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 8 when executed by a processor.