CN111709234A

CN111709234A - Training method and device of text processing model and electronic equipment

Info

Publication number: CN111709234A
Application number: CN202010465386.9A
Authority: CN
Inventors: 陈亮宇; 刘家辰; 肖欣延
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-05-28
Filing date: 2020-05-28
Publication date: 2020-09-25
Anticipated expiration: 2040-05-28
Also published as: CN111709234B

Abstract

The application discloses a training method and device of a text processing model and electronic equipment, and relates to the technical field of natural language processing. The specific implementation scheme is as follows: acquiring an original sentence set, wherein the original sentence set comprises a plurality of original sentences; performing word segmentation processing on each original sentence to determine each entry contained in each original sentence; replacing at least one entry in the entries contained in each original sentence with a synonym to generate a plurality of replacement sentences corresponding to the original sentences respectively; and training the initial text processing model by utilizing a plurality of original sentences and a plurality of corresponding alternative sentences. Therefore, by the training method of the text processing model, the trained text processing model can directly perform retouching on the input text without depending on a dictionary, the calculated amount is small, and the text retouching effect of the text processing model is improved.

Description

Training method and device of text processing model and electronic equipment

Technical Field

The application relates to the technical field of computers, in particular to the technical field of natural languages, and provides a method and a device for training a text processing model and electronic equipment.

Background

Text rendering is an important technology in auxiliary writing and can help authors write better words.

In the related art, text rendering is usually realized by establishing a dictionary and a language model in a combined manner. However, this method of text rendering is computationally intensive and highly dependent on dictionary quality, resulting in poor rendering quality.

Disclosure of Invention

The application provides a method, a device, an electronic device and a storage medium for text processing model training.

According to an aspect of the present application, there is provided a method for training a text processing model, including: acquiring an original sentence set, wherein the original sentence set comprises a plurality of original sentences; performing word segmentation processing on each original sentence to determine each entry contained in each original sentence; replacing at least one entry in the entries contained in each original sentence with a synonym to generate a plurality of replacement sentences corresponding to the original sentences respectively; and training an initial text processing model by utilizing the original sentences and the corresponding alternative sentences.

According to another aspect of the present application, there is provided a training apparatus for a text processing model, including: the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring an original sentence set, and the original sentence set comprises a plurality of original sentences; the determining module is used for performing word segmentation processing on each original sentence to determine each entry contained in each original sentence; a replacing module, configured to replace at least one entry in the entries included in each original sentence with a synonym, so as to generate a plurality of replacement sentences corresponding to the plurality of original sentences, respectively; and the training module is used for training the initial text processing model by utilizing the original sentences and the corresponding replacement sentences.

According to still another aspect of the present application, there is provided an electronic apparatus including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of training a text processing model as described above.

According to yet another aspect of the present application, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the training method of a text processing model as described above.

According to the technical scheme of the application, the method for realizing the text retouching through the combination of the dictionary and the language model is solved, the calculated amount is large, and the quality of the retouching is poor due to the fact that the quality of the dictionary is very dependent. Synonym replacement is carried out on part of entries in each original sentence in the original sentence set, a plurality of replacement sentences corresponding to each original sentence are generated respectively, the original sentences corresponding to the replacement sentences are generated according to the replacement sentences through the initial text processing model, and the initial text processing model is trained. Therefore, the original sentence with high quality corresponding to the replacement sentence is generated by using the initial text processing model according to the low-quality replacement sentence, and the training of the initial text processing model is realized, so that the input text can be directly retouched by the trained text processing model without depending on a dictionary, the calculation amount is small, and the text retouching effect of the text processing model is improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

fig. 1 is a schematic flowchart of a method for training a text processing model according to an embodiment of the present disclosure;

FIG. 2 is a schematic flowchart of another method for training a text processing model according to an embodiment of the present disclosure;

FIG. 3 is a schematic flowchart illustrating a method for training a text processing model according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of a training apparatus for a text processing model according to an embodiment of the present application;

fig. 5 is a block diagram of an electronic device for implementing a method for training a text processing model according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The embodiment of the application provides a training method of a text processing model aiming at the problems that in the related art, a text retouching method is realized in a mode of combining a dictionary and a language model, the calculated amount is large, and the retouching quality is poor due to the fact that the quality of the dictionary is very dependent.

The following describes in detail a method, an apparatus, an electronic device, and a storage medium for training a text processing model provided in the present application with reference to the accompanying drawings.

Fig. 1 is a schematic flowchart of a training method for a text processing model according to an embodiment of the present disclosure.

As shown in fig. 1, the method for training a text processing model includes the following steps:

step 101, an original sentence set is obtained, wherein the original sentence set comprises a plurality of original sentences.

The original sentence can be high-quality corpus data obtained from a network, a document and the like.

In the embodiment of the present application, high-quality corpus data can be obtained from information articles, encyclopedia data, excellent written works and other data as original sentences, and the obtained large number of original sentences are used to form an original sentence set.

Step 102, performing word segmentation processing on each original sentence to determine each entry contained in each original sentence.

In the embodiment of the present application, part of entries in each original sentence may be replaced to generate a replacement sentence that is synonymous with the original sentence but has a different expression manner, so as to serve as a corpus of the training text processing model. Therefore, the word segmentation processing may be performed on each original sentence in the original sentence set, and the entry included in each original sentence may be determined.

Step 103, replacing at least one entry in the entries contained in each original sentence with a synonym, so as to generate a plurality of replacement sentences corresponding to the original sentences respectively.

In the embodiment of the present application, after determining the entry included in each original sentence, synonyms of some entries in each original sentence may be determined by using a pre-established synonym library, and then some entries in each original sentence are replaced by the synonyms, so as to generate a plurality of replacement sentences corresponding to a plurality of original sentences, respectively. One original sentence may correspond to one alternative sentence, or may correspond to a plurality of alternative sentences.

As a possible implementation manner, a random manner may be adopted, a preset number of terms to be replaced are selected from each original sentence, a synonym corresponding to each term to be replaced is obtained from a preset synonym library, and then the synonym corresponding to each term to be replaced is used to replace the term to be replaced in each original sentence, so as to generate a replacement sentence corresponding to each original sentence.

It should be noted that, in actual use, when generating a replacement statement of an original statement, the number of terms replaced in each original statement may be determined according to actual needs and specific application scenarios, which is not limited in the embodiment of the present application. For example, the number of terms replaced in each original sentence may be 1-2.

And 104, training the initial text processing model by utilizing a plurality of original sentences and a plurality of corresponding replacement sentences.

In the embodiment of the present application, after determining the alternative sentences corresponding to each original sentence, each alternative sentence may be processed by using the initial text processing model to generate the original sentence corresponding to each alternative sentence, and parameters of the initial text processing model are updated according to differences between the original sentence generated by the initial text processing model and the corresponding original sentence in the original sentence set, until the performance of the updated text processing model meets the requirements, the training process of the text processing model is completed.

According to the technical scheme of the application, synonym replacement is carried out on part of entries in each original sentence in an original sentence set, a plurality of replacement sentences corresponding to each original sentence are generated respectively, the original sentences corresponding to the replacement sentences are generated according to the replacement sentences through an initial text processing model, and the initial text processing model is trained. Therefore, the original sentence with high quality corresponding to the replacement sentence is generated by using the initial text processing model according to the low-quality replacement sentence, and the training of the initial text processing model is realized, so that the input text can be directly retouched by the trained text processing model without depending on a dictionary, the calculation amount is small, and the text retouching effect of the text processing model is improved.

In a possible implementation form of the method, a replacement mode of the original sentence can be determined according to parameters such as the parts of speech of each entry contained in the original sentence, the number of the contained entries, the number of synonyms corresponding to the entry to be replaced and the like, so that the richness of the replacement sentence is improved, and the training effect of the text processing model is further improved.

The following further describes the training method of the text processing model provided in the embodiment of the present application with reference to fig. 2.

Fig. 2 is a flowchart illustrating another text processing model training method according to an embodiment of the present disclosure.

As shown in fig. 2, the method for training the text processing model includes the following steps:

step 201, an original sentence set is obtained, wherein the original sentence set comprises a plurality of original sentences.

Step 202, performing word segmentation processing on each original sentence to determine each entry contained in each original sentence.

The detailed implementation process and principle of the steps 201-202 can refer to the detailed description of the above embodiments, and are not described herein again.

Step 203, the part of speech of each entry in each original sentence is obtained.

As a possible implementation manner, partial terms in the original sentence may be replaced according to the part of speech of each term included in the original sentence, so as to generate a replacement sentence corresponding to the original sentence. Therefore, after the word segmentation processing is performed on each original sentence, part-of-speech recognition can be performed on each entry in the original sentence by using an arbitrary part-of-speech recognition tool to determine the part-of-speech of each entry in each original sentence.

And step 204, determining a plurality of candidate entries contained in each original sentence according to the part of speech of each entry.

The candidate entry is an entry whose part of speech included in the original sentence meets a preset condition. For example, if the preset condition is that the part of speech is a verb and an adjective, the candidate vocabulary entry included in the original sentence can be determined to be all verbs and adjectives included in the original sentence.

As a possible implementation manner, when a text is colorized, the word characteristics such as verbs and adjectives are generally colorized with high frequency, while the word characteristics such as nouns and pronouns are generally not required to be colorized or are colorized with low frequency, so that when a replacement sentence is generated, the word with high frequency of colorization in the original sentence can be replaced, so that the trained text processing model pays more attention to the word characteristics with high frequency of colorization, and the text colorization effect of the text processing model is further improved.

Specifically, parts of speech to be replaced may be preset, and then, according to the part of speech of each part of speech in each original sentence, the entry included in each original sentence as the part of speech to be replaced is determined as the candidate entry included in each original sentence.

For example, if the parts of speech to be replaced are verbs and adjectives, the verbs and the adjectives included in each original sentence can be determined as candidate entries included in each original sentence.

It should be noted that, in actual use, the part of speech of the candidate entry may be preset according to actual needs and specific application scenarios, which is not limited in the embodiment of the present application.

In step 205, at least one entry in the candidate entries included in each original sentence is replaced with a synonym, so as to generate a plurality of replacement sentences corresponding to the original sentences.

As a possible implementation manner, synonyms corresponding to a plurality of candidate entries included in each original sentence are determined according to a preset synonym library, and then all the candidate entries are replaced by using the synonyms corresponding to each candidate entry, so as to generate a replacement sentence corresponding to each original sentence.

As another possible implementation manner, a preset number of candidate entries may be selected from a plurality of candidate entries included in each original sentence in a random manner, synonyms corresponding to the preset number of candidate entries are selected according to a preset synonym library, and then, the selected candidate entries are replaced by the synonyms corresponding to the selected preset number of candidate entries, so as to generate a replacement sentence corresponding to each original sentence.

Further, for original sentences with different lengths, different numbers of entries contained in the original sentences can be replaced. That is, in a possible implementation form of the embodiment of the present application, the method may further include:

determining the number N of terms to be replaced corresponding to each original sentence according to the number of terms contained in each original sentence, wherein N is a positive integer;

and replacing the N entries in each original sentence with corresponding synonyms respectively to generate a plurality of first replacement sentences corresponding to the original sentences.

As a possible implementation manner, the number N of the to-be-replaced entries corresponding to the original sentence may be positively correlated with the number of the entries included in the original sentence, that is, the greater the number of the entries included in the original sentence is, the greater the number N of the to-be-replaced entries corresponding to the original sentence is.

Optionally, in a possible implementation form of the embodiment of the present application, a mapping relationship between a term number range and a term number N to be replaced may be preset, so that the number N of terms to be replaced corresponding to each original sentence may be determined according to a range to which the term number included in each original sentence belongs. And then, removing the entry to be replaced from each original sentence in a random mode or the selection mode according to the part of speech, further determining the synonym corresponding to the entry to be replaced in each original sentence according to a preset synonym library, and respectively replacing the entry to be replaced corresponding to each original sentence according to the synonym corresponding to each entry to be replaced so as to generate a first replacement sentence corresponding to each original sentence.

For example, the mapping relationship between the preset entry number range and the number N of entries to be replaced is that "when the number of entries is less than or equal to 10, the number N of entries to be replaced is 1; when the number of the entries is more than 10 and less than or equal to 20, the number N of the entries to be replaced is 2; when the number of the entries is greater than 20 and less than or equal to 30, the number N of the entries to be replaced is 3, and so on ", and the number of the entries of the original sentence a is 15, so that it can be determined that the number N of the entries to be replaced corresponding to the original sentence a is 2, so that 2 entries can be selected from the original sentence a as the entries to be replaced, and the synonyms of the entries to be replaced are used for replacing the entries to be replaced, so that the first replacement sentence corresponding to the original sentence a is generated.

Furthermore, for the original sentence with a long length, the original sentence can be replaced by adopting a plurality of modes to generate a plurality of replacement sentences corresponding to the original sentence. That is, in a possible implementation form of the embodiment of the present application, the method may further include:

acquiring the number M of entries in each original sentence, wherein M is a positive integer;

if the number M of entries contained in any original sentence in the original sentence set is larger than a threshold value, i entries in any original sentence are respectively replaced by synonyms to generate a second replacement sentence corresponding to any original sentence, j entries in any original sentence are respectively replaced by synonyms to generate a third replacement sentence corresponding to any original sentence, wherein the i entries are different from the j entries.

In the embodiment of the application, for an original sentence with the number M of terms larger than the threshold, the number N of terms to be replaced included in the original sentence can be determined according to the above manner, and then partial terms of the original sentence are replaced by multiple manners according to the number N of terms to be replaced, so as to generate multiple replacement sentences corresponding to the original sentence.

Optionally, i entries may be selected from the N to-be-replaced entries of the original sentence for replacement each time, so as to generate a replacement sentence corresponding to the original sentence, until all the N entries are replaced. Wherein i is a positive integer less than or equal to N.

For example, if N is 3, then when i is 1, one of 3 entries to be replaced in the original sentence may be replaced each time, so as to generate three second replacement sentences corresponding to the original sentence; when i is 2, 2 terms to be replaced in the original sentence can be replaced each time to generate three second replacement sentences corresponding to the original sentence; and when the i is 3, all the 3 terms to be replaced in the original sentence are replaced to generate a second replacement sentence corresponding to the original sentence.

Optionally, when the number M of terms included in the original sentence is greater than the threshold, the number N of terms to be replaced may be determined to be larger, and the number of terms to be replaced is too large, which may easily cause that semantic information of the replaced sentence is incomplete or has a larger difference from the original sentence, thereby easily causing that the training effect of the text processing model is not ideal. Therefore, a threshold value j of the number of the terms to be replaced can be preset, so that j terms can be selected from N terms to be replaced of the original sentence for replacement each time, a plurality of third replacement sentences corresponding to the original sentence are generated, and the replacement of all the N terms to be replaced in the original sentence is completed.

For example, if N is 4 and the threshold of the number of terms to be replaced is 3, 3 terms to be replaced in the original sentence may be replaced each time, so as to generate 4 third replacement sentences corresponding to the original sentence.

Further, for an entry to be replaced having multiple synonyms, the entry to be replaced may be replaced by different synonyms, respectively, so as to generate multiple different replacement statements. That is, in a possible implementation form of the embodiment of the present application, the method may further include:

and respectively replacing any entry to be replaced in any original sentence with one of Y synonyms to generate Y fourth replacement sentences corresponding to any original sentence.

In this embodiment of the present application, if the original sentence includes a to-be-replaced entry having a plurality of synonyms, each synonym corresponding to the to-be-replaced entry may be adopted to replace the to-be-replaced entry, so as to generate a plurality of fourth replacement sentences corresponding to the original sentence.

For example, if one entry B to be replaced in the original sentence a has 4 synonyms, the entry B to be replaced may be replaced by the 4 synonyms, so as to generate 4 fourth replacement sentences corresponding to the original sentence a.

Step 206, training the initial text processing model by using the plurality of original sentences and the plurality of corresponding replacement sentences.

The detailed implementation process and principle of the step 206 may refer to the detailed description of the above embodiments, and are not described herein again.

According to the technical scheme of the embodiment of the application, the mode of generating the replacement sentences corresponding to the original sentences is determined according to the parameters such as the parts of words and the number of the included words and the number of synonyms corresponding to the to-be-replaced words in the original sentences, so that the richness of the replacement sentences is improved, the original sentences corresponding to the replacement sentences are generated according to the replacement sentences by using the initial text processing model, and the initial text processing model is trained. Therefore, low-quality alternative sentences corresponding to high-quality original sentences are generated in multiple modes, a corpus for training an initial text processing model is enriched, the text processing model obtained through training can directly retouch an input text without depending on a dictionary, the calculated amount is small, the training effect of the text processing model is further improved, and the text retouching effect of the text processing model is further improved.

In a possible implementation form of the present application, when the initial text processing model processes the replacement sentences, the initial text processing model may predict the replacement positions in each replacement sentence and the replacement words corresponding to the replacement positions, so as to perform rendering processing on the replacement sentences, and thus, the initial text processing model may be trained from two aspects of accuracy of prediction of the replacement positions and accuracy of prediction of the replacement words, so as to improve the text rendering effect of the text processing model.

The following further describes the training method of the text processing model provided in the embodiment of the present application with reference to fig. 3.

Fig. 3 is a flowchart illustrating a training method of a text processing model according to another embodiment of the present disclosure.

As shown in fig. 3, the method for training the text processing model includes the following steps:

step 301, obtaining an original sentence set, wherein the original sentence set comprises a plurality of original sentences;

step 302, performing word segmentation processing on each original sentence to determine each entry contained in each original sentence.

Step 303, replacing at least one entry in the entries included in each original sentence with a synonym, so as to generate a plurality of replacement sentences corresponding to the plurality of original sentences respectively.

The detailed implementation process and principle of the steps 301-303 can refer to the detailed description of the above embodiments, and are not described herein again.

And step 304, processing each alternative sentence by using the initial text processing model to generate a prediction category label and a prediction alternative word of each participle in each alternative sentence.

The prediction category label of the participle refers to a parameter for indicating whether the participle needs to be replaced or not. For example, the prediction category label may include two values, i.e., 0 and 1, and when the prediction category label of the participle is 0, it indicates that the participle does not need to be replaced; when the prediction category label of a participle is 1, it indicates that the participle needs to be replaced. This will be specifically described below as an example.

The predicted replacement word of the participle refers to a replacement word which is predicted to replace the participle when the initial classification model determines that the participle needs to be replaced.

In the embodiment of the application, after the replacement sentence is input into the initial text processing model, the initial text processing model may perform word segmentation on the replacement sentence, further determine whether each word in the replacement sentence needs to be replaced, and generate a prediction category label of each word according to a determination result. When the initial classification model determines that the prediction category label of the participle is 1, namely when the participle needs to be replaced, the prediction replacement word of the participle can be selected from a preset word list.

And 305, determining a loss value of the initial text processing model according to the difference between each original sentence and the corresponding alternative sentence, and the prediction type label and the prediction alternative word of each participle in the corresponding alternative sentence.

As a possible implementation manner, according to the original sentence corresponding to each alternative sentence, the accuracy of the prediction result of the initial text processing model is checked, and a loss value of the initial text processing model is generated, so as to train the initial text processing model. That is, in a possible implementation form of the embodiment of the present application, the step 305 may include:

determining the actual category label and the target replacement word of each participle in the corresponding replacement sentence according to the difference between each original sentence and the corresponding replacement sentence;

determining a first loss value according to the difference between the actual category label and the predicted category label of each word segmentation;

determining a second loss value according to the difference between the target replacement word and the predicted replacement word;

and determining the loss value of the initial text processing model according to the first loss value and the second loss value.

The actual category label of the participle refers to a parameter for indicating whether the participle is a replaced word when generating a replacement sentence. For example, the actual category label may include two values, that is, 0 and 1, and when the actual category label of the participle is 0, it indicates that the participle is not a word that generates the replacement statement and is replaced; when the actual category label of a participle is 1, it indicates that the participle is a word that is replaced when generating a replacement sentence. This will be specifically described below as an example.

The target replacement word of the participle refers to the participle at the corresponding position in the original sentence before the original sentence is replaced by the participle. For example, if the original sentence is "the flower is really beautiful" and the alternative sentence is "the flower is really beautiful", the target alternative word of the participle "beautiful" may be determined to be "beautiful".

In the embodiment of the application, whether each participle in the replacement sentence is a replaced word in the process of generating the replacement sentence can be judged according to the difference between the original sentence and the corresponding replacement sentence, and if yes, the actual category label of the participle can be determined to be 1; if not, the actual category label of the participle can be determined to be 0. When the actual category label of one participle in the replacement sentence is determined to be 1, the participle at the corresponding position in the original sentence corresponding to the replacement sentence can be further determined as the target replacement word of the participle.

In the embodiment of the application, the performance of the initial text processing model can be reflected from different aspects due to the prediction accuracy of the initial text processing model on the category labels of the participles and the prediction accuracy of the alternative words of the participles, so that the performance of the initial text processing model can be measured from two aspects of the prediction accuracy of the initial text processing model on the labels of the participles and the prediction accuracy of the alternative words, and the initial text processing model is trained.

As a possible implementation, the tag confidence of the replacement statement may be determined according to the difference between the predicted class tag and the actual class tag of each participle in the replacement statement. For example, the ratio of the number of the participles in the replacement statement, where the predicted category label is the same as the actual category label, to the number of all the participles included in the replacement statement may be determined as the label confidence of the replacement statement, which is not limited in the embodiment of the present application. Further, the label confidence of each alternative sentence is substituted into a preset loss function (such as a cross entropy loss function), and a first loss value of the initial text processing model is determined.

For example, if the replacement statement a includes 10 participles, and the prediction category label and the actual category label of each participle of the replacement statement a are the same, it may be determined that the label confidence of the replacement statement a is 1; if the predicted category labels of the 8 participles in the alternative sentence a are the same as the actual category labels, it may be determined that the label confidence of the alternative sentence a is 0.8.

As a possible implementation manner, the semantic similarity between the target replacement word and the predicted replacement word can be utilized to determine the difference between the target replacement word and the predicted replacement word; the semantic similarity between the segmented words can be measured by parameters such as the distance between word vectors of the segmented words and the cosine similarity. Therefore, in the embodiment of the present application, the confidence of the replacement word in the replacement sentence may be determined according to the distance or cosine similarity between the word vector of the target replacement word corresponding to each participle in the replacement sentence and the word vector of the predicted replacement word. For example, the mean value of cosine similarity between the word vector of the target replacement word of each participle in the replacement sentence and the word vector of the predicted replacement word may be determined as the confidence of the replacement word of the replacement sentence, which is not limited in the embodiment of the present application. And substituting the confidence coefficient of the replacement word of each replacement sentence into a preset loss function (such as a cross entropy loss function) to determine a second loss value of the initial text processing model.

For example, the substitution sentence a includes a participle B and a participle C whose prediction category labels are 1, the cosine similarity between the target substitution word of the participle B and the prediction substitution word is 0.8, and the cosine similarity between the target substitution word of the participle C and the prediction substitution word is 0.6, so that the confidence of the substitution word of the substitution sentence a can be determined to be 0.7.

As a possible implementation manner, after determining the first loss value and the second loss value of the initial text processing model, the first loss value and the second loss value may be fused to generate the loss value of the initial text processing model. For example, the sum of the first loss value and the second loss value may be determined as the loss value of the initial text processing model; the average value of the first loss value and the second loss value may also be determined as the loss value of the initial text processing model, which is not limited in the embodiment of the present application.

And step 306, correcting the initial text processing model according to the loss value.

As a possible implementation manner, after determining the loss value of the initial text processing model, it may be determined whether the loss value is within a preset range, and if so, it may be determined that the performance of the initial text processing model meets the requirements, and the training process of the text processing model may be completed; if the loss value is not in the preset range, it can be determined that the performance of the initial text processing model does not meet the requirement, and the loss value can be directionally propagated to correct the parameters of the initial text processing model to generate a corrected text processing model. And then, continuously repeating the processing process of each alternative sentence by using the corrected text processing model until the loss value is in a preset range, and finishing the training process of the text processing model.

It should be noted that, in actual use, the manner of modifying the text processing model according to the loss value may be determined according to actual needs and a specific application scenario, which is not limited in this embodiment of the present application. For example, the text processing model may be modified using a gradient descent method.

As another possible implementation manner, the initial text processing model may include a label prediction layer and a replacement word prediction layer that are respectively connected to the feature processing layer, so that the label prediction layer and the replacement word prediction layer may be alternately trained according to the first loss value and the second loss value to modify the initial text processing model. That is, in a possible implementation form of the embodiment of the present application, the step 306 may include:

according to the first loss value, correcting a label prediction layer and a feature processing layer of the initial text processing model;

and correcting the replacement word prediction layer and the characteristic processing layer of the initial text processing model according to the second loss value.

The feature processing layer is a layer for performing feature representation on a replacement sentence input into the initial text processing model. For example, the feature processing layer may be a layer that vector maps the replacement statement to generate a vector representation corresponding to the replacement statement.

The label prediction layer is a layer for predicting a prediction type label of each participle in a replacement sentence according to the feature representation of the replacement sentence output by the feature processing layer.

The term prediction layer is a layer for predicting a predicted term of a participle with a prediction type label of 1 in a replacement term based on the feature representation of the replacement term output from the feature processing layer.

In the embodiment of the application, the first loss value can reflect the accuracy of the initial text processing model in predicting the prediction category labels of the participles, so that the label prediction layer and the feature processing layer of the initial text processing model can be corrected by using the first loss value; and the second loss value can reflect the accuracy of the initial text processing model in predicting the predicted replacement words of the participles, so that the replacement word prediction layer and the characteristic processing layer of the initial text processing model can be corrected by using the second loss value until the first loss value and the second loss value of the corrected text processing model are both in the preset range, and the training process of the text processing model is completed.

According to the technical scheme of the embodiment of the application, synonym replacement is carried out on part of entries in each original sentence in an original sentence set, a plurality of replacement sentences corresponding to each original sentence are respectively generated, the initial text processing model is used for predicting the prediction category labels and the prediction replacement words corresponding to each participle in the replacement sentences, the label prediction accuracy and the replacement word prediction accuracy of the initial text processing model are verified according to the original sentences, so that the first loss value and the second loss value of the initial text processing model are respectively generated, and then the label prediction layer and the replacement word prediction layer of the initial text processing model are respectively corrected according to the first loss value and the second loss value. Therefore, the text processing model is trained from the two aspects of label prediction and replacement word prediction, so that the training effect of the text processing model is further improved, and the text rendering effect of the text processing model is further improved.

In order to implement the above embodiments, the present application further provides a training apparatus for a text processing model.

Fig. 4 is a schematic structural diagram of a training apparatus for a text processing model according to an embodiment of the present application.

As shown in fig. 4, the training apparatus 40 for the text processing model includes:

a first obtaining module 41, configured to obtain an original sentence set, where the original sentence set includes a plurality of original sentences;

a determining module 42, configured to perform word segmentation processing on each original sentence to determine each entry included in each original sentence;

a replacing module 43, configured to replace at least one entry in the entries included in each original sentence with a synonym, so as to generate a plurality of replacement sentences corresponding to the plurality of original sentences, respectively; and

and a training module 44, configured to train the initial text processing model by using the multiple original sentences and the corresponding multiple replacement sentences.

In practical use, the training apparatus for text processing models provided in the embodiments of the present application may be configured in any electronic device to execute the aforementioned method for training text processing models.

According to the technical scheme of the embodiment of the application, synonym replacement is carried out on part of entries in each original sentence in the original sentence set, a plurality of replacement sentences corresponding to each original sentence are generated respectively, the original sentences corresponding to the replacement sentences are generated according to the replacement sentences by using the initial text processing model, and the initial text processing model is trained. Therefore, the original sentence with high quality corresponding to the replacement sentence is generated by using the initial text processing model according to the low-quality replacement sentence, and the training of the initial text processing model is realized, so that the input text can be directly retouched by the trained text processing model without depending on a dictionary, the calculation amount is small, and the text retouching effect of the text processing model is improved.

In a possible implementation form of the present application, the replacing module 43 includes:

the first acquisition unit is used for acquiring the part of speech of each entry in each original sentence;

the first determining unit is used for determining a plurality of candidate entries contained in each original sentence according to the part of speech of each entry;

a first replacement unit configured to replace at least one entry of the plurality of candidate entries included in each original sentence with a synonym to generate a plurality of replacement sentences corresponding to the plurality of original sentences.

Further, in another possible implementation form of the present application, the replacing module 43 includes:

the second determining unit is used for determining the number N of the terms to be replaced corresponding to each original sentence according to the number of the terms contained in each original sentence, wherein N is a positive integer;

and the second replacement unit is used for replacing the N entries in each original sentence with corresponding synonyms respectively so as to generate a plurality of first replacement sentences corresponding to the original sentences.

the second obtaining unit is used for obtaining the number M of the entries in each original sentence, wherein M is a positive integer;

and a third replacement unit, configured to replace i entries in any original sentence with synonyms respectively when the number M of entries included in any original sentence in the original sentence set is greater than a threshold value, so as to generate a second replacement sentence corresponding to any original sentence, and replace j entries in any original sentence with synonyms respectively, so as to generate a third replacement sentence corresponding to any original sentence, where i entries are different from j entries.

Further, in another possible implementation form of the present application, if any entry to be replaced of any original sentence in the original sentence set includes Y synonyms, the replacing module 43 includes:

and the fourth replacement unit is used for replacing any entry to be replaced in any original sentence with one of Y synonyms respectively so as to generate Y fourth replacement sentences corresponding to any original sentence.

Further, in another possible implementation form of the present application, the training module 44 includes:

the generating unit is used for processing each alternative sentence by using the initial text processing model so as to generate a prediction category label and a prediction alternative word of each participle in each alternative sentence;

a third determining unit, configured to determine a loss value of the initial text processing model according to a difference between each original sentence and the corresponding replacement sentence, and a prediction category label and a prediction replacement word of each participle in the corresponding replacement sentence; and

and the correcting unit is used for correcting the initial text processing model according to the loss value.

Further, in another possible implementation form of the present application, the third determining unit includes:

the first determining subunit is used for determining the actual category label and the target replacement word of each participle in the corresponding replacement sentence according to the difference between each original sentence and the corresponding replacement sentence;

the second determining subunit is used for determining a first loss value according to the difference between the actual category label and the predicted category label of each participle;

a third determining subunit, configured to determine a second loss value according to a difference between the target replacement word and the predicted replacement word;

and the fourth determining subunit is used for determining the loss value of the initial text processing model according to the first loss value and the second loss value.

Further, in another possible implementation form of the present application, the initial text processing model includes a label prediction layer and a replacement word prediction layer respectively connected to the feature processing layer; accordingly, the correction unit includes:

the first correction subunit is used for correcting the label prediction layer and the feature processing layer of the initial text processing model according to the first loss value;

and the second correction subunit is used for correcting the replacement word prediction layer and the characteristic processing layer of the initial text processing model according to the second loss value.

It should be noted that the foregoing explanation of the embodiment of the training method for the text processing model shown in fig. 1, fig. 2, and fig. 3 is also applicable to the training apparatus 40 for the text processing model of this embodiment, and details are not repeated here.

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

Fig. 5 is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 5, the electronic apparatus includes: one or more processors 501, memory 502, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each electronic device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 501 is taken as an example.

Memory 502 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by at least one processor to cause the at least one processor to perform the method for training a text processing model provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the training method of the text processing model provided herein.

The memory 502, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the training method of the text processing model in the embodiment of the present application (e.g., the first obtaining module 41, the determining module 42, the replacing module 43, and the training module 44 shown in fig. 4). The processor 501 executes various functional applications of the server and data processing, i.e., implements the training method of the text processing model in the above method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 502.

The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device of the training method of the text processing model, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 502 optionally includes memory located remotely from processor 501, and these remote memories may be connected over a network to an electronic device of a training method of a text processing model. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the training method of the text processing model may further include: an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503 and the output device 504 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example.

The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus of the training method of the text processing model, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or other input devices. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A training method of a text processing model comprises the following steps:

acquiring an original sentence set, wherein the original sentence set comprises a plurality of original sentences;

performing word segmentation processing on each original sentence to determine each entry contained in each original sentence;

replacing at least one entry in the entries contained in each original sentence with a synonym to generate a plurality of replacement sentences corresponding to the original sentences respectively; and

and training an initial text processing model by utilizing the original sentences and the corresponding alternative sentences.

2. The method of claim 1, wherein the replacing at least one of the entries contained in each of the original sentences with a synonym to generate a plurality of replaced sentences corresponding to the plurality of original sentences comprises:

acquiring the part of speech of each entry in each original sentence;

determining a plurality of candidate entries contained in each original sentence according to the part of speech of each entry;

and replacing at least one entry in the candidate entries contained in each original sentence with a synonym to generate a plurality of replaced sentences corresponding to the original sentences.

3. The method of claim 1, wherein the replacing at least one of the entries contained in each of the original sentences with a synonym comprises:

4. The method of claim 1, wherein the replacing at least one of the entries contained in each of the original sentences with a synonym comprises:

if the number M of entries contained in any original sentence in the original sentence set is larger than a threshold value, i entries in the original sentence are replaced with synonyms respectively to generate a second replacement sentence corresponding to the original sentence, j entries in the original sentence are replaced with synonyms respectively to generate a third replacement sentence corresponding to the original sentence, wherein the i entries are different from the j entries.

5. The method according to claim 1, wherein if any entry to be replaced of any original sentence in the original sentence set includes Y synonyms, replacing at least one entry in the entries included in each original sentence with a synonym to generate a plurality of replacement sentences corresponding to the original sentences, respectively, includes:

and replacing any entry to be replaced in any original sentence with one of the Y synonyms respectively to generate Y fourth replacement sentences corresponding to any original sentence.

6. The method of any of claims 1-5, wherein said training an initial text processing model using said plurality of original sentences and said corresponding plurality of alternative sentences comprises:

processing each alternative sentence by using the initial text processing model to generate a prediction category label and a prediction alternative word of each participle in each alternative sentence;

determining a loss value of the initial text processing model according to the difference between each original sentence and the corresponding replacement sentence, and the prediction category label and the prediction replacement word of each participle in the corresponding replacement sentence; and

and correcting the initial text processing model according to the loss value.

7. The method of claim 6, wherein the determining a loss value of the initial text processing model according to the difference between each original sentence and the corresponding replacement sentence and the prediction category label and the prediction replacement word of each participle in the corresponding replacement sentence comprises:

determining a first loss value according to the difference between the actual category label and the prediction category label of each word segmentation;

and determining a loss value of the initial text processing model according to the first loss value and the second loss value.

8. The method of claim 7, wherein the initial text processing model includes a label prediction layer and a replacement word prediction layer respectively connected to feature processing layers, and wherein modifying the initial text processing model based on the loss value comprises:

according to the first loss value, correcting a label prediction layer and the feature processing layer of the initial text processing model;

and correcting the replacement word prediction layer and the feature processing layer of the initial text processing model according to the second loss value.

9. A training apparatus for a text processing model, comprising:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring an original sentence set, and the original sentence set comprises a plurality of original sentences;

the determining module is used for performing word segmentation processing on each original sentence to determine each entry contained in each original sentence;

a replacing module, configured to replace at least one entry in the entries included in each original sentence with a synonym, so as to generate a plurality of replacement sentences corresponding to the plurality of original sentences, respectively; and

and the training module is used for training an initial text processing model by utilizing the original sentences and the corresponding replacement sentences.

10. The apparatus of claim 9, wherein the replacement module comprises:

a first obtaining unit, configured to obtain a part of speech of each entry in each original sentence;

a first determining unit, configured to determine, according to a part of speech of each entry, a plurality of candidate entries included in each original sentence;

a first replacing unit, configured to replace at least one entry in the candidate entries included in each original sentence with a synonym, so as to generate a plurality of replaced sentences corresponding to the original sentences.

11. The apparatus of claim 9, wherein the replacement module comprises:

a second determining unit, configured to determine, according to the number of entries included in each original sentence, the number N of entries to be replaced corresponding to each original sentence, where N is a positive integer;

12. The apparatus of claim 9, wherein the replacement module comprises:

a second obtaining unit, configured to obtain the number M of entries in each original sentence, where M is a positive integer;

and a third replacement unit, configured to replace i entries in any original sentence with synonyms respectively when the number M of entries included in any original sentence in the original sentence set is greater than a threshold value, so as to generate a second replacement sentence corresponding to any original sentence, and replace j entries in any original sentence with synonyms respectively, so as to generate a third replacement sentence corresponding to any original sentence, where the i entries are different from the j entries.

13. The apparatus of claim 9, wherein if any entry to be replaced of any original sentence in the original sentence set comprises Y synonyms, the replacing module comprises:

a fourth replacing unit, configured to replace the any entry to be replaced in the any original sentence with one of the Y synonyms, respectively, so as to generate Y fourth replacement sentences corresponding to the any original sentence.

14. The apparatus of any of claims 9-13, wherein the training module comprises:

a third determining unit, configured to determine a loss value of the initial text processing model according to a difference between each original sentence and a corresponding replacement sentence, and a prediction category label and a prediction replacement word of each participle in the corresponding replacement sentence; and

15. The apparatus of claim 14, wherein the third determining unit comprises:

the second determining subunit is configured to determine a first loss value according to a difference between the actual category label and the predicted category label of each word segmentation;

and the fourth determining subunit is configured to determine a loss value of the initial text processing model according to the first loss value and the second loss value.

16. The apparatus of claim 15, wherein the initial text processing model comprises a tag prediction layer and a replacement word prediction layer respectively connected to the feature processing layer, and the modifying unit comprises:

and the second correction subunit is used for correcting the replacement word prediction layer and the feature processing layer of the initial text processing model according to the second loss value.

17. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

18. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-8.