CN116894092B

CN116894092B - Text processing method, text processing device, electronic equipment and readable storage medium

Info

Publication number: CN116894092B
Application number: CN202311163073.8A
Authority: CN
Inventors: 王奥迪
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Suzhou Software Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Suzhou Software Technology Co Ltd
Priority date: 2023-09-11
Filing date: 2023-09-11
Publication date: 2024-01-26
Anticipated expiration: 2043-09-11
Also published as: CN116894092A

Abstract

The invention discloses a text processing method, a text processing device, electronic equipment and a readable storage medium, which relate to the technical field of natural language processing and are used for solving the problem of poor accuracy of text classification. The method comprises the following steps: matching the text to be processed with a preset keyword library; inserting a first tag word and a second tag word before and after the target keyword when the target keyword is included in the text to be processed, so as to obtain a target text; if the text to be processed does not contain the target keyword, determining the text to be processed as a target text; inputting the target text into a feature extraction module of a pre-trained processing model to perform feature extraction to obtain a target feature vector, wherein the processing model further comprises a text classification module and a keyword correction module; and inputting the target feature vector into a text classification module for classification processing to obtain a text classification result, and inputting the target feature vector into a keyword correction module for text prediction to obtain a predicted text. The embodiment of the invention can improve the accuracy of text classification.

Description

Text processing method, text processing device, electronic equipment and readable storage medium

Technical Field

The present invention relates to the field of natural language processing technologies, and in particular, to a text processing method, a text processing device, an electronic device, and a readable storage medium.

Background

Along with the development of big data and artificial intelligence technology, the internet has been closely related to life and work of people, in order to guide an internet content platform to strengthen and perfect internet content governance, a text needs to be identified, and under the condition that a preset keyword is included in the text, the text is determined to be the text which is actually required to be identified, and the text is deleted, recalled or shielded in time.

In the prior art, a keyword matching mode is generally adopted to process a text, and under the condition that the text contains preset keywords, the text is determined to be the text which is actually required to be subjected to recognition processing. The text is processed by the method, a large enough preset keyword library is needed to be built in advance, and more misjudgment conditions exist, so that the accuracy of text classification is poor.

Disclosure of Invention

The embodiment of the invention provides a text processing method, a text processing device, electronic equipment and a readable storage medium, which are used for solving the problem of poor accuracy of text classification.

In a first aspect, an embodiment of the present invention provides a text processing method, including:

Matching a text to be processed with a preset keyword library, wherein the preset keyword library comprises a plurality of keywords;

under the condition that the text to be processed comprises a target keyword, inserting a first tag word in front of the target keyword, and inserting a second tag word behind the target keyword to obtain a target text; under the condition that the text to be processed does not comprise a target keyword, determining the text to be processed as the target text, wherein the target keyword is one of the keywords;

inputting the target text into a feature extraction module of a pre-trained processing model to perform feature extraction to obtain a target feature vector, wherein the processing model further comprises a text classification module and a keyword correction module;

and inputting the target feature vector into the text classification module for classification processing to obtain a text classification result, and inputting the target feature vector into the keyword correction module for text prediction to obtain a predicted text.

Optionally, before the feature extraction module that inputs the target text into the pre-trained processing model performs feature extraction, the method further includes:

Performing iterative training on a processing model based on a training data set, and determining a loss value of the Nth iterative training, wherein the loss value comprises a first loss value and a second loss value, the first loss value is a loss value of the text classification module determined based on a first loss function, the second loss value is a loss value of the keyword correction module determined based on a second loss function, and N is a positive integer;

performing parameter adjustment on the processing model based on the loss value;

and determining the processing model trained in the nth iteration as the pre-trained processing model under the condition that the loss value meets a loss convergence condition.

Optionally, the feature extraction module for inputting the target text into the processing model performs feature extraction to obtain a target feature vector, including:

obtaining an intermediate feature vector of the target text, wherein the intermediate feature vector comprises at least one of a semantic feature vector, an auditory feature vector and a visual feature vector, the semantic feature vector is used for representing semantic features of the target text, the auditory feature vector is used for representing pronunciation features of the target text, and the visual feature vector is used for representing character features of the target text;

And inputting the intermediate feature vector into a bi-directional coding representation BERT model based on a transformer to perform feature extraction, so as to obtain a target feature vector.

Optionally, the obtaining the intermediate feature vector of the target text includes:

mapping the target text based on the vocabulary of the BERT model to obtain semantic feature vectors of each word in the target text;

and acquiring an intermediate feature vector of the target text based on the semantic feature vector of each word in the target text.

mapping the target text based on a pinyin word list to obtain an auditory feature vector of each word in the target text, wherein the pinyin word list is constructed based on pinyin letters and tones;

and acquiring an intermediate feature vector of the target text based on the hearing feature vector of each word in the target text.

acquiring a character picture corresponding to each word in the target text;

inputting the character picture into a pre-trained character recognition model for character recognition to obtain a visual feature vector of each word in the target text;

And acquiring an intermediate feature vector of the target text based on the visual feature vector of each word in the target text.

Optionally, before the iterative training of the processing model based on the training data set, the method further comprises:

acquiring a plurality of sample texts;

matching the plurality of sample texts with a preset keyword library;

under the condition that the sample text comprises a first keyword, inserting the first tag word in front of the first keyword, and inserting the second tag word behind the first keyword to obtain a target sample text;

and constructing a training data set based on the sample text and the target sample text, wherein the first keyword is a keyword in the plurality of keywords.

Optionally, after the target feature vector is input into the keyword correction module to perform text prediction to obtain a predicted text, the method further includes:

determining a second keyword based on the first tag word and the second tag word in the case that the predicted text includes the first tag word and the second tag word;

and under the condition that the second keyword is not included in the preset keyword library, updating the preset keyword library based on the second keyword.

Optionally, in the case that the second keyword is not included in the preset keyword library, updating the preset keyword library based on the second keyword includes:

adding the second keyword to a candidate word stock under the condition that the confidence degrees of the first tag word and the second tag word are both larger than a first threshold value;

and adding third keywords in the candidate word stock to the preset keyword stock, wherein the third keywords are second keywords, the number of which exceeds a second threshold, in the candidate word stock.

In a second aspect, an embodiment of the present invention provides a text processing apparatus, including:

the first matching module is used for matching the text to be processed with a preset keyword library, wherein the preset keyword library comprises a plurality of keywords;

the first processing module is used for inserting a first tag word in front of the target keyword and inserting a second tag word behind the target keyword under the condition that the target keyword is included in the text to be processed, so as to obtain a target text; under the condition that the text to be processed does not comprise a target keyword, determining the text to be processed as the target text, wherein the target keyword is one of the keywords;

The feature extraction module is used for inputting the target text into a feature extraction module of a pre-trained processing model to perform feature extraction to obtain a target feature vector, and the processing model further comprises a text classification module and a keyword correction module;

the second processing module is used for inputting the target feature vector into the text classification module for classification processing to obtain a text classification result, and inputting the target feature vector into the keyword correction module for text prediction to obtain a predicted text.

In a third aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor, and a program stored on the memory and executable on the processor;

the processor is configured to read a program in a memory to implement the steps in the method according to the first aspect.

In a fourth aspect, an embodiment of the present invention provides a readable storage medium storing a program, which when executed by a processor implements the steps of the method according to the first aspect.

In the embodiment of the application, the processing module comprises a feature extraction module, a text classification module and a keyword correction module, wherein the inputs of the text classification module and the keyword correction module are target feature vectors output by the feature extraction module. The output results of the text classification module and the keyword correction module can jointly influence the parameters of the processing model when the processing model is trained. For the text classification task, keywords are used as explicit prompt information to be used as input of a model, and the text classification module can learn interaction information of the keywords and the context through the keyword correction task, so that accuracy of a text classification result is improved. For the keyword correction task, the text classification task enables a certain text classification result information in the target feature vector, and is beneficial to improving the accuracy of predicting the text.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.

FIG. 1 is one of the flowcharts of a text processing method provided by an embodiment of the present invention;

FIG. 2 is a block diagram of a process model provided by an embodiment of the present invention;

FIG. 3 is a block diagram of a visual model provided by an embodiment of the present invention;

FIG. 4 is a second flowchart of a text processing method according to an embodiment of the present invention;

fig. 5 is a block diagram of a text processing apparatus according to an embodiment of the present invention;

fig. 6 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, fig. 1 is a schematic diagram of a text processing method according to an embodiment of the present invention, which can be applied to a scene of classifying a text, predicting the text, and the like. The text processing method provided by the embodiment of the invention can be applied to a text recognition scene, and whether the text contains keywords or not is judged by recognizing whether the text is actually required to be recognized or not through the method.

As shown in fig. 1, the text processing method provided by the embodiment of the invention specifically includes the following steps:

step 101, matching a text to be processed with a preset keyword library, wherein the preset keyword library comprises a plurality of keywords.

The preset keyword library is pre-constructed, and can be managed and updated according to actual conditions in the application process. Under different application scenarios, the keywords included in the preset keyword library may be different. In the text recognition scene, the keywords in the preset keyword library are preset keywords in various different scenes.

The specific manner of matching the text to be processed with the preset keyword library is not limited herein. By way of example, by comparing the text to be processed with keywords in a preset keyword library one by one, it is determined whether the text to be processed contains at least one keyword in the preset keyword library.

Step 102, under the condition that the text to be processed comprises a target keyword, inserting a first tag word in front of the target keyword, and inserting a second tag word behind the target keyword to obtain a target text; and under the condition that the text to be processed does not comprise the target keywords, determining the text to be processed as the target text, wherein the target keywords are keywords in the plurality of keywords.

The first tag word and the second tag word are inserted before and after the target keyword, so that the position of the target keyword in the text can be marked on one hand, and the target keyword can be processed as explicit prompt information on the other hand.

It should be noted that, the number of target keywords included in the text to be processed is not limited herein, and the text to be processed may include at least one target keyword. In the case where the number of target keywords is plural, it is necessary to insert the first tag word and the second tag word before and after each target keyword.

The specific content of the first tag word and the second tag word is not limited herein. As an alternative embodiment, START (START) and END (END) are used as tag words, for example, a first tag word is "[ START ]", a second tag word is "[/END ]", and START and END positions of the target keyword are labeled by the first tag word and the second tag word. As another alternative embodiment, the keywords in the preset keyword library may be classified in advance, and a category (TYPE) corresponding to each keyword is determined, where the first tag word and the second tag word may also be used to characterize the category of the target keyword. For example, the first tag word is "[ TYPEA ]", and the second tag word is "[/TYPEA ]". It should be noted that "/" in the tag words of the examples in this application is an identifier that plays a role in segmentation and/or identification only in practical applications.

And step 103, inputting the target text into a feature extraction module of a pre-trained processing model to perform feature extraction to obtain a target feature vector, wherein the processing model further comprises a text classification module and a keyword correction module.

The processing model comprises a feature extraction module, a text classification module and a keyword correction module. The processing model is a pre-trained model, and a specific training mode is not limited herein. Because the processing model comprises a feature extraction module, a text classification module and a keyword correction module, the modules are jointly trained in the training process and mutually influence.

In this embodiment, the feature extraction module is configured to perform feature extraction on words in the target text to obtain a word feature vector (unbedding) corresponding to each word (Token) in the target text, and a specific manner is not limited herein.

Illustratively, as an alternative embodiment, the target text is input into a pre-trained transducer-based bi-directional coded representation (Bidirectional Encoder Representations from Transformers, BERT) model for feature extraction to obtain the target feature vector. The BERT model is a pre-trained language representation model (Language Representation Model), and the specific structure thereof can be seen in the description of the related art, which is not described herein.

Alternatively, as another alternative embodiment, the step 103 includes:

and inputting the intermediate feature vector into a BERT model for feature extraction to obtain a target feature vector.

In the actual application process, text transformation is generally performed on keywords by means of shape proximity, sound proximity, pinyin replacement, component radical disassembly and the like, so that the probability of matching the text to be processed with a preset keyword library in step 101 to obtain target keywords is reduced.

In this embodiment, the intermediate feature vector of the target text includes at least one of a semantic feature vector, an auditory feature vector, and a visual feature vector, and the intermediate feature vector is input into the BERT model to perform feature extraction, so as to obtain the target feature vector. By the method, the target feature vector can represent at least one multi-modal feature such as semantics, hearing and vision of the text, and the accuracy of the target feature vector is improved.

Optionally, in some embodiments, the acquiring the intermediate feature vector of the target text includes:

The BERT model is trained in advance, in this embodiment, only the vocabulary in the BERT model is extracted, and mapping processing is performed on the target text through the vocabulary of the BERT. Specifically, query is performed in the vocabulary of the BERT according to the target text, so as to obtain the semantic feature vector (may also be referred to as token embedding) of each word in the target text, where each word in the target text corresponds to one of the vocabularies.

In this embodiment, the intermediate feature vector includes a semantic feature vector, and the intermediate feature vector of the target text may be obtained based on the semantic feature vector of each word in the target text, and the specific manner is not limited herein. By the method provided by the embodiment, the intermediate feature vector of the target text can represent the semantic features of the target text. Meanwhile, as the BERT model has better performance, the target text is mapped through the vocabulary of the BERT model, so that the accuracy of the semantic feature vector can be improved.

It should be appreciated that the feature extraction model includes a BERT model, and that during the training of the processing model, the BERT model is trained and its performance is improved, so that the vocabulary performance of the BERT model is correspondingly improved.

Optionally, in other embodiments, the acquiring the intermediate feature vector of the target text includes:

Pinyin word list is based on Pinyin wordsThe mother and tone construction, the size of the pinyin word list is determined based on the pinyin letters and tones. Taking the spelling letter "a" as an example, the "corresponding to different tones" can be obtained ""corresponds to" 0, 1, 2, 3, 4", respectively. According to the target text, inquiring in the pinyin word list, mapping each word in the target text to a word vector tuple, and adding word vectors of all letters to obtain an auditory feature vector (also called hearing embedding) of the character.

For example, "mom" corresponds to "m, a,1", a pinyin word list (consisting of pinyin alphabets and 0-4) is used to map to a word vector tuple, and the word vectors of the respective letters are added to obtain hearing embedding for the character "mom".

In this embodiment, the intermediate feature vector includes an auditory feature vector, and the intermediate feature vector of the target text is obtained based on the auditory feature vector of each word in the target text, and the specific manner is not limited herein. By the method provided by the embodiment, the intermediate feature vector of the target text can represent the pronunciation feature of the target text.

It should be understood that, in this embodiment, the feature extraction model includes a pinyin word list, and in the process of training the processing model, the pinyin word list is trained and the performance of the pinyin word list is improved, so that the obtained auditory feature vector can better represent the pronunciation feature of the target text.

acquiring a character picture corresponding to each word in the target text;

The specific manner of obtaining the character picture corresponding to each word in the target text is not limited herein. In the practical application process, corresponding character pictures are different through different parameters such as fonts and fonts. Illustratively, each word in the target text is generated into a character picture by a preset font style (such as "Song Ti").

The specific structure of the pre-trained character recognition model is not limited herein, and a character recognition model in the related art, that is, a visual information of a character is captured through an optical character recognition (Optical Character Recognition, OCR) task may be employed in practical applications. Illustratively, a shallow Residual Network (res net) model is used to model visual information of characters, a learning target of the res net model is to predict characters on a character picture, and a pre-trained res net model is used to obtain a visual feature vector (also referred to as visual reducing) of each word in a target text.

In this embodiment, the intermediate feature vector includes a visual feature vector, and the intermediate feature vector of the target text is obtained based on the visual feature vector of each word in the target text, and the specific manner is not limited herein. By the method provided by the embodiment, the intermediate feature vector of the target text can be used for representing the character features of the target text.

It should be understood that, in this embodiment, the feature extraction model includes a pre-trained character recognition model, and the pre-trained character recognition model is trained again in the process of training the processing model, so that the obtained visual feature vector can better represent the character features of the target text.

And 104, inputting the target feature vector into the text classification module for classification processing to obtain a text classification result, and inputting the target feature vector into the keyword correction module for text prediction to obtain a predicted text.

The text classification module is used for classifying the target text based on the target feature vector to obtain a text classification result. In the application scenario of text recognition, the text classification module may be considered as a classification model, where the classification result is used to represent whether the text to be processed passes recognition or not, or is used to represent whether the text to be processed is actually the text to be recognized.

The specific structure of the text classification module is not limited herein. Illustratively, in some embodiments, the text classification module includes a linear (linear) layer and a normalized (softmax) layer connected in sequence, and details are not described herein.

The keyword correction module is used for carrying out text prediction based on the target feature vector to obtain a predicted text, and the keyword correction and keyword discovery are realized by comparing the predicted text with a pre-labeled text. The specific structure of the keyword correction module is not limited herein. If the first tag word and the second tag word are included in the predicted text, the predicted text is indicated as the text containing the target keyword.

Illustratively, when the keyword "word a" is included in both text a and text B, it is assumed that, from the semantic level of the text, text a is actually text that does not need to be processed, and text B is actually text that needs to be processed. The keyword correction module can eliminate the matched keywords of the text A, and keep the matched keywords in the text B. Meanwhile, if the keyword "word a" is not matched in step 101, the "word a" can be predicted by the keyword correction module.

Illustratively, in some embodiments, the keyword correction module includes a Long Short-Term Memory (LSTM) model, a linear layer, and a softmax layer, which are sequentially connected, and detailed descriptions thereof are omitted herein.

Optionally, in some embodiments, before the step 103, the method further includes:

Loss values of the process model (noted as) Including a first loss value (noted as) And a second loss value (noted as). Illustratively, as an alternative embodiment,the method meets the following conditions:

；

as an alternative embodiment to this,the method meets the following conditions:

；

wherein,andfor the weight value to be predetermined,。

the first loss value is a loss value of the text classification module, the corresponding first loss function can be determined based on the structure of the text classification module, and the first loss value is determined based on a difference value between a text classification result output by the text classification module and a real label.

The second loss value is a loss value of the keyword correction module, a corresponding second loss function of the second loss value can be determined based on the structure of the keyword correction module, and the second loss value is determined based on a difference value between the predicted text and the real text output by the keyword correction module.

Parameters of various parts of the process model are adjusted based on the loss values. Specifically, after parameters of the feature extraction module, the text classification module and the keyword correction module are adjusted based on the loss value, the next iteration training is performed until the loss value meets the loss convergence condition.

In this embodiment, when training the processing model, the text classification module and the keyword correction module are jointly trained, so that the effect of model training is improved, and the accuracy of the output results of the text classification module and the keyword correction module is improved.

Optionally, in some embodiments, before the iterative training of the processing model based on the training data set, the method further comprises:

acquiring a plurality of sample texts;

matching the plurality of sample texts with a preset keyword library;

For convenience of description, two specific texts in the text recognition scene will be described below as an example. The first tag word is [ TYPEA ], and the second tag word is [/TYPEA ], and is used for representing that the keyword is TYPEA keywords.

Sample text a (noted text a): "hello, word a", and the target sample text A1 (noted as text A1) corresponding to the sample text a is "hello, [ type ea ] word a [/type ea ]". Although the text A contains the keyword "word A", the word A is not required to be recognized according to the actual semantics, so that the text A and the real text corresponding to the text A1 (marked as the text A2) are both "hello, and the word A".

Sample text B (noted text B): "here, the word a" is the target sample text B1 (written as the text B1) corresponding to the sample text B, "here, [ type a ] word a/type a", the keyword "word a" is included in the text B, and the keyword is actually a type of type a keyword according to actual semantics, so that the real texts (written as the text B2) corresponding to the text B and the text B1 are both "here, [ type a ] word a/type a".

The training data set comprises a text A1 and a text B1, the text A1 and the text B1 are input into a processing model to train the processing model, and the corresponding learning targets are the text A2 and the text B2, so that the keyword correction module has the keyword correction capability, namely, a first tag word and a second tag word do not exist in the text which does not need to be processed actually, and the first tag word and the second tag word are reserved in the text which does need to be processed actually.

The training data set comprises a text A and a text B, the text A and the text B are input into a processing model to train the processing model, the corresponding learning targets are the text A2 and the text B2, so that the keyword correction module has the capability of 'keyword discovery', namely, the keyword correction module of the processing model can still recognize the keyword which is not matched with the text to be processed actually when the text is matched with a preset keyword library.

And under the condition that the predicted text comprises the first tag word and the second tag word, the word positioned between the first tag word and the second tag word in the predicted text is the second keyword. Under the condition that the second keyword is not included in the preset keyword library, the preset keyword library can be updated based on the second keyword, so that the preset keyword library is updated in real time, and the integrity of the preset keyword library is improved.

It should be understood that the specific manner of updating the preset keyword library based on the second keyword is not limited herein. Optionally, in some embodiments, in a case where the second keyword is not included in the preset keyword library, updating the preset keyword library based on the second keyword includes:

In the text processing process using the method provided by the application, the keyword correction module of the processing model may continuously "find" the second keyword by continuously processing the text to be processed. In order to avoid the situations of misjudgment and the like, whether the second keyword is added to the candidate word stock is judged based on the confidence degrees of the first tag word and the second tag word.

The higher the confidence of the first tag word and the second tag word, the greater the probability of indicating that the second keyword is the keyword to be identified. The first threshold value can be adjusted and set according to actual conditions. And in a period of time, if the occurrence frequency of a certain word in the candidate word stock exceeds a second threshold value, the word is explained as a keyword needing to be identified and processed with high probability, the word is a frequently occurring keyword needing to be identified and processed, and the second threshold value can be adjusted and set according to actual conditions.

In the present embodiment, in the case where the predicted text includes the first tag word and the second tag word, the second keyword is determined based on the first tag word and the second tag word; and updating the preset keyword library based on the second keyword under the condition that the second keyword is not included in the preset keyword library. By the method, the automatic updating of the preset keyword library can be realized, and meanwhile, the judgment accuracy of the newly added keywords is improved.

Referring to fig. 2-4, for ease of understanding, a specific embodiment will be used to illustrate the text processing method and processing model provided in the present invention.

First, referring to fig. 2, the processing model includes a feature extraction module, a text classification module, and a keyword correction module. Specifically, the feature extraction module comprises a BERT word list, a pinyin word list, a visual model, a Linear layer and a BERT model, wherein the structure of the visual model is shown in fig. 3.

Referring to fig. 4, first, the text "hello," word a "to be processed is matched with a preset keyword library, and the TYPEA keyword" word a "is obtained by matching. The corresponding target text' hello, [ TYPEA ] word A [/TYPEA ] "is obtained by inserting the first tag word [ TYPEA ] and the second tag word [/TYPEA ]. Wherein the process of inserting the first tag word and the second tag word may also be referred to as text linearization.

The target text' hello, [ TYPEA ] word A [/TYPEA ] comprises 7 tokens, and token email, hearing embedding and visual email corresponding to the 7 tokens are respectively extracted through a BERT vocabulary, a pinyin vocabulary and a visual model structure.

And (3) carrying out Linear fusion on the token embedding and hearing embedding and visual embedding input Linear layers corresponding to the 7 tokens, and then sending the Linear fusion into a BERT model for feature extraction to obtain a target feature vector (marked as the embedding).

In this embodiment, the text classification module includes a Linear layer and a softmax layer. The text classification result is obtained by inputting the emmbeddings into the Linear layer and the softmax layer, and the text is text which does not need to be processed.

In this embodiment, the keyword collation module includes an LSTM model, a Linear layer, and a softmax layer. After the emmbeddings are input into the LSTM model, a final predicted text is output through a Linear layer and a softmax layer, wherein the predicted text is 'hello word A', and a first tag word and a second tag word in the text are deleted through a keyword correction module, so that keyword correction is realized.

In the process of training the processing model, a combined training mode is adopted for the text classification module and the keyword correction module, namely, a loss function for supervising model learning comprises a keyword correction task and a text classification task. The loss value L of the model satisfies:

。

In the training process, the emmbeddings are sent to a linear layer and a softmax layer of a text classification module, and cross entropy loss between a predicted label (text classification result) and a real label output by the text classification module is calculated to obtain. In particular, the method comprises the steps of,the method meets the following conditions:

。

in the above formula, n is the number of tokens included in the target text,for the hidden layer output of text sequences through BERT,the parameter matrix of the linear layer, d is the dimension of BERT output, C is the number of text classification labels,is a true label of text.

In the training process, the emmbeddings are sent to an LSTM model, a Linear layer and a softmax layer of the keyword correction module, and cross entropy loss between the next token of each prediction and the corresponding token in the target text is calculated.

。

In the above formula, n is the number of tokens included in the target text,the hidden layer output through LSTM for the ith token,is a parameter matrix of a linear layer, r is the dimension of LSTM hidden layer output, V is the size of a word list,and (3) indexing the ith token in the target text in the vocabulary.

It should be understood that the feature extraction module, the text classification module, and the keyword correction module all include Linear layers, but parameters of the respective Linear layers are different, and functions to be implemented are also different. The text classification module and the keyword correction module each include a softmax layer, but the parameters of each softmax layer are different.

According to the method, the multi-mode text recognition method combining keyword correction and text classification is achieved, and the text classification accuracy is greatly improved by the text classification method combining text semantics, hearing and vision multi-mode characteristics, particularly for texts with various variant words. Meanwhile, the keyword correction module can automatically update the preset keyword library.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a text processing device according to an embodiment of the present invention, and as shown in fig. 5, a text processing device 500 includes:

a first matching module 501, configured to match a text to be processed with a preset keyword library, where the preset keyword library includes a plurality of keywords;

a first processing module 502, configured to insert a first tag word before a target keyword and insert a second tag word after the target keyword when the target keyword is included in the text to be processed, so as to obtain a target text; under the condition that the text to be processed does not comprise a target keyword, determining the text to be processed as the target text, wherein the target keyword is one of the keywords;

The feature extraction module 503 is configured to input the target text into a feature extraction module of a pre-trained processing model to perform feature extraction, so as to obtain a target feature vector, where the processing model further includes a text classification module and a keyword correction module;

the second processing module 504 is configured to input the target feature vector into the text classification module for classification processing to obtain a text classification result, and input the target feature vector into the keyword correction module for text prediction to obtain a predicted text.

Optionally, the text processing device 500 further includes:

the iterative training module is used for carrying out iterative training on the processing model based on the training data set, determining a loss value of the Nth iterative training, wherein the loss value comprises a first loss value and a second loss value, the first loss value is a loss value of the text classification module determined based on a first loss function, the second loss value is a loss value of the keyword correction module determined based on a second loss function, and N is a positive integer;

the parameter adjustment module is used for carrying out parameter adjustment on the processing model based on the loss value;

and the first determining module is used for determining the processing model trained in the nth iteration as the pre-trained processing model under the condition that the loss value meets a loss convergence condition.

Optionally, the feature extraction module 503 includes:

an obtaining unit, configured to obtain an intermediate feature vector of the target text, where the intermediate feature vector includes at least one of a semantic feature vector, an auditory feature vector, and a visual feature vector, the semantic feature vector is used to characterize semantic features of the target text, the auditory feature vector is used to characterize pronunciation features of the target text, and the visual feature vector is used to characterize character features of the target text;

and the input module is used for inputting the intermediate feature vector into a bi-directional coding representation BERT model based on a transformer to perform feature extraction so as to obtain a target feature vector.

Optionally, the acquiring unit is specifically configured to:

acquiring a character picture corresponding to each word in the target text;

Optionally, the text processing device 500 further includes:

the acquisition module is used for acquiring a plurality of sample texts;

the second matching module is used for matching the plurality of sample texts with a preset keyword library;

the inserting module is used for inserting the first tag word before the first keyword and inserting the second tag word after the first keyword under the condition that the first keyword is included in the sample text, so as to obtain a target sample text;

and the construction module is used for constructing a training data set based on the sample text and the target sample text, and the first keywords are keywords in the plurality of keywords.

Optionally, the text processing device 500 further includes:

a second determining module, configured to determine a second keyword based on the first tag word and the second tag word, in a case where the predicted text includes the first tag word and the second tag word; and the updating module is used for updating the preset keyword library based on the second keyword under the condition that the second keyword is not included in the preset keyword library.

Optionally, the updating module is specifically configured to:

The text processing device 500 provided in the embodiment of the present invention can implement each process implemented in the method embodiment shown in fig. 1, and in order to avoid repetition, a description is omitted here.

The embodiment of the invention also provides electronic equipment. Because the principle of solving the problem of the electronic device is similar to that of the text processing method in the embodiment of the invention, the implementation of the electronic device can refer to the implementation of the method, and the repetition is omitted. As shown in fig. 6, an electronic device according to an embodiment of the present invention includes: the processor 600, configured to read the program in the memory 620, performs the following procedures:

Wherein in fig. 6, a bus architecture may comprise any number of interconnected buses and bridges, and in particular one or more processors represented by processor 600 and various circuits of memory represented by memory 620, linked together. The bus architecture may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., which are well known in the art and, therefore, will not be described further herein. The bus interface provides an interface. The processor 600 is responsible for managing the bus architecture and general processing, and the memory 620 may store data used by the processor 600 in performing operations.

Optionally, the processor 600 is further configured to read the program in the memory 620, and perform the following steps:

acquiring a character picture corresponding to each word in the target text;

acquiring a plurality of sample texts;

matching the plurality of sample texts with a preset keyword library;

The electronic device provided by the embodiment of the present invention may execute the above method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein.

The embodiment of the present application further provides a readable storage medium, where a program is stored, where the program, when executed by a processor, implements each process of the above text processing method embodiment, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here. The readable storage medium may be any available medium or data storage device that can be accessed by a processor, including, but not limited to, magnetic memories (e.g., floppy disks, hard disks, magnetic tapes, magneto-Optical disks (MO), etc.), optical memories (e.g., compact Disks (CD), digital video discs (Digital Versatile Disc, DVD), blu-ray discs (BD), high-definition universal discs (High-Definition Versatile Disc, HVD), etc.), and semiconductor memories (e.g., read-Only memories (ROM), erasable programmable Read-Only memories (Erasable Programmable Read-Only memories, EPROM), charged erasable programmable Read-Only memories (Electrically Erasable Programmable Read Only Memory, EEPROM), nonvolatile memories (NAND FLASH), solid State disks (Solid State disks or Solid State Drive, SSD)), etc.

In the several embodiments provided in this application, it should be understood that the disclosed methods and apparatus may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may be physically included separately, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in hardware plus software functional units.

The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.

Claims

1. A text processing method, comprising:

inputting the target feature vector into the text classification module for classification processing to obtain a text classification result, and inputting the target feature vector into the keyword correction module for text prediction to obtain a predicted text;

Before the feature extraction module for inputting the target text into the pre-trained processing model performs feature extraction, the method further comprises:

2. The method according to claim 1, wherein the feature extraction module for inputting the target text into the processing model performs feature extraction to obtain a target feature vector, including:

3. The method of claim 2, wherein the obtaining the intermediate feature vector of the target text comprises:

4. The method of claim 2, wherein the obtaining the intermediate feature vector of the target text comprises:

5. The method of claim 2, wherein the obtaining the intermediate feature vector of the target text comprises:

acquiring a character picture corresponding to each word in the target text;

6. The method of claim 1, wherein prior to iteratively training the process model based on the training dataset, the method further comprises:

acquiring a plurality of sample texts;

matching the plurality of sample texts with a preset keyword library;

7. The method of claim 1, wherein after inputting the target feature vector into the keyword correction module for text prediction to obtain predicted text, the method further comprises:

8. The method of claim 7, wherein updating the preset keyword library based on the second keyword if the second keyword is not included in the preset keyword library, comprises:

9. A text processing apparatus, comprising:

the second processing module is used for inputting the target feature vector into the text classification module for classification processing to obtain a text classification result, and inputting the target feature vector into the keyword correction module for text prediction to obtain a predicted text;

wherein the text processing device further comprises:

10. An electronic device, comprising: a memory, a processor, and a program stored on the memory and executable on the processor; it is characterized in that the method comprises the steps of,

the processor for reading a program in a memory to implement the steps in the method according to any one of claims 1 to 8.

11. A readable storage medium storing a program, wherein the program when executed by a processor implements the steps of the method according to any one of claims 1 to 8.