Disclosure of Invention
In view of the above problems, embodiments of the present invention are proposed to provide a text processing method, a text processing apparatus, an electronic device, and a computer-readable storage medium that overcome or at least partially solve the above problems.
In order to solve the above problem, an embodiment of the present invention discloses a text processing method, where the method includes:
acquiring a text to be processed and an inquiry statement;
matching the query statement with preset standard problems corresponding to a plurality of preset categories to obtain a preset category corresponding to the query statement;
classifying the texts to be processed by adopting a pre-trained text processing model to obtain answer position information corresponding to a preset category contained in the texts to be processed; determining answer position information matched with the query statement according to a preset category corresponding to the query statement and a preset category corresponding to the answer position information;
and determining a predicted answer text from the text to be processed according to the answer position information matched with the query statement.
Optionally, the text processing model includes a text extraction module and a full connection layer; the method for classifying the texts to be processed to obtain answer position information corresponding to preset categories contained in the texts to be processed by adopting a pre-trained text processing model comprises the following steps:
inputting the text to be processed into a text extraction module of a text processing model, and coding the text to be processed by the text extraction module to obtain sentence characteristics and character characteristics;
inputting the character features and sentence features into a full connection layer of the text processing model, classifying the text to be processed according to a plurality of preset categories by the full connection layer according to the sentence features, and determining the preset categories contained in the text to be processed; and determining a plurality of answer position information corresponding to preset categories contained in the text to be processed from the text to be processed by the full connection layer according to the character features.
Optionally, the text processing model comprises a first classification layer; the determining answer position information matched with the query statement according to the preset category corresponding to the query statement and the answer position information includes:
inputting the sentence characteristics into a first classification layer of the text processing model, and searching answer position information matched with the preset category corresponding to the query sentence from a plurality of answer position information determined by the full connection layer by the first classification layer according to the preset category corresponding to the query sentence so as to determine the answer position information matched with the query sentence.
Optionally, the text processing model is trained by:
acquiring training data, wherein the training data comprises a training text and a text label corresponding to the training text; the text label comprises a standard question corresponding to a preset category and real answer information matched with the standard question;
classifying answer position information corresponding to each preset category from the training text by adopting a text processing model; determining answer position information matched with the standard question according to a preset category corresponding to the standard question and a preset category corresponding to the answer position information; determining a predicted answer text according to the answer position information matched with the standard question; determining a real answer segment which accords with the real answer information in the training text;
determining a loss function value according to answer position information matched with the standard question, the predicted answer text, the real answer segment and the text label;
and adjusting the text processing model parameters according to the loss function values so as to train the text processing model.
Optionally, the real answer information includes real answer text; the text processing module comprises a text extraction module and a second classification layer; the determining a real answer segment in the training text that meets the real answer information includes:
inputting the training text into the text extraction module, and coding the training text by the text extraction module to obtain sentence characteristics;
inputting the sentence characteristics into the second classification layer, and judging whether each character in the training text appears in the real answer text by the second classification layer to obtain a judgment result of each character in the training text;
and determining a real answer segment consisting of characters appearing in the real answer text according to the judgment result.
The embodiment of the invention also discloses a text processing device, which comprises:
the acquisition module is used for acquiring the text to be processed and the query sentence;
the query statement matching module is used for matching the query statement with preset standard problems corresponding to a plurality of preset categories to obtain the preset categories corresponding to the query statement;
the model processing module is used for classifying the texts to be processed by adopting a pre-trained text processing model to obtain answer position information corresponding to a preset category contained in the texts to be processed; determining answer position information matched with the query statement according to a preset category corresponding to the query statement and a preset category corresponding to the answer position information;
and the answer determining module is used for determining a predicted answer text from the text to be processed according to the answer position information matched with the query statement.
Optionally, the text processing model includes a text extraction module and a full connection layer; the model processing module comprises:
the encoding submodule is used for inputting the text to be processed into a text extraction module of a text processing model, and the text extraction module encodes the text to be processed to obtain sentence characteristics and character characteristics;
the answer position information determining submodule is used for inputting the character features and the sentence features into a full connection layer of the text processing model, and the full connection layer classifies the text to be processed according to the sentence features and a plurality of preset categories and determines the preset categories contained in the text to be processed; and determining a plurality of answer position information corresponding to preset categories contained in the text to be processed from the text to be processed by the full connection layer according to the character features.
Optionally, the text processing model comprises a first classification layer; the model processing module comprises:
and the answer position information matching sub-module is used for inputting the sentence characteristics into a first classification layer of the text processing model, and the first classification layer searches answer position information matched with the preset category corresponding to the query sentence from a plurality of answer position information determined by the full connection layer according to the preset category corresponding to the query sentence so as to determine the answer position information matched with the query sentence.
Optionally, the text processing model is trained by:
the training data acquisition module is used for acquiring training data, and the training data comprises a training text and a text label corresponding to the training text; the text label comprises a standard question corresponding to a preset category and real answer information matched with the standard question;
the model training module is used for classifying answer position information corresponding to each preset type from the training text by adopting a text processing model; determining answer position information matched with the standard question according to a preset category corresponding to the standard question and a preset category corresponding to the answer position information; determining a predicted answer text according to the answer position information matched with the standard question; determining a real answer segment which accords with the real answer information in the training text;
the loss function determining module is used for determining a loss function value according to the answer position information matched with the standard question, the predicted answer text, the real answer segment and the text label;
and the model parameter adjusting module is used for adjusting the text processing model parameters according to the loss function values so as to train the text processing model.
Optionally, the real answer information includes real answer text; the model training module comprises:
the training text coding sub-module is used for inputting the training text into the text extraction module, and the text extraction module codes the training text to obtain sentence characteristics;
the classification submodule is used for inputting the sentence characteristics into the second classification layer, and the second classification layer judges whether each character in the training text appears in the real answer text or not to obtain the judgment result of each character in the training text;
and the answer segment determining submodule is used for determining a real answer segment consisting of characters appearing in the real answer text according to the judgment result.
Optionally, the real answer information further includes real answer position information and a preset category corresponding to the real answer position information; the loss function determination module includes:
the answer classification loss determining submodule is used for comparing a preset category corresponding to the answer position information matched with the standard question with a preset category corresponding to the real answer position information to determine the answer classification loss;
the position loss determining submodule is used for comparing answer position information matched with the standard question with the real answer position information to determine position loss;
the word classification loss determining submodule is used for judging whether the word in the text of the predicted answer appears in the real answer segment or not for each word in the predicted answer, and determining word classification loss;
the evaluation loss determining submodule is used for comparing the predicted answer text with the real answer text to determine the evaluation loss;
and the loss function value determining submodule is used for determining a loss function value according to the position loss, the answer classification loss, the word classification loss and the evaluation loss.
The implementation of the invention also discloses an electronic device, which comprises: a processor, a memory and a computer program stored on the memory and capable of running on the processor, which computer program, when executed by the processor, carries out the steps of the text processing method as described above.
The embodiment of the invention also discloses a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of the text processing method are realized.
The embodiment of the invention has the following advantages:
in the embodiment of the application, the text to be processed and the query statement can be acquired, and the query statement is matched with the preset standard problems corresponding to a plurality of preset categories to obtain the preset categories corresponding to the query statement; by adopting a pre-trained text processing model, answer position information corresponding to a preset category can be obtained from the classification of texts to be processed; and determining answer position information matched with the query statement according to the preset category corresponding to the query statement and the preset category corresponding to the answer position information so as to determine a predicted answer text from the text to be processed. According to the embodiment of the application, answer position information corresponding to a plurality of preset categories is obtained from the to-be-processed text in a classified mode, so that the answer position information classified according to the preset categories is obtained, the extracted information is effectively classified, the information matched with the preset categories of the query sentence is determined, and the user requirements are met accurately.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Information extraction is one of the important directions of natural language processing application, and mainly classifies and extracts specific events or entities from natural language texts. Currently, two modes of pointers and serialization are mainly adopted for information extraction, wherein a pointer network based on bert (bidirectional Encoder retrieval from converters) is a mainstream framework for pointer extraction, and an information extraction task can be used as an MRC (Machine Reading and understanding) task. The MRC task based on BERT can realize the extraction of information, but in a recruitment scene, the accuracy and the coverage of information extracted from recruitment posts are required to be high, the extraction model based on BERT cannot effectively classify the extracted information, and the extraction accuracy needs to be further improved.
The core concept of the embodiment of the invention is that in the embodiment of the invention, the text to be processed and the query sentence can be obtained, and the query sentence is matched with the preset standard problems corresponding to a plurality of preset categories to obtain the preset categories corresponding to the query sentence; by adopting a pre-trained text processing model, answer position information corresponding to a preset category can be obtained from the classification of texts to be processed; and determining answer position information matched with the query statement according to the preset category corresponding to the query statement and the preset category corresponding to the answer position information so as to determine a predicted answer text from the text to be processed. Compared with the MRC task based on the BERT, the embodiment of the application obtains the answer position information corresponding to a plurality of preset categories from the text to be processed in a classified mode, so that the answer position information classified according to the preset categories is obtained, the extracted information is effectively classified, the information matched with the preset categories of the query statement is determined, and the user requirements are met accurately.
Referring to fig. 1, a flowchart illustrating steps of a text processing method according to an embodiment of the present invention is shown, where the method specifically includes the following steps:
step 101, obtaining a text to be processed and a query sentence.
The text processing method can be applied to an intelligent dialogue scene in which a server performs text processing on a text to be processed based on a query sentence input by a user through a terminal, so that the text to be processed is answered according to the query sentence of the user.
Illustratively, in a recruitment service scene, a recruitment user can open a recruitment client through a terminal and input a recruitment text to the recruitment client, and a server can extract the recruitment text to obtain a text to be processed and store the text to be processed; the job hunting user can open the recruitment client through the terminal and input a query sentence to the recruitment client. After the server obtains the query sentence input by the user, the text to be processed can be processed, and the text obtained based on the query sentence input by the user is returned.
And 102, matching the query statement with preset standard problems corresponding to a plurality of preset categories to obtain the preset categories corresponding to the query statement.
Under the recruitment service scene, 6 categories of age requirements, job skills, post responsibilities, requirement specialities, academic requirements and work experience can be preset. The server may store a plurality of preset standard problems corresponding to preset categories, and may match the query statement with the plurality of preset standard problems after obtaining the query statement input by the user. After the matching with the preset standard problem is completed, the preset category corresponding to the preset standard problem can be used as the preset category corresponding to the query statement.
103, classifying the texts to be processed by adopting a pre-trained text processing model to obtain answer position information corresponding to a preset category contained in the texts to be processed; and determining answer position information matched with the query statement according to the preset category corresponding to the query statement and the preset category corresponding to the answer position information.
The answer position information may be used to indicate a position range of the answer text in the text to be processed, and may include a start position output sequence and an end position output sequence. Specifically, the answer position information may be represented by span (start, end), where start may be used to represent a start position of the answer text in the start position output sequence corresponding to the text to be processed, and end may be used to represent an end position of the answer text in the end position output sequence corresponding to the text to be processed.
For example, when the preset category is age requirement, job skill, job duty, requirement professional, academic requirement, and work experience, the preset category included in the text to be processed may be requirement professional and work experience.
By adopting a pre-trained text processing model, answer position information corresponding to a preset category contained in a text to be processed can be obtained by classifying the text to be processed, and answer position information matched with a query statement can be determined according to the preset category corresponding to the query statement and the preset category corresponding to the answer position information.
And 104, determining a predicted answer text from the text to be processed according to the answer position information matched with the query statement.
Specifically, according to the answer position information matched with the query statement, an index in the text to be processed corresponding to the matched answer position information is searched, a predicted answer text is determined according to the index, and the predicted answer text is replied to the user.
In the embodiment of the application, the text to be processed and the query statement can be acquired, and the query statement is matched with the preset standard problems corresponding to a plurality of preset categories to obtain the preset categories corresponding to the query statement; by adopting a pre-trained text processing model, answer position information corresponding to a preset category can be obtained from the classification of texts to be processed; and determining answer position information matched with the query statement according to the preset category corresponding to the query statement and the preset category corresponding to the answer position information so as to determine a predicted answer text from the text to be processed. Compared with the MRC task based on the BERT, the embodiment of the application obtains the answer position information corresponding to a plurality of preset categories from the text to be processed in a classified mode, so that the answer position information classified according to the preset categories is obtained, the extracted information is effectively classified, the information matched with the preset categories of the query statement is determined, and the user requirements are met accurately.
Referring to fig. 2, a flowchart illustrating steps of an alternative embodiment of a text processing method provided in the embodiment of the present application is shown, which may specifically include the following steps:
step 201, obtaining a text to be processed and a query sentence.
In the recruitment service scenario, a user can browse a plurality of recruitment posts through a client. The page of the recruitment post may include information related to the recruitment position, and the user may enter a query statement while browsing the page of one of the recruitment posts. The server can extract the information related to the recruitment position to obtain the text to be processed, and after the server obtains the query sentence input by the user, the text processing can be carried out on the text to be processed.
Illustratively, the query statement entered by the user: what is the job required for professional direction?
Exemplarily, the text to be processed: the cashier intern is responsible for the management and the check of daily collection and payment and the check of the basic account of the office. The finance related profession, the true and the affirmative graduates, and some people bring. The trial period is 2000, the positive 3000+ bonus (500-.
Step 202, matching the query statement with preset standard problems corresponding to a plurality of preset categories to obtain the preset category corresponding to the query statement.
Step 203, inputting the text to be processed into a text extraction module of the text processing model, and coding the text to be processed by the text extraction module to obtain sentence characteristics and character characteristics.
Illustratively, the text extraction module may use an ALBERT (lightweight BERT) model, and may input the text to be processed into the ALBERT model, and the ALBERT model may encode the text to be processed to obtain the sentence features and the word features.
In the embodiment of the application, the ALBERT model is adopted for coding, so that the time consumption for calling the model can be shortened, and the online deployment is facilitated.
Sentence features can be used to classify the text to be processed. Specifically, the ALBERT model may insert a [ cls ] flag bit before the input text to be processed, and after encoding, may obtain a final vector representation corresponding to the [ cls ] flag bit, and may use the final vector representation corresponding to the [ cls ] flag bit as a sentence feature for semantic representation of the whole text to be processed. The output of the ALBERT model may include a dotted _ out output, and the dotted _ out may be a final vector representation of a first token of the text sequence to be processed, and may be used to represent a final vector representation corresponding to the [ cls ] flag.
Word features may be used to determine answer position information. Specifically, after the ALBERT model encodes the text to be processed, the final vector representation of each token can be obtained, and the final vector representation of each token can be used as the character feature. Wherein the output of the ALBERT model may include a sequence _ output, which may be a final vector representation of each token of the text sequence to be processed.
Step 204, inputting the character features and sentence features into a full connection layer of the text processing model, classifying the text to be processed according to a plurality of preset categories by the full connection layer according to the sentence features, and determining the preset categories contained in the text to be processed; and determining a plurality of answer position information corresponding to preset categories contained in the text to be processed from the text to be processed by the full connection layer according to the character features.
The fully-connected layer may include a plurality of fully-connected layers, each of which may correspond to a predetermined category. Each answer position information may include a start position output sequence of one answer and an end position output sequence of one answer. The word features and the sentence features are input into a multilayer full-connection layer, the multilayer full-connection layer classifies texts to be processed according to the sentence features and a plurality of preset categories to determine the preset categories contained in the texts to be processed, and the multilayer full-connection layer determines a starting position output sequence and an ending position output sequence of a plurality of answers corresponding to the preset categories contained in the texts to be processed according to the word features from the texts to be processed.
Exemplarily, in a recruitment service scene, 6 full-connection layers corresponding to 6 categories of age requirements, job skills, job duties, requirement specialties, academic requirements and work experiences can be preset. The 6 categories may be represented by labels 0-5, for example, an age requirement label of 0, a job skill label of 1, a job duty label of 2, a demand specialty label of 3, a academic requirement label of 4, and a work experience label of 5.
The text to be processed may be classified by 6 fully connected layers. When the preset categories contained in the text to be processed are 6, 6 answer position information corresponding to the 6 categories can be determined from the text to be processed, and 12 output sequences corresponding to the 6 categories can be obtained. When the preset categories included in the text to be processed are the age requirement and the position responsibility, 2 pieces of answer position information respectively corresponding to the age requirement and the position responsibility categories can be determined from the text to be processed, for example, a start position output sequence and an end position output sequence of answers corresponding to the age requirement categories can be determined from the text to be processed by the first fully-connected layer, and a start position output sequence and an end position output sequence of answers corresponding to the position responsibility categories can be determined from the text to be processed by the third fully-connected layer.
It should be understood by those skilled in the art that the setting of the preset categories is only an example of the present invention, and those skilled in the art can set different numbers of different preset categories according to actual situations, and the present invention is not limited herein.
In the embodiment of the application, the position information of a plurality of answers determined from the text to be processed can be classified through the multi-layer full connection layer, so that the prediction capability of the text processing model is improved, and the text processing model has better discrimination on different types of queries.
Step 205, inputting the sentence characteristics into a first classification layer of the text processing model, and determining, by the first classification layer, a plurality of answer position information from the full-connection layer according to the preset category corresponding to the query sentence, and searching answer position information matched with the preset category corresponding to the query sentence to determine the answer position information matched with the query sentence.
For example, the first classification layer may adopt a softmax layer, the softmax layer may obtain a preset category corresponding to the query statement, may input the sentence characteristic into the softmax layer, and the softmax layer searches for answer position information matched with the preset category corresponding to the query statement from a plurality of answer position information determined by the multi-layered full link layer, and determines answer position information matched with the query statement. For example, the category corresponding to the query statement may be determined as the post responsibility, the answer position information corresponding to the category of the post responsibility may be searched from the 2 answer position information corresponding to the determined 2 categories, and the answer position information corresponding to the category of the post responsibility may be determined as the answer position information matched with the query statement.
And step 206, determining a predicted answer text from the text to be processed according to the answer position information matched with the query statement.
Fig. 3 is a schematic diagram of a text processing model for determining a text of a predicted answer according to an embodiment of the present invention. The text processing model may include a text processing module, a normalization layer, a full-link layer, and a first classification layer. The input of the text processing module is a text to be processed, and the text to be processed is responsible for encoding the text to be processed to obtain the final vector representation of each token. The normalization layer can perform normalization operation to accelerate the convergence rate of the text processing model. The full connection layer can classify the texts to be processed and output a starting position output sequence and an ending position output sequence of answers corresponding to a plurality of preset categories. The first classification layer may search for answer position information matched with a preset category corresponding to the query statement to determine answer position information matched with the query statement.
For example, the text to be processed is "the test engineer mainly performs comprehensive verification on the product", the text processing module may insert a [ cls ] flag bit before the text, and may be used for a text classification task, and insert a [ sep ] flag bit between two sentences of the input text to be processed for segmentation, and may be used for a sentence classification task. The start position output sequence 000000000100000000 and the end position output sequence 000000000000000100 for answers to the post name category, and the start position output sequence 010000000000000000 and the end position output sequence 0000010000000000 for answers to the post responsibility category can be determined by the fully connected layer. Wherein, a "1" in the start position output sequence may represent a position of a start character of the answer in the text to be processed, and a "1" in the end position output sequence may represent a position of an end character of the answer in the text to be processed. And determining a preset type corresponding to the query sentence as a post responsibility by the first classification layer, and determining a predicted answer text as 'comprehensive verification of products' from the text to be processed according to a starting character and an ending character of an answer corresponding to the post responsibility type.
In one example, the start position output sequence of the answer corresponding to one preset category may include a plurality of start characters, and the end position output sequence may include a plurality of end characters. The starting position and the ending position of the answer corresponding to one preset category can be determined according to the position size of the character. For example, the full link layer may determine that the initial position output sequence of the answer corresponding to the job skill category is 000001000000001000010000, and the final position output sequence 000000010000000001000001, where the positions of the initial characters in the initial position output sequence are 6, 15, and 20, respectively, then the initial character with the smallest position is determined as the initial position of the answer, and the positions of the final characters in the final position output sequence are 8, 18, and 24, respectively, then the final character with the smallest position is determined as the final position of the answer.
In the embodiment of the application, the text to be processed and the query statement can be acquired, and the query statement is matched with the preset standard problems corresponding to a plurality of preset categories to obtain the preset categories corresponding to the query statement; by adopting a pre-trained text processing model, answer position information corresponding to a preset category can be obtained from the classification of texts to be processed; and determining answer position information matched with the query statement according to the preset category corresponding to the query statement and the preset category corresponding to the answer position information so as to determine a predicted answer text from the text to be processed. Compared with the MRC task based on the BERT, the embodiment of the application obtains answer position information corresponding to a plurality of preset categories from the text to be processed in a classified mode, so that the answer position information classified according to the preset categories is obtained, the extracted information is effectively classified, the information matched with the preset categories of the query sentences is determined, and the user requirements are met accurately; by adopting the ALBERT model for coding, the time consumption for calling the model can be shortened, and the online deployment is facilitated.
Referring to fig. 4, a flowchart of a training method of a text processing model in an embodiment of the present application is shown, where the training method of the text processing model includes:
step 401, acquiring training data, wherein the training data comprises a training text and a text label corresponding to the training text; the text label comprises a standard question corresponding to a preset category and real answer information matched with the standard question.
The text labels corresponding to the training texts can be obtained by manually acquiring the texts to be processed, and the text labels can be used for assisting in training the model. The text labels may include standard questions corresponding to a plurality of preset categories and real answer information matched with the standard questions corresponding to the plurality of preset categories.
Step 402, classifying answer position information corresponding to each preset type from the training text by adopting a text processing model; determining a preset category corresponding to answer position information matched with the standard question according to the preset category corresponding to the standard question and the answer position information; determining a predicted answer text according to the answer position information matched with the standard question; and determining a real answer segment which accords with the real answer information in the training text.
In an alternative embodiment, the real answer information includes real answer text.
In an alternative embodiment, the text processing module includes a text extraction module and a second classification layer.
In an alternative embodiment, the step 402 may comprise the following sub-steps S4021-S4023:
and a substep S4021, inputting the training text into the text extraction module, and coding the training text by the text extraction module to obtain sentence characteristics.
And a substep S4022, inputting the sentence characteristics into the second classification layer, and judging whether each word in the training text appears in the real answer text by the second classification layer to obtain a judgment result of each word in the training text.
And a substep S4023 of determining a real answer segment composed of words appearing in the real answer text according to the determination result.
Illustratively, the second classification layer may employ a sigmoid layer, the sentence features may be input into the sigmoid layer, and each word in the training text may be subjected to secondary classification by the sigmoid layer. Specifically, it may be determined whether each word in the training text appears in the real answer text, and if the word in the training text appears in the real answer text, the word label is 1; and if the word in the training text does not appear in the real answer text, the label of the word is 0. For example, the training text may include "graduates who need financial specialties", the real answer text is "financial specialties", and after two classifications are made for each word, the label "0011110000" is obtained, and it may be determined that the real answer segment is "financial specialties" in the training text.
In the embodiment of the application, each word in the training text is classified into two by modeling the text to be processed, so that the prediction capability of the text processing model is improved.
And step 403, determining a loss function value according to the answer position information matched with the standard question, the predicted answer text, the real answer segment and the text label.
In an optional embodiment, the real answer information further includes real answer position information and a preset category corresponding to the real answer position information.
The real answer position information may be used to represent a position range of the real answer text in the training text, and may include a start position output sequence and an end position output sequence.
In an alternative embodiment, the step 403 may comprise the following sub-steps S4031-S4035:
and a substep S4031, comparing a preset category corresponding to the answer position information matched with the standard question with a preset category corresponding to the real answer position information, and determining answer classification loss.
And comparing the preset category corresponding to the answer position information matched with the standard question with the preset category corresponding to the real answer position information, so as to determine the answer classification loss class _ loss.
And a substep S4032, comparing the answer position information matched with the standard question with the real answer position information, and determining position loss.
Comparing the answer position information matched with the standard question with the position information of the real answer, the start position loss start _ loss and the end position loss end _ loss can be determined.
And a substep S4033, for each word in the predicted answer, judging whether the word in the text of the predicted answer appears in the real answer segment or not, and determining word classification loss.
The word classification loss class _ loss can be determined by judging whether a word in the predicted answer text appears in the real answer segment.
And a substep S4034, comparing the predicted answer text with the real answer text, and determining the evaluation loss.
The predicted answer text and the real answer text are compared to determine the evaluation loss _ rough. Specifically, after the predicted answer text and the real answer text are obtained, the answer correctness can be evaluated by using the ROUGE-L (called-organized unknown for marketing Evaluation-change common subsequence), and the loss _ ROUGE can be determined according to the ROUGE-L score. The ROUGE-L is an index for machine translation and article abstract evaluation, and corresponding ROUGE-L scores can be obtained by comparing and calculating the predicted answer text and the real answer text, and are used for measuring the similarity between the predicted answer and the real answer.
In the embodiment of the application, the text processing model can have the capability of judging the quality of the extracted predicted answer by dynamically predicting the real answer and the ROUGE-L score of the predicted answer, the prediction capability of the text processing model is favorably improved, and the model has better discrimination on different types of queries in actual prediction.
And a substep S4035 of determining a loss function value based on the location loss, the answer classification loss, the word classification loss, and the evaluation loss.
The overall loss function value may be formed by adding weights to each part, and specifically, the overall loss function value output is start _ loss + end _ loss +0.5 class _ loss + loss _ core.
Step 404, adjusting the text processing model parameters according to the loss function values to train the text processing model.
According to the loss function value, parameters of the text processing model can be adjusted, and the text processing model is trained.
In the embodiment of the application, the model is trained in a multi-task mode, the output losses of all the tasks are overlapped according to a certain weight to calculate the overall loss function value, the output losses of all the tasks are used as the overall optimization, the parameters of the model are adjusted to train the model, the perception capability of the text processing model on the boundary and specific content of an answer can be improved, and the text processing model has better generalization capability.
Fig. 5 is a schematic diagram illustrating a training process of a text processing model according to an embodiment of the present invention. The text processing model may include a text processing module, a normalization layer, a full-connectivity layer, a first classification layer, and a second classification layer. The input of the text processing module is a training text, which is responsible for encoding the training text to obtain the final vector representation of each token. The normalization layer can perform normalization operation to accelerate the convergence speed of the text processing model. The full-connection layer can classify the training texts and output a starting position output sequence and an ending position output sequence of answers corresponding to a plurality of preset categories. The first classification layer may search answer position information matched with a preset category corresponding to a preset standard question to determine the answer position information matched with the standard question. According to the answer position information matched with the standard question, the predicted answer text can be determined. The second classification layer may determine a real answer segment composed of words appearing in the real answer text in the training text by determining whether each word in the training text appears in the real answer text.
And comparing answer position information corresponding to a preset category, answer position information matched with the standard question, a predicted answer text, a real answer segment and the text label to determine a loss function value so as to adjust parameters of the text processing model and train the text processing model.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 6, a block diagram of a text processing apparatus according to an embodiment of the present invention is shown, which may specifically include the following modules:
the obtaining module 601 is configured to obtain a text to be processed and a query statement.
A query statement matching module 602, configured to match the query statement with preset standard problems corresponding to multiple preset categories, so as to obtain a preset category corresponding to the query statement.
The model processing module 603 is configured to use a pre-trained text processing model to classify the text to be processed to obtain answer position information corresponding to a preset category included in the text to be processed; and determining answer position information matched with the query statement according to the preset category corresponding to the query statement and the preset category corresponding to the answer position information.
And the answer determining module 604 is configured to determine a predicted answer text from the text to be processed according to the answer position information matched with the query statement.
In an alternative embodiment, the text processing model comprises a text extraction module and a full connection layer; the model processing module may include:
and the coding submodule is used for inputting the text to be processed into a text extraction module of a text processing model, and the text extraction module codes the text to be processed to obtain sentence characteristics and character characteristics.
The answer position information determining submodule is used for inputting the character features and the sentence features into a full connection layer of the text processing model, and the full connection layer classifies the text to be processed according to the sentence features and a plurality of preset categories and determines the preset categories contained in the text to be processed; and determining a plurality of answer position information corresponding to preset categories contained in the text to be processed from the text to be processed by the full connection layer according to the character features.
In an alternative embodiment, the text processing model includes a first classification layer; the model processing module may include:
and the answer position information matching sub-module is used for inputting the sentence characteristics into a first classification layer of the text processing model, searching answer position information matched with the preset category corresponding to the query sentence from a plurality of answer position information determined by the full connection layer according to the preset category corresponding to the query sentence by the first classification layer, and determining the answer position information matched with the query sentence.
In an alternative embodiment, the text processing model is trained by:
the training data acquisition module is used for acquiring training data, and the training data comprises a training text and a text label corresponding to the training text; the text labels comprise standard questions corresponding to preset categories and real answer information matched with the standard questions.
The model training module is used for classifying answer position information corresponding to each preset type from the training text by adopting a text processing model; determining a preset category corresponding to answer position information matched with the standard question according to the preset category corresponding to the standard question and the answer position information; determining a predicted answer text according to the answer position information matched with the standard question; and determining a real answer segment which accords with the real answer information in the training text.
And the loss function determining module is used for determining a loss function value according to the answer position information matched with the standard question, the predicted answer text, the real answer segment and the text label.
And the model parameter adjusting module is used for adjusting the text processing model parameters according to the loss function values so as to train the text processing model.
In an alternative embodiment, the real answer information includes real answer text; the model training module may include:
and the training text coding submodule is used for inputting the training text into the text extraction module, and the text extraction module codes the training text to obtain sentence characteristics.
And the classification submodule is used for inputting the sentence characteristics into the second classification layer, judging whether each character in the training text appears in the real answer text or not by the second classification layer, and obtaining the judgment result of each character in the training text.
And the answer segment determining submodule is used for determining a real answer segment consisting of characters appearing in the real answer text according to the judgment result.
In an optional embodiment, the real answer information further includes real answer position information and a preset category corresponding to the real answer position information; the loss function determination module may include:
and the answer classification loss determining submodule is used for comparing a preset category corresponding to the answer position information matched with the standard question with a preset category corresponding to the real answer position information to determine the answer classification loss.
And the position loss determining submodule is used for comparing the answer position information matched with the standard question with the real answer position information to determine the position loss.
And the word classification loss determining submodule is used for judging whether the word in the text of the predicted answer appears in the real answer segment or not for each word in the predicted answer, and determining the word classification loss.
And the evaluation loss determining submodule is used for comparing the predicted answer text with the real answer text to determine the evaluation loss.
And the loss function value determining submodule is used for determining a loss function value according to the position loss, the answer classification loss, the word classification loss and the evaluation loss.
In the embodiment of the application, the text to be processed and the query statement can be acquired, and the query statement is matched with the preset standard problems corresponding to a plurality of preset categories to obtain the preset categories corresponding to the query statement; by adopting a pre-trained text processing model, answer position information corresponding to a preset category can be obtained from the classification of texts to be processed; and determining answer position information matched with the query statement according to the preset category corresponding to the query statement and the preset category corresponding to the answer position information so as to determine a predicted answer text from the text to be processed. Compared with the MRC task based on the BERT, the embodiment of the application obtains answer position information corresponding to a plurality of preset categories from the text to be processed in a classified mode, so that the answer position information classified according to the preset categories is obtained, the extracted information is effectively classified, the information matched with the preset categories of the query statement is determined, and the user requirements are met accurately; by adopting the ALBERT model for coding, the time consumption for calling the model can be shortened, and the online deployment is facilitated.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
An embodiment of the present invention further provides an electronic device, including:
the computer program is executed by the processor to implement each process of the text processing method embodiment, and can achieve the same technical effect, and is not described herein again to avoid repetition.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when being executed by a processor, the computer program implements each process of the text processing method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The text processing method, the text processing apparatus, the electronic device, and the storage medium according to the present invention are described in detail above, and a specific example is applied in the text to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.