CN112417105A

CN112417105A - Question and answer processing method and device, storage medium and electronic equipment

Info

Publication number: CN112417105A
Application number: CN202011114806.5A
Authority: CN
Inventors: 杨正良; 刘设伟; 陈利琴
Original assignee: Taikang Insurance Group Co Ltd; Taikang Online Property Insurance Co Ltd
Current assignee: Taikang Insurance Group Co Ltd; Taikang Online Property Insurance Co Ltd
Priority date: 2020-10-16
Filing date: 2020-10-16
Publication date: 2021-02-26
Anticipated expiration: 2040-10-16
Also published as: CN112417105B

Abstract

The invention discloses a question-answer processing method, which comprises the following steps: the method comprises the steps of obtaining a user question, screening candidate paragraphs relevant to the user question from a target text corresponding to the user question based on a text screening model, generating a plurality of candidate answers matched with the user question in at least one candidate paragraph based on an answer generation model, sequencing the candidate answers based on an answer sequencing model to obtain a sequencing result, selecting the target answer corresponding to the user question from the candidate answers according to the candidate paragraphs, the candidate answers and the sequencing result, directly connecting input ends to output ends of three parts of text screening, answer generation and answer sequencing, and uniformly performing combined training, so that the problem that a trained system is difficult to achieve the optimal performance due to the fact that training targets of all parts are inconsistent is avoided, and the finally obtained answer is more accurate and reasonable through three steps of text screening, answer generation and answer sequencing, the efficiency of the question answering processing is improved.

Description

Question and answer processing method and device, storage medium and electronic equipment

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a question and answer processing method, a question and answer processing apparatus, a storage medium, and an electronic device.

Background

Machine reading understanding is a new QA (Question Answering) method, and the task of the method is as follows: giving sample data, wherein the data comprises text materials, questions and answers corresponding to the questions, and giving corresponding position information of the answers in the text materials; the goal is to be able to build a reading understanding model for a given training set data such that the model has the ability to answer a given text material question in the test set data, the quality of the answer to the question being judged by a given evaluation index. The objective is that the answer quality of the answer question is better when the answer quality is higher on a given evaluation index, and the final model evaluates the quality according to the score of the evaluation index value.

As shown in fig. 1, a schematic diagram of a question-answer processing flow based on a knowledge base is shown, a traditional customer service robot adopts the processes of retrieval, matching and sorting, firstly, an FAQ knowledge base system is constructed, then, FAQ question-answer pairs relevant to user questions are retrieved from the knowledge base through a relevant retrieval algorithm, finally, candidate answers are searched through a semantic matching algorithm, and the best answers are searched through recall sorting. This approach requires a lot of manpower to build the knowledge base, and most of this is based on a single document, resulting in that the answer may be incomplete and it is difficult to ensure that the answer is the best answer.

Disclosure of Invention

In view of the above problems, a question and answer processing method, device, storage medium and electronic device are proposed to solve the problem that a large amount of manpower is used to construct a knowledge base, and most of the questions are based on a single document, so that the answers may be incomplete, and it is difficult to ensure that the answers are the best answers.

According to an aspect of the present invention, there is provided a question-answer processing method including:

acquiring a user question;

screening at least one candidate paragraph related to a user question from a target text corresponding to the user question based on a text screening model;

generating a plurality of candidate answers in the at least one candidate passage matching the user question based on an answer generation model;

based on an answer ranking model, ranking the candidate answers to obtain a ranking result of the candidate answers;

selecting a target answer corresponding to the user question from the candidate answers according to the at least one candidate paragraph, the candidate answers and the sequencing result;

wherein the text screening model, the answer generation model and the answer ranking model are jointly trained.

Optionally, before the obtaining the user question, the method further includes:

screening at least one candidate paragraph related to a training question from a training text corresponding to the training question by using a text screening model to be trained;

generating a plurality of candidate answers matched with the training question in the at least one candidate paragraph by using an answer generation model to be trained;

sequencing the candidate answers by using an answer sequencing model to be trained to obtain a sequencing result of the candidate answers;

and performing joint training on the text screening model to be trained, the answer generation model to be trained and the answer ranking model to be trained on the basis of the candidate answers and the ranking results.

Optionally, before the screening, by using the text screening model to be trained, at least one candidate paragraph related to the training question is screened from the training text corresponding to the training question, the method further includes:

and searching a training text corresponding to the training problem from a plurality of preprocessed documents.

segmenting the training text in a sliding window mode to obtain a plurality of sub-texts;

searching a preset number of sub-texts related to the training text from the plurality of sub-texts;

and coding the training problem and the subfiles with the preset number to obtain the corresponding context vector.

Optionally, the screening, by using the text screening model to be trained, at least one candidate paragraph related to the training question from the training text corresponding to the training question includes:

utilizing the text screening model to be trained to calculate correlation data between a plurality of paragraphs in the training text and the training question according to the context vector;

and screening at least one candidate paragraph related to the training question according to the correlation data.

Optionally, the generating, by using an answer generation model to be trained, a plurality of candidate answers in the at least one candidate passage that match the training question includes:

matching a plurality of answers in the at least one candidate paragraph with the training question according to the context vector by using the answer generation model to be trained to obtain a starting matching degree corresponding to a starting index and an ending matching degree corresponding to an ending index of the plurality of answers;

and generating a plurality of candidate answers matched with the training questions according to the starting matching degree and the ending matching degree.

Optionally, the ranking the multiple candidate answers by using the answer ranking model to be trained to obtain the ranking result of the multiple candidate answers includes:

removing candidate answers which are overlapped with other candidate answers from the multiple candidate answers to obtain target candidate answers;

calculating question-answer matching degree between the target candidate answer and the training question and answer matching degree between the target candidate answer and a marked answer according to the context vector by using the answer ranking model to be trained;

and sequencing the candidate answers according to the question-answer matching degree and the answer matching degree to obtain a sequencing result of the candidate answers.

According to another aspect of the present invention, there is provided a question and answer processing apparatus including:

the problem acquisition module is used for acquiring user problems;

the screening module is used for screening at least one candidate paragraph related to the user question from a target text corresponding to the user question based on a text screening model;

an answer generation module, configured to generate a plurality of candidate answers in the at least one candidate passage that match the user question based on an answer generation model;

the sorting module is used for sorting the candidate answers based on an answer sorting model to obtain a sorting result of the candidate answers;

the answer selecting module is used for selecting a target answer corresponding to the user question from the candidate answers according to the at least one candidate paragraph, the candidate answers and the sequencing result;

Optionally, the apparatus further comprises:

the paragraph screening module is used for screening at least one candidate paragraph related to the training question from the training text corresponding to the training question by using a text screening model to be trained before the user question is obtained;

the answer generation module is used for generating a plurality of candidate answers matched with the training questions in the at least one candidate paragraph by using an answer generation model to be trained;

the ranking module is used for ranking the candidate answers by using an answer ranking model to be trained to obtain a ranking result of the candidate answers;

and the joint training module is used for performing joint training on the text screening model to be trained, the answer generation model to be trained and the answer ranking model to be trained on the basis of the candidate answers and the ranking results.

Optionally, the apparatus further comprises:

and the text searching module is used for searching the training text corresponding to the training problem from a plurality of preprocessed documents before screening at least one candidate paragraph related to the training problem from the training text corresponding to the training problem by using the text screening model to be trained.

Optionally, the apparatus further comprises:

the segmentation module is used for segmenting the training text in a sliding window mode to obtain a plurality of sub-texts before screening at least one candidate paragraph related to the training question from the training text corresponding to the training question by using the text screening model to be trained;

the text searching module is used for searching a preset number of sub-texts related to the training text from the plurality of sub-texts;

and the coding module is used for coding the training problem and the subfiles with the preset number to obtain the corresponding context vectors.

Optionally, the paragraph screening module includes:

the data calculation submodule is used for calculating correlation data between a plurality of paragraphs in the training text and the training question according to the context vector by utilizing the text screening model to be trained;

and the paragraph screening submodule is used for screening at least one candidate paragraph related to the training question according to the correlation data.

Optionally, the answer generating module includes:

the matching sub-module is used for matching a plurality of answers in the at least one candidate paragraph with the training question according to the context vector by utilizing the answer generation model to be trained to obtain a starting matching degree corresponding to a starting index and an ending matching degree corresponding to an ending index of the plurality of answers;

and the answer generation submodule is used for generating a plurality of candidate answers matched with the training questions according to the starting matching degree and the ending matching degree.

Optionally, the sorting module includes:

the eliminating sub-module is used for eliminating candidate answers which are overlapped with other candidate answers in the plurality of candidate answers to obtain target candidate answers;

the calculation sub-module is used for calculating question and answer matching degrees between the target candidate answers and the training questions and answer matching degrees between the target candidate answers and the marked answers according to the context vectors by using the answer ranking model to be trained;

and the sequencing submodule is used for sequencing the candidate answers according to the question-answer matching degree and the answer matching degree to obtain a sequencing result of the candidate answers.

According to another aspect of the present invention, there is provided a storage medium comprising a stored program, wherein the program, when executed, controls an apparatus on which the storage medium is located to perform one or more of the methods as described above.

In accordance with another aspect of the present invention, there is provided an electronic apparatus including: a memory, a processor, and executable instructions stored in the memory and executable in the processor, wherein the processor implements one or more methods as described above when executing the executable instructions.

According to the embodiment of the invention, at least one candidate paragraph related to a user question is screened from a target text corresponding to the user question based on a text screening model, a plurality of candidate answers matched with the user question in the at least one candidate paragraph are generated based on an answer generation model, the plurality of candidate answers are ranked based on an answer ranking model to obtain a ranking result of the plurality of candidate answers, and a target answer corresponding to the user question is selected from the plurality of candidate answers according to the at least one candidate paragraph, the plurality of candidate answers and the ranking result, wherein the text screening model, the answer generation model and the answer ranking model are jointly trained, so that input ends to output ends of three parts of text screening, answer generation and answer ranking are directly connected by a neural network, and joint training is carried out uniformly, the situation that the trained system cannot achieve the optimal performance easily due to the fact that training targets of all parts are inconsistent is avoided, the finally obtained answers are more accurate and reasonable through the three steps of text screening, answer generation and answer sorting, and the efficiency of question-answering processing is improved.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 is a schematic diagram of a knowledge-base based question-answering process flow;

fig. 2 is a flowchart of a question-answering processing method according to a first embodiment of the present invention;

fig. 3 is a flowchart of a question-answering processing method according to a second embodiment of the present invention;

fig. 4 is a block diagram of a question answering processing apparatus according to a third embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Example one

Referring to fig. 2, a flowchart of a question and answer processing method in the first embodiment of the present invention is shown, which may specifically include:

step 101, user questions are obtained.

In the embodiment of the invention, the user question is acquired in the application stage of the model. For example, in a customer service system for insurance services, a user submits questions about the insurance service.

Step 102, based on the text screening model, screening at least one candidate paragraph related to the user question from the target text corresponding to the user question.

In the embodiment of the present invention, the text screening model may be any type of network model that can implement text screening after being trained, the answer generation model may be any type of network model that can implement answer generation after being trained, and the answer ranking model may be any type of network model that can implement answer ranking after being trained.

In the embodiment of the invention, the text screening model, the answer generation model and the answer ranking model are jointly trained. An end-to-end training strategy is employed during the training phase, rather than training each component individually, so that downstream components can benefit from high quality upstream output (e.g., retrieved candidate paragraphs) during training.

In the embodiment of the invention, the candidate paragraphs are screened based on the trained text screening model. The user question and the corresponding target text are input into a text screening model, and the text screening model can screen at least one candidate paragraph related to the user question from the target text. For example, the target text is obtained by preprocessing the original text and then performing a multi-field cross search.

Step 103, generating a plurality of candidate answers matched with the user question in the at least one candidate paragraph based on an answer generation model.

In the embodiment of the invention, the candidate answer is generated based on the trained answer generation model. And inputting the candidate paragraphs obtained by the last step and the user question into an answer generation model to generate a plurality of candidate answers matched with the user question in at least one candidate paragraph.

And 104, ranking the candidate answers based on an answer ranking model to obtain a ranking result of the candidate answers.

In the embodiment of the invention, a plurality of candidate answers are ranked based on the trained answer ranking model. And inputting the multiple candidate answers into an answer ranking model, and ranking the answers by the answer ranking model to obtain a ranking result of the multiple candidate answers.

And 105, selecting a target answer corresponding to the user question from the candidate answers according to the at least one candidate paragraph, the candidate answers and the sequencing result.

In the embodiment of the invention, the text screening model, the answer generation model and the answer ranking model are jointly trained. After the three models of the text screening model, the answer generating model and the answer ranking model are processed, the final target answer is selected by comparing scores of all parts of the same input, for example, the scores of a candidate paragraph where each candidate answer is located in the text screening model, the scores of the candidate answers in the answer generating model and the scores of the candidate answers in the answer ranking model are obtained, the three scores are weighted and added, and then the best candidate answer is selected as the target answer according to the total score obtained by the weighted and added process.

According to the embodiment of the invention, by obtaining a user question, screening at least one candidate paragraph related to the user question from a target text corresponding to the user question based on a text screening model, generating a plurality of candidate answers matched with the user question in the at least one candidate paragraph based on an answer generation model, sorting the plurality of candidate answers based on an answer sorting model to obtain a sorting result of the plurality of candidate answers, selecting a target answer corresponding to the user question from the plurality of candidate answers according to the at least one candidate paragraph, the plurality of candidate answers and the sorting result, directly connecting input ends to output ends of three parts of text screening, answer generation and answer sorting by using a neural network, and uniformly carrying out joint training, thereby avoiding that the trained system is difficult to achieve optimal performance due to inconsistent training targets of all parts, through the three steps of text screening, answer generation and answer sorting, the finally obtained answers are more accurate and reasonable, and the efficiency of question-answering processing is improved.

Example two

Referring to fig. 3, a flowchart of a question and answer processing method in the second embodiment of the present invention is shown, which may specifically include:

step 201, using a text screening model to be trained, screening at least one candidate paragraph related to a training question from a training text corresponding to the training question.

In the embodiment of the invention, parameter setting is carried out on an initial text screening model to be used as the text screening model to be trained; setting parameters of an initial answer generation model to serve as the answer generation model to be trained; and setting parameters of an initial answer ranking model to serve as the answer ranking model to be trained. The text screening model can be any type of network model which can realize text screening after being trained, the answer generating model can be any type of network model which can realize answer generation after being trained, and the answer ranking model can be any type of network model which can realize answer ranking after being trained.

In an optional embodiment of the present invention, before the screening at least one candidate paragraph related to a training question from a training text corresponding to the training question by using a text screening model to be trained, the method may further include: and searching a training text corresponding to the training problem from a plurality of preprocessed documents.

The preprocessed document includes a document obtained by preprocessing an original document. The original document contains a large amount of noise texts, and the preprocessing comprises data cleaning of the document, for example, unicode characters, empty paragraphs and url links in the original document are deleted to obtain a preprocessed document.

And then, positioning the document segments, and recalling a set of candidate documents as a training text according to a training problem aiming at the preprocessed documents. For example, an open-source framework-based elastic-Search (a highly expandable open-source full-text Search engine) is adopted, and a multi-field cross retrieval mode is adopted to obtain a training text corresponding to a training problem from a preprocessed document.

In an alternative embodiment of the present invention, for the training texts obtained by the retrieval, the training texts may be pruned to reduce the search space. Specifically, the method can comprise two parts, namely: a Segment Encoding (Segment Encoding) module, a second part: a retrieval (retriever) module.

In one implementation, before screening at least one candidate paragraph related to a training question from a training text corresponding to the training question by using a text screening model to be trained, the method may further include: segmenting the training text in a sliding window mode to obtain a plurality of sub-texts; searching a preset number of sub-texts related to the training text from the plurality of sub-texts; and coding the training problem and the subfiles with the preset number to obtain the corresponding context vector.

Dividing the training text into a plurality of sub-texts containing overlapping according to a sliding window mode, and then coding the training problem and the sub-texts by using a pre-trained Transformer model.

For the segmentation coding module, the specific steps are as follows:

firstly, for each training text, calculating a cosine distance between the training question and the subfile based on TF-IDF (Term Frequency-Inverse Document Term Frequency), and selecting k subfiles with the minimum cosine distance, namely the subfiles with the preset number most related to the training question. The sub-texts are sorted according to their positions in the training text and are merged into a new pruned document. The segmentation is performed in a manner that the window size is 1 and the step size is r, and the document d can be finally divided into a plurality of sub-texts C ═ C1.

Then, coding is carried out through a Transformer model to obtain a representation c of a training text, and similarly, a representation q of a training question can also be obtained, wherein the representation of a final paragraph is [ [ CLS ]; q; [ SEP ]; c; [ SEP ] ], wherein CLS represents a classification flag and SEP is a flag representing a sentence.

Input X ═ of given length L (X)₁，x₂，...x_Lx) For each token in X, the input representation is the addition of each word, type and position of the word, resulting in an input embedding

Wherein D_hIs the hidden layer size.

Finally, the series of Transformer model inputs are embedded and projected into a series of context vectors as follows: h isⁱ＝TransformerBlock(h^i-1)，

hⁱRepresenting the hidden states of a series of pre-trained transform blocks.

In an optional embodiment of the present invention, in an implementation manner of screening at least one candidate paragraph related to a training question from a training text corresponding to the training question by using a text screening model to be trained, the method may include: utilizing the text screening model to be trained to calculate correlation data between a plurality of paragraphs in the training text and the training question according to the context vector; and screening at least one candidate paragraph related to the training question according to the correlation data. Wherein the correlation data is used to characterize a magnitude of correlation between the passage and the training problem.

The retrieval module comprises the following specific steps:

although pruning of a document has been done once before, there are still many paragraphs (i.e., sub-text) after the selected document is divided into paragraphs. To improve the efficiency of the coding and to make the downstream model focus more on important parts of the text, the invention employs a mechanism called Early-stopped. For J blockWherein J < I, a score (i.e., correlation data) scorer is calculated by a fixed size vector with a weighted self-aligned layer followed by a multi-layer perceptron^r∈R²The concrete formula is as follows:

wherein, w_μ，w_r，W_rIs the parameter to be learned, h^JAnd

both represent the hidden state of the jth pre-trained Transformer block. After the scores of all paragraphs are obtained, the top N paragraphs in each instance are passed to subsequent blocks, and the remaining paragraphs are discarded. Here, N is relatively small, and therefore, the model may focus on reading the most relevant context. To train the search module, the scores are normalized and the objective function of the search model is defined as:

wherein,

is a one-hot (one-hot coded) tag that indicates whether the current paragraph contains at least one fully matched, substantially true answer text.

Step 202, generating a plurality of candidate answers matched with the training question in the at least one candidate paragraph by using an answer generation model to be trained.

In the embodiment of the invention, candidate paragraphs are screened from the training text through a text screening model, and the answer generation model needs to find out the specific position of an answer from at least one candidate paragraph to generate a plurality of candidate answers matched with the training question. The answer generation model may determine candidate answers matching the training questions in the candidate paragraphs, and specifically, any suitable answer generation model may be adopted, which is not limited in this embodiment of the present invention. .

In an optional embodiment of the present invention, in an implementation manner of generating, by using an answer generation model to be trained, a plurality of candidate answers in the at least one candidate passage that match the training question may include: matching a plurality of answers in the at least one candidate paragraph with the training question according to the context vector by using the answer generation model to be trained to obtain a starting matching degree corresponding to a starting index and an ending matching degree corresponding to an ending index of the plurality of answers; and generating a plurality of candidate answers matched with the training questions according to the starting matching degree and the ending matching degree.

A reading (reader) module is adopted to determine specific areas, namely a start index and an end index of a part where the answer is located, from a plurality of candidate paragraphs containing the answer.

The last module retrieves a plurality of important candidate paragraphs containing answers and corresponding context vectors hⁱThe reading module needs to find out the specific position of the answer, i.e. the corresponding start and end indexes of the answer. The method comprises the following specific steps:

for a given screened candidate passage, the answer generation model aims to propose a plurality of candidate answers for each passage. For one input, the vector final state h of the hidden layer is^IProjection into two sets of scores is achieved as follows: score^s＝w_sh^I，score^e＝w_eh^I。

Wherein,

and

are the fraction of the start and end positions, w, respectively, of the answer span_s，w_eAre trainable parameter vectors.

Suppose alpha_iAnd beta_iRepresents the candidate answer a_iThe start and end indices, a score is calculated: s_i＝scoreα_i+scoreβ_iThen, the candidate sets of the top M names are listed according to the descending order of the scores, and a group of preliminary candidate answers a ═ a is obtained₁，...，a_MTheir fraction is s₁，...，s_M}. All text spans in the paragraph that match the best answer are marked as correct, resulting in two mark vectors

And

if the paragraph does not contain any answer string, so will y_sAnd y_eThe first element in (1) is marked and the remaining elements are set to 0. Finally, the objective function of the answer generation model is defined as:

step 203, ranking the multiple candidate answers by using an answer ranking model to be trained to obtain a ranking result of the multiple candidate answers.

In the embodiment of the invention, an answer ranking model is adopted to rank a plurality of candidate answers generated by the model, so as to obtain a ranking result of the plurality of candidate answers. For example, the answer ranking model may recalculate the matching degree between the candidate answer and the training question, and then rank the candidate answers according to the matching degree from high to low to obtain the ranking result, and specifically, any applicable answer ranking model may be adopted, which is not limited in the embodiment of the present invention.

In an optional embodiment of the present invention, in an implementation manner of obtaining a ranking result of the plurality of candidate answers by ranking the plurality of candidate answers by using the answer ranking model to be trained, the ranking result may include: removing candidate answers which are overlapped with other candidate answers from the multiple candidate answers to obtain target candidate answers; calculating question-answer matching degree between the target candidate answer and the training question and answer matching degree between the target candidate answer and a marked answer according to the context vector by using the answer ranking model to be trained; and sequencing the candidate answers according to the question-answer matching degree and the answer matching degree to obtain a sequencing result of the candidate answers.

Multiple candidate answers may overlap, and in order to obtain more reasonable candidate answers, candidate answers with overlapping portions with other candidate answers may be removed first.

For example, there may still be redundant parts in the multiple candidate answers, so the present invention provides a pruning algorithm, assuming that the set of candidate answers obtained in the previous step is a, and the corresponding score is S. Firstly, the answer with the highest score in the set A is selected and placed in the set B, then the indexes of the beginning and the end of the answers left in the set A are searched for and whether the indexes corresponding to the answers placed in the set B are overlapped, if yes, the answers are deleted from the set A, then the next selection and deletion are carried out, and the final obtained candidate answer is the final answer until the set A is empty or the number of the answers in the set B reaches a preset threshold value.

Then, a reordering (reranker) module is adopted to reorder the generated multiple candidate answers, so as to obtain the ordering result of the candidate answers.

For candidate answers ai in the set B, a score based on its span representation is computed, where the representation is a weighted self-aligned vector defined by answer span boundaries, as follows:

wherein,

is a stacking vector

For the short term of (A) or (B),

representing a weighted self-alignment vector, w, calculated by softmax_αRepresenting a weight vector.

Then, the answers in B need to be ranked and ranked, and for better training the reordering module, two kinds of labels are respectively designed for each candidate answer ai

And

first, design

A label is provided

Defining as the maximum exact match score between the candidate answer ai and the true phase answer; secondly, design the

A label is provided

Is defined as a candidate answer a_iAnd a manually labeled original training data answer, wherein F1 represents a fuzzy degree of match at the character level,

if there is not a correct prediction in B: (

All elements of are 0), then the confidence is replaced with the best answerThe lowest ranked candidate answer.

Finally, an objective function defining the answer ranking model is as follows:

and 204, performing combined training on the text screening model to be trained, the answer generation model to be trained and the answer ranking model to be trained based on the candidate answers and the ranking results.

In an optional embodiment of the present invention, a multi-task learning method is adopted, parameters of modules of the method are shared, and a joint objective function is defined as: j ═ l_I+l_II+l_III. For each complete training process, first, the scores of all paragraphs in training set X are calculated. Then, N paragraphs are retrieved per instance and a new training set X is constructed, which contains only the retrieved paragraphs. For each instance, if all top ranked parts are negative examples, then the lowest confidence part is replaced with the best segmented paragraph. At the same time, in each complete training, two small batches are extracted from the training set X and the training set X-, wherein the first group is used for calculating l_IAnd the other set is used for calculating l_IIAnd l_III. Here context vector hⁱShared between the reading module and the reordering module to avoid duplicate calculations. For the batch size setting, the batch size of X is dynamically determined so that both X and X-can be trained with the same number of steps.

Step 205, obtain user questions.

In the embodiment of the present invention, the specific implementation manner of this step may refer to the description in the foregoing embodiment, and details are not described herein.

Step 206, based on the text screening model, at least one candidate paragraph related to the user question is screened from the target text corresponding to the user question.

Step 207, generating a plurality of candidate answers matching the user question in the at least one candidate passage based on an answer generation model.

And step 208, ranking the multiple candidate answers based on the answer ranking model to obtain a ranking result of the multiple candidate answers.

Step 209, selecting a target answer corresponding to the user question from the candidate answers according to the at least one candidate paragraph, the candidate answers and the ranking result.

According to the embodiment of the invention, at least one candidate paragraph related to a training question is screened from a training text corresponding to the training question by using a text screening model to be trained, a model is generated by using an answer to be trained, a plurality of candidate answers in the at least one candidate paragraph, which are matched with the training question, are generated, the plurality of candidate answers are ranked by using an answer ranking model to be trained, ranking results of the plurality of candidate answers are obtained, the text screening model to be trained, the answer generating model to be trained and the answer ranking model to be trained are jointly trained on the basis of the plurality of candidate answers and the ranking results, so that input ends to output ends of three parts of text screening, answer generating and answer ranking are directly connected by using a neural network, and the joint training is uniformly carried out, thereby avoiding the problem that the trained system cannot achieve the optimal performance due to the inconsistency of training targets of all parts, through the three steps of text screening, answer generation and answer sorting, the finally obtained answers are more accurate and reasonable, and the efficiency of question-answering processing is improved.

EXAMPLE III

Referring to fig. 4, a block diagram of a structure of a question answering processing apparatus in a third embodiment of the present invention is shown, which may specifically include:

a question acquisition module 301, configured to acquire a user question;

a screening module 302, configured to screen at least one candidate paragraph related to a user question from a target text corresponding to the user question based on a text screening model;

an answer generation module 303, configured to generate a plurality of candidate answers matching the user question in the at least one candidate passage based on an answer generation model;

a ranking module 304, configured to rank the multiple candidate answers based on an answer ranking model, so as to obtain a ranking result of the multiple candidate answers;

an answer selecting module 305, configured to select a target answer corresponding to the user question from the plurality of candidate answers according to the at least one candidate paragraph, the plurality of candidate answers, and the ranking result;

Optionally, the apparatus further comprises:

Optionally, the paragraph screening module includes:

Optionally, the answer generating module includes:

Optionally, the sorting module includes:

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

In an embodiment of the present disclosure, the training device of the question-answering processing system includes a processor and a memory, the modules and the sub-modules are stored in the memory as program units, and the processor executes the program units stored in the memory to implement corresponding functions.

The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be provided with one or more than one, at least one candidate paragraph related to the user question is screened from a target text corresponding to the user question based on a text screening model by obtaining the user question, a plurality of candidate answers matched with the user question in the at least one candidate paragraph are generated based on an answer generating model, the plurality of candidate answers are ranked based on an answer ranking model to obtain a ranking result of the plurality of candidate answers, a target answer corresponding to the user question is selected from the plurality of candidate answers according to the at least one candidate paragraph, the plurality of candidate answers and the ranking result, so that input ends and output ends of three parts of text screening, answer generating and ranking are directly connected through a neural network, and joint training is performed uniformly, and the problem that the trained system is difficult to achieve the optimal performance due to the fact that training targets of all parts are inconsistent is avoided, through the three steps of text screening, answer generation and answer sorting, the finally obtained answers are more accurate and reasonable, and the efficiency of question-answering processing is improved.

The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.

An embodiment of the present invention provides a storage medium on which a program is stored, the program implementing the question-answering processing method when executed by a processor.

The embodiment of the invention provides a processor, which is used for running a program, wherein the question answering processing method is executed when the program runs.

The embodiment of the invention provides electronic equipment, which comprises a processor, a memory and a program which is stored on the memory and can run on the processor, wherein the processor executes the program and realizes the following steps:

acquiring a user question;

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A question-answer processing method, characterized by comprising:

acquiring a user question;

2. The method of claim 1, wherein prior to said obtaining a user question, the method further comprises:

3. The method of claim 2, wherein before the screening at least one candidate paragraph related to the training question from the training text corresponding to the training question by using the text screening model to be trained, the method further comprises:

4. The method of claim 2, wherein before the screening at least one candidate paragraph related to the training question from the training text corresponding to the training question by using the text screening model to be trained, the method further comprises:

5. The method of claim 4, wherein the screening at least one candidate paragraph related to a training question from a training text corresponding to the training question by using a text screening model to be trained comprises:

6. The method of claim 4, wherein the generating, using the answer generation model to be trained, a plurality of candidate answers in the at least one candidate passage that match the training question comprises:

7. The method according to claim 4, wherein the ranking the candidate answers using the answer ranking model to be trained to obtain the ranking result of the candidate answers comprises:

8. A question-answering processing apparatus characterized by comprising:

the problem acquisition module is used for acquiring user problems;

9. A storage medium, characterized in that the storage medium comprises a stored program, wherein a device on which the storage medium is located is controlled to perform the method according to any one of claims 1 to 7 when the program is run.

10. An electronic device, comprising: memory, processor and executable instructions stored in the memory and executable in the processor, characterized in that the processor implements the method according to any of claims 1-7 when executing the executable instructions.