CN112685548A

CN112685548A - Question answering method, electronic device and storage device

Info

Publication number: CN112685548A
Application number: CN202011627778.7A
Authority: CN
Inventors: 崔一鸣; 车万翔; 杨子清; 王士进; 胡国平; 秦兵; 刘挺
Original assignee: Hebei Xunfei Institute Of Artificial Intelligence; Zhongke Xunfei Internet Beijing Information Technology Co ltd; iFlytek Co Ltd
Current assignee: Hebei Xunfei Institute Of Artificial Intelligence; Zhongke Xunfei Internet Beijing Information Technology Co ltd; iFlytek Co Ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2021-04-20
Anticipated expiration: 2040-12-31
Also published as: CN112685548B

Abstract

The application discloses a question answering method, electronic equipment and a storage device, wherein the question answering method comprises the following steps: acquiring a problem text and a chapter text, and acquiring reference texts of a plurality of knowledge points; the question text and the chapter text contain a plurality of words, and the knowledge points are related to at least one of the question text and the chapter text; extracting individual semantic representations of a plurality of words and extracting original semantic representations of all reference texts; and predicting an answer text of the question text from the discourse text by utilizing the individual semantic representations of the words and the original semantic representation of each reference text. According to the scheme, the accuracy of question answering can be improved.

Description

Question answering method, electronic device and storage device

Technical Field

The present application relates to the field of natural language understanding, and in particular, to a question answering method, an electronic device, and a storage device.

Background

With the development of information technology, the chapter reading understanding correlation system has obtained great performance improvement, and on the basis, the correlation system can answer related questions based on chapters. However, in a real-world scenario, understanding only the chapters may still not be able to answer the question accurately. For example, the discourse is "banana in refrigerator and apple on table", and when the question is "where the yellow fruit is placed", it is not possible to answer the question exactly, and so on, since the machine cannot understand what the "yellow fruit" refers to according to the discourse. In view of this, how to improve the accuracy of question answering becomes a topic with great research value.

Disclosure of Invention

The technical problem text mainly solved by the application is to provide a question answering method, electronic equipment and a storage device, and the accuracy of question answering can be improved.

In order to solve the above problem text, a first aspect of the present application provides a question answering method, including: acquiring a problem text and a chapter text, and acquiring reference texts of a plurality of knowledge points; the question text and the chapter text contain a plurality of words, and the knowledge points are related to at least one of the question text and the chapter text; extracting individual semantic representations of a plurality of words and extracting original semantic representations of all reference texts; and predicting an answer text of the question text from the discourse text by utilizing the individual semantic representations of the words and the original semantic representation of each reference text.

In order to solve the above problem text, a second aspect of the present application provides an electronic device, which includes a memory and a processor coupled to each other, wherein the memory stores program instructions, and the processor is configured to execute the program instructions to implement the problem answering method in the first aspect.

In order to solve the above problem text, a third aspect of the present application provides a storage device storing program instructions executable by a processor, the program instructions being for implementing the question answering method of the first aspect.

According to the scheme, the question text and the chapter text are obtained, the reference texts of the knowledge points are obtained, the question text and the chapter text contain a plurality of words, the knowledge points are related to at least one of the question text and the chapter text, on the basis, the individual semantic representations of the words are extracted, the original semantic representation of each reference text is extracted, and therefore the answer text of the question text is obtained through prediction from the chapter text by utilizing the individual semantic representations of the words and the original semantic representation of each reference text. Therefore, by acquiring the reference texts of the knowledge points, external knowledge can be introduced in the question answering process, the expansion of the background of the chapter text and the question text is facilitated, and the accuracy of question answering is further improved.

Drawings

FIG. 1 is a schematic flow chart diagram illustrating one embodiment of a question answering method of the present application;

FIG. 2 is a block diagram of an embodiment of an individual semantic extraction network;

FIG. 3 is a block diagram of an embodiment of an original semantic extraction network;

FIG. 4 is a diagram of one embodiment of answering question text based on chapter text and reference text;

FIG. 5 is a flowchart illustrating an embodiment of step S13 in FIG. 1;

FIG. 6 is a state diagram of an embodiment of a semantic association acquisition process;

FIG. 7 is a schematic flow chart diagram illustrating another embodiment of a question answering method of the present application;

FIG. 8 is a schematic diagram of another embodiment of answering question text based on chapter text and reference text;

FIG. 9 is a schematic flow chart diagram illustrating one embodiment of a method for training a question answering model;

FIG. 10 is a schematic diagram of an embodiment of a training sample acquisition process;

FIG. 11 is a block diagram of an embodiment of an electronic device of the present application;

FIG. 12 is a block diagram of an embodiment of a memory device according to the present application.

Detailed Description

The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.

The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.

Referring to fig. 1, fig. 1 is a schematic flow chart of an embodiment of a question answering method according to the present application. Specifically, the method may include the steps of:

step S11: and acquiring a question text and a chapter text, and acquiring a reference text of a plurality of knowledge points.

In the embodiment of the disclosure, the question text and the chapter text contain a plurality of words, and the specific number of the words is not limited herein. In addition, the question text and the chapter text may be literally related, for example, the chapter text is "banana in refrigerator and apple in table", and the question text is "where banana is", that is, the chapter text and the question text are both literally related to "banana"; in addition, the question text and the chapter text may also be literally unrelated, for example, the chapter text is "banana in refrigerator and apple on table", and the question text is "where the yellow fruit is" i.e., the chapter text and the question text are literally unrelated. The above examples are only a few of the situations that may exist in the actual application process, and do not limit other situations that may exist in the actual application process, and they are not illustrated herein.

In one implementation scenario, in order to improve the accuracy of the subsequent predicted answer text, the question text and the chapter text may be preprocessed, so as to obtain a plurality of words contained in the question text and the chapter text.

In a specific implementation scenario, the chapter text and the question text may be data washed. For example, illegal characters such as html (Hyper-Text Markup Language) Markup characters (e.g., </body > and the like) in the Text of the chapters and the Text of the questions can be removed, messy codes caused by coding errors and the like in the Text of the chapters and the Text of the questions can be removed, and the probability of abnormal errors can be reduced in the subsequent application process through data cleaning, so that the robustness of answering Text prediction can be improved.

In another specific implementation scenario, the sections text and the question text can be participled. For example, Word segmentation can be performed on English by using Word segmentation tools such as Word Piece, or Word segmentation can be performed directly according to spaces between words; and for Chinese, the word segmentation can be directly carried out by taking the characters as granularity, and the word segmentation can be beneficial to capturing fine-grained semantic information.

In yet another specific implementation scenario, multiple spaces in both the text of the chapters and the text of the question may be replaced with one space, and the space at the end of the beginning sentence of the sentence may be removed.

In another specific implementation scenario, after the foregoing preprocessing is performed on the chapter text and the question text, the format of the chapter text and the question text after preprocessing can be converted according to the actual application requirement, for example, the format can be converted into:

[CLS]Q₁…Q_n[SEP]P₁…P_m[SEP]……(1)

in the above formula (1), [ CLS]And [ SEP ]]As delimiters, Q₁…Q_nRepresenting individual words, P, in the question text after preprocessing₁…P_mRepresenting each word in the text of the chapters after preprocessing. In addition, according to the practical application requirement, each word in the text of the chapter after preprocessing can be placed in front, and each word in the problem text after preprocessing can be placed in back, that is, converted into [ CLS]P₁…P_m[SEP]Q₁…Q_n[SEP]And is not limited herein.

In addition, in the embodiment of the disclosure, the knowledge points are related to at least one of the question text and the chapter text. For example, several knowledge points may be associated with the question text, and still taking the aforementioned question text "where yellow fruit is" as an example, several knowledge points may be associated with "yellow fruit", and may specifically relate to "banana", "orange", "lemon", "grapefruit", and so on; alternatively, the plurality of knowledge points may be related to chapter texts, and may specifically relate to "banana", "refrigerator", "apple", taking the chapter text "banana in refrigerator and apple in table" as an example; alternatively, several knowledge points may be related to both chapter text and question text, and still taking the aforementioned chapter text "banana in refrigerator, apple on table" and question text "where yellow fruit is" as an example, several knowledge points may be related to "yellow fruit", specifically may relate to "banana", "orange", "lemon", "grapefruit", and so on, and furthermore several knowledge points may specifically relate to "banana", "refrigerator", "apple". Other cases may be analogized, and no one example is given here.

Specifically, in order to improve the pertinence of the reference text, keyword recognition may be performed on at least one of the question text and the chapter text to obtain a plurality of keywords, and on this basis, the reference text of the knowledge points related to the keywords may be obtained from a preset knowledge dictionary. The preset knowledge dictionary may specifically include, but is not limited to: encyclopedia, Wikipedia, Chinese words, Oxford dictionary, and the like, without limitation. In the above manner, the keyword recognition is performed on at least one of the question text and the chapter text to obtain the plurality of keywords, so that the reference text of the knowledge points related to the keywords is obtained from the preset knowledge dictionary, and therefore, the relevance of the reference text, the chapter text and the question text can be improved, the reference value of external knowledge can be improved in the question answering process, and the accuracy of question answering can be improved.

In one implementation scenario, the preset knowledge dictionary may include a plurality of dictionary entries, each of which may include: entry, basic information, explanation, and illustrative sentences. The entry may represent a topic of a dictionary entry, the basic information may include information such as a part of speech, pinyin (or phonetic symbol) and the like of the topic, the explanation may represent a paraphrase of the topic, and the example sentence may be an exemplary sentence including the topic. Taking the dictionary item "banana" as an example, the term "banana", the basic information may include the part of speech (i.e., noun) of "banana" and the pinyin (i.e., xi ā ngji ā o) of banana, the explanation may include "monocotyledon, musaceae, oblong leaf, berry flesh, long column shape, yellow peel at maturity, distribution in tropical and subtropical regions, soft and sweet pulp", the example sentence may be "monkey in zoo loves banana", and the rest may be analogized, which is not exemplified herein.

In one implementation scenario, to improve the robustness of the original semantic representation of the subsequently extracted reference text, a pre-set knowledge dictionary may be pre-processed.

In a specific implementation scenario, a preset knowledge dictionary may be subjected to data cleansing. For example, illegal characters such as html (Hyper-Text Markup Language) Markup characters (e.g., </body > and the like) in the preset knowledge dictionary can be removed, messy codes caused by coding errors and the like in the preset knowledge dictionary can be removed, and the probability of abnormal errors can be reduced in the subsequent application process through data cleaning, so that the robustness of answer Text prediction can be improved.

In another specific implementation scenario, in order to reduce interference of irrelevant information, only interpretations corresponding to the vocabulary entries in the preset knowledge dictionary may be extracted by using a regular expression, without retaining basic information (such as part of speech, pinyin, and the like), example sentences, and the like.

In another specific implementation scenario, a word segmentation may be performed on a preset knowledge dictionary. For example, Word segmentation can be performed on English by using Word segmentation tools such as Word Piece, or Word segmentation can be performed directly according to spaces between words; and for Chinese, the word segmentation can be directly carried out by taking the characters as granularity, and the word segmentation can be beneficial to capturing fine-grained semantic information.

In another specific implementation scenario, multiple spaces in the predetermined knowledge dictionary may be replaced by one space, and the space at the beginning and end of the sentence may be removed.

In another specific implementation scenario, after the pre-processing is performed on the preset knowledge dictionary, the pre-processed dictionary entries of the preset knowledge dictionary may be further subjected to format conversion according to the actual application requirement, for example, for the ith dictionary entry DⁱCan be converted into:

[CLS]D₁…D_k[SEP]……(2)

in the above formula (2), [ CLS]And [ SEP ]]As separators, D₁…D_kRepresenting the individual words in the dictionary entry after preprocessing. Specifically, as previously described, dictionary entries retain only entries and their interpretations, and so can be written in colon': ' connective terms and their explanations. For example, taking the term "apple" as an example, D can be constructed₁…D_kIs' apple: dicotyledonous plant, Rosaceae.Deciduous arbor. Most of them are self-sterile and need cross-pollination. The pulp is crisp, fragrant and sweet, and can help digestion. "where blank space represents a participle.

In yet another specific implementation scenario, nouns and entities of names of the discourse text and the question text can be identified as keywords. Specifically, the NER (Named Entity Recognition) tool such as LTP, Stanza, etc. may be used to recognize nouns and Named entities of the chapter text and the question text, which is not limited herein.

In another specific implementation scenario, dictionary entries D matching the keywords may be directly extracted from the preset knowledge dictionary, respectively, to obtain the reference text. For example, when the keyword is "apple", the reference text "[ CLS ] may be extracted]Apple: dicotyledonous plant, Rosaceae. Deciduous arbor. Most of them are self-sterile and need cross-pollination. The pulp is crisp, fragrant and sweet, and can help digestion. [ SEP ]]". Other cases may be analogized, and no one example is given here. In addition, the occurrence times of each keyword in the discourse text and the question text can be respectively counted, the dictionary items matched with the keywords are sorted according to the sequence of the occurrence times of the corresponding keywords from high to low, and the dictionary item (for example, D) arranged at the top of the preset numerical digit (for example, the top 10 digits) is selected¹,D²,…,D¹⁰) And obtaining a reference text. It should be noted that, in the case that the dictionary entry matching the keyword is less than the preset value (e.g., 10), at least one empty dictionary entry may be added to complement the preset number of dictionary entries, so that the finally obtained reference text is the preset number.

Step S12: individual semantic representations of a number of words are extracted, and original semantic representations of respective reference texts are extracted.

In one implementation scenario, embedded representations of a plurality of words in a chapter text and a question text can be obtained, and semantic extraction is performed on the embedded representations of the plurality of words, so that individual semantic representations of the plurality of words are obtained.

In a specific implementation scenario, the words may be discretized to map the words to 0-1 vectors, and then the words are mapped based on word vector technologyIs converted into a continuous vector representation, i.e. an embedded representation of the word. Specifically, the word list size may be V (e.g. including 20000 to 30000 common words) based on the predetermined corpus, and for a word w_iThe 0-1 vector representation of (a) is a vector of length | V |, where the word w_iThe corresponding dimension in the predetermined corpus is 1, and all other elements are 0. For example, assuming that vocabulary V of the corpus is { a, b, c, d }, the 0-1 vector of the word "c" is denoted X_iThe other cases can be analogized, and no example is given here. Further, each word corresponds to a real-valued vector of d dimension, so that the predetermined corpus corresponds to form a word vector matrix E of | V | × d size, and the embedded expression of the word can be obtained by discretizing the product of the 0-1 vector expression and the word vector matrix E, which is specifically expressed as follows:

e_i＝E·X_i……(3)

in the above formula (3), e_iMeaning word w_iIs embedded in the representation, X_iMeaning word w_iDiscretizing 0-1 vector representation, and E represents a word vector matrix corresponding to a preset corpus.

In another specific implementation scenario, the embedded representations of several words may be input into a semantic extraction model, resulting in a semantic representation of the words. The semantic extraction model may specifically include, but is not limited to: a multi-tier semantic extraction network (e.g., a multi-tier transformer). Under the condition that the semantic extraction model is an L-layer transformer, context dynamic coding can be realized through the L-layer transformer, and finally, a semantic representation that the chapter and the problem are associated and unified is obtained, which can be specifically represented as follows:

H_i＝transformer(H_i-1)……(4)

in the above formula (4), i ranges from 1 to L, H_iAnd representing semantic representation obtained by extraction of the transform of the ith layer. Furthermore, H₀The expression (1) represents an embedded expression obtained by performing the discretization and word embedding processes.

In another specific implementation scenario, the text of the chapters and the text of the question may also be directly input into the individual semantic extraction network to obtain individual semantic representations of a plurality of words in both the text of the chapters and the text of the question. Furthermore, the individual semantic extraction network may include multiple layers of semantic extraction networks. The individual semantic extraction network may specifically include but is not limited to: ALBERT (ALITE BERT, lightweight BERT) model. Referring to fig. 2 in conjunction, fig. 2 is a schematic diagram of a framework of an embodiment of an individual semantic extraction network, as shown in fig. 2, the individual semantic extraction network may specifically include: discretization layer, word embedding layer, multi-layer transform representing layer. For specific data processing procedures of the discretization layer, the word embedding layer and the multi-layer transform representing layer, reference may be made to the foregoing description, and details are not described herein again.

In another implementation scenario, similar to the individual semantic representations described above, the embedded representations of the terms in the reference text may be obtained and semantically extracted to obtain the original semantic representation of the reference text. Specifically, reference may be made to the extraction process of the individual semantic representation, which is not described herein again.

In addition, for convenience of description, in the embodiments of the present disclosure and in the following embodiments, the tth word w in the question text and the chapter text is described_tIs denoted as H_L(t) and noting the original semantic representation of the ith reference text as

Step S13: and predicting an answer text of the question text from the discourse text by utilizing the individual semantic representations of the words and the original semantic representation of each reference text.

In one implementation scenario, the individual semantic representation H of each term may be utilized separately_L(t) and original semantic representation of respective reference texts

Obtaining semantic association degrees between the words and the reference texts respectively, and on the basis, aiming at each word, utilizing the semantic association degrees between the word and the reference texts to combine the semantic association degrees with each otherAnd fusing the original semantic representations to obtain a fused semantic representation of the word, so that the answer text can be predicted from the text of the chapters on the basis of the individual semantic representation and the fused semantic representation of the words. According to the method, the original semantic representations are fused through the semantic association degrees between the words and the reference texts respectively to obtain fused semantic representations, namely the fused semantic representations of the words depend on the original semantic representations of the reference texts with large semantic association degrees, on the basis, the answer texts are predicted by utilizing the individual semantic representations of the words and the fused semantic representations of the introduced external knowledge, and the accuracy of the answer texts can be improved.

In a specific implementation scenario, taking N reference texts as an example, the k-th word w in the question text and the chapter text is considered_kIn other words, H can be represented by its individual semantics_L(k) Original semantic representations with 1 st reference text respectively

Original semantic representation of through Nth reference text

Obtain the word w_kSemantic association degrees with the 1 st reference text to the Nth reference text respectively, and so on, the semantic association degrees between each word in the question text and the chapter text and the 1 st reference text to the Nth reference text can be obtained, and on the basis, the word w_kIn other words, the semantic association degrees between the original semantic representations of the 1 st reference text and the nth reference text can be used for weighting the original semantic representations of the 1 st reference text and the nth reference text respectively to obtain a word w_kThe fused semantic representation of the question text and each word in the text of the chapters can be obtained by analogy. The specific process of obtaining the semantic relevance and fusing the original semantic representation may refer to the related embodiments described later, which is not described herein again.

In another specific implementation scenario, to improve the efficiency of predicting answer text, a prediction network may be trained in advance, which may include but is not limited to: the method includes the steps that a full connection layer, a softmax layer and the like are connected, on the basis, individual semantic representations and fusion semantic representations of words can be spliced, the spliced semantic representations are sent to the prediction network, so that a first probability value that each word in a chapter text is a starting position of an answer text and a second probability value that each word in the chapter text is an ending position of the answer text can be predicted, finally, the starting position word and the ending position word of the answer text can be determined on the basis of the first probability value and the second probability value of each word, and the combination of the starting position word, the ending position word and the words among the words is used as the answer text.

In another implementation scenario, as previously described, individual semantic representations H of each term may be utilized separately_L(t) and original semantic representation of respective reference texts

And obtaining semantic association degrees between the words and the reference texts respectively. On the basis, the semantic association degree between the keywords in the problem text and each reference text can be obtained, so that for each reference text, the importance degree can be obtained by utilizing the semantic association degree between each keyword and each reference text, the reference texts are sequenced according to the descending order of the importance degree, the reference text in the front preset sequence (such as the first order) is selected, the entry in the selected reference text is extracted, and the keyword in the problem text is replaced by the entry, so that a new problem text is obtained. On the basis, the new question text and the individual semantic representations of a plurality of words in the text of the chapters are obtained, and the individual semantic representations of the words are utilized, so that the answer text of the question text is obtained through prediction in the text of the chapters.

In a specific implementation scenario, at least one of an adjective, a noun, a name entity, and the like of the question text may be identified as a keyword, and the specific process may refer to the foregoing related description, which is not described herein again. In addition, for the entries in the reference text, reference may be made to the foregoing related description, and details are not repeated here.

In another specific implementation scenario, as mentioned above, to improve the efficiency of predicting answer text, a prediction network may be trained in advance, and the prediction network may include but is not limited to: the method includes the steps of inputting a new question text and an individual semantic representation of a plurality of words in a chapter text into a prediction network, so that a first probability value that each word in the chapter text is a starting position of an answer text and a second probability value that each word in the chapter text is an ending position of the answer text can be predicted, finally determining the starting position word and the ending position word of the answer text based on the first probability value and the second probability value of each word, and taking the combination of the starting position word, the ending position word and the words between the starting position word and the ending position word as the answer text.

In another specific implementation scenario, please refer to fig. 4 in combination, and fig. 4 is a schematic diagram of an embodiment of answering a question text based on a chapter text and a reference text. As shown in fig. 4, semantic relevance between keywords (e.g., yellow, sweet, and fruit) in the question text and each reference text may be obtained, for convenience of description, a reference text with an entry of "banana" is referred to as reference text 1, a reference text with an entry of "apple" is referred to as reference text 2, and a reference text with an entry of "lemon" is referred to as reference text 3, for example, semantic relevance between the keywords "yellow" and the reference texts 1 to 3 is: 0.5, 0, 0.5, and the semantic association of the keyword "sweet" with reference text 1 to reference text 3 is: 0.5, 0.5 and 0, and moreover, the semantic association degrees of the keyword 'fruit' with the reference texts 1 to 3 are respectively as follows: 1/3, 1/3, 1/3, the semantic relevance of each reference text with the keywords can be fused (e.g., weighted sum, direct sum, etc.) as the importance of the reference text. Taking the example of fusing by direct summation, the importance of the reference text 1 is 4/3, the importance of the reference text 1 is 5/6, and the importance of the reference text 2 is 5/6, that is, the importance of the reference text 1 is the highest. On the basis, the keywords in the question text can be replaced by the entries of the reference text 1, i.e., the question text shown in fig. 4 can be updated to "where is the banana? ". Therefore, based on the description steps, the individual semantic representations of a plurality of words in the new question text and the text of the chapters can be obtained, and the answer text can be predicted according to the individual semantic representations.

In an implementation scenario, in order to improve the efficiency of question answering, a question answering model may be trained in advance using training samples, and the training samples may specifically include: the specific training process of the sample question text, the sample chapter text, and the sample answer text may refer to the following embodiments of the present disclosure, which are not repeated herein. After the question answering model is obtained through training, the answer text of the question text can be predicted and obtained through the question answering model. In addition, the question answering model may specifically include: the individual semantic extraction network, the original semantic extraction network, and the prediction network may refer to the foregoing related descriptions, and are not described herein again.

Referring to fig. 5, fig. 5 is a flowchart illustrating an embodiment of step S13 in fig. 1. The method specifically comprises the following steps:

step S51: and respectively utilizing the individual semantic representation of each word and the original semantic representation of each reference text to obtain the semantic association degree between each word and each reference text.

Specifically, one of the words may be used as the current word, and the individual semantic representation of the current word is subjected to point multiplication with the original semantic representation of each reference text to obtain an initial association degree between the current word and each reference text, so that the initial association degree between the current word and each reference text is normalized to obtain a semantic association degree between the current word and each reference text. The individual semantic representation and the original semantic representation can be specifically regarded as vectors with certain dimensionality, so that point multiplication operation can be completed by multiplying corresponding elements in the individual semantic representation and the original semantic representation and then summing the multiplied elements integrally, and details are not repeated herein. In the mode, one word in the plurality of words is used as the current word, the individual semantic representation of the current word is directly subjected to point multiplication with the original semantic representation of each reference text, the initial association degree between the current word and each reference text is obtained, and the semantic association degree can be obtained through normalization, so that the complexity of obtaining the semantic association degree can be reduced.

Referring to fig. 6, fig. 6 is a state diagram of an embodiment of the semantic association obtaining process, which can be used to combine the tth word w in the question text and the chapter text as described above_tIs denoted as H_L(t) and noting the original semantic representation of the ith reference text as

Taking the question text and the chapter text with M words in total and the reference text with N words in total as an example, the word w can be expressed_tAs a current word, so that its individual semantic representation can be denoted as H_L(t) original semantic representations with respective reference texts

(wherein i ∈ [1, N)]) Performing dot multiplication to obtain the current word w_tInitial association degree alpha between each reference text and each reference text_ti：

In the above equation (5), a dot product operation is expressed. On the basis, N initial relevance degrees can be spliced to form a vector

On the basis of the above-mentioned vector quantity

Normalization is performed, for example, using softmax on the vector

Normalization is carried out to obtain semantic association degree beta_t：

For other words, the analogy can be repeated, and no one example is given here.

Step S52: and fusing the original semantic representations by utilizing the semantic association degree between the words and the reference texts to obtain fused semantic representations of the words.

Specifically, the original semantic representations of the respective reference texts may be weighted by the semantic association between the words and the respective reference texts. Still in the above word w_tFor example, the word w may be used_tWith respective reference text Dⁱ(wherein i ∈ [1, N)]) Semantic relevance between, original semantic representation of each reference text

(wherein i ∈ [1, N)]) IntoLine weighting, resulting in a fused semantic representation of the word, k (t):

for other words, the same can be done, and finally the fused semantic representation K (1) of the 1 st word, the fused semantic representation K (2) of the 2 nd word to the fused semantic representation K (M) of the M-th word can be obtained. In order to facilitate subsequent processing of the fused semantic representations, the fused semantic representations of the M words may be spliced to obtain a final fused semantic representation K:

K＝[K(1),K(2),…,K(M)]……(9)

step S53: and predicting the answer text from the text of the chapters based on the individual semantic representation and the fusion semantic representation of the words.

In an implementation scenario, as described in the foregoing disclosure, in order to improve the efficiency of predicting the answer text, a prediction network may be trained in advance, so that individual semantic representations and fused semantic representations of a plurality of words may be directly sent to the prediction network, and finally the answer text is predicted. Reference may be made to the related description in the foregoing disclosed embodiments, and details are not repeated herein.

In another implementation scenario, to further improve the accuracy of predicting the answer text, a first probability value and a second probability value of each word in the text of the chapter may be predicted respectively by using an individual semantic representation and a fused semantic representation of the words and a global semantic representation used for representing the overall semantics of the words, so that a starting word and an ending word are determined in the text of the chapter based on the first probability value and the second probability value, and the answer text is obtained by using the starting word and the ending word, and the global semantic representation is obtained based on the individual semantic representations of all the words, the first probability value represents the possibility that a word is the starting position of the answer text, and the second probability value represents the possibility that a word is the ending position of the answer text. In the mode, by further introducing the global semantic representation representing the whole semantics of a plurality of words, the local semantic information (namely, the individual semantic representation), the external knowledge (namely, the fusion semantic representation) and the global semantic information (namely, the global semantic representation) can be integrated in the process of predicting the answer text, so that the accuracy of the predicted answer text can be improved.

In a specific implementation scenario, the individual semantic representation of the words at the preset positions in the question text and the chapter text can be directly used as the global semantic representation. For example, the individual semantic representation of the character [ CLS ] in expression (1) may be used as the global semantic representation, or the individual semantic representation of [ SEP ] located at the end in expression (1) may be used as the global semantic representation. By the method, the individual semantic representation of the words at the preset positions in the problem text and the chapter text is directly used as the global semantic representation, so that the complexity of obtaining the global semantic representation can be reduced.

In another specific implementation scenario, global average pooling may also be performed on the individual semantic representations of the plurality of terms to obtain a global semantic representation. Taking the individual semantic representation as an L-dimensional vector as an example, an average value of the 1 st element in the M individual semantic representations may be specifically calculated as the 1 st element in the global semantic representation, and so on, an average value of the kth element in the M individual semantic representations is calculated as the kth element in the global semantic representation, until an average value of the L th element in the M individual semantic representations is calculated as the L th element in the global semantic representation. By the mode, global average pooling is carried out on the individual semantic representations of the words to obtain global semantic representation, and improvement of accuracy of the global semantic representation can be facilitated.

In yet another specific implementation scenario, the individual semantic representations of the terms may be globally pooled to obtain a global semantic representation. Taking the individual semantic representation as an L-dimensional vector as an example, the maximum value of the 1 st element in the M individual semantic representations may be counted, and the maximum value of the kth element in the M individual semantic representations may be used as the 1 st element in the global semantic representation, and so on, and the maximum value of the kth element in the M individual semantic representations may be counted, and the maximum value of the kth element in the global semantic representation may be counted, until the maximum value of the L-th element in the M individual semantic representations is counted, and the maximum value of the L-th element in the global semantic representation may be used as the L-th. By the mode, the global semantic representation is obtained by performing global maximum pooling on the individual semantic representations of the words, and the accuracy of the global semantic representation can be improved.

In yet another specific implementation scenario, for convenience of description, the global semantic representation may be recorded as G, and as described above, the fused semantic representations of the words may be spliced to obtain the final fused semantic representation K, and similarly, the individual semantic representations of the words may be spliced to obtain the final individual semantic representation H_LOn the basis, the final individual semantic representation H can be expressed_LMultiplying the dimensionality corresponding to the global semantic representation G and multiplying the multiplication result with the final individual semantic representation H_LSending the final fusion semantic expression K into a first prediction network to obtain a first probability value p of each word_sAnd the multiplication result is compared with the final individual semantic representation H_LSending the final fusion semantic expression K into a second prediction network to obtain a second probability value p of each word_e：

p_s＝[H_L；K；G⊙H_L]·W_s……(10)

p_e＝[H_L；K；G⊙H_L]·W_e……(11)

In the above equations (10) and (11), as_sRepresenting a network parameter of the first predicted network, W_eA network parameter representing a second prediction network, [ 2 ]]A splice is indicated.

In another specific implementation scenario, in order to improve the prediction accuracy, the individual semantic representation and the fused semantic representation of the words and the global semantic representation used for representing the overall semantics of the words may be further used to predict a first probability value of each word in the text of the chapter, which may specifically refer to the foregoing description and formula (12), and on this basis, the individual semantic representation and the fused semantic representation of the words and the individual semantic representation of the word corresponding to the global semantic representation and the maximum first probability value may be further used to predict a second probability value of each word. According to the method, the first probability value of each word is obtained through prediction, and then the second probability value is predicted according to the individual semantic representation of the word corresponding to the maximum first probability value, the individual semantic representation, the fusion semantic representation and the global semantic representation of a plurality of words, so that the prediction result of the starting position can be fully considered in the prediction process of the ending position, and the accuracy of the prediction of the ending position can be improved.

Specifically, as mentioned above, the individual semantic representation is specifically an L-dimensional vector, so the final individual semantic representation is H_LThe vector is M × L, and for convenience of subsequent data processing, the individual semantics of the word corresponding to the largest first probability value may be copied M times to form a vector of M × L, which may be denoted as H for convenience of description_s. On the basis, the second probability value p of each word can be obtained through the following formula_e：

p_e＝[H_L；K；G⊙H_L；H_s]·W_e……(12)

In a further specific implementation scenario, the first probability values p of the individual words are obtained_sAnd a second probability value p_eThen, one word in the text of the chapters can be used as the current word, the product of the first probability value of the current word and the probability of the second probability value of each word behind the current word is counted, and finally the combination of the two words and the word between the two words corresponding to the maximum probability product can be used as the answer text. With continuing reference to fig. 4, the first word "water" in the chapter text may be taken as the current word, and the product of the first probability value of "water" and the probability of the second probability value of each word (i.e., "fruit", "shop", "back", … … "," most "," right ", and" side ") after" water "may be counted, and the similar processing may be performed on the second word" fruit ", the third word" shop "until the second last word" right ", and finally the product of the first probability value of" most left "inside" and the probability of the second probability value of "side" in the chapter text of fig. 4 may be counted to be the largest, so that "most" and "side" and the same may be counted to be the largestThe combination of "left" in the room, i.e., "leftmost" as the question text "where yellow sweet fruits are placed? "is used as the answer text. Other cases may be analogized, and no one example is given here.

Different from the embodiment, the original semantic representations are fused through the semantic association degrees between the words and the reference texts respectively to obtain fused semantic representations, namely the fused semantic representations of the words are more dependent on the original semantic representations of the reference texts with large semantic association degrees, on the basis, the answer texts are predicted by utilizing the individual semantic representations of the words and the fused semantic representations of the introduced external knowledge, and the accuracy of the answer texts can be improved.

Referring to fig. 7, fig. 7 is a flow chart illustrating a question answering method according to another embodiment of the present application. The method specifically comprises the following steps:

step S71: and acquiring a question text and a chapter text, and acquiring a reference text of a plurality of knowledge points.

In the embodiment of the disclosure, the question text and the chapter text contain words, and the knowledge points are related to at least one of the question text and the chapter text. Reference may be made to the related description in the foregoing disclosed embodiments, and details are not repeated herein.

Step S72: individual semantic representations of a number of words are extracted, and original semantic representations of respective reference texts are extracted.

Reference may be made to the related description in the foregoing disclosed embodiments, and details are not repeated herein.

Step S73: and predicting an answer text of the question text from the discourse text by utilizing the individual semantic representations of the words and the original semantic representation of each reference text.

In the embodiment of the present disclosure, the answer text is obtained based on a first probability value and a second probability value of each word in the chapter text, the first probability value represents a possibility that the word is a starting position of the answer text, the second probability value represents a possibility that the word is an ending position of the answer text, the first probability value and the second probability value are obtained by using an individual semantic representation and a fusion semantic representation of a plurality of words and a global semantic representation prediction for representing an overall semantic representation of the plurality of words, which may be specifically referred to the related description in the foregoing embodiment and are not described herein again,

step S74: among the individual semantic representations of the words, an individual semantic representation of a word at a start position of the answer text is selected, and an individual semantic representation of a word at an end position of the answer text is selected.

As with the previously disclosed embodiments, for ease of description, the individual semantic representations of the words of the beginning position of the answer text may be denoted as H_sAnd marking the individual semantic representation of the answer text ending position word as H_e。

Step S75: and predicting to obtain a third probability value by using the individual semantic representation of the initial position words, the individual semantic representation of the ending position words and the global semantic representation.

In the embodiment of the present disclosure, the third probability value represents a possibility that the answer text of the question text does not exist in the discourse text. Although the answer text of the question text can be predicted in the discourse text through the related steps of the answer text prediction, in a real scene, there may be a case that the answer text of the question text does not really exist in the discourse text. Referring to fig. 8, fig. 8 is a diagram of another embodiment of answering a question text based on a chapter text and a reference text. As shown in fig. 8, for the question text "where the blue fruit is placed? It is obvious that the text of the chapters does not actually refer to any blue fruit, so that the text of the chapters does not actually have the real answer text of the question text, and therefore, on the basis of the prediction of the answer text, the answer rejection prediction is further performed, that is, the possibility that the answer text of the question text does not exist in the text of the chapters is predicted, and the false answer rate of the question answer can be favorably reduced.

As previously described, a third predictive network may be trained in advance in order to improve prediction efficiency. The third prediction network may specifically include, but is not limited to: a fully connected layer, a normalization layer, and is not limited herein. On the basis, the individual semantic meaning of the initial position word can be expressed H_sIndividuals who end position wordsSemantic representation H_eAnd sending the global semantic representation G into a third prediction network to obtain a third probability value p_NA：

p_NA＝sigmoid([H_s；H_e；G]·W_NA)……(13)

In the above formula (13), W_NAA network parameter (e.g., a network parameter of a full link layer) representing a third predicted network]Representing the concatenation, a sigmoid function is used to normalize the output range to a range of 0 to 1.

Step S76: determining whether to output the answer text based on the first probability value of the start position word, the second probability value of the end position word, and the third probability value.

Specifically, the product of the probabilities of the first probability value of the start position word and the second probability value of the end position word may be obtained, which may be denoted as p for convenience of description_AAnd comparing the product p of the probabilities_AAnd a third probability value p_NAThe product p of the magnitude of (1) and the probability_AGreater than a third probability value p_NAIn the case of (1), the answer text can be determined to be output, that is, the question text can be considered to have the answer text, and the answer text of the question text is the answer text predicted from the chapter text; otherwise, the product p in probability_ANot greater than a third probability value p_NAIn the case of (3), it is determined not to output the answer text, that is, it may be considered that no answer text exists in the question text.

Referring to fig. 8, since the text of the chapters does not refer to any "blue fruit", and the reference text is introduced to explain the knowledge points in the text of the chapters in detail, but all the knowledge points do not match with the "blue fruit", the third probability value p finally obtained_NAGreater than product of probability p_AThat is, the answer text is more likely to be rejected than the predicted answer text, so that it can be considered that there is no corresponding answer text for the question text, and it can be determined not to output the answer text. Other cases may be analogized, and no one example is given here.

Different from the foregoing embodiment, by selecting the individual semantic representation of the word at the start position of the answer text and the individual semantic representation of the word at the end position of the answer text from the individual semantic representations of the words, a third probability value is obtained by prediction by using the individual semantic representation of the word at the start position, the individual semantic representation of the word at the end position and the global semantic representation, and the third probability value represents the possibility that the answer text of the question text does not exist in the text of the chapter, and whether the answer text is output or not is determined based on the first probability value of the word at the start position, the second probability value of the word at the end position and the third probability value, so that the robustness of the answer to the question can be further improved. In addition, because the answer text prediction and the answer rejection prediction both refer to the global semantic representation, an interaction mechanism can be favorably introduced into the answer text prediction process and the answer rejection prediction process, so that the answer rejection prediction process can take the information into account through the global semantic representation under the condition that the answer text prediction process gives the answer text with higher confidence coefficient, and the answer rejection prediction process can tend to output the prediction of 'not rejecting' more, otherwise, the answer rejection prediction process can take the information into account through the global semantic representation under the condition that the answer text prediction process gives the answer text with lower confidence coefficient, so that the answer rejection prediction process tends to output the prediction of 'rejecting', and the accuracy of the answer rejection prediction is favorably improved.

Referring to fig. 9, fig. 9 is a flowchart illustrating an embodiment of a method for training a question answering model. The method specifically comprises the following steps:

step S91: and acquiring a training sample and acquiring sample reference texts of a plurality of knowledge points.

In the embodiment of the present disclosure, the training samples may specifically include: the system may further include a sample question text, a sample chapter text, and a sample answer text, the sample question text and the sample chapter text may contain a number of sample words, and a number of knowledge points are associated with at least one of the sample question text and the sample chapter text. Reference may be made to the related description in the foregoing embodiments, which are not repeated herein.

In addition, for the convenience of subsequent processing, the sample answer text can be represented by 0-1 vectors, specifically, a starting position vector and an ending position vector. Taking the sample chapter text "banana in refrigerator and apple on table", and the sample question text "where yellow fruit is" as an example, the sample answer text is actually "in refrigerator", for convenience of subsequent processing, a 0-1 vector may be used to represent the starting position vector of the sample answer text as [ 0010000000 ], and a 0-1 vector may be used to represent the ending position vector of the sample answer text as [ 0000100000 ], that is, "0" represents that the sample word is not available for answering the sample question text, and "1" represents that the sample word is available for answering the sample question text. Other cases may be analogized, and no one example is given here.

In one implementation scenario, the sample answer text intercepted from the sample chapter text may be referred to as a first sample answer text, and the sample question text corresponding to the first sample answer text may be referred to as a first sample question text, in addition, in order to improve robustness of the question answer model, keyword recognition may be performed on the first sample question text, and the recognized keyword may be replaced by other different words to obtain a second question text, so that a combination of the sample chapter text, the first sample question text, and the first sample answer text, and a combination of the sample chapter text, the second sample question text, and a preset text may be used as a training sample, and the preset text is used to indicate that there is no answer text in the sample chapter text that answers to the second sample question text. Specifically, the preset text may be "no answer", or may be represented by an all-zero vector, which is not limited herein. In the above manner, the keywords are identified for the first sample question text, and the identified keywords are replaced with other different words, so that the second question text can be constructed, the answer text for answering the second question text does not exist in the sample chapter text, and the training sample is constructed based on the sample chapter text, the first sample question text, the first sample answer text, the second sample question text and the preset text, so that the training sample contains both the sample question text capable of being answered and the sample question text incapable of being answered, and further, the robustness of the question answer model can be improved in the training process by using the training sample.

In a specific implementation scenario, a combination of the sample chapter text, the first sample question text, and the first sample answer text may be used as a set of training samples, and a combination of the sample chapter text, the second sample question text, and the preset text may be used as a set of training samples.

In another specific implementation scenario, terms such as noun, name entity, adjective, time, etc. in the first sample question text may be identified as keywords. Specifically, the keywords may be identified by an NER (Named Entity identification) tool such as LTP, Stanza, and the like.

In another specific implementation scenario, only one keyword may be replaced at a time in order to ensure semantic consistency of the second sample question text obtained after replacing the keyword. In addition, a plurality of keywords can be replaced each time, and semantic consistency of the keywords is checked manually after replacement, so that a second sample question text is obtained.

Specifically, the name entity in the first sample question text may be replaced with another name entity in the sample chapter text. Referring to fig. 10, fig. 10 is a schematic diagram of an embodiment of a training sample obtaining process. As shown in fig. 10, for the first sample question text "where is the birth site of turing? ", one can identify the named entity" Turing ", so that the named entity can be replaced with another entity" Einstein "in the sample chapter text, get a second sample question text" where the birth location of Einstein? ", other cases may be analogized, and are not exemplified herein.

Specifically, a time in the first sample question text may also be replaced with another time in the sample chapter text. With continuing reference to fig. 10, for the first sample question text "which school was visited in figure 1926? ", a time" 1926 year "can be identified and replaced with another time" end of 1927 "in the sample chapter text, a second sample question text" school of which did pictures go at end of 1927? ", other cases may be analogized, and are not exemplified herein.

In particular, nouns in the first sample question text may also be replaced by their anti-sense words. With continuing reference to fig. 10, for the first sample question text, "which paper of turing received a reward? "can one identify a noun" reward "and can replace that noun with its anti-sense word" penalty ", get a second sample question text" which paper of Tuoling received the penalty? ", other cases may be analogized, and are not exemplified herein.

Specifically, the adjectives in the first sample question text may also be replaced with their anti-sense words. With continuing reference to fig. 10, for the first sample question text, "what was obtained because of excellent performance of Tuoling? "the adjective" excellent "can be recognized and obtained, so that the adjective can be replaced with its antisense word" poor ", and a second sample question text" what was obtained because of poor performance? ", other cases may be analogized, and are not exemplified herein.

Step S92: and extracting sample individual semantic representations of a plurality of sample words by using an individual semantic extraction network of the question answering model, and extracting sample original semantic representations of all sample reference texts by using an original semantic extraction network of the question answering model.

Step S93: and predicting to obtain a first prediction probability value and a second prediction probability value of each sample word in the sample chapter text by utilizing the sample individual semantic representation of a plurality of sample words and the sample original semantic representation of each sample reference text based on the prediction network of the question answer model.

In the embodiment of the disclosure, the first prediction probability value represents the possibility that the sample word is the starting position of the sample answer text, and the second prediction probability value represents the possibility that the sample word is the ending position of the sample answer text.

Specifically, the sample individual semantic representation of each sample word and the sample original semantic representation of each sample reference text may be used to obtain the sample semantic association degree between each sample word and each sample reference text, and the sample original semantic representations may be fused to obtain the sample fusion semantic representation of each sample word, on the basis, the sample individual semantic representation and the sample fusion semantic representation of the sample words may be sent to a prediction network to obtain the first prediction probability value and the second prediction probability value.

In an implementation scenario, one sample word of the sample words may be used as a current sample word, and the sample individual semantic representation of the current sample word is subjected to point multiplication with the sample original semantic representation of each sample reference text, so as to obtain a sample initial association degree between the sample current word and each sample reference text, and finally, the sample initial association degree between the sample current word and each sample reference text is normalized, so as to obtain a sample semantic association degree between the sample current word and each sample reference text. Specifically, reference may be made to the process for obtaining the semantic association degree in the foregoing disclosed embodiments, which is not described herein again.

In another implementation scenario, the prediction network may specifically include a first prediction network for predicting the starting position and a second prediction network for predicting the ending position, so that sample individual semantic representations and sample fusion semantic representations of a plurality of sample words and sample global semantic representations for representing the overall semantics of the plurality of sample words may be sent to the first prediction network to obtain first prediction probability values of the sample words in the sample chapter text, and the sample individual semantic representations and the sample fusion semantic representations of the sample words and the sample global semantic representations for representing the overall semantics of the plurality of sample words may be sent to the second prediction network to obtain second prediction probability values of the sample words in the sample chapter text. Specifically, the sample global semantic representation is obtained based on the individual semantic representations of all sample words, which may specifically refer to the process of obtaining the global semantic representation in the foregoing disclosed embodiment, and details are not described here.

In yet another implementation scenario, the sample individual semantic representation and the sample fusion semantic representation of the sample words, and the sample global semantic representation for representing the overall semantics of the sample words may also be sent to the first prediction network to obtain a first prediction probability value of each sample word in the sample chapter text, and on this basis, the sample individual semantic representation and the sample fusion semantic representation of the sample words, and the sample individual semantic representation of the sample words corresponding to the sample global semantic representation and the start position vector may be sent to the second prediction network to obtain a second prediction probability value of each sample word in the sample chapter text.

For convenience of description, similar to the foregoing disclosed embodiments, sample individual semantic representations of several sample terms may be spliced to obtain a final sample individual semantic representation H_LFurthermore, the starting position vector can be noted as Y_sThen sample individual semantic representation of the starting position sample word H_sCan be written as:

the specific process can refer to the related description in the foregoing disclosed embodiments, and is not repeated herein.

Step S94: and obtaining a loss value of the question answer model based on the sample answer text, the first prediction probability value and the second prediction probability value.

Specifically, for the starting position, a cross entropy loss function can be used to process the starting position vector and first prediction probability values of each sample word in the sample chapter text to obtain a first loss value; for the ending position, the second prediction probability values of the ending position vector and each sample word in the sample chapter text can be processed by using a cross entropy loss function to obtain a second loss value. On the basis, the first loss value and the second loss value can be weighted to obtain the loss value of the question answering model.

Answer text to this training phaseThe prediction is done. Further, similar to the previously described disclosed embodiments, the training phase may also perform a refusal answer prediction in order to reduce the rate of false answers to questions. In particular, a sample individual semantic representation of a sample answer text start position sample word may be selected and a sample individual semantic representation of a sample answer text end position sample word may be selected among sample individual semantic representations of several sample words. In particular, the sample individual semantic representation of the start position sample word may refer to the formula (16) correlation description. Similarly, the ending position vector may be written as Y_eSample individual semantic representation H of end position sample words_eCan be expressed as:

further, the sample individual semantic representation of the sample words at the start position, the sample individual semantic representation of the sample words at the end position and the sample global semantic representation can be sent to a third prediction network to obtain a third prediction probability value, and the third prediction probability value represents the possibility that the sample answer text of the sample question text does not exist in the sample chapter text. On the basis, a third loss value can be obtained based on whether the sample question text is marked with the sample answer text and the third prediction probability value, and further, the loss value of the question answer model can be obtained based on the first loss value, the second loss value and the third loss value. Therefore, the global semantic representation of the sample is referred to in the training answer text prediction link and the training answer rejection prediction link, so that an interaction mechanism can be introduced in the training answer text prediction link and the training answer rejection prediction link, and the training answer text prediction link and the training answer rejection prediction link are mutually promoted and supplemented.

As described above, in the training stage, the answer text prediction and the answer rejection prediction may be performed together, so that when the sample question text is marked with the corresponding sample answer text, the third loss value may be masked, only the first loss value and the second loss value are weighted to obtain the loss value of the question answer model, and the subsequent step of adjusting the network parameter of the question answer model is performed using the loss value; and under the condition that the sample question text is not marked with the corresponding sample answer text, shielding the first loss value and the second loss value, taking the third loss value as the loss value of the question answer model, and performing the subsequent step of adjusting the network parameters of the question answer model by using the loss value.

It should be noted that, in the training process, the answer rejection prediction does not depend on the first prediction probability value and the second prediction probability value obtained by the answer text prediction, so the answer rejection prediction and the answer text prediction can be executed in sequence, that is, the first prediction probability value and the second prediction probability value can be obtained by predicting first and then the third prediction probability value can be obtained by predicting, or the third prediction probability value can be obtained by predicting first and then the first prediction probability value and the second prediction probability value can be obtained by predicting; the prediction may also be performed simultaneously, that is, the first prediction probability value, the second prediction probability value, and the third prediction probability value are obtained through simultaneous prediction, which is not limited herein.

Step S95: and adjusting network parameters of the question answering model by using the loss value.

Specifically, the network parameters of the question answering model can be adjusted by using loss values in a random Gradient Descent (SGD), Batch Gradient Descent (BGD), small Batch Gradient Descent (Mini-Batch Gradient Descent, MBGD), and other manners, wherein the Batch Gradient Descent refers to updating the parameters by using all samples during each iteration; the random gradient descent means that one sample is used for parameter updating in each iteration; the small batch gradient descent means that a batch of samples is used for parameter updating at each iteration, and details are not repeated here.

Different from the embodiment, the method comprises the steps of obtaining training samples, obtaining sample reference texts of a plurality of knowledge points, extracting sample individual semantic representations of a plurality of sample words by using an individual semantic extraction network of a question answering model, extracting sample original semantic representations of each sample reference text by using an original semantic extraction network of the question answering model, predicting to obtain a first prediction probability value and a second prediction probability value of each sample word in a sample chapter text by using the sample individual semantic representations of the plurality of sample words and the sample original semantic representations of each sample reference text based on a prediction network of the question answering model, obtaining a loss value of the question answering model based on the sample answering text, the first prediction probability value and the second prediction probability value, and adjusting network parameters of the question answering model by using the loss value, and then, external knowledge can be introduced in the training process, so that the background of the sample chapter text and the sample question text can be expanded, and the accuracy of the question answering model can be improved.

Referring to fig. 11, fig. 11 is a schematic diagram of a frame of an electronic device 110 according to an embodiment of the present application. The electronic device 110 comprises a memory 111 and a processor 112 coupled to each other, wherein the memory 111 stores program instructions, and the processor 112 is configured to execute the program instructions to implement the steps in any of the above-mentioned embodiments of the question answering method, or to implement the steps in the above-mentioned embodiments of the training method for the question answering model. Specifically, electronic device 110 may include, but is not limited to: desktop computers, notebook computers, tablet computers, servers, and the like, without limitation.

In particular, the processor 112 is configured to control itself and the memory 102 to implement the steps in any of the above-described question answering method embodiments, or in the training method embodiments of the question answering model described above. Processor 112 may also be referred to as a CPU (Central Processing Unit). The processor 112 may be an integrated circuit chip having signal processing capabilities. The Processor 112 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 112 may be commonly implemented by integrated circuit chips.

In the embodiment of the disclosure, the processor 112 is configured to obtain a question text and a chapter text, and obtain a reference text of a plurality of knowledge points; the question text and the chapter text contain a plurality of words, and the knowledge points are related to at least one of the question text and the chapter text; the processor 112 is configured to extract individual semantic representations of the plurality of words and extract an original semantic representation of each reference text; the processor 112 is configured to predict an answer text of the question text from the chapter text using the individual semantic representations of the words and the original semantic representations of the respective reference texts.

In some disclosed embodiments, the processor 112 is configured to obtain semantic association degrees between the respective words and the respective reference texts by using the individual semantic representation of each word and the original semantic representation of the respective reference texts; the processor 112 is configured to fuse the original semantic representations by using semantic association degrees between the words and the reference texts to obtain a fused semantic representation of the words; the processor 112 is configured to predict the answer text from the chapter text based on the individual semantic representations and the fused semantic representation of the words.

In some disclosed embodiments, the processor 112 is configured to use one of the plurality of words as the current word; the processor 112 is configured to perform dot multiplication on the individual semantic representations of the current word and the original semantic representations of the reference texts, so as to obtain an initial association degree between the current word and each reference text; the processor 112 is configured to normalize the initial association degrees between the current word and each reference text, and obtain semantic association degrees between the current word and each reference text.

Different from the embodiment, one of the words is taken as the current word, the individual semantic representation of the current word is directly multiplied by the original semantic representation of each reference text to obtain the initial association degree between the current word and each reference text, and the semantic association degree can be obtained by normalization, so that the complexity of obtaining the semantic association degree can be reduced.

In some disclosed embodiments, the processor 112 is configured to predict a first probability value and a second probability value of each term in the text of the chapters respectively by using the individual semantic representation and the fused semantic representation of the terms and the global semantic representation representing the overall semantics of the terms; the processor 112 is configured to determine a starting word and an ending word in the text of the chapters based on the first probability value and the second probability value, and obtain an answer text by using the starting word and the ending word; wherein the global semantic representation is derived based on individual semantic representations of all terms, the first probability value represents a likelihood that a term is a starting position of the answer text, and the second probability value represents a likelihood that a term is an ending position of the answer text.

Different from the embodiment, by further introducing global semantic representation representing the whole semantics of a plurality of words, local semantic information (namely, individual semantic representation), external knowledge (namely, fusion semantic representation) and global semantic information (namely, global semantic representation) can be integrated in the process of predicting the answer text, so that the accuracy of the predicted answer text can be improved.

In some disclosed embodiments, the processor 112 is configured to predict a first probability value of each term in the text of the chapters using the individual semantic representation and the fused semantic representation of the terms, and the global semantic representation representing the overall semantics of the terms; the processor 112 is configured to predict a second probability value of each word by using the individual semantic representation and the fused semantic representation of the words, and the global semantic representation, the individual semantic representation of the word corresponding to the maximum first probability value

Different from the embodiment, the first probability value of each word is obtained by prediction, and then the second probability value is predicted by depending on the individual semantic representation of the word corresponding to the maximum first probability value, the individual semantic representation of a plurality of words, the fusion semantic representation and the global semantic representation, so that the prediction result of the starting position can be fully considered in the prediction process of the ending position, and the accuracy of the prediction of the ending position can be improved.

In some disclosed embodiments, the global semantic representation and the individual semantic representations are the same size, and the processor 112 is configured to perform any of the following to obtain the global semantic representation: taking the individual semantic representation of the words at the preset positions in the question text and the text of the sections as global semantic representation; carrying out global average pooling on the individual semantic representations of the words to obtain global semantic representation; and carrying out global maximum pooling on the individual semantic representations of the words to obtain global semantic representation.

Different from the embodiment, the individual semantic representation of the words at the preset positions in the question text and the text of the chapters is directly used as the global semantic representation, which is beneficial to reducing the complexity of obtaining the global semantic representation; global average pooling is carried out on the individual semantic representations of the words to obtain global semantic representation, so that the accuracy of the global semantic representation can be improved; in addition, global semantic representation is obtained by performing global maximum pooling on the individual semantic representations of the words, and the accuracy of global semantic representation can be improved.

In some disclosed embodiments, the answer text is derived based on a first probability value and a second probability value for each term in the text of the chapter, the first probability value representing a likelihood that the term is a starting position of the answer text, the second probability value representing a likelihood that the term is an ending position of the answer text, the first probability value and the second probability value being predicted using individual semantic representations and fused semantic representations of the terms, and a global semantic representation representing a global semantic representation of the overall semantics of the terms, the processor 112 is configured to select, among the individual semantic representations of the terms, an individual semantic representation of the term at the starting position of the answer text and an individual semantic representation of the term at the ending position of the answer text; the processor 112 is configured to predict a third probability value by using the individual semantic representation of the start position word, the individual semantic representation of the end position word, and the global semantic representation; wherein the third probability value represents the possibility that the answer text of the question text does not exist in the text of the chapters; the processor 112 is configured to determine whether to output the answer text based on the first probability value of the start position word, the second probability value of the end position word, and the third probability value.

In some disclosed embodiments, the processor 112 is configured to obtain a product of probabilities of a first probability value of a start position term and a second probability value of an end position term; the processor 112 is configured to determine to output the answer text if the product of the probabilities is greater than a third probability value; the processor 112 is configured to determine not to output the answer text if the product of the probabilities is not greater than the third probability value.

Unlike the foregoing embodiment, by obtaining a product of probabilities of the first probability value of the start position word and the second probability value of the end position word and comparing a magnitude relationship between the product of the probabilities and the third probability value to determine whether to output the answer text, it is possible to advantageously reduce the complexity of determining whether to output the answer text.

In some disclosed embodiments, the processor 112 is configured to perform keyword recognition on at least one of the question text and the chapter text to obtain a plurality of keywords; the processor 112 is configured to obtain reference texts of knowledge points related to the keywords from a preset knowledge dictionary.

Different from the embodiment, the method obtains the keywords by performing keyword recognition on at least one of the question text and the chapter text, and thus obtains the reference text of the knowledge points related to the keywords from the preset knowledge dictionary, so that the method can improve the relevance of the reference text, the chapter text and the question text, and is further beneficial to improving the reference value of external knowledge in the question answering process and improving the accuracy of question answering.

In some disclosed embodiments, the answer text is predicted based on a question answer model, the question answer model is trained using training samples, and the training samples include: sample question text, sample chapter text, and sample answer text.

Different from the previous embodiment, the answer text is obtained through the question answer model prediction, which can be beneficial to improving the efficiency of question answering.

In some disclosed embodiments, the processor 112 is configured to obtain a sample chapter text, a first sample question text, and a first sample answer text of the first sample question text; wherein the first sample answer text is intercepted from the sample chapter text; the processor 112 is configured to perform keyword recognition on the first question text, and replace the recognized keywords with other different words to obtain a second question text; the processor 112 is configured to use a combination of the sample chapter text, the first sample question text, and the first sample answer text, and a combination of the sample chapter text, the second sample question text, and the preset text as training samples; and the preset text is used for indicating that the answer text which answers the second sample question text does not exist in the sample chapter text.

Different from the foregoing embodiment, the keyword recognition is performed on the first sample question text, and the recognized keyword is replaced with other different words, so that the second question text can be constructed and obtained, an answer text which answers the second question text does not exist in the sample chapter text, and a training sample is constructed based on the sample chapter text, the first sample question text, the first sample answer text, the second sample question text and the preset text, so that the training sample contains both the sample question text which can be answered and the sample question text which cannot be answered, and therefore, the robustness of the question answer model can be improved in the training process by using the training sample.

Referring to fig. 12, fig. 12 is a schematic diagram of a memory device 120 according to an embodiment of the present application. The memory device 120 stores program instructions 121 that can be executed by the processor, the program instructions 121 being configured to implement the steps in any of the above-described embodiments of the question answering method, or the steps in the above-described embodiments of the training method for the question answering model.

By the scheme, external knowledge can be introduced in the question answering process, the background of the chapter text and the question text can be expanded, and the accuracy of question answering can be improved.

In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.

The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar parts may be referred to each other, and for brevity, will not be described again herein.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims

1. A method of answering questions, comprising:

acquiring a problem text and a chapter text, and acquiring reference texts of a plurality of knowledge points; wherein the question text and the chapter text contain words, and the knowledge points are related to at least one of the question text and the chapter text;

extracting individual semantic representations of the words and extracting original semantic representations of the reference texts;

and predicting the answer text of the question text from the discourse text by using the individual semantic representations of the words and the original semantic representation of each reference text.

2. The method of claim 1, wherein predicting the answer text of the question text from the discourse text using the individual semantic representations of the words and the original semantic representation of each of the reference texts comprises:

obtaining semantic association degrees between the words and the reference texts respectively by using the individual semantic representation of each word and the original semantic representation of each reference text respectively;

fusing the original semantic representations by utilizing the semantic association degree between the words and the reference texts to obtain fused semantic representations of the words;

predicting the answer text from the chapter text based on the individual semantic representations and the fused semantic representation of the words.

3. The method according to claim 2, wherein obtaining semantic association degrees between the respective words and the respective reference texts by using the respective individual semantic representations of the words and the respective original semantic representations of the respective reference texts comprises:

respectively taking one word in the plurality of words as a current word;

performing dot multiplication on the individual semantic representation of the current word and the original semantic representation of each reference text to obtain an initial association degree between the current word and each reference text;

and normalizing the initial association degree between the current word and each reference text to obtain the semantic association degree between the current word and each reference text.

4. The method of claim 2, wherein predicting the answer text from the discourse text based on the individual semantic representations and the fused semantic representation of the words comprises:

respectively predicting a first probability value and a second probability value of each word in the discourse text by using the individual semantic representation and the fused semantic representation of the words and a global semantic representation for representing the overall semantics of the words;

determining a starting word and an ending word in the discourse text based on the first probability value and the second probability value, and obtaining the answer text by using the starting word and the ending word;

wherein the global semantic representation is derived based on individual semantic representations of all of the words, the first probability value representing a likelihood that the word is the start position of the answer text, the second probability value representing a likelihood that the word is the end position of the answer text.

5. The method of claim 4, wherein the predicting a first probability value and a second probability value for each term in the text of the chapters using the individual semantic representation and the fused semantic representation of the terms and a global semantic representation for representing the overall semantics of the terms comprises:

predicting a first probability value of each word in the discourse text by using the individual semantic representation and the fused semantic representation of the words and a global semantic representation used for representing the overall semantics of the words;

and predicting a second probability value of each word by using the individual semantic representation and the fused semantic representation of the words, the global semantic representation and the individual semantic representation of the word corresponding to the maximum first probability value.

6. The method of claim 4, wherein the global semantic representation and the individual semantic representations are the same size; the global semantic representation is obtained by any one of the following methods:

taking the individual semantic representation of the words at the preset positions in the question text and the discourse text as the global semantic representation;

carrying out global average pooling on the individual semantic representations of the words to obtain the global semantic representation;

and carrying out global maximum pooling on the individual semantic representations of the words to obtain the global semantic representation.

7. The method of claim 1, wherein the answer text is derived based on a first probability value and a second probability value for each word in the chapter text, the first probability value representing a likelihood that the word is a beginning location of the answer text, the second probability value representing a likelihood that the word is an ending location of the answer text, the first probability value and the second probability value being predicted using the individual semantic representations and the fused semantic representation of the words and a global semantic representation for representing the overall semantics of the words;

after the predicting the answer text of the question text from the chapter text by using the individual semantic representations of the words and the original semantic representation of each of the reference texts, the method further comprises:

selecting an individual semantic representation of a word at a starting position of the answer text and an individual semantic representation of a word at an ending position of the answer text from the individual semantic representations of the words;

predicting to obtain a third probability value by using the individual semantic representation of the starting position word, the individual semantic representation of the ending position word and the global semantic representation; wherein the third probability value represents a likelihood that no answer text of the question text exists in the discourse text;

determining whether to output the answer text based on the first probability value of the start position word, the second probability value of the end position word, and the third probability value.

8. The method of claim 7, wherein the determining whether to output the answer text based on the first probability value for the start position word, the second probability value for the end position word, and the third probability value comprises:

obtaining the product of the probabilities of the first probability value of the initial position word and the second probability value of the ending position word;

determining to output the answer text if the product of the probabilities is greater than the third probability value;

determining not to output the answer text if the product of the probabilities is not greater than the third probability value.

9. The method of claim 1, wherein obtaining the reference text of the knowledge points comprises:

performing keyword identification on at least one of the question text and the chapter text to obtain a plurality of keywords;

and acquiring a reference text of the knowledge points related to the keywords from a preset knowledge dictionary.

10. The method of claim 1, wherein the answer text is predicted based on a question answer model, wherein the question answer model is trained using training samples, and wherein the training samples comprise: sample question text, sample chapter text, and sample answer text.

11. The method of claim 10, wherein the step of obtaining training samples comprises:

acquiring a sample chapter text, a first sample question text and a first sample answer text of the first sample question text; wherein the first sample answer text is intercepted from the sample chapter text;

performing keyword recognition on the first question text, and replacing recognized keywords with other different words to obtain a second question text;

taking the combination of the sample chapter text, the first sample question text and the first sample answer text, and the combination of the sample chapter text, the second sample question text and a preset text as training samples;

and the preset text is used for indicating that no answer text for answering the second sample question text exists in the sample chapter text.

12. An electronic device comprising a memory and a processor coupled to each other, the memory having stored therein program instructions, the processor being configured to execute the program instructions to implement the question answering method according to any one of claims 1 to 11.

13. A storage device storing program instructions executable by a processor to implement the question answering method of any one of claims 1 to 11.