CN112685548B

CN112685548B - Question answering method, electronic device and storage device

Info

Publication number: CN112685548B
Application number: CN202011627778.7A
Authority: CN
Inventors: 崔一鸣; 车万翔; 杨子清; 王士进; 胡国平; 秦兵; 刘挺
Original assignee: Hebei Xunfei Institute Of Artificial Intelligence; Iflytek Beijing Co ltd; iFlytek Co Ltd
Current assignee: Hebei Xunfei Institute Of Artificial Intelligence; Iflytek Beijing Co ltd; iFlytek Co Ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2023-09-08
Anticipated expiration: 2040-12-31
Also published as: CN112685548A

Abstract

The application discloses a question answering method, electronic equipment and a storage device, wherein the question answering method comprises the following steps: acquiring a question text and a chapter text, and acquiring reference texts of a plurality of knowledge points; wherein the question text and the chapter text comprise a plurality of words, and a plurality of knowledge points are related to at least one of the question text and the chapter text; extracting individual semantic representations of a plurality of words, and extracting original semantic representations of each reference text; and predicting the answer text of the question text from the chapter text by using the individual semantic representations of the words and the original semantic representations of the reference texts. By the scheme, the accuracy of question answering can be improved.

Description

Question answering method, electronic device and storage device

Technical Field

The present application relates to the field of natural language understanding technologies, and in particular, to a question answering method, an electronic device, and a storage device.

Background

With the development of information technology, the related system has obtained great performance improvement after reading and understanding the chapters, and on the basis of the performance improvement, the related system can answer related questions based on the chapters. However, in a real-world scenario, merely understanding chapters may still not accurately answer questions. For example, the chapter is "banana in refrigerator, apple on table", and when the problem is "yellow fruit is put at place", the machine cannot accurately answer the problem because the machine cannot understand what the "yellow fruit" refers to according to the chapter, and so on, but is not sufficient. In view of this, how to improve the accuracy of question answering is a very valuable topic.

Disclosure of Invention

The application mainly solves the technical problem text by providing a problem answering method, electronic equipment and a storage device, and can improve the accuracy of the problem answering.

In order to solve the above problem text, a first aspect of the present application provides a method for answering a question, including: acquiring a question text and a chapter text, and acquiring reference texts of a plurality of knowledge points; wherein the question text and the chapter text comprise a plurality of words, and a plurality of knowledge points are related to at least one of the question text and the chapter text; extracting individual semantic representations of a plurality of words, and extracting original semantic representations of each reference text; and predicting the answer text of the question text from the chapter text by using the individual semantic representations of the words and the original semantic representations of the reference texts.

In order to solve the above problem text, a second aspect of the present application provides an electronic device, which includes a memory and a processor coupled to each other, the memory storing program instructions, and the processor being configured to execute the program instructions to implement the method for answering a question in the first aspect.

In order to solve the above-described problem text, a third aspect of the present application provides a storage device storing program instructions executable by a processor for implementing the problem answering method in the above-described first aspect.

According to the scheme, the question text and the chapter text are obtained, the reference text of a plurality of knowledge points is obtained, the question text and the chapter text contain a plurality of words, the knowledge points are related to at least one of the question text and the chapter text, on the basis, individual semantic representations of the words are extracted, original semantic representations of the reference text are extracted, and therefore answer text of the question text is predicted from the chapter text by the individual semantic representations of the words and the original semantic representations of the reference text. Therefore, by acquiring the reference texts of a plurality of knowledge points, external knowledge can be introduced in the question answering process, which is beneficial to expanding the text of the chapter and the background of the question text and further beneficial to improving the accuracy of the question answering.

Drawings

FIG. 1 is a flow chart of an embodiment of a question answering method of the present application;

FIG. 2 is a schematic diagram of a framework of one embodiment of an individual semantic extraction network;

FIG. 3 is a schematic diagram of a framework of one embodiment of a primitive semantic extraction network;

FIG. 4 is a schematic diagram of one embodiment of answering question text based on chapter text and reference text;

FIG. 5 is a flowchart illustrating an embodiment of step S13 in FIG. 1;

FIG. 6 is a state diagram of one embodiment of a semantic association acquisition process;

FIG. 7 is a flow chart of another embodiment of the question answering method of the present application;

FIG. 8 is a schematic diagram of another embodiment of answering question text based on chapter text and reference text;

FIG. 9 is a flow chart of one embodiment of a method of training a question answering model;

FIG. 10 is a schematic diagram of an embodiment of a training sample acquisition process;

FIG. 11 is a schematic diagram of a frame of an embodiment of an electronic device of the present application;

FIG. 12 is a schematic diagram of a frame of an embodiment of a storage device of the present application.

Detailed Description

The following describes embodiments of the present application in detail with reference to the drawings.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the present application.

The terms "system" and "network" are often used interchangeably herein. The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two.

Referring to fig. 1, fig. 1 is a flowchart illustrating an embodiment of a question answering method according to the present application. Specifically, the method may include the steps of:

step S11: and acquiring the question text and the chapter text, and acquiring the reference text of a plurality of knowledge points.

In the embodiment of the present disclosure, the question text and the chapter text include a plurality of words, and the specific number of the plurality of words is not limited herein. In addition, the question text and the chapter text may be literally related, for example, the chapter text is "banana in refrigerator, apple on table", the question text is "banana where", i.e., both the chapter text and the question text are literally related to "banana"; in addition, the question text may be literally unrelated to the chapter text, for example, the chapter text is "banana in refrigerator, apple on table", the question text is "yellow fruit where", i.e. the chapter text and the question text have no literal relevance. The above examples are only a few cases that may exist in the practical application process, and are not limited to other cases that may exist in the practical application process, and are not exemplified here.

In one implementation scenario, in order to improve accuracy of the subsequent predicted answer text, the question text and the chapter text may be preprocessed, so as to obtain a plurality of words contained in the question text and the chapter text.

In one specific implementation scenario, data cleansing may be performed on chapter text and question text. For example, illegal characters such as html (Hyper-Text Markup Language, hypertext markup language) markup characters (e.g., </body >, etc.) in the chapter text and the question text can be removed, messy codes caused by coding errors, etc. in the chapter text and the question text can be removed, and the probability of abnormal errors can be reduced in the subsequent application process through data cleaning, so that the robustness of the answer text prediction can be improved.

In another specific implementation scenario, the chapter text and the question text may also be segmented. For example, word pieces may be used for English with Word Piece tools, or directly according to spaces between words; and for Chinese, the Chinese can be directly segmented with the characters as granularity, and the segmentation can be beneficial to capturing semantic information with fine granularity.

In yet another specific implementation scenario, it is also possible to replace multiple spaces in both the chapter text and the question text with one space and remove the spaces at the beginning and end of the sentence.

In another specific implementation scenario, after the preprocessing is performed on the chapter text and the question text, format conversion may be performed on the preprocessed chapter text and question text according to actual application needs, for example, may be converted into:

[CLS]Q ₁ …Q _n [SEP]P ₁ …P _m [SEP]……(1)

In the above formula (1) [ CLS ]]And [ SEP ]]For separators, Q ₁ …Q _n Representing individual words in the question text after preprocessing, P ₁ …P _m Representing the individual terms in the text of the chapter after preprocessing. In addition, according to practical application requirements, each word in the text of the chapter after the pretreatment can be placed before, and each word in the text of the question after the pretreatment can be placed after, namely, converted into [ CLS ]]P ₁ …P _m [SEP]Q ₁ …Q _n [SEP]The present invention is not limited thereto.

Further, in the disclosed embodiments, several knowledge points are related to at least one of question text, chapter text. For example, several knowledge points may be related to the question text, still taking the aforementioned question text "where the yellow fruit is", several knowledge points may be related to "yellow fruit", in particular "banana", "orange", "lemon", "grapefruit" etc.; alternatively, a plurality of knowledge points may be related to the chapter text, and still taking the aforementioned chapter text of "banana is in refrigerator and apple is on table" as an example, a plurality of knowledge points may specifically relate to "banana", "refrigerator", "apple"; alternatively, the knowledge points may be related to both the chapter text and the question text, and still taking the aforementioned chapter text "banana is in a refrigerator, apple is on a table" and the question text "yellow fruit is" as an example, the knowledge points may be related to "yellow fruit", and may specifically relate to "banana", "orange", "lemon", "grapefruit", etc., and the knowledge points may also specifically relate to "banana", "refrigerator", "apple". Other situations can be similar and are not exemplified here.

Specifically, in order to improve pertinence of the reference text, at least one of the question text and the chapter text may be identified by a keyword to obtain a plurality of keywords, and on the basis of the keywords, the reference text of the knowledge points related to the keywords may be obtained from a preset knowledge dictionary. The preset knowledge dictionary may specifically include, but is not limited to: hundred degrees encyclopedia, wikipedia, chinese words, oxford dictionary, and the like, without limitation. According to the method, the keywords are obtained by identifying the keywords of at least one of the question text and the chapter text, so that the reference text of the knowledge points related to the keywords is obtained from the preset knowledge dictionary, the correlation between the reference text and the chapter text and between the reference text and the question text can be improved, the reference value of the external knowledge in the process of answering the questions can be improved, and the accuracy of answering the questions can be improved.

In one implementation scenario, the preset knowledge dictionary may include a plurality of dictionary entries, each of which may include: entry, basic information, explanation, and example sentence. The entry may represent a topic of the dictionary item, the basic information may include information such as part of speech, pinyin (or phonetic symbol) of the topic, the explanation may represent a paraphrase of the topic, and the example sentence may be an example sentence containing the topic. Taking the dictionary term "banana" as an example, the term is "banana", the basic information may include the part of speech (i.e., noun) of "banana" and the pinyin (i.e., xi ā ngji ā o) of banana, the explanation may include "monocotyledonous plants, musaceae, oblong leaves, berry pulp, oblong columns, yellow peel when ripe, distributed in tropical and subtropical areas, soft and sweet pulp", the example sentence may be "monkey loving bananas in zoo", and the other cases may be similarly, and are not exemplified here.

In one implementation scenario, the pre-set knowledge dictionary may be pre-processed in order to increase the robustness of the original semantic representation of the subsequently extracted reference text.

In one specific implementation scenario, the data cleansing may be performed on a pre-set knowledge dictionary. For example, illegal characters such as html (Hyper-Text Markup Language, hypertext markup language) markup characters (e.g., </body > and the like) in a preset knowledge dictionary can be removed, messy codes caused by coding errors and the like in the preset knowledge dictionary can be removed, and the probability of abnormal errors can be reduced in the subsequent application process through data cleaning, so that the robustness of answer text prediction can be improved.

In another specific implementation scenario, in order to reduce interference of irrelevant information, regular expressions may be used to extract only the explanation corresponding to the vocabulary entry in the preset knowledge dictionary, without retaining basic information (such as part of speech, pinyin, etc.), example sentences, and the like.

In yet another specific implementation scenario, the preset knowledge dictionary may also be segmented. For example, word pieces may be used for English with Word Piece tools, or directly according to spaces between words; and for Chinese, the Chinese can be directly segmented with the characters as granularity, and the segmentation can be beneficial to capturing semantic information with fine granularity.

In yet another specific implementation scenario, a plurality of spaces in the preset knowledge dictionary may be replaced by one space, and the spaces at the beginning and end of the sentence may be removed.

In yet another specific implementation scenario, after the pre-processing is performed on the pre-set knowledge dictionary, format conversion may be further performed on each dictionary item of the pre-set knowledge dictionary after the pre-processing according to actual application requirements, for example, for the ith dictionary item D ⁱ Can be converted into:

[CLS]D ₁ …D _k [SEP]……(2)

in the above formula (2) [ CLS ]]And [ SEP ]]For separator, D ₁ …D _k Representing individual words in the dictionary entry after preprocessing. Specifically, as previously described, dictionary entries only retain the entry and its interpretation, so the term may be represented by the colon': ' connect entry and its interpretation. For example, taking the term "apple" as an example, D can be constructed ₁ …D _k Is "apple: dicotyledonous plants, rosaceae. And (5) fallen tree. Most of the flowers are sterile and need cross pollination. The pulp is crisp and sweet, and can help digestion. "wherein a space represents a word segment.

In yet another specific implementation scenario, nouns and nameentities of the chapter text and the question text may be identified as keywords. More specifically, the nouns and nameentities of the chapter text and the question text may be identified using NER (Named Entity Recognition ) tools such as LTP, stanza, etc., which are not limited herein.

In another specific implementation scenario, dictionary items D matching the keywords may be directly extracted from the preset knowledge dictionary, respectively, to obtain the reference text. For example, when the keyword is "apple", the reference text "[ CLS" can be extracted]Apple: dicotyledonous plants, rosaceae. And (5) fallen tree. Most of the flowers are sterile and need cross pollination. The pulp is crisp and sweet, and can help digestion. [ SEP ]]". Other situations can be similar and are not exemplified here. In addition, the occurrence times of the keywords in the chapter text and the question text can be counted respectively, the dictionary items matched with the keywords are sorted according to the order of the occurrence times of the corresponding keywords from more to less, and the dictionary items (such as D) arranged in the front preset numerical value (such as the front 10) are selected ¹ ,D ² ,…,D ¹⁰ ) And obtaining the reference text. It should be noted that, when the dictionary entries matching the keywords are not more than the preset value (e.g., 10), at least one empty dictionary entry may be filled in to fill in the preset value dictionary entries, so that the finally obtained reference text is the preset value.

Step S12: individual semantic representations of the several terms are extracted, and original semantic representations of the respective reference texts are extracted.

In one implementation scenario, embedded representations of a plurality of words in the chapter text and the question text may be obtained, and semantic extraction may be performed on the embedded representations of the plurality of words, thereby obtaining individual semantic representations of the plurality of words.

In one specific implementation scenario, the word may be first discretized to map the word to a 0-1 vector, and then the discretized representation of the word is converted to a continuous vector representation, i.e., an embedded representation of the word, based on a word vector technique. Specifically, the word list size of the preset corpus may be V (e.g. including 20000 to 30000 common words), for a word w _i The 0-1 vector representation of (2) is a vector of length |V| where the word w _i The dimension corresponding to the predetermined corpus is 1, and all other elements are 0. For example, assuming the vocabulary v= { a, b, c, d } of the corpus, the 0-1 vector of the word "c" is represented asX _i = {0, 1,0}, and the like, and are not exemplified here. Further, each word corresponds to a d-dimensional real value vector, so that a word vector matrix E with the magnitude of |V| d is correspondingly formed by a preset corpus, and then the embedded representation of the word can be obtained by discretizing the product of 0-1 vector representation and the word vector matrix E, wherein the embedded representation of the word is represented by the following specific formula:

e _i ＝E·X _i ……(3)

In the above formula (3), e _i Representing word w _i Is embedded in the representation, X _i Representing word w _i Discretizing 0-1 vector representation, wherein E represents a word vector matrix corresponding to a preset corpus.

In another specific implementation scenario, the embedded representations of the terms may be input into a semantic extraction model, resulting in semantic representations of the terms. The semantic extraction model may specifically include, but is not limited to: a multi-layered semantic extraction network (e.g., multi-layered transformers). Under the condition that the semantic extraction model is an L-layer transform, context dynamic coding can be realized through the L-layer transform, and finally unified semantic representation of the relevance of chapters and problems can be obtained, which can be expressed as follows:

H _i ＝transformer(H _i-1 )……(4)

in the above formula (4), i has a value ranging from 1 to L, H _i Representing the semantic representation extracted through the i-th layer transducer. In addition, H ₀ The expression (1) is an embedded expression obtained by the discretization and word embedding process.

In yet another specific implementation scenario, the chapter text and the question text may also be directly input into an individual semantic extraction network to obtain individual semantic representations of several words in both the chapter text and the question text. Furthermore, the individual semantic extraction network may include a multi-layered semantic extraction network. The individual semantic extraction network may specifically include, but is not limited to: ALBERT (ALITE BERT, lightweight BERT) model. Referring to fig. 2 in combination, fig. 2 is a schematic diagram illustrating a framework of an embodiment of an individual semantic extraction network, and as shown in fig. 2, the individual semantic extraction network may specifically include: discretization layer, word embedding layer, multi-layer transducer presentation layer. The specific data processing process of the discretization layer, the word embedding layer and the multi-layer transducer representation layer can be referred to the foregoing description, and will not be repeated here.

In another implementation scenario, similar to the individual semantic representations described above, an embedded representation of the terms in the reference text may be obtained and semantic extraction performed on the embedded representation of the terms in the reference text, resulting in an original semantic representation of the reference text. The extracting process of the individual semantic representation can be referred to specifically, and will not be described in detail here.

In addition, for convenience of description, in the embodiments of the present disclosure and the following embodiments, the t-th word w in the question text and the chapter text will be described _t The individual semantic representation is denoted as H _L (t) and noting the original semantic representation of the ith reference text as

Step S13: and predicting the answer text of the question text from the chapter text by using the individual semantic representations of the words and the original semantic representations of the reference texts.

In one implementation scenario, the individual semantic representation H of each term may be utilized separately _L (t) and raw semantic representations of respective reference textsOn the basis, the semantic association degree between each word and each reference text can be utilized to fuse each original semantic representation with each word, so that fused semantic representations of the word are obtained, and answer texts can be predicted from chapter texts based on individual semantic representations and fused semantic representations of a plurality of words. According to the mode, the original semantic representations are fused through the semantic relevance between the words and each reference text respectively to obtain the fused semantic representation, namely the fused semantic representation of the words is more dependent on the original semantic representation of the reference text with high semantic relevance, and the words are reused on the basis The individual semantic representation of the text and the fusion semantic representation of the external knowledge are introduced to predict the answer text, so that the accuracy of the answer text can be improved.

In a specific implementation scenario, taking a total of N reference texts as an example, for the kth word w in question text and chapter text _k In other words, H can be represented by its individual semantics _L (k) Original semantic representations respectively with the 1 st reference textOriginal semantic representation +.>Get word w _k Semantic association with the 1 st to the N th reference texts respectively, and so on, so as to obtain the semantic association between each word in the question text and the chapter text and the 1 st to the N th reference texts respectively, and on the basis, the word w _k In other words, the original semantic representations from the 1 st reference text to the nth reference text can be weighted by using the semantic association degree between the original semantic representations and the 1 st reference text to the nth reference text respectively to obtain a word w _k And so on, the fused semantic representation of each term in the question text and the chapter text can be obtained. The specific process of obtaining the semantic association degree and fusing the original semantic representation can refer to the related embodiments described later, and will not be described in detail herein.

In another specific implementation scenario, to increase the efficiency of predicting answer text, a predictive network may be pre-trained, which may include, but is not limited to: the full-connection layer, the softmax layer and the like are not limited, on the basis, individual semantic representations and fusion semantic representations of words can be spliced, the spliced semantic representations are sent to the prediction network, so that a first probability value that each word in the chapter text is the starting position of the answer text and a second probability value that each word in the chapter text is the ending position of the answer text can be predicted, finally, the starting position word and the ending position word of the answer text can be determined based on the first probability value and the second probability value of each word, and the combination of the starting position word, the ending position word and words among the words can be used as the answer text.

In another implementation scenario, as previously described, the individual semantic representation H of each term may be utilized separately _L (t) and raw semantic representations of respective reference textsAnd obtaining the semantic association degree between the words and each reference text respectively. On the basis, the semantic association degree between the keywords in the question text and each reference text can be obtained, so that for each reference text, the semantic association degree between each keyword and each reference text can be utilized to obtain the importance degree, the reference texts are ordered according to the order of the importance degree from big to small, the reference text positioned at the front preset sequence (such as the first position) is selected, the entry in the selected reference text is extracted, and the keyword in the question text is replaced by the entry, so that a new question text is obtained. Based on the method, the new question text and the individual semantic representations of a plurality of words in the chapter text are obtained, and the individual semantic representations of the plurality of words are utilized, so that the answer text of the question text is predicted in the chapter text.

In a specific implementation scenario, at least one of adjectives, nouns, name entities, etc. of the problem text may be identified, and the specific process may refer to the foregoing related description as a key word, which is not repeated herein. In addition, the term in the reference text may refer to the related description, and is not repeated herein.

In another specific implementation scenario, as previously described, to increase the efficiency of predicting answer text, a predictive network may be pre-trained, which may include, but is not limited to: the full connection layer, the softmax layer and the like are not limited, and on the basis, the new question text and the individual semantic representations of a plurality of words in the chapter text can be sent into the prediction network, so that a first probability value that each word in the chapter text is the starting position of the answer text and a second probability value that each word in the chapter text is the ending position of the answer text can be predicted, finally, the starting position word and the ending position word of the answer text can be determined based on the first probability value and the second probability value of each word, and the combination of the starting position word, the ending position word and words between the starting position word and the ending position word can be used as the answer text.

In another specific implementation scenario, please refer to fig. 4 in combination, fig. 4 is a schematic diagram of an embodiment of answering question text based on chapter text and reference text. As shown in fig. 4, the semantic relativity between the keywords (e.g., yellow, sweet, fruit) and each reference text in the question text can be obtained, for convenience of description, the reference text with the term "banana" is respectively marked as reference text 1, the reference text with the term "apple" is marked as reference text 2, the reference text with the term "lemon" is marked as reference text 3, for example, the semantic relativity between the keywords "yellow" and the reference text 1 to the reference text 3 is respectively: 0.5, 0, 0.5, and the semantic association of the keyword "sweet" with reference text 1 to reference text 3 is respectively: 0.5, 0, and in addition, the semantic association degree of the keyword "fruit" with the reference text 1 to the reference text 3 is respectively: 1/3, then for each reference text described above, it may be fused (e.g., weighted summed, direct summed, etc.) with the semantic relevance of the respective keywords as the importance of that reference text. Taking direct summation for fusion as an example, the importance of the reference text 1 is 4/3, the importance of the reference text 1 is 5/6, and the importance of the reference text 2 is 5/6, i.e. the importance of the reference text 1 is highest. On this basis, the keywords in the question text may be replaced with the entries of the reference text 1, i.e. the question text shown in fig. 4 may be updated to "where is the banana put? ". Accordingly, based on the previous description step, individual semantic representations of a plurality of words in the new question text and the chapter text can be obtained, and the prediction of the answer text can be performed accordingly.

In one implementation scenario, to improve the efficiency of the question answering, a question answering model may be pre-trained with training samples, and the training samples may specifically include: the specific training process may refer to the following disclosure embodiments of the present application, which are not described herein in detail. After the question answer model is trained, the answer text of the question text can be predicted by using the question answer model. Furthermore, the question answer model may specifically include: the individual semantic extraction network, the original semantic extraction network and the prediction network can be specifically referred to the related description, and are not described herein.

Referring to fig. 5, fig. 5 is a flowchart illustrating an embodiment of step S13 in fig. 1. The method specifically comprises the following steps:

step S51: and obtaining the semantic association degree between the words and each reference text by using the individual semantic representation of each word and the original semantic representation of each reference text.

Specifically, one term of the terms can be used as a current term, and the individual semantic representation of the current term is respectively subjected to dot multiplication with the original semantic representation of each reference text to obtain the initial association degree between the current term and each reference text, so that the initial association degree between the current term and each reference text is normalized to obtain the semantic association degree between the current term and each reference text. The individual semantic representation and the original semantic representation can be specifically regarded as vectors with certain dimensions, so that the individual semantic representation and corresponding elements in the original semantic representation can be multiplied and then summed together to complete point multiplication operation, and detailed description is omitted here. According to the method, one word in the plurality of words is used as the current word, and the individual semantic representation of the current word is directly subjected to dot multiplication with the original semantic representation of each reference text to obtain the initial association degree between the current word and each reference text, and the semantic association degree can be obtained through normalization according to the initial association degree, so that the complexity of obtaining the semantic association degree can be reduced.

Referring to fig. 6 in combination, fig. 6 is a schematic state diagram of an embodiment of a semantic association acquisition process, and as described above, the t-th word w in the question text and the chapter text may be used _t The individual semantic representation is denoted as H _L (t) and noting the original semantic representation of the ith reference text asTaking the example that M words are in total in question text and chapter text and N words are in total in reference text, the word w can be used _t As current words, the individual semantic representations thereof can be marked as H _L (t) original semantic representation of the respective reference text +.>(wherein i.epsilon.1, N)]) Dot multiplication is carried out to obtain the current word w _t Initial degree of association alpha with each reference text _ti ：

In the above formula (5), the dot product operation is represented. On the basis, N initial association degrees can be spliced to form vectors

On the basis, the vector can be used forNormalization is performed, for example, the vectors described above can be treated with softmax +.>Normalizing to obtain semantic association degree beta _t ：

For other words, the same can be said, and are not exemplified here.

Step S52: and fusing each original semantic representation by utilizing the semantic association degree between the words and each reference text to obtain the fused semantic representation of the words.

Specifically, the original semantic representation of each reference text may be weighted using the semantic association between the term and each reference text. Still in the word w _t For example, the word w _t With respective reference texts D ⁱ (wherein i.epsilon.1, N)]) Semantic relatedness between original semantic representations of respective reference text(wherein i.epsilon.1, N)]) Weighting is carried out, so that the fusion semantic representation K (t) of the word is obtained:

for other words, the same can be said, and finally the fusion semantic representation K (1) of the 1 st word, the fusion semantic representation K (2) of the 2 nd word and the fusion semantic representation K (M) of the M th word can be obtained. In order to facilitate the subsequent processing of the fused semantic representation, the fused semantic representations of the M words can be spliced to obtain a final fused semantic representation K:

K＝[K(1),K(2),…,K(M)]……(9)

step S53: and predicting and obtaining the answer text from the chapter text based on individual semantic representations of the words and the fusion semantic representations.

In one implementation scenario, as described in the foregoing disclosed embodiments, in order to increase the efficiency of predicting the answer text, a prediction network may be pre-trained, so that individual semantic representations and fused semantic representations of several words may be directly fed into the prediction network, and finally predicted to obtain the answer text. Reference may be made specifically to the relevant descriptions in the foregoing disclosed embodiments, and details are not repeated here.

In another implementation scenario, in order to further improve accuracy of predicting the answer text, individual semantic representations and fusion semantic representations of a plurality of words and global semantic representations for representing overall semantics of the plurality of words may be used to respectively predict and obtain a first probability value and a second probability value of each word in the chapter text, so that a starting word and an ending word are determined in the chapter text based on the first probability value and the second probability value, and the answer text is obtained by using the starting word and the ending word, and the global semantic representations are obtained based on individual semantic representations of all the words, wherein the first probability value represents a likelihood that the word is a starting position of the answer text, and the second probability value represents a likelihood that the word is an ending position of the answer text. By further introducing the global semantic representation representing the whole semantics of a plurality of words, the method can integrate local semantic information (namely, individual semantic representation), external knowledge (namely, fusion semantic representation) and global semantic information (namely, global semantic representation) in the process of predicting the answer text, so that the accuracy of predicting the answer text can be improved.

In a specific implementation scenario, the individual semantic representations of the words at the preset positions in the question text and the chapter text can be directly used as the global semantic representation. For example, the individual semantic representation of the character [ CLS ] in the expression (1) may be used as the global semantic representation, or the individual semantic representation of the last [ SEP ] in the expression (1) may be used as the global semantic representation. According to the method, the individual semantic representations of the words at the preset positions in the question text and the chapter text are directly used as the global semantic representations, so that complexity of acquiring the global semantic representations can be reduced.

In another specific implementation scenario, the individual semantic representations of several terms may also be globally averaged pooled to obtain a global semantic representation. Taking the individual semantic representation as an L-dimensional vector as an example, specifically, the average value of the 1 st element in the M individual semantic representations can be calculated and used as the 1 st element in the global semantic representation, and the average value of the k element in the M individual semantic representations is calculated and used as the k element in the global semantic representation until the average value of the L element in the M individual semantic representations is calculated and used as the L element in the global semantic representation. According to the method, the individual semantic representations of the words are subjected to global average pooling to obtain the global semantic representation, so that the accuracy of the global semantic representation can be improved.

In yet another specific implementation scenario, the individual semantic representations of several terms may also be globally maximally pooled to obtain a global semantic representation. Taking the individual semantic representation as an L-dimensional vector as an example, specifically, the maximum value of the 1 st element in the M individual semantic representation can be counted as the 1 st element in the global semantic representation, and the like, the maximum value of the k element in the M individual semantic representation is counted as the k element in the global semantic representation until the maximum value of the L element in the M individual semantic representation is counted as the L element in the global semantic representation. According to the method, the individual semantic representations of the words are subjected to global maximum pooling to obtain the global semantic representation, so that the accuracy of the global semantic representation can be improved.

In yet another specific implementation scenario, the global semantic representation may be noted as G for ease of description, and furthermore, the fused semantic representations of several words may be stitched as described above to obtain the final fused semantic representation K, and similarly, several words may be stitchedThe individual semantic representations of the words are spliced to obtain a final individual semantic representation H _L Based on this, the final individual semantic representation H can be represented _L Multiplying the dimension corresponding to the global semantic representation G and multiplying the multiplication result with the final individual semantic representation H _L The final fusion semantic representation K is sent into a first prediction network to obtain a first probability value p of each word _s And multiplying the result with the final individual semantic representation H _L The final fusion semantic representation K is sent into a second prediction network to obtain a second probability value p of each word _e ：

p _s ＝[H _L ；K；G⊙H _L ]·W _s ……(10)

p _e ＝[H _L ；K；G⊙H _L ]·W _e ……(11)

In the above formula (10) and formula (11), as follows, the corresponding dimension is multiplied by weight, W _s Representing network parameters of a first predictive network, W _e Representing network parameters of the second predictive network []Representing stitching.

In still another specific implementation scenario, in order to improve prediction accuracy, the first probability value of each term in the text of the chapter may be obtained by predicting by using the individual semantic representation and the fusion semantic representation of the plurality of terms and the global semantic representation for representing the whole semantics of the plurality of terms, and specifically, the foregoing description and the formula (12) may be referred to, on this basis, the second probability value of each term may be obtained by further using the individual semantic representation and the fusion semantic representation of the plurality of terms, and the global semantic representation and the individual semantic representation of the term corresponding to the first probability value that is the greatest. According to the method, the first probability value of each word is obtained through prediction, and the second probability value is predicted by means of individual semantic representation of the word corresponding to the maximum first probability value, individual semantic representations of a plurality of words, fusion semantic representations and global semantic representations, so that the prediction result of the starting position is fully considered in the prediction process of the ending position, and the accuracy of the prediction of the ending position can be improved.

Specifically, as previously described, an individualThe volumetric semantic representation is embodied as an L-dimensional vector, so the final individual semantic representation H _L For convenience of subsequent data processing, the individual semantics of the word corresponding to the maximum first probability value can be duplicated M times to form a vector with the size of M x L, which can be marked as H for convenience of description _s . On the basis, a second probability value p of each word can be obtained by the following formula _e ：

p _e ＝[H _L ；K；G⊙H _L ；H _s ]·W _e ……(12)

In yet another specific implementation scenario, the first probability value p of each term is obtained _s And a second probability value p _e And then, respectively taking one word in the chapter text as the current word, counting the product of the first probability value of the current word and the second probability value of each word positioned behind the current word, and finally taking the combination of the two words corresponding to the product of the maximum probability and the words between the two words as the answer text. With continued reference to fig. 4, the first word "water" in the chapter text may be first used as the current word, and the product of the first probability value of the "water" and the probability of the second probability value of each word (i.e., "fruit", "store", "inside", "… …", "right", "edge") after the "water" may be counted, for the second word "fruit", the third word "store" until the last word "right", the similar process as above may be performed, and finally the product of the first probability value of the "leftmost" in the thickened portion "leftmost" in the chapter text of fig. 4 and the probability of the second probability value of the "edge" may be counted to be the largest, so the combination of the "leftmost" and "edge" and the "left" therebetween, i.e., "leftmost" may be placed where the sweet fruit of yellow as the problem text? "answer text. Other situations can be similar and are not exemplified here.

Different from the embodiment, the original semantic representations are fused through the semantic association degrees between the words and each reference text respectively to obtain the fused semantic representation, namely the fused semantic representation of the words is more dependent on the original semantic representation of the reference text with large semantic association degrees, on the basis, the answer text is predicted by utilizing the individual semantic representation of the words and the fused semantic representation of the introduced external knowledge, and the accuracy of the answer text can be improved.

Referring to fig. 7, fig. 7 is a flowchart illustrating another embodiment of the question answering method according to the present application. The method specifically comprises the following steps:

step S71: and acquiring the question text and the chapter text, and acquiring the reference text of a plurality of knowledge points.

In the embodiment of the disclosure, the question text and the chapter text comprise a plurality of words, and a plurality of knowledge points are related to at least one of the question text and the chapter text. Reference may be made specifically to the relevant descriptions in the foregoing disclosed embodiments, and details are not repeated here.

Step S72: individual semantic representations of the several terms are extracted, and original semantic representations of the respective reference texts are extracted.

Reference may be made specifically to the relevant descriptions in the foregoing disclosed embodiments, and details are not repeated here.

Step S73: and predicting the answer text of the question text from the chapter text by using the individual semantic representations of the words and the original semantic representations of the reference texts.

In the embodiment of the disclosure, the answer text is obtained based on a first probability value and a second probability value of each word in the chapter text, the first probability value represents the possibility that the word is the start position of the answer text, the second probability value represents the possibility that the word is the end position of the answer text, the first probability value and the second probability value are obtained by using individual semantic representations and fusion semantic representations of a plurality of words and a global semantic representation prediction for representing the whole semantics of a plurality of words, and the relevant description in the embodiment of the disclosure can be referred to, and is not repeated here,

step S74: among the individual semantic representations of the several terms, an individual semantic representation of the starting location term of the answer text is selected, and an individual semantic representation of the ending location term of the answer text is selected.

As with the previously disclosed embodiments, for ease of description, the following may be mentionedThe individual semantic representation of the answer text starting location word is denoted as H _s And the individual semantic representation of the word at the end of the answer text is denoted as H _e 。

Step S75: and predicting to obtain a third probability value by using the individual semantic representation of the initial position word, the individual semantic representation of the end position word and the global semantic representation.

In the embodiment of the present disclosure, the third probability value represents a likelihood of answer text without question text in the chapter text. Although the answer text of the question text can be predicted from the chapter text by the above-described relevant steps of answer text prediction, in a real scene, there may be a case where there is no real answer text of the question text in the chapter text. Referring to fig. 8 in combination, fig. 8 is a schematic diagram of another embodiment of answering question text based on chapter text and reference text. As shown in fig. 8, for the question text "where is blue fruit put? The text of the chapter is not actually referred to by any blue fruit, so that the text of the chapter does not actually have true answer text of the question text, and the answer rejection prediction is further performed on the basis of the answer text prediction, namely, the possibility of predicting the answer text of the question text which does not exist in the text of the chapter is predicted, so that the answer error rate of the question answer can be reduced.

As previously described, a third predictive network may be pre-trained in order to improve prediction efficiency. The third predictive network may specifically include, but is not limited to: the full connection layer and the normalization layer are not limited herein. On the basis, the individual semantic representation H of the initial position word can be expressed _s Individual semantic representation H of end position words _e And the global semantic representation G is fed into a third predictive network, thereby obtaining a third probability value p _NA ：

p _NA ＝sigmoid([H _s ；H _e ；G]·W _NA )……(13)

In the above formula (13), W _NA Representing network parameters of the third predicted network (e.g., network parameters of the full connectivity layer), []Representing stitching, sigmoid function is used to normalize the output range to 0 to1.

Step S76: based on the first probability value of the start position word, the second probability value of the end position word, and the third probability value, it is determined whether to output the answer text.

Specifically, the product of probabilities of the first probability value of the start position word and the second probability value of the end position word may be obtained, and may be denoted as p for convenience of description _A And compares the product p of the probabilities _A And a third probability value p _NA The product p of the probability _A Greater than the third probability value p _NA Under the condition of (1), determining output answer text, namely considering that the question text has the answer text, wherein the answer text of the question text is the answer text predicted in the chapter text; conversely, in the product p of the probabilities _A Not greater than a third probability value p _NA In the case of (2), it is determined that no answer text is output, that is, the question text is considered to have no answer text.

With continued reference to fig. 8, since the text of the chapter does not involve any blue fruit, and the introduced reference text explains each knowledge point in the chapter in detail, but does not match the blue fruit, the third probability value p is finally obtained _NA A product p greater than the probability _A I.e. it is more prone to reject answer question text than predicted answer text, the question text may be considered to have no corresponding answer text, and it may be determined that no answer text is output. Other situations can be similar and are not exemplified here.

Unlike the foregoing embodiments, by selecting the individual semantic representation of the starting position word of the answer text from among the individual semantic representations of the several words, and selecting the individual semantic representation of the ending position word of the answer text, the third probability value is predicted by using the individual semantic representation of the starting position word, the individual semantic representation of the ending position word, and the global semantic representation, and the third probability value represents the likelihood that the answer text of the question text does not exist in the chapter text, and further, based on the first probability value of the starting position word, the second probability value of the ending position word, and the third probability value, whether the answer text is output is determined, so that the robustness of the question answer can be further improved. In addition, since the answer text prediction and the rejection answer prediction both refer to the global semantic representation, it can be advantageous to introduce an interaction mechanism in the answer text prediction process and the rejection answer prediction process, so that the rejection answer prediction process can take this information into account by the global semantic representation in the case that the answer text prediction process gives an answer text with a higher confidence, and thus the rejection answer prediction process can be more prone to output a prediction of "not refused answer", whereas in the case that the answer text prediction process gives an answer text with a lower confidence, the rejection answer prediction process can take this information by the global semantic representation, and thus the rejection answer prediction process can be more prone to output a prediction of "refused answer", and thus the accuracy of the rejection answer prediction can be improved.

Referring to fig. 9, fig. 9 is a flowchart of an embodiment of a training method for a question answering model. The method specifically comprises the following steps:

step S91: and obtaining training samples and obtaining sample reference texts of a plurality of knowledge points.

In an embodiment of the present disclosure, the training sample may specifically include: the sample question text, sample chapter text, and sample answer text may contain a number of sample words, and a number of knowledge points are associated with at least one of the sample question text, the sample chapter text. Reference may be made specifically to the foregoing descriptions of the disclosed embodiments, and details are not repeated herein.

Furthermore, to facilitate subsequent processing, the sample answer text may be represented by a 0-1 vector, specifically a start position vector and an end position vector. Taking the sample chapter text of "banana in refrigerator, apple on table", the sample question text of "yellow fruit where" as an example, the sample answer text is actually "in refrigerator", for convenience of subsequent processing, the starting position vector of the sample answer text may be represented by 0-1 vector as [ 00 1 00 00 00 0], the ending position vector of the sample answer text may be represented by 0-1 vector as [ 00 00 1 00 00 0], that is, the sample word may not be used for answering the sample question text is represented by "0", and the sample word may be used for answering the sample question text is represented by "1". Other situations can be similar and are not exemplified here.

In one implementation scenario, the sample answer text intercepted from the sample chapter text may be referred to as a first sample answer text, and the sample question text corresponding to the first sample answer text may be referred to as a first sample question text, in addition, in order to improve the robustness of the question answer model, keyword recognition may be performed on the first sample question text, and the recognized keyword may be replaced by another different word to obtain a second question text, so that the sample chapter text, a combination of the first sample question text and the first sample answer text, and a combination of the sample chapter text, the second sample question text and the preset text may be used as training samples, and the preset text is used to indicate that the answer text of the second sample question text does not exist in the sample chapter text. Specifically, the preset text may be "no answer", or may be represented by an all-zero vector, which is not limited herein. According to the method, the second question text can be constructed by carrying out keyword recognition on the first sample question text and replacing the keywords obtained by recognition with other different words, the answer text for answering the second question text does not exist in the sample chapter text, and the training sample is constructed based on the sample chapter text, the first sample question text, the first sample answer text, the second sample question text and the preset text, so that the training sample contains both the sample question text which can be answered and the sample question text which cannot be answered, and further the robustness of the question answer model can be improved in the training process by using the training sample.

In one specific implementation scenario, a combination of the sample chapter text, the first sample question text, and the first sample answer text may be used as a set of training samples, and a combination of the sample chapter text, the second sample question text, and the preset text may be used as a set of training samples.

In another specific implementation scenario, terms, namebodies, adjectives, times, etc. in the first sample question text may be identified as keywords. Keywords may be identified using NER (Named Entity Recognition ) tools such as LTP, stanza, etc.

In another specific implementation scenario, only one keyword may be replaced at a time in order to ensure semantic consistency of the resulting second sample question text after replacing the keyword. In addition, a plurality of keywords can be replaced each time, and semantic consistency of the keywords can be checked manually after replacement, so that a second sample question text can be obtained.

In particular, a nameentity in the first sample question text may be replaced with another nameentity in the sample chapter text. Referring to fig. 10 in combination, fig. 10 is a schematic diagram illustrating an embodiment of a training sample acquisition process. As shown in fig. 10, the text "where is the birth place of the figure? "the named entity" Turing "can be identified so that it can be replaced with another entity" Einstein "in the sample chapter text to get the second sample question text" where is the birth location of Einstein? Other cases may be similarly considered, and are not exemplified herein.

In particular, the time in the first sample question text may also be replaced with another time in the sample chapter text. With continued reference to fig. 10, for the first sample question text "what school is examined in fig. 1926? By "one can identify the time" 1926 ", so that the time can be replaced with another time" 1927 end "in the sample chapter text, and a second sample question text" 1927 end figure? Other cases may be similarly considered, and are not exemplified herein.

In particular, nouns in the first sample question text may also be replaced with their anti-ambiguities. With continued reference to fig. 10, for the first sample question text "what paper of figure is rewarded? ", it can be recognized that the term" favourite "is obtained, so that the term can be replaced with its disambiguation" punishment ", and the second sample question text" what paper of the figure receives punishment? Other cases may be similarly considered, and are not exemplified herein.

In particular, adjectives in the first sample question text may also be replaced with their anti-ambiguities. With continued reference to fig. 10, for the first sample question text "what is obtained for excellent performance? The adjective "excellent" can be identified, so that the adjective can be replaced by the anti-definition word "bad" to obtain a second sample question text "what is obtained by the figure's achievement bad? Other cases may be similarly considered, and are not exemplified herein.

Step S92: extracting sample individual semantic representations of a plurality of sample words by using an individual semantic extraction network of the question answer model, and extracting sample original semantic representations of each sample reference text by using an original semantic extraction network of the question answer model.

Step S93: based on a prediction network of the question answer model, a first prediction probability value and a second prediction probability value of each sample word in the sample chapter text are obtained through prediction by using sample individual semantic representations of a plurality of sample words and sample original semantic representations of each sample reference text.

In an embodiment of the disclosure, the first predicted probability value represents a likelihood that the sample word is a starting position of the sample answer text, and the second predicted probability value represents a likelihood that the sample word is an ending position of the sample answer text.

Specifically, sample individual semantic representations of each sample word and sample original semantic representations of each sample reference text can be used respectively to obtain sample semantic association degrees between the sample word and each sample reference text, and sample fusion semantic representations of the sample words are obtained by fusing the sample original semantic representations of the sample word and each sample reference text to obtain sample fusion semantic representations of the sample word, and on the basis, the sample individual semantic representations and the sample fusion semantic representations of a plurality of sample words can be sent into a prediction network to obtain the first prediction probability value and the second prediction probability value.

In one implementation scenario, one sample word in a plurality of sample words can be used as a current sample word, and sample individual semantic representations of the current sample word are respectively subjected to dot multiplication with sample original semantic representations of all sample reference texts to obtain sample initial association degrees between the current sample word and all sample reference texts, and finally, the sample initial association degrees between the current sample word and all sample reference texts are normalized to obtain sample semantic association degrees between the current sample word and all sample reference texts. The process of obtaining the semantic association degree in the foregoing disclosed embodiment may be referred to specifically, and will not be described herein.

In another implementation scenario, the prediction network may specifically include a first prediction network for predicting a start position and a second prediction network for predicting an end position, so that a sample individual semantic representation and a sample fusion semantic representation of a plurality of sample words and a sample global semantic representation for representing a plurality of sample words global semantics may be sent to the first prediction network to obtain a first prediction probability value of each sample word in the sample chapter text, and a sample individual semantic representation and a sample fusion semantic representation of a plurality of sample words and a sample global semantic representation for representing a plurality of sample words global semantics are sent to the second prediction network to obtain a second prediction probability value of each sample word in the sample chapter text. Specifically, the sample global semantic representation is obtained based on individual semantic representations of all sample words, and the global semantic representation obtaining process in the foregoing disclosed embodiment may be referred to specifically, and will not be described herein.

In still another implementation scenario, sample individual semantic representations and sample fusion semantic representations of a plurality of sample words and sample global semantic representations for representing the whole semantics of the plurality of sample words are sent to a first prediction network to obtain first prediction probability values of all sample words in a sample chapter text, and on the basis, sample individual semantic representations and sample fusion semantic representations of the plurality of sample words and sample individual semantic representations of sample words corresponding to the sample global semantic representations and initial position vectors are sent to a second prediction network to obtain second prediction probability values of all sample words in the sample chapter text.

For ease of description, similar to the previously disclosed embodiments, sample individual semantic representations of several sample words may be stitched to obtain a final sample individual semantic representation H _L In addition, the starting position vector can be denoted as Y _s Sample individual semantic representation H of the starting location sample word _s It can be noted that:

specific processes may refer to the related descriptions in the foregoing disclosed embodiments, and are not repeated herein.

Step S94: and obtaining a loss value of the question answer model based on the sample answer text, the first predicted probability value and the second predicted probability value.

Specifically, for the initial position, the initial position vector and the first prediction probability value of each sample word in the sample chapter text can be processed by using a cross entropy loss function to obtain a first loss value; and for the end position, the end position vector and the second prediction probability value of each sample word in the sample chapter text can be processed by using a cross entropy loss function to obtain a second loss value. On this basis, the first loss value and the second loss value can be weighted to obtain the loss value of the question answering model.

The answer text predictions for this training phase have been completed. Furthermore, similar to the previously disclosed embodiments, the training phase may also make a reject answer prediction in order to reduce the rate of questions answers. In particular, among sample individual semantic representations of a number of sample words, a sample individual semantic representation of a sample word at a sample answer text start position may be selected, and a sample individual semantic representation of a sample word at a sample answer text end position may be selected. In particular, sample individual semantic representations of the starting location sample words may be referred toRead formula (16) related description. Similarly, the end position vector can be noted as Y _e Sample individual semantic representation H of end position sample words _e Can be expressed as:

further, the sample individual semantic representation of the initial position sample word, the sample individual semantic representation of the end position sample word and the sample global semantic representation may be sent to a third prediction network to obtain a third prediction probability value, where the third prediction probability value represents a likelihood that a sample answer text of the sample question text does not exist in the sample chapter text. Based on the result, a third loss value can be obtained based on whether the sample answer text is marked with the sample answer text and the third prediction probability value, and further, the loss value of the question answer model can be obtained based on the first loss value, the second loss value and the third loss value. Therefore, the training answer text prediction link and the training answer rejection prediction link refer to the sample global semantic representation, so that an interaction mechanism can be introduced into the training answer text prediction link and the training answer rejection prediction link, and the training answer text prediction link and the training answer rejection prediction link can be mutually promoted and complemented.

As described above, the answer text prediction and the reject answer prediction may be performed together in the training phase, so that in the case where the sample question text is labeled with the corresponding sample answer text, the third loss value may be masked, only the first loss value and the second loss value may be weighted, the loss value of the question answer model may be obtained, and the subsequent step of adjusting the network parameter of the question answer model may be performed using the loss value; in the case that the sample question text is not marked with the corresponding sample answer text, the first loss value and the second loss value may be masked, and the third loss value may be used as the loss value of the question answer model, and the subsequent step of adjusting the network parameters of the question answer model may be performed using the loss value.

In the training process, the rejecting answer prediction does not depend on the first prediction probability value and the second prediction probability value obtained by the answer text prediction, so that the rejecting answer prediction and the answer text prediction can be executed sequentially, namely, the first prediction probability value and the second prediction probability value can be obtained by first prediction and the third prediction probability value can be obtained by later prediction, or the third prediction probability value can be obtained by first prediction and the first prediction probability value and the second prediction probability value can be obtained by later prediction; the first predicted probability value and the second predicted probability value, and the third predicted probability value may also be performed simultaneously, i.e., simultaneously, without limitation.

Step S95: and adjusting network parameters of the question answering model by using the loss value.

Specifically, the network parameters of the question answering model can be adjusted by using the loss values in a random gradient descent (Stochastic Gradient Descent, SGD), a batch gradient descent (Batch Gradient Descent, BGD), a small batch gradient descent (Mini-Batch Gradient Descent, MBGD) mode, etc., wherein the batch gradient descent refers to parameter updating by using all samples at each iteration; random gradient descent refers to the use of one sample for parameter updating at each iteration; the small batch gradient descent refers to that a batch of samples is used for parameter updating at each iteration, and is not described herein.

Different from the embodiment, the training samples are obtained, the sample reference texts of a plurality of knowledge points are obtained, so that the individual semantic extraction network of the question answer model is utilized to extract sample individual semantic representations of a plurality of sample words, the original semantic extraction network of the question answer model is utilized to extract sample original semantic representations of all sample reference texts, and further, based on the prediction network of the question answer model, the sample individual semantic representations of a plurality of sample words and the sample original semantic representations of all sample reference texts are utilized to predict to obtain a first prediction probability value and a second prediction probability value of all sample words in a sample chapter text, and on the basis, the loss value of the question answer model is obtained based on the sample answer text, the first prediction probability value and the second prediction probability value, so that network parameters of the question answer model are adjusted by utilizing the loss value, and external knowledge can be introduced in the training process, so that expansion of the sample chapter text and the background of the sample question text is facilitated, and accuracy of the question answer model is improved is facilitated.

Referring to fig. 11, fig. 11 is a schematic diagram of a frame of an electronic device 110 according to an embodiment of the application. The electronic device 110 comprises a memory 111 and a processor 112 coupled to each other, the memory 111 having stored therein program instructions, the processor 112 being adapted to execute the program instructions to implement the steps of any of the above-described question answering method embodiments or to implement the steps of the above-described question answering model training method embodiments. In particular, electronic device 110 may include, but is not limited to: desktop computers, notebook computers, tablet computers, servers, etc., are not limited herein.

In particular, the processor 112 is configured to control itself and the memory 102 to implement the steps of any of the question answering method embodiments described above, or to implement the steps of the training method embodiments of the question answering model described above. The processor 112 may also be referred to as a CPU (Central Processing Unit ). The processor 112 may be an integrated circuit chip with signal processing capabilities. The processor 112 may also be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 112 may be commonly implemented by an integrated circuit chip.

In the embodiment of the disclosure, the processor 112 is configured to obtain a question text and a chapter text, and obtain reference texts of a plurality of knowledge points; wherein the question text and the chapter text comprise a plurality of words, and a plurality of knowledge points are related to at least one of the question text and the chapter text; the processor 112 is configured to extract individual semantic representations of the plurality of terms and extract original semantic representations of the respective reference texts; the processor 112 is configured to predict answer text of the question text from the chapter text using the individual semantic representations of the words and the original semantic representations of the respective reference texts.

In some disclosed embodiments, the processor 112 is configured to obtain semantic relatedness between the word and each reference text by using the individual semantic representation of each word and the original semantic representation of each reference text, respectively; the processor 112 is configured to fuse each original semantic representation by using the semantic association degree between the term and each reference text, so as to obtain a fused semantic representation of the term; the processor 112 is configured to predict answer text from the chapter text based on individual semantic representations of the terms and the fused semantic representations.

In some disclosed embodiments, the processor 112 is configured to treat one of the words as a current word; the processor 112 is configured to perform dot multiplication on the individual semantic representation of the current term and the original semantic representation of each reference text, so as to obtain an initial association degree between the current term and each reference text; the processor 112 is configured to normalize the initial association degrees between the current word and each reference text, so as to obtain semantic association degrees between the current word and each reference text.

Different from the foregoing embodiment, by respectively using one term of the plurality of terms as the current term, and directly performing dot multiplication on the individual semantic representation of the current term and the original semantic representation of each reference text, an initial association degree between the current term and each reference text is obtained, and the semantic association degree can be obtained by normalizing according to the initial association degree, so that the complexity of obtaining the semantic association degree can be reduced.

In some disclosed embodiments, the processor 112 is configured to predict a first probability value and a second probability value for each term in the chapter text using the individual semantic representations and the fused semantic representations of the terms and the global semantic representations of the global semantics of the terms, respectively; the processor 112 is configured to determine a start word and an end word in the chapter text based on the first probability value and the second probability value, and obtain an answer text using the start word and the end word; the global semantic representation is obtained based on individual semantic representations of all words, the first probability value represents the possibility that the words are the starting positions of the answer texts, and the second probability value represents the possibility that the words are the ending positions of the answer texts.

Different from the foregoing embodiment, by further introducing the global semantic representation representing the whole semantics of a plurality of terms, the local semantic information (i.e., the individual semantic representation), the external knowledge (i.e., the fusion semantic representation) and the global semantic information (i.e., the global semantic representation) can be synthesized in the process of predicting the answer text, so that the accuracy of predicting the answer text can be improved.

In some disclosed embodiments, the processor 112 is configured to predict a first probability value for each term in the chapter text using the individual semantic representations and the fused semantic representations of the terms and a global semantic representation for representing the entire semantics of the terms; the processor 112 is configured to predict and obtain the second probability value of each term by using the individual semantic representations of the terms and the fusion semantic representations, and the individual semantic representations of the terms corresponding to the global semantic representation and the maximum first probability value

Different from the foregoing embodiment, the first probability value of each term is predicted, and then the second probability value is predicted depending on the individual semantic representation of the term corresponding to the maximum first probability value, the individual semantic representations of the terms, the fusion semantic representations and the global semantic representations, so that the prediction result of the start position is fully considered in the prediction process of the end position, thereby being beneficial to improving the accuracy of the prediction of the end position.

In some disclosed embodiments, the global semantic representation is the same size as the individual semantic representations, and the processor 112 is configured to perform any of the following to obtain the global semantic representation: the individual semantic representation of the words at the preset positions in the question text and the chapter text is used as the global semantic representation; carrying out global average pooling on individual semantic representations of a plurality of words to obtain global semantic representations; and carrying out global maximum pooling on individual semantic representations of a plurality of words to obtain global semantic representations.

Different from the embodiment, the individual semantic representations of the words at the preset positions in the question text and the chapter text are directly used as the global semantic representations, so that the complexity of acquiring the global semantic representations can be reduced; the individual semantic representations of a plurality of words are subjected to global average pooling to obtain global semantic representations, so that the accuracy of the global semantic representations can be improved; in addition, the global semantic representation is obtained by carrying out global maximization on individual semantic representations of a plurality of words, so that the accuracy of the global semantic representation can be improved.

In some disclosed embodiments, the answer text is derived based on a first probability value and a second probability value for each term in the chapter text, the first probability value representing a likelihood that the term is a starting location of the answer text, the second probability value representing a likelihood that the term is an ending location of the answer text, the first probability value and the second probability value being predicted using individual semantic representations of the terms and a fused semantic representation, and a global semantic representation for representing a global semantic of the terms, the processor 112 being configured to select, among the individual semantic representations of the terms, an individual semantic representation of the term in the starting location of the answer text, and to select an individual semantic representation of the term in the ending location of the answer text; the processor 112 is configured to predict and obtain a third probability value by using the individual semantic representation of the start position word, the individual semantic representation of the end position word, and the global semantic representation; wherein the third probability value represents a likelihood of answer text for which there is no question text in the chapter text; the processor 112 is configured to determine whether to output the answer text based on the first probability value for the start position word, the second probability value for the end position word, and the third probability value.

In some disclosed embodiments, the processor 112 is configured to obtain a product of probabilities of a first probability value for a start position word and a second probability value for an end position word; the processor 112 is configured to determine to output the answer text if the product of the probabilities is greater than the third probability value; the processor 112 is configured to determine not to output the answer text if the product of the probabilities is not greater than the third probability value.

Unlike the foregoing embodiments, it is possible to advantageously reduce the complexity of determining whether to output the answer text by obtaining the product of the probabilities of the first probability value of the start position word and the second probability value of the end position word and comparing the magnitude relation between the product of the probabilities and the third probability value to determine whether to output the answer text.

In some disclosed embodiments, the processor 112 is configured to identify keywords from at least one of the question text and the chapter text, and obtain a plurality of keywords; the processor 112 is configured to obtain, from a preset knowledge dictionary, a reference text of a knowledge point related to the keyword.

Different from the embodiment, the method and the device are characterized in that at least one of the question text and the chapter text is subjected to keyword recognition to obtain a plurality of keywords, so that the reference text of the knowledge points related to the keywords is obtained from the preset knowledge dictionary, the relevance between the reference text and the chapter text and the question text can be improved, the reference value of external knowledge in the process of answering the questions can be improved, and the accuracy of answering the questions can be improved.

In some disclosed embodiments, the answer text is predicted based on a question answer model, the question answer model is trained using training samples, and the training samples include: sample question text, sample chapter text, and sample answer text.

Unlike the previous embodiments, the answer text is predicted by the question answer model, which can be advantageous to improve the efficiency of question answer.

In some disclosed embodiments, the processor 112 is configured to obtain sample chapter text, a first sample question text, and a first sample answer text of the first sample question text; wherein the first sample answer text is truncated from the sample chapter text; the processor 112 is configured to identify keywords of the first question text, and replace the identified keywords with other different words to obtain a second question text; the processor 112 is configured to use a combination of the sample chapter text, the first sample question text, and the first sample answer text, and a combination of the sample chapter text, the second sample question text, and the preset text as training samples; the preset text is used for indicating that no answer text for answering the second sample question text exists in the sample chapter text.

Different from the foregoing embodiment, the keyword recognition is performed on the first sample question text, and the recognized keyword is replaced by other different words, so that a second question text can be constructed, no answer text for answering the second question text exists in the sample chapter text, and a training sample is constructed based on the sample chapter text, the first sample question text and the first sample answer text, and the second sample question text and the preset text, so that the training sample contains both the sample question text which can be answered and the sample question text which cannot be answered, and further, the robustness of the question answer model can be improved in the training process using the training sample.

Referring to fig. 12, fig. 12 is a schematic diagram illustrating a frame of a storage device 120 according to an embodiment of the application. The storage means 120 stores program instructions 121 that can be executed by the processor, the program instructions 121 being configured to implement the steps of any of the above-described embodiments of the question answering method, or to implement the steps of the above-described embodiments of the training method of the question answering model.

According to the scheme, external knowledge can be introduced in the question answering process, expansion of the text of the chapter and the background of the question text is facilitated, and accordingly accuracy of the question answering is improved.

In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.

The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims

1. A method of question answering, comprising:

acquiring a question text and a chapter text, and acquiring reference texts of a plurality of knowledge points; wherein the question text and the chapter text include a plurality of words, and the plurality of knowledge points are related to at least one of the question text and the chapter text;

extracting individual semantic representations of the words and extracting original semantic representations of the reference texts;

obtaining semantic association degrees between the words and the reference texts by using the individual semantic representation of each word and the original semantic representation of each reference text respectively;

fusing each original semantic representation by utilizing the semantic association degree between the word and each reference text to obtain a fused semantic representation of the word;

respectively predicting a first probability value and a second probability value of each word in the chapter text by using the individual semantic representations and the fusion semantic representations of the words and a global semantic representation for representing the whole semantics of the words;

determining a start word and an end word in the chapter text based on the first probability value and the second probability value, and obtaining an answer text by using the start word and the end word;

The global semantic representation is obtained based on individual semantic representations of all the words, the first probability value represents the possibility that the words are the starting positions of the answer texts, and the second probability value represents the possibility that the words are the ending positions of the answer texts.

2. The method of claim 1, wherein said deriving semantic association between each of said terms and each of said reference texts using an individual semantic representation of each of said terms and an original semantic representation of each of said reference texts, respectively, comprises:

respectively taking one word in the plurality of words as a current word;

performing dot multiplication on the individual semantic representation of the current word and the original semantic representation of each reference text respectively to obtain the initial association degree between the current word and each reference text;

normalizing the initial association degree between the current word and each reference text to obtain the semantic association degree between the current word and each reference text.

3. The method of claim 1, wherein the predicting the first probability value and the second probability value for each term in the chapter text using the individual semantic representation and the fused semantic representation of the number of terms and a global semantic representation for representing the global semantics of the number of terms, respectively, comprises:

Predicting a first probability value of each term in the chapter text by using the individual semantic representation and the fusion semantic representation of the terms and a global semantic representation for representing the whole semantics of the terms;

and predicting the second probability value of each word by using the individual semantic representations of the words, the fusion semantic representations and the individual semantic representations of the words corresponding to the global semantic representation and the maximum first probability value.

4. The method of claim 1, wherein the global semantic representation and the individual semantic representation are the same size; the global semantic representation is obtained by any one of the following means:

taking the individual semantic representations of the words at preset positions in the question text and the chapter text as the global semantic representations;

carrying out global average pooling on individual semantic representations of the words to obtain the global semantic representation;

and carrying out global maximum pooling on individual semantic representations of the words to obtain the global semantic representation.

5. The method of claim 1, wherein the determining a start word and an end word in the chapter text based on the first probability value and the second probability value, and using the start word and the end word, the method further comprises, after obtaining an answer text:

Selecting individual semantic representations of the words at the starting position of the answer text from the individual semantic representations of the words at the ending position of the answer text;

predicting to obtain a third probability value by using the individual semantic representation of the initial position word, the individual semantic representation of the end position word and the global semantic representation; wherein the third probability value represents a likelihood that no answer text for the question text exists in the chapter text;

determining whether to output the answer text based on the first probability value of the starting position word, the second probability value of the ending position word, and the third probability value.

6. The method of claim 5, wherein the determining whether to output the answer text based on the first probability value for the starting location word, the second probability value for the ending location word, and the third probability value comprises:

obtaining a product of probabilities of a first probability value of the starting position word and a second probability value of the ending position word;

determining to output the answer text if the product of the probabilities is greater than the third probability value;

In the case where the product of the probabilities is not greater than the third probability value, it is determined that the answer text is not output.

7. The method of claim 1, wherein the obtaining the reference text for the plurality of knowledge points comprises:

carrying out keyword recognition on at least one of the question text and the chapter text to obtain a plurality of keywords;

and acquiring a reference text of a knowledge point related to the keyword from a preset knowledge dictionary.

8. The method of claim 1, wherein the answer text is predicted based on a question answer model, the question answer model is trained using training samples, and the training samples comprise: sample question text, sample chapter text, and sample answer text.

9. The method of claim 8, wherein the step of obtaining training samples comprises:

acquiring a sample chapter text, a first sample question text and a first sample answer text of the first sample question text; wherein the first sample answer text is truncated from the sample chapter text;

keyword recognition is carried out on the first sample question text, and keywords obtained through recognition are replaced by other different words, so that a second sample question text is obtained;

Taking the combination of the sample chapter text, the first sample question text and the first sample answer text and the combination of the sample chapter text, the second sample question text and a preset text as training samples;

and the preset text is used for indicating that no answer text for answering the second sample question text exists in the sample chapter text.

10. An electronic device comprising a memory and a processor coupled to each other, the memory having stored therein program instructions for executing the program instructions to implement the question answering method of any one of claims 1 to 9.

11. A storage device storing program instructions executable by a processor for implementing the question answering method according to any one of claims 1 to 9.