CN114880485A - Reading comprehension answer generation method and device, computer equipment and storage medium - Google Patents

Reading comprehension answer generation method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114880485A
CN114880485A CN202210512242.3A CN202210512242A CN114880485A CN 114880485 A CN114880485 A CN 114880485A CN 202210512242 A CN202210512242 A CN 202210512242A CN 114880485 A CN114880485 A CN 114880485A
Authority
CN
China
Prior art keywords
character string
text
processed
reading comprehension
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210512242.3A
Other languages
Chinese (zh)
Inventor
喻祥
张智超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Pudu Technology Co Ltd
Original Assignee
Shenzhen Pudu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Pudu Technology Co Ltd filed Critical Shenzhen Pudu Technology Co Ltd
Priority to CN202210512242.3A priority Critical patent/CN114880485A/en
Publication of CN114880485A publication Critical patent/CN114880485A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to a reading comprehension answer generation method, a reading comprehension answer generation device, a computer device and a storage medium. The method comprises the following steps: acquiring a problem to be processed; obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text; generating a model according to the reading comprehension text, the problem to be processed and the preset language to obtain a prediction text; and searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed. The reading comprehension text prefix tree is constructed, the answer generated by the predicted text is restrained by the prefix tree, the answer is guaranteed to belong to the components of the reading comprehension text, and the accuracy of the reading comprehension answer is improved.

Description

Reading comprehension answer generation method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of natural language processing technologies, and in particular, to a method and an apparatus for generating reading comprehension answers, a computer device, and a storage medium.
Background
Machine-read understanding is a hotspot of research in the field of natural language processing technology and is a long-term goal in artificial intelligence to understand and process human language processes. The general mode of machine-reading understanding is that machines read documents and answer accordingly to questions.
In the early stage of machine reading understanding, answers corresponding to the questions are determined mainly based on a rule system, and although the answers have an accuracy rate of 30% -40%, the questions are seriously dependent on some grammatical and linguistic based tools, data sets are too small, effective features are difficult to construct and the like. With the development of deep learning technology, reading understanding generation is performed by adopting a seq2seq model, which can effectively improve the above problems, but the following problems still exist: the traditional Seq2Seq model has randomness in generating reading understanding answers, generated contents may exceed the range of reading articles, and the generated contents are different in each reading understanding, so that the accuracy of generating reading understanding answers in the prior art is not high.
Disclosure of Invention
In view of the above, there is a need to provide a reading comprehension answer generation method, apparatus, computer device and storage medium capable of improving the accuracy of the answer.
In a first aspect, the present application provides a method of reading comprehension answer generation. The reading comprehension answer generation method comprises the following steps:
acquiring a problem to be processed;
obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text;
generating a model according to the reading comprehension text, the problem to be processed and the preset language to obtain a prediction text;
and searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed.
In one embodiment, finding a target string corresponding to a predicted text in a prefix tree includes:
taking the predicted text as an initial character string;
obtaining N next character strings based on the initial character string, and obtaining a target character string through the last next character string, wherein N is an integer greater than or equal to zero;
wherein the initial character string is a previous character string of a first next character string; searching a next character of a previous character string in the prefix tree, and determining coverage gain of all searched next characters according to the problem to be processed and the reading comprehension text; splicing the next character with the largest coverage gain with the previous character string to obtain a next character string; and when the next character of the previous character string is the preset ending character, the previous character string is the last next character string, and the last next character string is taken as the target character string.
In one embodiment, the determining of the coverage gain comprises:
acquiring a first character string comprising a to-be-processed question and a reading comprehension text;
taking a next character string corresponding to the next character as a second character string;
splicing the next character with the second character string to obtain a third character string;
determining a first longest common subsequence between the first string and the third string;
determining a second longest common subsequence between the first string and the second string;
determining a difference of the first longest common subsequence and the second longest common subsequence as a coverage gain.
In one embodiment, constructing the prefix tree based on the reading comprehension text comprises:
carrying out sentence cutting processing on the reading comprehension text to obtain a plurality of sentences;
a prefix tree is generated from the plurality of sentences.
In one embodiment, the obtaining the predicted text according to the reading comprehension text, the problem to be processed and the preset language generation model includes:
splicing the reading comprehension text and the problem to be processed to obtain a target sequence;
obtaining a target vector according to the word vector, the position vector and the segment vector of the target sequence;
and inputting the target vector into the BERT model to obtain a predicted text output by the BERT model.
In one embodiment, the reading comprehension text and the problem to be processed are spliced, and the method comprises the following steps:
adding a start mark before reading a start character of the comprehension text;
adding a splicing mark between the reading comprehension text and the problem to be processed;
adding an end mark after an end character of the problem to be processed;
and sequentially connecting the starting mark, the reading comprehension text, the splicing mark, the problem to be processed and the ending mark through a connecting function.
In one embodiment, the determining process of the position vector includes:
acquiring a preset absolute position coding vector and a preset hyper-parameter;
determining a base vector set according to the absolute position coding vector and the hyperparameter;
determining absolute positions corresponding to any two vectors in the absolute position coding vectors;
determining a target position according to the absolute position;
generating a coding vector of a target position according to the base vector corresponding to the absolute position in the base vector set and the hyperparameter;
the code vectors for all target positions are determined as position vectors.
In a second aspect, the present application further provides a device for reading comprehension answer generation. The reading comprehension answer generation device comprises:
the data acquisition module is used for acquiring the problem to be processed;
the prefix tree generation module is used for obtaining a corresponding reading understanding text according to the problem to be processed and constructing a prefix tree based on the reading understanding text;
the data prediction module is used for generating a model according to the reading comprehension text, the problem to be processed and the preset language to obtain a prediction text;
and the answer determining module is used for searching a target character string corresponding to the predicted text in the prefix tree and determining the target character string as an answer corresponding to the problem to be processed.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory and a processor, the memory stores a computer program, and the processor realizes the following steps when executing the computer program:
acquiring a problem to be processed;
obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text;
generating a model according to the reading comprehension text, the problem to be processed and a preset language to obtain a prediction text;
and searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed.
In a fourth aspect, the present application further provides a computer-readable storage medium. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of:
acquiring a problem to be processed;
obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text;
generating a model according to the reading comprehension text, the problem to be processed and a preset language to obtain a prediction text;
and searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed.
The reading understanding answer generation method, the reading understanding answer generation device, the computer equipment, the storage medium and the computer program product acquire the to-be-processed question; obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text; generating a model according to the reading comprehension text, the problem to be processed and a preset language to obtain a prediction text; and searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed. The reading comprehension text prefix tree is constructed, the answer generated by the predicted text is restrained by the prefix tree, the answer is guaranteed to belong to the components of the reading comprehension text, and the accuracy of the reading comprehension answer is improved.
Drawings
FIG. 1 is a flow diagram illustrating a method for reading understanding answer generation in one embodiment;
FIG. 2 is a block diagram of a prefix tree in one embodiment;
FIG. 3 is a schematic flow chart of step 106 in one embodiment;
FIG. 4 is a diagram illustrating an example of a relationship between an absolute position and a target position;
FIG. 5 is a flowchart illustrating a method for reading understanding answer generation according to another embodiment;
FIG. 6 is a block diagram of an apparatus for reading understanding answer generation in one embodiment;
FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In an embodiment, as shown in fig. 1, a method for generating a reading understanding answer is provided, and this embodiment is illustrated by applying the method to a terminal, it is to be understood that the method may also be applied to a server, and may also be applied to a system including a terminal and a server, and is implemented by interaction between the terminal and the server. In this embodiment, the method includes the steps of:
step 102, a problem to be processed is obtained.
Specifically, the terminal may obtain the to-be-processed problem in a plurality of ways, for example, the terminal may receive the to-be-processed problem input by the user through the data interface, or may actively read the to-be-processed problem from other devices, which is not limited in this embodiment. The terminal can be a personal computer, a notebook computer, a smart phone, a robot and other devices.
And 104, obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text.
Fig. 2 is a schematic structural diagram of the prefix tree in one embodiment, and as shown in fig. 2, except for a root node and leaf nodes, each node of the tree stores one character in a character string composed of a reading comprehension text, and a plurality of nodes on each path are combined to form a complete character string.
It should be understood that a character refers to a glyph-like unit or symbol that includes letters, numbers, operators, punctuation marks and other symbols, as well as some functional symbols. Characters are the common names of letters, numbers and symbols in electronic computers or radio communications, and are the smallest data access units in data structures. A string is a sequence of multiple characters.
Specifically, after the terminal acquires the problem to be processed, the terminal may send a data acquisition instruction to the server, receive a reading comprehension text corresponding to the problem to be processed and returned by the server, process the reading comprehension text to obtain each character in the reading comprehension text, and finally store all the characters in each node of the tree to obtain the prefix tree.
And 106, generating a model according to the reading comprehension text, the problem to be processed and the preset language to obtain a prediction text.
The preset Language generation Model refers to a pre-generated Language generation Model, and may be a sequence to sequence (seq 2seq) Model, a Language representation Model (BERT) Model using a Unified Language Model (UNILM) mode, and the like.
Specifically, the terminal may directly input the reading comprehension text and the problem to be processed into the preset language generation model to obtain the predicted text output by the model, or may first pre-process the reading comprehension text and the problem to be processed, and then input the pre-processed data into the preset language generation model to obtain the predicted text, which is not limited in this embodiment.
And 108, searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed.
In particular, the predicted text may be found in a prefix tree in which a set of target strings containing the predicted text is determined. Referring also to fig. 2, assuming that the predicted text is "tomorrow", the character strings including "tomorrow" have "tomorrow would be better", "tomorrow rains", "tomorrow afternoon meetings", and "tomorrow afternoon vacation" in the prefix tree, and a set of target character strings is determined as answers from the four sets of character strings, for example, "tomorrow would be better". The prefix tree is generated according to the reading comprehension text, and the target character string is also a part of the content in the reading comprehension text, so that the answer does not exceed the scope of the reading comprehension text, and the quality and the reliability of the reading comprehension question and answer are improved.
It should be understood that the embodiment may be applied to various reading understanding question and answer scenarios, taking a terminal as a robot and a reading understanding text as a dish description text as examples, a user may input a question about a dish through the robot, and the robot outputs a corresponding answer according to the question, so as to improve human-computer interaction experience.
The embodiment obtains the problem to be processed; obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text; generating a model according to the reading comprehension text, the problem to be processed and a preset language to obtain a prediction text; and searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed. The reading comprehension text prefix tree is constructed, the answer generated by the predicted text is restrained by the prefix tree, the answer is guaranteed to belong to the components of the reading comprehension text, and the accuracy of the reading comprehension answer is improved.
In one embodiment, the step of finding a target string corresponding to the predicted text in the prefix tree includes: taking the predicted text as an initial character string; obtaining N next character strings based on the initial character string, and obtaining a target character string through the last next character string, wherein N is an integer greater than or equal to zero; wherein the initial character string is a previous character string of a first next character string; searching a next character of a previous character string in the prefix tree, and determining coverage gain of all searched next characters according to the problem to be processed and the reading comprehension text; splicing the next character with the largest coverage gain with the previous character string to obtain a next character string; and when the next character of the previous character string is the preset ending character, the previous character string is the last next character string, and the last next character string is taken as the target character string.
Specifically, after the next character string is obtained, the next character needs to be continuously searched from the prefix tree, iteration is continued when the next character is not the preset end character, and the iteration is ended when the next character is the preset end character, so that the target character string is obtained.
Referring to fig. 2, taking the predicted text "tomorrow" as an example, taking "tomorrow" as an initial character string, searching the next character of "tomorrow" in the prefix tree, according to the prefix tree in fig. 2, the next character may be "meeting" and "down", calculating coverage gains of "meeting" and "down", if the coverage gain of "meeting" is larger, splicing "meeting" and "tomorrow" to obtain the next character string "tomorrow meeting", continuing to search the next character "better" and "good" in the prefix tree, and taking the next character string "tomorrow better" as a target character string until the next character is EOS.
According to the embodiment, the next character with the largest coverage gain is spliced with the previous character string to obtain the target character string, so that the correlation degree between the target character string and the problem to be processed is the highest, and the quality of the answer is improved.
In one embodiment, the determination of the coverage gain comprises: acquiring a first character string comprising a to-be-processed question and a reading comprehension text; taking a next character string corresponding to the next character as a second character string; splicing the next character with the second character string to obtain a third character string; determining a first longest common subsequence between the first string and the third string; determining a second longest common subsequence between the first string and the second string; determining a difference of the first longest common subsequence and the second longest common subsequence as a coverage gain.
It should be understood that the coverage gain may be calculated using the longest common subsequence, and the calculation formula is referenced below:
Δ (k) =LCS(Z (k) ,Q)-LCS(Z <t ,Q);
wherein, Delta (k) Coverage gain, Z, for the next character k (k) Is a third string, Q is a first string, Z <t Is the second string.
Referring to fig. 2, taking the predicted text "tomorrow" as an example, when calculating the coverage gain of the next character "meeting", it is necessary to calculate the longest common subsequence LCS1 of the third character string "tomorrow" and the first character string, the longest common subsequence LCS2 of the second character string "tomorrow" and the first character string, and the coverage gain of the "meeting" is: Δ 1 ═ LCS1-LCS 2. In calculating the coverage gain of the next character "lower", the third character string "tomorrow" and the longest common subsequence LCS3 of the first character string, the second character string "tomorrow" and the longest common subsequence LCS2 of the first character string need to be calculated, and the coverage gain of "lower" is: Δ 2 ═ LCS3-LCS 2.
According to the method and the device, the coverage gain of the next character is determined according to the longest public subsequence, so that the calculation speed of the coverage gain is effectively improved, and the answer generation efficiency is further improved.
In one embodiment, the step of generating the prefix tree from the reading comprehension text comprises: carrying out sentence cutting processing on the reading comprehension text to obtain a plurality of sentences; a prefix tree is generated from the plurality of sentences.
In which the reading comprehension text may be processed in various ways, in one example, a specific separator such as a period, a line break or a semicolon may be found in the reading comprehension text, and the segmentation is performed using a Natural Language processing Toolkit (NLTK) framework.
In a specific implementation, after obtaining a plurality of sentences, the sentence may be further subjected to character cutting processing to obtain a plurality of characters, and the characters are stored in nodes of the prefix tree to obtain the prefix tree.
In the embodiment, the sentence cutting processing is performed on the reading comprehension text, and the prefix tree is generated according to the processed sentences, so that the generation speed and accuracy of the prefix tree are improved, and the generation efficiency of the answer is further improved.
Fig. 3 is a flowchart illustrating step 106 in an embodiment, where the predetermined language generation model is a BERT model using a UniLM mode in this embodiment. As shown in fig. 3, the step of generating a model according to the reading comprehension text, the problem to be processed, and the preset language to obtain a prediction text includes:
and step 302, splicing the reading comprehension text and the problem to be processed to obtain a target sequence.
It should be understood that the reading comprehension text and the to-be-processed question may be spliced in a variety of ways, for example, a "+" operator may be added between the reading comprehension text and the to-be-processed question, or the reading comprehension text and the to-be-processed question may be spliced by using a splicing function, which is not limited in this embodiment.
In one example, the reading comprehension text and the pending question are spliced, including: adding a start mark before reading a start character of the comprehension text; adding a splicing mark between the reading comprehension text and the problem to be processed; adding an end mark after an end character of the problem to be processed; and sequentially connecting the starting mark, the reading comprehension text, the splicing mark, the problem to be processed and the ending mark through a connecting function.
The start flag, the splice flag, the end flag, and the connection function may be set according to usage habits, for example, the start flag may be set to [ CLS ], the splice flag may be set to [ SEP ], the end flag may be set to [ EOS ], and the connection function uses a concat function.
Taking the reading comprehension text as S, the question as W, and the target sequence as M as examples, the reading comprehension text and the question to be processed can be spliced by the following formula:
M=concat([CLS],S,[SEP],W,[EOS])。
and step 304, obtaining a target vector according to the word vector, the position vector and the segment vector of the target sequence.
Specifically, taking the target sequence M as an example, a word vector token encoding (M), a position vector position encoding (M), and a segment vector encoding (M) of the target sequence are obtained, and the three vectors are summed to obtain the target vector.
And step 306, inputting the target vector into a BERT model adopting a UniLM mode to obtain a prediction text output by the BERT model.
The UNILM can be used for fine adjustment of natural language understanding tasks and natural language generation tasks, and different self-attention mask aggregation contexts can be used by adopting a BERT model of a UNILM mode, so that the UNILM is more suitable for being applied to scenes for generating question answers.
The BERT model adopting UNILM mode is composed of 24 layers of transform networks, and converts a target vector { xi } into a matrix H 0 =【x 1 ,......,x m Input to the transform network, the output of each layer is: h l =Transformer l (H l-1 ) Wherein l is the number of layers.
For the Transformer self-attention weight of the l-th layer, the calculation can be performed by the following self-attention output equation:
Figure BDA0003639775980000091
wherein the content of the first and second substances,
Figure BDA0003639775980000092
wherein A is l Is the transform self-attention weight of the l-th layer, softmax function is a normalized exponential function, d k Q, K columns of the matrix, each layer controlling each word by a mask matrix EAttention area, 0 for attention, negative infinity for attention, is masked, and for the first layer transform, Q, K, V, is obtained by linear transformation of the input matrix, W is the linear transformation hyperparameter.
In the embodiment, the reading comprehension text and the problem to be processed are processed and then input into the BERT model adopting the UNILM mode, so that the predicted text corresponding to the reading comprehension text and the problem to be processed is obtained, and compared with other models, a more accurate predicted text can be generated.
In one embodiment, the determination of the position vector comprises: acquiring a preset absolute position coding vector and a preset hyper-parameter; determining a base vector set according to the absolute position coding vector and the hyperparameter; determining absolute positions corresponding to any two vectors in the absolute position coding vectors; determining a target position according to the absolute position; generating a coding vector of a target position according to the base vector corresponding to the absolute position in the base vector set and the hyperparameter; the code vectors for all target positions are determined as position vectors.
It should be understood that the pretraining of the BERT model requires a lot of corpus and computation power, and due to the limitation of pretraining, the downstream task is usually developed on the public trained BERT model, the maximum input length supported by the public BERT model is 512, but the reading comprehension text often reaches the top of a thousand characters, which results in that the space size of the reading comprehension text is greatly limited when the public BERT model is used for the downstream task.
The BERT model disclosed at present adopts an absolute position coding mode, the maximum coding length is 512, each position vector is learned by the model through self-training, if the BERT model with the length larger than 512 is used, retraining is needed, a large amount of training linguistic data is needed, a large amount of calculation resources are consumed, and the use cost is very high.
For the problem, the embodiment adopts the absolute position coding which is well trained by hierarchical decomposition, and takes the absolute position coding vector as the base vector, so that the coding of the reading comprehension text with the length of more than 512 can be realized, and the retraining problem of the BERT model is avoided.
In particular, assume that the absolute position-encoding vector for which the disclosed BERT model has been trained is p 1 、p 2 、p 3 、…、p n N is 1, 2,.. and 512, the BERT model adopting the UNILM mode in the present embodiment needs to construct a new set of coding vectors q based on this model 1 ,q 2 ,…,q m ,m>n, the new code vector may be obtained by:
Figure BDA0003639775980000101
wherein the content of the first and second substances,
Figure BDA0003639775980000102
is a hyper-parameter which is the parameter,
Figure BDA0003639775980000103
and is
Figure BDA0003639775980000104
u 1 、u 2 、......、u n Is a set of basis vectors.
FIG. 4 is a diagram illustrating the relationship between the absolute position and the target position according to an embodiment, and referring to FIG. 4, in order that the new code vector is the same as the absolute position code vector when n is not exceeded, i.e. q 1 =p 1 、q 2 =p 2 、......、q n =p n To facilitate compatibility with the trained BERT model, the basis vector may be obtained by:
Figure BDA0003639775980000111
for absolute positions i and j corresponding to any two of the absolute encoded vectors, it can be expressed as a target position k: (i-1) n + j, the basis vectors corresponding to the absolute positions i and j are u respectively i And u j Position coding for absolute positions i and jAre respectively
Figure BDA0003639775980000112
And
Figure BDA0003639775980000113
u j and then the coding vector of the target position is obtained by adding the two.
The embodiment can expand the length of the coding field from n to n by adopting a position coding hierarchical decomposition mode 2 The limitation that the maximum character length supported by the public BERT model is 512 is broken through, 512 characters can be supported, and the input requirement of the model can be met when the reading comprehension text is long.
Fig. 5 is a flowchart illustrating a method for generating a reading comprehension answer according to another embodiment, where as shown in fig. 5, the method for generating a reading comprehension answer may include the following steps:
step 502, a problem to be processed is obtained.
And step 504, obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text.
And step 506, splicing the reading comprehension text and the problem to be processed to obtain a target sequence.
And step 508, obtaining a target vector according to the word vector, the position vector and the segment vector of the target sequence.
And step 510, inputting the target vector into a BERT model adopting a UniLM mode to obtain a prediction text output by the BERT model.
And step 512, searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed.
It should be understood that, although the steps in the flowcharts related to the above embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the present application further provides a device for generating reading comprehension answers, which is used for implementing the method for generating reading comprehension answers mentioned above. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the method, so that specific limitations in one or more embodiments of the device for generating reading understanding answers provided below can be referred to the limitations on the method for generating reading understanding answers in the foregoing, and details are not described herein again.
In one embodiment, as shown in fig. 6, there is provided an apparatus for reading comprehension answer generation, including: a data acquisition module 602, a prefix tree generation module 604, a data prediction module 606, and an answer determination module 608, wherein:
a data obtaining module 602, configured to obtain a problem to be processed;
a prefix tree generation module 604, configured to obtain a corresponding reading understanding text according to the problem to be processed, and construct a prefix tree based on the reading understanding text;
a data prediction module 606, configured to generate a model according to the reading comprehension text, the problem to be processed, and a preset language, to obtain a prediction text;
an answer determining module 608, configured to search a target character string corresponding to the predicted text in the prefix tree, and determine the target character string as an answer corresponding to the to-be-processed question.
In one embodiment, the answer determination module 608 is further configured to use the predicted text as an initial string; obtaining N next character strings based on the initial character string, and obtaining a target character string through the last next character string, wherein N is an integer greater than or equal to zero; wherein the initial character string is a previous character string of a first next character string; searching a next character of a previous character string in the prefix tree, and determining coverage gain of all searched next characters according to the problem to be processed and the reading comprehension text; splicing the next character with the largest coverage gain with the previous character string to obtain a next character string; and when the next character of the previous character string is the preset ending character, the previous character string is the last next character string, and the last next character string is taken as the target character string.
In one embodiment, the answer determination module 608 is further configured to obtain a first character string including the to-be-processed question and the reading comprehension text; taking a next character string corresponding to the next character as a second character string; splicing the next character with the second character string to obtain a third character string; determining a first longest common subsequence between the first string and the third string; determining a second longest common subsequence between the first string and the second string; determining a difference of the first longest common subsequence and the second longest common subsequence as a coverage gain.
In an embodiment, the prefix tree generating module 604 is further configured to perform sentence cutting processing on the reading comprehension text to obtain a plurality of sentences; a prefix tree is generated from the plurality of sentences.
In one embodiment, the preset language generation model is a BERT model using a UNILM mode, and the data prediction module 606 is further configured to splice the reading comprehension text and the problem to be processed to obtain a target sequence; obtaining a target vector according to the word vector, the position vector and the segment vector of the target sequence; and inputting the target vector into the BERT model to obtain a predicted text output by the BERT model.
In one embodiment, the data prediction module 606 is further configured to add a start marker before reading a start character of the comprehension text; adding a splicing mark between the reading comprehension text and the problem to be processed; adding an end mark after an end character of a problem to be processed; and sequentially connecting the starting mark, the reading comprehension text, the splicing mark, the problem to be processed and the ending mark through a connecting function.
In one embodiment, the data prediction module 606 is further configured to obtain a predetermined absolute position-coding vector and a predetermined hyper-parameter; determining a base vector set according to the absolute position coding vector and the hyperparameter; determining absolute positions corresponding to any two vectors in the absolute position coding vectors; determining a target position according to the absolute position; generating a coding vector of a target position according to the base vector corresponding to the absolute position in the base vector set and the hyper-parameter; the code vectors for all target positions are determined as position vectors.
The above-mentioned reading understanding answer generation device may be implemented in whole or in part by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 7. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method of reading comprehension answer generation. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
acquiring a problem to be processed;
obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text;
generating a model according to the reading comprehension text, the problem to be processed and a preset language to obtain a prediction text;
and searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
taking the predicted text as an initial character string; obtaining N next character strings based on the initial character string, and obtaining a target character string through the last next character string, wherein N is an integer greater than or equal to zero; wherein the initial character string is a previous character string of a first next character string; searching a next character of a previous character string in the prefix tree, and determining coverage gain of all searched next characters according to the problem to be processed and the reading comprehension text; splicing the next character with the largest coverage gain with the previous character string to obtain a next character string; and when the next character of the previous character string is the preset ending character, the previous character string is the last next character string, and the last next character string is taken as the target character string.
In one embodiment, the processor when executing the computer program further performs the steps of:
acquiring a first character string comprising a to-be-processed question and a reading comprehension text; taking a next character string corresponding to the next character as a second character string; splicing the next character with the second character string to obtain a third character string; determining a first longest common subsequence between the first string and the third string; determining a second longest common subsequence between the first string and the second string; determining a difference of the first longest common subsequence and the second longest common subsequence as a coverage gain.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
carrying out sentence cutting processing on the reading comprehension text to obtain a plurality of sentences; a prefix tree is generated from the plurality of sentences.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
the preset language generation model is a BERT model adopting a UNILM mode, and splicing the reading comprehension text and the problem to be processed to obtain a target sequence; obtaining a target vector according to the word vector, the position vector and the segment vector of the target sequence; and inputting the target vector into the BERT model to obtain a predicted text output by the BERT model.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
adding a start mark before reading a start character of the comprehension text; adding a splicing mark between the reading comprehension text and the problem to be processed; adding an end mark after an end character of the problem to be processed; and sequentially connecting the starting mark, the reading comprehension text, the splicing mark, the problem to be processed and the ending mark through a connecting function.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring a preset absolute position coding vector and a preset hyper-parameter; determining a base vector set according to the absolute position coding vector and the hyperparameter; determining absolute positions corresponding to any two vectors in the absolute position coding vectors; determining a target position according to the absolute position; generating a coding vector of a target position according to the base vector corresponding to the absolute position in the base vector set and the hyperparameter; the code vectors for all target positions are determined as position vectors.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, performs the steps of:
acquiring a problem to be processed;
obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text;
generating a model according to the reading comprehension text, the problem to be processed and a preset language to obtain a prediction text;
and searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed.
In one embodiment, the computer program when executed by the processor further performs the steps of:
taking the predicted text as an initial character string; obtaining N next character strings based on the initial character string, and obtaining a target character string through the last next character string, wherein N is an integer greater than or equal to zero; wherein the initial character string is a previous character string of a first next character string; searching a next character of a previous character string in the prefix tree, and determining coverage gain of all searched next characters according to the problem to be processed and the reading comprehension text; splicing the next character with the largest coverage gain with the previous character string to obtain a next character string; and when the next character of the previous character string is the preset ending character, the previous character string is the last next character string, and the last next character string is taken as the target character string.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a first character string comprising a to-be-processed question and a reading comprehension text; taking a next character string corresponding to the next character as a second character string; splicing the next character with the second character string to obtain a third character string; determining a first longest common subsequence between the first string and the third string; determining a second longest common subsequence between the first string and the second string; determining a difference of the first longest common subsequence and the second longest common subsequence as a coverage gain.
In one embodiment, the computer program when executed by the processor further performs the steps of:
carrying out sentence cutting processing on the reading comprehension text to obtain a plurality of sentences; a prefix tree is generated from the plurality of sentences.
In one embodiment, the computer program when executed by the processor further performs the steps of:
the preset language generation model is a BERT model adopting a UNILM mode, and splicing the reading comprehension text and the problem to be processed to obtain a target sequence; obtaining a target vector according to the word vector, the position vector and the segment vector of the target sequence; and inputting the target vector into the BERT model to obtain a predicted text output by the BERT model.
In one embodiment, the computer program when executed by the processor further performs the steps of:
adding a start mark before reading a start character of the comprehension text; adding a splicing mark between the reading comprehension text and the problem to be processed; adding an end mark after an end character of the problem to be processed; and sequentially connecting the starting mark, the reading comprehension text, the splicing mark, the problem to be processed and the ending mark through a connecting function.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a preset absolute position coding vector and a preset hyper-parameter; determining a base vector set according to the absolute position coding vector and the hyperparameter; determining absolute positions corresponding to any two vectors in the absolute position coding vectors; determining a target position according to the absolute position in the base vector set; generating a coding vector of a target position according to the base vector corresponding to the absolute position and the hyperparameter; the code vectors for all target positions are determined as position vectors.
In one embodiment, a computer program product is provided, comprising a computer program which when executed by a processor performs the steps of:
acquiring a problem to be processed;
obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text;
generating a model according to the reading comprehension text, the problem to be processed and a preset language to obtain a prediction text;
and searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the problem to be processed.
In one embodiment, the computer program when executed by the processor further performs the steps of:
taking the predicted text as an initial character string; obtaining N next character strings based on the initial character string, and obtaining a target character string through the last next character string, wherein N is an integer greater than or equal to zero; wherein the initial character string is a previous character string of a first next character string; searching a next character of a previous character string in the prefix tree, and determining coverage gain of all searched next characters according to the problem to be processed and the reading comprehension text; splicing the next character with the largest coverage gain with the previous character string to obtain a next character string; and when the next character of the previous character string is the preset ending character, the previous character string is the last next character string, and the last next character string is taken as the target character string.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a first character string comprising a to-be-processed question and a reading comprehension text; taking a next character string corresponding to the next character as a second character string; splicing the next character with the second character string to obtain a third character string; determining a first longest common subsequence between the first string and the third string; determining a second longest common subsequence between the first string and the second string; determining a difference of the first longest common subsequence and the second longest common subsequence as a coverage gain.
In one embodiment, the computer program when executed by the processor further performs the steps of:
carrying out sentence cutting processing on the reading comprehension text to obtain a plurality of sentences; a prefix tree is generated from the plurality of sentences.
In one embodiment, the computer program when executed by the processor further performs the steps of:
the preset language generation model is a BERT model adopting a UNILM mode, and splicing the reading comprehension text and the problem to be processed to obtain a target sequence; obtaining a target vector according to the word vector, the position vector and the segment vector of the target sequence; and inputting the target vector into the BERT model to obtain a predicted text output by the BERT model.
In one embodiment, the computer program when executed by the processor further performs the steps of:
adding a start mark before reading a start character of the comprehension text; adding a splicing mark between the reading comprehension text and the problem to be processed; adding an end mark after an end character of the problem to be processed; and sequentially connecting the starting mark, the reading comprehension text, the splicing mark, the problem to be processed and the ending mark through a connecting function.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a preset absolute position coding vector and a preset hyper-parameter; determining a base vector set according to the absolute position coding vector and the hyperparameter; determining absolute positions corresponding to any two vectors in the absolute position coding vectors; determining a target position according to the absolute position; generating a coding vector of a target position according to the base vector corresponding to the absolute position in the base vector set and the hyperparameter; the code vectors for all target positions are determined as position vectors.
It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, databases, or other media used in the embodiments provided herein can include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (10)

1. A method of reading comprehension answer generation, the method comprising:
acquiring a problem to be processed;
obtaining a corresponding reading comprehension text according to the problem to be processed, and constructing a prefix tree based on the reading comprehension text;
generating a model according to the reading comprehension text, the problem to be processed and a preset language to obtain a prediction text;
and searching a target character string corresponding to the predicted text in the prefix tree, and determining the target character string as an answer corresponding to the to-be-processed question.
2. The method of claim 1, wherein the finding the target string corresponding to the predicted text in the prefix tree comprises:
taking the predicted text as an initial character string;
obtaining N next character strings based on the initial character string, and obtaining the target character string through the last next character string, wherein N is an integer greater than or equal to zero;
wherein the initial character string is a previous character string of a first one of the next character strings; determining coverage gain of all found next characters according to the problem to be processed and the reading comprehension text by searching the next character of the previous character string in the prefix tree; splicing the next character with the largest coverage gain with the previous character string to obtain the next character string; and when the next character of the previous character string is a preset end character, the previous character string is the last next character string, and the last next character string is used as a target character string.
3. The method of claim 2, wherein the determining of the coverage gain comprises:
acquiring a first character string comprising the to-be-processed question and the reading comprehension text;
taking the next character string corresponding to the next character as a second character string;
splicing the next character and the second character string to obtain a third character string;
determining a first longest common subsequence between the first string and the third string;
determining a second longest common subsequence between the first string and the second string;
determining a difference of the first longest common subsequence and the second longest common subsequence as a coverage gain.
4. The method of claim 1, wherein the constructing a prefix tree based on the reading comprehension text comprises:
carrying out sentence cutting processing on the reading comprehension text to obtain a plurality of sentences;
generating the prefix tree from the plurality of sentences.
5. The method according to any one of claims 1 to 4, wherein the preset language generating model is a BERT model adopting UNILM mode, and the obtaining of the predicted text according to the reading comprehension text, the problem to be processed and the preset language generating model comprises:
splicing the reading comprehension text and the problem to be processed to obtain a target sequence;
obtaining a target vector according to the word vector, the position vector and the segment vector of the target sequence;
and inputting the target vector into the BERT model to obtain the predicted text output by the BERT model.
6. The method of claim 5, wherein said concatenating said reading comprehension text and said question to be processed comprises:
adding a start mark before a start character of the reading comprehension text;
adding a splicing mark between the reading comprehension text and the question to be processed;
adding an end mark after the end character of the problem to be processed;
and connecting the starting mark, the reading comprehension text, the splicing mark, the problem to be processed and the ending mark in sequence through a connecting function.
7. The method of claim 5, wherein the determining the position vector comprises:
acquiring a preset absolute position coding vector and a preset hyper-parameter;
determining a set of base vectors from the absolute position-coding vectors and the hyper-parameters;
determining absolute positions corresponding to any two vectors in the absolute position encoding vectors;
determining a target position according to the absolute position;
generating a coding vector of the target position according to the base vector corresponding to the absolute position in the base vector set and the hyperparameter;
determining the code vectors of all target positions as the position vector.
8. An apparatus for reading comprehension answer generation, the apparatus comprising:
the data acquisition module is used for acquiring the problem to be processed;
the prefix tree generation module is used for obtaining a corresponding reading understanding text according to the problem to be processed and constructing a prefix tree based on the reading understanding text;
the data prediction module is used for generating a model according to the reading comprehension text, the problem to be processed and a preset language to obtain a prediction text;
and the answer determining module is used for searching a target character string corresponding to the predicted text in the prefix tree and determining the target character string as an answer corresponding to the to-be-processed question.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor is configured to carry out the steps of the method according to any one of claims 1 to 7 when the computer program is executed.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202210512242.3A 2022-05-12 2022-05-12 Reading comprehension answer generation method and device, computer equipment and storage medium Pending CN114880485A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210512242.3A CN114880485A (en) 2022-05-12 2022-05-12 Reading comprehension answer generation method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210512242.3A CN114880485A (en) 2022-05-12 2022-05-12 Reading comprehension answer generation method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114880485A true CN114880485A (en) 2022-08-09

Family

ID=82676142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210512242.3A Pending CN114880485A (en) 2022-05-12 2022-05-12 Reading comprehension answer generation method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114880485A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117573839A (en) * 2024-01-12 2024-02-20 阿里云计算有限公司 Document retrieval method, man-machine interaction method, electronic device and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117573839A (en) * 2024-01-12 2024-02-20 阿里云计算有限公司 Document retrieval method, man-machine interaction method, electronic device and storage medium
CN117573839B (en) * 2024-01-12 2024-04-19 阿里云计算有限公司 Document retrieval method, man-machine interaction method, electronic device and storage medium

Similar Documents

Publication Publication Date Title
US20230048218A1 (en) On-Device Projection Neural Networks for Natural Language Understanding
US11816442B2 (en) Multi-turn dialogue response generation with autoregressive transformer models
CN110738026B (en) Method and device for generating description text
US20200356729A1 (en) Generation of text from structured data
JP7417679B2 (en) Information extraction methods, devices, electronic devices and storage media
JP6649536B1 (en) Dialogue processing device, learning device, dialogue processing method, learning method and program
US11694034B2 (en) Systems and methods for machine-learned prediction of semantic similarity between documents
CN111354333A (en) Chinese prosody hierarchy prediction method and system based on self-attention
CN116992008B (en) Knowledge graph multi-hop question-answer reasoning method, device and computer equipment
Thomas et al. Chatbot using gated end-to-end memory networks
Xu et al. Match-prompt: Improving multi-task generalization ability for neural text matching via prompt learning
CN114064852A (en) Method and device for extracting relation of natural language, electronic equipment and storage medium
CN115062134A (en) Knowledge question-answering model training and knowledge question-answering method, device and computer equipment
CN114880485A (en) Reading comprehension answer generation method and device, computer equipment and storage medium
CN113987162A (en) Text abstract generation method and device and computer equipment
CN109753563B (en) Tag extraction method, apparatus and computer readable storage medium based on big data
CN115952266A (en) Question generation method and device, computer equipment and storage medium
Liu et al. Named entity recognition using a semi-supervised model based on bert and bootstrapping
Jyothi et al. Abstractive text summarization on templatized data
CN113435183B (en) Text generation method, device and storage medium
Prakash et al. Alice: A natural language question answering system using dynamic attention and memory
Kamath et al. Attention and Memory Augmented Networks
CN117633147A (en) Reading and understanding method and device, terminal equipment and storage medium
CN117010334A (en) Text information generation method, device, computer equipment and storage medium
CN115688903A (en) Training method of text recognition model, text recognition method, medium, and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination