WO2021082953A1 - 机器阅读理解方法、设备、存储介质及装置 - Google Patents

机器阅读理解方法、设备、存储介质及装置 Download PDF

Info

Publication number
WO2021082953A1
WO2021082953A1 PCT/CN2020/121518 CN2020121518W WO2021082953A1 WO 2021082953 A1 WO2021082953 A1 WO 2021082953A1 CN 2020121518 W CN2020121518 W CN 2020121518W WO 2021082953 A1 WO2021082953 A1 WO 2021082953A1
Authority
WO
WIPO (PCT)
Prior art keywords
paragraph
machine reading
reading comprehension
understood
sample
Prior art date
Application number
PCT/CN2020/121518
Other languages
English (en)
French (fr)
Inventor
郝正鸿
许开河
王少军
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021082953A1 publication Critical patent/WO2021082953A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the technical field of artificial intelligence, and in particular to a machine reading comprehension method, equipment, storage medium and device.
  • Machine reading comprehension is one of the core tasks in the field of natural language processing (Natural Language Processing, abbreviated NLP). It is necessary to teach machines to read and understand paragraph text and find answers to questions through algorithm design.
  • the existing machine reading comprehension data sets include Multiple choice questions, cloze questions, essay questions, etc.
  • a machine reading comprehension method includes the following steps:
  • the valuable sentence vector passes through the answer layer of the preset machine reading comprehension model to obtain the predicted answer range of each target question;
  • the predicted answer range is sent to the target terminal.
  • a machine reading comprehension device includes:
  • the acquisition module is used to acquire the paragraphs to be understood and the corresponding multiple target questions
  • the interaction module is used to multithread the paragraphs to be understood and the corresponding multiple target questions, and sequentially go through the embedding layer, coding layer and interaction layer of the preset machine reading comprehension model to obtain the paragraphs to be understood and the corresponding target questions.
  • a screening module configured to pass the interactive information semantics through the screening layer of the preset machine reading comprehension model to obtain valuable sentence vectors that are strongly related to each of the target questions;
  • a prediction module for obtaining the predicted answer range of each target question through the answer layer of the preset machine reading comprehension model for the valuable sentence vector
  • the sending module is used to send the predicted answer range to the target terminal.
  • the machine reading comprehension device includes a memory, a processor, and a machine reading comprehension program stored on the memory and running on the processor, and the machine reading comprehension program is configured to implement the following step:
  • the valuable sentence vector passes through the answer layer of the preset machine reading comprehension model to obtain the predicted answer range of each target question;
  • the predicted answer range is sent to the target terminal.
  • the valuable sentence vector passes through the answer layer of the preset machine reading comprehension model to obtain the predicted answer range of each target question;
  • the predicted answer range is sent to the target terminal.
  • Figure 1 is a schematic structural diagram of a machine reading comprehension device in a hardware operating environment involved in a solution of an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a first embodiment of a machine reading comprehension method according to this application;
  • FIG. 3 is a schematic flowchart of a second embodiment of a machine reading comprehension method according to this application.
  • FIG. 4 is a schematic flowchart of a third embodiment of a machine reading comprehension method according to this application.
  • Fig. 5 is a structural block diagram of a first embodiment of a machine reading comprehension device according to the present application.
  • FIG. 1 is a schematic structural diagram of a machine reading comprehension device in a hardware operating environment involved in a solution of an embodiment of this application.
  • the machine reading comprehension device may include a processor 1001, such as a central processing unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
  • the communication bus 1002 is used to implement connection and communication between these components.
  • the user interface 1003 may include a display screen (Display), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the wired interface of the user interface 1003 may be a USB interface in this application.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a wireless fidelity (Wireless-Fidelity, Wi-Fi) interface).
  • the memory 1005 may be a high-speed random access memory (Random Access Memory, RAM) memory, or a stable memory (Non-volatile Memory, NVM), such as a disk memory.
  • RAM Random Access Memory
  • NVM Non-volatile Memory
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001.
  • FIG. 1 does not constitute a limitation on the machine reading comprehension device, and may include more or fewer components than shown in the figure, or combine certain components, or different component arrangements.
  • a memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a machine reading comprehension program.
  • the network interface 1004 is mainly used to connect to a back-end server to communicate data with the back-end server;
  • the user interface 1003 is mainly used to connect a user device;
  • the machine reading comprehension device uses a processor 1001 calls the machine reading comprehension program stored in the memory 1005, and executes the following steps:
  • the valuable sentence vector passes through the answer layer of the preset machine reading comprehension model to obtain the predicted answer range of each target question;
  • the predicted answer range is sent to the target terminal.
  • FIG. 2 is a schematic flowchart of the first embodiment of the machine reading comprehension method of the present application, and the first embodiment of the machine reading comprehension method of the present application is proposed.
  • the machine reading comprehension method includes the following steps:
  • Step S10 Obtain the paragraph to be understood and the corresponding multiple target questions.
  • the execution subject of this embodiment is the machine reading comprehension device, where the machine reading comprehension device may be an electronic device such as a smart phone, a personal computer, or a server, which is not limited in this embodiment.
  • the paragraph to be understood is a paragraph that requires semantic understanding, and may be an instruction manual of a device. For multiple target questions asked by the user, the corresponding answer is found from the instruction manual.
  • the target question is a question related to semantic understanding raised for the paragraph to be understood, and semantic analysis is performed on the paragraph to be understood through the preset reading comprehension model, so as to find an answer corresponding to the target question.
  • Step S20 Perform multi-thread processing on the paragraphs to be understood and the corresponding multiple target questions, and sequentially go through the embedding layer, coding layer and interaction layer of the preset machine reading comprehension model to obtain the paragraphs to be understood and the corresponding questions. Describe the semantics of the interactive information between the target questions.
  • the multi-threaded processor can simultaneously process multiple target problems in parallel, thereby improving processing efficiency.
  • the first layer of the preset machine reading comprehension model is an embedding layer.
  • the paragraph to be understood and the corresponding target question are input into the preset machine reading comprehension model, and the embedding layer realizes the The paragraph to be understood and the corresponding target question are mapped into a vector representation.
  • the second layer of the preset machine reading comprehension model is an encoder layer, which encodes the vector representation of the paragraph to be understood and the vector representation of the target question to obtain a semantic representation containing context , That is, the paragraph semantics corresponding to the paragraph to be understood and the question semantics corresponding to the target question.
  • the third layer of the preset machine reading comprehension model is the interaction layer, which captures the interactive relationship between paragraph and question and outputs the semantic representation of interactive information, similar to humans reading the original text repeatedly with questions, thus Obtain the semantics of the interactive information between the paragraph to be understood and the target question.
  • Step S30 Pass the interactive information semantics through the screening layer of the preset machine reading comprehension model to obtain valuable sentence vectors that are strongly related to each of the target questions.
  • the filtering layer is implemented in two parts. The first part is the information filtering threshold (gated info filter), the second part is to do attention analysis with the target problem.
  • the specific algorithm is described as follows:
  • Step S40 The valuable sentence vector passes through the answer layer of the preset machine reading comprehension model to obtain the predicted answer range of each target question.
  • the answer layer of the preset machine reading comprehension model performs answer prediction based on the valuable sentence to obtain a predicted answer range. It can be used for machine reading and comprehension of a list of data to obtain the required content.
  • Each column is a set of attributes, such as name, certificate number, address, etc.
  • the system is to identify which column is the address and which column is the ID card. For example, if the address contains keywords such as provinces, cities and counties, the ID card also has its own. Rules, according to these rules, identify the attributes of each column, identify the required content and upload it to the system.
  • a common application is to insure agricultural insurance, insuring the entire village or town, and how to quickly copy farmers from paper to the system. It can also be applied to an intelligent question-and-answer system. For example, when a user has questions about the instructions for use of the electrical appliance and asks a question when using an electrical appliance, the machine can read and understand the manual to predict the answer to the user’s question.
  • Step S50 Send the predicted answer range to the target terminal.
  • the target terminal is a terminal device of the user, such as a smart phone or a personal computer, and the predicted answer range is viewed through the target terminal.
  • the paragraph to be understood and the corresponding multiple target questions are obtained, and the paragraph to be understood and the corresponding multiple target questions are processed by multithreading, and then sequentially go through the embedding layer and coding of the preset machine reading comprehension model.
  • Layer and interactive layer to obtain the semantics of the interactive information between the paragraph to be understood and each of the target questions, based on artificial intelligence, pass the semantics of the interactive information through the screening layer of the preset machine reading comprehension model to obtain The valuable sentence vector with strong relevance to the target question, the valuable sentence vector passes through the answer layer of the preset machine reading comprehension model to obtain the predicted answer range of each target question, and the preset machine reading comprehension
  • the model predicts the answer, improves the accuracy and efficiency of the predicted answer, and sends the predicted answer range to the target terminal to improve the user experience.
  • FIG. 3 is a schematic flowchart of the second embodiment of the machine reading comprehension method of the present application. Based on the first embodiment shown in FIG. 2 above, a second embodiment of the machine reading comprehension method of the present application is proposed.
  • step S20 includes:
  • Step S201 Perform multi-thread processing on the paragraph to be understood and the corresponding multiple target questions, and obtain the vector representation of the paragraph to be understood and the vector representation of each target question through the embedding layer of the preset machine reading comprehension model.
  • the first layer of the preset machine reading comprehension model is the embedding layer, and the paragraph to be understood and the corresponding target question are input into the preset machine reading comprehension model, after the The embedding layer realizes the mapping of the paragraph to be understood and the corresponding target question into a vector representation.
  • the embedding layer implements logic: map the paragraph text and question text to a combination of character identification numbers (Identity document, abbreviation id), and a combination of position id; combine the character id combinations of the paragraph text and the question text; Combine the position id combination of the paragraph text and the question text; map the character id combination to the vector representation in the character table; map the position id combination to the vector representation in the position table; add the character vector representation and the position vector representation, Then do layer normalization (LayerNormalization) and random inactivation (dropout) to get the final vector representation.
  • LayerNormalization LayerNormalization
  • dropout random inactivation
  • Step S202 The vector representation of the paragraph to be understood and the vector representation of each target question pass through the coding layer of the preset machine reading comprehension model to obtain the paragraph semantics corresponding to the paragraph to be understood and the corresponding target questions The semantics of the question.
  • the second layer of the preset machine reading comprehension model is the encoder layer, which encodes the vector representation of the paragraph to be understood and the vector representation of the target question to obtain a semantic representation containing context .
  • the encoding layer may use a Recurrent Neural Network (RNN) network to encode the vector representation of the paragraph to be understood and the vector representation of each target question, and the RNN encoding follows the paragraph and target to be understood.
  • RNN Recurrent Neural Network
  • the step size of the question is performed layer by layer, and the last layer of the RNN can contain the characteristics of the entire sentence, that is, the paragraph semantics corresponding to the paragraph to be understood and the question semantics corresponding to the target question.
  • Step S203 The paragraph semantics and each question semantics pass through the interaction layer of the preset machine reading comprehension model to obtain the interactive information semantics between the paragraph to be understood and each target question.
  • the third layer of the preset machine reading comprehension model is the interaction layer, which captures the interactive relationship between the paragraph and the question and outputs the semantic representation of the interaction information, similar to humans repeatedly reading the original text with questions, thus Obtain the semantics of the interactive information between the paragraph to be understood and the target question.
  • the second and third layers are implemented based on the BERT (Bidirectional Encoder Representations from Transformers) model, which is a 12-layer bidirectional self-attention model.
  • BERT Bidirectional Encoder Representations from Transformers
  • the hidden layer vector representations output by the embedding layer are used as self-attention questions (query), answers (key), and weights (value).
  • the query and key calculate the attention score, and the normalized attention score Multiply the value vector representation to obtain a hidden layer vector representation containing paragraph self-attention representation, question self-attention representation, paragraph-question attention representation, and question-paragraph attention representation; the previous hidden layer vector representation passes A fully connected layer and layer normalization (Layer Normalization) get the vector representation of each character of the paragraph and question after the contextual self-attention and the interactive attention of the paragraph question.
  • Layer Normalization Layer Normalization
  • step S30 includes:
  • the gate filtering probability is calculated by the gate filtering probability formula
  • the vector representation of each sentence gate in the paragraph to be understood and the vector representation of each of the target questions are interacted with attention through the preset interaction formula to obtain the association with each of the target questions. Strong value sentence vector.
  • the filtering layer is implemented in two parts. The first part is the information filtering threshold (gated info filter), the second part is to do attention analysis with the target problem.
  • the specific algorithm is described as follows:
  • G i is the filtered probability door sentence i
  • [sigma] is a sigmoid function
  • W g is to be learning parameter vector h i of the sentence to be appreciated that i is represented paragraphs
  • Is is a vector representation of the concentrated resources of the paragraph to be understood
  • b g is a bias term.
  • f i g i ⁇ h i; where, h i is to be understood that the sentence paragraph i is represented by a vector, g i is the probability that the filter door.
  • Dot multiplication also called scalar product, results in the length of a vector projected in the direction of another vector, which is a scalar.
  • b is the bias term
  • ⁇ (f i , h q ) is the attention score of f i and h q
  • f i is the vector representation of each sentence in the paragraph to be understood after being filtered by the gate
  • h q is the vector representation of the problem
  • V is the parameter to be learned
  • T represents the matrix transposition
  • W f is the parameter to be learned
  • W h is the parameter to be learned
  • the normalization function is the softmax function, so that the attention weight of each sentence in the paragraph is obtained
  • p q is the weighted summation of the vector representations with the attention weight of all sentences in the paragraph.
  • the paragraph to be understood and the corresponding multiple target questions are interacted with attention, and the vector representation of each sentence in the paragraph to be understood is multiplied by the gate filtering probability to obtain all State the filtered vector representation of each sentence gate in the paragraph to be understood.
  • the vector representation of each sentence gate in the paragraph to be understood and the vector representation of each target question are preset The interaction formula performs attention interaction, obtains valuable sentence vectors with strong relevance to each of the target questions, and improves the accuracy of predicted answers.
  • FIG. 4 is a schematic flow chart of the third embodiment of the machine reading comprehension method according to the present application. Based on the above-mentioned first embodiment or the second embodiment, a third embodiment of the machine reading comprehension method according to the present application is proposed. In this embodiment, the description is based on the first embodiment.
  • the method before the step S10, the method further includes:
  • Step S01 Obtain open data from a preset database, perform data extraction on the open data, and obtain sample paragraphs.
  • the preset database may be a wiki database, and open wiki data, that is, the open data, is downloaded from the wiki database.
  • the open wiki data can be extracted through WikiCorpus, the data extraction processing class in Gensim.
  • Gensim is a topic model Python toolkit that provides the WikiCorpus extraction processing class of wiki data. Since the open wiki data contains traditional characters and irregular characters, it is necessary to convert traditional characters to simplified characters, and to convert character codes. At the same time, for follow-up work, the corpus needs to be segmented. Converting Traditional Chinese to Simplified Chinese You can use the open source simplified-traditional conversion tool OpenCC to convert the traditional characters in the open wiki data into simplified characters.
  • Character encoding conversion you can use the iconv command to convert the file to utf-8 encoding.
  • the default character set encoding in the linux shell configuration file is UTF-8.
  • UTF-8 is an expression of unicode, and gb2312 and unicode are both characters
  • the iconv command can be used to realize the encoding. This is for files, that is, the specified file is converted from one encoding to another to obtain the sample paragraph.
  • Step S02 Perform keyword extraction on the sample paragraphs to obtain keywords corresponding to the sample paragraphs.
  • TF-TDF term frequency-inverse document frequency
  • Step S03 Generate sample answers according to the keywords.
  • a sentence containing the keyword is searched from the sample paragraph, and a sentence containing a larger number of keywords is used as the sample answer. It can also generate the sample answer based on the document and learn the keywords in the document, including key knowledge points, named entities or semantic concepts that can be used as answers to common questions in the article. Since the answer is a fragment of the document, it is regarded as a sequence labeling task .
  • the answer synthesis module (Answer Synthesis Module), sequence labeling questions, trained an IOB tagger (4 kinds of tags: start, mid, end, none) to predict whether each word in the paragraph is an answer.
  • Step S04 Generate a sample question according to the sample paragraph and the sample answer.
  • the Question Generation (QG) model can be encoding-decoding + attention (encoder-decoder+ attention) model, the input is the answer sentence (answer sentence), that is, input the kind of answer, and use bidirectional Gated Recurrent Unity (BiGRU) to encode, and connect the last hidden state in the two directions (last hidden state) As the output of the encoder and the initial state of the decoder.
  • the attention layer has been improved, and it is hoped that the question generation model can remember which content in the sample answer has been used, and will not reuse it when generating question keywords, thereby generating the sample question.
  • step S04 includes:
  • the decoder in the encoder-decoder attention model is decoded to obtain a sample question.
  • the sample paragraph and the sample answer are represented in the form of vectors, and the paragraph word vector is spliced with a preset two-dimensional feature, and the preset two-dimensional feature indicates whether a document word appears in the answer , And then encode the answer word vector, find the corresponding position vector according to the position of the sample paragraph of the answer word vector, and then splice the position vector and the answer word vector together, and pay attention through the encoder decoder
  • the encoder in the force model uses BiGRU to encode the input paragraph word vector and the input answer word vector to obtain the annotated paragraph word vector and the annotated answer word vector, in order to directly generate some phrases and phrases in the document in the question sentence.
  • the entity uses the point-selected normalized exponential function (pointer-softmax) when decoding, that is, two output layers, for the final selection of the normalized exponential function shortlist softmax and the location normalized exponential function (location softmax),
  • the shortlist softmax is the traditional softmax, which generates a predefined output vocabulary (predefined output vocabulary), which corresponds to the generation-mode in the copy network copynet, and the location softmax represents the position of a word on the input end, which corresponds to the copy in copynet Mode (copy-mode), weighting and splicing two softmax outputs to obtain the sample problem.
  • Step S05 Establish a basic machine reading comprehension model.
  • the basic machine reading comprehension model may be an adaptation-long-short-term memory (Match-Long-Short Term Memory, Match-LSTM) model
  • the Match-LSTM model includes an embedded Embedding layer, an LSTM layer, and a Match-LSTM layer , Embedding layer embeds paragraphs and questions, LSTM layer brings paragraphs and questions into BiLSTM layer, and obtains all hidden states, so that paragraphs and questions have contextual information.
  • the main function of Match-LSTM layer is to obtain the interaction between paragraphs and questions. information.
  • Step S06 Train the basic machine reading comprehension model according to the sample paragraph, the sample answer, and the sample question to obtain a preset machine reading comprehension model.
  • sample paragraph, the sample answer, and the sample question are used as training sample data to train the Match-LSTM model to obtain a preset machine reading comprehension model, and the preset machine reading comprehension obtained by training
  • the model can predict the answer according to the paragraph to be understood and the corresponding target question.
  • step S50 it further includes:
  • the machine reading and comprehension method of this solution can also be applied to an intelligent question and answer system. For example, when a user has questions about the instructions for use of the electrical appliance when using an electrical appliance, if he asks a question, he can use the machine to read the manual. Reading comprehension, predict the answer corresponding to the user’s question.
  • an embodiment of the present application also proposes a storage medium, the storage medium may be volatile or non-volatile, and a machine reading comprehension program is stored on the storage medium, and the machine reading comprehension program is stored on the storage medium.
  • the following steps are implemented:
  • the valuable sentence vector passes through the answer layer of the preset machine reading comprehension model to obtain the predicted answer range of each target question;
  • the predicted answer range is sent to the target terminal.
  • an embodiment of the present application also proposes a machine reading comprehension device, and the machine reading comprehension device includes:
  • the obtaining module 10 is used to obtain the paragraph to be understood and the corresponding multiple target questions.
  • the paragraph to be understood is a paragraph that requires semantic understanding, and may be an instruction manual of a device. For multiple target questions asked by the user, the corresponding answer is found from the instruction manual.
  • the target question is a question related to semantic understanding raised for the paragraph to be understood, and semantic analysis is performed on the paragraph to be understood through the preset reading comprehension model, so as to find an answer corresponding to the target question.
  • the interaction module 20 is configured to perform multi-thread processing on the paragraph to be understood and the corresponding multiple target questions, and sequentially go through the embedding layer, the coding layer and the interaction layer of the preset machine reading comprehension model to obtain the paragraph to be understood The semantics of the interactive information with each of the target questions.
  • the multi-threaded processor can simultaneously process multiple target problems in parallel, thereby improving processing efficiency.
  • the first layer of the preset machine reading comprehension model is an embedding layer.
  • the paragraph to be understood and the corresponding target question are input into the preset machine reading comprehension model, and the embedding layer realizes the The paragraph to be understood and the corresponding target question are mapped into a vector representation.
  • the second layer of the preset machine reading comprehension model is an encoder layer, which encodes the vector representation of the paragraph to be understood and the vector representation of the target question to obtain a semantic representation containing context , That is, the paragraph semantics corresponding to the paragraph to be understood and the question semantics corresponding to the target question.
  • the third layer of the preset machine reading comprehension model is the interaction layer, which captures the interactive relationship between paragraph and question and outputs the semantic representation of interactive information, similar to humans reading the original text repeatedly with questions, thus Obtain the semantics of the interactive information between the paragraph to be understood and the target question.
  • the screening module 30 is configured to pass the interactive information semantics through the screening layer of the preset machine reading comprehension model to obtain valuable sentence vectors that are strongly related to each of the target questions.
  • the filtering layer is implemented in two parts. The first part is the information filtering threshold (gated info filter), the second part is to do attention analysis with the target problem.
  • the specific algorithm is described as follows:
  • the prediction module 40 is configured to pass the answer layer of the preset machine reading comprehension model through the valuable sentence vector to obtain the predicted answer range of each target question.
  • the answer layer of the preset machine reading comprehension model performs answer prediction based on the valuable sentence to obtain a predicted answer range. It can be used for machine reading and comprehension of a list of data to obtain the required content.
  • Each column is a set of attributes, such as name, certificate number, address, etc.
  • the system is to identify which column is the address and which column is the ID card. For example, if the address contains keywords such as provinces, cities and counties, the ID card also has its own. Rules, according to these rules, identify the attributes of each column, identify the required content and upload it to the system.
  • a common application is to insure agricultural insurance, insuring the entire village or town, and how to quickly copy farmers from paper to the system. It can also be applied to an intelligent question-and-answer system. For example, when a user has questions about the instructions for use of the electrical appliance and asks a question when using an electrical appliance, the machine can read and understand the manual to predict the answer to the user’s question.
  • the sending module 50 is configured to send the predicted answer range to the target terminal.
  • the target terminal is a terminal device of the user, such as a smart phone or a personal computer, and the predicted answer range is viewed through the target terminal.
  • the paragraph to be understood and the corresponding multiple target questions are obtained, and the paragraph to be understood and the corresponding multiple target questions are processed by multithreading, and then sequentially go through the embedding layer and coding of the preset machine reading comprehension model.
  • Layer and interactive layer to obtain the semantics of the interactive information between the paragraph to be understood and each of the target questions, based on artificial intelligence, pass the semantics of the interactive information through the screening layer of the preset machine reading comprehension model to obtain The valuable sentence vector with strong relevance to the target question, the valuable sentence vector passes through the answer layer of the preset machine reading comprehension model to obtain the predicted answer range of each target question, and the preset machine reading comprehension
  • the model predicts the answer, improves the accuracy and efficiency of the predicted answer, and sends the predicted answer range to the target terminal to improve the user experience.
  • the interaction module 20 is further configured to perform multi-thread processing on the paragraphs to be understood and the corresponding multiple target questions, and obtain the paragraphs to be understood through the embedded layer of the preset machine reading comprehension model
  • the vector representation of and the vector representation of each target question; the vector representation of the paragraph to be understood and the vector representation of each target question pass through the coding layer of the preset machine reading comprehension model to obtain the paragraph corresponding to the paragraph to be understood Semantics and the question semantics corresponding to each of the target questions; the paragraph semantics and each of the question semantics pass through the interaction layer of the preset machine reading comprehension model to obtain the relationship between the paragraph to be understood and each of the target questions Interaction information semantics.
  • the screening module 30 is further configured to pass through the screening layer of the preset machine reading comprehension model, and calculate the door filter probability through the door filter probability formula according to the vector representation of the paragraph to be understood;
  • the vector representation of each sentence in the paragraph to be understood is multiplied by the gate filtering probability to obtain the vector representation of each sentence in the paragraph to be understood after filtering; according to the semantics of the interactive information, the paragraph to be understood
  • the vector after the filtering of each sentence gate in each of the sentence gates indicates that the vector represents the attention interaction with each of the target questions through a preset interaction formula, and a valuable sentence vector that is strongly related to each of the target questions is obtained.
  • the gate filtering probability formula is:
  • G i is the filtered probability door sentence i
  • [sigma] is a sigmoid function
  • W g is to be learning parameter vector h i of the sentence to be appreciated that i is represented paragraphs
  • Is is a vector representation of the concentrated resources of the paragraph to be understood
  • b g is a bias term.
  • the machine reading comprehension device further includes:
  • the data extraction module is used to obtain open data from a preset database, perform data extraction on the open data, and obtain sample paragraphs;
  • the keyword extraction module is used to extract keywords from the sample paragraphs to obtain keywords corresponding to the sample paragraphs;
  • Generating module used to generate sample answers according to the keywords
  • the generating module is further configured to generate a sample question according to the sample paragraph and the sample answer;
  • the training module is configured to train the basic machine reading comprehension model according to the sample paragraph, the sample answer, and the sample question to obtain a preset machine reading comprehension model.
  • the generating module is further configured to represent the sample paragraph and the sample answer in the form of a vector to obtain a paragraph word vector corresponding to the sample paragraph and an answer word vector corresponding to the sample answer ; Splicing the paragraph word vector with a preset two-dimensional feature to obtain an input paragraph word vector, the preset two-dimensional feature vector indicating whether a paragraph word appears in the sample answer; combining the answer word vector with a position vector Perform splicing to obtain an input answer word vector, where the position vector represents the position of the sample answer in the sample paragraph; the encoder in the encoder decoder attention model compares the input paragraph word vector and the Input the answer word vector for encoding to obtain annotated paragraph word vector and annotated answer word vector; calculate the initial state of the decoder in the encoder decoder attention model according to the annotated paragraph word vector and the annotated answer word vector; according to The initial state of the decoder, the annotated paragraph word vector and the annotated answer word vector are decoded by the decode
  • the machine reading comprehension device further includes:
  • the calculation module is used to obtain a plurality of sentence options to be selected, calculate the similarity between each sentence option to be selected and the predicted answer range, and select the sentence option with the highest similarity as the target option;
  • the sending module 50 is further configured to send the target option to the target terminal.
  • the machine reading comprehension method provided in this application further ensures the privacy and security of all the above-mentioned data
  • all the above-mentioned data can also be stored in a node of a blockchain.
  • target questions and sentence vectors, etc. these data can be stored in the blockchain node.
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Memory image ROM/Random Access Memory (RAM, magnetic disk, CD-ROM), including several instructions to enable a terminal device (can be a mobile phone, computer, server, air conditioner, or network device Etc.) Perform the methods described in the various embodiments of the present application.
  • a terminal device can be a mobile phone, computer, server, air conditioner, or network device Etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Machine Translation (AREA)

Abstract

本申请公开了一种机器阅读理解方法、设备、存储介质及装置,该方法通过获取待理解段落及对应的多个目标问题,将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义,基于人工智能,将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量,所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围,通过预设机器阅读理解模型进行答案预测,提高预测答案的准确率和效率,将所述预测答案范围发送至目标终端,提升用户体验。

Description

机器阅读理解方法、设备、存储介质及装置
本申请要求于2019年10月29日提交中国专利局、申请号为CN201911058199.2,发明名称为“机器阅读理解方法、设备、存储介质及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能的技术领域,尤其涉及一种机器阅读理解方法、设备、存储介质及装置。
背景技术
机器阅读理解是自然语言处理(Natural Language Processing,缩写NLP)领域的核心任务之一,需要通过算法设计实现教会机器对段落文本进行阅读理解并找到问题答案,目前已有的机器阅读理解数据集包括选择题、完形填空题、问答题等。
根据人类的阅读理解行为,是需要在通读段落(paragraph)后过滤出与问题(question)相关且对回答问题有价值的词语,然后再进一步理解question确定答案范围(answer span),但是针对问答类阅读理解任务,发明人意识到现有阅读理解模型多是在整个paragraph上寻找答案,导致寻找到的答案准确率不高,且效率低。
上述内容仅用于辅助理解本申请的技术方案,并不代表承认上述内容是现有技术。
发明内容
一种机器阅读理解方法,所述机器阅读理解方法包括以下步骤:
获取待理解段落及对应的多个目标问题;
将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义;
将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量;
所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围;
将所述预测答案范围发送至目标终端。
一种机器阅读理解装置,所述机器阅读理解装置包括:
获取模块,用于获取待理解段落及对应的多个目标问题;
交互模块,用于将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义;
筛选模块,用于将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量;
预测模块,用于所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围;
发送模块,用于将所述预测答案范围发送至目标终端。
一种机器阅读理解设备,所述机器阅读理解设备包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的机器阅读理解程序,所述机器阅读理解程序配置为实现如下步骤:
获取待理解段落及对应的多个目标问题;
将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义;
将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量;
所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围;
将所述预测答案范围发送至目标终端。
一种存储介质,所述存储介质上存储有机器阅读理解程序,所述机器阅读理解程序被处理器执行时实现如下步骤:
获取待理解段落及对应的多个目标问题;
将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义;
将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量;
所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围;
将所述预测答案范围发送至目标终端。
附图说明
图1是本申请实施例方案涉及的硬件运行环境的机器阅读理解设备的结构示意图;
图2为本申请机器阅读理解方法第一实施例的流程示意图;
图3为本申请机器阅读理解方法第二实施例的流程示意图;
图4为本申请机器阅读理解方法第三实施例的流程示意图;
图5为本申请机器阅读理解装置第一实施例的结构框图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
参照图1,图1为本申请实施例方案涉及的硬件运行环境的机器阅读理解设备结构示意图。
如图1所示,该机器阅读理解设备可以包括:处理器1001,例如中央处理器(Central Processing Unit,CPU),通信总线1002、用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display),可选用户接口1003还可以包括标准的有线接口、无线接口,对于用户接口1003的有线接口在本申请中可为USB接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如无线保真(Wireless-Fidelity,Wi-Fi)接口)。存储器1005可以是高速的随机存取存储器(Random Access Memory,RAM)存储器,也可以是稳定的存储器(Non-volatile Memory,NVM),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
本领域技术人员可以理解,图1中示出的结构并不构成对机器阅读理解设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图1所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及机器阅读理解程序。
在图1所示的机器阅读理解设备中,网络接口1004主要用于连接后台服务器,与所述后台服务器进行数据通信;用户接口1003主要用于连接用户设备;所述机器阅读理解设备通过处理器1001调用存储器1005中存储的机器阅读理解程序,并执行如下步骤:
获取待理解段落及对应的多个目标问题;
将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义;
将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量;
所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围;
将所述预测答案范围发送至目标终端。
本申请机器阅读理解设备的其他实施例或具体实现方式可参照下述机器阅读理解方法的实施例。
基于上述硬件结构,提出本申请机器阅读理解方法的实施例。
参照图2,图2为本申请机器阅读理解方法第一实施例的流程示意图,提出本申请机器阅读理解方法第一实施例。
在第一实施例中,所述机器阅读理解方法包括以下步骤:
步骤S10:获取待理解段落及对应的多个目标问题。
应理解的是,本实施例的执行主体是所述机器阅读理解设备,其中,所述机器阅读理解设备可为智能手机、个人电脑或服务器等电子设备,本实施例对此不加以限制。所述待理解段落为需要进行语义理解的段落,可以是一个设备的使用说明书,对于用户问出的多个所述目标问题,从所述使用说明书中找到对应的答案。所述目标问题为针对所述待理解段落提出的语义理解相关的问题,通过所述预设阅读理解模型对所述待理解段落进行语义分析,从而查找与所述目标问题对应的答案。
步骤S20:将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义。
可理解的是,所述多线程处理器可同时对多个所述目标问题进行并行处理,从而提高处理效率。所述预设机器阅读理解模型的第一层为嵌入(embedding)层,将所述待理解段落及对应的所述目标问题输入所述预设机器阅读理解模型,经过所述嵌入层,实现将所述待理解段落及对应的所述目标问题映射为向量表示。
在具体实现中,所述预设机器阅读理解模型的第二层是编码(encoder)层,对所述待理解段落的向量表示及所述目标问题的向量表示进行编码,获得包含上下文的语义表示,即所述待理解段落对应的段落语义及所述目标问题对应的问题语义。
应理解的是,所述预设机器阅读理解模型的第三层是交互(interaction)层,该层捕捉paragraph和question的交互关系并输出交互信息语义表示,类似人类带着问题反复阅读原文,从而获得所述待理解段落与所述目标问题之间的交互信息语义。
步骤S30:将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量。
需要说明的是,在所述交互层之后,增加筛选有价值句子(gated answer valuable sentences selection)层,即所述筛选层,该筛选层分为两部分实现,第一部分为信息过滤门限(gated info filter),第二部分是与所述目标问题做注意力(attention)分析,具体算法描述如下:
1、计算门过滤概率g i。2、将所述待理解段落中的每个句子向量表示与所述门过滤概 率点乘,获得过滤后的向量表示f i,公式为:f i=g i⊙h i,其中,h i为所述待理解段落中句子i的向量表示,g i为所述门过滤概率。3、将f i与h q进行注意力交互,获得筛选后的向量p q,h q为问题的向量表示。4、用p q表示与所述目标问题关联性较强的有价值句子,将所述有价值句子作为所述预设机器阅读理解模型的回答层的输入,从而进行答案范围的预测。
步骤S40:所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围。
应理解的是,所述预设机器阅读理解模型的回答(answer)层,根据所述有价值句子进行答案预测,获得预测答案范围。可用于对一列列的数据进行机器阅读理解,获得需要的内容。每一列都是一组类型的属性,比如姓名,证件号码,地址等,系统是要识别哪一列是地址,哪一列是身份证,比如地址中含有省市县等关键字,身份证也有它的规则,根据这些规则识别每一列的属性,将需要的内容识别出来并上传到系统,应用常见就是农险投保,整个村或者镇投保,把农户怎么快速的从纸质上复制到系统中。还可应用于智能问答系统,比如,用户在使用电器时,针对电器的使用说明有疑问,问出问题,则可通过对所述说明书进行机器阅读理解,预测出用户问题对应的答案。
步骤S50:将所述预测答案范围发送至目标终端。
可理解的是,所述目标终端为用户的终端设备,比如智能手机或者个人计算机等,通过所述目标终端查看所述预测答案范围。
本实施例中,获取待理解段落及对应的多个目标问题,将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义,基于人工智能,将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量,所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围,通过预设机器阅读理解模型进行答案预测,提高预测答案的准确率和效率,将所述预测答案范围发送至目标终端,提升用户体验。
参照图3,图3为本申请机器阅读理解方法第二实施例的流程示意图,基于上述图2所示的第一实施例,提出本申请机器阅读理解方法的第二实施例。
在第二实施例中,所述步骤S20,包括:
步骤S201:将所述待理解段落及对应的多个所述目标问题进行多线程处理,经过预设机器阅读理解模型的嵌入层,获得待理解段落的向量表示及各目标问题的向量表示。
应理解的是,所述预设机器阅读理解模型的第一层为嵌入(embedding)层,将所述待理解段落及对应的所述目标问题输入所述预设机器阅读理解模型,经过所述嵌入层,实现将所述待理解段落及对应的所述目标问题映射为向量表示。
可理解的是,嵌入层实现逻辑:将带段落文本和问题文本分别映射为字符身份标识号(Identity document,缩写id)组合,位置id组合;将段落文本和问题文本的字符id组合拼接起来;将段落文本和问题文本的位置id组合拼接起来;将字符id组合映射到字符表中的向量表示;将位置id组合映射到位置表中的向量表示;将字符向量表示和位置向量表示做累加,然后做层次归一化(LayerNormalization)和随机失活(dropout)得到最终的向量表示。
步骤S202:所述待理解段落的向量表示及各所述目标问题的向量表示经过所述预设机器阅读理解模型的编码层,获得所述待理解段落对应的段落语义及各所述目标问题对应的问题语义。
需要说明的是,所述预设机器阅读理解模型的第二层是编码(encoder)层,对所述待理解段落的向量表示及所述目标问题的向量表示进行编码,获得包含上下文的语义表示。所述编码层可采用一个循环神经网络(Recurrent Neural Network,RNN)网络对所述待理解段落的向量表示及各所述目标问题的向量表示进行编码,RNN编码顺着所述待理解段落 和目标问题的步长逐层进行,RNN的最后一层能够包含整个句子的特征,即所述待理解段落对应的段落语义及所述目标问题对应的问题语义。
步骤S203:所述段落语义及各所述问题语义经过所述预设机器阅读理解模型的交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义。
在具体实现中,所述预设机器阅读理解模型的第三层是交互(interaction)层,该层捕捉paragraph和question的交互关系并输出交互信息语义表示,类似人类带着问题反复阅读原文,从而获得所述待理解段落与所述目标问题之间的交互信息语义。第二层与第三层基于BERT(Bidirectional Encoder Representations from Transformers)模型实现,为12层的双向自注意力模型,每层实现逻辑如下:
将嵌入层输出的隐层向量表示分别作为自注意力的问题(query)、答案(key)和权重(value),query和key计算自注意力分数(attention score),将归一化的attention score与value向量表示相乘得到包含段落自注意力表示、问题自注意力表示、段落-问题的注意力表示和问题-段落的注意力表示的隐层向量表示;将上步的隐层向量表示经过一层全连接层和层次归一化(Layer Normalization)得到经过上下文自注意力和段落问题交互注意力后的段落和问题的每个字符的向量表示。
进一步地,所述步骤S30,包括:
经过所述预设机器阅读理解模型的筛选层,根据所述待理解段落的向量表示,通过门过滤概率公式计算门过滤概率;
将所述待理解段落中每个句子的向量表示与所述门过滤概率点乘,获得所述待理解段落中每个句子门过滤后的向量表示;
根据所述交互信息语义,将所述待理解段落中每个句子门过滤后的向量表示与各所述目标问题的向量表示通过预设交互公式进行注意力交互,获得与各所述目标问题关联性较强的有价值句子向量。
应理解的是,在所述交互层之后,增加筛选有价值句子(gated answer valuable sentences selection)层,即所述筛选层,该筛选层分为两部分实现,第一部分为信息过滤门限(gated info filter),第二部分是与所述目标问题做注意力(attention)分析,具体算法描述如下:
1、计算门过滤概率,所述门过滤概率公式为:
Figure PCTCN2020121518-appb-000001
其中,g i为句子i的门过滤概率,σ为sigmoid函数,W g和U g均为待学习参数,h i为所述待理解段落中句子i的向量表示,
Figure PCTCN2020121518-appb-000002
为所述待理解段落的集中资源的向量表示,b g为偏置项。
2、将所述待理解段落中的每个句子向量表示与所述门过滤概率点乘,获得过滤后的向量表示f i,公式为:
f i=g i⊙h i;其中,h i为所述待理解段落中句子i的向量表示,g i为所述门过滤概率。点乘,也叫数量积,结果是一个向量在另一个向量方向上投影的长度,是一个标量。
3、将f i与h q进行注意力交互,获得筛选后的向量p q,h q为问题的向量表示,所述预设交互公式为:
φ(f i,h q)=v Ttanh(W ff i+W hh q+b);
Figure PCTCN2020121518-appb-000003
Figure PCTCN2020121518-appb-000004
其中,b为偏置项,φ(f i,h q)为f i和h q的attention score,f i为待理解段落中每个句子经过门过滤后向量表示,h q为问题的向量表示,v为待学习参数,中T表示矩阵转置,W f 为待学习参数,W h为待学习参数,
Figure PCTCN2020121518-appb-000005
是归一化处理,归一化函数为softmax函数,从而得到段落中每个句子的attention权重,p q即为段落中所有句子的带attention权重的向量表示的加权求和。
4、用p q表示与所述目标问题关联性较强的有价值句子,将所述有价值句子作为所述预设机器阅读理解模型的回答层的输入,从而进行答案范围的预测。
在本实施例中,将所述待理解段落及对应的多个所述目标问题进行注意力交互,将所述待理解段落中每个句子的向量表示与所述门过滤概率点乘,获得所述待理解段落中每个句子门过滤后的向量表示,根据所述交互信息语义,将所述待理解段落中每个句子门过滤后的向量表示与各所述目标问题的向量表示通过预设交互公式进行注意力交互,获得与各所述目标问题关联性较强的有价值句子向量,提高预测答案的准确性。
参照图4,图4为本申请机器阅读理解方法第三实施例的流程示意图,基于上述第一实施例或第二实施例,提出本申请机器阅读理解方法的第三实施例。本实施例中,基于第一实施例进行说明。
在第三实施例中,所述步骤S10之前,还包括:
步骤S01:从预设数据库获取开放数据,对所述开放数据进行数据抽取,获得样本段落。
应理解的是,所述预设数据库可以是维基(wiki)数据库,从所述wiki数据库中下载开放wiki数据,即所述开放数据。可通过Gensim中的数据的抽取处理类维基百科(WikiCorpus)对所述开放wiki数据进行抽取,Gensim是一个主题模型Python工具包,提供了wiki数据的抽取处理类WikiCorpus。由于所述开放wiki数据包含繁体字及不规范字符,需要进行繁体转简体,以及字符编码转换。同时为了后续工作,需要对语料进行分词处理。繁体转简体可使用开源简繁转换工具OpenCC,将所述开放wiki数据中的繁体字转换为简体字。字符编码转换,可使用iconv命令将文件转换成utf-8编码,linux shell配置文件中默认的字符集编码为UTF-8,UTF-8是unicode的一种表达方式,gb2312是和unicode都是字符的编码方式,在LINUX上进行编码转换时,可以利用iconv命令实现,这是针对文件的,即将指定文件从一种编码转换为另一种编码,获得所述样本段落。
步骤S02:对所述样本段落进行关键词提取,获得所述样本段落对应的关键词。
可理解的是,对所述样本段落进行分词处理,使用jieba分词工具包,命令行分词,获得所述样本段落的所有词语,计算所有词语的词频-逆文档频率(Term Frequency-Inverse Document Frequency,TF-TDF)值,当一个词在文档频率越高并且新鲜度高,即普遍度低,其TF-IDF值越高,可将词语按照TF-IDF值从大到小进行排序,获取排在前面的预设数量的词作为所述样本段落对应的关键词,所述预设数量可根据经验值进行设置。
步骤S03:根据所述关键词生成样本答案。
需要说明的是,从所述样本段落中查找包含所述关键词的句子,将包含的关键词数量较多的句子作为所述样本答案。还可以是基于文档生成所述样本答案,学习文档中的关键词,包括文章中可作为常见问题答案的关键知识点、命名实体或语义概念,由于答案是文档的片段,所以看做序列标注任务。答案合成模块(Answer Synthesis Module),序列标注问题,训练了一个IOB tagger(4种标记:start,mid,end,none)来预测段落里的每个单词是不是答案。通过双向长短时记忆(Bi-directional Long Short-Term Memory,BiLSTM)层对所述关键词的词向量进行编码,然后加两个全连接层(Fully Connected layer,FC)和一个归一化指数函数(Softmax)产生每个单词的标签相似性(tag likelihoods),选择连续的范围(span)作为候选答案块(candidate answer chunks),输入问题生成模块,生成所述样本答案。
步骤S04:根据所述样本段落和所述样本答案生成样本问题。
在具体实现中,基于所述样本段落和所述样本答案,生成自然语言的完整问句,作为生成任务,问题生成(Question Generation,QG)模型可以是编码-解码+注意力(encoder-decoder+attention)模型,输入是答案句子(answer sentence),即输入所述样答案,用双向门控循环统一(bidirectional Gated Recurrent Unity,BiGRU)进行编码,连接两个方向的最后隐藏状态(last hidden state)作为编码器的输出以及解码器的初始状态。对注意力层做了改进,希望问题生成模型能记住所述样答案中哪些内容被使用过了,在产生问题关键词时就不再重复使用,从而生成所述样本问题。
进一步地,所述步骤S04,包括:
将所述样本段落和所述样本答案以向量形式进行表示,获得所述样本段落对应的段落词向量和所述样本答案对应的答案词向量;
将所述段落词向量与预设二维特征拼接,获得输入段落词向量,所述预设二维特征向量表示段落单词是否在所述样本答案中出现;
将所述答案词向量与位置向量进行拼接,获得输入答案词向量,所述位置向量为表示所述样本答案在所述样本段落中的位置;
通过编码器解码器注意力模型中的编码器对所述输入段落词向量和所述输入答案词向量进行编码,获得注释段落词向量和注释答案词向量;
根据所述注释段落词向量和所述注释答案词向量计算所述编码器解码器注意力模型中解码器的初始状态;
根据所述解码器的初始状态、所述注释段落词向量和所述注释答案词向量,经过所述编码器解码器注意力模型中解码器进行解码,获得样本问题。
应理解的是,将所述样本段落和所述样本答案以向量形式进行表示,将所述段落词向量与预设二维特征拼接,所述预设二维特征表示文档单词是否在答案中出现,接着对所述答案词向量进行编码,根据所述答案词向量所述样本段落的位置找到对应的位置向量,然后把所述位置向量和所述答案词向量拼接起来,通过编码器解码器注意力模型中的编码器,用BiGRU对所述输入段落词向量和所述输入答案词向量进行编码,获得注释段落词向量和注释答案词向量,为了在问句中直接产生文档中的一些短语和实体,在解码的时候采用了选点归一化指数函数(pointer-softmax),也就是两个输出层,供最后挑选归一化指数函数shortlist softmax和定位归一化指数函数(location softmax),shortlist softmax就是传统的softmax,产生预定义输出词汇表(predefined output vocabulary),对应复制网络copynet中的生成模式(generate-mode),location softmax则表示某个词在输入端的位置,对应copynet中的复制模式(copy-mode),对两个softmax输出进行加权和拼接得到所述样本问题。
步骤S05:建立基础机器阅读理解模型。
应理解的是,所述基础机器阅读理解模型可以是适应-长短期记忆(Match-Long-Short Term Memory,Match-LSTM)模型,Match-LSTM模型包括嵌入Embedding层、LSTM层和Match-LSTM层,Embedding层对段落和问题进行单词嵌入,LSTM层将段落和问题带入BiLSTM层,获取所有隐藏状态,使得段落和问题都带有上下文信息,Match-LSTM层主要作用是获取段落和问题的交互信息。
步骤S06:根据所述样本段落、所述样本答案和所述样本问题对所述基础机器阅读理解模型进行训练,获得预设机器阅读理解模型。
可理解的是,所述样本段落、所述样本答案和所述样本问题作为训练样本数据,对Match-LSTM模型进行训练,获得预设机器阅读理解模型,训练获得的所述预设机器阅读理解模型能够根据所述待理解段落及对应的目标问题进行答案预测。
进一步地,所述步骤S50之后,还包括:
获取多个待选句子选项,计算各待选句子选项与所述预测答案范围之间的相似度,选取所述相似度最高的待选句子选项作为目标选项;
将所述目标选项发送至所述目标终端。
需要说明的是,在进行阅读理解习题练习时,若一个问题存在ABCD四个待选句子选项,可对各待选句子选项与所述预测答案范围,句子从分词,列出所有词,对分词进行编码,计算词频,得出各待选句子选项与所述预测答案范围分别对应的词频向量之后,计算各待选句子选项与所述预测答案范围对应的向量之间夹角的余弦值,值越大相似度越高。
在具体实现中,本方案的机器阅读理解方法,还可应用于智能问答系统,比如,用户在使用电器时,针对电器的使用说明有疑问,问出问题,则可通过对所述说明书进行机器阅读理解,预测出用户问题对应的答案。
本实施例中,通过从数据库中获取开放数据,根据开放数据生成样本段落和样本答案,扩大了训练样本集的数量,根据大量的样本答案和样本问题对所述基础机器阅读理解模型进行训练,从而获得预测准确率更高的预设机器阅读理解模型。
此外,本申请实施例还提出一种存储介质,所述存储介质可以是易失性的,也可以是非易失性的,所述存储介质上存储有机器阅读理解程序,所述机器阅读理解程序被处理器执行时实现如下步骤:
获取待理解段落及对应的多个目标问题;
将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义;
将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量;
所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围;
将所述预测答案范围发送至目标终端。
本申请所述存储介质的其他实施例或具体实现方式可参照上述各方法实施例,此处不再赘述。
此外,参照图5,本申请实施例还提出一种机器阅读理解装置,所述机器阅读理解装置包括:
获取模块10,用于获取待理解段落及对应的多个目标问题。
应理解的是,所述待理解段落为需要进行语义理解的段落,可以是一个设备的使用说明书,对于用户问出的多个所述目标问题,从所述使用说明书中找到对应的答案。所述目标问题为针对所述待理解段落提出的语义理解相关的问题,通过所述预设阅读理解模型对所述待理解段落进行语义分析,从而查找与所述目标问题对应的答案。
交互模块20,用于将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义。
可理解的是,所述多线程处理器可同时对多个所述目标问题进行并行处理,从而提高处理效率。所述预设机器阅读理解模型的第一层为嵌入(embedding)层,将所述待理解段落及对应的所述目标问题输入所述预设机器阅读理解模型,经过所述嵌入层,实现将所述待理解段落及对应的所述目标问题映射为向量表示。
在具体实现中,所述预设机器阅读理解模型的第二层是编码(encoder)层,对所述待理解段落的向量表示及所述目标问题的向量表示进行编码,获得包含上下文的语义表示,即所述待理解段落对应的段落语义及所述目标问题对应的问题语义。
应理解的是,所述预设机器阅读理解模型的第三层是交互(interaction)层,该层捕捉paragraph和question的交互关系并输出交互信息语义表示,类似人类带着问题反复阅读原文,从而获得所述待理解段落与所述目标问题之间的交互信息语义。
筛选模块30,用于将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量。
需要说明的是,在所述交互层之后,增加筛选有价值句子(gated answer valuable sentences selection)层,即所述筛选层,该筛选层分为两部分实现,第一部分为信息过滤门限(gated info filter),第二部分是与所述目标问题做注意力(attention)分析,具体算法描述如下:
1、计算门过滤概率g i。2、将所述待理解段落中的每个句子向量表示与所述门过滤概率点乘,获得过滤后的向量表示f i,公式为:f i=g i⊙h i,其中,h i为所述待理解段落中句子i的向量表示,g i为所述门过滤概率。3、将f i与h q进行注意力交互,获得筛选后的向量p q,h q为问题的向量表示。4、用p q表示与所述目标问题关联性较强的有价值句子,将所述有价值句子作为所述预设机器阅读理解模型的回答层的输入,从而进行答案范围的预测。
预测模块40,用于所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围。
应理解的是,所述预设机器阅读理解模型的回答(answer)层,根据所述有价值句子进行答案预测,获得预测答案范围。可用于对一列列的数据进行机器阅读理解,获得需要的内容。每一列都是一组类型的属性,比如姓名,证件号码,地址等,系统是要识别哪一列是地址,哪一列是身份证,比如地址中含有省市县等关键字,身份证也有它的规则,根据这些规则识别每一列的属性,将需要的内容识别出来并上传到系统,应用常见就是农险投保,整个村或者镇投保,把农户怎么快速的从纸质上复制到系统中。还可应用于智能问答系统,比如,用户在使用电器时,针对电器的使用说明有疑问,问出问题,则可通过对所述说明书进行机器阅读理解,预测出用户问题对应的答案。
发送模块50,用于将所述预测答案范围发送至目标终端。
可理解的是,所述目标终端为用户的终端设备,比如智能手机或者个人计算机等,通过所述目标终端查看所述预测答案范围。
本实施例中,获取待理解段落及对应的多个目标问题,将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义,基于人工智能,将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量,所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围,通过预设机器阅读理解模型进行答案预测,提高预测答案的准确率和效率,将所述预测答案范围发送至目标终端,提升用户体验。
在一实施例中,所述交互模块20,还用于将所述待理解段落及对应的多个所述目标问题进行多线程处理,经过预设机器阅读理解模型的嵌入层,获得待理解段落的向量表示及各目标问题的向量表示;所述待理解段落的向量表示及各所述目标问题的向量表示经过所述预设机器阅读理解模型的编码层,获得所述待理解段落对应的段落语义及各所述目标问题对应的问题语义;所述段落语义及各所述问题语义经过所述预设机器阅读理解模型的交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义。
在一实施例中,所述筛选模块30,还用于经过所述预设机器阅读理解模型的筛选层,根据所述待理解段落的向量表示,通过门过滤概率公式计算门过滤概率;将所述待理解段落中每个句子的向量表示与所述门过滤概率点乘,获得所述待理解段落中每个句子门过滤后的向量表示;根据所述交互信息语义,将所述待理解段落中每个句子门过滤后的向量表示与各所述目标问题的向量表示通过预设交互公式进行注意力交互,获得与各所述目标问题关联性较强的有价值句子向量。
在一实施例中,所述门过滤概率公式为:
Figure PCTCN2020121518-appb-000006
其中,g i为句子i的门过滤概率,σ为sigmoid函数,W g和U g均为待学习参数,h i为所述待理解段落中句子i的向量表示,
Figure PCTCN2020121518-appb-000007
为所述待理解段落的集中资源的向量表示,b g为偏置项。
在一实施例中,所述机器阅读理解装置还包括:
数据抽取模块,用于从预设数据库获取开放数据,对所述开放数据进行数据抽取,获得样本段落;
关键词提取模块,用于对所述样本段落进行关键词提取,获得所述样本段落对应的关键词;
生成模块,用于根据所述关键词生成样本答案;
所述生成模块,还用于根据所述样本段落和所述样本答案生成样本问题;
建立模块,用于建立基础机器阅读理解模型;
训练模块,用于根据所述样本段落、所述样本答案和所述样本问题对所述基础机器阅读理解模型进行训练,获得预设机器阅读理解模型。
在一实施例中,所述生成模块,还用于将所述样本段落和所述样本答案以向量形式进行表示,获得所述样本段落对应的段落词向量和所述样本答案对应的答案词向量;将所述段落词向量与预设二维特征拼接,获得输入段落词向量,所述预设二维特征向量表示段落单词是否在所述样本答案中出现;将所述答案词向量与位置向量进行拼接,获得输入答案词向量,所述位置向量为表示所述样本答案在所述样本段落中的位置;通过编码器解码器注意力模型中的编码器对所述输入段落词向量和所述输入答案词向量进行编码,获得注释段落词向量和注释答案词向量;根据所述注释段落词向量和所述注释答案词向量计算所述编码器解码器注意力模型中解码器的初始状态;根据所述解码器的初始状态、所述注释段落词向量和所述注释答案词向量,经过所述编码器解码器注意力模型中解码器进行解码,获得样本问题。
在一实施例中,所述机器阅读理解装置还包括:
计算模块,用于获取多个待选句子选项,计算各待选句子选项与所述预测答案范围之间的相似度,选取所述相似度最高的待选句子选项作为目标选项;
所述发送模块50,还用于将所述目标选项发送至所述目标终端。
本申请所述机器阅读理解装置的其他实施例或具体实现方式可参照上述各方法实施例,此处不再赘述。
在另一实施例中,本申请所提供的机器阅读理解方法,为进一步保证上述所有出现的数据的私密和安全性,上述所有数据还可以存储于一区块链的节点中。例如目标问题及句子向量等等,这些数据均可存储在区块链节点中。
需要说明的是,本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。词语第一、第二、以及第三等的使用不表示任何顺序,可将这些词语解释为标识。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如只读存储器镜像(Read Only Memory image,ROM)/随机存取存储器(Random Access Memory,RAM)、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种机器阅读理解方法,其中,所述机器阅读理解方法包括以下步骤:
    获取待理解段落及对应的多个目标问题;
    将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义;
    将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量;
    所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围;
    将所述预测答案范围发送至目标终端。
  2. 如权利要求1所述的机器阅读理解方法,其中,所述将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义,包括:
    将所述待理解段落及对应的多个所述目标问题进行多线程处理,经过预设机器阅读理解模型的嵌入层,获得待理解段落的向量表示及各目标问题的向量表示;
    所述待理解段落的向量表示及各所述目标问题的向量表示经过所述预设机器阅读理解模型的编码层,获得所述待理解段落对应的段落语义及各所述目标问题对应的问题语义;
    所述段落语义及各所述问题语义经过所述预设机器阅读理解模型的交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义。
  3. 如权利要求2所述的机器阅读理解方法,其中,所述将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量,包括:
    经过所述预设机器阅读理解模型的筛选层,根据所述待理解段落的向量表示,通过门过滤概率公式计算门过滤概率;
    将所述待理解段落中每个句子的向量表示与所述门过滤概率点乘,获得所述待理解段落中每个句子门过滤后的向量表示;
    根据所述交互信息语义,将所述待理解段落中每个句子门过滤后的向量表示与各所述目标问题的向量表示通过预设交互公式进行注意力交互,获得与各所述目标问题关联性较强的有价值句子向量。
  4. 如权利要求3所述的机器阅读理解方法,其中,所述门过滤概率公式为:
    Figure PCTCN2020121518-appb-100001
    其中,g i为句子i的门过滤概率,σ为sigmoid函数,W g和U g均为待学习参数,h i为所述待理解段落中句子i的向量表示,
    Figure PCTCN2020121518-appb-100002
    为所述待理解段落的集中资源的向量表示,b g为偏置项。
  5. 如权利要求1所述的机器阅读理解方法,其中,所述获取待理解段落及对应的多个目标问题之前,所述机器阅读理解方法还包括:
    从预设数据库获取开放数据,对所述开放数据进行数据抽取,获得样本段落;
    对所述样本段落进行关键词提取,获得所述样本段落对应的关键词;
    根据所述关键词生成样本答案;
    根据所述样本段落和所述样本答案生成样本问题;
    建立基础机器阅读理解模型;
    根据所述样本段落、所述样本答案和所述样本问题对所述基础机器阅读理解模型进行训练,获得预设机器阅读理解模型。
  6. 如权利要求5所述的机器阅读理解方法,其中,所述根据所述样本段落和所述样本答案生成样本问题,包括:
    将所述样本段落和所述样本答案以向量形式进行表示,获得所述样本段落对应的段落词向量和所述样本答案对应的答案词向量;
    将所述段落词向量与预设二维特征拼接,获得输入段落词向量,所述预设二维特征向量表示段落单词是否在所述样本答案中出现;
    将所述答案词向量与位置向量进行拼接,获得输入答案词向量,所述位置向量为表示所述样本答案在所述样本段落中的位置;
    通过编码器解码器注意力模型中的编码器对所述输入段落词向量和所述输入答案词向量进行编码,获得注释段落词向量和注释答案词向量;
    根据所述注释段落词向量和所述注释答案词向量计算所述编码器解码器注意力模型中解码器的初始状态;
    根据所述解码器的初始状态、所述注释段落词向量和所述注释答案词向量,经过所述编码器解码器注意力模型中解码器进行解码,获得样本问题。
  7. 如权利要求1-6中任一项所述的机器阅读理解方法,其中,所述将所述预测答案范围发送至目标终端之后,所述机器阅读理解方法还包括:
    获取多个待选句子选项,计算各待选句子选项与所述预测答案范围之间的相似度,选取所述相似度最高的待选句子选项作为目标选项;
    将所述目标选项发送至所述目标终端。
  8. 一种机器阅读理解装置,其中,所述机器阅读理解装置包括:
    获取模块,用于获取待理解段落及对应的多个目标问题;
    交互模块,用于将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义;
    筛选模块,用于将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量;
    预测模块,用于所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围;
    发送模块,用于将所述预测答案范围发送至目标终端。
  9. 一种机器阅读理解设备,其中,所述机器阅读理解设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的机器阅读理解程序,所述机器阅读理解程序被所述处理器执行时实现如下步骤:
    获取待理解段落及对应的多个目标问题;
    将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义;
    将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量;
    所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围;
    将所述预测答案范围发送至目标终端。
  10. 如权利要求9所述的机器阅读理解设备,其中,所述将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和 交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义,包括:
    将所述待理解段落及对应的多个所述目标问题进行多线程处理,经过预设机器阅读理解模型的嵌入层,获得待理解段落的向量表示及各目标问题的向量表示;
    所述待理解段落的向量表示及各所述目标问题的向量表示经过所述预设机器阅读理解模型的编码层,获得所述待理解段落对应的段落语义及各所述目标问题对应的问题语义;
    所述段落语义及各所述问题语义经过所述预设机器阅读理解模型的交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义。
  11. 如权利要求10所述的机器阅读理解设备,其中,所述将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量,包括:
    经过所述预设机器阅读理解模型的筛选层,根据所述待理解段落的向量表示,通过门过滤概率公式计算门过滤概率;
    将所述待理解段落中每个句子的向量表示与所述门过滤概率点乘,获得所述待理解段落中每个句子门过滤后的向量表示;
    根据所述交互信息语义,将所述待理解段落中每个句子门过滤后的向量表示与各所述目标问题的向量表示通过预设交互公式进行注意力交互,获得与各所述目标问题关联性较强的有价值句子向量。
  12. 如权利要求11所述的机器阅读理解设备,其中,所述门过滤概率公式为:
    Figure PCTCN2020121518-appb-100003
    其中,g i为句子i的门过滤概率,σ为sigmoid函数,W g和U g均为待学习参数,h i为所述待理解段落中句子i的向量表示,
    Figure PCTCN2020121518-appb-100004
    为所述待理解段落的集中资源的向量表示,b g为偏置项。
  13. 如权利要求9所述的机器阅读理解设备,其中,所述获取待理解段落及对应的多个目标问题之前,所述机器阅读理解程序被所述处理器执行时还实现如下步骤:
    从预设数据库获取开放数据,对所述开放数据进行数据抽取,获得样本段落;
    对所述样本段落进行关键词提取,获得所述样本段落对应的关键词;
    根据所述关键词生成样本答案;
    根据所述样本段落和所述样本答案生成样本问题;
    建立基础机器阅读理解模型;
    根据所述样本段落、所述样本答案和所述样本问题对所述基础机器阅读理解模型进行训练,获得预设机器阅读理解模型。
  14. 如权利要求13所述的机器阅读理解设备,其中,所述根据所述样本段落和所述样本答案生成样本问题,包括:
    将所述样本段落和所述样本答案以向量形式进行表示,获得所述样本段落对应的段落词向量和所述样本答案对应的答案词向量;
    将所述段落词向量与预设二维特征拼接,获得输入段落词向量,所述预设二维特征向量表示段落单词是否在所述样本答案中出现;
    将所述答案词向量与位置向量进行拼接,获得输入答案词向量,所述位置向量为表示所述样本答案在所述样本段落中的位置;
    通过编码器解码器注意力模型中的编码器对所述输入段落词向量和所述输入答案词向量进行编码,获得注释段落词向量和注释答案词向量;
    根据所述注释段落词向量和所述注释答案词向量计算所述编码器解码器注意力模型中解码器的初始状态;
    根据所述解码器的初始状态、所述注释段落词向量和所述注释答案词向量,经过所述编码器解码器注意力模型中解码器进行解码,获得样本问题。
  15. 如权利要求9-14中任一项所述的机器阅读理解设备,其中,所述将所述预测答案范围发送至目标终端之后,所述机器阅读理解程序被所述处理器执行时还实现如下步骤:
    获取多个待选句子选项,计算各待选句子选项与所述预测答案范围之间的相似度,选取所述相似度最高的待选句子选项作为目标选项;
    将所述目标选项发送至所述目标终端。
  16. 一种存储介质,其中,所述存储介质上存储有机器阅读理解程序,所述机器阅读理解程序被处理器执行时实现如下步骤:
    获取待理解段落及对应的多个目标问题;
    将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义;
    将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量;
    所述有价值句子向量经过所述预设机器阅读理解模型的回答层,获得各所述目标问题的预测答案范围;
    将所述预测答案范围发送至目标终端。
  17. 如权利要求16所述的存储介质,其中,所述将所述待理解段落及对应的多个所述目标问题进行多线程处理,依次经过预设机器阅读理解模型的嵌入层、编码层和交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义,包括:
    将所述待理解段落及对应的多个所述目标问题进行多线程处理,经过预设机器阅读理解模型的嵌入层,获得待理解段落的向量表示及各目标问题的向量表示;
    所述待理解段落的向量表示及各所述目标问题的向量表示经过所述预设机器阅读理解模型的编码层,获得所述待理解段落对应的段落语义及各所述目标问题对应的问题语义;
    所述段落语义及各所述问题语义经过所述预设机器阅读理解模型的交互层,获得所述待理解段落与各所述目标问题之间的交互信息语义。
  18. 如权利要求17所述的存储介质,其中,所述将所述交互信息语义经过所述预设机器阅读理解模型的筛选层,获得与各所述目标问题关联性较强的有价值句子向量,包括:
    经过所述预设机器阅读理解模型的筛选层,根据所述待理解段落的向量表示,通过门过滤概率公式计算门过滤概率;
    将所述待理解段落中每个句子的向量表示与所述门过滤概率点乘,获得所述待理解段落中每个句子门过滤后的向量表示;
    根据所述交互信息语义,将所述待理解段落中每个句子门过滤后的向量表示与各所述目标问题的向量表示通过预设交互公式进行注意力交互,获得与各所述目标问题关联性较强的有价值句子向量。
  19. 如权利要求18所述的存储介质,其中,所述门过滤概率公式为:
    Figure PCTCN2020121518-appb-100005
    其中,g i为句子i的门过滤概率,σ为sigmoid函数,W g和U g均为待学习参数,h i为所述待理解段落中句子i的向量表示,
    Figure PCTCN2020121518-appb-100006
    为所述待理解段落的集中资源的向量表示,b g为偏置项。
  20. 如权利要求16所述的存储介质,其中,所述获取待理解段落及对应的多个目标问题之前,所述机器阅读理解程序被处理器执行时还实现如下步骤:
    从预设数据库获取开放数据,对所述开放数据进行数据抽取,获得样本段落;
    对所述样本段落进行关键词提取,获得所述样本段落对应的关键词;
    根据所述关键词生成样本答案;
    根据所述样本段落和所述样本答案生成样本问题;
    建立基础机器阅读理解模型;
    根据所述样本段落、所述样本答案和所述样本问题对所述基础机器阅读理解模型进行训练,获得预设机器阅读理解模型。
PCT/CN2020/121518 2019-10-29 2020-10-16 机器阅读理解方法、设备、存储介质及装置 WO2021082953A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911058199.2 2019-10-29
CN201911058199.2A CN111027327B (zh) 2019-10-29 2019-10-29 机器阅读理解方法、设备、存储介质及装置

Publications (1)

Publication Number Publication Date
WO2021082953A1 true WO2021082953A1 (zh) 2021-05-06

Family

ID=70204835

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/121518 WO2021082953A1 (zh) 2019-10-29 2020-10-16 机器阅读理解方法、设备、存储介质及装置

Country Status (2)

Country Link
CN (1) CN111027327B (zh)
WO (1) WO2021082953A1 (zh)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113268576A (zh) * 2021-06-02 2021-08-17 北京汇声汇语科技有限公司 一种基于深度学习的部门语义信息抽取的方法及装置
CN113420160A (zh) * 2021-06-24 2021-09-21 竹间智能科技(上海)有限公司 数据处理方法和设备
CN113505207A (zh) * 2021-07-02 2021-10-15 中科苏州智能计算技术研究院 一种金融舆情研报的机器阅读理解方法及系统
CN113553410A (zh) * 2021-06-30 2021-10-26 北京百度网讯科技有限公司 长文档处理方法、处理装置、电子设备和存储介质
CN113742468A (zh) * 2021-09-03 2021-12-03 上海欧冶金融信息服务股份有限公司 一种基于阅读理解的智能问答方法及系统
CN113822040A (zh) * 2021-08-06 2021-12-21 深圳市卓帆技术有限公司 一种主观题阅卷评分方法、装置、计算机设备及存储介质
CN113836893A (zh) * 2021-09-14 2021-12-24 北京理工大学 一种融入多个段落信息的抽取式机器阅读理解方法
CN113836941A (zh) * 2021-09-27 2021-12-24 上海合合信息科技股份有限公司 一种合同导航方法及装置
CN115033702A (zh) * 2022-03-04 2022-09-09 贵州电网有限责任公司 一种基于集成学习的变电站选址知识抽取方法
CN115910345A (zh) * 2022-12-22 2023-04-04 广东数业智能科技有限公司 一种心理健康测评智能预警方法及存储介质
CN117290483A (zh) * 2023-10-09 2023-12-26 成都明途科技有限公司 答案确定方法、模型训练方法、装置及电子设备
CN115033702B (zh) * 2022-03-04 2024-06-04 贵州电网有限责任公司 一种基于集成学习的变电站选址知识抽取方法

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027327B (zh) * 2019-10-29 2022-09-06 平安科技(深圳)有限公司 机器阅读理解方法、设备、存储介质及装置
CN111552781B (zh) * 2020-04-29 2021-03-02 焦点科技股份有限公司 一种联合机器检索阅读的方法
CN111460176B (zh) * 2020-05-11 2023-11-07 南京大学 一种基于哈希学习的多文档机器阅读理解方法
CN111428021B (zh) * 2020-06-05 2023-05-30 平安国际智慧城市科技股份有限公司 基于机器学习的文本处理方法、装置、计算机设备及介质
CN111858879B (zh) * 2020-06-18 2024-04-05 达观数据有限公司 一种基于机器阅读理解的问答方法及系统、存储介质、计算机设备
CN112084299B (zh) * 2020-08-05 2022-05-31 山西大学 一种基于bert语义表示的阅读理解自动问答方法
CN112347756B (zh) * 2020-09-29 2023-12-22 中国科学院信息工程研究所 一种基于序列化证据抽取的推理阅读理解方法及系统
CN112163079B (zh) * 2020-09-30 2024-02-20 民生科技有限责任公司 一种基于阅读理解模型的智能对话方法及系统
CN112380326B (zh) * 2020-10-10 2022-07-08 中国科学院信息工程研究所 一种基于多层感知的问题答案抽取方法及电子装置
CN112416931A (zh) * 2020-11-18 2021-02-26 脸萌有限公司 信息生成方法、装置和电子设备
CN112613322B (zh) * 2020-12-17 2023-10-24 平安科技(深圳)有限公司 文本处理方法、装置、设备及存储介质
CN112612892B (zh) * 2020-12-29 2022-11-01 达而观数据(成都)有限公司 一种专有领域语料模型构建方法、计算机设备及存储介质
CN112784579B (zh) * 2020-12-31 2022-05-27 山西大学 一种基于数据增强的阅读理解选择题答题方法
CN112860863A (zh) * 2021-01-30 2021-05-28 云知声智能科技股份有限公司 一种机器阅读理解方法及装置
CN113268601B (zh) * 2021-03-02 2024-05-14 安徽淘云科技股份有限公司 信息提取方法、阅读理解模型训练方法及相关装置
CN113076431B (zh) * 2021-04-28 2022-09-02 平安科技(深圳)有限公司 机器阅读理解的问答方法、装置、计算机设备及存储介质
CN113191159B (zh) * 2021-05-25 2023-01-20 广东电网有限责任公司广州供电局 一种机器阅读理解方法、装置、设备和存储介质
CN113420134B (zh) * 2021-06-22 2022-10-14 康键信息技术(深圳)有限公司 机器阅读理解方法、装置、计算机设备和存储介质
CN113312912B (zh) * 2021-06-25 2023-03-31 重庆交通大学 一种用于交通基础设施检测文本的机器阅读理解方法
CN113627152B (zh) * 2021-07-16 2023-05-16 中国科学院软件研究所 一种基于自监督学习的无监督机器阅读理解训练方法
CN115048907B (zh) * 2022-05-31 2024-02-27 北京深言科技有限责任公司 文本数据质量确定的方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2531720A (en) * 2014-10-27 2016-05-04 Ibm Automatic question generation from natural text
CN108959396A (zh) * 2018-06-04 2018-12-07 众安信息技术服务有限公司 机器阅读模型训练方法及装置、问答方法及装置
CN110096698A (zh) * 2019-03-20 2019-08-06 中国地质大学(武汉) 一种考虑主题的机器阅读理解模型生成方法与系统
CN110309305A (zh) * 2019-06-14 2019-10-08 中国电子科技集团公司第二十八研究所 基于多任务联合训练的机器阅读理解方法及计算机存储介质
CN111027327A (zh) * 2019-10-29 2020-04-17 平安科技(深圳)有限公司 机器阅读理解方法、设备、存储介质及装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190156220A1 (en) * 2017-11-22 2019-05-23 Microsoft Technology Licensing, Llc Using machine comprehension to answer a question
US10997221B2 (en) * 2018-04-07 2021-05-04 Microsoft Technology Licensing, Llc Intelligent question answering using machine reading comprehension
CN109033068B (zh) * 2018-06-14 2022-07-12 北京慧闻科技(集团)有限公司 基于注意力机制的用于阅读理解的方法、装置和电子设备
CN109033229B (zh) * 2018-06-29 2021-06-11 北京百度网讯科技有限公司 问答处理方法和装置
CN109460553B (zh) * 2018-11-05 2023-05-16 中山大学 一种基于门限卷积神经网络的机器阅读理解方法
CN109670029B (zh) * 2018-12-28 2021-09-07 百度在线网络技术(北京)有限公司 用于确定问题答案的方法、装置、计算机设备及存储介质
CN109829055B (zh) * 2019-02-22 2021-03-12 苏州大学 基于过滤门机制的用户法条预测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2531720A (en) * 2014-10-27 2016-05-04 Ibm Automatic question generation from natural text
CN108959396A (zh) * 2018-06-04 2018-12-07 众安信息技术服务有限公司 机器阅读模型训练方法及装置、问答方法及装置
CN110096698A (zh) * 2019-03-20 2019-08-06 中国地质大学(武汉) 一种考虑主题的机器阅读理解模型生成方法与系统
CN110309305A (zh) * 2019-06-14 2019-10-08 中国电子科技集团公司第二十八研究所 基于多任务联合训练的机器阅读理解方法及计算机存储介质
CN111027327A (zh) * 2019-10-29 2020-04-17 平安科技(深圳)有限公司 机器阅读理解方法、设备、存储介质及装置

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113268576A (zh) * 2021-06-02 2021-08-17 北京汇声汇语科技有限公司 一种基于深度学习的部门语义信息抽取的方法及装置
CN113268576B (zh) * 2021-06-02 2024-03-08 北京汇声汇语科技有限公司 一种基于深度学习的部门语义信息抽取的方法及装置
CN113420160A (zh) * 2021-06-24 2021-09-21 竹间智能科技(上海)有限公司 数据处理方法和设备
CN113553410A (zh) * 2021-06-30 2021-10-26 北京百度网讯科技有限公司 长文档处理方法、处理装置、电子设备和存储介质
CN113553410B (zh) * 2021-06-30 2023-09-22 北京百度网讯科技有限公司 长文档处理方法、处理装置、电子设备和存储介质
CN113505207A (zh) * 2021-07-02 2021-10-15 中科苏州智能计算技术研究院 一种金融舆情研报的机器阅读理解方法及系统
CN113505207B (zh) * 2021-07-02 2024-02-20 中科苏州智能计算技术研究院 一种金融舆情研报的机器阅读理解方法及系统
CN113822040A (zh) * 2021-08-06 2021-12-21 深圳市卓帆技术有限公司 一种主观题阅卷评分方法、装置、计算机设备及存储介质
CN113742468B (zh) * 2021-09-03 2024-04-12 上海欧冶金诚信息服务股份有限公司 一种基于阅读理解的智能问答方法及系统
CN113742468A (zh) * 2021-09-03 2021-12-03 上海欧冶金融信息服务股份有限公司 一种基于阅读理解的智能问答方法及系统
CN113836893A (zh) * 2021-09-14 2021-12-24 北京理工大学 一种融入多个段落信息的抽取式机器阅读理解方法
CN113836941A (zh) * 2021-09-27 2021-12-24 上海合合信息科技股份有限公司 一种合同导航方法及装置
CN113836941B (zh) * 2021-09-27 2023-11-14 上海合合信息科技股份有限公司 一种合同导航方法及装置
CN115033702A (zh) * 2022-03-04 2022-09-09 贵州电网有限责任公司 一种基于集成学习的变电站选址知识抽取方法
CN115033702B (zh) * 2022-03-04 2024-06-04 贵州电网有限责任公司 一种基于集成学习的变电站选址知识抽取方法
CN115910345A (zh) * 2022-12-22 2023-04-04 广东数业智能科技有限公司 一种心理健康测评智能预警方法及存储介质
CN117290483A (zh) * 2023-10-09 2023-12-26 成都明途科技有限公司 答案确定方法、模型训练方法、装置及电子设备

Also Published As

Publication number Publication date
CN111027327A (zh) 2020-04-17
CN111027327B (zh) 2022-09-06

Similar Documents

Publication Publication Date Title
WO2021082953A1 (zh) 机器阅读理解方法、设备、存储介质及装置
CN108959246B (zh) 基于改进的注意力机制的答案选择方法、装置和电子设备
US20200356729A1 (en) Generation of text from structured data
CN109101537A (zh) 基于深度学习的多轮对话数据分类方法、装置和电子设备
CN112287069B (zh) 基于语音语义的信息检索方法、装置及计算机设备
CN111666416B (zh) 用于生成语义匹配模型的方法和装置
CN110795552A (zh) 一种训练样本生成方法、装置、电子设备及存储介质
JP7290861B2 (ja) 質問応答システムのための回答分類器及び表現ジェネレータ、並びに表現ジェネレータを訓練するためのコンピュータプログラム
CN111738016A (zh) 多意图识别方法及相关设备
CN113704460B (zh) 一种文本分类方法、装置、电子设备和存储介质
CN114676234A (zh) 一种模型训练方法及相关设备
US11645526B2 (en) Learning neuro-symbolic multi-hop reasoning rules over text
CN111985243B (zh) 情感模型的训练方法、情感分析方法、装置及存储介质
KR20200087977A (ko) 멀티모달 문서 요약 시스템 및 방법
CN110866098A (zh) 基于transformer和lstm的机器阅读方法、装置及可读存储介质
CN110678882A (zh) 使用机器学习从电子文档选择回答跨距
CN111274822A (zh) 语义匹配方法、装置、设备及存储介质
CN112528654A (zh) 自然语言处理方法、装置及电子设备
CN114416995A (zh) 信息推荐方法、装置及设备
JP2022145623A (ja) ヒント情報を提示する方法及び装置並びにコンピュータプログラム
CN114282528A (zh) 一种关键词提取方法、装置、设备及存储介质
CN113609873A (zh) 翻译模型训练方法、装置及介质
CN112328655A (zh) 文本标签挖掘方法、装置、设备及存储介质
CN113704466B (zh) 基于迭代网络的文本多标签分类方法、装置及电子设备
CN114936564A (zh) 一种基于对齐变分自编码的多语言语义匹配方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20880603

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20880603

Country of ref document: EP

Kind code of ref document: A1