CN111125295B

CN111125295B - LSTM-based method and system for obtaining answers to food safety questions

Info

Publication number: CN111125295B
Application number: CN201911113762.1A
Authority: CN
Inventors: 陈瑛; 赵筱钰; 陈昂轩; 董玉博; 侯文俊
Original assignee: China Agricultural University
Current assignee: China Agricultural University
Priority date: 2019-11-14
Filing date: 2019-11-14
Publication date: 2023-11-24
Anticipated expiration: 2039-11-14
Also published as: CN111125295A

Abstract

The embodiment of the invention provides a method and a system for obtaining answers to food safety questions based on LSTM. The method comprises the following steps: acquiring a question to be answered; inputting the questions to be answered into a pre-trained question-answer judging model, and obtaining a question matching result output by the question-answer judging model; the question-answer judging model is obtained by training an LSTM model by using manually marked question answer pairs and corresponding matching labels based on a food safety question answer sample set. According to the embodiment of the invention, the food safety problem is trained based on the LSTM model, the problem that the search engine is low in retrieval accuracy of the food safety event is solved, and answers to the food safety problem can be effectively screened from a news base.

Description

LSTM-based method and system for obtaining answers to food safety questions

Technical Field

The invention relates to the technical field of natural language processing, in particular to a method and a system for obtaining answers to food safety questions based on LSTM.

Background

In recent years, due to frequent occurrence of various food safety events, the food safety problem is more serious, and the concern of the public on the food safety is aggravated. Data for food safety class information is often difficult to obtain due to information asymmetry issues.

The corpus of food safety events at present mainly comes from news reports. However, due to the varying length of news text, combined with uncertainty in answer distribution, it is inefficient to manually find answers to a food safety related question in a food safety event news story.

While most automated question-answering systems construct a food safety database in a structured way, from which the corresponding data that matches the question is chosen as an answer to be returned to the consumer. The database construction method is very dependent on the quality of manually labeling corresponding data, and once the data is labeled with errors or answer data corresponding to questions is not labeled, reasonable information feedback can not be given to consumers. It would be desirable to construct a food-safe question-answering system that can find answers to questions from news stories, and which can produce answers by itself, rather than from an existing database.

Therefore, a method for obtaining automatic answers to questions of food safety is needed to provide a query of related food safety events and provide related sentences containing the answers, so that consumers can conveniently know the food safety events.

Disclosure of Invention

The embodiment of the invention provides a method and a system for obtaining answers to food safety questions based on LSTM (least squares), which are used for solving the problem that the answer efficiency is low by manually searching the food safety questions in the prior art.

In a first aspect, an embodiment of the present invention provides a method for obtaining answers to food safety questions based on LSTM, including:

acquiring a question to be answered;

inputting the questions to be answered into a pre-trained question-answer judging model, and obtaining a question matching result output by the question-answer judging model; the question-answer judging model is obtained by training an LSTM model by using manually marked question answer pairs and corresponding matching labels based on a food safety question answer sample set.

Preferably, the question-answer judgment model is obtained by the following steps:

acquiring a sample set of answers to the food safety questions;

classifying according to the question answer pairs, defining sentences in which the target answers are contained in the question answer pairs as first labels, and defining sentences in which the target answers are not contained in the question answer pairs as second labels;

dividing the food safety problem sample set into a training set, a cross-validation set and a test set based on the first tag and the second tag;

acquiring an LSTM model network structure as an initial model;

and inputting the training set, the cross validation set and the test set into the initial model for training to obtain the question-answer judging model.

Preferably, the obtaining a sample set of answers to the food safety questions specifically includes:

acquiring a food safety original document;

manually asking questions based on a preset food safety event, marking sentences with matching answers to the manual questions in the food safety original document, and forming matching question answer pairs by the sentences with the manual questions and the manual questions;

labeling sentences with preset non-association degrees with the manual questioning in the food safety original document, and forming a non-matching question answer pair by the sentences with the preset non-association degrees and the manual questioning;

word2Vec model is adopted to generate Word vectors in the food safety field, and the questions and answers after the Word segmentation in the question answer pair are mapped to Word vector space to generate question answer pair vector representation;

and constructing the word vector into the food safety question answer sample set.

Preferably, the manual questioning is performed based on a preset food safety event, sentences with matching answers to the manual questioning are marked in the food safety original document, and the sentences with matching answers to the manual questioning and the manual questioning form matching question answer pairs, which further comprises:

and searching an article with highest matching degree with the manual questioning in the food safety original document by using a Lucene inverted index system, and searching sentences containing information related to the manual questioning in the article.

Preferably, the inputting the questions to be answered to a pre-trained question-answer judging model, and obtaining the question matching result output by the question-answer judging model specifically includes:

searching an article with the highest matching degree with the questions to be answered in the food safety original document by adopting a Lucene inverted index system based on the questions to be answered, and taking the article as a candidate article;

all sentences in the candidate articles and the questions to be answered form candidate question answer pairs;

and inputting the candidate question answer pair vector representation to the question-answer judging model, calculating the matching degree by the question judging model, and taking the sentence with the highest matching degree as the question matching result.

Preferably, the matching degree calculation performed by the problem judgment model specifically includes:

the question judgment model performs matching degree scoring according to the matching degree of all sentences and the questions to be answered to obtain matching degree scores;

and extracting sentences from the high-order to the low-order according to the matching degree score to match the questions to be answered.

Preferably, the question answer pair is in the format of a question-answer-tag value.

In a second aspect, an embodiment of the present invention provides a system for obtaining answers to food safety questions based on LSTM, including:

the acquisition module is used for acquiring the questions to be answered;

the processing module is used for inputting the questions to be answered into a pre-trained question-answer judging model and obtaining the question matching results output by the question-answer judging model; the question-answer judging model is obtained by training an LSTM model by using manually marked question answer pairs and corresponding matching labels based on a food safety question answer sample set.

In a third aspect, an embodiment of the present invention provides an electronic device, including:

a memory, a processor, and a computer program stored on the memory and executable on the processor, which when executed implements the steps of any of the LSTM based methods of obtaining answers to food safety questions.

In a fourth aspect, embodiments of the present invention provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of any of the LSTM-based methods of obtaining answers to food safety questions.

According to the LSTM-based method and system for obtaining the answers to the food safety questions, provided by the embodiment of the invention, the LSTM-based model is used for training the food safety questions, the problem that a search engine is low in efficiency on the food safety events is solved, and the answers to the food safety questions can be effectively screened out from a news base.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a method for obtaining answers to food safety questions based on LSTM according to an embodiment of the present invention;

fig. 2 is a flowchart of a problem determination model training method according to an embodiment of the present invention:

FIG. 3 is a schematic diagram of an overall process of training a problem determination model according to an embodiment of the present invention;

FIG. 4 is a schematic workflow diagram of an automatic food safety questioning and answering system according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of the relationship between the average accuracy and the number of articles selected according to the embodiment of the present invention;

FIG. 6 is a system architecture diagram for obtaining answers to food safety questions based on LSTM according to an embodiment of the present invention;

fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Fig. 1 is a flowchart of a method for obtaining answers to food safety questions based on LSTM according to an embodiment of the present invention, as shown in fig. 1, including:

s1, acquiring a question to be answered;

s2, inputting the questions to be answered into a pre-trained question-answer judging model, and obtaining a question matching result output by the question-answer judging model; the question-answer judging model is obtained by training an LSTM model by using manually marked question answer pairs and corresponding matching labels based on a food safety question answer sample set.

Specifically, the method is used for firstly preparing an answered food safety question according to a food safety hotspot focused by the user, inputting the question into a question-answer judging model which is trained in advance, outputting a question answer matched with the question to be answered, wherein the question-answer judging model is obtained by training an LSTM (Long Short-Term Memory) model according to a certain number of food safety question sample sets and classification labels.

According to the embodiment of the invention, the obtained food safety problem answer pair is trained based on the LSTM model, the problem that the search engine is low in efficiency on food safety events is solved, and the food safety problem answer can be effectively screened out from the news base.

Based on the above embodiments, fig. 2 is a flowchart of a training method for a question judgment model according to an embodiment of the present invention, and as shown in fig. 2, the question judgment model is obtained by:

101, acquiring a sample set of answers to food safety questions;

102, classifying according to the question answer pairs, defining sentences in which the question answer pairs contain target answers as first labels, and defining sentences in which the question answer pairs do not contain target answers as second labels;

103, dividing the food safety problem sample set into a training set, a cross-validation set and a test set based on the first tag and the second tag;

104, acquiring an LSTM model network structure as an initial model;

and 105, inputting the training set, the cross validation set and the test set into the initial model for training to obtain the question-answer judging model.

Specifically, in step 101, a sample set of food safety problems to be trained is obtained;

in step 102, the obtained question answer pair data is used to train an LSTM model capable of judging the matching degree of the questions and the sentences, so as to ensure the accuracy and generalization capability of the neural network, and ensure that the proportion of the matched questions and answers in the training data is 50%. The sentences of all the question answer pairs containing the answers are provided with labels=1, namely a first label, the sentences of the questions not containing the answers are provided with labels=0, namely a second label, and the trained model can give the probability of representing that the given answer candidate sentence can accurately answer the questions, namely the matching degree of the question answer pair;

in step 103, after the corresponding classification labels are divided, the sample set is divided into a training set, a cross validation set and a test set so as to facilitate the subsequent model training; in order to ensure the accuracy and generalization capability of the neural network, the proportion of whether the questions and the answers match or not in the training data is ensured to be 50 percent. Sentence token=1 for all pairs of questions answers containing the answer, sentence token=0 for no answer; all use cases are randomly arranged and divided into 7:2:1, a training set, a cross-validation set and a test set;

in step 104, an LSTM model is obtained as an initial model structure, where LSTM is a time-loop neural network, specifically designed to solve the long-term dependency problem existing in a general RNN (loop neural network), and all RNNs have a chain form of a repeating neural network module;

in step 105, the sample set is input into the obtained initial model to start training, the input layer data uniformly adopts word vectors given by the word vector model generated previously, each training case comprises question text vector representation, candidate answer sentence text vector representation and 0-1 label (indicating whether the two are matched), firstly, word vector dimension of word2vec is set to be 20, word unit number in sentences is uniform to be 80, then, for the excessive word units, truncation processing is carried out, and for the questions or answer sentences with insufficient word unit number, word vector dimension supplement is adopted to be 0, and the word unit number is expanded to 80 by using full-zero word vector filling, so that the word vector number contained in each question or answer sentence sample received by the LSTM input layer is ensured to be consistent. And then constructing a vocabulary of the training corpus, so that words in the text correspond to the pre-training vectors one by one, and designating the pre-training word vectors. In the training process, the LSTM uses an ADAM optimization method to carry out gradient descent on the training set, and the attributes such as the layer number, the learning rate and the like belong to adjustable training parameters. Training parameters are adjusted using the cross-test set, and result verification is performed using the test set. The LSTM output layer calculates the relevant probability values using a Softmax function. The trained model can calculate the matching degree of the given question and the candidate answer sentence according to the word vector representation of the given question and the candidate answer sentence, and the closer the numerical value is to 1, the greater the probability that the candidate sentence can accurately answer the question is represented, and the whole training process is schematically shown in fig. 3.

According to the embodiment of the invention, the LSTM model is introduced, and the model is trained and learned based on label classification, so that the question-answer judgment model obtained through training has strong learning ability, and the matching answers of the questions can be effectively identified and matched.

Based on any one of the above embodiments, the obtaining a sample set of answers to the food safety questions specifically includes:

acquiring a food safety original document;

Specifically, a certain number of food safety original documents are obtained, manual questioning is carried out aiming at related food safety events, sentences where answers are located are manually marked in the documents, irrelevant document sentence contents are selected, and matched and unmatched answer pairs are respectively formed with the proposed questions. And (3) word segmentation is carried out on the texts in the answers of the questions, word2vec technology is combined, word vector models in the food safety field are used for generating word vectors, and the word vectors form a food safety question sample set.

Based on any of the above embodiments, the manually asking question based on a preset food safety event marks a sentence with an answer matching the manually asking question in the food safety original document, and forms a matching question answer pair with the manually asking question, and further includes:

Specifically, for the obtained food safety corpus, manually proposing questions related to the content of the article, selecting related answers from the article, and combining the answer candidate sentences and the manually proposing questions into question answer pairs; the number of answer pairs of the questions can be expanded, and the specific method is as follows: and using the Lucene inverted index system to find out the article with the highest matching degree with the questions in the document corpus, and finding out all sentences containing the required answers in the documents to generate question answer pairs. And randomly selecting irrelevant sentences in the document, and forming unmatched question answer pairs with manually-proposed questions.

For example, the Lucene inverted index system is used to index all 7053 food safety articles for subsequent retrieval. The number of answer pairs of the questions can be expanded, and the specific method is as follows: using a Lucene inverted index system to find N articles with highest matching degree with the questions from the document corpus, finding all sentences containing required answers from the N articles to serve as answer candidate sentences capable of answering given questions, and generating answer pairs with the given questions; in addition, irrelevant sentences in the documents are randomly selected, and a non-matching question answer pair is formed by the irrelevant sentences and manually-proposed questions. Based on 899 question answer pairs corpus of manual labeling, 9850 question answer pairs are obtained from 7053 food safety article corpus by Lucene. Then, using a jieba word segmentation model to segment sentences in 17 ten thousand food safety news reports as basic data of a training word vector model, and mining the semantics of the vocabulary in the food safety field as far as possible so as to improve the field adaptability of the system in the food safety specific field; the word vector dimension was 20 using the Skip-Gram model in word2 vec. The window value is set to be 5, the alpha value is set to be 0.01, training is carried out by using the segmented news document, vector representation of each word is obtained, and words with similar meanings are mapped to adjacent positions in a vector space correspondingly. And for the question text and the answer candidate sentence text in all the question answer pairs, converting the segmented text into word vectors for subsequent use by using a word vector model in the food safety field based on word2vec technology.

Here, lucene is a full text search engine, which is a full text search engine toolkit of open source code, but it is not a complete full text search engine, but a full text search engine architecture, providing complete query engine and index engine, and a partial text analysis engine. The purpose of Lucene is to provide a simple and easy-to-use tool kit for software developers to conveniently realize the full-text retrieval function in a target system or to establish a complete full-text retrieval engine based on the function.

According to the embodiment of the invention, the Lucene search engine with stronger universality is adopted, the original document is initially searched, and the article with the highest matching degree is locked, so that the accuracy of problem search is effectively improved.

Based on any one of the foregoing embodiments, the inputting the to-be-answered question into a pre-trained question-answering judgment model, and obtaining a question matching result output by the question-answering judgment model specifically includes:

The matching degree calculation performed by the problem judgment model specifically includes:

Specifically, for a newly presented problem of a user, all food safety documents are searched by utilizing the Lucene inverted index, and an article with the highest matching degree with the text of the problem is searched to be used as a candidate article. Traversing the articles, taking all sentences in the candidate articles as candidate sentences and forming question answer pairs with questions proposed by users, carrying out matching degree calculation by the LSTM model trained in the embodiment, and extracting sentences from high to low according to the probability that the sentences can answer the questions and taking the sentences as answers given by the LSTM model. In connection with the foregoing embodiment, the workflow of the food safety automatic question-answering system is shown in fig. 4.

The number of candidate articles is selected to affect the overall accuracy of extracting answer sentences, and in the experiment, the result is shown in fig. 5, and the highest accuracy obtained by selecting one candidate article with the highest matching degree is found.

In addition, the LSTM-based question-answering system and the traditional question-answering system based on the document retrieval technology (such as the Lucene system) are compared in terms of accuracy: at 899 question-answer pairs marked by people, 80 extensions are randomly taken out to be used as a test set, and by extracting different numbers of articles to be expanded into 33355 question-answer pairs, comparison is made on an LSTM-based question-answer system and a Lucene-based question-answer system respectively from two aspects of the number of articles selected (N) and the number of candidate sentences selected by each question (S), and the results are shown in table 1:

TABLE 1

As can be seen from the two comparison results, the average accuracy of the LSTM model matching method combined with Lucene is 7-8 percent higher than that obtained by simply using Lucene for searching (the traditional searching method based on character string matching).

According to the embodiment of the invention, by adopting the LSTM model matching method combined with Lucene, the LSTM model has relatively strong capability of finding the correct answer to the given question, so that the implementation method of the LSTM-based food safety automatic question-answering system is proved, a relatively complete automatic question-answering system can be realized, and the efficiency and accuracy of obtaining the answer to the question from the related information of massive food safety are effectively improved.

Based on any of the above embodiments, the question answer pair is in the format of a question-answer-tag value.

Here, the formats of the question answer pairs all adopt the question-answer-label values uniformly, so that the matching relation between the questions and the answers and the corresponding label values is intuitively reflected.

Fig. 6 is a system structure diagram for obtaining answers to food safety questions based on LSTM according to an embodiment of the present invention, as shown in fig. 6, including: an acquisition module 61 and a processing module 62; wherein:

the obtaining module 61 is configured to obtain a question to be answered; the processing module 62 is configured to input the question to be answered to a pre-trained question-answer judging model, and obtain a question matching result output by the question-answer judging model; the question-answer judging model is obtained by training an LSTM model by using manually marked question answer pairs and corresponding matching labels based on a food safety question answer sample set.

The system provided by the embodiment of the present invention is used for executing the corresponding method, and the specific implementation manner of the system is consistent with the implementation manner of the method, and the related algorithm flow is the same as the algorithm flow of the corresponding method, which is not repeated here.

According to the embodiment of the invention, the food safety problem is trained based on the LSTM model, the problem that the search engine is low in retrieval accuracy of the food safety event is solved, and answers to the food safety problem can be effectively screened from a news base.

Fig. 7 illustrates a physical schematic diagram of an electronic device, as shown in fig. 7, which may include: processor 710, communication interface (Communications Interface) 720, memory 730, and communication bus 740, wherein processor 710, communication interface 720, memory 730 communicate with each other via communication bus 740. Processor 710 may call logic instructions in memory 730 to perform the following method: acquiring a question to be answered; inputting the questions to be answered into a pre-trained question-answer judging model, and obtaining a question matching result output by the question-answer judging model; the question-answer judging model is obtained by training an LSTM model by using manually marked question answer pairs and corresponding matching labels based on a food safety question answer sample set.

Further, the logic instructions in the memory 730 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, embodiments of the present invention further provide a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the transmission method provided in the above embodiments, for example, including: acquiring a question to be answered; inputting the questions to be answered into a pre-trained question-answer judging model, and obtaining a question matching result output by the question-answer judging model; the question-answer judging model is obtained by training an LSTM model by using manually marked question answer pairs and corresponding matching labels based on a food safety question answer sample set.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for obtaining answers to food safety questions based on LSTM, comprising:

acquiring a question to be answered;

inputting the questions to be answered into a pre-trained question-answer judging model, and obtaining a question matching result output by the question-answer judging model; the question-answer judging model is obtained by training an LSTM model by using manually marked question answer pairs and corresponding matching labels based on a food safety question answer sample set;

the question-answer judging model is obtained through the following steps:

acquiring a sample set of answers to the food safety questions;

acquiring an LSTM model network structure as an initial model;

inputting the training set, the cross validation set and the test set into the initial model for training to obtain the question-answer judging model;

the method for obtaining the answer sample set of the food safety question specifically comprises the following steps:

acquiring a food safety original document;

constructing the word vector into the answer sample set of the food safety questions;

inputting the questions to be answered into a pre-trained question-answer judging model, and obtaining the question matching results output by the question-answer judging model, wherein the method specifically comprises the following steps of:

inputting the candidate question answer pair vector representation to the question-answer judging model, calculating the matching degree by the question judging model, and taking a sentence with the highest matching degree as the question matching result;

2. The LSTM based method of claim 1, wherein the manually asking questions based on the preset food safety event marks sentences with matching answers to the manually asking questions in the food safety original document, and the manually asking sentences with matching answers to the manually asking questions and the manually asking questions form matching question answer pairs, further comprising:

3. The method of claim 1 or 2, wherein the question-answer pair is in the form of a question-answer-label value.

4. A LSTM based system for obtaining answers to food safety questions comprising:

the acquisition module is used for acquiring the questions to be answered;

the processing module is used for inputting the questions to be answered into a pre-trained question-answer judging model and obtaining the question matching results output by the question-answer judging model; the question-answer judging model is obtained by training an LSTM model by using manually marked question answer pairs and corresponding matching labels based on a food safety question answer sample set;

the question-answer judging model is obtained through the following steps:

acquiring a sample set of answers to the food safety questions;

acquiring an LSTM model network structure as an initial model;

acquiring a food safety original document;

5. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor performs the steps of the LSTM based method of obtaining answers to food safety questions as claimed in any one of claims 1 to 3.

6. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the LSTM based method of obtaining answers to food safety questions as claimed in any one of claims 1 to 3.