CN106126596A - A kind of answering method based on stratification memory network - Google Patents

A kind of answering method based on stratification memory network Download PDF

Info

Publication number
CN106126596A
CN106126596A CN201610447676.4A CN201610447676A CN106126596A CN 106126596 A CN106126596 A CN 106126596A CN 201610447676 A CN201610447676 A CN 201610447676A CN 106126596 A CN106126596 A CN 106126596A
Authority
CN
China
Prior art keywords
word
sentence
granularity
memory unit
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610447676.4A
Other languages
Chinese (zh)
Other versions
CN106126596B (en
Inventor
许家铭
石晶
姚轶群
徐波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201610447676.4A priority Critical patent/CN106126596B/en
Publication of CN106126596A publication Critical patent/CN106126596A/en
Application granted granted Critical
Publication of CN106126596B publication Critical patent/CN106126596B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a kind of answering method based on stratification memory network, first a granularity memory coding is carried out, and under the stimulation of problem semantic coding, an information inference for granularity mnemon is completed by the attention mechanism of many wheel iteration, by the sampling of k maximum, sentence is screened, word granularity memory coding is also carried out on the basis of sentence granularity memory coding, i.e. carry out memory coding at two levels, form the memory coding of stratification, utilize sentence granularity and the output Word probability distribution of word granularity mnemon associated prediction, improve the accuracy of automatic question answering, efficiently solve the answer select permeability of low-frequency word and unregistered word.

Description

Question-answering method based on hierarchical memory network
Technical Field
The invention relates to the technical field of automatic question-answering system construction, in particular to an end-to-end question-answering method based on a hierarchical memory network.
Background
Automatic question answering has long been one of the most challenging tasks in natural language processing questions that requires a deep understanding of the text and screening out candidate answers as system responses. The conventional methods currently available include: independently training each module in the text processing process by adopting a pipeline mode, and then fusing output modes; and constructing a large-scale structured knowledge base, and performing information reasoning and answer prediction based on the knowledge base. In recent years, end-to-end systems based on deep learning methods have been widely used to solve various tasks without manually constructing features and without individually tuning individual modules.
The question-answering system can be roughly divided into two steps: the relevant semantic information is first located, this step is called the "activation phase", and then a response generation is performed based on the relevant information, this step is called the "generation phase". Recently, neural memory network models have achieved better results in question-answering system tasks. However, the biggest disadvantage of these models is that the memory unit with single-level sentence granularity is adopted, and the problem of low-frequency words or unknown words cannot be solved well. Also, in general, to reduce the time complexity of the model, it is often necessary to reduce the dictionary size. At this time, the existing end-to-end neural network model cannot well select low-frequency or unknown words as answer output. That is, when the target answer word is outside the training dictionary, the existing method cannot well select the accurate answer as the model output in the on-line testing stage. The following dialog text is taken as an example:
1. what name is your good for mr?
2. Hiccup, i called Williams.
3. Please tell me your passport number.
4. Preferably 577838771.
5. Also do you phone numbers?
6. The number is 0016178290851.
Assuming that "williamson", "577838771" and "0016178290851" are low frequency words or unknown words, none of these methods can select accurate user information from the dialog text if the conventional methods discard these words or collectively replace them with "unk" symbols. However, in practical applications, most answer information comes from low-frequency words or long-tail words, and how to design an answer selection method capable of effectively solving the problem of unknown words is a task urgently needed in the field of the automatic question and answer system at present.
Disclosure of Invention
Technical problem to be solved
In order to solve the problems in the prior art, the invention provides a question-answering method based on a hierarchical memory network.
(II) technical scheme
The invention provides a question-answering method based on a hierarchical memory network, which comprises the following steps: step S101: integrating the position of a word and the time sequence information of a sentence, and performing sentence granularity memory coding on the sentence in the sentence set to obtain a double-channel memory coding of a sentence granularity memory unit; step S102: under the stimulation of problem semantic coding, completing information reasoning of the sentence granularity memory unit through a multi-round iterative attention mechanism to obtain the probability distribution of output words in dictionary dimensions on the sentence granularity memory unit; step S103: k maximum sampling is carried out on the information inference result of the sentence granularity memory unit, and a k maximum sampling important sentence set is screened out from the sentence set; step S104: performing word granularity memory coding on the sentence set by using a bidirectional cyclic neural network model to obtain memory coding of a word granularity memory unit; step S105: obtaining word granularity output word probability distribution through an attention mechanism based on the problem semantic code, the memory code of the word granularity memory unit and the k maximum sampling important sentence set; and step S106: and jointly predicting the probability distribution of output words from the sentence granularity and word granularity memory units, and performing supervision training by using the cross entropy.
(III) advantageous effects
According to the technical scheme, the question-answering method based on the hierarchical memory network has the following beneficial effects:
(1) the method firstly carries out sentence granularity memory coding, completes information reasoning of a sentence granularity memory unit through a multi-round iterative attention mechanism under the stimulation of question semantic coding, can improve the accuracy and timeliness of automatic question answering, and is favorable for answer selection of low-frequency words and unknown words;
(2) sentences are screened through k maximum sampling, so that the automatic question answering efficiency can be improved, and the calculation complexity is reduced;
(3) word granularity memory coding is carried out on the basis of sentence granularity memory coding, namely memory coding is carried out on two layers to form hierarchical memory coding, so that the accuracy of automatic question answering can be further improved;
(4) when the cyclic neural network is used for word granularity memory coding, the operation is carried out on the full sentence set X, the method can introduce context environment semantic information of words in the full sentence set in the word granularity memory coding process, and can improve the accuracy and timeliness of automatic question answering;
(5) the attention mechanism on the word granularity memory unit is operated on the word granularity memory unit subset after k sampling, so that interference information in memory coding is avoided, and the calculated amount of the word granularity attention mechanism is reduced;
(6) the sentence granularity and word granularity memory unit is used for jointly predicting the probability distribution of output words, so that the accuracy of automatic question answering can be further improved, and the answer selection problem of low-frequency words and unknown words is effectively solved.
Drawings
FIG. 1 is a flowchart of a question answering method based on a hierarchical memory network according to an embodiment of the present invention;
FIG. 2 is a block diagram of a hierarchical memory network-based question-answering method according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating sentence-granularity memory coding and information inference based on sentence-granularity memory coding according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating word granularity memory coding and attention activation based on the word granularity memory coding according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the performance of a hierarchical memory network-based question-answering method according to an embodiment of the present invention 1;
fig. 6 is a schematic diagram of another performance of the question-answering method based on the hierarchical memory network according to the embodiment of the present invention.
Detailed Description
In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.
The invention discloses a question-answering method based on a hierarchical memory network, which is based on an end-to-end model of a whole neural network structure, can realize information reasoning, screening and word granularity selection in a sentence set, and effectively solves the problem of answer selection of a question-answering system under big data on low-frequency words or unknown words. The question-answering method of the invention respectively carries out two hierarchical memory codes on a sentence set with time sequence information, which respectively are as follows: sentence granularity memory coding and word granularity memory coding. And then carrying out information reasoning, screening and activation based on the hierarchical memory network, and jointly predicting the probability distribution of the candidate answer words.
The question-answering method firstly carries out sentence-vectorization memory coding on a sentence set through a hierarchical memory network, considers the position information of words in sentences and the sequence time information of the sentences in the sentence set, then completes the information reasoning of a sentence granularity memory unit through a multi-round iterative attention mechanism, carries out k maximum sampling based on the reasoning result and screens out important sentence information. And then, carrying out word granularity sequence coding on the sentence set by using a bidirectional circulation network model, carrying out information activation of a word granularity memory unit from the screened information through an attention mechanism, finally predicting output word probability distribution from the sentence granularity memory unit and the word granularity memory unit respectively, carrying out joint supervision training through Softmax, and learning an end-to-end automatic question-answering model.
The question-answering method based on the hierarchical memory network as an embodiment of the invention is described in detail below with reference to the accompanying drawings.
Fig. 1 is a flowchart of a question-answering method based on a hierarchical memory network according to an embodiment of the present invention, and referring to fig. 1, the question-answering method includes:
step S101: and fusing the position of the word and the time sequence information of the sentence, and performing sentence granularity memory coding on the sentence in the sentence set to obtain the double-channel memory coding of the sentence granularity memory unit.
Referring to fig. 3, step S101 includes:
sub-step S101 a: and carrying out double-channel word vector mapping on the sentences with the time sequence information in the sentence set to obtain double-channel word vectorization codes of the sentences.
The sub-step S101a includes: given a set of sentences with time series information X ═ { X ═ Xi}i=(1,2,...,n)Wherein i is the current time sequence of the sentence, and n is the maximum time sequence length of the sentence set; randomly initializing two word vector matricesAndwherein | V | is dictionary dimension, d is dimension of word vector, A and C respectively adopt normal distribution with standard deviation of 0.1 and mean value of 0 as random initialization parameter, and sentence X in sentence set XiPerforming two-channel word vector mapping to obtain sentence xiWord x inijIs a two-channel vectorized code ofAndj is a word in the sentence xiThe location information in (1).
Sub-step S101 b: and updating the two-channel word vectorization codes according to the position information of the words in the sentences.
The sub-step S101b includes: generating an updating matrix l according to the position information j of the words in the sentences and the dimension d of the word vectors, and encoding the updated two-channel word vectorization code into lgj·(Axij) And lgj·(Cxij) Wherein:
lgj=(1-j/Ji)-(g/d)(1-2j/Ji) (1)
wherein, JiIs the sentence xiThe number of Chinese words, g is the current dimension value in the word vector with the dimension d, and J is more than or equal to 1 and less than or equal to J and g is more than or equal to 1 and less than or equal to d.
Sub-step S101 c: and merging the time sequence information of the sentences to perform sentence granularity memory coding on the sentences to obtain the double-channel memory coding of the sentence granularity memory unit.
The sub-step S101c includes: vectorized matrix for randomly initializing a time sequence of two sentencesAndwhere n is the maximum time series length of the sentence set, d is the time vector dimension, which is the same as the word vector dimension, TAAnd TCNormal distribution with standard deviation of 0.1 and mean value of 0 is respectively adopted as random initialization parameters, and then the dual-channel memory code of the sentence-size memory unit is M(S)={{ai},{ci}, wherein:
a i = Σ j l j · ( Ax i j ) + T A ( i ) - - - ( 2 )
c i = Σ j l j · ( Cx i j ) + T C ( i ) - - - ( 3 )
wherein ljUpdate matrix l in sentence x foriThe update vector of the j-th word, the operator, is the multiplication operation of elements between vectors, such as l in formula (2)j·(Axij) Represents a vector ljSum vector (Ax)ij) An element multiplication operation is performed.
Step S102: under the stimulation of problem semantic coding, information reasoning of the sentence granularity memory unit is completed through a multi-round iterative attention mechanism, and output word probability distribution of the sentence granularity memory unit in dictionary dimensions is obtained.
Step S102 includes:
sub-step S102 a: and vectorizing and expressing the question text to obtain the semantic code of the question.
The sub-step S102a includes: using the word vector matrixFor the jth word q in the question text qjPerforming vectorized representationAnd updating the vectorized representation based on the position j of the word in the question text to obtain the semantic code of the question:
u 1 ( S ) = Σ j l j · ( Aq j ) - - - ( 4 )
the same as the formulas (2) and (3), ljTo update the matrix l in sentence xiThe update vector of the jth word in (b).
Sub-step S102 b: under the stimulation of problem semantic coding, information activation is carried out in the double-channel memory coding of the sentence granularity memory unit by using an attention mechanism;
the sub-step S102b includes: calculating attention weight of problem semantic codes in a sentence granularity memory unit by adopting a dot product mode:
α i ( S ) = s o f t max ( a i T u 1 ( S ) ) - - - ( 5 )
and under the stimulation of problem semantic coding, the activation information of the double-channel memory coding of the sentence-granularity memory unit is as follows:
sub-step S102 c: and finishing information reasoning on the sentence granularity memory unit through a multi-round iterative attention mechanism to obtain the probability distribution of output words in dictionary dimensions on the sentence granularity memory unit.
The sub-step S102c includes: performing R round information activation on the sentence granularity memory unit, finding out a candidate sentence set, and obtaining the activation information O of the R roundRWherein, in the r +1 th round of information activation,
u r + 1 ( S ) = o r + u r ( S ) - - - ( 6 )
α i ( S ) = s o f t m a x ( a i T u r + 1 ( S ) ) - - - ( 7 )
a i = Σ j l j · ( A r + 1 x i j ) + T A r + 1 ( i ) - - - ( 8 )
o r + 1 = Σ i α i ( S ) c i - - - ( 9 )
c i = Σ j l j · ( C r + 1 x i j ) + T C r + 1 ( i ) - - - ( 10 )
wherein R is more than or equal to 1 and less than or equal to (R-1), and an independent word vector matrix A is adopted in the R +1 round of information activationr+1And Cr+1And time vector matrixAndvectorizing the sentence set, andCrandnormal distribution with standard deviation of 0.1 and mean of 0 is respectively adopted as random initialization parameters.
And finishing information reasoning in the sentence granularity memory unit through an attention mechanism of R rounds of iteration to obtain the probability distribution of output words in dictionary dimensionality on the sentence granularity memory unit as follows:
p ( S ) ( w ) = s o f t m a x ( ( C R ) T ( o R + u R ( S ) ) ) - - - ( 11 )
wherein,is a set of dictionary dimension words, and is,the word vector matrix activated for the R-th round of information and T is the transpose operator.
The invention firstly carries out sentence granularity memory coding, completes the information reasoning of the sentence granularity memory unit through a multi-round iterative attention mechanism under the stimulation of question semantic coding, can improve the accuracy and timeliness of automatic question answering, and is beneficial to the answer selection of low-frequency words and unknown words.
Step S103: k maximum sampling is carried out on the information reasoning result of the sentence granularity memory unit, and an important sentence set with k maximum sampling is screened out from the sentence set.
Step S103 includes:
sub-step S103 a: attention weight vector activated for R-th round information on sentence-granularity memory unitSelecting k largest attention weight subsets
Sub-step S103 b: selecting k largest attention weight subsetsCorresponding sentence set as k maximum sampling important sentence setSentences in the important sentence setIs an important sentence.
According to the invention, the sentence is screened by sampling at the maximum, so that the automatic question answering efficiency can be improved, the calculation complexity is reduced, and the answer selection of low-frequency words and unknown words is facilitated.
Step S104: and performing word granularity memory coding on the sentence set by using the bidirectional cyclic neural network model to obtain the memory coding of the word granularity memory unit.
Referring to fig. 4, step S104 includes:
sub-step S104 a: and (4) encoding words in the important sentence set according to the time sequence by using the bidirectional circulation network model to obtain the hidden state of the bidirectional circulation network model. There are many existing models of the bidirectional loop network model, and this embodiment adopts one of them: gate cycle network model (GRU).
The sub-step S104a includes: using gate cycle network model (GRU) to respectively match all words in sentence set XForward and backward encoding is carried out according to time sequence, and the hidden state of forward GRU encoding is that for the word characteristics at the time tThe hidden state of backward GRU coding isWherein, | t | is the maximum sequence length of the words after arranging all the words in the sentence set X according to the time sequence,andis the same as the dimension d of the word vector, CRIs a word vector matrix in the process of activating the R-th round of information in the sentence granularity memory unit.
Sub-step S104 b: and fusing the hidden states of the bidirectional circulation network model to obtain the memory code of the word granularity memory unit.
The sub-step S104b includes: directly adding the hidden states of the two-way circulation network model to obtain a memory code M of a word granularity memory unit(W)={mt}t=1,2,..|t|)Wherein
The invention uses the recurrent neural network to carry out word granularity memory coding, which is operated on the whole sentence set X, and the method can introduce the context environment semantic information of the words in the whole sentence set in the word granularity memory coding process, can improve the accuracy and timeliness of automatic question answering, and is beneficial to the answer selection of low-frequency words and unknown words.
Step S105: based on problem semantic coding, memory coding of a word granularity memory unit and k maximum sampling important sentence sets, word granularity output word probability distribution is obtained through an attention mechanism.
Step S105 includes:
sub-step S105 a: calculating the attention weight of the word granularity memory unit according to the problem semantic code and the memory code of the word granularity memory unit;
the sub-step S105a includes: semantic coding based on problem in R-th round information activation process on sentence granularity memory unitMemory code M of word granularity memory unit(W)={mt}t=1,2,..,|t|)And k max sampling important sentence setObtaining the attention weight vector of the normalized word granularity memory unitWherein:
α t ( w ) = s o f t m a x ( v T tanh ( Wu R ( S ) + U m ^ t ) ) - - - ( 12 )
wherein,is k max sample important sentence setSet of words in (1)Corresponding word granularity memory code M(W)={mt}t=(1,2,...,|t|)A subset ofAttention weight vector α(W)Dimension of (2) and collection of important sentences in time sequenceAll the words inThe maximum sequence length of the arranged words is consistent, namely the maximum sequence length is Andall learning parameters are learning parameters, v, W and U are initialized randomly by adopting normal distribution with standard deviation of 0.1 and mean value of 0, and are updated in a training stage.
Substep S105b obtaining word granularity output word probability distribution according to attention weight of word granularity memory unit in the embodiment of the present invention, attention weight α of normalized word granularity memory unit is directly adopted(W)Outputting a word probability distribution as a word granularity:
p ( W ) ( w ^ ) = α ( W ) - - - ( 13 )
at this time, the word granularity outputs the word probability distribution in accordance with the dimension of the attention weight, i.e., the word granularity outputs the word probability distribution For the set of all words in the set of important sentences
The invention also carries out word granularity memory coding on the basis of sentence granularity memory coding, namely, memory coding is carried out on two layers to form hierarchical memory coding, thereby further improving the accuracy of automatic question answering and being more beneficial to answer selection of low-frequency words and unknown words. Meanwhile, the attention mechanism on the word granularity memory unit is operated on the word granularity memory unit subset after k sampling, so that interference information in memory coding is avoided, and the calculation amount of the word granularity attention mechanism is reduced.
Step S106: and jointly predicting the probability distribution of output words from the sentence granularity and word granularity memory units, and performing supervision training by using the cross entropy.
Step S106 includes:
sub-step S106 a: performing output word joint prediction based on the output word probability distribution and the word granularity output word probability distribution of the dictionary dimension on the sentence granularity memory unit, wherein the joint prediction output word distribution p (w) has the expression:
p ( w ) = p ( S ) ( w ) + p ( W ) ( w ) = p ( S ) ( w ) + t r a n s ( p ( W ) ( w ^ ) ) - - - ( 14 )
where trans (·) denotes the word granularity of the subset to output the word probability distributionWord granularity output word probability distribution mapped to dictionary dimension corpusThe mapping operation particularly refers to a probability distribution of the output wordsMiddle probability value according to its corresponding word subsetDictionary dimensional word corpus of words in (1)The position in the full set is mapped with probability value, if some words in the full set do not appear in the sub-set, the output probability is set as 0, and the output probability distribution of the mapped words is obtained
Sub-step S106 b: and performing cross entropy supervision training on the distribution of the joint prediction output words by using the distribution of the target answer words. And given that the target answer word distribution of the training set is y, performing joint optimization based on a cross entropy function of the target answer word distribution y and the joint prediction output word distribution p (w).
In an exemplary embodiment of the invention, the objective function in the joint optimization is optimized by error back propagation by adopting a random gradient descent method, and the optimization parameters comprise a word vector matrix { A ] in a word granularity memory unitr}r=(1,2,...,R)And { Cr}t=1,2,...,R)And time vector matrixAndall parameter sets { theta ] of bidirectional GRU model adopted in word granularity memory coding processGRUV, W and U in the attention weight of the word granularity memory unit (formula (12)) are calculated.
The method jointly predicts the probability distribution of output words in the sentence granularity and word granularity memory unit, can further improve the accuracy of automatic question answering, and is more favorable for answer selection of low-frequency words and unknown words.
Fig. 2 is a schematic diagram of a framework of a question-answering method based on a hierarchical memory network according to an embodiment of the present invention. Referring to fig. 2, the question-answering method based on the hierarchical memory network has two layers of memory network units, which are respectively:
a memory unit one: the sentence set carries out coding memory of sentence granularity in a time sequence;
a second memory cell: and all words in the sentence set are subjected to word granularity coding memory according to a time sequence.
And (4) screening and filtering important information by adopting the k maximum mode among different memory unit layers.
The model information processing stage has two information activation mechanisms, which are respectively:
the first activation mechanism is as follows: adopting an inference mechanism to activate information on the sentence granularity memory unit;
and (2) an activation mechanism II: and performing word selection on the word granularity memory unit by adopting an attention mechanism.
The whole model training stage has two supervision information for guidance, which are respectively:
monitoring information I: the sentence granularity memory unit decodes the output vector after information reasoning and outputs Softmax to fit information of the target word;
and (5) monitoring information II: and the word granularity memory unit performs attention mechanism activation and Softmax output and then performs fitting information on the target word.
In order to accurately evaluate the automatic question-answering response performance of the method, the performance of the method is compared by comparing error sample numbers of the answer words selected and output by the model and the actual data answer words.
TABLE 1
Data field Training/testing question-answer pairs Dictionary size (Whole/training/testing) Unregistered target word (percentage)
Airline ticket booking 7,000/7,000 10,682/5,612/5,618 5,070(72.43%)
The invention adopts a Chinese air ticket booking field text data set in the experiment, wherein the data set comprises 2,000 complete conversation histories and 14,000 question-answer pairs, and the ratio of the number of the question-answer pairs to the number of the question-answer pairs is 5: 5, and the data set is divided into a training set and a testing set. The present invention does not perform any processing (including word-kill and stem reduction operations) on these text data sets. Specific statistical information of the data set is shown in table 1, and it can be seen that the unregistered target word in the test set occupies 72.43%, which has a relatively large influence on the conventional model training.
The following comparative methods were used in the experiments of the invention:
the first comparison method comprises the following steps: the method is based on a pointer network model of an attention mechanism, all words in a sentence set are regarded as a long sentence according to a time sequence for coding, and answers are generated by directly utilizing the attention mechanism of question and word coding;
and a second comparison method comprises the following steps: the neural memory network model is used for carrying out sentence granularity coding on a sentence set, and carrying out answer matching on a full dictionary space directly by using information obtained after semantic activation is carried out on a coding vector of a question.
The parameters used in the experiments of the present invention are set as shown in table 2:
TABLE 2
n d R k lr bs
16 100 3 1 0.01 10
In table 2, a parameter n is a sentence maximum time sequence of a sentence set of experimental data, d is a word vector dimension and a hidden layer coding dimension, R is an iteration number of an inference mechanism on a sentence size memory unit, k is a maximum sampling number between different layers of memories, lr is a learning rate when a random gradient descent method is used for model parameter optimization, and bs is a number of samples in each batch when model training is performed.
In the experiment of the invention, 15 rounds of iterative training are carried out, all the methods are converged as shown in fig. 5, and the final converged experiment result is shown in table 3:
TABLE 3
Method of producing a composite material Number of wrong samples
Comparison method 1 109
Comparison method two 56
The method of the invention 0
FIG. 5 and Table 3 show the results of the evaluation of the number of false samples on a data set by the method of the present invention, the first comparison method and the second comparison method. Experimental results show that the convergence rate of the method is obviously superior to that of other methods. And according to the final convergence result in table 3, it can be seen that the method of the present invention is significantly superior to other methods, and can completely solve the problem of answer selection on the set of unregistered words, reaching a 100% accuracy.
Meanwhile, the present invention verifies the performance influence of the maximum sampling number k of information screening among the hierarchical memory units on the number of wrong samples in the answer selection problem, and the experimental results are shown in fig. 6 and table 4. It can be seen that when the maximum sampling number is 1, the convergence rate and the final convergence result of the performance of the method of the present invention can be optimized, further explaining the importance of information selection among the hierarchical memory units.
TABLE 4
Maximum number of samples Number of wrong samples
3 5
2 4
1 0
So far, the embodiments of the present invention have been described in detail with reference to the accompanying drawings. From the above description, those skilled in the art should clearly recognize that the present invention is a question-answering method based on a hierarchical memory network.
The invention relates to a question-answering method based on a hierarchical memory network, which comprises the steps of firstly carrying out sentence granularity memory coding, finishing information reasoning of a sentence granularity memory unit through a multi-round iterative attention mechanism under the stimulation of question semantic coding, improving the accuracy and timeliness of automatic question-answering, and facilitating the answer selection of low-frequency words and unknown words; the sentences are screened through the k maximum sampling, so that the efficiency of automatic question answering can be improved, the calculation complexity is reduced, word granularity memory coding is performed on the basis of sentence granularity memory coding, namely, memory coding is performed on two levels to form hierarchical memory coding, and the accuracy of automatic question answering can be further improved; when the cyclic neural network is used for word granularity memory coding, the operation is carried out on the full sentence set X, the method can introduce context environment semantic information of words in the full sentence set in the word granularity memory coding process, and can improve the accuracy and timeliness of automatic question answering; the attention mechanism on the word granularity memory unit is operated on the word granularity memory unit subset after k sampling, so that interference information in memory coding is avoided, and the calculated amount of the word granularity attention mechanism is reduced; the sentence granularity and word granularity memory unit is used for jointly predicting the probability distribution of output words, so that the accuracy of automatic question answering can be further improved, and the answer selection problem of low-frequency words and unknown words is effectively solved.
It is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. In addition, the above definitions of the respective elements are not limited to the various manners mentioned in the embodiments, and those skilled in the art may easily modify or replace them, for example:
(1) directional phrases used in the embodiments, such as "upper", "lower", "front", "rear", "left", "right", etc., refer only to the orientation of the attached drawings and are not intended to limit the scope of the present invention;
(2) the embodiments described above may be mixed and matched with each other or with other embodiments based on design and reliability considerations, i.e. technical features in different embodiments may be freely combined to form further embodiments.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A question-answering method based on a hierarchical memory network is characterized by comprising the following steps:
step S101: integrating the position of a word and the time sequence information of a sentence, and performing sentence granularity memory coding on the sentence in the sentence set to obtain a double-channel memory coding of a sentence granularity memory unit;
step S102: under the stimulation of problem semantic coding, completing information reasoning of the sentence granularity memory unit through a multi-round iterative attention mechanism to obtain the probability distribution of output words in dictionary dimensions on the sentence granularity memory unit;
step S103: k maximum sampling is carried out on the information inference result of the sentence granularity memory unit, and a k maximum sampling important sentence set is screened out from the sentence set;
step S104: performing word granularity memory coding on the sentence set by using a bidirectional cyclic neural network model to obtain memory coding of a word granularity memory unit;
step S105: obtaining word granularity output word probability distribution through an attention mechanism based on the problem semantic code, the memory code of the word granularity memory unit and the k maximum sampling important sentence set; and
step S106: and jointly predicting the probability distribution of output words from the sentence granularity and word granularity memory units, and performing supervision training by using the cross entropy.
2. The question-answering method according to claim 1, characterized in that said step S101 comprises:
sub-step S101 a: given a set of sentences with time series information X ═ { X ═ Xi}i=(1,2,...,n)Randomly initializing a word vector matrixAndsentence xiWord x inijIs a two-channel vectorized code ofAnd
wherein i is the current time series of sentences; n is the maximum time series length of the sentence set; | V | is a dictionary dimension; d is the dimension of the word vector; j is a word in the sentence xiThe location information in (1);
sub-step S101 b: updating the two-channel word vectorization codes according to the position information of the words in the sentences; and
sub-step S101 c: and merging the time sequence information of the sentences to perform sentence granularity memory coding on the sentences to obtain the double-channel memory coding of the sentence granularity memory unit.
3. The question-answering method according to claim 2, characterized in that said sub-step S101b comprises:
the updated two-channel word vectorization code is lgj·(Axij) And lgj·(Cxij) Wherein
lgj=(1-j/Ji)-(g/d)(1-2j/Ji) (1)
wherein, JiIs the sentence xiThe number of Chinese words, g is the current dimension value in the word vector with the dimension d, and J is more than or equal to 1 and less than or equal to J and g is more than or equal to 1 and less than or equal to d.
4. The question-answering method according to claim 3, characterized in that said sub-step S101c comprises:
time vector matrix for randomly initializing sentencesAndthe two-channel memory code of the sentence-size memory unit is M(S)={{ai},{ciAnd } of the component (c), wherein,
ai=∑jlj·(Axij)+TA(i) (2)
ci=∑jlj·(Cxij)+TC(i) (3)
wherein ljUpdate matrix l in sentence x foriAn update vector of the jth word; the operator is an inter-vector element multiplication operation; n is the maximum time series length of the sentence set; d is the time vector dimension, the same as the dimension of the word vector.
5. The question-answering method according to claim 4, characterized in that said step S102 comprises:
sub-step S102 a: using word vector matricesFor the jth word q in the question text qjPerforming vectorized representationObtaining problem semantic codes:
u 1 ( S ) = Σ j l j · ( Aq j ) - - - ( 4 )
wherein ljTo update the matrix l in sentence xiAn update vector of the jth word;
sub-step S102 b: calculating attention weight of problem semantic code in sentence-granularity memory unit
α i ( S ) = s o f t m a x ( a i T u 1 ( S ) ) - - - ( 5 )
Under the stimulation of problem semantic coding, the activation information of the double-channel memory coding of the sentence granularity memory unit is as follows:and
sub-step S102 c: and finishing information reasoning on the sentence granularity memory unit through a multi-round iterative attention mechanism to obtain the probability distribution of output words in dictionary dimensions on the sentence granularity memory unit.
6. The question-answering method according to claim 5, characterized in that said sub-step S102c comprises:
performing R round information activation on the sentence granularity memory unit to obtain the activation information O of the R roundRWherein, in the r +1 th round of information activation,
u r + 1 ( S ) = o r + u r ( S ) - - - ( 6 )
α i ( S ) = s o f t max ( a i T u r + 1 ( S ) ) - - - ( 7 )
a i = Σ j l j · ( A r + 1 x i j ) + T A r + 1 ( i ) - - - ( 8 )
o r + 1 = Σ i α i ( S ) c i - - - ( 9 )
c i = Σ j l j · ( C r + 1 x i j ) + T C r + 1 ( i ) - - - ( 10 )
wherein R is more than or equal to 1 and less than or equal to (R-1); a. ther+1=Cr
The probability distribution of output words in dictionary dimensions on the sentence granularity memory unit is as follows:
p ( S ) ( w ) = s o f t m a x ( ( C R ) r ( o R + u R ( S ) ) ) - - - ( 11 )
wherein w ═ { w ═ wt}t=(1,2,...,|V|)A dictionary dimension word set is obtained;a word vector matrix activated for the R-th round of information; t is a transpose operator.
7. The question-answering method according to claim 6, characterized in that said step S103 comprises:
sub-step S103 a: attention weight vector activated for R-th round information on sentence-granularity memory unitSelecting k largest attention weight subsetsAnd
sub-step S103 b: selecting k largest attention weight subsetsCorresponding sentence set as k maximum sampling important sentence set
8. The question-answering method according to claim 7, characterized in that said step S104 comprises:
sub-step S104 a: respectively aligning all words in sentence set X by using gate cycle network modelForward and backward encoding is carried out according to time sequence, and the hidden state of forward GRU encoding is that for the word characteristics at the time tThe hidden state of backward GRU coding is
Wherein, | t | is the maximum sequence length of words after arranging all words in the sentence set X according to the time sequence;andis the same as the dimension d of the word vector;
sub-step S104 b: obtaining the memory code M of the word granularity memory unit(W)={mt}t=(1,2,...,|t|)Wherein
9. The question-answering method according to claim 8, characterized in that said step S105 comprises:
sub-step S105 a: computing an attention weight vector for the normalized word granularity memory unitWherein:
α t ( W ) = s o f t m a x ( v T tanh ( Wu R ( S ) + U m ^ t ) ) - - - ( 12 )
wherein,is k max sample important sentence setSet of words in (1)Corresponding word granularity memory code M(W)={mt}t=(1,2,...,|t|)A subset ofAttention weight vector α(W)Has the dimension of Andis a learning parameter;
sub-step S105 b: word granularity output word probability distributionComprises the following steps:
p ( W ) ( w ^ ) = α ( W ) - - - ( 13 )
wherein, for the set of all words in the set of important sentences
10. The question-answering method according to claim 9, characterized in that said step S106 comprises:
sub-step S106 a: performing output word joint prediction based on the output word probability distribution and the word granularity output word probability distribution of the dictionary dimension on the sentence granularity memory unit, wherein the joint prediction output word distribution p (w) has the expression:
p ( w ) = p ( S ) ( w ) + p ( W ) ( w ) = p ( S ) ( w ) + t r a n s ( p ( W ) ( w ^ ) ) - - - ( 14 )
where trans (·) denotes the word granularity of the subset to output the word probability distributionWord granularity output word probability distribution mapped to dictionary dimension corpus
Sub-step S106 b: and performing cross entropy supervision training on the distribution of the joint prediction output words by using the distribution of the target answer words.
CN201610447676.4A 2016-06-20 2016-06-20 A kind of answering method based on stratification memory network Active CN106126596B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610447676.4A CN106126596B (en) 2016-06-20 2016-06-20 A kind of answering method based on stratification memory network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610447676.4A CN106126596B (en) 2016-06-20 2016-06-20 A kind of answering method based on stratification memory network

Publications (2)

Publication Number Publication Date
CN106126596A true CN106126596A (en) 2016-11-16
CN106126596B CN106126596B (en) 2019-08-23

Family

ID=57470348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610447676.4A Active CN106126596B (en) 2016-06-20 2016-06-20 A kind of answering method based on stratification memory network

Country Status (1)

Country Link
CN (1) CN106126596B (en)

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776578A (en) * 2017-01-03 2017-05-31 竹间智能科技(上海)有限公司 Talk with the method and device of performance for lifting conversational system
CN106778014A (en) * 2016-12-29 2017-05-31 浙江大学 A kind of risk Forecasting Methodology based on Recognition with Recurrent Neural Network
CN107273487A (en) * 2017-06-13 2017-10-20 北京百度网讯科技有限公司 Generation method, device and the computer equipment of chat data based on artificial intelligence
CN107491541A (en) * 2017-08-24 2017-12-19 北京丁牛科技有限公司 File classification method and device
CN107766506A (en) * 2017-10-20 2018-03-06 哈尔滨工业大学 A kind of more wheel dialog model construction methods based on stratification notice mechanism
CN107818306A (en) * 2017-10-31 2018-03-20 天津大学 A kind of video answering method based on attention model
CN107844533A (en) * 2017-10-19 2018-03-27 云南大学 A kind of intelligent Answer System and analysis method
CN108108428A (en) * 2017-12-18 2018-06-01 苏州思必驰信息科技有限公司 A kind of method, input method and system for building language model
CN108388561A (en) * 2017-02-03 2018-08-10 百度在线网络技术(北京)有限公司 Neural network machine interpretation method and device
CN108417210A (en) * 2018-01-10 2018-08-17 苏州思必驰信息科技有限公司 A kind of word insertion language model training method, words recognition method and system
CN108549850A (en) * 2018-03-27 2018-09-18 联想(北京)有限公司 A kind of image-recognizing method and electronic equipment
CN108628935A (en) * 2018-03-19 2018-10-09 中国科学院大学 A kind of answering method based on end-to-end memory network
CN108959246A (en) * 2018-06-12 2018-12-07 北京慧闻科技发展有限公司 Answer selection method, device and electronic equipment based on improved attention mechanism
CN109033463A (en) * 2018-08-28 2018-12-18 广东工业大学 A kind of community's question and answer content recommendation method based on end-to-end memory network
CN109388706A (en) * 2017-08-10 2019-02-26 华东师范大学 A kind of problem fine grit classification method, system and device
CN109558487A (en) * 2018-11-06 2019-04-02 华南师范大学 Document Classification Method based on the more attention networks of hierarchy
CN109597884A (en) * 2018-12-28 2019-04-09 北京百度网讯科技有限公司 Talk with method, apparatus, storage medium and the terminal device generated
CN109614473A (en) * 2018-06-05 2019-04-12 安徽省泰岳祥升软件有限公司 Knowledge reasoning method and device applied to intelligent interaction
CN109658270A (en) * 2018-12-19 2019-04-19 前海企保科技(深圳)有限公司 It is a kind of to read the core compensation system and method understood based on insurance products
CN109829631A (en) * 2019-01-14 2019-05-31 北京中兴通网络科技股份有限公司 A kind of business risk early warning analysis method and system based on memory network
CN109840322A (en) * 2018-11-08 2019-06-04 中山大学 It is a kind of based on intensified learning cloze test type reading understand analysis model and method
CN109977428A (en) * 2019-03-29 2019-07-05 北京金山数字娱乐科技有限公司 A kind of method and device that answer obtains
CN109992657A (en) * 2019-04-03 2019-07-09 浙江大学 A kind of interactive problem generation method based on reinforcing Dynamic Inference
CN110019719A (en) * 2017-12-15 2019-07-16 微软技术许可有限责任公司 Based on the question and answer asserted
CN110046244A (en) * 2019-04-24 2019-07-23 中国人民解放军国防科技大学 Answer selection method for question-answering system
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems
CN110147532A (en) * 2019-01-24 2019-08-20 腾讯科技(深圳)有限公司 Coding method, device, equipment and storage medium
CN110334195A (en) * 2019-06-26 2019-10-15 北京科技大学 A kind of answering method and system based on local attention mechanism memory network
CN110348462A (en) * 2019-07-09 2019-10-18 北京金山数字娱乐科技有限公司 A kind of characteristics of image determination, vision answering method, device, equipment and medium
CN110389996A (en) * 2018-04-16 2019-10-29 国际商业机器公司 Realize the full sentence recurrent neural network language model for being used for natural language processing
CN110555097A (en) * 2018-05-31 2019-12-10 罗伯特·博世有限公司 Slot filling with joint pointer and attention in spoken language understanding
CN110866403A (en) * 2018-08-13 2020-03-06 中国科学院声学研究所 End-to-end conversation state tracking method and system based on convolution cycle entity network
CN111047482A (en) * 2019-11-14 2020-04-21 华中师范大学 Knowledge tracking system and method based on hierarchical memory network
CN111291803A (en) * 2020-01-21 2020-06-16 中国科学技术大学 Image grading granularity migration method, system, equipment and medium
CN111310848A (en) * 2020-02-28 2020-06-19 支付宝(杭州)信息技术有限公司 Training method and device of multi-task model
CN112732879A (en) * 2020-12-23 2021-04-30 重庆理工大学 Downstream task processing method and model of question-answering task
CN113704437A (en) * 2021-09-03 2021-11-26 重庆邮电大学 Knowledge base question-answering method integrating multi-head attention mechanism and relative position coding

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834747A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Short text classification method based on convolution neutral network
CN105159890A (en) * 2014-06-06 2015-12-16 谷歌公司 Generating representations of input sequences using neural networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105159890A (en) * 2014-06-06 2015-12-16 谷歌公司 Generating representations of input sequences using neural networks
CN104834747A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Short text classification method based on convolution neutral network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SAINBAYAR SUKHBAATAR ET AL.: "End-To-End Memory Networks", 《ARXIV:1503.08895V5》 *
SARATH CHANDAR ET AL.: "Hierarchical Memory Networks", 《ARXIV:1605.07427V1》 *

Cited By (60)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778014A (en) * 2016-12-29 2017-05-31 浙江大学 A kind of risk Forecasting Methodology based on Recognition with Recurrent Neural Network
CN106778014B (en) * 2016-12-29 2020-06-16 浙江大学 Disease risk prediction modeling method based on recurrent neural network
CN106776578A (en) * 2017-01-03 2017-05-31 竹间智能科技(上海)有限公司 Talk with the method and device of performance for lifting conversational system
US11403520B2 (en) 2017-02-03 2022-08-02 Baidu Online Network Technology (Beijing) Co., Ltd. Neural network machine translation method and apparatus
CN108388561A (en) * 2017-02-03 2018-08-10 百度在线网络技术(北京)有限公司 Neural network machine interpretation method and device
CN108388561B (en) * 2017-02-03 2022-02-25 百度在线网络技术(北京)有限公司 Neural network machine translation method and device
CN107273487A (en) * 2017-06-13 2017-10-20 北京百度网讯科技有限公司 Generation method, device and the computer equipment of chat data based on artificial intelligence
US10762305B2 (en) 2017-06-13 2020-09-01 Beijing Baidu Netcom Science And Technology Co., Ltd. Method for generating chatting data based on artificial intelligence, computer device and computer-readable storage medium
CN109388706A (en) * 2017-08-10 2019-02-26 华东师范大学 A kind of problem fine grit classification method, system and device
CN107491541A (en) * 2017-08-24 2017-12-19 北京丁牛科技有限公司 File classification method and device
CN107491541B (en) * 2017-08-24 2021-03-02 北京丁牛科技有限公司 Text classification method and device
CN107844533A (en) * 2017-10-19 2018-03-27 云南大学 A kind of intelligent Answer System and analysis method
CN107766506A (en) * 2017-10-20 2018-03-06 哈尔滨工业大学 A kind of more wheel dialog model construction methods based on stratification notice mechanism
CN107818306B (en) * 2017-10-31 2020-08-07 天津大学 Video question-answering method based on attention model
CN107818306A (en) * 2017-10-31 2018-03-20 天津大学 A kind of video answering method based on attention model
CN110019719B (en) * 2017-12-15 2023-04-25 微软技术许可有限责任公司 Assertion-based question and answer
CN110019719A (en) * 2017-12-15 2019-07-16 微软技术许可有限责任公司 Based on the question and answer asserted
CN108108428B (en) * 2017-12-18 2020-05-12 苏州思必驰信息科技有限公司 Method, input method and system for constructing language model
CN108108428A (en) * 2017-12-18 2018-06-01 苏州思必驰信息科技有限公司 A kind of method, input method and system for building language model
CN108417210A (en) * 2018-01-10 2018-08-17 苏州思必驰信息科技有限公司 A kind of word insertion language model training method, words recognition method and system
CN108417210B (en) * 2018-01-10 2020-06-26 苏州思必驰信息科技有限公司 Word embedding language model training method, word recognition method and system
CN108628935B (en) * 2018-03-19 2021-10-15 中国科学院大学 Question-answering method based on end-to-end memory network
CN108628935A (en) * 2018-03-19 2018-10-09 中国科学院大学 A kind of answering method based on end-to-end memory network
CN108549850B (en) * 2018-03-27 2021-07-16 联想(北京)有限公司 Image identification method and electronic equipment
CN108549850A (en) * 2018-03-27 2018-09-18 联想(北京)有限公司 A kind of image-recognizing method and electronic equipment
CN110389996A (en) * 2018-04-16 2019-10-29 国际商业机器公司 Realize the full sentence recurrent neural network language model for being used for natural language processing
CN110555097A (en) * 2018-05-31 2019-12-10 罗伯特·博世有限公司 Slot filling with joint pointer and attention in spoken language understanding
CN109614473A (en) * 2018-06-05 2019-04-12 安徽省泰岳祥升软件有限公司 Knowledge reasoning method and device applied to intelligent interaction
CN108959246B (en) * 2018-06-12 2022-07-12 北京慧闻科技(集团)有限公司 Answer selection method and device based on improved attention mechanism and electronic equipment
CN108959246A (en) * 2018-06-12 2018-12-07 北京慧闻科技发展有限公司 Answer selection method, device and electronic equipment based on improved attention mechanism
CN110866403A (en) * 2018-08-13 2020-03-06 中国科学院声学研究所 End-to-end conversation state tracking method and system based on convolution cycle entity network
CN110866403B (en) * 2018-08-13 2021-06-08 中国科学院声学研究所 End-to-end conversation state tracking method and system based on convolution cycle entity network
CN109033463B (en) * 2018-08-28 2021-11-26 广东工业大学 Community question-answer content recommendation method based on end-to-end memory network
CN109033463A (en) * 2018-08-28 2018-12-18 广东工业大学 A kind of community's question and answer content recommendation method based on end-to-end memory network
CN109558487A (en) * 2018-11-06 2019-04-02 华南师范大学 Document Classification Method based on the more attention networks of hierarchy
CN109840322A (en) * 2018-11-08 2019-06-04 中山大学 It is a kind of based on intensified learning cloze test type reading understand analysis model and method
CN109658270A (en) * 2018-12-19 2019-04-19 前海企保科技(深圳)有限公司 It is a kind of to read the core compensation system and method understood based on insurance products
CN109597884A (en) * 2018-12-28 2019-04-09 北京百度网讯科技有限公司 Talk with method, apparatus, storage medium and the terminal device generated
CN109829631A (en) * 2019-01-14 2019-05-31 北京中兴通网络科技股份有限公司 A kind of business risk early warning analysis method and system based on memory network
US11995406B2 (en) 2019-01-24 2024-05-28 Tencent Technology (Shenzhen) Company Limited Encoding method, apparatus, and device, and storage medium
CN110147532A (en) * 2019-01-24 2019-08-20 腾讯科技(深圳)有限公司 Coding method, device, equipment and storage medium
CN110147532B (en) * 2019-01-24 2023-08-25 腾讯科技(深圳)有限公司 Encoding method, apparatus, device and storage medium
CN109977428A (en) * 2019-03-29 2019-07-05 北京金山数字娱乐科技有限公司 A kind of method and device that answer obtains
CN109977428B (en) * 2019-03-29 2024-04-02 北京金山数字娱乐科技有限公司 Answer obtaining method and device
CN109992657A (en) * 2019-04-03 2019-07-09 浙江大学 A kind of interactive problem generation method based on reinforcing Dynamic Inference
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems
CN110134771B (en) * 2019-04-09 2022-03-04 广东工业大学 Implementation method of multi-attention-machine-based fusion network question-answering system
CN110046244A (en) * 2019-04-24 2019-07-23 中国人民解放军国防科技大学 Answer selection method for question-answering system
CN110046244B (en) * 2019-04-24 2021-06-08 中国人民解放军国防科技大学 Answer selection method for question-answering system
CN110334195A (en) * 2019-06-26 2019-10-15 北京科技大学 A kind of answering method and system based on local attention mechanism memory network
CN110348462A (en) * 2019-07-09 2019-10-18 北京金山数字娱乐科技有限公司 A kind of characteristics of image determination, vision answering method, device, equipment and medium
CN110348462B (en) * 2019-07-09 2022-03-04 北京金山数字娱乐科技有限公司 Image feature determination and visual question and answer method, device, equipment and medium
CN111047482A (en) * 2019-11-14 2020-04-21 华中师范大学 Knowledge tracking system and method based on hierarchical memory network
CN111291803B (en) * 2020-01-21 2022-07-29 中国科学技术大学 Image grading granularity migration method, system, equipment and medium
CN111291803A (en) * 2020-01-21 2020-06-16 中国科学技术大学 Image grading granularity migration method, system, equipment and medium
CN111310848A (en) * 2020-02-28 2020-06-19 支付宝(杭州)信息技术有限公司 Training method and device of multi-task model
CN111310848B (en) * 2020-02-28 2022-06-28 支付宝(杭州)信息技术有限公司 Training method and device for multi-task model
CN112732879A (en) * 2020-12-23 2021-04-30 重庆理工大学 Downstream task processing method and model of question-answering task
CN113704437B (en) * 2021-09-03 2023-08-11 重庆邮电大学 Knowledge base question-answering method integrating multi-head attention mechanism and relative position coding
CN113704437A (en) * 2021-09-03 2021-11-26 重庆邮电大学 Knowledge base question-answering method integrating multi-head attention mechanism and relative position coding

Also Published As

Publication number Publication date
CN106126596B (en) 2019-08-23

Similar Documents

Publication Publication Date Title
CN106126596B (en) A kind of answering method based on stratification memory network
CN113544703B (en) Efficient off-policy credit allocation
US20220067278A1 (en) System for entity and evidence-guided relation prediction and method of using the same
CN108681610B (en) generating type multi-turn chatting dialogue method, system and computer readable storage medium
Chen et al. Strategies for training large vocabulary neural language models
CN110969020B (en) CNN and attention mechanism-based Chinese named entity identification method, system and medium
CN113190688B (en) Complex network link prediction method and system based on logical reasoning and graph convolution
Xia et al. Fully dynamic inference with deep neural networks
CN110019843A (en) The processing method and processing device of knowledge mapping
CN111782961B (en) Answer recommendation method oriented to machine reading understanding
CN114912419B (en) Unified machine reading understanding method based on recombination countermeasure
CN115510814B (en) Chapter-level complex problem generation method based on dual planning
CN113362963A (en) Method and system for predicting side effects among medicines based on multi-source heterogeneous network
Moriya et al. Evolution-strategy-based automation of system development for high-performance speech recognition
CN114880428B (en) Method for recognizing speech part components based on graph neural network
Serras et al. User-aware dialogue management policies over attributed bi-automata
CN116502648A (en) Machine reading understanding semantic reasoning method based on multi-hop reasoning
CN114036938B (en) News classification method for extracting text features by combining topic information and word vectors
Cífka et al. Black-box language model explanation by context length probing
Eyraud et al. TAYSIR Competition: Transformer+\textscrnn: Algorithms to Yield Simple and Interpretable Representations
CN117744760A (en) Text information identification method and device, storage medium and electronic equipment
Cho et al. Parallel parsing in a Gradient Symbolic Computation parser
CN111723186A (en) Knowledge graph generation method based on artificial intelligence for dialog system and electronic equipment
KR20210002027A (en) Apparatus and method for evaluating self-introduction based on natural language processing
Mo et al. Fine grained knowledge transfer for personalized task-oriented dialogue systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant