CN106126596A - A kind of answering method based on stratification memory network - Google Patents
A kind of answering method based on stratification memory network Download PDFInfo
- Publication number
- CN106126596A CN106126596A CN201610447676.4A CN201610447676A CN106126596A CN 106126596 A CN106126596 A CN 106126596A CN 201610447676 A CN201610447676 A CN 201610447676A CN 106126596 A CN106126596 A CN 106126596A
- Authority
- CN
- China
- Prior art keywords
- word
- sentence
- granularity
- memory unit
- sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000015654 memory Effects 0.000 title claims abstract description 179
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000013517 stratification Methods 0.000 title abstract 3
- 230000007246 mechanism Effects 0.000 claims abstract description 31
- 238000005070 sampling Methods 0.000 claims abstract description 23
- 230000000638 stimulation Effects 0.000 claims abstract description 10
- 239000013598 vector Substances 0.000 claims description 42
- 230000004913 activation Effects 0.000 claims description 18
- 238000012549 training Methods 0.000 claims description 17
- 239000011159 matrix material Substances 0.000 claims description 16
- 230000002457 bidirectional effect Effects 0.000 claims description 9
- 125000004122 cyclic group Chemical group 0.000 claims description 5
- 238000003062 neural network model Methods 0.000 claims description 4
- 230000035699 permeability Effects 0.000 abstract 1
- 238000001994 activation Methods 0.000 description 16
- 238000010586 diagram Methods 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012216 screening Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000011478 gradient descent method Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 101100379081 Emericella variicolor andC gene Proteins 0.000 description 1
- 208000031361 Hiccup Diseases 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 238000012733 comparative method Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a kind of answering method based on stratification memory network, first a granularity memory coding is carried out, and under the stimulation of problem semantic coding, an information inference for granularity mnemon is completed by the attention mechanism of many wheel iteration, by the sampling of k maximum, sentence is screened, word granularity memory coding is also carried out on the basis of sentence granularity memory coding, i.e. carry out memory coding at two levels, form the memory coding of stratification, utilize sentence granularity and the output Word probability distribution of word granularity mnemon associated prediction, improve the accuracy of automatic question answering, efficiently solve the answer select permeability of low-frequency word and unregistered word.
Description
Technical Field
The invention relates to the technical field of automatic question-answering system construction, in particular to an end-to-end question-answering method based on a hierarchical memory network.
Background
Automatic question answering has long been one of the most challenging tasks in natural language processing questions that requires a deep understanding of the text and screening out candidate answers as system responses. The conventional methods currently available include: independently training each module in the text processing process by adopting a pipeline mode, and then fusing output modes; and constructing a large-scale structured knowledge base, and performing information reasoning and answer prediction based on the knowledge base. In recent years, end-to-end systems based on deep learning methods have been widely used to solve various tasks without manually constructing features and without individually tuning individual modules.
The question-answering system can be roughly divided into two steps: the relevant semantic information is first located, this step is called the "activation phase", and then a response generation is performed based on the relevant information, this step is called the "generation phase". Recently, neural memory network models have achieved better results in question-answering system tasks. However, the biggest disadvantage of these models is that the memory unit with single-level sentence granularity is adopted, and the problem of low-frequency words or unknown words cannot be solved well. Also, in general, to reduce the time complexity of the model, it is often necessary to reduce the dictionary size. At this time, the existing end-to-end neural network model cannot well select low-frequency or unknown words as answer output. That is, when the target answer word is outside the training dictionary, the existing method cannot well select the accurate answer as the model output in the on-line testing stage. The following dialog text is taken as an example:
1. what name is your good for mr?
2. Hiccup, i called Williams.
3. Please tell me your passport number.
4. Preferably 577838771.
5. Also do you phone numbers?
6. The number is 0016178290851.
Assuming that "williamson", "577838771" and "0016178290851" are low frequency words or unknown words, none of these methods can select accurate user information from the dialog text if the conventional methods discard these words or collectively replace them with "unk" symbols. However, in practical applications, most answer information comes from low-frequency words or long-tail words, and how to design an answer selection method capable of effectively solving the problem of unknown words is a task urgently needed in the field of the automatic question and answer system at present.
Disclosure of Invention
Technical problem to be solved
In order to solve the problems in the prior art, the invention provides a question-answering method based on a hierarchical memory network.
(II) technical scheme
The invention provides a question-answering method based on a hierarchical memory network, which comprises the following steps: step S101: integrating the position of a word and the time sequence information of a sentence, and performing sentence granularity memory coding on the sentence in the sentence set to obtain a double-channel memory coding of a sentence granularity memory unit; step S102: under the stimulation of problem semantic coding, completing information reasoning of the sentence granularity memory unit through a multi-round iterative attention mechanism to obtain the probability distribution of output words in dictionary dimensions on the sentence granularity memory unit; step S103: k maximum sampling is carried out on the information inference result of the sentence granularity memory unit, and a k maximum sampling important sentence set is screened out from the sentence set; step S104: performing word granularity memory coding on the sentence set by using a bidirectional cyclic neural network model to obtain memory coding of a word granularity memory unit; step S105: obtaining word granularity output word probability distribution through an attention mechanism based on the problem semantic code, the memory code of the word granularity memory unit and the k maximum sampling important sentence set; and step S106: and jointly predicting the probability distribution of output words from the sentence granularity and word granularity memory units, and performing supervision training by using the cross entropy.
(III) advantageous effects
According to the technical scheme, the question-answering method based on the hierarchical memory network has the following beneficial effects:
(1) the method firstly carries out sentence granularity memory coding, completes information reasoning of a sentence granularity memory unit through a multi-round iterative attention mechanism under the stimulation of question semantic coding, can improve the accuracy and timeliness of automatic question answering, and is favorable for answer selection of low-frequency words and unknown words;
(2) sentences are screened through k maximum sampling, so that the automatic question answering efficiency can be improved, and the calculation complexity is reduced;
(3) word granularity memory coding is carried out on the basis of sentence granularity memory coding, namely memory coding is carried out on two layers to form hierarchical memory coding, so that the accuracy of automatic question answering can be further improved;
(4) when the cyclic neural network is used for word granularity memory coding, the operation is carried out on the full sentence set X, the method can introduce context environment semantic information of words in the full sentence set in the word granularity memory coding process, and can improve the accuracy and timeliness of automatic question answering;
(5) the attention mechanism on the word granularity memory unit is operated on the word granularity memory unit subset after k sampling, so that interference information in memory coding is avoided, and the calculated amount of the word granularity attention mechanism is reduced;
(6) the sentence granularity and word granularity memory unit is used for jointly predicting the probability distribution of output words, so that the accuracy of automatic question answering can be further improved, and the answer selection problem of low-frequency words and unknown words is effectively solved.
Drawings
FIG. 1 is a flowchart of a question answering method based on a hierarchical memory network according to an embodiment of the present invention;
FIG. 2 is a block diagram of a hierarchical memory network-based question-answering method according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating sentence-granularity memory coding and information inference based on sentence-granularity memory coding according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating word granularity memory coding and attention activation based on the word granularity memory coding according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the performance of a hierarchical memory network-based question-answering method according to an embodiment of the present invention 1;
fig. 6 is a schematic diagram of another performance of the question-answering method based on the hierarchical memory network according to the embodiment of the present invention.
Detailed Description
In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.
The invention discloses a question-answering method based on a hierarchical memory network, which is based on an end-to-end model of a whole neural network structure, can realize information reasoning, screening and word granularity selection in a sentence set, and effectively solves the problem of answer selection of a question-answering system under big data on low-frequency words or unknown words. The question-answering method of the invention respectively carries out two hierarchical memory codes on a sentence set with time sequence information, which respectively are as follows: sentence granularity memory coding and word granularity memory coding. And then carrying out information reasoning, screening and activation based on the hierarchical memory network, and jointly predicting the probability distribution of the candidate answer words.
The question-answering method firstly carries out sentence-vectorization memory coding on a sentence set through a hierarchical memory network, considers the position information of words in sentences and the sequence time information of the sentences in the sentence set, then completes the information reasoning of a sentence granularity memory unit through a multi-round iterative attention mechanism, carries out k maximum sampling based on the reasoning result and screens out important sentence information. And then, carrying out word granularity sequence coding on the sentence set by using a bidirectional circulation network model, carrying out information activation of a word granularity memory unit from the screened information through an attention mechanism, finally predicting output word probability distribution from the sentence granularity memory unit and the word granularity memory unit respectively, carrying out joint supervision training through Softmax, and learning an end-to-end automatic question-answering model.
The question-answering method based on the hierarchical memory network as an embodiment of the invention is described in detail below with reference to the accompanying drawings.
Fig. 1 is a flowchart of a question-answering method based on a hierarchical memory network according to an embodiment of the present invention, and referring to fig. 1, the question-answering method includes:
step S101: and fusing the position of the word and the time sequence information of the sentence, and performing sentence granularity memory coding on the sentence in the sentence set to obtain the double-channel memory coding of the sentence granularity memory unit.
Referring to fig. 3, step S101 includes:
sub-step S101 a: and carrying out double-channel word vector mapping on the sentences with the time sequence information in the sentence set to obtain double-channel word vectorization codes of the sentences.
The sub-step S101a includes: given a set of sentences with time series information X ═ { X ═ Xi}i=(1,2,...,n)Wherein i is the current time sequence of the sentence, and n is the maximum time sequence length of the sentence set; randomly initializing two word vector matricesAndwherein | V | is dictionary dimension, d is dimension of word vector, A and C respectively adopt normal distribution with standard deviation of 0.1 and mean value of 0 as random initialization parameter, and sentence X in sentence set XiPerforming two-channel word vector mapping to obtain sentence xiWord x inijIs a two-channel vectorized code ofAndj is a word in the sentence xiThe location information in (1).
Sub-step S101 b: and updating the two-channel word vectorization codes according to the position information of the words in the sentences.
The sub-step S101b includes: generating an updating matrix l according to the position information j of the words in the sentences and the dimension d of the word vectors, and encoding the updated two-channel word vectorization code into lgj·(Axij) And lgj·(Cxij) Wherein:
lgj=(1-j/Ji)-(g/d)(1-2j/Ji) (1)
wherein, JiIs the sentence xiThe number of Chinese words, g is the current dimension value in the word vector with the dimension d, and J is more than or equal to 1 and less than or equal to J and g is more than or equal to 1 and less than or equal to d.
Sub-step S101 c: and merging the time sequence information of the sentences to perform sentence granularity memory coding on the sentences to obtain the double-channel memory coding of the sentence granularity memory unit.
The sub-step S101c includes: vectorized matrix for randomly initializing a time sequence of two sentencesAndwhere n is the maximum time series length of the sentence set, d is the time vector dimension, which is the same as the word vector dimension, TAAnd TCNormal distribution with standard deviation of 0.1 and mean value of 0 is respectively adopted as random initialization parameters, and then the dual-channel memory code of the sentence-size memory unit is M(S)={{ai},{ci}, wherein:
wherein ljUpdate matrix l in sentence x foriThe update vector of the j-th word, the operator, is the multiplication operation of elements between vectors, such as l in formula (2)j·(Axij) Represents a vector ljSum vector (Ax)ij) An element multiplication operation is performed.
Step S102: under the stimulation of problem semantic coding, information reasoning of the sentence granularity memory unit is completed through a multi-round iterative attention mechanism, and output word probability distribution of the sentence granularity memory unit in dictionary dimensions is obtained.
Step S102 includes:
sub-step S102 a: and vectorizing and expressing the question text to obtain the semantic code of the question.
The sub-step S102a includes: using the word vector matrixFor the jth word q in the question text qjPerforming vectorized representationAnd updating the vectorized representation based on the position j of the word in the question text to obtain the semantic code of the question:
the same as the formulas (2) and (3), ljTo update the matrix l in sentence xiThe update vector of the jth word in (b).
Sub-step S102 b: under the stimulation of problem semantic coding, information activation is carried out in the double-channel memory coding of the sentence granularity memory unit by using an attention mechanism;
the sub-step S102b includes: calculating attention weight of problem semantic codes in a sentence granularity memory unit by adopting a dot product mode:
and under the stimulation of problem semantic coding, the activation information of the double-channel memory coding of the sentence-granularity memory unit is as follows:
sub-step S102 c: and finishing information reasoning on the sentence granularity memory unit through a multi-round iterative attention mechanism to obtain the probability distribution of output words in dictionary dimensions on the sentence granularity memory unit.
The sub-step S102c includes: performing R round information activation on the sentence granularity memory unit, finding out a candidate sentence set, and obtaining the activation information O of the R roundRWherein, in the r +1 th round of information activation,
wherein R is more than or equal to 1 and less than or equal to (R-1), and an independent word vector matrix A is adopted in the R +1 round of information activationr+1And Cr+1And time vector matrixAndvectorizing the sentence set, andCrandnormal distribution with standard deviation of 0.1 and mean of 0 is respectively adopted as random initialization parameters.
And finishing information reasoning in the sentence granularity memory unit through an attention mechanism of R rounds of iteration to obtain the probability distribution of output words in dictionary dimensionality on the sentence granularity memory unit as follows:
wherein,is a set of dictionary dimension words, and is,the word vector matrix activated for the R-th round of information and T is the transpose operator.
The invention firstly carries out sentence granularity memory coding, completes the information reasoning of the sentence granularity memory unit through a multi-round iterative attention mechanism under the stimulation of question semantic coding, can improve the accuracy and timeliness of automatic question answering, and is beneficial to the answer selection of low-frequency words and unknown words.
Step S103: k maximum sampling is carried out on the information reasoning result of the sentence granularity memory unit, and an important sentence set with k maximum sampling is screened out from the sentence set.
Step S103 includes:
sub-step S103 a: attention weight vector activated for R-th round information on sentence-granularity memory unitSelecting k largest attention weight subsets
Sub-step S103 b: selecting k largest attention weight subsetsCorresponding sentence set as k maximum sampling important sentence setSentences in the important sentence setIs an important sentence.
According to the invention, the sentence is screened by sampling at the maximum, so that the automatic question answering efficiency can be improved, the calculation complexity is reduced, and the answer selection of low-frequency words and unknown words is facilitated.
Step S104: and performing word granularity memory coding on the sentence set by using the bidirectional cyclic neural network model to obtain the memory coding of the word granularity memory unit.
Referring to fig. 4, step S104 includes:
sub-step S104 a: and (4) encoding words in the important sentence set according to the time sequence by using the bidirectional circulation network model to obtain the hidden state of the bidirectional circulation network model. There are many existing models of the bidirectional loop network model, and this embodiment adopts one of them: gate cycle network model (GRU).
The sub-step S104a includes: using gate cycle network model (GRU) to respectively match all words in sentence set XForward and backward encoding is carried out according to time sequence, and the hidden state of forward GRU encoding is that for the word characteristics at the time tThe hidden state of backward GRU coding isWherein, | t | is the maximum sequence length of the words after arranging all the words in the sentence set X according to the time sequence,andis the same as the dimension d of the word vector, CRIs a word vector matrix in the process of activating the R-th round of information in the sentence granularity memory unit.
Sub-step S104 b: and fusing the hidden states of the bidirectional circulation network model to obtain the memory code of the word granularity memory unit.
The sub-step S104b includes: directly adding the hidden states of the two-way circulation network model to obtain a memory code M of a word granularity memory unit(W)={mt}t=1,2,..|t|)Wherein
The invention uses the recurrent neural network to carry out word granularity memory coding, which is operated on the whole sentence set X, and the method can introduce the context environment semantic information of the words in the whole sentence set in the word granularity memory coding process, can improve the accuracy and timeliness of automatic question answering, and is beneficial to the answer selection of low-frequency words and unknown words.
Step S105: based on problem semantic coding, memory coding of a word granularity memory unit and k maximum sampling important sentence sets, word granularity output word probability distribution is obtained through an attention mechanism.
Step S105 includes:
sub-step S105 a: calculating the attention weight of the word granularity memory unit according to the problem semantic code and the memory code of the word granularity memory unit;
the sub-step S105a includes: semantic coding based on problem in R-th round information activation process on sentence granularity memory unitMemory code M of word granularity memory unit(W)={mt}t=1,2,..,|t|)And k max sampling important sentence setObtaining the attention weight vector of the normalized word granularity memory unitWherein:
wherein,is k max sample important sentence setSet of words in (1)Corresponding word granularity memory code M(W)={mt}t=(1,2,...,|t|)A subset ofAttention weight vector α(W)Dimension of (2) and collection of important sentences in time sequenceAll the words inThe maximum sequence length of the arranged words is consistent, namely the maximum sequence length is Andall learning parameters are learning parameters, v, W and U are initialized randomly by adopting normal distribution with standard deviation of 0.1 and mean value of 0, and are updated in a training stage.
Substep S105b obtaining word granularity output word probability distribution according to attention weight of word granularity memory unit in the embodiment of the present invention, attention weight α of normalized word granularity memory unit is directly adopted(W)Outputting a word probability distribution as a word granularity:
at this time, the word granularity outputs the word probability distribution in accordance with the dimension of the attention weight, i.e., the word granularity outputs the word probability distribution For the set of all words in the set of important sentences
The invention also carries out word granularity memory coding on the basis of sentence granularity memory coding, namely, memory coding is carried out on two layers to form hierarchical memory coding, thereby further improving the accuracy of automatic question answering and being more beneficial to answer selection of low-frequency words and unknown words. Meanwhile, the attention mechanism on the word granularity memory unit is operated on the word granularity memory unit subset after k sampling, so that interference information in memory coding is avoided, and the calculation amount of the word granularity attention mechanism is reduced.
Step S106: and jointly predicting the probability distribution of output words from the sentence granularity and word granularity memory units, and performing supervision training by using the cross entropy.
Step S106 includes:
sub-step S106 a: performing output word joint prediction based on the output word probability distribution and the word granularity output word probability distribution of the dictionary dimension on the sentence granularity memory unit, wherein the joint prediction output word distribution p (w) has the expression:
where trans (·) denotes the word granularity of the subset to output the word probability distributionWord granularity output word probability distribution mapped to dictionary dimension corpusThe mapping operation particularly refers to a probability distribution of the output wordsMiddle probability value according to its corresponding word subsetDictionary dimensional word corpus of words in (1)The position in the full set is mapped with probability value, if some words in the full set do not appear in the sub-set, the output probability is set as 0, and the output probability distribution of the mapped words is obtained
Sub-step S106 b: and performing cross entropy supervision training on the distribution of the joint prediction output words by using the distribution of the target answer words. And given that the target answer word distribution of the training set is y, performing joint optimization based on a cross entropy function of the target answer word distribution y and the joint prediction output word distribution p (w).
In an exemplary embodiment of the invention, the objective function in the joint optimization is optimized by error back propagation by adopting a random gradient descent method, and the optimization parameters comprise a word vector matrix { A ] in a word granularity memory unitr}r=(1,2,...,R)And { Cr}t=1,2,...,R)And time vector matrixAndall parameter sets { theta ] of bidirectional GRU model adopted in word granularity memory coding processGRUV, W and U in the attention weight of the word granularity memory unit (formula (12)) are calculated.
The method jointly predicts the probability distribution of output words in the sentence granularity and word granularity memory unit, can further improve the accuracy of automatic question answering, and is more favorable for answer selection of low-frequency words and unknown words.
Fig. 2 is a schematic diagram of a framework of a question-answering method based on a hierarchical memory network according to an embodiment of the present invention. Referring to fig. 2, the question-answering method based on the hierarchical memory network has two layers of memory network units, which are respectively:
a memory unit one: the sentence set carries out coding memory of sentence granularity in a time sequence;
a second memory cell: and all words in the sentence set are subjected to word granularity coding memory according to a time sequence.
And (4) screening and filtering important information by adopting the k maximum mode among different memory unit layers.
The model information processing stage has two information activation mechanisms, which are respectively:
the first activation mechanism is as follows: adopting an inference mechanism to activate information on the sentence granularity memory unit;
and (2) an activation mechanism II: and performing word selection on the word granularity memory unit by adopting an attention mechanism.
The whole model training stage has two supervision information for guidance, which are respectively:
monitoring information I: the sentence granularity memory unit decodes the output vector after information reasoning and outputs Softmax to fit information of the target word;
and (5) monitoring information II: and the word granularity memory unit performs attention mechanism activation and Softmax output and then performs fitting information on the target word.
In order to accurately evaluate the automatic question-answering response performance of the method, the performance of the method is compared by comparing error sample numbers of the answer words selected and output by the model and the actual data answer words.
TABLE 1
Data field | Training/testing question-answer pairs | Dictionary size (Whole/training/testing) | Unregistered target word (percentage) |
Airline ticket booking | 7,000/7,000 | 10,682/5,612/5,618 | 5,070(72.43%) |
The invention adopts a Chinese air ticket booking field text data set in the experiment, wherein the data set comprises 2,000 complete conversation histories and 14,000 question-answer pairs, and the ratio of the number of the question-answer pairs to the number of the question-answer pairs is 5: 5, and the data set is divided into a training set and a testing set. The present invention does not perform any processing (including word-kill and stem reduction operations) on these text data sets. Specific statistical information of the data set is shown in table 1, and it can be seen that the unregistered target word in the test set occupies 72.43%, which has a relatively large influence on the conventional model training.
The following comparative methods were used in the experiments of the invention:
the first comparison method comprises the following steps: the method is based on a pointer network model of an attention mechanism, all words in a sentence set are regarded as a long sentence according to a time sequence for coding, and answers are generated by directly utilizing the attention mechanism of question and word coding;
and a second comparison method comprises the following steps: the neural memory network model is used for carrying out sentence granularity coding on a sentence set, and carrying out answer matching on a full dictionary space directly by using information obtained after semantic activation is carried out on a coding vector of a question.
The parameters used in the experiments of the present invention are set as shown in table 2:
TABLE 2
n | d | R | k | lr | bs |
16 | 100 | 3 | 1 | 0.01 | 10 |
In table 2, a parameter n is a sentence maximum time sequence of a sentence set of experimental data, d is a word vector dimension and a hidden layer coding dimension, R is an iteration number of an inference mechanism on a sentence size memory unit, k is a maximum sampling number between different layers of memories, lr is a learning rate when a random gradient descent method is used for model parameter optimization, and bs is a number of samples in each batch when model training is performed.
In the experiment of the invention, 15 rounds of iterative training are carried out, all the methods are converged as shown in fig. 5, and the final converged experiment result is shown in table 3:
TABLE 3
Method of producing a composite material | Number of wrong samples |
Comparison method 1 | 109 |
Comparison method two | 56 |
The method of the invention | 0 |
FIG. 5 and Table 3 show the results of the evaluation of the number of false samples on a data set by the method of the present invention, the first comparison method and the second comparison method. Experimental results show that the convergence rate of the method is obviously superior to that of other methods. And according to the final convergence result in table 3, it can be seen that the method of the present invention is significantly superior to other methods, and can completely solve the problem of answer selection on the set of unregistered words, reaching a 100% accuracy.
Meanwhile, the present invention verifies the performance influence of the maximum sampling number k of information screening among the hierarchical memory units on the number of wrong samples in the answer selection problem, and the experimental results are shown in fig. 6 and table 4. It can be seen that when the maximum sampling number is 1, the convergence rate and the final convergence result of the performance of the method of the present invention can be optimized, further explaining the importance of information selection among the hierarchical memory units.
TABLE 4
Maximum number of samples | Number of wrong samples |
3 | 5 |
2 | 4 |
1 | 0 |
So far, the embodiments of the present invention have been described in detail with reference to the accompanying drawings. From the above description, those skilled in the art should clearly recognize that the present invention is a question-answering method based on a hierarchical memory network.
The invention relates to a question-answering method based on a hierarchical memory network, which comprises the steps of firstly carrying out sentence granularity memory coding, finishing information reasoning of a sentence granularity memory unit through a multi-round iterative attention mechanism under the stimulation of question semantic coding, improving the accuracy and timeliness of automatic question-answering, and facilitating the answer selection of low-frequency words and unknown words; the sentences are screened through the k maximum sampling, so that the efficiency of automatic question answering can be improved, the calculation complexity is reduced, word granularity memory coding is performed on the basis of sentence granularity memory coding, namely, memory coding is performed on two levels to form hierarchical memory coding, and the accuracy of automatic question answering can be further improved; when the cyclic neural network is used for word granularity memory coding, the operation is carried out on the full sentence set X, the method can introduce context environment semantic information of words in the full sentence set in the word granularity memory coding process, and can improve the accuracy and timeliness of automatic question answering; the attention mechanism on the word granularity memory unit is operated on the word granularity memory unit subset after k sampling, so that interference information in memory coding is avoided, and the calculated amount of the word granularity attention mechanism is reduced; the sentence granularity and word granularity memory unit is used for jointly predicting the probability distribution of output words, so that the accuracy of automatic question answering can be further improved, and the answer selection problem of low-frequency words and unknown words is effectively solved.
It is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. In addition, the above definitions of the respective elements are not limited to the various manners mentioned in the embodiments, and those skilled in the art may easily modify or replace them, for example:
(1) directional phrases used in the embodiments, such as "upper", "lower", "front", "rear", "left", "right", etc., refer only to the orientation of the attached drawings and are not intended to limit the scope of the present invention;
(2) the embodiments described above may be mixed and matched with each other or with other embodiments based on design and reliability considerations, i.e. technical features in different embodiments may be freely combined to form further embodiments.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A question-answering method based on a hierarchical memory network is characterized by comprising the following steps:
step S101: integrating the position of a word and the time sequence information of a sentence, and performing sentence granularity memory coding on the sentence in the sentence set to obtain a double-channel memory coding of a sentence granularity memory unit;
step S102: under the stimulation of problem semantic coding, completing information reasoning of the sentence granularity memory unit through a multi-round iterative attention mechanism to obtain the probability distribution of output words in dictionary dimensions on the sentence granularity memory unit;
step S103: k maximum sampling is carried out on the information inference result of the sentence granularity memory unit, and a k maximum sampling important sentence set is screened out from the sentence set;
step S104: performing word granularity memory coding on the sentence set by using a bidirectional cyclic neural network model to obtain memory coding of a word granularity memory unit;
step S105: obtaining word granularity output word probability distribution through an attention mechanism based on the problem semantic code, the memory code of the word granularity memory unit and the k maximum sampling important sentence set; and
step S106: and jointly predicting the probability distribution of output words from the sentence granularity and word granularity memory units, and performing supervision training by using the cross entropy.
2. The question-answering method according to claim 1, characterized in that said step S101 comprises:
sub-step S101 a: given a set of sentences with time series information X ═ { X ═ Xi}i=(1,2,...,n)Randomly initializing a word vector matrixAndsentence xiWord x inijIs a two-channel vectorized code ofAnd
wherein i is the current time series of sentences; n is the maximum time series length of the sentence set; | V | is a dictionary dimension; d is the dimension of the word vector; j is a word in the sentence xiThe location information in (1);
sub-step S101 b: updating the two-channel word vectorization codes according to the position information of the words in the sentences; and
sub-step S101 c: and merging the time sequence information of the sentences to perform sentence granularity memory coding on the sentences to obtain the double-channel memory coding of the sentence granularity memory unit.
3. The question-answering method according to claim 2, characterized in that said sub-step S101b comprises:
the updated two-channel word vectorization code is lgj·(Axij) And lgj·(Cxij) Wherein
lgj=(1-j/Ji)-(g/d)(1-2j/Ji) (1)
wherein, JiIs the sentence xiThe number of Chinese words, g is the current dimension value in the word vector with the dimension d, and J is more than or equal to 1 and less than or equal to J and g is more than or equal to 1 and less than or equal to d.
4. The question-answering method according to claim 3, characterized in that said sub-step S101c comprises:
time vector matrix for randomly initializing sentencesAndthe two-channel memory code of the sentence-size memory unit is M(S)={{ai},{ciAnd } of the component (c), wherein,
ai=∑jlj·(Axij)+TA(i) (2)
ci=∑jlj·(Cxij)+TC(i) (3)
wherein ljUpdate matrix l in sentence x foriAn update vector of the jth word; the operator is an inter-vector element multiplication operation; n is the maximum time series length of the sentence set; d is the time vector dimension, the same as the dimension of the word vector.
5. The question-answering method according to claim 4, characterized in that said step S102 comprises:
sub-step S102 a: using word vector matricesFor the jth word q in the question text qjPerforming vectorized representationObtaining problem semantic codes:
wherein ljTo update the matrix l in sentence xiAn update vector of the jth word;
sub-step S102 b: calculating attention weight of problem semantic code in sentence-granularity memory unit
Under the stimulation of problem semantic coding, the activation information of the double-channel memory coding of the sentence granularity memory unit is as follows:and
sub-step S102 c: and finishing information reasoning on the sentence granularity memory unit through a multi-round iterative attention mechanism to obtain the probability distribution of output words in dictionary dimensions on the sentence granularity memory unit.
6. The question-answering method according to claim 5, characterized in that said sub-step S102c comprises:
performing R round information activation on the sentence granularity memory unit to obtain the activation information O of the R roundRWherein, in the r +1 th round of information activation,
wherein R is more than or equal to 1 and less than or equal to (R-1); a. ther+1=Cr,
The probability distribution of output words in dictionary dimensions on the sentence granularity memory unit is as follows:
wherein w ═ { w ═ wt}t=(1,2,...,|V|)A dictionary dimension word set is obtained;a word vector matrix activated for the R-th round of information; t is a transpose operator.
7. The question-answering method according to claim 6, characterized in that said step S103 comprises:
sub-step S103 a: attention weight vector activated for R-th round information on sentence-granularity memory unitSelecting k largest attention weight subsetsAnd
sub-step S103 b: selecting k largest attention weight subsetsCorresponding sentence set as k maximum sampling important sentence set
8. The question-answering method according to claim 7, characterized in that said step S104 comprises:
sub-step S104 a: respectively aligning all words in sentence set X by using gate cycle network modelForward and backward encoding is carried out according to time sequence, and the hidden state of forward GRU encoding is that for the word characteristics at the time tThe hidden state of backward GRU coding is
Wherein, | t | is the maximum sequence length of words after arranging all words in the sentence set X according to the time sequence;andis the same as the dimension d of the word vector;
sub-step S104 b: obtaining the memory code M of the word granularity memory unit(W)={mt}t=(1,2,...,|t|)Wherein
9. The question-answering method according to claim 8, characterized in that said step S105 comprises:
sub-step S105 a: computing an attention weight vector for the normalized word granularity memory unitWherein:
wherein,is k max sample important sentence setSet of words in (1)Corresponding word granularity memory code M(W)={mt}t=(1,2,...,|t|)A subset ofAttention weight vector α(W)Has the dimension of Andis a learning parameter;
sub-step S105 b: word granularity output word probability distributionComprises the following steps:
wherein, for the set of all words in the set of important sentences
10. The question-answering method according to claim 9, characterized in that said step S106 comprises:
sub-step S106 a: performing output word joint prediction based on the output word probability distribution and the word granularity output word probability distribution of the dictionary dimension on the sentence granularity memory unit, wherein the joint prediction output word distribution p (w) has the expression:
where trans (·) denotes the word granularity of the subset to output the word probability distributionWord granularity output word probability distribution mapped to dictionary dimension corpus
Sub-step S106 b: and performing cross entropy supervision training on the distribution of the joint prediction output words by using the distribution of the target answer words.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610447676.4A CN106126596B (en) | 2016-06-20 | 2016-06-20 | A kind of answering method based on stratification memory network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610447676.4A CN106126596B (en) | 2016-06-20 | 2016-06-20 | A kind of answering method based on stratification memory network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106126596A true CN106126596A (en) | 2016-11-16 |
CN106126596B CN106126596B (en) | 2019-08-23 |
Family
ID=57470348
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610447676.4A Active CN106126596B (en) | 2016-06-20 | 2016-06-20 | A kind of answering method based on stratification memory network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106126596B (en) |
Cited By (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106776578A (en) * | 2017-01-03 | 2017-05-31 | 竹间智能科技(上海)有限公司 | Talk with the method and device of performance for lifting conversational system |
CN106778014A (en) * | 2016-12-29 | 2017-05-31 | 浙江大学 | A kind of risk Forecasting Methodology based on Recognition with Recurrent Neural Network |
CN107273487A (en) * | 2017-06-13 | 2017-10-20 | 北京百度网讯科技有限公司 | Generation method, device and the computer equipment of chat data based on artificial intelligence |
CN107491541A (en) * | 2017-08-24 | 2017-12-19 | 北京丁牛科技有限公司 | File classification method and device |
CN107766506A (en) * | 2017-10-20 | 2018-03-06 | 哈尔滨工业大学 | A kind of more wheel dialog model construction methods based on stratification notice mechanism |
CN107818306A (en) * | 2017-10-31 | 2018-03-20 | 天津大学 | A kind of video answering method based on attention model |
CN107844533A (en) * | 2017-10-19 | 2018-03-27 | 云南大学 | A kind of intelligent Answer System and analysis method |
CN108108428A (en) * | 2017-12-18 | 2018-06-01 | 苏州思必驰信息科技有限公司 | A kind of method, input method and system for building language model |
CN108388561A (en) * | 2017-02-03 | 2018-08-10 | 百度在线网络技术(北京)有限公司 | Neural network machine interpretation method and device |
CN108417210A (en) * | 2018-01-10 | 2018-08-17 | 苏州思必驰信息科技有限公司 | A kind of word insertion language model training method, words recognition method and system |
CN108549850A (en) * | 2018-03-27 | 2018-09-18 | 联想(北京)有限公司 | A kind of image-recognizing method and electronic equipment |
CN108628935A (en) * | 2018-03-19 | 2018-10-09 | 中国科学院大学 | A kind of answering method based on end-to-end memory network |
CN108959246A (en) * | 2018-06-12 | 2018-12-07 | 北京慧闻科技发展有限公司 | Answer selection method, device and electronic equipment based on improved attention mechanism |
CN109033463A (en) * | 2018-08-28 | 2018-12-18 | 广东工业大学 | A kind of community's question and answer content recommendation method based on end-to-end memory network |
CN109388706A (en) * | 2017-08-10 | 2019-02-26 | 华东师范大学 | A kind of problem fine grit classification method, system and device |
CN109558487A (en) * | 2018-11-06 | 2019-04-02 | 华南师范大学 | Document Classification Method based on the more attention networks of hierarchy |
CN109597884A (en) * | 2018-12-28 | 2019-04-09 | 北京百度网讯科技有限公司 | Talk with method, apparatus, storage medium and the terminal device generated |
CN109614473A (en) * | 2018-06-05 | 2019-04-12 | 安徽省泰岳祥升软件有限公司 | Knowledge reasoning method and device applied to intelligent interaction |
CN109658270A (en) * | 2018-12-19 | 2019-04-19 | 前海企保科技(深圳)有限公司 | It is a kind of to read the core compensation system and method understood based on insurance products |
CN109829631A (en) * | 2019-01-14 | 2019-05-31 | 北京中兴通网络科技股份有限公司 | A kind of business risk early warning analysis method and system based on memory network |
CN109840322A (en) * | 2018-11-08 | 2019-06-04 | 中山大学 | It is a kind of based on intensified learning cloze test type reading understand analysis model and method |
CN109977428A (en) * | 2019-03-29 | 2019-07-05 | 北京金山数字娱乐科技有限公司 | A kind of method and device that answer obtains |
CN109992657A (en) * | 2019-04-03 | 2019-07-09 | 浙江大学 | A kind of interactive problem generation method based on reinforcing Dynamic Inference |
CN110019719A (en) * | 2017-12-15 | 2019-07-16 | 微软技术许可有限责任公司 | Based on the question and answer asserted |
CN110046244A (en) * | 2019-04-24 | 2019-07-23 | 中国人民解放军国防科技大学 | Answer selection method for question-answering system |
CN110134771A (en) * | 2019-04-09 | 2019-08-16 | 广东工业大学 | A kind of implementation method based on more attention mechanism converged network question answering systems |
CN110147532A (en) * | 2019-01-24 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Coding method, device, equipment and storage medium |
CN110334195A (en) * | 2019-06-26 | 2019-10-15 | 北京科技大学 | A kind of answering method and system based on local attention mechanism memory network |
CN110348462A (en) * | 2019-07-09 | 2019-10-18 | 北京金山数字娱乐科技有限公司 | A kind of characteristics of image determination, vision answering method, device, equipment and medium |
CN110389996A (en) * | 2018-04-16 | 2019-10-29 | 国际商业机器公司 | Realize the full sentence recurrent neural network language model for being used for natural language processing |
CN110555097A (en) * | 2018-05-31 | 2019-12-10 | 罗伯特·博世有限公司 | Slot filling with joint pointer and attention in spoken language understanding |
CN110866403A (en) * | 2018-08-13 | 2020-03-06 | 中国科学院声学研究所 | End-to-end conversation state tracking method and system based on convolution cycle entity network |
CN111047482A (en) * | 2019-11-14 | 2020-04-21 | 华中师范大学 | Knowledge tracking system and method based on hierarchical memory network |
CN111291803A (en) * | 2020-01-21 | 2020-06-16 | 中国科学技术大学 | Image grading granularity migration method, system, equipment and medium |
CN111310848A (en) * | 2020-02-28 | 2020-06-19 | 支付宝(杭州)信息技术有限公司 | Training method and device of multi-task model |
CN112732879A (en) * | 2020-12-23 | 2021-04-30 | 重庆理工大学 | Downstream task processing method and model of question-answering task |
CN113704437A (en) * | 2021-09-03 | 2021-11-26 | 重庆邮电大学 | Knowledge base question-answering method integrating multi-head attention mechanism and relative position coding |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104834747A (en) * | 2015-05-25 | 2015-08-12 | 中国科学院自动化研究所 | Short text classification method based on convolution neutral network |
CN105159890A (en) * | 2014-06-06 | 2015-12-16 | 谷歌公司 | Generating representations of input sequences using neural networks |
-
2016
- 2016-06-20 CN CN201610447676.4A patent/CN106126596B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105159890A (en) * | 2014-06-06 | 2015-12-16 | 谷歌公司 | Generating representations of input sequences using neural networks |
CN104834747A (en) * | 2015-05-25 | 2015-08-12 | 中国科学院自动化研究所 | Short text classification method based on convolution neutral network |
Non-Patent Citations (2)
Title |
---|
SAINBAYAR SUKHBAATAR ET AL.: "End-To-End Memory Networks", 《ARXIV:1503.08895V5》 * |
SARATH CHANDAR ET AL.: "Hierarchical Memory Networks", 《ARXIV:1605.07427V1》 * |
Cited By (60)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106778014A (en) * | 2016-12-29 | 2017-05-31 | 浙江大学 | A kind of risk Forecasting Methodology based on Recognition with Recurrent Neural Network |
CN106778014B (en) * | 2016-12-29 | 2020-06-16 | 浙江大学 | Disease risk prediction modeling method based on recurrent neural network |
CN106776578A (en) * | 2017-01-03 | 2017-05-31 | 竹间智能科技(上海)有限公司 | Talk with the method and device of performance for lifting conversational system |
US11403520B2 (en) | 2017-02-03 | 2022-08-02 | Baidu Online Network Technology (Beijing) Co., Ltd. | Neural network machine translation method and apparatus |
CN108388561A (en) * | 2017-02-03 | 2018-08-10 | 百度在线网络技术(北京)有限公司 | Neural network machine interpretation method and device |
CN108388561B (en) * | 2017-02-03 | 2022-02-25 | 百度在线网络技术(北京)有限公司 | Neural network machine translation method and device |
CN107273487A (en) * | 2017-06-13 | 2017-10-20 | 北京百度网讯科技有限公司 | Generation method, device and the computer equipment of chat data based on artificial intelligence |
US10762305B2 (en) | 2017-06-13 | 2020-09-01 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method for generating chatting data based on artificial intelligence, computer device and computer-readable storage medium |
CN109388706A (en) * | 2017-08-10 | 2019-02-26 | 华东师范大学 | A kind of problem fine grit classification method, system and device |
CN107491541A (en) * | 2017-08-24 | 2017-12-19 | 北京丁牛科技有限公司 | File classification method and device |
CN107491541B (en) * | 2017-08-24 | 2021-03-02 | 北京丁牛科技有限公司 | Text classification method and device |
CN107844533A (en) * | 2017-10-19 | 2018-03-27 | 云南大学 | A kind of intelligent Answer System and analysis method |
CN107766506A (en) * | 2017-10-20 | 2018-03-06 | 哈尔滨工业大学 | A kind of more wheel dialog model construction methods based on stratification notice mechanism |
CN107818306B (en) * | 2017-10-31 | 2020-08-07 | 天津大学 | Video question-answering method based on attention model |
CN107818306A (en) * | 2017-10-31 | 2018-03-20 | 天津大学 | A kind of video answering method based on attention model |
CN110019719B (en) * | 2017-12-15 | 2023-04-25 | 微软技术许可有限责任公司 | Assertion-based question and answer |
CN110019719A (en) * | 2017-12-15 | 2019-07-16 | 微软技术许可有限责任公司 | Based on the question and answer asserted |
CN108108428B (en) * | 2017-12-18 | 2020-05-12 | 苏州思必驰信息科技有限公司 | Method, input method and system for constructing language model |
CN108108428A (en) * | 2017-12-18 | 2018-06-01 | 苏州思必驰信息科技有限公司 | A kind of method, input method and system for building language model |
CN108417210A (en) * | 2018-01-10 | 2018-08-17 | 苏州思必驰信息科技有限公司 | A kind of word insertion language model training method, words recognition method and system |
CN108417210B (en) * | 2018-01-10 | 2020-06-26 | 苏州思必驰信息科技有限公司 | Word embedding language model training method, word recognition method and system |
CN108628935B (en) * | 2018-03-19 | 2021-10-15 | 中国科学院大学 | Question-answering method based on end-to-end memory network |
CN108628935A (en) * | 2018-03-19 | 2018-10-09 | 中国科学院大学 | A kind of answering method based on end-to-end memory network |
CN108549850B (en) * | 2018-03-27 | 2021-07-16 | 联想(北京)有限公司 | Image identification method and electronic equipment |
CN108549850A (en) * | 2018-03-27 | 2018-09-18 | 联想(北京)有限公司 | A kind of image-recognizing method and electronic equipment |
CN110389996A (en) * | 2018-04-16 | 2019-10-29 | 国际商业机器公司 | Realize the full sentence recurrent neural network language model for being used for natural language processing |
CN110555097A (en) * | 2018-05-31 | 2019-12-10 | 罗伯特·博世有限公司 | Slot filling with joint pointer and attention in spoken language understanding |
CN109614473A (en) * | 2018-06-05 | 2019-04-12 | 安徽省泰岳祥升软件有限公司 | Knowledge reasoning method and device applied to intelligent interaction |
CN108959246B (en) * | 2018-06-12 | 2022-07-12 | 北京慧闻科技(集团)有限公司 | Answer selection method and device based on improved attention mechanism and electronic equipment |
CN108959246A (en) * | 2018-06-12 | 2018-12-07 | 北京慧闻科技发展有限公司 | Answer selection method, device and electronic equipment based on improved attention mechanism |
CN110866403A (en) * | 2018-08-13 | 2020-03-06 | 中国科学院声学研究所 | End-to-end conversation state tracking method and system based on convolution cycle entity network |
CN110866403B (en) * | 2018-08-13 | 2021-06-08 | 中国科学院声学研究所 | End-to-end conversation state tracking method and system based on convolution cycle entity network |
CN109033463B (en) * | 2018-08-28 | 2021-11-26 | 广东工业大学 | Community question-answer content recommendation method based on end-to-end memory network |
CN109033463A (en) * | 2018-08-28 | 2018-12-18 | 广东工业大学 | A kind of community's question and answer content recommendation method based on end-to-end memory network |
CN109558487A (en) * | 2018-11-06 | 2019-04-02 | 华南师范大学 | Document Classification Method based on the more attention networks of hierarchy |
CN109840322A (en) * | 2018-11-08 | 2019-06-04 | 中山大学 | It is a kind of based on intensified learning cloze test type reading understand analysis model and method |
CN109658270A (en) * | 2018-12-19 | 2019-04-19 | 前海企保科技(深圳)有限公司 | It is a kind of to read the core compensation system and method understood based on insurance products |
CN109597884A (en) * | 2018-12-28 | 2019-04-09 | 北京百度网讯科技有限公司 | Talk with method, apparatus, storage medium and the terminal device generated |
CN109829631A (en) * | 2019-01-14 | 2019-05-31 | 北京中兴通网络科技股份有限公司 | A kind of business risk early warning analysis method and system based on memory network |
US11995406B2 (en) | 2019-01-24 | 2024-05-28 | Tencent Technology (Shenzhen) Company Limited | Encoding method, apparatus, and device, and storage medium |
CN110147532A (en) * | 2019-01-24 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Coding method, device, equipment and storage medium |
CN110147532B (en) * | 2019-01-24 | 2023-08-25 | 腾讯科技(深圳)有限公司 | Encoding method, apparatus, device and storage medium |
CN109977428A (en) * | 2019-03-29 | 2019-07-05 | 北京金山数字娱乐科技有限公司 | A kind of method and device that answer obtains |
CN109977428B (en) * | 2019-03-29 | 2024-04-02 | 北京金山数字娱乐科技有限公司 | Answer obtaining method and device |
CN109992657A (en) * | 2019-04-03 | 2019-07-09 | 浙江大学 | A kind of interactive problem generation method based on reinforcing Dynamic Inference |
CN110134771A (en) * | 2019-04-09 | 2019-08-16 | 广东工业大学 | A kind of implementation method based on more attention mechanism converged network question answering systems |
CN110134771B (en) * | 2019-04-09 | 2022-03-04 | 广东工业大学 | Implementation method of multi-attention-machine-based fusion network question-answering system |
CN110046244A (en) * | 2019-04-24 | 2019-07-23 | 中国人民解放军国防科技大学 | Answer selection method for question-answering system |
CN110046244B (en) * | 2019-04-24 | 2021-06-08 | 中国人民解放军国防科技大学 | Answer selection method for question-answering system |
CN110334195A (en) * | 2019-06-26 | 2019-10-15 | 北京科技大学 | A kind of answering method and system based on local attention mechanism memory network |
CN110348462A (en) * | 2019-07-09 | 2019-10-18 | 北京金山数字娱乐科技有限公司 | A kind of characteristics of image determination, vision answering method, device, equipment and medium |
CN110348462B (en) * | 2019-07-09 | 2022-03-04 | 北京金山数字娱乐科技有限公司 | Image feature determination and visual question and answer method, device, equipment and medium |
CN111047482A (en) * | 2019-11-14 | 2020-04-21 | 华中师范大学 | Knowledge tracking system and method based on hierarchical memory network |
CN111291803B (en) * | 2020-01-21 | 2022-07-29 | 中国科学技术大学 | Image grading granularity migration method, system, equipment and medium |
CN111291803A (en) * | 2020-01-21 | 2020-06-16 | 中国科学技术大学 | Image grading granularity migration method, system, equipment and medium |
CN111310848A (en) * | 2020-02-28 | 2020-06-19 | 支付宝(杭州)信息技术有限公司 | Training method and device of multi-task model |
CN111310848B (en) * | 2020-02-28 | 2022-06-28 | 支付宝(杭州)信息技术有限公司 | Training method and device for multi-task model |
CN112732879A (en) * | 2020-12-23 | 2021-04-30 | 重庆理工大学 | Downstream task processing method and model of question-answering task |
CN113704437B (en) * | 2021-09-03 | 2023-08-11 | 重庆邮电大学 | Knowledge base question-answering method integrating multi-head attention mechanism and relative position coding |
CN113704437A (en) * | 2021-09-03 | 2021-11-26 | 重庆邮电大学 | Knowledge base question-answering method integrating multi-head attention mechanism and relative position coding |
Also Published As
Publication number | Publication date |
---|---|
CN106126596B (en) | 2019-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106126596B (en) | A kind of answering method based on stratification memory network | |
CN113544703B (en) | Efficient off-policy credit allocation | |
US20220067278A1 (en) | System for entity and evidence-guided relation prediction and method of using the same | |
CN108681610B (en) | generating type multi-turn chatting dialogue method, system and computer readable storage medium | |
Chen et al. | Strategies for training large vocabulary neural language models | |
CN110969020B (en) | CNN and attention mechanism-based Chinese named entity identification method, system and medium | |
CN113190688B (en) | Complex network link prediction method and system based on logical reasoning and graph convolution | |
Xia et al. | Fully dynamic inference with deep neural networks | |
CN110019843A (en) | The processing method and processing device of knowledge mapping | |
CN111782961B (en) | Answer recommendation method oriented to machine reading understanding | |
CN114912419B (en) | Unified machine reading understanding method based on recombination countermeasure | |
CN115510814B (en) | Chapter-level complex problem generation method based on dual planning | |
CN113362963A (en) | Method and system for predicting side effects among medicines based on multi-source heterogeneous network | |
Moriya et al. | Evolution-strategy-based automation of system development for high-performance speech recognition | |
CN114880428B (en) | Method for recognizing speech part components based on graph neural network | |
Serras et al. | User-aware dialogue management policies over attributed bi-automata | |
CN116502648A (en) | Machine reading understanding semantic reasoning method based on multi-hop reasoning | |
CN114036938B (en) | News classification method for extracting text features by combining topic information and word vectors | |
Cífka et al. | Black-box language model explanation by context length probing | |
Eyraud et al. | TAYSIR Competition: Transformer+\textscrnn: Algorithms to Yield Simple and Interpretable Representations | |
CN117744760A (en) | Text information identification method and device, storage medium and electronic equipment | |
Cho et al. | Parallel parsing in a Gradient Symbolic Computation parser | |
CN111723186A (en) | Knowledge graph generation method based on artificial intelligence for dialog system and electronic equipment | |
KR20210002027A (en) | Apparatus and method for evaluating self-introduction based on natural language processing | |
Mo et al. | Fine grained knowledge transfer for personalized task-oriented dialogue systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |