Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, the technical solution in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
Part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art
All other embodiments obtained without creative efforts, belong to the scope of protection of the invention.
A kind of flow chart of the method for the establishment semanteme key index provided as shown in Figure 1 for one embodiment of the invention, including
Following steps:
S11:In corpus is talked with, the part of speech of the read statement of each pair of sentence pair of analysis and each word of revert statement,
Each semantic key for meeting default part of speech requirement in each sentence is extracted, wherein, the default part of speech includes:Noun, adjective,
At least one of verb;
S12:All sentences in dialogue corpus are clustered according to semantic key, the sentence of each classification corresponds to together
One semantic key;
S13:Using the sentence training sequence of all categories talked with corpus to series model, obtain sentence being reflected
Penetrate the coding network as real-valued vectors;
S14:Same category of sentence will be belonged to be encoded using the coding network, obtained corresponding with each semantic key
Real-valued vectors set;
S15:Multiple real-valued vectors composition memory squares are selected from the corresponding real-valued vectors set of each semantic key
Battle array, semantic key index is established in each semantic key and each dot-blur pattern part.
In the present embodiment, since every operation is all to talk with read statement in corpus and revert statement as base
Plinth, so the collection of sentence is particularly significant in dialogue corpus.In dialogue data is gathered, can take " microblogging ",
STC (Short text conversation, the dialogue of short text) numbers in the social network-i i-platforms such as " Twitter ", " forum "
According to (for example, the finding that bloger issues in microblogging, what is heard, the information felt, and the information is directed to, other users are directed to
In the information of the information-reply, the information of bloger's issue can be understood as the read statement in STC data, what other users were replied
Information can be understood as the revert statement in STC data).By the STC data collected from major social network-i i-platform deposit pair
Talk about in corpus.It is also possible to obtained into instant chat software.This kind of software is carried out by two objects
Real-time communication, adjacent two sentences generally also all have correlation, meet the standard of STC data.
For step S11, in corpus is talked with, each word in the read statement and revert statement of each pair of sentence pair is analyzed
The part of speech of language, for the ease of description, only chooses a small number of sentences to carrying out example, such as following groups conversation sentence:
1st, " which constellation you most like " " I most likes Capricorn ";
2nd, " which constellation you most like " " because I is Sagittarius, I likes Sagittarius ";
3rd, " film of Zhou Jielun is very good to be seen " " song of Zhou Jielun is also listened very well ".
First, these sentences segmented, standardized etc. with philology disposal, is then directed to per in short, finds wherein
Semantic key, wherein, semantic key can be understood as the centre word in one section of sentence.Simple lookup mode for example forms the sentence
Noun, verb, adjective etc., naturally it is also possible to the more complicated accurately method of setting.Then, will be to language according to semantic key
Sentence in material storehouse aggregates into different classifications.
For example, pass through LTP (Language Technology Platform, language technology platform) this kind of instrument pair
Above-mentioned sentence carries out part-of-speech tagging, obtains:
" which constellation you most like "
Noun:Constellation
Verb:Like
" I most likes Capricorn "
Noun:Capricorn
Verb:Like
" because I is Sagittarius, I likes Sagittarius "
Noun:Sagittarius
Verb:Like
" film of Zhou Jielun is very good to be seen "
Noun:Zhou Jielun, film
Adjective:It is good-looking
" song of Zhou Jielun is also listened very well "
Noun:Zhou Jielun, song
Adjective:It is pleasing to the ear
Satisfactory each semantic key in each sentence is extracted, for example, by taking noun as an example, " which constellation you most like "
Semantic key be " constellation ", the semantic key of " I most likes Capricorn " be " Capricorn ", " because I is Sagittarius, I like
The semantic key of joyous Sagittarius " is " Sagittarius ", the semantic key of " film of Zhou Jielun is very good to be seen " is " Zhou Jielun ", " Zhou Jielun
Song also listen very well " semantic key be " Zhou Jielun ".
Wherein, segmented to sentence., can also be by the theme implication of read statement in the philology disposals such as standardization
Extracted.Using the theme implication that is extracted as semantic key during follow-up, generally also can preferably it be imitated
Fruit.
Such as:" I does not believe constellation, does not also know what constellation I is "
If carry out extracting semantic key with " noun, verb, adjective " in sentence etc., then guide by semantic key
Output statement is probably also related with " constellation ".And if during using theme implication as semantic key.Such as it can extract
The semantic key that " not liking constellation " is the theme.And then below during, be likely to be obtained that " you believe blood group" it is similar with
" constellation " unrelated revert statement.
Wherein, can also be by same read statement pair in being segmented to these sentences, the philology disposal such as standardize
The different revert statements answered are grouped.Such as:
1st, " which constellation you most like " " I most likes Capricorn ";
2nd, " which constellation you most like " " because I is Sagittarius, I likes Sagittarius ";
3rd, " which constellation you most like " " what is constellation "
4th, " which constellation you most like " " not knowing "
5th, " which constellation you most like " " Cancer and Gemini are all delithted with, and are obstinately said if which is most liked, I
Do not know "
It is grouped, can be classified according to output statement by corresponding.For example, there is the constellation sentence 1,2,5 specifically replied
For one group.The constellation sentence 3,4 of insignificant reply is one group.At this moment insignificant revert statement 3,4 can be deleted
Remove.The further sentence pair simplified in dialogue corpus, improves the efficiency of the semantic key of extraction.
Simultaneously as in advance to corresponding sentence to being grouped, special demand can also be directed to and create inhomogeneity
Other semanteme key index, when having the demand of specific aim colony, often relates to the sentence dialogue less than other classifications.By to it
Classification is to conversation sentence to carrying out category filter.For example, create the conversational device in terms of children's early learning machine, it is possible to only
The sentence in terms of poem, nursery rhymes, animation is obtained, and the sentence of some popular, modern unsuitable children can screen out in advance.Not only
The establishment efficiency of semantic key index can be significantly lifted, but also semantic key index can be subjected to modularization management, pin
Individually created for different classes of semantic key index, consequently facilitating the maintenance in later stage.
For step S12, the sentence in dialogue corpus is clustered according to semantic key, is determined by step S11
Semantic key, classifies sentence,
For example, " Capricorn ", " Sagittarius " belong to constellation this classification, it is " constellation " to be unified cluster.Obtain:
Clustered with semantic key " constellation ":" which constellation you most like ", " I most likes Capricorn ", " be shooter because of me
Seat, so I likes Sagittarius " is one kind;
Clustered with semantic key " Zhou Jielun ":" film of Zhou Jielun is very good to be seen " " song of Zhou Jielun is also listened very well " is one
Class.
For step S13, the sentence of all categories that step S12 is classified is trained model, can obtain reflecting sentence
Penetrate the coding network as real-valued vectors.
For step S14, according to the classification of each semantic key, it will belong to what same category of sentence was determined using step S13
Coding network, can obtain the corresponding real-valued vectors set of each semantic key.
For example, being encoded with the sentence of " constellation " this kind of semantic key using coding network, corresponding real-valued vectors are obtained,
Encoded with the sentence of " Zhou Jielun " this kind of semantic key using coding network, obtain corresponding real-valued vectors.
Due in step s 11 it is exemplary dialogue corpus conversation sentence to less.Dialogue language material in practical application
The quantity of sentence pair in storehouse can be very more.So that the real-valued vectors in real-valued vectors set also can be very more.
For step S15, multiple vector composition dot-blur patterns are selected from the corresponding real-valued vectors set of each semantic key.
For example, multiple real-valued vectors composition dot-blur pattern S are selected in semantic key " constellation " corresponding real-valued vectors set.For semantic key
" constellation " and matrix S establish semantic key index.Pass through the semantic key, you can find corresponding matrix.Crowd is finally obtained
Multi-semantic meaning key and corresponding dot-blur pattern, can be stored in storage medium, be called for other equipment.
It is in the present embodiment, described to utilize the sentence training of all categories talked with corpus as a kind of embodiment
Sequence includes to series model:Using the sentence training sequence of all categories talked with corpus to series model and its mutation,
Wherein, the sequence includes to series model and its mutation:Attention model and own coding model.
In the present embodiment, its model is its main feature is that by statement coding be real-valued vectors using a coding network, and
Real-valued vectors are decoded as a sentence using a decoding network afterwards.When wherein training sequence is to series model, coding network
Read statement can be encoded into a sentence vector;During training attention model, coding network can obtain the statement coding of input
One vector set;During training own coding model, when replying, the sentence of input can be reappeared.
By the implementation can be seen that by extract talk with corpus in each sentence semantic key, to each sentence into
Row classification, determines the real-valued vectors of sentence in set of all categories, and real-valued vectors of all categories are formed corresponding dot-blur pattern
Establish and index with corresponding semantic key, relevant semantic information is stored in the form of real-valued vectors, establish semantic key index
Read-write storage device, realizes equivalent to by corpus, can easily apply semantic information wherein included, it is easier to
Carry out semantic expansion and association.
A kind of flow chart of the method for establishment semanteme key index of another embodiment of the present invention offer, bag are provided
Include following steps:
S151:The central point of real-valued vectors set is determined by mean algorithm;
S152:Determine the distance of each real-valued vectors and the central point in real-valued vectors set;
S153:Real-valued vectors according to default quantity is chosen apart from ascending order form dot-blur pattern.
In the present embodiment, in practical application, dialogue corpus can collect substantial amounts of sentence dialogue so as to language
Expect that the quantity of the sentence pair in storehouse is very more, so that the real-valued vectors that result in each real-valued vectors set are also very much, can cause
Dot-blur pattern is very big.So that needing to filter out the stronger sentence of some relevances, real-valued vectors selection is carried out.
For step S151, the central point of each real-valued vectors is determined by using K- mean algorithms.
For step S152, according to the central point of the step S151 each real-valued vectors set determined, to determine its center
The distance of point and each real-valued vectors in the real-valued vectors set of place, wherein, common distance algorithm has Euclidean distance and Manhattan
Distance algorithm.
For step S153, present count is chosen according to the ascending sequence of the distance of each real-valued vectors to the central point
The real-valued vectors composition dot-blur pattern of amount, wherein, apart from smaller, its relevance is stronger, and distance is bigger, and relevance is weaker.
It can be seen that by the implementation when the real-valued vectors in the corresponding real-valued vectors set of semantic key are excessive
When, the stronger real-valued vectors of relevance are chosen according to the distance of each real-valued vectors and central point, are keeping the corresponding reality of semantic key
When being worth the precision of vector, reduce the size of dot-blur pattern.
A kind of flow chart of the revert statement generation method of dialogue of one embodiment of the invention offer, bag are provided
Include following steps:
S21:Extract at least one semantic key in read statement;
S22:Using default Semantic mapping relation, at least one semantic key is expanded into multiple semantic keysets and is closed;
S23:Each semantic key in being closed according to the multiple semantic keyset, index and the matched memory square of each semantic key
Battle array;
S24:At least the dot-blur pattern is input in coding network and is decoded, selects a decoded wherein language
Sentence makes revert statement.
In the present embodiment, for step S21, first, the sentence of input is received, wherein, which can be user
The sentence of input, can also be by sentence of the machine as Self -adaptive.By being segmented to read statement, standardization processing etc.
Philological operation, extracts at least one semantic key in sentence.For example, read statement is:" picture of van gogh, very beautiful ".Carry
Take out at least one semantic key therein:" picture of van gogh ".
For step S22, using default Semantic mapping relation, semantic key " picture of van gogh " is expanded as multiple semantic keys
Gather { " sunflower ", " Starry Starry Night " }.
For step S23, each semantic key in being closed according to semantic keyset indexes the matched dot-blur pattern of each semantic key.Example
Such as, semantic key " sunflower " indexes corresponding dot-blur pattern in external memory storage.Other semantic keys are also by identical side
Method goes in external memory storage to index corresponding dot-blur pattern.
For step S24, according to the dot-blur pattern of step S23 indexes, these dot-blur patterns are input in coding network
Decoded, so as to decode a plurality of sentence, therefrom choose a sentence as revert statement.
Such as:Input " picture of van gogh, very beautiful "
Semantic key " picture of van gogh ", expands and is:" sunflower ", " Starry Starry Night "
When to expand semantic key be " sunflower ", " really very beautiful, the especially sunflower of van gogh, the flower in picture are exported
As burned flame, the life passion as artist's fanaticism is showed "
When the semantic key of expansion is " Starry Starry Night ", and output " it is really very beautiful, wherein there is the Starry Starry Night of a secondary van gogh, signify
The spirit that the mankind struggle with struggle "
Wherein, when mapping is by semantic key, the semantic key that can be expanded to it is controlled.Such as the semanteme no matter extracted
What such as " beautiful " key is.When expanding, control it as " Leonardo da Vinci ".So output statement may will be " Leonardo da Vinci
With being all outstanding artist as van gogh ".
It can be seen that by the implementation by specifying different semantic keys and corresponding mapping method to control generation
Semanteme, different semantic keys or different Semantic mapping methods can be selected, so as to improve model to same a word
The diversity replied is generated, and in the direction for controlling dialog semantics.So as to improve the experience of user.
It is in the present embodiment, described that at least the dot-blur pattern is input in coding network as a kind of embodiment
Carrying out decoding includes:
The sentence vector of the dot-blur pattern and the read statement is input in coding network together and is decoded, its
In, the sentence vector of the read statement by read statement by importing pre-generatmg in the coding network.
In the present embodiment, read statement is also entered into coding network, determines the sentence vector of read statement, then will
The sentence vector of the dot-blur pattern and read statement is input in coding network at the same time to be decoded.For example, " picture of van gogh, very
It is beautiful ", the semantic key of extraction is expanded when being " beautiful ", and the semantic key expanded is " intension ".Pass through the semanteme of expansion
Key " intension " indexes corresponding dot-blur pattern in external memory storage, while by read statement " picture of van gogh, very beautiful ".It is defeated
Enter into coding network, the sentence vector of definite input network, then decoded by the way that the sentence of dot-blur pattern and read statement vector is common,
So as to decode corresponding sentence " picture of van gogh, also there is very much intension ".
It can be seen that by the implementation and decode the sentence vector of read statement jointly with dot-blur pattern, solution can be made
The sentence of code is more close to the answer of read statement, makes the sentence and read statement context relation of generation.Improve what is replied
Degree of intelligence, while user experience is improved, increase the satisfaction of user.
As a kind of embodiment, in the present embodiment, at least one semantic key is expanded into multiple semantic keysets
Close, the expansion includes:With word mapping, synonym mapping, antonym mapping, knowledge mapping.
In the present embodiment, such as read statement is:" film of Zhou Jielun is very good to be seen ", the semantic key extracted has
" Zhou Jielun ", " film ".A variety of different semantic keys can be mapped out by different expanding methods:" Zhou Jielun ", " elder brother's icepro ",
" film ", " song ", " special edition " etc..
By the semantic keys of difference mapped out, the sentence of each side can be also obtained in decoding, such as:
" artistic skills of Zhou Jielun are very excellent "
" in the shooting of film, elder brother, which insults also to have, to show up "
" this film of Zhou Jielun was shown on January 19th, 2012 "
" song of Zhou Jielun is also listened very well "
" theme song of this film is exactly to be sung by Zhou Jielun, has also gone out special edition of the same name "
Can be seen that by the implementation by mapping out different semantic keys, decoding obtains the sentence of various aspects,
Improve the diversity of reply.So as to improve the experience sense of user by.
As a kind of embodiment, in the present embodiment, after the completion of a wheel dialogue, by the input language of epicycle dialogue centering
Sentence and revert statement are as talking with to being added in corpus, to expand semantic key index.
In the present embodiment, since read statement is that extraneous input is come in, revert statement is decoding generation, and having can
Can not there is no the sentence pair in corpus is talked with, so the read statement and revert statement during epicycle is talked with are as conversation sentence
It is right, it is added in corpus, so as to expand semantic key index.
It can be seen that by the implementation by gathering the completed dialogue of each round to be carried out more to semantic key index
Newly.New semantic key index after updated is more enriched perfect.
The revert statement generation method of the method for above-mentioned establishment semanteme key index and dialogue can mutually independent operation and
Optimization, can also combine integral system, as shown in figure 4, can be applied in the dialogue chat of robot, wherein, sentence
To the be in short known as post, the sentence dialogue for responding dialogue is known as comment.
The external semantic memory of the system is one tensor of construction in the form of matrix table.Wherein, each matrix table
Show the output statement for being possible to represent of corresponding certain semantic key.Each row vector of matrix is based on a sentence and forms,
All row vectors in one matrix table all represent its corresponding one specific semantic key.Conversation sentence process is exported in generation
In, a semantic key is extracted from read statement, by using the data and sequence and series model of external memory storage storage
To generate last output conversation sentence.By operational semantics key, the theme or semanteme of output statement can be guided.
Wherein, external semantic memory homing sequence is made of to Sequence Learning frame three parts:One encoder E, one
A decoder D and external semantic a memory M, wherein Fig. 5 describe its overall framework and data flow.
External semantic memory is incorporated into coder-decoder frame, which is considered as to make
With mass data (being probably unsupervised) structure outside training dataset.Therefore, external semantic memory M can be counted as
It is the explicit storage of external knowledge, this enables coder-decoder model to use the knowledge formation outside post
comment.The use of external semantic memory is similar to the mode that we talk with.In our dialogues, it will usually pay close attention to
Some specific semantic key words (such as keyword or theme) of one sentence, then know according to our target or background
Know and think deeply relevant reply keyword or reply subject direction, finally forming an output statement according to the semanteme being combined into carries out
Reply.
We simulate this dialog procedure using external semantic memory module.First, using semantic key and external semantic
Memory construction indexes.When corresponding to when having input by sentence, responded.The semantic key of read statement is extracted, then basis is reflected
The relation of penetrating obtains exporting semantic key, according to obtained output semanteme key, memory is addressed to handle, finally, decoder mould
Type is based on extracting sentence generation output statement in read statement and external memory storage.
The whole frame of coder-decoder with external memory storage is formulated as follows.
Assuming that x1:T={ x1, x2..., xTRepresent post in each word, wherein T represent post length.In equation p,
c1:T=E (x1:T) in, coder module E receives word insertion and produces the dense expression p and one group of context for inputting sentence hereinafter
Vectorial c1:T={ c1, c2..., cT(abbreviation c), herein, we simply use a GRU (gated recurrent
Unit, thresholding cycling element) Recognition with Recurrent Neural Network as encoder.
We extract semantic key from the sentence of input.One typical semantic key is a word or phrase, is being inputted
Sentence in when extracting semantic key, the LTP instruments of POS (Part-of-speech, part-of-speech tagging) label can be used to be carried
Take.Multiple semantic keys can be usually extracted from read statement, the semantic key extracted is referred to as inputting semantic key.We can
With one semantic key of input of selection, when selection inputs semantic key, can use different methods, for example, iteration select or with
The method of machine selection makes choice, after the semantic key of what a input is selected, by selecting different input semanteme keys, Wo Menke
To generate the output statement with different semantic preferences.That is, once find the semantic key k of input(i), passed through semanteme
Key mapper S is converted into the semantic key k of output(o), its formula is:k(o)=S (k(i))。
Semantic key mapper S can simply be mapped very much, for example, directly using the semantic key of input as output language
Adopted key, i.e. k(o)=k(i).Alternatively, very complicated mapping can also be carried out, for example, by synonym or square from knowledge mapping etc.
The word that method is derived.It is what that no matter can even input the word of semantic key, directly uses the word specified as output
Semantic key, is forced using the output semanteme key specified when output statement is generated.So so that semantic key mapper has
Controllability.
After the semantic key of output is determined, it would be desirable to indexed using it to be established for external semantic memory.Such as Fig. 5
Described, memory module M is made of k storage unit, wherein, k is the quantity of all semantic keys.Each semanteme key corresponds to corresponding
Access unit address or index, each storage unit be the L*D matrix associated with certain semantic key, wherein L represents to deposit
The line number of sentence in storage unit, D represent the dimension of sentence.Reader (Reader) R is read out selected storage unit m, with
For producing external memory storage context vector r.Wherein r=R (m, p).
The inspiration of neural Turing machine is received, a kind of addressing based on content is applied to the square of selected storage unit by we
Battle array.A suffix p is given, reader R will return to equationMatrix m on weighing vector, wherein
wlBy equationProvide.L rows in wherein m (l) representing matrixes m.β in above-mentioned formula is control
The coefficient of the acutance of weight vector.In order to ensure clarity, the value of β is set to 100 herein.
After memory context vector r is obtained, we are replicated each word in read statement, and are chased after
It is added in original context C1, forms a new joint context vector collectionWherein,
During decoding, decoder D is exported back using the linguistic context vector from postposition sentence and external semantic memory to produce response
Multiple (being expressed as y).Wherein, response output is replied
Summarize below to the overall process of exterior semantic memory homing sequence to Sequence Learning in following algorithm.
It should be noted that the structure of external memory storage is unrelated with the training of encoding-decoder., can be with the case of given external memory storage
Encoding-decoder model is trained using normal gradients descent algorithm.Generation of the external semantic memory homing sequence to sequence
Algorithm is as follows:
1st, input dialogue sentence is received;
2nd, conversation sentence is converted into a word insertion sequence x1:T;
3rd, the semantic key k of extraction input(i);
4th, the semantic push-to of input is crossed mapping to obtain exporting semantic key k(o)=S (k(i));
5th, associated memory block matrix m is found by exporting semantic key;
6th, coding c is carried out to input dialogue sentence using encoder p1:T=E (x1:T);
7th, memory context vector r=R (m, p) is read from external semantic memory;
8th, r is appended in original encoder context vector,
9th, formula is passed throughDecoding generation output revert statement.
Overall system can be generated more rich more specific and returned with intellectual compared to the model with other productions
It is multiple, and the semantic direction of dialogue can be controlled to a certain extent.Meanwhile the overall system can be used and had
Among the robot system of chat conversations function, the satisfaction of user can be improved, so as to increase user and the robot
Interaction times, improve the degree of recognition of user.The overall system is applied talks with interactive task in other similar being related to,
Can also improve system it is overall can only degree and user satisfaction.
The overall framework of the method is described above is, the structure of external semantic memory is described in detail below.
The content of each storage unit is to include the square of the K representative comment sentence insertion corresponding to certain semantic key
Battle array." semantic key " can be represented by keyword or the vector of theme implication.In the case of the expression based on keyword, it is single
To the index of vector.Such as above-mentioned formula k(o)=S (k(i)) shown in, can be mapped by semantic key mapper S the semantic key of input come
Obtain the semantic key of output.
External semantic memory is independently of the training process of coder-decoder and builds.This separation can make
Information can be added outside coder-decoder training data by obtaining us.The data constructed according to external memory storage, can make
Encoder is determined with two kinds of different methods.
1st, coding decoder
In given parallel STC data, can training standard in advance sequence to Sequence Learning.Herein, data can
With different from the data aiminged drill for external semantic memory, with reference to external knowledge.Once training is completed, decoding is abandoned
Device, leaves behind encoder and is used for training and changing dialogue data.
2nd, self-encoding encoder
When given data are not sentences pair (such as news, novel, or other text materials), own coding can be used
Device to it carries out conversion training.
In order to build external semantic memory, we can obtain the semantic key of each conversation sentence, then that these are right
Language sentence carries out polymerization classification according to semantic key.Once packet is completed, it would be desirable to which it is big to build a fixation for each semantic key
Small storage unit, i.e., select k most representational sentences in all sentences out of each group.In order to select most
Representational sentence to all sentences in same storage unit, it is necessary to carry out semantic analysis, and obtain the projection vector in space.So
Clustered afterwards on projection vector using " K- mean algorithms ".Finally, it is nearest from center for each corresponding potential projection vector
Cluster centre select a sentence.In the case that the quantity of the sentence in storage unit is a less than k, carried out using null vector
Filling.
It is trained by the external semantic memory of structure with coder-decoder, it is pre- that training process can be divided into data
Processing and end to end coder-decoder train two parts.
1st, data prediction:As set forth above, it is possible to a sentence is put into multiple semantic key groups in memory construction.
Training data is also handled according to identical rule.Since training corpus is rear comment sentence, we are according to semantic key to instruction
White silk is filtered.It has two benefits (1) to be more convenient for the training of small lot;(2) some general responses may be filtered out.
The semantic key of read statement is extracted first, is then converted the semantic key of these read statements by semantic key mapper
For the semantic key of relevant output.Then will be trained according to the semantic key of output to being divided into different crucial groups.It is worth noting that, one
The sentence of a input may correspond to the sentence of multiple outputs.By application packet, we are divided according to their output statement
Class.And the sentence of an output, it is also possible to it is related with multiple input sentence, thus illustrate that the sentence of this output is very general,
Without actual meaning, this kind of meaningless sentence is removed.
2nd, end-to-end training and generation:As described above, during the training period, external semantic memory is fixed to training process
, therefore it is substantially the additional input of decoder.The remainder of the model is the coder-decoder structure of standard.
As shown in Figure 5.We randomly choose one from training data and answer semantic key, then carry out construction output statement.Encoder-
The object function of decoder is defined as follows:
Wherein, object function is completely adjustable.In generating process, semantic key is extracted first from read statement, such as
Fruit has multiple input semanteme key, then generates multiple output statement candidates using them successively.Then according to their possibility
These candidates are ranked up, to produce final previous name or preceding n output.
In order to verify the method, we used about 4,200,000 pairs of STC data from microblogging to carry out encoder-decoding
Device training.LTP instruments are used to segment and POS marks.Data set is filtered according to above-mentioned preprocess method, so as to be advised
Mould narrows down to 1,500,000.All data are all used for the construction of external memory storage, and there was only 120 general-purposes in data set in training, with
Display utilizes the potentiality of exterior corpus.1000 pairs of test sets as objective evaluation are extracted from data set.
We construct three kinds of sequence generation systems:Neural links-and-links machine NRM (neural responding machine, mark
Accurate sequence is trained to sequence), (multiresolution recurrent neural network, multiresolution follow MrRNN
Ring neutral net) and propose ESED (external semantic memory guided encoder-decoder
Model, external semantic memory guiding coder-decoder model).For our model, we are come using three kinds of methods
Build external memory storage:S2S (sequence to sequence model sequences to series model), Atten (neural
Attention model, attention model) and AutoED (auto-encoder, self-encoding encoder), wherein, AutoED is only right
Comment is trained and (can regard the non-parallel corpus storehouse using unsupervised big data as).
The encoder and decoder structure of ESED is identical.It is the hidden state arrow of word insertion and 800 dimensions with 400 dimensions
One layer of GRU of amount.Semantic key is used as using noun, adjective.All parameters are evenly distributedly initial between -0.05 and 0.05
Change.Adam (A method for stochastic optimization, a kind of method of random optimization) optimizer it is initial
Learning rate is 0.0004.Minimum lot size is dimensioned to 64.For NRM, we are embedded in word use and hidden state is identical
Dimension as our model.For MrRNN, we use LSTM (long short term memory, long short-term memory)
HRED (hierarchical recurrent encoder-decoder, layered circulation encoding-decoder) model is instead of, and
Using nouns and adjectives as semantic key.All these models all have passed through the training of multiple times, with best effective solution
The problem of certainly complicated.It is worth noting that, for ESED, generated successively using multiple input semanteme key multiple output statements into
Row candidate, we arrange candidate word according to semantic key length descending, because it is considered that long word may include more letters
Breath.
1st, diversity and substantive analysis
Herein, we have proposed two purpose analysis indexes:The diversity and substance of STC.They can be from
Easily directly calculated in generated statement, and without necessarily referring to sentence.Coverage rate this avoid the reference in control can
Can low-down problem.
Diversity reflects the rich of the word in generated statement, is defined as
By calculating the sum of word and the quantity of the word uniquely occurred, all test sentences are calculated.Here
Basic assumption be generation vocabulary it is abundanter, the comment of generation is more diversified.Can be easily from this module
Go out, it is common to use invalid answer can reduce diversity measurement.
Substance has reacted the substantive information included in the answer of generation, these information are defined as significant entity
Quantity.We extract entity without using NER (name entity recognition, name Entity recognition) instruments, because
These instruments usually only identify name, place name and organization name, this is relatively limited.On the contrary, we, which use, is based on Wiki hundred
The maximum string matching algorithm of section's entity table identifies significant entity.Substantial definition is every words on test set by we
Average entity number of words:
For each sentence in test set, we select top-1 and top-5 to calculate the two measurements.As a result such as Fig. 6
Shown in middle table 1.It is observed that ESED is better than other two models all the time in terms of diversity and substance, its difference for
The result of top-1 has statistical meaning.For top-5's as a result, the diversity fall of ESED is less than NRM models.
This shows that ESED can produce more diversified and abundant output statement.The reduction of result of the entity from top-1 to top-5 may
It is due to that the information included in semantic key reduces (longer semantic key row name higher).
It is believed that the above-mentioned performance benefits of ESED are mainly due to the semantic key of use.Particularly when the training data of ESED
When being divided into the rear comment data of shared identical semantic key, we effectively remove general revert statement.To be terrible
To this specific idea, table two gives some examples in Fig. 7.It can be seen from these examples that NRM and MrRNN tend to
Rationally, safety but the answer without practical significance, and ESED can be produced using different semantic keys it is more specific and diversified
Comment.The ESED models of table 1 use expression of the word (noun, adjective, Chinese idiom) as semantic key in Fig. 6.By this list
The semantic key of word rank and using theme implication rank semantic key be compared it is also meaningful.
In order to consider other semantic key types, we use theme (topic) to be tested as semantic key.On
The topic model that we use in testing, it is intended to cultivate the model of a suitable short text.With LDA (Latent
Dirichlet allocation, potential Di Li Crays distribution) compare, BTM (Biterm Topic Model, a kind of short text
Topic model) it is more effective in terms of short text modeling.In addition, the model should assemble text with appropriate granularity:Neither
General (no information), nor too specific (overfitting).Therefore, we select different theme sizes (200 in an experiment
With 500).Detailed training program and setting is identical with above-mentioned original text.Therefore, we train theme mould as described above first
Type, and theme is selected come the theme for each sentence distribution uniqueness by maximum probability.Therefore, we instruct as described above first
Practice topic model, and theme is selected come the theme for each sentence distribution uniqueness by maximum probability.We assume that one common
Semantic mapper be used for theme.Therefore, we only need to estimate the theme of a model, and semantic key is exactly this theme
Index.In order to investigate the influence of topic numbers, topic numbers are respectively set to 200 and 500 by us.Table 3 shows master in Fig. 8
Inscribe the diversity of rank and substantive sex expression.In order to compare, the result of comparable words rank is also listed.
It is observed that the other external memory storage execution of subject matter level is more far short of what is expected than the memory of word rank.It is considered that this
It is so fuzzy to be largely because the semanteme of a topic, so that we can not control caused output language well
The semanteme of sentence.In addition, not having clear and definite relation between two themes, therefore it also is difficult to define a significant Semantic mapping
Device.As soon as a theme can only be had by being additionally, since each sentence, thus ESED training in share identical theme make comments to
It is more difficult.In fact, it is observed that for training the quantity of ESED and the other data of subject matter level much smaller.It is all this
A little degenerations for all causing 3 data of table in Fig. 8.In the remainder of this paper, we only focus on the external memory storage of word rank.
2nd, the objective evaluation of quality is replied
We use BLEU (a method for automatic evaluation of machine
Translation, a kind of method of machine translation automatic Evaluation) reference of scoring replies comment quality and objective degree to assess
Amount.BLEU scorings are suggested the quality of assessment machine translation first.It is used also as answer quality in conversational system recently
Objective indicator.As a result as shown in table 4 in Fig. 9.
It is observed that the comment generated for top-1 and top-5, ESED obtains preferably scoring than BLEU.In spite of
People think BLEU scoring with the mankind evaluate uniformity when being evaluated for conversational system it is poor.But the performance benefits of uniformity are still
So demonstrate the validity of proposed ESED methods.
3rd, personnel evaluation replies quality
We have also carried out human assessment, reply comment quality of the more different models in top-1 settings.We follow
The evaluation criteria of STC-2 challenges.The appropriateness of answer judges from following four standard:
(1) it is fluent:Comment is acceptable as natural language text;
(2) uniformity:Comment should get up with logical communication link, and related to original text;
(3) it is self-sufficient:Evaluator can judge whether comment is suitable by reading comment to any content in addition;
(4) it is substantive:Comment provides the new information in this position initiation human eye.
If (1) or (2) are sub-standard, comment should be labeled as " L0 ";If (3) or (4) are sub-standard, comment
" L1 " should be labeled as, otherwise.It is labeled as " L2 ".In order to emphasize general comment problem, we add a special symbol
" LD " marks reasonable but general comment.When we calculate average mark, " L2 " is 2 points, and " L1 " is 1 point, and " L0 " is 0 point.
Especially, " LD " is denoted as 0 point by us, and producing various information with Motivational systems feeds back, rather than general comment.Final
Average mark is exactly the average value of all speaker's fractions in all test sentences.
Our test set is made of 100 sentences in training set.Six scholiasts are according to above-mentioned standard to generation
Answer is scored.As a result as shown in table 5 in Figure 10.For LD, L0, L1, L2, numeral is to generate the percentage of sentence, and Ave is
Average mark.It will be apparent that much fewer than the common replies that baseline NRM models produce of ESED models.Although ESED has than MrRNN
The LD percentages of bigger, this is probably because MrRNN tends to produce a variety of answers that can not be understood.In general, ESED moulds
Type can generate the more coherent and more rich comment of information, this is that people are appreciated.This is also consistent with the conclusion of Fig. 6 tables 1.
4th, semantic key guiding analysis
The effect for having shown that ESED methods STC above.These experiments all employ a common Uniform semantic key
Mapping function.Multiple semantic keys that the rich and varied property of generation comment is extracted in same post.But as above
Described, a major advantage of ESED is can to control the semanteme of generation by manipulating semantic mapper.
(1) controllable semantic key mapping method:In addition to using equal function as mapper, also many other is automatic
Method inputs semantic key to map.This idea is to find and the relevant word or expression of the semantic key of input.Semanteme can be used
Learn relationship map (such as synonym or antonym) or data-driven mapping (such as word insertion neighborhood).In addition to automated manner, also
Mapping can be controlled manually.The semantic key of output is set.Table 6 shows what is generated using different Semantic mapping functions in Figure 11
The output statement of same read statement.It can be seen that using synonym or antonym mapping function, the sentence of generation and input
Semantic relation is close, but content is different.This abundant semanteme indicates the advantage using external memory storage.We also use with
The semantic key of machine selection guides the possibility of the comment generated to show.As shown in table 6 in Figure 11, these comments are most of can
Keep semantic dependency and subject consistency.It will be appreciated, however, that and not all Semantic mapping function can all have obvious knot
Fruit.
(2) knowledge importing is carried out by external storage:In addition to diversified information is fed back, our model can be with profit
New knowledge is introduced to the comment of generation with external storage.Content in view of memory memory storage unit is represented comprising top-K
The external semantic memory building process of the matrix of sentence insertion is commented on, may be included in matrix related to corresponding semantic keywords
Knowledge.Embedded knowledge can with the word in the training data of external memory storage, occurred jointly or relevant entity associated,
Table 7 has given some examples in fig. 12.
As can be seen from the table, external information can manually specify specific semantic key to introduce, this to annotate
Substantially without influence fluency and uniformity.For example, giving the semantic key of " Song ", model shows associated ability,
And the comment of " song of Zhou Jielun is also good " is generated, it uses the knowledge that Zhou Jielun is film star and singer.
Pass through method proposed in this paper.By combining external semantic memory in coder-decoder frame, this method is big
Blanket invalid answer problem is alleviated greatly, and generates more diversified and specific reaction.Objective evaluation and the mankind
Assessment all shows advantage.The separation of external storage construction and neural metwork training also so that using non-parallel corpus storehouse can
Energy.Also, the corresponding semanteme of generation can be controlled by manipulating semantic mapper, it means that produce more rich newspeak
Sentence.
A kind of structure diagram of the system of the establishment semanteme key index provided as shown in figure 13 for one embodiment of the invention,
The method that the technical solution of the present embodiment is applicable to the semantic key index of establishment to equipment, the system can perform above-mentioned any real
The method of the semantic key index of establishment described in example is applied, and is configured in the terminal.
A kind of system for creating semantic key index provided in this embodiment includes:Semantic key extraction procedure module 11, sentence
Cluster Program module 12, coding network mapper module 13, statement coding program module 14 and semantic key index establish program
Module 15.
Wherein, semantic key extraction procedure module 11 is used in corpus is talked with, and analyzes the read statement of each pair of sentence pair
With the part of speech of each word of revert statement, each semantic key for meeting default part of speech requirement in each sentence is extracted, wherein, it is described
Default part of speech includes:At least one of noun, adjective, verb;Sentence Cluster Program module 12 is used for according to semantic key pair
All sentences in dialogue corpus are clustered, and the sentence of each classification corresponds to same semantic key;Coding network maps journey
Sequence module 13 is used to series model obtain that sentence can be reflected using the sentence training sequence of all categories talked with corpus
Penetrate the coding network as real-valued vectors;Statement coding program module 14 utilizes the volume for that will belong to same category of sentence
Code network is encoded, and is obtained and the corresponding real-valued vectors set of each semantic key;Semantic key index is established program module 15 and is used
Multiple real-valued vectors composition dot-blur patterns are selected in the corresponding real-valued vectors set of each semantic keys of Yu Congyu, in each semanteme
Key and each dot-blur pattern part establish semantic key index.
Further, the semantic key index is established program module and is used for:
The central point of real-valued vectors set is determined by mean algorithm;
Determine the distance of each real-valued vectors and the central point in real-valued vectors set;
Real-valued vectors according to default quantity is chosen apart from ascending order form dot-blur pattern.
A kind of revert statement of the dialogue provided as shown in figure 14 for one embodiment of the invention generates the structural representation of system
Figure, the technical solution of the present embodiment are applicable to the revert statement generation method of the dialogue to equipment, which can perform above-mentioned
The revert statement generation method of dialogue described in any embodiment, and configure in the terminal.
A kind of system for creating semantic key index provided in this embodiment includes:Semantic key extraction procedure module 21, it is semantic
Key expands program module 22, and dot-blur pattern concordance program module 23 and sentence reply program module 24.
Wherein, semantic key extraction procedure module 21 is used to extract at least one semantic key in read statement;Semantic key is opened up
Open up program module 22 to be used to utilize default Semantic mapping relation, at least one semantic key is expanded into multiple semantic keysets
Close;Dot-blur pattern concordance program module 23 indexes and described each for each semantic key in being closed according to the multiple semantic keyset
The semantic matched dot-blur pattern of key;Sentence replys program module 24 and is used at least the dot-blur pattern is input in coding network
Decoded, select a decoded wherein sentence as revert statement.
Further, the described at least dot-blur pattern is input in coding network carries out decoding and includes:
The sentence vector of the dot-blur pattern and the read statement is input in coding network together and is decoded, its
In, the sentence vector of the read statement by read statement by importing pre-generatmg in the coding network.
Further, after the completion of a wheel dialogue, using the read statement and revert statement of epicycle dialogue centering as dialogue
To being added in corpus, to expand semantic key index.
Herein, relational terms such as first and second and the like be used merely to by an entity or operation with it is another
One entity or operation distinguish, without necessarily requiring or implying there are any this reality between these entities or operation
Relation or order.Moreover, term " comprising ", "comprising", not only including those key elements, but also including being not explicitly listed
Other element, or further include as elements inherent to such a process, method, article, or device.Do not limiting more
In the case of system, the key element that is limited by sentence " including ... ", it is not excluded that in the process including the key element, method, article
Or also there are other identical element in equipment.
Device embodiment described above is only schematical, wherein the unit illustrated as separating component can
To be or may not be physically separate, physics list is may or may not be as the component that unit is shown
Member, you can with positioned at a place, or can also be distributed in multiple network unit.It can be selected according to the actual needs
In some or all of module realize the purpose of this embodiment scheme.Those of ordinary skill in the art are not paying creativeness
Work in the case of, you can to understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
Realized by the mode of software plus required general hardware platform, naturally it is also possible to pass through hardware.Based on such understanding, on
The part that technical solution substantially in other words contributes to the prior art is stated to embody in the form of software product, should
Computer software product can store in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD, including some fingers
Order is used so that a computer equipment (can be personal computer, server, or network equipment etc.) performs each implementation
Method described in some parts of example or embodiment.
Finally it should be noted that:The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
The present invention is described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that:It still may be used
To modify to the technical solution described in foregoing embodiments, or equivalent substitution is carried out to which part technical characteristic;
And these modification or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical solution spirit and
Scope.