CN111046134A - Dialog generation method based on replying person personal feature enhancement - Google Patents
Dialog generation method based on replying person personal feature enhancement Download PDFInfo
- Publication number
- CN111046134A CN111046134A CN201911062516.8A CN201911062516A CN111046134A CN 111046134 A CN111046134 A CN 111046134A CN 201911062516 A CN201911062516 A CN 201911062516A CN 111046134 A CN111046134 A CN 111046134A
- Authority
- CN
- China
- Prior art keywords
- distribution
- context
- response
- sentence
- dialog
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 230000004044 response Effects 0.000 claims abstract description 54
- 239000013598 vector Substances 0.000 claims description 21
- 238000012549 training Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 5
- 230000004927 fusion Effects 0.000 claims description 2
- 238000012886 linear function Methods 0.000 claims description 2
- 238000005457 optimization Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 claims description 2
- 238000011156 evaluation Methods 0.000 abstract description 3
- 230000008034 disappearance Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 3
- 230000001537 neural effect Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 208000031968 Cadaver Diseases 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000011511 automated evaluation Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Human Computer Interaction (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a dialog generating method based on replying person personal feature enhancement, which comprises the following steps: 1) constructing 2 encoder-decoder basic frameworks; 2) constructing a VAE model based on vMF distribution on an encoder-decoder model by utilizing vMF distribution as a personal feature extractor to obtain a replier personal feature latent variable based on context; 3) and constructing a CVAE generation model on an encoder-decoder model by utilizing the personal characteristic latent variables and vMF distributed on the encoder-decoder model as an information enhancement generator to obtain a response fusing the personal characteristic latent variables and context of the replying person. According to the dialog generation method, the response of the personal characteristics of the respondent can be effectively reflected and a better result can be obtained on the relevant evaluation indexes by modeling the personal characteristics and the context of the respondent.
Description
Technical Field
The invention relates to the technical field of natural language processing and a dialogue system, in particular to a dialogue generating method based on replying person personal feature enhancement.
Background
With the continuous rise of artificial intelligence in recent two years, in many fields, more and more artificial intelligence products slowly appear in industrial services, and the conversation system is more and more concerned by people as a new field. Open field oriented dialog system[1]Is an important direction in man-machine conversation, and aims to make the generated conversation response more natural, fluent and diverse as possible.
In recent years, the development of the research related to dialog generation is greatly promoted by the continuous progress of deep learning technology, so that the dialog generation does not rely on the modes of template matching, retrieval and the like. In recent years, the dialogue system method mainly comprises: (1) the generation-based method mainly comprises a Seq2Seq model adopting an Encoder-Decoder framework[2]Generative models based on a neural variational encoder[3]Etc.; (2) based on the method of retrieval, responses are selected primarily from candidate responses. The key is message response matching, and a matching algorithm must overcome semantic difference between a message and a response; (3) hybrid approaches, combining neural generative models with search-based models, combine the advantages of both search and generation-based models, and are attractive in performance.
The method mainly considers the diversity of the response, and rarely considers the consistency of the response generated by a replier; the problem of KL divergence disappearance exists in a neural variation encoder model, so that potential space cannot be effectively utilized[4]And the space contains more personal characteristics of the respondent.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides a dialog generation method based on personal feature enhancement of a replying person. According to the method, vMF distribution is introduced by using an encoder-decoder framework to construct VAE and CVAE, personal characteristics and context information of a replier in a conversation context are fused, and the finally obtained conversation generation result is the best result in 5 indexes of Average, Greedy, Extreme, Distinct-1 and Distinct-2 compared with the existing model.
The purpose of the invention is realized by the following technical scheme:
a dialog generation method based on replying person personal feature enhancement comprises the following steps:
(1) 2 encoder-decoder basic frameworks are constructed:
2 encoder-decoder basic frames are respectively used for reconstructing sentences of a relevant replying person in each section of dialogue context of the training corpus and responses in each section of dialogue;
(2) constructing a personal feature extractor:
latent variable z for extracting personal characteristics of respondent by personal characteristic extractorr(ii) a vMF distribution is introduced to an encoder-decoder basic framework for reconstructing each dialog context related replier sentence in a training corpus to construct a VAE generation model based on vMF distribution, and a personal characteristic latent variable z is obtainedr;
(3) Constructing an information enhancement generator:
information enhancement generator for obtaining fusion replying person personal characteristic latent variable zrAnd a response y for context x; the information enhancement generator introduces vMF distribution and respondent personal feature latent variable z on the encoder-decoder basic framework for reconstructing each dialog response in the corpusrA CVAE generative model based on vMF distribution was constructed to obtain the response y.
Further, in the step (1), in the training corpus, each dialog is composed of a context sentence and a response; wherein the context sentence in each dialog is represented as x ═ (x)1,x2,…,xn),xiRepresenting the ith sentence in the context, n representing n sentences contained in the dialog context, in particular in the form of xi=(wi,1,wi,2,…wi,j,…,wi,Ni),wi,jRepresenting the jth word, N, in the ith sentenceiRepresenting a sentence xiThe number of words in (1); the response in each dialog is denoted as y ═ w (w)y,1,wy,2,…wy,j,…,wy,Ny),wy,jIndicating the jth word, N, in the responseyRepresents the number of words contained in response y; sentences related to the replying person information extracted from each dialog context are represented asl denotes the number of sentences of the relevant reverter in the dialog context.
Further, the following processing is required to obtain the corpus:
(101) deleting sentences of which the original dialogue length is less than 3 or more than 10 in the corpus to normalize the dialogue length;
(102) the last sentence of each dialog in the corpus is considered as a response, and the rest sentences are considered as contexts.
Further, in step (2) (3), vMF distribution, i.e. von Mises-Fisher distribution, is used to represent the probability distribution on the unit sphere, and the probability density function is as follows:
wherein,d represents theThe dimension of the space, x represents a unit random vector of d dimensions;representing a direction vector on a unit sphere, | | μ | | ═ 1; kappa.gtoreq.0 represents a concentration parameter; i isρA modified Bessel function representing the order ρ, where ρ ═ d/2-1; the distribution indicates the distribution of the unit vectors on the spherical surface;
further, in the step (2), the specific steps are as follows:
the personal feature extractor consists of a sentence encoder, a local context encoder, an vMF distribution and reply decoder.
First, a sentence encoder encodes a sentence x about reverter information using a bi-directional RNN layered encoderrIt will xrEach sentence in (1)Coded as a vectorThen x is putrAll ofThe coded vector is used as the input of a local context coder, and finally, a sentence x related to the reverter information is obtainedrPotential vector of
Next, vMF is used to distribute the sentence x for the information about the reverterrHidden state ofLearning to obtain the distribution of the representation of personal characteristics of the replying person, and then performing rejection sampling on the distribution to obtain the latent variable zrThe sampling formula is as follows:
wherein ω ∈ [ -1,1](ii) a Will zrThe reconstructed sentence x of the replying information is obtained as the input of the replying decoderrThe calculation formula is as follows:
where l represents the sentence x in the context about the reverter informationrThe number of (2); n is a radical ofiIs xrLength of the ith sentence, wi,jIs xrThe ith sentenceA representation of the jth word of (a);
finally, optimizing the model by using an ELBO formula:
wherein,which is indicative of the error of the reconstruction,for calculating KL divergence between a posterior distribution and a prior distribution subject toPosterior distribution complianceWhereinIs a parameter of the posterior distribution,is set to a constant;the calculation formula is as follows:
the KL divergence is calculated as follows:
where Γ (·) represents a Gama distribution.
Further, in step (3), the information enhancement generator is used for generating the latent variable z by combining the personal characteristicsrAnd the dialog context x ultimately generates a response y; the method specifically comprises the following steps:
the information enhancement generator includes a sentence encoder, a global context encoder, an vMF distribution and response decoder.
First, all context sentences x are encoded using a sentence encoder1,x2,…,xnIs composed ofThe coded response y being a vectorWill be provided withDeriving context potential vectors as input to a global context encoder
Second, the context latent vectorAnd vectors generated in responseAs input to vMF distribution, get distribution representation, sample output context latent variable z, processThe following were used:
wherein ω ∈ [ -1,1 ];
finally, the context x, the context latent variable z and the replying person characteristic latent variable zrGenerating a response y as a response decoder input;
the generation process is represented as follows:
wherein σ represents a sigmoid function;is a word-embedded representation of the ith word in response y;representing the hidden state of the t step; v and b are parameters to be learned; p is a radical ofvocabThe generation probability of the word list is shown; p is a radical ofvocab(wy,i) Means to generate a word wy,iThe probability of (d); n is a radical ofyRepresents the length of response y; equation (11) represents the generation probability of the response y.
The optimization procedure using CVAE based on vMF distribution is expressed as follows:
whereinThe process of generation is shown as being performed,which is indicative of the error of the reconstruction,indicating the KL divergence between the posterior distribution and the prior distribution, the posterior distribution beingA priori distribution ofvMF distribution parameters in the above equationAfter setting as constant, test parameterPrior parameterThe calculation is as follows:
wherein the posterior distribution of CVAE is obeyedA priori distributed complianceIs based on x; the following formula of KL divergence is obtained according to the prior and the posterior:
compared with the prior art, the technical scheme of the invention has the following beneficial effects:
1. in order to solve the problem that potential space cannot be effectively utilized due to disappearance of KL divergence in VAE and CVAE, vMF distribution is used for replacing Gaussian distribution in the VAE and CVAE model based on the encoder-decoder framework in the step (2) and the step (3) of the invention[5]The KL divergence in the model using Gaussian distribution is calculated by using the mean and variance of the Gaussian distribution, but the problem of KL divergence disappearance is caused by the continuous change of the mean and variance in training; therefore, vMF distribution is used to replace Gaussian distribution, KL divergence in the model is determined by a parameter kappa, and the parameter kappa is a constant and cannot be changed in training, so that the problem of KL divergence disappearance cannot be caused, and potential space can be fully used; experiments show that the introduction of the vMF distribution can solve the problem of KL divergence disappearance.
2. In order to improve the consistency of respondents in response, the method utilizes a VAE model based on vMF distribution to represent respondent information in the context by vMF distribution in step (2), and obtains a potential variable z of the personal characteristics of the respondents by samplingrAnd applying the response to the final response generation, so that the final response contains the relevant information of the replier in the context; experiments show that the extraction of personal characteristics of respondents in the context can obviously improve the consistency of respondents in response.
3. In order to enhance the information amount in the response, the invention extracts the information of the global context by using the CVAE model based on vMF distribution in step (3), and combines the information with the replier information in the context to act on the generation process, and the information contained in the response can be effectively enhanced by inputting the global context information into the generation process; experiments show that the introduction of the information quantity can effectively improve the Distingt-1 and Distingt-2 indexes, and the introduction of the item is beneficial to enhancing the information quantity in response.
Drawings
FIG. 1 is a frame diagram of a dialog generation method based on replying person personal feature enhancement provided by the present invention;
FIG. 2 is SSVNGau、SSVNGau-E、SSVNGau-GAnd calculating the KL divergence degree of the model in training.
FIG. 3 is a graph showing the results of the corresponding performance of the present invention at different λ values;
Detailed Description
The invention is described in further detail below with reference to the figures and specific examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
2 Dialogue data sets Cornell Movie Dialogs Corpus and Ubuntu Dialogue Corpus[6]The method of carrying out the invention is given as an example. The overall framework of the method is shown in figure 1. The whole system algorithm flow comprises 3 steps of constructing an encoder-decoder model, extracting personal characteristics by using VAE and generating response by using CVAE.
The method comprises the following specific steps:
(1) constructing the input of an encoder-decoder model:
cornell Movie dimensions cores contains sessions in over 80000 movies; ubuntu dialogue Corpus contains approximately 500000 rounds of conversations collected from Ubuntu Internet replayed Chat, each starting with an unresolved technical problem and then a corresponding answer to the solution. The invention takes the two dialogue data sets as the original corpus to construct an encoder-decoder model and process the original corpus according to the following steps: (1) deleting the dialogs with the number of dialog rounds less than 3 or more than 10 in the dialog data set; (2) the last sentence in each dialog is taken as the response, and the preceding sentences are taken as the dialog context. Table 1 shows the detailed statistics of the two data sets. 91271 Dialogs for training, 871 Dialogs for verification and 702 Dialogs for testing are available in Cornell Movie Dialogs Corpus, wherein each dialog contains 5.04 of average sentences, 16.91 of average words and 10000 of word list size; 448833 dialogues for training, 19584 dialogues for verification, and 18920 dialogues for testing in Ubuntu Internet replayed Chat, wherein each dialog contains an average number of sentences of 4.94, an average number of words of 23.67, and a vocabulary size of 20000.
TABLE 1 dialog data set statistics
(2) Personal feature extraction using VAE
In order to obtain personal characteristics z of respondents in each conversationrWe require that vMF distribution be added among the encoder-decoder models to construct the VAE model and train the model according to the following objective function:
the symbols in the formula have the meanings as described above. Prior distribution formulaThe posterior distribution formula
Finally, the personal characteristics z of the respondents are obtainedr。
(3) Generating responses using CVAE
To get the final output, we require the use of a CVAE model based on vMF, with context x as input, and the respondent personal characteristics latent variable zrAs condition variables, the generation process is trained with the following objective function:
on the upper partThe equation represents the training target for the entire model. The symbols in the formula have the meanings as described above.
In a specific implementation process, taking a Cornell Movie dimensions kernel dataset as an example, various parameters are set in advance, a word vector has a dimension of 200 and is initialized randomly, a sentence encoder adopts a 2-layer bidirectional GRU structure, wherein each layer comprises 600 hidden neurons, z and z arerIs set to 50, updates the parameters at an initial learning rate of 0.001 using Adam algorithm, and in training we select the best model using the lower bound of variation on the validation set using early-stop strategy.
Table 2 shows the present model (SSVN), a simplified version of the model (SVN, SSVN)Gau、SSVNGau-E、SSVNGau-G) And results of other models (S2SA, HRED, VHRED, HVMN) on two datasets and five evaluation indexes (Average, Greedy, Extreme, Distingt-1, Distingt-2).
TABLE 2-1 automated assessment results of Cornell Movie dialog Corpus dialog dataset
TABLE 2-2 Ubuntu Dialogue Corpus Dialogue data set automated evaluation results
TABLE 2-3 model ablation Performance on Cornell Movie scales Corpus dialogue datasets
The comparative experimental algorithms in the table are described below:
s2 SA: a standard seq2seq model with attention mechanism;
HRED: a layered coding framework of a multi-convolution dialog model;
VHRED: a layered codec having latent random variables;
HVMN: a codec network comprising a hierarchy and a variable memory;
SSVNGau、SSVNGau-E、SSVNGau-G: is the 3 degradation models we propose;
remarking: the method provided by the invention is characterized in that SSVN, Gau represents Gaussian distribution, vMF represents vMF distribution, and the SSVN, Gau represents Gaussian distribution and vMF represents vMF distribution are distribution representations in a potential space; thereby producing a series of degradation models for SSVN.
FIG. 2 shows SSVNGau、SSVNGau-E、SSVNGau-GResults in resolving KL divergence disappearance.
FIG. 3 is a graph showing the results of the corresponding performance of the present invention at different λ values;
table 3 shows an example of the above method:
TABLE 3 example generated on Cornell Movie scales dialog corps dialog dataset
As can be seen from the experimental results in table 2, the personal features of the respondent are extracted and fused with the context text, so that the automatic evaluation standard of the dialog generation method can be greatly improved. As can be seen from the experimental results of table 3 in the specific examples, the responses generated by the present invention are closer in result to the personal characteristics of the respondents, and the responses generated are more diverse and natural than the dialog generation methods previously proposed.
The present invention is not limited to the above-described embodiments. The foregoing description of the specific embodiments is intended to describe and illustrate the technical solutions of the present invention, and the above specific embodiments are merely illustrative and not restrictive. Those skilled in the art can make many changes and modifications to the invention without departing from the spirit and scope of the invention as defined in the appended claims.
Reference documents:
[1]Lifeng Shang,Zhengdong Lu,and Hang Li.2015.Neural respondingmachine for short-text conversation.In Proceedings of the 53rd Annual Meetingof the Association for Computational Linguistics(ACL),pages 1577–1586.
[2]Ilya Sutskever,Oriol Vinyals,and Quoc V Le.2014.Sequence tosequence learning with neural networks.In Advances in Neural InformationProcessing Systems 27(NIPS),pages 3104–3112.
[3]D.P.Kingma,D.J.Rezende,S.Mohamed,and M.Welling.2014.Semi-supervised learning with deep generative models.In Advances in NeuralInformation Processing Systems 27(NIPS),pages 3581–3589.
[4]Tiancheng Zhao,Ran Zhao,and Maxine Eskenazi.2017.Learningdiscourse-level diversity for neural dialog models using conditionalvariational autoencoders.In Proceedings of the 55th Annual Meeting of theAssociation for Computational Linguistics(ACL),pages 654–664.
[5]Jiacheng Xu and Greg Durrett.2018.Spherical latent spaces forstable variational autoencoders.In Proceedings of the 2018 Conference onEmpirical Methods in Natural Language Processing(EMNLP),pages 4503–4513.
[6]Hongshen Chen,Zhaochun Ren,Jiliang Tang,Yihong Eric Zhao,and DaweiYin.2018.Hierarchical variational memory network for dialogue generation.InProceedings of the 2018 World Wide Web Conference(WWW’18),pages 1653–1662.
Claims (6)
1. a dialog generation method based on the personal feature enhancement of a replying person is characterized by comprising the following steps:
(1) 2 encoder-decoder basic frameworks are constructed:
2 encoder-decoder basic frames are respectively used for reconstructing sentences of a relevant replying person in each section of dialogue context of the training corpus and responses in each section of dialogue;
(2) constructing a personal feature extractor:
latent variable z for extracting personal characteristics of respondent by personal characteristic extractorr(ii) a vMF distribution is introduced to an encoder-decoder basic framework for reconstructing each dialog context related replier sentence in a training corpus to construct a VAE generation model based on vMF distribution, and a personal characteristic latent variable z is obtainedr;
(3) Constructing an information enhancement generator:
information enhancement generator for obtaining fusion replying person personal characteristic latent variable zrAnd a response y for context x; the information enhancement generator introduces vMF distribution and respondent personal feature latent variable z on the encoder-decoder basic framework for reconstructing each dialog response in the corpusrA CVAE generative model based on vMF distribution was constructed to obtain the response y.
2. The dialog generation method based on the personal feature enhancement of the replying person of claim 1, characterized in that in the step (1), each dialog is composed of a context sentence and a response in the corpus; wherein the context sentence in each dialog is represented as x ═ (x)1,x2,...,xn),xiRepresenting the ith sentence in the context, n representing n sentences contained in the dialog context, in particular in the form of xi=(wi,1,wi,2,...wi,j,...,wi,Ni),wi,jRepresenting the jth word, N, in the ith sentenceiRepresenting a sentence xiThe number of words in (1); the response in each dialog is represented aswy,jIndicating the jth word, N, in the responseyRepresents the number of words contained in response y; sentences related to the replying person information extracted from each dialog context are represented asl denotes the number of sentences of the relevant reverter in the dialog context.
3. The dialog generation method based on the replying person personal feature enhancement as claimed in claim 1 or 2, characterized in that the following processing is required to obtain the corpus:
(101) deleting sentences of which the original dialogue length is less than 3 or more than 10 in the corpus to normalize the dialogue length;
(102) the last sentence of each dialog in the corpus is considered as a response, and the rest sentences are considered as contexts.
4. The method according to claim 1, wherein in step (2) (3), vMF distribution, i.e. von mises-Fisher distribution, is used to represent the probability distribution on the unit sphere, and the probability density function is as follows:
wherein,d represents theThe dimension of the space, x represents a unit random vector of d dimensions;representing a direction vector on a unit sphere, | | μ | | ═ 1; kappa.gtoreq.0 represents a concentration parameter; i isρA modified Bessel function representing the order ρ, where ρ ═ d/2-1; the distribution indicates the distribution of the unit vectors on the spherical surface.
5. The dialog generation method based on the replying person personal feature enhancement as claimed in claim 1, wherein the specific steps in the step (2) are as follows:
the personal feature extractor consists of a sentence encoder, a local context encoder, an vMF distribution and reply decoder.
First, a sentence encoder encodes a sentence x about reverter information using a bi-directional RNN layered encoderrIt will xrEach sentence in (1)Coded as a vectorThen x is putrAll ofThe coded vector is used as the input of a local context coder, and finally, a sentence x related to the reverter information is obtainedrPotential vector of
Next, vMF is used to distribute the sentence x for the information about the reverterrHidden state ofLearning to obtain the distribution of the representation of personal characteristics of the replying person, and then performing rejection sampling on the distribution to obtain the latent variable zrThe sampling formula is as follows:
wherein ω ∈ [ -1,1](ii) a Will zrThe reconstructed sentence x of the replying information is obtained as the input of the replying decoderrThe calculation formula is as follows:
where l represents the sentence x in the context about the reverter informationrThe number of (2); n is a radical ofiIs xrLength of the ith sentence, wi,jIs xrThe ith sentenceA representation of the jth word of (a);
finally, optimizing the model by using an ELBO formula:
wherein,which is indicative of the error of the reconstruction,for calculating KL divergence between a posterior distribution and a prior distribution subject toPosterior distribution complianceWhereinIs a parameter of the posterior distribution,is set to a constant;the calculation formula is as follows:
the KL divergence is calculated as follows:
where Γ (·) represents a Gama distribution.
6. The dialog generating method based on the personal feature enhancement of the replying person as claimed in claim 1, wherein in the step (3), the information enhancement generator is based on the combination of the latent variable z of the personal featurerAnd the dialog context x ultimately generates a response y; the method specifically comprises the following steps:
the information enhancement generator includes a sentence encoder, a global context encoder, an vMF distribution and response decoder.
First, all context sentences x are encoded using a sentence encoder1,x2,...,xnIs composed ofThe coded response y being a vectorWill be provided withDeriving context potential vectors as input to a global context encoder
Second, the context latent vectorAnd vectors generated in responseCombining the input as the vMF distribution to obtain a distribution representation, the output context latent variable z is sampled as follows:
wherein ω ∈ [ -1,1 ];
finally, the context x, the context latent variable z and the replying person characteristic latent variable zrGenerating a response y as a response decoder input;
the generation process is represented as follows:
wherein σ represents a sigmoid function;is a word-embedded representation of the ith word in response y;representing the hidden state of the t step; v and b are parameters to be learned; p is a radical ofvocabThe generation probability of the word list is shown; p is a radical ofvocab(wy,i) Means to generate a word wy,iThe probability of (d); n is a radical ofyRepresents the length of response y; equation (11) represents the generation probability of the response y.
The optimization procedure using CVAE based on vMF distribution is expressed as follows:
whereinThe process of generation is shown as being performed,which is indicative of the error of the reconstruction,indicating the KL divergence between the posterior distribution and the prior distribution, the posterior distribution beingA priori distribution ofvMF distribution parameters in the above equationPost-set to constant, posterior parameterPrior parameterThe calculation is as follows:
wherein the posterior distribution of CVAE is obeyedA priori distributed complianceIs based on x; the following formula of KL divergence is obtained according to the prior and the posterior:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911062516.8A CN111046134B (en) | 2019-11-03 | 2019-11-03 | Dialog generation method based on replier personal characteristic enhancement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911062516.8A CN111046134B (en) | 2019-11-03 | 2019-11-03 | Dialog generation method based on replier personal characteristic enhancement |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111046134A true CN111046134A (en) | 2020-04-21 |
CN111046134B CN111046134B (en) | 2023-06-30 |
Family
ID=70232833
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911062516.8A Active CN111046134B (en) | 2019-11-03 | 2019-11-03 | Dialog generation method based on replier personal characteristic enhancement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111046134B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113254597A (en) * | 2021-06-23 | 2021-08-13 | 腾讯科技(深圳)有限公司 | Model training method, query processing method and related equipment |
CN114398904A (en) * | 2021-11-22 | 2022-04-26 | 重庆邮电大学 | Open field conversation generation method based on multi-granularity feature decoupling |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102609926A (en) * | 2011-07-05 | 2012-07-25 | 天津大学 | ROAD-detection-based index similarity sorting filtering method |
US20170083238A1 (en) * | 2015-09-23 | 2017-03-23 | Hanan Potash | Processor that uses plural form information |
CN106841085A (en) * | 2016-06-05 | 2017-06-13 | 乌鲁木齐职业大学 | Gas measuring method based on KPCA |
WO2018016581A1 (en) * | 2016-07-22 | 2018-01-25 | ヤマハ株式会社 | Music piece data processing method and program |
CN109033069A (en) * | 2018-06-16 | 2018-12-18 | 天津大学 | A kind of microblogging Topics Crawling method based on Social Media user's dynamic behaviour |
CN110032642A (en) * | 2019-03-26 | 2019-07-19 | 广东工业大学 | The modeling method of the manifold topic model of word-based insertion |
-
2019
- 2019-11-03 CN CN201911062516.8A patent/CN111046134B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102609926A (en) * | 2011-07-05 | 2012-07-25 | 天津大学 | ROAD-detection-based index similarity sorting filtering method |
US20170083238A1 (en) * | 2015-09-23 | 2017-03-23 | Hanan Potash | Processor that uses plural form information |
CN106841085A (en) * | 2016-06-05 | 2017-06-13 | 乌鲁木齐职业大学 | Gas measuring method based on KPCA |
WO2018016581A1 (en) * | 2016-07-22 | 2018-01-25 | ヤマハ株式会社 | Music piece data processing method and program |
CN109033069A (en) * | 2018-06-16 | 2018-12-18 | 天津大学 | A kind of microblogging Topics Crawling method based on Social Media user's dynamic behaviour |
CN110032642A (en) * | 2019-03-26 | 2019-07-19 | 广东工业大学 | The modeling method of the manifold topic model of word-based insertion |
Non-Patent Citations (2)
Title |
---|
PERFORMANCE COMPARISON OF LOCAL DIRECTIONAL PATTERN TO LOCAL BIN: "Bo Xu; Daozhi Lin; Longbiao Wang; Hongyang Chao; Weifeng Li; Qinmin Liao", 《 2014 7TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING》 * |
黄佳佳: "基于深度学习的主题模型研究", 《计算机学报》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113254597A (en) * | 2021-06-23 | 2021-08-13 | 腾讯科技(深圳)有限公司 | Model training method, query processing method and related equipment |
CN114398904A (en) * | 2021-11-22 | 2022-04-26 | 重庆邮电大学 | Open field conversation generation method based on multi-granularity feature decoupling |
Also Published As
Publication number | Publication date |
---|---|
CN111046134B (en) | 2023-06-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Structure-aware abstractive conversation summarization via discourse and action graphs | |
CN111143509B (en) | Dialogue generation method based on static-dynamic attention variation network | |
CN111159368B (en) | Reply generation method of personalized dialogue | |
CN112115687B (en) | Method for generating problem by combining triplet and entity type in knowledge base | |
CN113158665A (en) | Method for generating text abstract and generating bidirectional corpus-based improved dialog text | |
Xie et al. | Attention-based dense LSTM for speech emotion recognition | |
CN113435211B (en) | Text implicit emotion analysis method combined with external knowledge | |
CN109033069B (en) | Microblog theme mining method based on social media user dynamic behaviors | |
CN112364161B (en) | Microblog theme mining method based on dynamic behaviors of heterogeneous social media users | |
CN110991290A (en) | Video description method based on semantic guidance and memory mechanism | |
CN113407663B (en) | Image-text content quality identification method and device based on artificial intelligence | |
CN112597769B (en) | Short text topic identification method based on Dirichlet variational self-encoder | |
CN110069611B (en) | Topic-enhanced chat robot reply generation method and device | |
CN110991190A (en) | Document theme enhanced self-attention network, text emotion prediction system and method | |
CN111914553B (en) | Financial information negative main body judging method based on machine learning | |
CN111046134A (en) | Dialog generation method based on replying person personal feature enhancement | |
CN111046157B (en) | Universal English man-machine conversation generation method and system based on balanced distribution | |
Mathur et al. | A scaled‐down neural conversational model for chatbots | |
Ashfaque et al. | Design and Implementation: Deep Learning-based Intelligent Chatbot | |
CN113947074A (en) | Deep collaborative interaction emotion reason joint extraction method | |
Chang et al. | A semi-supervised stable variational network for promoting replier-consistency in dialogue generation | |
CN117150320A (en) | Dialog digital human emotion style similarity evaluation method and system | |
Sun et al. | Convntm: conversational neural topic model | |
CN116629272A (en) | Text generation method and system controlled by natural language | |
CN116628203A (en) | Dialogue emotion recognition method and system based on dynamic complementary graph convolution network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |