CN115495566A - Dialog generation method and system for enhancing text features - Google Patents

Dialog generation method and system for enhancing text features Download PDF

Info

Publication number
CN115495566A
CN115495566A CN202211238085.8A CN202211238085A CN115495566A CN 115495566 A CN115495566 A CN 115495566A CN 202211238085 A CN202211238085 A CN 202211238085A CN 115495566 A CN115495566 A CN 115495566A
Authority
CN
China
Prior art keywords
keyword
vector
text
semantic
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211238085.8A
Other languages
Chinese (zh)
Inventor
王烨
廖靖波
于洪
雷大江
黄昌豪
杨峻杰
卞政轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202211238085.8A priority Critical patent/CN115495566A/en
Publication of CN115495566A publication Critical patent/CN115495566A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the field of man-machine conversation, in particular to a conversation generation method and a system for enhancing text characteristics; the method comprises the steps of obtaining a problem text and a reply text, and extracting keywords in the problem text through a TextRank algorithm to obtain a keyword sequence; introducing a keyword encoder, wherein the keyword encoder encodes each keyword through an attention mechanism to obtain a corresponding keyword vector; splicing the keyword vector and the semantic vector and inputting the spliced keyword vector and the semantic vector into a first multilayer perceptron to obtain a keyword semantic vector containing rich semantics; splicing the keyword semantic vector and the problem text vector, and then obtaining an input vector through a second multilayer perceptron; training a dialogue generating model according to the input vector, calculating a loss value by adopting a loss function, performing back propagation, and adjusting parameters of the dialogue generating model; the invention strengthens the weight of the key words, enhances the feature expression of the text and achieves the aim of generating the dialog text with higher quality.

Description

Dialog generation method and system for enhancing text features
Technical Field
The invention relates to the field of man-machine conversation, in particular to an open field generation model for enhancing conversation feature expression, and specifically relates to a conversation generation method and a system for enhancing text features.
Background
Human-machine dialogues are mainly divided into task-oriented and non-task-oriented dialog (open domain) application systems. Compared with a task-based dialog system, the open-domain dialog system does not need to execute a specific task, and the generated reply is more random than the task-based dialog system. Chat robots can be currently classified into three types, namely, a search type, a generation type and a knowledge graph type. The retrieval type chat robot extracts the most appropriate reply from the existing dialogue corpus by using technologies such as sequencing, matching and the like, but the method can only generate the text in the corpus and cannot realize the diversity of the dialogue, and if more dialogues exist in the corpus, the reply generation speed is slowed down, and the experience of the chat is influenced.
With the deep development of the end-to-end deep learning model, the dialogue system model in the open field already solves the corresponding part of problems, so that the generated dialogue reply is richer. The end-to-end coder-decoder model used by the generative chat robot encodes the dialog into a specific feature vector, and each word of the generative dialog is obtained by sampling from a word list through the decoder, so that the dialog which is not generated in the generation corpus is generated, the defect that the retrieval dialog can only be generated according to the template of the corpus is overcome, and the reply is richer. However, because the generative model samples from the vocabulary and then combines the sampled words into a reply sentence according to the sampling sequence, the representation of the dialogue features is not complete enough, so that the generative model is easy to generate low-quality or irrelevant semantic reaction.
The problem is explained in detail by taking a Seq2Seq model as an example, and the Seq2Seq is the earliest end-to-end generation model and makes a great contribution to the field of text generation. The subsequent chat robots are basically based on the seq2seq paradigm. It contains two Recurrent Neural Networks (RNNs), an encoder and a decoder, respectively. The encoder encodes the input sequence into a semantic vector and the decoder decodes the semantic vector into an output sequence. However, the RNN coding sequence and the de-coding sequence are required to be sequentially performed from left to right in an autoregressive mode, so that the parallel operation is difficult, and the time complexity for processing a long sequence is high. Meanwhile, the RNN is difficult to establish a long-distance context-dependent model, so that extracted features lack important information.
The attention of the existing depth model to words at different positions in a text sequence is the same. Although attention is paid to solving the problem of word weight in a text sequence, the current dialogue system model still takes the maximum likelihood function of the generated text and the reference text as an optimization function, and a special optimization function is lacked to learn the attention weight, so that the importance weight between different words learned by the model is inaccurate. If the attention mechanism is not used, all words are equally weighted in the text feature, which is not consistent with the scenario of open conversation.
Disclosure of Invention
The invention provides a dialog generation method and system for enhancing text features, which aim to solve the problems that the features of a dialog text cannot be completely characterized by the existing generating model, the features are lost due to a recurrent neural network, and the weights of all words in a text sequence in the text features are the same, so that some general or inaccurate dialog replies are easily generated.
In a first aspect, the present invention provides a dialog generation method for enhancing text features, comprising the steps of:
s1, obtaining a problem text and a reply text, and extracting keywords in the problem text through a TextRank algorithm to obtain a keyword sequence; acquiring a problem text vector of a problem text through an input encoder;
s2, introducing a keyword encoder, and encoding a keyword sequence by the keyword encoder through an attention mechanism to obtain a keyword vector;
s3, splicing the keyword vectors and the semantic vectors and inputting the spliced keyword vectors and the semantic vectors into a first multilayer perceptron to obtain keyword semantic vectors containing rich semantics;
s4, splicing the keyword semantic vector and the problem text vector, and then passing through a second multilayer perceptron to obtain an input vector;
s5, training a dialogue generating model according to the input vector and the reply text, calculating a loss value by adopting a loss function, performing back propagation, and adjusting parameters of the dialogue generating model;
and S6, inputting the text to be replied into the trained dialogue generating model to generate the dialogue.
Further, step S4 further includes:
acquiring a reply text vector of a reply text by adopting an output encoder;
splicing the keyword semantic vector and the problem text vector, and then obtaining a first fusion characteristic through a second-layer perceptron, and inputting the first fusion characteristic into a prior network to obtain a prior distribution parameter;
splicing the keyword semantic vector, the problem text vector and the reply text vector, and then passing through a third-layer perceptron to obtain a second fusion characteristic, and inputting the second fusion characteristic into an identification network to obtain an approximate posterior distribution parameter;
and carrying out reparameterization on the approximate posterior distribution parameters to obtain hidden variables, and initializing the hidden variables through linear transformation to obtain input vectors.
Further, the keyword encoder encodes the keyword sequence through an attention mechanism to obtain a keyword vector, including:
h t =Enc key (e(K))
Enc key (e(k i ))=LSTM(input i )
Figure BDA0003883788250000031
Figure BDA0003883788250000032
wherein Enc key () Denotes keyword encoder, K = K 1 ,k 2 ,...,k t Representing a sequence of keywords, k i Denotes the ith keyword, h t Representing keyword vectors, e () representing word vectors for computing words, input i Representing a keyword k i The weighting vector of (e), LSTM () represents the long-short time memory network LSTM, alpha (e (k) i ),e(k j ) ) represents a keyword k i And a keyword k j 1 is less than or equal to j < i.
Further, obtaining the semantic vector by using a semantic encoder includes:
h s =Enc sem (e(S),(h 0 ,c 0 ))
wherein Enc sem () Representation semantic encoder, h s Representing a semantic vector, h 0 Representing the initial hidden state of the semantic encoder, c 0 And expressing the initial cell state of the semantic encoder, and S expresses the semantic text corresponding to the problem text.
In a second aspect, based on the method proposed in the first aspect, the present invention provides a dialog generation system for enhancing text features, including a sample module, a keyword extraction module, a coding module, a fusion module, and a training module, wherein:
the system comprises a sample module, a processing module and a display module, wherein the sample module is used for acquiring a plurality of groups of conversation samples, and the conversation samples comprise question texts and reply texts;
the keyword extraction module is used for extracting a plurality of keywords from the question text to form a keyword sequence;
the encoding module comprises an input encoder, an output encoder, a keyword encoder and a semantic encoder and is used for encoding data of the sample module and the keyword extraction module;
the fusion module comprises a first fusion module and a second fusion module:
the first fusion module is used for fusing the keyword vectors and the semantic vectors to obtain keyword semantic vectors containing rich semantics;
the second fusion module is used for fusing the keyword semantic vector and the question text vector;
and the training module is used for training the dialogue generating model, calculating loss by adopting a loss function, performing back propagation and adjusting parameters of the dialogue generating model.
Further, the system also comprises a third fusion module, a prior network and an identification network:
the third fusion module is used for fusing the keyword semantic vector, the question text vector and the reply text vector;
the prior network is used for receiving the output result of the second fusion module and acquiring prior distribution parameters;
and the identification network is used for receiving the output result of the third fusion module and acquiring the approximate posterior distribution parameters.
The invention has the beneficial effects that:
the invention provides an open type dialog generating method for enhancing dialog text feature expression. When the ordinary RNN is used for coding the text, the front part of the text is easy to ignore, and the weight of the content of the text which is more backward is larger, so the ordinary RNN cannot code the long text. The long text dependency problem can be solved by using LSTM, but with the same weight between all words of the text; if only the attention mechanism is used to solve the problem, since the optimization function for specifically optimizing the attention is not used, but the maximum likelihood optimization function generated by the dialogue is used, it is very difficult to learn the correct weight, which may result in the low weight of the key text and the high weight of the non-key text, thereby misleading the model optimization. The present invention chooses to explicitly extract keywords from text using a keyword extraction method and encode the keywords by an attention mechanism rather than the entire text. Specifically, attention calculation is carried out on the keyword features and the semantic category features which are coded by using an attention mechanism, and then the result is used as one part of text features and is fused into the text features. Therefore, the weight of the words which are more key in the text sequence in the text features is increased. The method not only can correctly enhance the weight of the key text, but also can avoid the situation that the model learns the wrong weight to cause the situation that the non-key vocabulary occupies higher weight.
Drawings
FIG. 1 is a flow diagram of a dialog generation method for enhancing text features in accordance with the present invention;
FIG. 2 is a schematic diagram of a Seq2Seq model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a CAVE model according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a dialog generating method and a dialog generating system for enhancing text features, which integrate keywords and category semantics to extract feature representation with richer semantics and more accurate weight, fuse the integrated features with original text features to obtain higher-quality input features, train a dialog generating model and obtain higher-quality dialog.
In an embodiment, a Seq2Seq model shown in fig. 2 is used to describe the dialog generation model training method for enhancing text features of the present invention, as shown in fig. 1, including:
s11, acquiring a question text C = C 1 ,c 2 ,...,c m And reply text X = X 1 ,x 2 ,...,x n ,c u Represents the u ∈ {1, 2.., m } word, x in the question text v Representing the v ∈ {1, 2.., n } word in the reply text, m representing the number of words in the question text, and n representing the number of words in the reply text; extracting keywords in the problem text through a TextRank algorithm to obtain a keyword sequence K = K 1 ,k 2 ,...,k t ,k i Represents the ith element {1,2,. And t } key word;
s12, acquiring a problem text vector of a problem text through an input encoder, and acquiring a semantic vector through a semantic encoder, wherein the semantic vector is expressed as:
h ci ,c ci =Enc in (e(c i ),(h ci-1 ,c ci-1 ))
h s =Enc sem (e(S),(h 0 ,c 0 ))
h ci ,c ci respectively representing the hidden state of the ith step of the input encoder and the cell state of the ith step, enc in () Representing the input encoder, enc sem () Representation semantic encoder, h s Representing a semantic vector, h 0 Representing the initial hidden state of the semantic encoder, c 0 Representing the initial cell state of a semantic encoder, and S represents the semantic category of the question text;
s13, introducing a keyword encoder, and encoding each keyword by the keyword encoder through an attention mechanism to obtain a corresponding keyword vector;
s14, splicing the keyword vectors and the semantic vectors, inputting the spliced keyword vectors and semantic vectors into a first multilayer perceptron to obtain keyword semantic vectors containing rich semantics, wherein one problem text corresponds to one semantic vector and is expressed as follows:
h'=MLP([h t :h s ])
h' represents a keyword semantic vector, h t Represents a keyword vector, h s Represents a semantic vector, MLP () represents a first multi-layer perceptron;
s15, splicing the keyword semantic vector and the problem text vector, and then obtaining an input vector through a second multilayer perceptron, wherein the input vector is expressed as:
Figure BDA0003883788250000061
Figure BDA0003883788250000062
representing the input vector of the decoder, i.e. the initial hidden state of the decoder, h m Representing a problem text vector, MLP' () representing a second multi-layered perceptron;
and S16, training a dialogue generating model according to the input vector and the reply text, calculating a loss value by adopting a loss function, performing back propagation, and adjusting parameters of the dialogue generating model.
Specifically, as shown in FIG. 2, the first input to the decoder is<SOS>(Start of sense) tag and initial hidden state, i.e., input vector
Figure BDA0003883788250000063
The entry of each subsequent step being the word x in the reply sequence t And the hidden state s output in the previous step t-1 . And finally, mapping the output of the decoder to a word list space by using SoftMax, taking the word with the maximum probability in the word list as a generated result, and expressing the result as follows:
s t =Dec(s t-1 ,e(x t ));
Figure BDA0003883788250000071
s t represents the hidden state of the decoder at step t, dec () represents the decoder,
Figure BDA0003883788250000072
representing the words generated by the t step of the decoder, and MLP () representing the words mapped and output to the word list space by the multi-layer perceptron;
specifically, the Seq2 Seq-based dialog generation model uses cross entropy as a loss function, expressed as:
Figure BDA0003883788250000073
x represents a reference reply text which is,
Figure BDA0003883788250000074
indicating that the decoder generated text.
In one embodiment, the CVAE model shown in fig. 3 is used to describe the dialog generation model training method for enhancing text features of the present invention, which includes:
s21, acquiring question text C = C 1 ,c 2 ,...,c m And reply text X = X 1 ,x 2 ,...,x n ,c u Represents the u e {1, 2.., m } word, x in the question text v Representing the v ∈ {1, 2., n } word in the reply text, m representing the number of words in the question text, and n representing the number of words in the reply text; extracting keywords in the problem text through a TextRank algorithm to obtain a keyword sequence K = K 1 ,k 2 ,...,k t ,k i Representing the ith element {1,2,..., t } key words;
s22, acquiring a problem text vector of a problem text through an input encoder, acquiring a semantic vector through a semantic encoder, and acquiring a reply text vector of a reply text through an output encoder, wherein the expression is as follows:
h ci ,c ci =Enc in (e(c i ),(h ci-1 ,c ci-1 ))
h xi ,c xi =Enc out (e(x i ),(h xi-1 ,c xi-1 ))
h s =Enc sem (e(S),(h 0 ,c 0 ))
h ci ,c ci respectively representing the hidden state of the input encoder step i and the cell state of the step i, h xi ,c xi Respectively representing the hidden state of the output encoder at step i and the cell state of step i, enc out () Representing an output encoder, enc in () Representing the input encoder, enc sem () Representation semantic encoder, h s Representing a semantic vector, h 0 Representing the initial hidden state of the semantic encoder, c 0 Representing the initial cell state of a semantic encoder, and S represents the semantic category of a problem text;
s23, a keyword encoder is introduced, and the keyword encoder encodes each keyword through an attention mechanism to obtain a corresponding keyword vector;
s24, splicing the keyword vectors and the semantic vectors, and inputting the spliced keyword vectors and the semantic vectors into a first multilayer perceptron to obtain keyword semantic vectors containing rich semantics, wherein the keyword semantic vectors are expressed as follows:
h'=MLP([h t :h s ])
h' represents a keyword semantic vector, h t Represents a keyword vector, h s Represents a semantic vector, MLP () represents a first multi-layer perceptron;
s25, splicing the keyword semantic vector and the problem text vector, and then obtaining a first fusion characteristic through a second multilayer sensing machine, and inputting the first fusion characteristic into a prior network to obtain a prior distribution parameter;
s26, splicing the keyword semantic vector, the problem text vector and the reply text vector, and then passing through a third-layer perceptron to obtain a second fusion characteristic, and inputting the second fusion characteristic into a recognition network to obtain an approximate posterior distribution parameter;
s27, carrying out reparameterization on the approximate posterior distribution parameters to obtain hidden variables, and initializing the hidden variables through linear transformation to obtain input vectors;
and S28, training a dialogue generating model according to the input vector and the reply text, calculating a loss value by adopting a loss function, performing back propagation, and adjusting parameters of the dialogue generating model.
Specifically, the CVAE-based dialog generation model uses reconstruction loss and KL distance as loss functions, expressed as:
Figure BDA0003883788250000081
wherein q is φ (z | X, C, K, S) represents an approximate posterior distribution,
Figure BDA0003883788250000082
which represents a distribution a priori, and,
Figure BDA0003883788250000083
expressing the expectation of reconstructing a reply text X under the approximate posterior distribution, KL expressing KL divergence of two distributions, X expressing a reply text, C expressing a problem text, K expressing a keyword text, S expressing a semantic text and z expressing a hidden variable.
In one embodiment, the method for encoding each keyword by the keyword encoder through the attention mechanism to obtain a corresponding keyword vector includes:
h t =Enc key (e(K))
Enc key (e(k i ))=LSTM(input i )
Figure BDA0003883788250000091
Figure BDA0003883788250000092
wherein Enc key () Denotes keyword encoder, K = K 1 ,k 2 ,...,k t Representing a sequence of keywords, k i Denotes the ith keyword, h t Represents the hidden state of the t-th step of the keyword encoder, which is also a keyword vector in this embodiment, e () represents the calculated word vector, input i Representing a keyword k i The weighting vector of (a), LSTM (), represents the long-short time memory network LSTM, α (e (k) i ),e(k j ) ) represents a keyword k i And a keyword k j 1 is less than or equal to j < i.
In an embodiment, the present invention provides a dialog generation system for enhancing text features, which includes a sample module, a keyword extraction module, a coding module, a fusion module, and a training module, wherein:
the system comprises a sample module, a processing module and a display module, wherein the sample module is used for acquiring a plurality of groups of conversation samples, and the conversation samples comprise question texts and reply texts;
the keyword extraction module is used for extracting a plurality of keywords from the question text to form a keyword sequence;
the encoding module comprises an input encoder, an output encoder, a keyword encoder and a semantic encoder and is used for encoding data of the sample module and the keyword extraction module;
the fusion module comprises a first fusion module and a second fusion module:
the first fusion module is used for fusing the keyword vectors and the semantic vectors to obtain keyword semantic vectors containing rich semantics;
the second fusion module is used for fusing the keyword semantic vector and the question text vector;
and the training module is used for training the dialogue generating model, calculating loss by adopting a loss function, performing back propagation and adjusting parameters of the dialogue generating model.
Specifically, the system further comprises a third fusion module, a prior network and an identification network:
the third fusion module is used for fusing the keyword semantic vector, the question text vector and the reply text vector;
the prior network is used for receiving the output result of the second fusion module and acquiring prior distribution parameters;
and the identification network is used for receiving the output result of the third fusion module and acquiring the approximate posterior distribution parameters.
Table 1 dailydalogs dataset evaluation results with behavior signatures
Figure BDA0003883788250000101
TABLE 2 evaluation results of the DAILYDIALOGS dataset with emotion tags
Figure BDA0003883788250000102
TABLE 3 Empatiticdialogues dataset evaluation results with emotion tags
Figure BDA0003883788250000103
Tables 1,2,3 show the results of automated evaluation of our model and the remaining models. Table 1 shows the results of using motivational semantics in the DailyDialog dataset. Table 2 shows the results of using sentiment semantics in the DailyDialog dataset. Table 3 shows the results of using emotion semantics in the Empathetical dialogs dataset.
It can be seen that our model achieves good results on 3 different data sets. Our model is superior to the other comparative models in both BLEU and METEOT, and is only lower than the Transformer in Rouge. This shows that our model can generate higher quality dialog text, especially on different data sets, so our method has good generalization capability. However, after the method is added, the error bar of the model becomes high, which shows that the addition of the method can improve the generating effect of the dialogue model, but the model becomes more complex, and the fluctuation of the model becomes large.
In the present invention, unless otherwise explicitly specified or limited, the terms "mounted," "disposed," "connected," "fixed," "rotated," and the like are to be construed broadly, e.g., as being fixedly connected, detachably connected, or integrated; can be mechanically or electrically connected; the terms may be directly connected or indirectly connected through an intermediate agent, and may be used for communicating the inside of two elements or interacting relation of two elements, unless otherwise specifically defined, and the specific meaning of the terms in the present invention can be understood by those skilled in the art according to specific situations.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (6)

1. A dialog generation method for enhancing text features, comprising the steps of:
s1, obtaining a problem text and a reply text, and extracting keywords in the problem text through a TextRank algorithm to obtain a keyword sequence; obtaining a question text vector of a question text through an input encoder;
s2, introducing a keyword encoder, and encoding a keyword sequence by the keyword encoder through an attention mechanism to obtain a keyword vector;
s3, splicing the keyword vectors and the semantic vectors and inputting the spliced keyword vectors and the semantic vectors into a first multilayer perceptron to obtain keyword semantic vectors containing rich semantics;
s4, splicing the keyword semantic vector and the problem text vector, and then passing through a second multilayer perceptron to obtain an input vector;
s5, training a dialogue generating model according to the input vector and the reply text, calculating a loss value by adopting a loss function, performing back propagation, and adjusting parameters of the dialogue generating model;
and S6, inputting the text to be replied into the trained dialogue generating model to generate the dialogue.
2. The method of claim 1, wherein step S4 further comprises:
acquiring a reply text vector of a reply text by adopting an output encoder;
splicing the keyword semantic vector and the problem text vector, and then obtaining a first fusion characteristic through a second-layer perceptron, and inputting the first fusion characteristic into a prior network to obtain a prior distribution parameter;
splicing the keyword semantic vector, the problem text vector and the reply text vector, and then passing through a third-layer perceptron to obtain a second fusion characteristic, and inputting the second fusion characteristic into an identification network to obtain an approximate posterior distribution parameter;
and carrying out reparameterization on the approximate posterior distribution parameters to obtain hidden variables, and initializing the hidden variables through linear transformation to obtain input vectors.
3. The method of claim 1, wherein the keyword encoder encodes the keyword sequence by an attention mechanism to obtain a keyword vector, comprising:
h t =Enc key (e(K))
Enc key (e(k i ))=LSTM(input i )
Figure FDA0003883788240000021
Figure FDA0003883788240000022
wherein Enc key () Representing keyword encoder, K = K 1 ,k 2 ,...,k t Representing a sequence of keywords, k i Denotes the ith keyword, h t Representing a keyword vector, e () representing a word vector of a computed word, input i Representing a keyword k i The weighting vector of (e), LSTM () represents the long-short time memory network LSTM, alpha (e (k) i ),e(k j ) ) represents a keyword k i And a keyword k j 1 is less than or equal to j < i.
4. The dialog generation method for enhancing text features according to claim 1, wherein obtaining semantic vectors by using a semantic encoder comprises:
h s =Enc sem (e(S),(h 0 ,c 0 ))
wherein Enc sem () Representation semantic encoder, h s Representing a semantic vector, h 0 Representing the initial hidden state of the semantic encoder, c 0 And expressing the initial cell state of the semantic encoder, and S expresses the semantic text corresponding to the problem text.
5. A dialog generation system for enhancing text features is characterized by comprising a sample module, a keyword extraction module, a coding module, a fusion module and a training module, wherein:
the system comprises a sample module, a processing module and a display module, wherein the sample module is used for acquiring a plurality of groups of conversation samples, and the conversation samples comprise question texts and reply texts;
the keyword extraction module is used for extracting a plurality of keywords from the question text to form a keyword sequence;
the encoding module comprises an input encoder, an output encoder, a keyword encoder and a semantic encoder and is used for encoding data of the sample module and the keyword extraction module;
the fusion module comprises a first fusion module and a second fusion module:
the first fusion module is used for fusing the keyword vector and the semantic vector to obtain a keyword semantic vector containing rich semantics;
the second fusion module is used for fusing the keyword semantic vector and the question text vector;
and the training module is used for training the dialogue generating model, calculating loss by adopting a loss function, reversely propagating and adjusting parameters of the dialogue generating model.
6. The dialog generation system according to claim 5, further comprising a third fusion module, a priori network, and a recognition network:
the third fusion module is used for fusing the keyword semantic vector, the question text vector and the reply text vector;
the prior network is used for receiving the output result of the second fusion module and acquiring prior distribution parameters;
and the identification network is used for receiving the output result of the third fusion module and acquiring the approximate posterior distribution parameters.
CN202211238085.8A 2022-10-11 2022-10-11 Dialog generation method and system for enhancing text features Pending CN115495566A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211238085.8A CN115495566A (en) 2022-10-11 2022-10-11 Dialog generation method and system for enhancing text features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211238085.8A CN115495566A (en) 2022-10-11 2022-10-11 Dialog generation method and system for enhancing text features

Publications (1)

Publication Number Publication Date
CN115495566A true CN115495566A (en) 2022-12-20

Family

ID=84473646

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211238085.8A Pending CN115495566A (en) 2022-10-11 2022-10-11 Dialog generation method and system for enhancing text features

Country Status (1)

Country Link
CN (1) CN115495566A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932726A (en) * 2023-08-04 2023-10-24 重庆邮电大学 Open domain dialogue generation method based on controllable multi-space feature decoupling

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932726A (en) * 2023-08-04 2023-10-24 重庆邮电大学 Open domain dialogue generation method based on controllable multi-space feature decoupling
CN116932726B (en) * 2023-08-04 2024-05-10 重庆邮电大学 Open domain dialogue generation method based on controllable multi-space feature decoupling

Similar Documents

Publication Publication Date Title
Kumar et al. Dialogue act sequence labeling using hierarchical encoder with crf
CN111198937B (en) Dialog generation device, dialog generation program, dialog generation apparatus, computer-readable storage medium, and electronic apparatus
CN109977207A (en) Talk with generation method, dialogue generating means, electronic equipment and storage medium
CN110297887B (en) Service robot personalized dialogue system and method based on cloud platform
US11475225B2 (en) Method, system, electronic device and storage medium for clarification question generation
CN111966800A (en) Emotional dialogue generation method and device and emotional dialogue model training method and device
CN110825848A (en) Text classification method based on phrase vectors
CN112364148B (en) Deep learning method-based generative chat robot
CN115186147B (en) Dialogue content generation method and device, storage medium and terminal
CN113094475A (en) Dialog intention recognition system and method based on context attention flow
CN116341651A (en) Entity recognition model training method and device, electronic equipment and storage medium
CN110275953B (en) Personality classification method and apparatus
CN115495566A (en) Dialog generation method and system for enhancing text features
CN114817467A (en) Intention recognition response method, device, equipment and storage medium
CN114416948A (en) One-to-many dialog generation method and device based on semantic perception
CN117574904A (en) Named entity recognition method based on contrast learning and multi-modal semantic interaction
CN117093864A (en) Text generation model training method and device
CN111046157A (en) Universal English man-machine conversation generation method and system based on balanced distribution
CN116362242A (en) Small sample slot value extraction method, device, equipment and storage medium
CN110851580A (en) Personalized task type dialog system based on structured user attribute description
CN115422388A (en) Visual conversation method and system
CN114912441A (en) Text error correction model generation method, error correction method, system, device and medium
CN114398488A (en) Bilstm multi-label text classification method based on attention mechanism
CN112818688A (en) Text processing method, device, equipment and storage medium
CN113486167B (en) Text completion method, apparatus, computer device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination