WO2018014835A1 - 一种对话生成方法及装置、设备、存储介质 - Google Patents

一种对话生成方法及装置、设备、存储介质 Download PDF

Info

Publication number
WO2018014835A1
WO2018014835A1 PCT/CN2017/093417 CN2017093417W WO2018014835A1 WO 2018014835 A1 WO2018014835 A1 WO 2018014835A1 CN 2017093417 W CN2017093417 W CN 2017093417W WO 2018014835 A1 WO2018014835 A1 WO 2018014835A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
hidden layer
vector
round
layer vector
Prior art date
Application number
PCT/CN2017/093417
Other languages
English (en)
French (fr)
Inventor
舒悦
路彦雄
林芬
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018014835A1 publication Critical patent/WO2018014835A1/zh
Priority to US15/997,912 priority Critical patent/US10740564B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • G06F40/56Natural language generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • the present invention relates to the field of voice processing, and in particular, to a dialog generation method and apparatus, device, and storage medium.
  • the dialog system may include a rule based dialog system, a search based dialog system or a generous dialog system.
  • the rule-based dialogue system is simple in structure and high in accuracy, but the generalization ability is poor.
  • the search-based dialogue system requires the quality and quantity of the corpus to be relatively high, otherwise it is prone to low recall and other issues; the generated dialogue system can be compared.
  • a good construction language model can generate corresponding answer sentences for any input sentence.
  • the modeling mode of the production dialogue system can be divided into two types: single-round modeling and multi-round modeling.
  • the single-round generation dialogue model only Modeling the question and answer pairs, and directly concatenating the context into a long question when dealing with multiple rounds of dialogue, but when the number of dialogue rounds is large and the context is long, information compression disorder is easy to occur, resulting in The problem of low quality of the answer sentence, multi-round generation dialogue model models multiple rounds of question and answer transmission process, but the model is easy to generate high frequency answers with low precision.
  • the embodiments of the present invention provide a dialog generation method, device, device, and storage medium, which can solve the technical problem of low accuracy of generating a dialogue.
  • An embodiment of the present invention provides a dialog generation method, including:
  • the positive hidden layer vector of the last word in the K-th round query sentence, and the hidden layer of the last word in the K-1 round reply sentence outputted for the K-1 round query sentence a vector, and an initial hidden layer vector for the K-1 round reply sentence outputted by the K-1 round of the query sentence, determining an initial hidden layer vector output for the Kth round of the query sentence;
  • An embodiment of the present invention provides a dialog generating apparatus, including:
  • a hidden layer calculation portion configured to convert each word in the Kth round of interrogation sentences into a first word vector, and calculate a forward hidden layer vector and a reverse hidden layer of each word according to the first word vector Vector, K is a positive integer greater than or equal to 2;
  • a topic determining portion configured to acquire a content topic of the K-th query question and convert the content topic into a second word vector
  • a vector calculation portion configured to determine according to the second word vector, the most The positive hidden layer vector of the latter word, the hidden layer vector of the last word in the K-1 round reply sentence outputted by the K-1 round query sentence, and the output for the K-1 round of the query sentence An initial hidden layer vector of the K-1 round reply sentence, determining an initial hidden layer vector output for the Kth round of the query sentence;
  • a reply output portion configured to: according to the forward hidden layer vector and the reverse hidden layer vector of each word in the Kth round interrogation sentence, and the initial implicit output for the Kth round interrogation sentence A layer vector that generates a reply sentence for the Kth round of the query sentence.
  • An embodiment of the present invention provides a dialog generating device, where the device includes an interface circuit, a memory, and a processor, wherein the memory stores a set of program codes, and the processor is configured to call the program code stored in the memory, configured to execute The following steps:
  • the positive hidden layer vector of the last word in the K-th round query sentence, and the hidden layer of the last word in the K-1 round reply sentence outputted for the K-1 round query sentence a vector, and an initial hidden layer vector for the K-1 round reply sentence outputted by the K-1 round of the query sentence, determining an initial hidden layer vector output for the Kth round of the query sentence;
  • the embodiment of the present invention provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the dialog generation method provided by the embodiment of the present invention.
  • each word in the K-th query sentence is first converted into the first word direction. And calculate the forward hidden layer vector and the reverse hidden layer vector of each word according to the first word vector; then obtain the content theme of the Kth round of the query sentence, and convert the content theme into the second word vector;
  • the vector and the reverse hidden layer vector, and the initial hidden layer vector output for the Kth round of the query sentence generate a reply sentence for the Kth round of the query sentence, and effectively suppress the cross-topic commonality by adding the topic content during the generated dialogue process.
  • FIG. 1 is a schematic flowchart of a dialog generation method according to an embodiment of the present invention.
  • 2A is a schematic structural diagram of a dialog generation system according to an embodiment of the present invention.
  • 2B is a schematic flowchart of a dialog generation method according to an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of a dialog generating apparatus according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of another dialog generating apparatus according to an embodiment of the present invention.
  • RNN Recurrent neural Network
  • LSTM Long Short-Term Memory: A time recurrent neural network, which can be understood as a cellular structure of a cyclic neural network, containing input gates, output gates, and forgetting gates. It is suitable for processing and predicting intervals and delays in time series. Long important events.
  • GRU Gate Recurrent Unit: As a kind of RNN variant neural network, GRU will forget that the gate and the input gate are combined into a single update gate, which also mixes the cell state and the hidden state, that is, the cell state is removed. Store information directly through the output, since this structure is much simpler than LSTM. GRU is similar to LSTM and is suitable for long-term dependencies and has a simpler cell structure.
  • One-hot One-hot, a vector whose dimension is the dictionary size. Each dimension corresponds to a word in the dictionary, only the corresponding bit is 1, and the others are all 0.
  • Word vector A fixed-length low-dimensional (usually 200- to 300-dimensional) vector used to represent a word with a highly correlated word and a small vector spacing.
  • Softmax The promotion of logistic regression models on multi-classification problems.
  • BTM Biterm Topic Model
  • the Biterm theme model whose main idea is to synthesize the co-occurrence word pairs (ie, the word co-occurrence mode) composed of any two words in the statistics, and to model the co-occurrence word pairs to solve the problem.
  • the problem of sparse features The problem of sparse features.
  • FIG. 1 is a schematic flowchart diagram of a dialog generation method according to an embodiment of the present invention. As shown in the figure, the method in the embodiment of the present invention includes:
  • a multi-round dialogue model can be constructed. As shown in FIG. 2A, each round of the query sentence and the corresponding reply sentence can be embedded in the single-round dialogue model, and the multi-round dialogue model can be regarded as the single-round dialogue model. Expanding, in the single-round dialogue model, can be divided into an encoder layer, an intention layer, and a decoder layer.
  • the K-th round inquiry sentence input by the user may be first obtained, and the K-th round inquiry sentence is segmented by words, and the word vector of each word in the inquiry is represented by one-hot coding, and then passed.
  • the Embedding Space Matrix (ESM) converts the word vector of each word into a vector of a predetermined dimension.
  • the dimension of the unique heat coding is the size of the preset dictionary, and each dimension corresponds to one word in the dictionary, only the corresponding bit is 1, the others are all 0, and then the K-round query sentence is scanned from front to back, and will be successively
  • the word vector of each word is input to the forward gated loop unit, and the forward hidden layer vector of each word is recorded. And scanning the K-th query from the back to the front, inputting the word vector of each word to the reverse-gated loop unit one by one, and recording the reverse hidden layer vector after inputting each word.
  • the forward hidden layer vector of the target word may be calculated according to the first word vector of the target word in the K-th query and the forward hidden layer vector of the previous word of the target word.
  • the positive hidden layer vector can be expressed as Calculating a reverse hidden layer vector of the target word according to the first word vector of the target word in the K-th query and the reverse hidden layer vector of the next word of the target word, and the reverse hidden of the target word.
  • the layer vector can be expressed as
  • the K-round question is “Have you seen a movie?”
  • the word vector of the first word "you” Determine the positive hidden layer vector of the first word “you” Word vector based on the second word "look”
  • the positive hidden layer vector of the first word "you” Determine the positive hidden layer vector of the first word “see” Word vector according to the third word "over”
  • the positive hidden layer vector of the second word “see” Determine the positive hidden layer vector of the third word "over”
  • the forward hidden layer vector of the fourth word "electricity” is calculated by successive analogy Positive hidden layer vector of the fifth word "shadow” Positive hidden layer vector of the sixth word "?”
  • the BTM algorithm can be used to train each of the plurality of words, determine the probability distribution of each word as the content topic, and then match the K-th query question with the plurality of words to determine the Kth
  • the content topic with the highest probability in the round query sentence, the content topic with the highest probability can be represented by the unique heat coding, and the embedded space matrix of the content theme is constructed, thereby obtaining the word vector E (k) of the content theme.
  • the forward hidden layer vector of the last word in the K-th round query sentence, and the last word in the K-1 round reply sentence outputted for the K-1 round query sentence a hidden layer vector, and an initial hidden layer of the K-1 round reply sentence for the K-1 round query sentence output A vector determining an initial hidden layer vector output for the Kth round of interrogation sentences.
  • the forward hidden layer vector of the last word in the query sentence output in the coding layer 21 in the Kth round the word vector E (k) of the content subject
  • the hidden layer vector of the last word in the K-1 round reply sentence outputted by the K-1 round inquiry sentence, and the initial hidden layer vector of the K-1 round reply sentence outputted for the K-1 round inquiry sentence are input to
  • the simple recurrent neural network simple-reverse neural network
  • the initial hidden layer vector for the output of the K-th query is calculated, and the initial hidden layer vector can be expressed as:
  • W (in, in) , W (in, de) , W (in, en), and W (in, e) are parameters in the simple-RNN neural network, respectively, and ⁇ is used to h (in, k)
  • the initial hidden layer vector is compressed in the [0,1] interval, thereby increasing the nonlinear representation ability of the model.
  • S104 Generate, according to the forward hidden layer vector and the reverse hidden layer vector of each word in the Kth round query sentence, and the initial hidden layer vector outputted for the Kth round query sentence. A reply sentence for the inquiry question of the Kth round.
  • each word pair generation layer in the Kth round of interrogation sentences may be calculated according to the second hidden layer vector and the hidden layer vector of each word in the Kth round interrogation sentence. Calculating a weight of the second reply word; calculating a weight of the second reply word according to each word pair in the K-th round query sentence, and calculating a hidden layer vector of each word in the K-th round query sentence Weighting the sum and using the weighted sum as the contribution of each word pair in the Kth round of query words to generate the second reply word.
  • a probability distribution of each word in the preset dictionary may be calculated according to the third hidden layer vector; selecting a word with the highest probability in the preset dictionary as the second The reply words are output, and then the third reply word, the fourth reply word, the fifth reply word, and the like are sequentially output, and each time, 50 words can be selected to generate a reply sentence word by word, and the first 5 sentences with the highest probability are selected.
  • the importance level g jt of each word pair in the query sentence can be calculated by replying to the hidden layer vector of the previous word of a word in the reply sentence and the hidden layer vector of each word in the query sentence. among them, a hidden layer vector for the previous word of the word, In order to query the hidden layer vector of each word in the sentence, W (de, de) and W (de, en) are the parameters in the neural network respectively, and then normalize the importance g jt to calculate the Kth. Round the weight of the hidden layer vector of each word in the query Finally, calculate the weighted sum of the hidden layer vectors of each word in the K-th query.
  • a hidden layer vector of the reply word verbatim wherein The word vector for the previous word of the word, The hidden layer vector of the previous word of the word.
  • the content of the query sentence "Have you seen a movie?" 25 is first calculated as “movie”, and the content theme "movie” is encoded to obtain a theme vector, and then the output of the previous round of the intention layer is obtained.
  • the vector, the output vector of the decoding layer of the previous round, the output vector of the current encoding layer 21 and the subject vector are input to the intent layer, and the initial hidden layer vector is output through the neural network operation, and the initial hidden layer vector can be used for the decoding layer. Determine the first word of the answer.
  • the decoding layer 23 it can be regarded as the reverse process of the encoding layer 21, and the word vector and the hidden layer vector can be decoded into a natural language, and the initial hidden layer vector output according to the intention layer and each word in the query sentence in the attention layer can be Word vector, generate the answer "I like European and American movies.” Assuming a dictionary of 10,000 words, the decoding layer 23 will generate a probability distribution of the 10,000 words each time it is decoded, and then select each time. The most probable word is output. The process is as follows:
  • the initial hidden layer vector is outputted by the layer 22, and the initial hidden layer vector and the word vector whose first character is the identifier character "_EOS_" are input to the decoding layer 23, and the hidden layer vector is updated by the neural network to obtain the second hidden layer.
  • the second hidden layer vector generates a probability distribution of 10,000 words through the softmax regression algorithm, wherein the probability of the "I" word is the largest, so the output reply word "I” is output, and then the second hidden layer vector and the reply word "I”
  • the word vector is used as an input to generate a third hidden layer vector, and the probability distribution of the next word is calculated according to the third hidden layer vector, and the "hi" word with the highest probability is taken as the output. Repeat the above process until the end of the whole process when the special symbol _EOS_ is output, then the reply sentence "I like European and American movies _EOS_" 26 can be generated.
  • each word in the Kth round of interrogation sentences is first converted into a first word vector, and a forward hidden layer vector and a reverse hidden layer vector of each word are calculated according to the first word vector; Obtaining the content topic of the Kth round of the query sentence, and converting the content theme into the second word vector; secondly, according to the second word vector, the positive hidden layer vector of the last word in the Kth round of the query sentence, for the K-1 round The hidden layer vector of the last word in the K-1 round reply sentence output by the query sentence, and the initial hidden layer vector of the K-1 round reply sentence outputted for the K-1 round query sentence, determining the inquiry for the Kth round The initial hidden layer vector of the sentence output; finally, according to the forward hidden layer vector and the reverse hidden layer vector of each word in the K round query, and the initial hidden layer vector output for the Kth round of the query sentence, generating for the Kth The reply sentence of the round inquiry sentence effectively suppresses the generation of the cross-topic general high-frequency reply sentence by adding
  • the invention relates to the field of computer technology and machine learning, and uses deep learning technology to enable a robot to understand human natural language semantics through multiple rounds of dialogue and generate corresponding reply statements.
  • the technical solution provided by this embodiment is not only It can avoid the low generalization ability of the rule-based dialogue system and the low recall ability of the dialogue system based on the search algorithm. At the same time, it can effectively reduce the problem of generating high-frequency answer sentences with the high probability of the mainstream statistical dialogue-based generating dialogue system.
  • the generation of dialogue algorithms has increased the practicality by a step.
  • a single sentence is encoded by the GRU unit at the decoding layer 23 to prevent gradient dispersion; and the dialogue theme information based on the BTM algorithm is creatively added to the intention layer 22 as the generation of the dialogue supervision information.
  • the probability of generation of high frequency answers can be reduced to some extent; at the same time, at the decoding layer 23, a two-way attention mechanism (attention layer 24) is used to capture the key information in the context, so that the generated dialogue has a better correlation.
  • the present embodiment is based on a multi-round generation dialogue model conversation generation method, the method includes two processes of training and prediction, wherein the input of the multi-round generation conversation model is a question and answer pair of the first four rounds of dialogue and a question of the current round, and multiple rounds of generation
  • the output of the dialog model is the answer to the current round generated by the algorithm based on the previous information.
  • the processing flow is divided into three parts, the processing flow of the coding layer, the intention layer, and the decoding layer.
  • the coding layer which is used to map the input natural language into a vector of fixed dimensions. Therefore, the input is a sentence expressed in natural language, and the output is a vector of a fixed length. in particular:
  • I> segment the sentence at the word level, and then convert the one-hot representation of each word into a 200-dimensional word vector by embedding the spatial matrix.
  • V> string forward splicing II> and III> obtained forward and reverse hidden layer vector, ie As an expression of a modified word in a sentence, as an input to the decoding layer.
  • using the implicit state of the bidirectional structure as the input of the intent layer can more accurately describe the key information in the context, effectively reducing the problem of the back information of the unidirectional structure, because the hidden state of the bidirectional structure is from a certain To a certain extent, each word has global information, avoiding the problem that the more the one-way structure has more information, the more relevant the generated sentence is.
  • Intention layer the role of this layer is to encode the topic transfer process of multiple rounds of dialogue
  • the input of the intent layer is 1)
  • the last round of question and answer decoding layer end hidden state
  • the output h (in, k-1) of the last round of the question and answer layer, and the subject E (k) of the current round question, is the vector h (in, k) after the current subject and context information are combined.
  • W (in, in) , W (in, de) , W (in, en), and W (in, e) are parameters in the simple-RNN neural network, respectively, and ⁇ is used to h (in, k)
  • the initial hidden layer vector is compressed in the [0,1] interval, thereby increasing the nonlinear representation ability of the model; and as an input to the decoding layer.
  • the present embodiment explicitly displays the subject of the current query into the calculation, which is equivalent to adding supervisory information in the operation process, so that the next step of generating the answer sentence is limited to the topic, thereby reducing the part of the general height.
  • the probability of generating a frequency response sentence is
  • Decoding layer the function of this layer is to output the probability distribution of the next word in the dictionary by analyzing the output vectors of the coding layer and the intention layer.
  • the input is the output of the intent layer h (in, k) , the output of the encoding layer
  • the output is the probability distribution of the next word in the dictionary. in particular:
  • next hidden layer state is generated verbatim, and then each hidden layer state is connected to the fully connected layer, and the probability distribution of the next word in the dictionary is calculated by softmax.
  • the loss is calculated by calculating the negative log likelihood of the probability distribution of the predicted answer in the standard answer, calculating the overall loss of the standard answer and the loss as the current round, using the reversal of the cyclic neural network.
  • the propagation algorithm (BPTT, Back Propagation Through Time) performs error backhaul.
  • the machine learning cluster search (Beam Search) algorithm selects the top 50 words with the highest probability each time, generates the answer words verbatim, and outputs the first 5 sentences with the highest probability.
  • the technical solution provided by this embodiment is born out of the translation model. It is well known that the translation model constructs an equivalent space transformation from one language to another, so the semantic space is relatively fixed. In the dialogue model, what is to be done is the mapping of multiple semantic spaces. Because of the same sentence, different people will have different answers. However, in the face of massive data, some common but moderate responses such as “oh, good” have become mainstream in the corpus, making the trained robots tend to Use these high frequency answers.
  • the technical solution provided by the invention reduces the semantic space of the generated statement by using the topic information of the semantic segment, thereby suppressing the generation of the high frequency meaningless sentence to a certain extent. At the same time, through the two-way attention model, the key semantic information is extracted more accurately, and the correlation of the generated statements is better guaranteed.
  • the technical solution provided by this embodiment can use MXNET 0.5.0 in the deep learning framework to perform training and prediction on the Tesla K40.
  • the technical solution provided in this embodiment can be applied to a business scenario such as a chat bot, an automatic reply of a mail, and an automatically generated candidate answer sentence in a social software, and can automatically generate the most suitable answer sentences in real time according to the previous rounds of dialogue, and generate the same.
  • the process is fully controlled by the algorithm without user intervention.
  • an automatic reply is directly performed according to the user and the input, thereby achieving the effect of emotional companion; and in the automatic generation of the candidate answering sentence service, a plurality of candidates are generated for the user according to the previous rounds of chat situations, when the user is inconvenient When inputting, it helps users to respond quickly.
  • FIG. 3 is a schematic structural diagram of a dialog generating apparatus according to an embodiment of the present invention.
  • the components included in the device can be implemented by a processor in a terminal such as a mobile phone, a tablet computer, a personal computer, etc.; wherein the functions implemented by the processor can of course be implemented by logic circuits, in the process of implementation.
  • the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA).
  • the device in the embodiment of the present invention includes:
  • the hidden layer calculation section 301 is configured to convert each of the K rounds of interrogation sentences into a first word vector, and calculate a forward hidden layer vector and a reverse implicitness of each of the words according to the first word vector
  • the layer vector, K is a positive integer greater than or equal to 2.
  • a multi-round dialogue model can be constructed. As shown in FIG. 2A, each round of the query sentence and the corresponding reply sentence can be embedded in the single-round dialogue model, and the multi-round dialogue model can be regarded as the single-round dialogue model. Expanding, in the single-round dialogue model, can be divided into an encoding layer, an intent layer, and a decoding layer.
  • the K-th round query sentence input by the user may be first obtained, and the K-th round query sentence is word-divided, the unique word is used to represent the word vector of each word in the query, and then the embedded space matrix is used to The word vector of a word is transformed into a vector of a predetermined dimension
  • the dimension of the unique heat coding is the size of the preset dictionary, and each dimension corresponds to one word in the dictionary, only the corresponding bit is 1, the others are all 0, and then the K-round query sentence is scanned from front to back, and will be successively
  • the word vector of each word is input to the forward gated loop unit, and the forward hidden layer vector of each word is recorded. And scanning the K-th query from the back to the front, inputting the word vector of each word to the reverse-gated loop unit one by one, and recording the reverse hidden layer vector after inputting each word.
  • the forward hidden layer vector of the target word may be calculated according to the first word vector of the target word in the K-th query and the forward hidden layer vector of the previous word of the target word.
  • the positive hidden layer vector can be expressed as Calculating a reverse hidden layer vector of the target word according to the first word vector of the target word in the K-th query and the reverse hidden layer vector of the next word of the target word, and the reverse hidden of the target word.
  • the layer vector can be expressed as
  • the K-round question is “Have you seen a movie?”
  • the word vector of the first word "you” Determine the positive hidden layer vector of the first word “you” Word vector based on the second word "look”
  • the positive hidden layer vector of the first word "you” Determine the positive hidden layer vector of the first word “see” Word vector according to the third word "over”
  • the positive hidden layer vector of the second word “see” Determine the positive hidden layer vector of the third word "over”
  • the forward hidden layer vector of the fourth word "electricity” is calculated by successive analogy Positive hidden layer vector of the fifth word "shadow” Positive hidden layer vector of the sixth word "?”
  • the topic determining portion 302 is configured to acquire a content topic of the K-th query question and convert the content topic into a second word vector.
  • the BTM algorithm can be used to train each of the plurality of words, determine the probability distribution of each word as the content topic, and then match the K-th query question with the plurality of words to determine the Kth
  • the content topic with the highest probability in the round query sentence, the content topic with the highest probability can be represented by the unique heat coding, and the embedded space matrix of the content theme is constructed, thereby obtaining the word vector E (k) of the content theme.
  • the vector calculation section 303 is configured to, according to the second word vector, the forward hidden layer vector of the last word in the K-th round query sentence, and the K-1 round reply sentence outputted for the K-1 round query sentence a hidden layer vector of the last word in the middle, and an initial hidden layer vector of the K-1 round reply sentence outputted for the K-1 round of the query sentence, determining an initial hidden layer vector for the Kth round of the query sentence output .
  • the forward hidden layer vector of the last word in the query sentence output in the coding layer in the Kth round, the word vector E (k) of the content subject, and the The hidden layer vector of the last word in the K-1 round reply sentence output from the K-1 round query sentence, and the initial hidden layer vector of the K-1 round reply sentence outputted for the K-1 round query sentence are input to the simple In the -RNN neural network, an initial hidden layer vector for the output of the K-th query is calculated, and the initial hidden layer vector can be expressed as:
  • W (in, in) , W (in, de) , W (in, en), and W (in, e) are parameters in the simple-RNN neural network, respectively, and ⁇ is used to h (in, k)
  • the initial hidden layer vector is compressed in the [0,1] interval, thereby increasing the nonlinear representation ability of the model.
  • a reply output portion 304 configured to: according to the forward hidden layer vector and the reverse hidden layer vector of each word in the Kth round interrogation sentence, and the initial output for the Kth round interrogation sentence A hidden layer vector that generates a reply sentence for the Kth round of the query sentence.
  • the hidden state of the bidirectional structure as the input of the attention layer can be more Accurately describe the key information in the context, effectively reduce the problem of the back-end structure key information, because the hidden layer state of the bidirectional structure can increase the global information of each word to a certain extent, thus avoiding the one-way structure The more information carried by the latter words, the more relevant the generated replies are.
  • each word pair generation layer in the Kth round of interrogation sentences may be calculated according to the second hidden layer vector and the hidden layer vector of each word in the Kth round interrogation sentence. Calculating a weight of the second reply word; calculating a weight of the second reply word according to each word pair in the K-th round query sentence, and calculating a hidden layer vector of each word in the K-th round query sentence Weighting the sum and using the weighted sum as the contribution of each word pair in the Kth round of query words to generate the second reply word.
  • a probability distribution of each word in the preset dictionary may be calculated according to the third hidden layer vector; selecting a word with the highest probability in the preset dictionary as the second The reply words are output, and then the third reply word, the fourth reply word, the fifth reply word, and the like are sequentially output, and each time, 50 words can be selected to generate a reply sentence word by word, and the first 5 sentences with the highest probability are selected.
  • the importance level g jt of each word pair in the query sentence can be calculated by replying to the hidden layer vector of the previous word of a word in the reply sentence and the hidden layer vector of each word in the query sentence. among them, a hidden layer vector for the previous word of the word, In order to query the hidden layer vector of each word in the sentence, W (de, de) and W (de, en) are the parameters in the neural network respectively, and then normalize the importance g jt to calculate the Kth. Round the weight of the hidden layer vector of each word in the query Finally, calculate the weighted sum of the hidden layer vectors of each word in the K-th query.
  • a hidden layer vector of the reply word verbatim wherein The word vector for the previous word of the word, The hidden layer vector of the previous word of the word.
  • the intent layer In the intent layer, first calculate the content of the query phrase "Have you seen a movie?" as “movie”, and encode the content theme "movie” to get the theme vector, and then output the output vector of the previous round of the intention layer.
  • the output vector of the decoding layer of the previous round, the output vector of the current coding layer and the subject vector are input to the intent layer, and the initial hidden layer vector is output through the neural network operation.
  • the initial hidden layer vector can be used for the decoding layer to determine the answer sentence. The first word.
  • the decoding layer it can be regarded as the reverse process of the coding layer.
  • the word vector and the hidden layer vector can be decoded into a natural language.
  • the initial hidden layer vector output according to the intention layer and the word vector of each word in the query layer in the attention layer can be obtained. , generate the answer "I like European and American movies.”
  • a dictionary of 10,000 words each time decoding will generate a probability distribution of the 10,000 words, and then select one word with the highest probability to output each time.
  • the process is as follows: first, the initial hidden layer vector is output, and the initial hidden layer vector and the word vector whose first character is the identifier character "_EOS_" are input to the decoding layer, and the hidden layer vector is updated by the neural network to obtain the second hidden layer vector.
  • the second hidden layer vector generates a probability distribution of 10,000 words through the softmax regression algorithm, wherein the "I" word has the highest probability, so the output reply word "I", and then the second hidden layer vector and the reply word "I”
  • the word vector is used as input to generate a third hidden layer orientation
  • the quantity is calculated according to the third hidden layer vector, and the probability of the next word is taken as the output. Repeat the above process until the end of the whole process when the special symbol _EOS_ is output, then the reply sentence "I like European and American movies _EOS_" can be generated.
  • each word in the Kth round of interrogation sentences is first converted into a first word vector, and a forward hidden layer vector and a reverse hidden layer vector of each word are calculated according to the first word vector; Obtaining the content topic of the Kth round of the query sentence, and converting the content theme into the second word vector; secondly, according to the second word vector, the positive hidden layer vector of the last word in the Kth round of the query sentence, for the K-1 round The hidden layer vector of the last word in the K-1 round reply sentence output by the query sentence, and the initial hidden layer vector of the K-1 round reply sentence outputted for the K-1 round query sentence, determining the inquiry for the Kth round The initial hidden layer vector of the sentence output; finally, according to the forward hidden layer vector and the reverse hidden layer vector of each word in the K round query, and the initial hidden layer vector output for the Kth round of the query sentence, generating for the Kth The reply sentence of the round inquiry sentence effectively suppresses the generation of the cross-topic general high-frequency reply sentence by adding
  • FIG. 4 is a schematic structural diagram of a dialog generating device according to an embodiment of the present invention.
  • the device can include at least one processor 401, such as a CPU, at least one interface circuit 402, at least one memory 403, and at least one bus 404.
  • the communication bus 404 is configured to implement connection communication between these components.
  • the interface circuit 402 in the embodiment of the present invention may be a wired transmission port, or may be a wireless device, for example, including an antenna device, configured to perform signaling or data communication with other node devices.
  • the memory 403 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.
  • the memory 403 can also optionally be at least one storage device located remotely from the aforementioned processor 401.
  • a set of program codes is stored in the memory 403, and the processor 401 is configured to call the program code stored in the memory, configured to perform the following steps:
  • the positive hidden layer vector of the last word in the K-th round query sentence, and the hidden layer of the last word in the K-1 round reply sentence outputted for the K-1 round query sentence a vector, and an initial hidden layer vector for the K-1 round reply sentence outputted by the K-1 round of the query sentence, determining an initial hidden layer vector output for the Kth round of the query sentence;
  • the processor 401 is configured to perform the following operations:
  • the processor 401 is configured to perform the following operations:
  • the processor 401 is configured to perform the following operations:
  • the processor 401 is configured to perform the following operations:
  • the processor 401 is configured to perform the following operations:
  • a word having the highest probability among the preset dictionary is selected as the second reply word for output.
  • dialog generation method if the above-described dialog generation method is implemented in the form of a software function portion and sold or used as a stand-alone product, it may also be stored in a computer readable storage medium.
  • a computer device which may be a personal computer, server, or network device, etc.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, and a read only memory. (Read Only Memory, ROM), disk or optical disk, etc.
  • embodiments of the invention are not limited to any specific combination of hardware and software.
  • the embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute a dialog generation method in the embodiment of the present invention.
  • the program may be stored in a computer readable storage medium, and the storage medium may include: Flash disk, read-only memory random access memory (Random Access Memory, RAM), disk or optical disk.
  • the forward hidden layer vector and the inverse of each word in the K-round query sentence are generated, and the reply sentence for the Kth round of the query sentence is generated.
  • the cross-topic general high frequency reply is effectively suppressed.
  • the generation of sentences improves the accuracy of the generated dialogue.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例公开了一种对话生成方法、装置、设备、存储介质,包括:将第K轮询问句中的每个字转化为第一词向量,并根据第一词向量计算每个字的正向隐层向量和反向隐层向量;获取第K轮询问句的内容主题,并将内容主题转化为第二词向量;根据第二词向量、第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对第K轮询问句输出的初始隐层向量;根据第K轮询问句中每个字的正向隐层向量和反向隐层向量、以及针对第K轮询问句输出的初始隐层向量,生成针对第K轮询问句的答复句。

Description

一种对话生成方法及装置、设备、存储介质
相关申请的交叉引用
本申请基于申请号为2016105675040、申请日为2016年07月19日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本发明涉及语音处理领域,尤其涉及一种对话生成方法及装置、设备、存储介质。
背景技术
近年来,人机交互方式正发生着日新月异的变化,对话机器人正作为一种新的交互模式吸引着越来越多人的关注。然而,如何在多轮对话中提高自动生成的回复语句的相关性,如何降低高频答案的生成概率,生成高质量的对话,一直成为自然语言处理领域研究的重点,其中,对话系统是自然语言处理的一个重要应用方向。
在现有技术方案中,对话系统可以包括基于规则的对话系统、基于搜索的对话系统或生成式对话系统。其中,基于规则的对话系统结构简单、高准确度,但泛化能力较差;基于搜索的对话系统,要求语料库的质量以及数量比较高,否则容易出现低召回等问题;生成式对话系统可以较好的构建语言模型,对任意输入语句,均可生成对应的答句,生成式对话系统建模方式可以分为单轮建模与多轮建模两类,其中,单轮生成式对话模型仅对问答对进行建模,而在处理多轮对话时将上下文直接拼接成一句长问句,但当对话轮数较多且上下文较长时,容易出现信息压缩混乱,导致生 成答句质量较低等问题,多轮生成式对话模型将多轮问答传递过程进行建模,但是该模型容易生成高频答案,精确度低。
发明内容
本发明实施例提供一种对话生成方法及装置、设备、存储介质,能够解决生成对话精确度低的技术问题。
本发明实施例提供了一种对话生成方法,包括:
将第K轮询问句中的每个字转化为第一词向量,并根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量,K为大于等于2的正整数;
获取所述第K轮询问句的内容主题,并将所述内容主题转化为第二词向量;
根据所述第二词向量、所述第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对所述第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对所述第K轮询问句输出的初始隐层向量;
根据所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量、以及所述针对所述第K轮询问句输出的初始隐层向量,生成针对所述第K轮询问句的答复句。
本发明实施例提供了一种对话生成装置,包括:
隐层计算部分,配置为将第K轮询问句中的每个字转化为第一词向量,并根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量,K为大于等于2的正整数;
主题确定部分,配置为获取所述第K轮询问句的内容主题,并将所述内容主题转化为第二词向量;
向量计算部分,配置为根据所述第二词向量、所述第K轮询问句中最 后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对所述第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对所述第K轮询问句输出的初始隐层向量;
答复输出部分,配置为根据所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量、以及所述针对所述第K轮询问句输出的初始隐层向量,生成针对所述第K轮询问句的答复句。
本发明实施例提供了一种对话生成设备,所述设备包括接口电路、存储器以及处理器,其中,存储器中存储一组程序代码,且处理器配置为调用存储器中存储的程序代码,配置为执行以下步骤:
将第K轮询问句中的每个字转化为第一词向量,并根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量,K为大于等于2的正整数;
获取所述第K轮询问句的内容主题,并将所述内容主题转化为第二词向量;
根据所述第二词向量、所述第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对所述第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对所述第K轮询问句输出的初始隐层向量;
根据所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量、以及所述针对所述第K轮询问句输出的初始隐层向量,生成针对所述第K轮询问句的答复句。
本发明实施例提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令用于执行本发明实施例提供的对话生成方法。
实施本发明实施例,首先将第K轮询问句中的每个字转化为第一词向 量,并根据第一词向量计算每个字的正向隐层向量和反向隐层向量;然后获取第K轮询问句的内容主题,并将内容主题转化为第二词向量;其次根据第二词向量、第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对第K轮询问句输出的初始隐层向量;最后根据第K轮询问句中每个字的正向隐层向量和反向隐层向量、以及针对第K轮询问句输出的初始隐层向量,生成针对第K轮询问句的答复句,通过在生成对话过程中加入主题内容,有效的抑制了跨主题通用高频答复句的生成,提高生成对话的精确性。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提出的一种对话生成方法的流程示意图;
图2A是本发明实施例提出的一种对话生成系统的架构示意图;
图2B是本发明实施例提出的一种对话生成方法的流程示意图;
图3是本发明实施例提供的一种对话生成装置的结构示意图;
图4是本发明实施例提供的另一种对话生成装置的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范 围。
为了更好地理解本发明的各实施例,下面提供一些技术名词的含义:
RNN(Recurrent neural Network):循环神经网络,可以用于对时序行为进行建模。
LSTM(Long Short-Term Memory):一种时间递归神经网络,可以理解为循环神经网络的一种细胞结构,含有输入门、输出门、遗忘门,适用于处理和预测时间序列中间隔和延迟非常长的重要事件。
GRU(Gated Recurrent Unit):门控循环单元,作为一种RNN变种神经网络,GRU将忘记门和输入门合成了一个单一的更新门,同样还混合了细胞状态和隐藏状态,即将单元状态去除,直接通过输出存储信息,由于这种结构比LSTM更加简单。GRU与LSTM类似,适于处理长时依赖,且细胞结构更为简单。
One-hot:独热,一个向量,其维数为字典大小,每一维对应字典中的一个字,仅在对应位为1,其他都为0。
词向量:一个定长低维(通常为200维至300维)向量,用以表示某个字,具有相关性大的字和向量间距小的特点。
Softmax:逻辑(logistic)回归模型在多分类问题上的推广。
BTM(Biterm Topic Model),Biterm主题模型,其主要思想是统计预料中任意两个词语组成的共现词对(即词共现模式),并以共现词对为单位进行建模来解决预料特征稀疏的问题。
请参考图1,图1是本发明实施例提供的一种对话生成方法的流程示意图。如图所述,本发明实施例中的方法包括:
S101,将第K轮询问句中的每个字转化为第一词向量,并根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量,K为大于等于2的正整数。
在实现中,可以构建多轮对话模型,如图2A所示,可以将每一轮的询问句以及相应的答复句嵌入到单轮对话模型中,多轮对话模型可以看作单轮对话模型的展开,在单轮对话模型中,可以分为编码(encoder)层、意图(intention)层以及解码(decoder)层。
在编码层,可以首先获取用户输入的第K轮询问句,并将第K轮询问句以字进行分词,使用独热(one-hot)编码表示询问中的每个字的词向量,然后通过嵌入空间矩阵(ESM,Embedding Space Matrix)将每个字的词向量转化成一个预定维数的向量
Figure PCTCN2017093417-appb-000001
其中,独热编码的维数为预设字典的大小,每一维对应字典中的一个字,仅在对应位为1,其他都为0,然后从前向后扫描第K轮询问句,逐次将每个字的词向量输入到正向门控循环单元,记录输入每个字的正向隐层向量
Figure PCTCN2017093417-appb-000002
并且从后向前扫描第K轮询问句,逐次将每个字的词向量输入到反向门控循环单元,记录输入每个字后的反向隐层向量
Figure PCTCN2017093417-appb-000003
其中,可以根据所述第K轮询问句中目标字的第一词向量和所述目标字的上一个字的正向隐层向量,计算所述目标字的正向隐层向量,目标字的正向隐层向量可以表示为
Figure PCTCN2017093417-appb-000004
根据所述第K轮询问句中目标字的第一词向量和所述目标字的下一个字的反向隐层向量,计算所述目标字的反向隐层向量,目标字的反向隐层向量可以表示为
Figure PCTCN2017093417-appb-000005
例如,第K轮询问句为“你看过电影吗?”,首先可以对“你看过电影吗”进行正向编码,将询问中的每个字转化为一个词向量,分别为
Figure PCTCN2017093417-appb-000006
Figure PCTCN2017093417-appb-000007
然后根据第一个字“你”的词向量
Figure PCTCN2017093417-appb-000008
确定第一个字“你”的正向隐层向量
Figure PCTCN2017093417-appb-000009
根据第二字“看”的词向量
Figure PCTCN2017093417-appb-000010
和第一个 字“你”的正向隐层向量
Figure PCTCN2017093417-appb-000011
确定第一个字“看”的正向隐层向量
Figure PCTCN2017093417-appb-000012
根据第三个字“过”的词向量
Figure PCTCN2017093417-appb-000013
和第二个字“看”的正向隐层向量
Figure PCTCN2017093417-appb-000014
确定第三个字“过”的正向隐层向量
Figure PCTCN2017093417-appb-000015
逐次类推分别计算得到第四个字“电”的正向隐层向量
Figure PCTCN2017093417-appb-000016
第五个字“影”的正向隐层向量
Figure PCTCN2017093417-appb-000017
第六个字“吗”的正向隐层向量
Figure PCTCN2017093417-appb-000018
另外,可以首先可以对“你看过电影吗”进行反向编码,将询问中的每个字转化为一个词向量,分别为
Figure PCTCN2017093417-appb-000019
然后根据第六个字“吗”的词向量
Figure PCTCN2017093417-appb-000020
确定第六个字“吗”的反向隐层向量
Figure PCTCN2017093417-appb-000021
根据第五字“影”的词向量
Figure PCTCN2017093417-appb-000022
和第六个字“吗”的反向隐层向量
Figure PCTCN2017093417-appb-000023
确定第五个字“影”的反向隐层向量
Figure PCTCN2017093417-appb-000024
根据第四个字“电”的词向量
Figure PCTCN2017093417-appb-000025
和第五个字“影”的反向隐层向量
Figure PCTCN2017093417-appb-000026
确定第四个字“电”的正向隐层向量
Figure PCTCN2017093417-appb-000027
逐次类推分别计算得到第三个字“过”的反向隐层向量第二个字“看”的正向隐层向量
Figure PCTCN2017093417-appb-000029
第一个字“我”的反向隐层向量
Figure PCTCN2017093417-appb-000030
S102,获取所述第K轮询问句的内容主题,并将所述内容主题转化为第二词向量。
在实现中,可以使用BTM算法对多个词中的每个词进行训练,确定每个词作为内容主题的概率分布,然后将第K轮询问句与所述多个词进行匹配,确定第K轮询问句中概率最大的内容主题,该概率最大的内容主题可以使用独热编码表示,并构建该内容主题的嵌入空间矩阵,从而得到该内容主题的词向量E(k)
S103,根据所述第二词向量、所述第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对所述第K-1轮询问句输出的第K-1轮答复句的初始隐层 向量,确定针对所述第K轮询问句输出的初始隐层向量。
在实现中,如图2A所示,在意图层,可以将第K轮中编码层21中输出的询问句中最后一个字的正向隐层向量、内容主题的词向量E(k)、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对第K-1轮询问句输出的第K-1轮答复句的初始隐层向量输入到简单循环神经网络(simple-RNN,simple Recurrent neural Network)中,计算得到针对第K轮询问句输出的初始隐层向量,该初始隐层向量可以表示为:
Figure PCTCN2017093417-appb-000031
其中,W(in,in)、W(in,de)、W(in,en)以及W(in,e)分别为simple-RNN神经网络中的参数,σ用于将h(in,k)初始隐层向量压缩在[0,1]区间,从而增加模型的非线性表征能力。
需要说明的是,在计算初始隐层向量的过程中,由于将第K轮询问句中的内容主题加入到意图层进行计算,相当于在运算过程中加入了监督信息,从而在生成的答复句可以被限制在该内容主题的范围内,进而减少部分通用高频答复句的生成概率。
S104,根据所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量、以及所述针对所述第K轮询问句输出的初始隐层向量,生成针对所述第K轮询问句的答复句。
在实现中,首先对所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量进行拼接得到所述第K轮询问句中每个字的隐层向量,其中,每个字的隐层向量
Figure PCTCN2017093417-appb-000032
然后根据所述针对所述第K轮询问句输出的初始隐层向量以及预设的标识字符的词向量,确定针对所述第K轮询问句输出的第二隐层向量,进而根据所述第二隐层向量确定所述针对所述第K轮询问句输出的第一个答复字;根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生 成第二个答复字的贡献度;根据所述第K轮询问句中每个字对生成第二个答复字的贡献度、所述第二隐层向量以及所述第一个答复字的词向量,计算所述第三隐层向量;根据所述第三隐层向量,生成针对所述第K轮询问句的第二个答复字,依次类推生成针对所述第K轮询问句的答复句。
需要说明的是,使用双向结构的隐状态作为注意力层的输入,可以更加精确的描述上下文中的重点信息,有效降低单向结构重点信息靠后的问题,由于双向结构的隐层状态在一定程度上可以增加每个字的全局信息,因此避免了单向结构越靠后的字所携带的信息越多的问题,使得生成的答复句相关性更强。
在本发明的其他实施例中,可以根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生成所述第二个答复字的权重;根据所述第K轮询问句中每个字对生成所述第二个答复字的权重,计算所述第K轮询问句中每个字的隐层向量的加权和,并将所述加权和作为所述第K轮询问句中每个字对生成所述第二个答复字的贡献度。
在本发明的其他实施例中,可以根据所述第三隐层向量,计算在预设字典中的每个字的概率分布;选择在所述预设字典中概率最大的字作为所述第二个答复字进行输出,进而依次输出第三个答复字、第四答复字、第五答复字等等,每次可以选择50个字逐字生成答复句,并选择概率最高的前5句话。
例如:可以通过答复句中某字的上一个字的隐层向量和询问句中的每个字的隐层向量,计算询问句中每个字对生成该字的重要度gjt
Figure PCTCN2017093417-appb-000033
其中,
Figure PCTCN2017093417-appb-000034
为该字的上一个字的隐层向量,
Figure PCTCN2017093417-appb-000035
为询问句中的每个字的隐层向量,W(de,de)、W(de,en)分别为神经网络中的参数,然后对重要度gjt进行归一化处理,计算得到第K轮询问句中每个字的 隐层向量的权重
Figure PCTCN2017093417-appb-000036
最后计算第K轮询问句中每个字的隐层向量的加权和
Figure PCTCN2017093417-appb-000037
从而根据
Figure PCTCN2017093417-appb-000038
逐字生成该答复字的隐层向量,其中,
Figure PCTCN2017093417-appb-000039
为该字的上一个字的词向量,
Figure PCTCN2017093417-appb-000040
为该字的上一个字的隐层向量。
针对上述发明实施例提出的一种对话生成方法,参见图2B所示,以下通过详细的例子说明该方法的实施步骤:
在编码层21,首先将询问句“你看过电影吗”25按字分词为“你”、“看”“、过”、“电”、“影”、“吗”,从“你”到“我”进行正向编码,形成6个字的正向隐层向量
Figure PCTCN2017093417-appb-000041
即注意力层24中从左向右的向量,然后再从“吗”到“你”进行反向编码,形成6个字的反向隐层向量
Figure PCTCN2017093417-appb-000042
Figure PCTCN2017093417-appb-000043
即注意力层中从右向左的向量,最后将正向隐层向量和反向隐层向量进行串联拼接,形成某个字的隐层向量,例如,询问句中“你”的隐层向量
Figure PCTCN2017093417-appb-000044
Figure PCTCN2017093417-appb-000045
为“你”的正向隐层向量,
Figure PCTCN2017093417-appb-000046
为“你”的反向隐层向量,并且,将询问句中的最后一个字“吗”的正向隐层向量
Figure PCTCN2017093417-appb-000047
输入到意图层。
在意图层22,首先通过计算得到询问句“你看过电影吗”25的内容主题为“电影”,并对内容主题“电影”进行编码得到主题向量,然后将上一轮的意图层的输出向量,上一轮的解码层的输出向量,本轮编码层21的输出向量以及主题向量一并输入到意图层,通过神经网络运算输出初始隐层向量,该初始隐层向量可以用于解码层确定答句的第一个字。
在解码层23,可看作编码层21的逆向过程,可以将词向量和隐层向量解码为自然语言,可以根据意图层输出的初始隐层向量和注意力层中询问句中每个字的词向量,生成答句“我喜欢欧美电影”。假设一个10000个字的字典,解码层23每次解码会生成该1万个字的概率分布,然后每次选取 概率最大的一个字进行输出。过程如下:
首先意图层22的输出初始隐层向量,并将该初始隐层向量以及第一个字符是标识字符“_EOS_”的词向量输入到解码层23,通过神经网络更新隐层向量得到第二隐层向量,第二隐层向量通过softmax回归算法生成1万个字的概率分布,其中“我”字的概率最大,因此输出答复字“我”,然后将第二隐层向量和答复字“我”的词向量作为输入,生成第三隐层向量,根据第三隐层向量计算下一个字的概率分布,取概率最大的“喜”字作为输出。重复以上过程,直到输出特殊符号_EOS_时结束全部过程,则可以生成答复句“我喜欢欧美电影_EOS_”26。
在本发明实施例中,首先将第K轮询问句中的每个字转化为第一词向量,并根据第一词向量计算每个字的正向隐层向量和反向隐层向量;然后获取第K轮询问句的内容主题,并将内容主题转化为第二词向量;其次根据第二词向量、第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对第K轮询问句输出的初始隐层向量;最后根据第K轮询问句中每个字的正向隐层向量和反向隐层向量、以及针对第K轮询问句输出的初始隐层向量,生成针对第K轮询问句的答复句,通过在生成对话过程中加入主题内容,有效的抑制了跨主题通用高频答复句的生成,提高生成对话的精确性。
近年来,人机交互方式正发生着日新月异的变化,对话机器人正作为一种新的交互模式吸引着越来越多人的关注。本发明涉及计算机技术与机器学习领域,使用深度学习技术,让机器人可通过多轮对话理解人类的自然语言语义,并生成相应回复语句。然而如何在多轮对话中提高自动生成的回复语句的相关性,如何降低高频答案的生成概率,使得生成高质量的对话,这些问题一直困扰着相关研究人员。本实施例提供的技术方案不但 能避免基于规则的对话系统的低泛化能力,以及基于搜索算法的对话系统的低召回能力,同时也能有效降低主流基于统计学习的生成对话系统大概率生成高频答句的问题,进而使得生成对话算法在实用性上提高了一个台阶。
参见图2A所示,基于多轮对话模型,在解码层23采用GRU单元对单句进行编码,以防梯度弥散;在意图层22创造性地加入基于BTM算法的对话主题信息,作为生成对话监督信息,可在一定程度上降低高频答案的生成概率;同时在解码层23,使用双向注意力机制(注意力层24),以捕捉上下文中的重点信息,使生成的对话有更好的相关性。
本实施例基于多轮生成对话模型的对话生成方法,该方法包括训练与预测两个过程,其中多轮生成对话模型的输入是前四轮对话的问答对以及当前轮的问句,多轮生成对话模型的输出是算法依据前文信息生成的当前轮的答句。
在训练过程中,拥有真实的五轮问答对,故选取最后一轮的真实答句作为训练算法的监督信息,对生成的答句计算损失函数,进而训练神经网络直至收敛。将每一轮的一问一答嵌入一个单轮生成对话模型中,那么多轮生成对话可视为单轮生成对话的时序展开。在单轮生成模型中,处理流程分为三部分,编码层、意图层和解码层的处理流程。
1)编码层,该层的作用是将输入的自然语言映射到一个固定维度的向量中。故其输入为自然语言表示的句子,输出为固定长度的向量。具体而言:
I>,将句子以字级别进行分词,然后将每个字的one-hot表示,通过嵌入空间矩阵,转换成一个200维的词向量
Figure PCTCN2017093417-appb-000048
II>从前向后扫描句子,逐次将每个字的词向量输入正向GRU网络,记录输入每个字后的隐层状态
Figure PCTCN2017093417-appb-000049
III>反向扫描句子,逐次将每个字的词向量输入反向GRU网络,记录输入每个字后的隐层状态
Figure PCTCN2017093417-appb-000050
IV>取II>的末状态
Figure PCTCN2017093417-appb-000051
作为整句话的定长向量表达,即句子嵌入(embedding),该句子嵌入作为意图层的输入;
V>串型拼接II>和III>中的得到的正反向隐层向量,即
Figure PCTCN2017093417-appb-000052
作为改词在句子中的表达,作为解码层的输入。相较于单向结构,使用双向结构的隐状态作为意图层的输入,可以更加精确的描述上下文中的重点信息,有效降低单向结构重点信息靠后的问题,因为双向结构的隐状态从某种程度上使得每个词都带有全局信息,避免了单向结构越靠后的词带有的信息越多的问题,使得生成的答句相关性更强。
2)意图层,该层的作用是编码多轮对话的话题传递过程,意图层的输入为1)中的
Figure PCTCN2017093417-appb-000053
上一轮问答解码层结尾隐状态
Figure PCTCN2017093417-appb-000054
上一轮问答intention层的输出h(in,k-1),以及当前轮问句的主题E(k),输出为当前主题与上下文信息综合编码之后的向量h(in,k)。具体而言:
I>计算当前轮问句的主题,使用BTM算法,先进行线下离线训练,得出每个词的主题分布,进而在线计算出当前问句概率最大的主题序号,该序号可视为主题的one-hot表示,然后构建主题embedding矩阵,得到该主题的词向量E(k)
II>通过simple-RNN网络计算主题的传递,
Figure PCTCN2017093417-appb-000055
其中,W(in,in)、W(in,de)、W(in,en)以及W(in,e)分别为simple-RNN神经网络中的参数,σ用于将h(in,k)初始隐层向量压缩在[0,1]区间,从而增加模型的非线性表征能力;并作为解码层的输入。在该过程中,本实施例显示地将当前query的主题纳入计算,相当于在运算过程中加入了监督信息,使得下一步生成答句 的时候会限制在该主题下,进而可以减少部分通用高频答句的生成概率。
3)解码层,该层的作用是通过分析编码层和意图层的输出向量,输出下一个字在字典中的概率分布。输入是意图层的输出h(in,k),编码层的输出
Figure PCTCN2017093417-appb-000056
输出是下一个字在字典中的概率分布。具体而言:
I>通过
Figure PCTCN2017093417-appb-000057
计算attention,首先通过答句上一个字的隐层与问句每个字的隐层,计算问句中该字的重要度,
Figure PCTCN2017093417-appb-000058
其中,
Figure PCTCN2017093417-appb-000059
为该字的上一个字的隐层向量,
Figure PCTCN2017093417-appb-000060
为询问句中的每个字的隐层向量,W(de,de)、W(de,en)分别为神经网络中的参数;进而使用softmax概率进行归一化,得到注意力层的权重,
Figure PCTCN2017093417-appb-000061
即计算问句中哪些成分对生成该字贡献最大。然后计算问句每个字的隐层向量计算加权和,即
Figure PCTCN2017093417-appb-000062
II>使用GRU单元,根据
Figure PCTCN2017093417-appb-000063
逐字生成下一个隐层状态,然后每个隐层状态接入全连接层,并通过softmax计算下一个字在字典中的概率分布。在训练时,通过计算标准答句中的对应字在预测答句的概率分布的负log似然来计算损失,计算标准答句的总体损失和作为本轮的损失,使用循环神经网络的反向传播算法(BPTT,Back Propagation Through Time)进行误差回传。在预测时,使用机器学习的集束搜索(Beam Search)算法每次选取概率最大的前50个字,逐字生成答句,并输出概率最高的前5句话。
本实施例提供的技术方案脱胎于翻译模型,众所周知,翻译模型构建的是一种语言到另一种语言的等价空间变换,故语义空间相对固定。而在对话模型中,要完成的是多语义空间的映射,因为同一句问句,不同的人会有不同的回答。然而,在海量数据面前,一些如“哦,好的”之类的通用却中庸的回复方式,在语料中成为主流,使得训练出来的机器人倾向于 使用这些高频回答。本发明提供的技术方案通过使用语义段的主题信息,减小生成语句的语义空间,从而在一定程度上抑制了高频无意义答句的生成。同时通过双向注意力模型,更精确地抽取出重点语义信息,更好地保证了生成语句的相关性。
在实现的过程中,本实施例提供的技术方案可以使用深度学习框架使用MXNET 0.5.0,在Tesla K40上进行训练和预测。本实施例提供的技术方案可应用于聊天机器人、邮件自动回复、社交软件中自动生成候选答句等业务场景中,能够根据前几轮的对话实时自动生成最合适的几种答句,其生成过程由算法完全控制,无需用户干涉。例如在聊天机器人中,直接根据用户与输入进行自动回复,从而达到情感陪伴的作用;再如在自动生成候选答句业务中,根据前几轮的聊天情况为用户生成若干候选,当用户不方便输入时,可帮助用户进行快速回复。
请参考图3,图3是本发明实施例提供的一种对话生成装置的结构示意图。该装置所包括的各部分都可以通过对话生成设备例如手机、平板电脑、个人电脑等终端中处理器来实现;其中处理器所实现的功能当然还可以通过逻辑电路来实现,在实施的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。如图3所述,本发明实施例中的装置包括:
隐层计算部分301,配置为将第K轮询问句中的每个字转化为第一词向量,并根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量,K为大于等于2的正整数。
在实现中,可以构建多轮对话模型,如图2A所示,可以将每一轮的询问句以及相应的答复句嵌入到单轮对话模型中,多轮对话模型可以看作单轮对话模型的展开,在单轮对话模型中,可以分为编码层、意图层以及解码层。
在编码层,可以首先获取用户输入的第K轮询问句,并将第K轮询问句以字进行分词,使用独热编码表示询问中的每个字的词向量,然后通过嵌入空间矩阵将每个字的词向量转化成一个预定维数的向量
Figure PCTCN2017093417-appb-000064
其中,独热编码的维数为预设字典的大小,每一维对应字典中的一个字,仅在对应位为1,其他都为0,然后从前向后扫描第K轮询问句,逐次将每个字的词向量输入到正向门控循环单元,记录输入每个字的正向隐层向量
Figure PCTCN2017093417-appb-000065
并且从后向前扫描第K轮询问句,逐次将每个字的词向量输入到反向门控循环单元,记录输入每个字后的反向隐层向量
Figure PCTCN2017093417-appb-000066
其中,可以根据所述第K轮询问句中目标字的第一词向量和所述目标字的上一个字的正向隐层向量,计算所述目标字的正向隐层向量,目标字的正向隐层向量可以表示为
Figure PCTCN2017093417-appb-000067
根据所述第K轮询问句中目标字的第一词向量和所述目标字的下一个字的反向隐层向量,计算所述目标字的反向隐层向量,目标字的反向隐层向量可以表示为
Figure PCTCN2017093417-appb-000068
例如,第K轮询问句为“你看过电影吗?”,首先可以对“你看过电影吗”进行正向编码,将询问中的每个字转化为一个词向量,分别为
Figure PCTCN2017093417-appb-000069
Figure PCTCN2017093417-appb-000070
然后根据第一个字“你”的词向量
Figure PCTCN2017093417-appb-000071
确定第一个字“你”的正向隐层向量
Figure PCTCN2017093417-appb-000072
根据第二字“看”的词向量
Figure PCTCN2017093417-appb-000073
和第一个字“你”的正向隐层向量
Figure PCTCN2017093417-appb-000074
确定第一个字“看”的正向隐层向量
Figure PCTCN2017093417-appb-000075
根据第三个字“过”的词向量
Figure PCTCN2017093417-appb-000076
和第二个字“看”的正向隐层向量
Figure PCTCN2017093417-appb-000077
确定第三个字“过”的正向隐层向量
Figure PCTCN2017093417-appb-000078
逐次类推分别计算得到第四个字“电”的正向隐层向量
Figure PCTCN2017093417-appb-000079
第五个字“影”的正向隐层向量
Figure PCTCN2017093417-appb-000080
第六个字“吗”的正向隐层向量
Figure PCTCN2017093417-appb-000081
另外,可以首先可以对“你看过电影吗”进行反向编码,将询问中的每个字转化为一个词向量,分别为
Figure PCTCN2017093417-appb-000082
然后根据第六个字“吗”的词向量
Figure PCTCN2017093417-appb-000083
确定第六个字“吗”的反向隐层向量
Figure PCTCN2017093417-appb-000084
根据第五字“影”的词向量
Figure PCTCN2017093417-appb-000085
和第六个字“吗”的反向隐层向量
Figure PCTCN2017093417-appb-000086
确定第五个字“影”的反向隐层向量
Figure PCTCN2017093417-appb-000087
根据第四个字“电”的词向量
Figure PCTCN2017093417-appb-000088
和第五个字“影”的反向隐层向量
Figure PCTCN2017093417-appb-000089
确定第四个字“电”的正向隐层向量
Figure PCTCN2017093417-appb-000090
逐次类推分别计算得到第三个字“过”的反向隐层向量
Figure PCTCN2017093417-appb-000091
第二个字“看”的正向隐层向量
Figure PCTCN2017093417-appb-000092
第一个字“我”的反向隐层向量
Figure PCTCN2017093417-appb-000093
主题确定部分302,配置为获取所述第K轮询问句的内容主题,并将所述内容主题转化为第二词向量。
在实现中,可以使用BTM算法对多个词中的每个词进行训练,确定每个词作为内容主题的概率分布,然后将第K轮询问句与所述多个词进行匹配,确定第K轮询问句中概率最大的内容主题,该概率最大的内容主题可以使用独热编码表示,并构建该内容主题的嵌入空间矩阵,从而得到该内容主题的词向量E(k)
向量计算部分303,配置为根据所述第二词向量、所述第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对所述第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对所述第K轮询问句输出的初始隐层向量。
在实现中,如图2A所示,在意图层,可以将第K轮中编码层中输出的询问句中最后一个字的正向隐层向量、内容主题的词向量E(k)、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对第K-1轮询问句输出的第K-1轮答复句的初始隐层向量输入到simple-RNN神 经网络中,计算得到针对第K轮询问句输出的初始隐层向量,该初始隐层向量可以表示为:
Figure PCTCN2017093417-appb-000094
其中,W(in,in)、W(in,de)、W(in,en)以及W(in,e)分别为simple-RNN神经网络中的参数,σ用于将h(in,k)初始隐层向量压缩在[0,1]区间,从而增加模型的非线性表征能力。
需要说明的是,在计算初始隐层向量的过程中,由于将第K轮询问句中的内容主题加入到意图层进行计算,相当于在运算过程中加入了监督信息,从而在生成的答复句可以被限制在该内容主题的范围内,进而减少部分通用高频答复句的生成概率。
答复输出部分304,配置为根据所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量、以及所述针对所述第K轮询问句输出的初始隐层向量,生成针对所述第K轮询问句的答复句。
在实现中,首先对所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量进行拼接得到所述第K轮询问句中每个字的隐层向量,其中,每个字的隐层向量
Figure PCTCN2017093417-appb-000095
然后根据所述针对所述第K轮询问句输出的初始隐层向量以及预设的标识字符的词向量,确定针对所述第K轮询问句输出的第二隐层向量,进而根据所述第二隐层向量确定所述针对所述第K轮询问句输出的第一个答复字;根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生成第二个答复字的贡献度;根据所述第K轮询问句中每个字对生成第二个答复字的贡献度、所述第二隐层向量以及所述第一个答复字的词向量,计算所述第三隐层向量;根据所述第三隐层向量,生成针对所述第K轮询问句的第二个答复字,依次类推生成针对所述第K轮询问句的答复句。
需要说明的是,使用双向结构的隐状态作为注意力层的输入,可以更 加精确的描述上下文中的重点信息,有效降低单向结构重点信息靠后的问题,由于双向结构的隐层状态在一定程度上可以增加每个字的全局信息,因此避免了单向结构越靠后的字所携带的信息越多的问题,使得生成的答复句相关性更强。
在本发明的其他实施例中,可以根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生成所述第二个答复字的权重;根据所述第K轮询问句中每个字对生成所述第二个答复字的权重,计算所述第K轮询问句中每个字的隐层向量的加权和,并将所述加权和作为所述第K轮询问句中每个字对生成所述第二个答复字的贡献度。
在本发明的其他实施例中,可以根据所述第三隐层向量,计算在预设字典中的每个字的概率分布;选择在所述预设字典中概率最大的字作为所述第二个答复字进行输出,进而依次输出第三个答复字、第四答复字、第五答复字等等,每次可以选择50个字逐字生成答复句,并选择概率最高的前5句话。
例如:可以通过答复句中某字的上一个字的隐层向量和询问句中的每个字的隐层向量,计算询问句中每个字对生成该字的重要度gjt
Figure PCTCN2017093417-appb-000096
其中,
Figure PCTCN2017093417-appb-000097
为该字的上一个字的隐层向量,
Figure PCTCN2017093417-appb-000098
为询问句中的每个字的隐层向量,W(de,de)、W(de,en)分别为神经网络中的参数,然后对重要度gjt进行归一化处理,计算得到第K轮询问句中每个字的隐层向量的权重
Figure PCTCN2017093417-appb-000099
最后计算第K轮询问句中每个字的隐层向量的加权和
Figure PCTCN2017093417-appb-000100
从而根据
Figure PCTCN2017093417-appb-000101
逐字生成该答复字的隐层向量,其中,
Figure PCTCN2017093417-appb-000102
为该字的上一个字的词向量,
Figure PCTCN2017093417-appb-000103
为该字的上一个 字的隐层向量。
针对上述发明实施例提出的一种对话生成装置,以下通过详细的例子说明该方法的实施步骤:
在编码层,首先将询问句“你看过电影吗”按字分词为“你”、“看”“、过”、“电”、“影”、“吗”,从“你”到“我”进行正向编码,形成6个字的正向隐层向量
Figure PCTCN2017093417-appb-000104
即注意力层中从左向右的向量,然后再从“吗”到“你”进行反向编码,形成6个字的反向隐层向量
Figure PCTCN2017093417-appb-000105
Figure PCTCN2017093417-appb-000106
即注意力层中从右向左的向量,最后将正向隐层向量和反向隐层向量进行串联拼接,形成某个字的隐层向量,例如,询问句中“你”的隐层向量
Figure PCTCN2017093417-appb-000107
Figure PCTCN2017093417-appb-000108
为“你”的正向隐层向量,
Figure PCTCN2017093417-appb-000109
为“你”的反向隐层向量,并且,将询问句中的最后一个字“吗”的正向隐层向量
Figure PCTCN2017093417-appb-000110
输入到意图层。
在意图层,首先通过计算得到询问句“你看过电影吗”的内容主题为“电影”,并对内容主题“电影”进行编码得到主题向量,然后将上一轮的意图层的输出向量,上一轮的解码层的输出向量,本轮编码层的输出向量以及主题向量一并输入到意图层,通过神经网络运算输出初始隐层向量,该初始隐层向量可以用于解码层确定答句的第一个字。
在解码层,可看做编码层的逆向过程,可以将词向量和隐层向量解码为自然语言,可以根据意图层输出的初始隐层向量和注意力层中询问句中每个字的词向量,生成答句“我喜欢欧美电影”。假设一个10000个字的字典,每次解码会生成该1万个字的概率分布,然后每次选取概率最大的一个字进行输出。过程如下:首先的输出初始隐层向量,并将该初始隐层向量以及第一个字符是标识字符“_EOS_”的词向量输入到解码层,通过神经网络更新隐层向量得到第二隐层向量,第二隐层向量通过softmax回归算法生成1万个字的概率分布,其中“我”字的概率最大,因此输出答复字“我”,然后将第二隐层向量和答复字“我”的词向量作为输入,生成第三隐层向 量,根据第三隐层向量计算下一个字的概率分布,取概率最大的“喜”字作为输出。重复以上过程,直到输出特殊符号_EOS_时结束全部过程,则可以生成答复句“我喜欢欧美电影_EOS_”。
在本发明实施例中,首先将第K轮询问句中的每个字转化为第一词向量,并根据第一词向量计算每个字的正向隐层向量和反向隐层向量;然后获取第K轮询问句的内容主题,并将内容主题转化为第二词向量;其次根据第二词向量、第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对第K轮询问句输出的初始隐层向量;最后根据第K轮询问句中每个字的正向隐层向量和反向隐层向量、以及针对第K轮询问句输出的初始隐层向量,生成针对第K轮询问句的答复句,通过在生成对话过程中加入主题内容,有效的抑制了跨主题通用高频答复句的生成,提高生成对话的精确性。
请参考图4,图4是本发明实施例提供的一种对话生成设备的结构示意图。如图所示,该设备可以包括:至少一个处理器401,例如CPU,至少一个接口电路402,至少一个存储器403,至少一个总线404。
其中,通信总线404配置为实现这些组件之间的连接通信。
其中,本发明实施例中的接口电路402可以是有线发送端口,也可以为无线设备,例如包括天线装置,配置为与其他节点设备进行信令或数据的通信。
存储器403可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。存储器403可选的还可以是至少一个位于远离前述处理器401的存储装置。存储器403中存储一组程序代码,且处理器401配置为调用存储器中存储的程序代码,配置为执行以下步骤:
将第K轮询问句中的每个字转化为第一词向量,并根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量,K为大于等于2的正整数;
获取所述第K轮询问句的内容主题,并将所述内容主题转化为第二词向量;
根据所述第二词向量、所述第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对所述第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对所述第K轮询问句输出的初始隐层向量;
根据所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量、以及所述针对所述第K轮询问句输出的初始隐层向量,生成针对所述第K轮询问句的答复句。
其中,处理器401配置为执行如下操作步骤:
根据所述第K轮询问句中目标字的第一词向量和所述目标字的上一个字的正向隐层向量,计算所述目标字的正向隐层向量;或
根据所述第K轮询问句中目标字的第一词向量和所述目标字的下一个字的反向隐层向量,计算所述目标字的反向隐层向量。
其中,处理器401配置为执行如下操作步骤:
对所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量进行拼接得到所述第K轮询问句中每个字的隐层向量;
根据所述针对所述第K轮询问句输出的初始隐层向量以及所述第K轮询问句中每个字的隐层向量,生成针对所述第K轮询问句的答复句。
其中,处理器401配置为执行如下操作步骤:
根据所述针对所述第K轮询问句输出的初始隐层向量以及预设的标识字符的词向量,确定针对所述第K轮询问句输出的第二隐层向量,进而根 据所述第二隐层向量确定所述针对所述第K轮询问句输出的第一个答复字;
根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生成第二个答复字的贡献度;
根据所述第K轮询问句中每个字对生成第二个答复字的贡献度、所述第二隐层向量以及所述第一个答复字的词向量,计算所述第三隐层向量;
根据所述第三隐层向量,生成针对所述第K轮询问句的第二个答复字,依次类推生成针对所述第K轮询问句的答复句。
其中,处理器401配置为执行如下操作步骤:
根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生成所述第二个答复字的权重;
根据所述第K轮询问句中每个字对生成所述第二个答复字的权重,计算所述第K轮询问句中每个字的隐层向量的加权和,并将所述加权和作为所述第K轮询问句中每个字对生成所述第二个答复字的贡献度。
其中,处理器401配置为执行如下操作步骤:
根据所述第三隐层向量,计算在预设字典中的每个字的概率分布;
选择在所述预设字典中概率最大的字作为所述第二个答复字进行输出。
本发明实施例中,如果以软件功能部分的形式实现上述的对话生成方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器 (Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本发明实施例不限制于任何特定的硬件和软件结合。
相应地,本发明实施例再提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令用于执行本发明实施例中对话生成方法。
需要说明的是,对于前述的各个方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某一些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详细描述的部分,可以参见其他实施例的相关描述。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器随机存取器(Random Access Memory,RAM)、磁盘或光盘等。
以上对本发明实施例所提供的内容下载方法及相关设备、系统进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。
工业实用性
在本发明实施例中,根据第K轮询问句中每个字的正向隐层向量和反 向隐层向量、以及针对第K轮询问句输出的初始隐层向量,生成针对第K轮询问句的答复句,通过在生成对话过程中加入主题内容,有效的抑制了跨主题通用高频答复句的生成,提高生成对话的精确性。

Claims (16)

  1. 一种对话生成方法,所述方法包括:
    将第K轮询问句中的每个字转化为第一词向量,并根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量,K为大于等于2的正整数;
    获取所述第K轮询问句的内容主题,并将所述内容主题转化为第二词向量;
    根据所述第二词向量、所述第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对所述第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对所述第K轮询问句输出的初始隐层向量;
    根据所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量、以及所述针对所述第K轮询问句输出的初始隐层向量,生成针对所述第K轮询问句的答复句。
  2. 如权利要求1所述的方法,其中,所述根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量包括:
    根据所述第K轮询问句中目标字的第一词向量和所述目标字的上一个字的正向隐层向量,计算所述目标字的正向隐层向量。
  3. 如权利要求1所述的方法,其中,所述根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量包括:
    根据所述第K轮询问句中目标字的第一词向量和所述目标字的下一个字的反向隐层向量,计算所述目标字的反向隐层向量。
  4. 如权利要求1所述的方法,其中,所述根据所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量、以及所述针对所述第K轮询问句输出的初始隐层向量,生成针对所述第K轮询问句的答复句包 括:
    对所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量进行拼接得到所述第K轮询问句中每个字的隐层向量;
    根据所述针对所述第K轮询问句输出的初始隐层向量以及所述第K轮询问句中每个字的隐层向量,生成针对所述第K轮询问句的答复句。
  5. 如权利要求4所述的方法,其中,所述根据所述针对所述第K轮询问句输出的初始隐层向量以及所述第K轮询问句中每个字的隐层向量,生成针对所述第K轮询问句的答复句包括:
    根据所述针对所述第K轮询问句输出的初始隐层向量以及预设的标识字符的词向量,确定针对所述第K轮询问句输出的第二隐层向量,进而根据所述第二隐层向量确定所述针对所述第K轮询问句输出的第一个答复字;
    根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生成第二个答复字的贡献度;
    根据所述第K轮询问句中每个字对生成第二个答复字的贡献度、所述第二隐层向量以及所述第一个答复字的词向量,计算所述第三隐层向量;
    根据所述第三隐层向量,生成针对所述第K轮询问句的第二个答复字,依次类推生成针对所述第K轮询问句的答复句。
  6. 如权利要求5所述的方法,其中,所述根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生成第二个答复字的贡献度包括:
    根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生成所述第二个答复字的权重;
    根据所述第K轮询问句中每个字对生成所述第二个答复字的权重, 计算所述第K轮询问句中每个字的隐层向量的加权和,并将所述加权和作为所述第K轮询问句中每个字对生成所述第二个答复字的贡献度。
  7. 如权利要求5或6所述的方法,其中,所述根据所述第三隐层向量,生成针对所述第K轮询问句的第二个答复字包括:
    根据所述第三隐层向量,计算在预设字典中的每个字的概率分布;
    选择在所述预设字典中概率最大的字作为所述第二个答复字进行输出。
  8. 一种对话生成装置,所述装置包括:
    隐层计算部分,配置为将第K轮询问句中的每个字转化为第一词向量,并根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量,K为大于等于2的正整数;
    主题确定部分,配置为获取所述第K轮询问句的内容主题,并将所述内容主题转化为第二词向量;
    向量计算部分,配置为根据所述第二词向量、所述第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对所述第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对所述第K轮询问句输出的初始隐层向量;
    答复输出部分,配置为根据所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量、以及所述针对所述第K轮询问句输出的初始隐层向量,生成针对所述第K轮询问句的答复句。
  9. 如权利要求8所述的装置,其中,所述隐层计算部分配置为:
    根据所述第K轮询问句中目标字的第一词向量和所述目标字的上一个字的正向隐层向量,计算所述目标字的正向隐层向量。
  10. 如权利要求8所述的装置,其中,所述隐层计算部分配置为:
    根据所述第K轮询问句中目标字的第一词向量和所述目标字的下一个字的反向隐层向量,计算所述目标字的反向隐层向量。
  11. 如权利要求8所述的装置,其中,所述答复输出部分配置为:
    对所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量进行拼接得到所述第K轮询问句中每个字的隐层向量;
    根据所述针对所述第K轮询问句输出的初始隐层向量以及所述第K轮询问句中每个字的隐层向量,生成针对所述第K轮询问句的答复句。
  12. 如权利要求11所述的方法,其中,所述答复输出部分配置为:
    根据所述针对所述第K轮询问句输出的初始隐层向量以及预设的标识字符的词向量,确定针对所述第K轮询问句输出的第二隐层向量,进而根据所述第二隐层向量确定所述针对所述第K轮询问句输出的第一个答复字;
    根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生成第二个答复字的贡献度;
    根据所述第K轮询问句中每个字对生成第二个答复字的贡献度、所述第二隐层向量以及所述第一个答复字的词向量,计算所述第三隐层向量;
    根据所述第三隐层向量,生成针对所述第K轮询问句的第二个答复字,依次类推生成针对所述第K轮询问句的答复句。
  13. 如权利要求12所述的方法,其中,所述答复输出部分配置为:
    根据所述第二隐层向量以及所述第K轮询问句中每个字的隐层向量,计算所述第K轮询问句中每个字对生成所述第二个答复字的权重;
    根据所述第K轮询问句中每个字对生成所述第二个答复字的权重,计算所述第K轮询问句中每个字的隐层向量的加权和,并将所述加权和作为所述第K轮询问句中每个字对生成所述第二个答复字的贡献度。
  14. 如权利要求12或13所述的方法,其中,所述答复输出部分配置为:
    根据所述第三隐层向量,计算在预设字典中的每个字的概率分布;
    选择在所述预设字典中概率最大的字作为所述第二个答复字进行输出。
  15. 一种对话生成设备,其中,所述设备包括接口电路、存储器以及处理器,其中,存储器中存储一组程序代码,且处理器配置为调用存储器中存储的程序代码,配置为执行以下步骤:
    将第K轮询问句中的每个字转化为第一词向量,并根据所述第一词向量计算所述每个字的正向隐层向量和反向隐层向量,K为大于等于2的正整数;
    获取所述第K轮询问句的内容主题,并将所述内容主题转化为第二词向量;
    根据所述第二词向量、所述第K轮询问句中最后一个字的正向隐层向量、针对第K-1轮询问句输出的第K-1轮答复句中最后一个字的隐层向量、以及针对所述第K-1轮询问句输出的第K-1轮答复句的初始隐层向量,确定针对所述第K轮询问句输出的初始隐层向量;
    根据所述第K轮询问句中每个字的所述正向隐层向量和所述反向隐层向量、以及所述针对所述第K轮询问句输出的初始隐层向量,生成针对所述第K轮询问句的答复句。
  16. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令用于执行权利要求1至7任一项提供的对话生成方法。
PCT/CN2017/093417 2016-07-19 2017-07-18 一种对话生成方法及装置、设备、存储介质 WO2018014835A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/997,912 US10740564B2 (en) 2016-07-19 2018-06-05 Dialog generation method, apparatus, and device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610567504.0 2016-07-19
CN201610567504.0A CN107632987B (zh) 2016-07-19 2016-07-19 一种对话生成方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/997,912 Continuation US10740564B2 (en) 2016-07-19 2018-06-05 Dialog generation method, apparatus, and device, and storage medium

Publications (1)

Publication Number Publication Date
WO2018014835A1 true WO2018014835A1 (zh) 2018-01-25

Family

ID=60991987

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/093417 WO2018014835A1 (zh) 2016-07-19 2017-07-18 一种对话生成方法及装置、设备、存储介质

Country Status (3)

Country Link
US (1) US10740564B2 (zh)
CN (1) CN107632987B (zh)
WO (1) WO2018014835A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110134790A (zh) * 2019-05-17 2019-08-16 中国科学技术大学 一种语境集合与回复集合的匹配方法及装置
CN111597339A (zh) * 2020-05-22 2020-08-28 北京慧闻科技(集团)有限公司 文档级多轮对话意图分类方法、装置、设备及存储介质
CN115017314A (zh) * 2022-06-02 2022-09-06 电子科技大学 一种基于注意力机制的文本分类方法

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2568233A (en) * 2017-10-27 2019-05-15 Babylon Partners Ltd A computer implemented determination method and system
CN110309275B (zh) * 2018-03-15 2024-06-14 北京京东尚科信息技术有限公司 一种对话生成的方法和装置
CN108491514B (zh) * 2018-03-26 2020-12-01 清华大学 对话系统中提问的方法及装置、电子设备、计算机可读介质
CN109241262B (zh) * 2018-08-31 2021-01-05 出门问问信息科技有限公司 基于关键词生成回复语句的方法及装置
CN109241265B (zh) * 2018-09-17 2022-06-03 四川长虹电器股份有限公司 一种面向多轮查询的领域识别方法及系统
CN109376222B (zh) * 2018-09-27 2021-05-25 国信优易数据股份有限公司 问答匹配度计算方法、问答自动匹配方法及装置
US11270084B2 (en) * 2018-10-12 2022-03-08 Johnson Controls Tyco IP Holdings LLP Systems and methods for using trigger words to generate human-like responses in virtual assistants
CN109558585A (zh) * 2018-10-26 2019-04-02 深圳点猫科技有限公司 一种基于教育系统的答案自动寻找方法及电子设备
CN109635093B (zh) * 2018-12-17 2022-05-27 北京百度网讯科技有限公司 用于生成回复语句的方法和装置
CN109726394A (zh) * 2018-12-18 2019-05-07 电子科技大学 基于融合btm模型的短文本主题聚类方法
CN109597884B (zh) * 2018-12-28 2021-07-20 北京百度网讯科技有限公司 对话生成的方法、装置、存储介质和终端设备
IT201900000526A1 (it) * 2019-01-11 2020-07-11 Userbot S R L Sistema di intelligenza artificiale per processi aziendali
CN110147435B (zh) * 2019-01-24 2023-08-22 腾讯科技(深圳)有限公司 对话生成方法、装置、设备及存储介质
CN109933809B (zh) * 2019-03-15 2023-09-15 北京金山数字娱乐科技有限公司 一种翻译方法及装置、翻译模型的训练方法及装置
CN109992785B (zh) * 2019-04-09 2023-07-25 腾讯科技(深圳)有限公司 基于机器学习的内容计算方法、装置及设备
US11604962B2 (en) 2019-05-09 2023-03-14 Genpact Luxembourg S.à r.l. II Method and system for training a machine learning system using context injection
CN110413729B (zh) * 2019-06-25 2023-04-07 江南大学 基于尾句-上下文双重注意力模型的多轮对话生成方法
US11176330B2 (en) * 2019-07-22 2021-11-16 Advanced New Technologies Co., Ltd. Generating recommendation information
CN110598206B (zh) * 2019-08-13 2023-04-07 平安国际智慧城市科技股份有限公司 文本语义识别方法、装置、计算机设备和存储介质
CN110473540B (zh) * 2019-08-29 2022-05-31 京东方科技集团股份有限公司 语音交互方法及系统、终端设备、计算机设备及介质
CN111091011B (zh) * 2019-12-20 2023-07-28 科大讯飞股份有限公司 领域预测方法、领域预测装置及电子设备
CN111428014A (zh) * 2020-03-17 2020-07-17 北京香侬慧语科技有限责任公司 一种基于最大互信息的非自回归对话说生成方法及模型
US11494564B2 (en) * 2020-03-27 2022-11-08 Naver Corporation Unsupervised aspect-based multi-document abstractive summarization
CN112270167B (zh) * 2020-10-14 2022-02-08 北京百度网讯科技有限公司 角色标注方法、装置、电子设备和存储介质
CN113076408B (zh) * 2021-03-19 2024-09-17 联想(北京)有限公司 一种会话信息的处理方法及装置
CN112925896A (zh) * 2021-04-04 2021-06-08 河南工业大学 一种基于联合解码的话题扩展情感对话生成方法
US20230169271A1 (en) * 2021-11-30 2023-06-01 Adobe Inc. System and methods for neural topic modeling using topic attention networks
CN114238549A (zh) * 2021-12-15 2022-03-25 平安科技(深圳)有限公司 文本生成模型的训练方法、装置、存储介质及计算机设备
CN115293132B (zh) * 2022-09-30 2022-12-30 腾讯科技(深圳)有限公司 虚拟场景的对话处理方法、装置、电子设备及存储介质
CN116226356B (zh) * 2023-05-08 2023-07-04 深圳市拓保软件有限公司 一种基于nlp的智能客服交互方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8566102B1 (en) * 2002-03-28 2013-10-22 At&T Intellectual Property Ii, L.P. System and method of automating a spoken dialogue service
US20140236577A1 (en) * 2013-02-15 2014-08-21 Nec Laboratories America, Inc. Semantic Representations of Rare Words in a Neural Probabilistic Language Model
CN104462064A (zh) * 2014-12-15 2015-03-25 陈包容 一种移动终端信息通讯提示输入内容的方法和系统
CN104615646A (zh) * 2014-12-25 2015-05-13 上海科阅信息技术有限公司 智能聊天机器人系统

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5974412A (en) * 1997-09-24 1999-10-26 Sapient Health Network Intelligent query system for automatically indexing information in a database and automatically categorizing users
US6654735B1 (en) * 1999-01-08 2003-11-25 International Business Machines Corporation Outbound information analysis for generating user interest profiles and improving user productivity
US6347313B1 (en) * 1999-03-01 2002-02-12 Hewlett-Packard Company Information embedding based on user relevance feedback for object retrieval
US7567958B1 (en) * 2000-04-04 2009-07-28 Aol, Llc Filtering system for providing personalized information in the absence of negative data
DE60111329T2 (de) * 2000-11-14 2006-03-16 International Business Machines Corp. Anpassung des phonetischen Kontextes zur Verbesserung der Spracherkennung
US7590603B2 (en) * 2004-10-01 2009-09-15 Microsoft Corporation Method and system for classifying and identifying messages as question or not a question within a discussion thread
JP4476786B2 (ja) * 2004-11-10 2010-06-09 株式会社東芝 検索装置
CN1952928A (zh) * 2005-10-20 2007-04-25 梁威 建立自然语言知识库及其自动问答检索的计算机系统
US9129300B2 (en) * 2010-04-21 2015-09-08 Yahoo! Inc. Using external sources for sponsored search AD selection
US10331785B2 (en) * 2012-02-17 2019-06-25 Tivo Solutions Inc. Identifying multimedia asset similarity using blended semantic and latent feature analysis
US9465833B2 (en) * 2012-07-31 2016-10-11 Veveo, Inc. Disambiguating user intent in conversational interaction system for large corpus information retrieval
US9298757B1 (en) * 2013-03-13 2016-03-29 International Business Machines Corporation Determining similarity of linguistic objects
US20140280088A1 (en) * 2013-03-15 2014-09-18 Luminoso Technologies, Inc. Combined term and vector proximity text search
US9514753B2 (en) * 2013-11-04 2016-12-06 Google Inc. Speaker identification using hash-based indexing
US20150169772A1 (en) * 2013-12-12 2015-06-18 Microsoft Corporation Personalizing Search Results Based on User-Generated Content
CN104050256B (zh) * 2014-06-13 2017-05-24 西安蒜泥电子科技有限责任公司 基于主动学习的问答方法及采用该方法的问答系统
US20160189556A1 (en) * 2014-12-29 2016-06-30 International Business Machines Corporation Evaluating presentation data
CN105095444A (zh) * 2015-07-24 2015-11-25 百度在线网络技术(北京)有限公司 信息获取方法和装置
US10102206B2 (en) * 2016-03-31 2018-10-16 Dropbox, Inc. Intelligently identifying and presenting digital documents
WO2018014018A1 (en) * 2016-07-15 2018-01-18 University Of Central Florida Research Foundation, Inc. Synthetic data generation of time series data
US11113732B2 (en) * 2016-09-26 2021-09-07 Microsoft Technology Licensing, Llc Controlling use of negative features in a matching operation
CN107885756B (zh) * 2016-09-30 2020-05-08 华为技术有限公司 基于深度学习的对话方法、装置及设备
US10540967B2 (en) * 2016-11-14 2020-01-21 Xerox Corporation Machine reading method for dialog state tracking
US10635733B2 (en) * 2017-05-05 2020-04-28 Microsoft Technology Licensing, Llc Personalized user-categorized recommendations

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8566102B1 (en) * 2002-03-28 2013-10-22 At&T Intellectual Property Ii, L.P. System and method of automating a spoken dialogue service
US20140236577A1 (en) * 2013-02-15 2014-08-21 Nec Laboratories America, Inc. Semantic Representations of Rare Words in a Neural Probabilistic Language Model
CN104462064A (zh) * 2014-12-15 2015-03-25 陈包容 一种移动终端信息通讯提示输入内容的方法和系统
CN104615646A (zh) * 2014-12-25 2015-05-13 上海科阅信息技术有限公司 智能聊天机器人系统

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SERBAN, I.V. ET AL.: "Building End-to-End Dialogue Systems Using Generative Hierarchical Neural Network Models", PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 17 February 2015 (2015-02-17), pages 3776 - 3783, XP055454537 *
SHANG, LIFENG ET AL.: "Neural Responding Machine for Short-Text Conversation", PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7 TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, 31 July 2015 (2015-07-31), pages 1577 - 1586, XP055295743 *
SORDONI, A. ET AL.: "A Neural Network Approach to Context-Sensitive Generation of Conversational Responses", HUMAN LANGUAGE TECHNOLOGIES: THE 2015 ANNUAL CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ACL, 5 June 2015 (2015-06-05), pages 196 - 205, XP055295652 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110134790A (zh) * 2019-05-17 2019-08-16 中国科学技术大学 一种语境集合与回复集合的匹配方法及装置
CN110134790B (zh) * 2019-05-17 2022-09-30 中国科学技术大学 一种语境集合与回复集合的匹配方法及装置
CN111597339A (zh) * 2020-05-22 2020-08-28 北京慧闻科技(集团)有限公司 文档级多轮对话意图分类方法、装置、设备及存储介质
CN115017314A (zh) * 2022-06-02 2022-09-06 电子科技大学 一种基于注意力机制的文本分类方法

Also Published As

Publication number Publication date
US20180285348A1 (en) 2018-10-04
CN107632987B (zh) 2018-12-07
US10740564B2 (en) 2020-08-11
CN107632987A (zh) 2018-01-26

Similar Documents

Publication Publication Date Title
WO2018014835A1 (zh) 一种对话生成方法及装置、设备、存储介质
US12067981B2 (en) Adversarial learning and generation of dialogue responses
US10515155B2 (en) Conversational agent
US10255275B2 (en) Method and system for generation of candidate translations
CN109582767B (zh) 对话系统处理方法、装置、设备及可读存储介质
US11934454B2 (en) Video processing method and apparatus, video retrieval method and apparatus, storage medium, and server
CN108153913B (zh) 回复信息生成模型的训练方法、回复信息生成方法及装置
US20180329884A1 (en) Neural contextual conversation learning
WO2019174450A1 (zh) 一种对话生成的方法和装置
US11593571B2 (en) Machine translation method, device, and computer-readable storage medium
US20230394245A1 (en) Adversarial Bootstrapping for Multi-Turn Dialogue Model Training
CN110069612B (zh) 一种回复生成方法及装置
WO2022052744A1 (zh) 会话信息处理方法、装置、计算机可读存储介质及设备
CN115309877A (zh) 对话生成方法、对话模型训练方法及装置
US20240177506A1 (en) Method and Apparatus for Generating Captioning Device, and Method and Apparatus for Outputting Caption
Zhang et al. Nlp-qa framework based on lstm-rnn
WO2020155769A1 (zh) 关键词生成模型的建模方法和装置
WO2023137922A1 (zh) 语音消息生成方法和装置、计算机设备、存储介质
CN111506717B (zh) 问题答复方法、装置、设备及存储介质
WO2023231513A1 (zh) 对话内容的生成方法及装置、存储介质、终端
EP3525107A1 (en) Conversational agent
CN111104806B (zh) 神经机器翻译模型的构建方法及装置、翻译方法及装置
CN111797220A (zh) 对话生成方法、装置、计算机设备和存储介质
CN115879480A (zh) 语义约束机器翻译方法、装置、电子设备及存储介质
US20230168989A1 (en) BUSINESS LANGUAGE PROCESSING USING LoQoS AND rb-LSTM

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17830469

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17830469

Country of ref document: EP

Kind code of ref document: A1