US20210303606A1 - Dialog generation method and apparatus, device, and storage medium - Google Patents

Dialog generation method and apparatus, device, and storage medium Download PDF

Info

Publication number
US20210303606A1
US20210303606A1 US17/346,197 US202117346197A US2021303606A1 US 20210303606 A1 US20210303606 A1 US 20210303606A1 US 202117346197 A US202117346197 A US 202117346197A US 2021303606 A1 US2021303606 A1 US 2021303606A1
Authority
US
United States
Prior art keywords
word
input
encoding vector
attention score
dialog
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US17/346,197
Other languages
English (en)
Other versions
US12056167B2 (en
Inventor
Yizhang TAN
Jiachen DING
Changyu MIAO
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Assigned to TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED reassignment TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TAN, Yizhang, DING, Jiachen, MIAO, Changyu
Publication of US20210303606A1 publication Critical patent/US20210303606A1/en
Application granted granted Critical
Publication of US12056167B2 publication Critical patent/US12056167B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • Embodiments of this application relate to the field of artificial intelligence, and in particular, to a dialog generation method and apparatus, a device, and a storage medium.
  • the natural language dialog is one of the greatest challenges in artificial intelligence.
  • dialog systems such as Siri provided by Apple Inc.
  • Siri provided by Apple Inc.
  • dialog systems can make simple dialogs with humans and complete simple tasks, such as inquiring weather and checking stocks.
  • dialog generation method and apparatus a dialog generation method and apparatus, a device, and a storage medium are provided.
  • Specific technical solutions are as follows.
  • One aspect of the present disclosure provides a dialog generation method, performed by a human-machine dialog system.
  • the method includes obtaining an input dialog sequence from a dialog client; obtaining associated information related to the input dialog sequence; encoding, by an encoder, the input dialog sequence to obtain an input encoding vector; encoding, by the encoder, the associated information to obtain an associated encoding vector; decoding, by a decoder, the input encoding vector and the associated encoding vector to obtain an output dialog sequence, the output dialog sequence comprising an out-of-vocabulary word corresponding to the associated information; and transmitting the output dialog sequence to the dialog client.
  • the apparatus includes an interface module, an obtaining module, an encoding module, and a decoding module.
  • the interface module is configured to obtain an input dialog sequence from a dialog client.
  • the obtaining module is configured to obtain associated information related to the input dialog sequence.
  • the encoding module is configured to encode the input dialog sequence to obtain an input encoding vector, and to encode the associated information to obtain an associated encoding vector.
  • the decoding module is configured to decode the input encoding vector and the associated encoding vector to obtain an output dialog sequence, the output dialog sequence comprising an out-of-vocabulary word corresponding to the associated information.
  • the interface module is configured to transmit the output dialog sequence to the dialog client.
  • a computer device including a memory and one or more processors.
  • One or more memories store at least one computer-readable instruction, and the at least one computer-readable instruction is loaded and executed by the one or more processors to implement a plurality of operations.
  • the operations include: obtaining an input dialog sequence from a dialog client; obtaining associated information related to the input dialog sequence; encoding, by an encoder, the input dialog sequence to obtain an input encoding vector; encoding, by the encoder, the associated information to obtain an associated encoding vector; decoding, by a decoder, the input encoding vector and the associated encoding vector to obtain an output dialog sequence, the output dialog sequence comprising an out-of-vocabulary word corresponding to the associated information; and transmitting the output dialog sequence to the dialog client.
  • Another aspect of this disclosure provides a non-transitory computer-readable storage medium storing computer-readable instructions, when executed by one or more processors, the computer readable instructions computer-readable instructions cause one or more processors to implement the dialog generation method described above.
  • FIG. 1 is a block diagram of a human-machine dialog system according to one embodiment of this application.
  • FIG. 2 is a flowchart of a dialog generation method according to one embodiment of this application.
  • FIG. 3 is a diagram of a principle of a dialog generation method according to one embodiment of this application.
  • FIG. 4 is a flowchart of a dialog generation method according to one embodiment of this application.
  • FIG. 5 is a diagram of a principle of a dialog generation method according to one embodiment of this application.
  • FIG. 6 is a block diagram of a dialog generation apparatus according to one embodiment of this application.
  • FIG. 7 is a block diagram of a computer device according to one embodiment of this application.
  • first extended dictionary may be referred to as a second extended dictionary; and similarly, a second extended dictionary may be referred to as a first extended dictionary.
  • Both the first extended dictionary and the second extended dictionary may be extended dictionaries, and in some cases, may be separate and different extended dictionaries.
  • F factual information (such as a news reports) related to a current chat topic.
  • H opinion information (such as news comments, discussion history) related to a current chat topic.
  • Encoding expressing a dialog sequence as one or more encoding vectors, where the “dialog sequence” is generally a dialog sequence of a variable length.
  • Decoding outputting a corresponding output dialog sequence according to an encoding vector corresponding to an input dialog sequence.
  • Attention mechanism calculating a weight of one vector relative to a plurality of vectors, and obtaining a weighted average according to the weight
  • Copy generation network a new text generation system that can automatically copy text fragments from an input text to a generated text, or generate new text fragments.
  • Recurrent neural network (RNN) cell a component of a recurrent neural network, where for an input vector, an output vector is obtained through linear mapping and nonlinear variation of the neural network.
  • LSM Long short-term memory
  • a dialog system may select words and phrases from a preset dictionary to form a dialog sequence, i.e., an answer. All words and phrases in the answer come from the preset dictionary. Because all words and phrases in answers outputted by the dialog system are derived from a dictionary while words and phrases in the dictionary are preset and fixed, content of the answers outputted by the dialog system may be limited.
  • An embodiment of this application provides a dialog generation method applicable to a human-machine dialog system.
  • the system can combine hot events and/or different opinions and automatically generate sentences with facts and/or opinions to reply to the user.
  • FIG. 1 is a schematic structural diagram of a human-machine dialog system according to an embodiment of this application.
  • the human-machine dialog system includes: a dialog client 100 , a dialog server 200 , and an information resource server 300 .
  • the dialog client 100 can be implemented as any device such as a smart speaker, a smart robot, a smart vanity mirror, a smart phone, an application client, or a web client.
  • the dialog client 100 is provided with a microphone and a speaker, or the dialog client 100 is provided with peripheral components for inputting and displaying a text.
  • the dialog server 200 is a server for providing a backend dialog service for the dialog client 100 .
  • the dialog server 200 may be one server or a plurality of servers.
  • a neural network model based on sequence to sequence (seq2seq) is provided in the dialog server 200 .
  • the neural network model is used to generate an output dialog sequence based on the input dialog sequence.
  • the dialog server 200 may include, but is not limited to, weather query, business consulting, smart customer service (for air ticket service or restaurant service, etc.).
  • the dialog server 200 is also connected to the information resource server 300 .
  • the information resource server 300 stores factual information (Facts) and opinion information (History).
  • the dialog server 200 can obtain, from the information resource server 300 , factual information and/or opinion information related to the input dialog sequence.
  • the dialog server 200 includes: an interface module 220 , an obtaining module 240 , an encoder 260 , and a decoder 280 .
  • the interface module 220 is an interaction module or a communication module between the dialog server 200 and the dialog client 100 .
  • the interface module 220 is configured to obtain the input dialog sequence of the user from the dialog client 100 and transmit the sequence to the obtaining module 240 and the encoder 260 .
  • the interface module 220 is further configured to transmit the output dialog sequence generated by the dialog server 200 to the dialog client 100 .
  • the obtaining module 240 is configured to obtain factual information and/or opinion information corresponding to the output dialog sequence in the information resource server 300 .
  • FIG. 2 is a flowchart of a dialog generation method according to one embodiment of this application.
  • the method being applied to the dialog server 200 in FIG. 1 is taken as an example.
  • the method includes the following steps.
  • Step 201 Obtain an input dialog sequence from a dialog client.
  • the input dialog sequence is an input sequence, that is, a to-be-processed dialog sequence.
  • the dialog client collects the input dialog sequence in text form and/or speech form from a user, and transmits the input dialog sequence to the dialog server.
  • the dialog server obtains the input dialog sequence from the dialog client.
  • the input dialog sequence in speech form may be converted into the input dialog sequence in text form by the dialog client or the dialog server.
  • Step 202 Obtain associated information related to the input dialog sequence.
  • the dialog server may retrieve the associated information related to the input dialog sequence in an information resource server according to the output dialog sequence.
  • the associated information includes: factual information (Facts) and/or opinion information (History).
  • the factual information includes at least one of news reports, encyclopedia knowledge, or common knowledge.
  • the opinion information includes at least one of forum discussion history or a thread.
  • Step 203 Call an encoder to encode the input dialog sequence to obtain an input encoding vector.
  • the dialog server may convert the input dialog sequence into an input word vector, and then call the encoder to encode the input word vector to obtain an input encoding vector.
  • the input encoding vector is a feature vector used to represent an input dialog sequence.
  • Step 204 Call the encoder to encode the associated information to obtain an associated encoding vector.
  • the dialog server may convert the associated information into an associated word vector, and then call the encoder to encode the associated word vector to obtain the associated encoding vector.
  • the associated encoding vector is a feature vector used to represent associated information.
  • Step 205 Call a decoder to decode the input encoding vector and the associated encoding vector to obtain an output dialog sequence, the output dialog sequence including an out-of-vocabulary word belonging to the associated information.
  • the out-of-vocabulary (OOV) word refers to words that are not in the dictionary.
  • a dictionary is equivalent to a set of words. If a word is in the dictionary, it is called an in-vocabulary word. Otherwise, it is called an out-of-vocabulary word.
  • a decoder is called to dynamically decode the input encoding vector and the associated encoding vector to obtain an output dialog sequence, the output dialog sequence including an out-of-vocabulary word belonging to the associated information.
  • the dynamic decoding includes: generating an output word from a preset dictionary according to the input encoding vector, and/or copying an output word from an extended dictionary according to the associated encoding vector.
  • the extended dictionary is a dictionary constructed based on words in the associated information.
  • Step 206 Transmit the output dialog sequence to the dialog client.
  • the dialog server transmits the output dialog sequence in text form to the dialog client, or transmits the output dialog sequence in speech form to the dialog client after converting the output dialog sequence in text form into the output dialog sequence in speech form.
  • an encoder is called to encode the input dialog sequence to obtain an input encoding vector; the encoder is called to encode the associated information to obtain an associated encoding vector; and a decoder is called to decode the input encoding vector and the associated encoding vector to obtain an output dialog sequence.
  • the output dialog sequence is dynamically generated based on the preset dictionary and the extended dictionary, the output dialog sequence includes an out-of-vocabulary word belonging to the associated information.
  • the associated information includes factual information and/or opinion information related to the input dialog sequence, the dialog system can automatically generate an answer with factual information and/or opinion information, thereby achieving a good dialog effect.
  • FIG. 3 is a schematic diagram of a principle of a dialog system according to one embodiment of this application.
  • the associated information includes: opinion information (History) and factual information (Facts), and the dialog generation model includes an encoder 320 and a decoder 340 .
  • the dialog system obtains a current user input (Input), opinion information (History) related to the current user input, and factual information (Facts) related to the current user input.
  • the encoder 320 is configured to encode the current user input (Input) to obtain an input encoding vector I; encode the opinion information (History) to obtain an opinion encoding vector H; and encode the factual information (Facts) to obtain a fact encoding vector F.
  • the input encoding vector I, the opinion encoding vector H, and the fact encoding vector F are all encoding vectors.
  • the encoder 320 is configured to input the input encoding vector I, the opinion encoding vector H, and the fact encoding vector F to the decoder 340 .
  • the decoder 340 is provided with a copy generation network 50 .
  • the copy generation network 50 decodes the input encoding vector I, the opinion encoding vector H, and the fact encoding vector F, to obtain the output dialog sequence.
  • the output dialog sequence is a reply to the current user input (Input).
  • the fact encoding vector F and/or the opinion encoding vector H obtained by encoding the factual information can be collectively referred to as the associated encoding vector.
  • the dialog system inputs the input encoding vector, the associated encoding vector, and decoding information of a previous instant to the copy generation network 50 for decoding, to obtain the output dialog sequence.
  • FIG. 4 is a flowchart of a dialog generation method according to one embodiment of this application.
  • the method being applied to the dialog server 200 in the human-machine dialog system in FIG. 1 is taken as an example.
  • the method includes the following steps.
  • Step 401 Obtain an input dialog sequence from a dialog client.
  • the input dialog sequence is an input sequence, that is, a to-be-processed dialog sequence.
  • the dialog client collects the input dialog sequence in text form and/or speech form from a user, and transmits the input dialog sequence to the dialog server.
  • the dialog server obtains the input dialog sequence from the dialog client.
  • the input dialog sequence in speech form may be converted into the input dialog sequence in text form by the dialog client or the dialog server.
  • the input dialog sequence is a text sequence of a variable length.
  • Step 402 Obtain associated information related to the input dialog sequence.
  • the dialog server retrieves the associated information related to the input dialog sequence in an information resource server according to the output dialog sequence.
  • the associated information includes: factual information (Facts) and/or opinion information (History).
  • the factual information includes at least one of news reports, encyclopedia knowledge, or common knowledge.
  • the opinion information includes at least one of forum discussion history or a thread.
  • Step 403 Call an encoder to encode the input dialog sequence to obtain an input encoding vector.
  • a correspondence between words and word vectors is preset in the human-machine dialog system.
  • a word vector is a vector that represents a word by using a mathematical method.
  • the human-machine dialog system performs word segmentation on the input dialog sequence and obtains a plurality of words arranged in order. A word vector corresponding to each word is queried, and the word vector corresponding to each word is arranged to obtain a word vector of the input dialog sequence.
  • the encoder encodes the word vector of the input dialog sequence to obtain the input encoding vector I.
  • the encoder is constructed based on Bi-LSTM, but this embodiment does not limit the encoder to a specific type.
  • Step 404 Call the encoder to encode the associated information to obtain an associated encoding vector.
  • the human-machine dialog system performs word segmentation on the associated information and obtains a plurality of words arranged in order. A word vector corresponding to each word is queried, and the word vector corresponding to each word is arranged to obtain a word vector of the input dialog sequence.
  • the human-machine dialog system When the associated information includes opinion information, the human-machine dialog system performs word segmentation on the opinion information and obtains a plurality of words arranged in order. A word vector corresponding to each word is queried, and the word vector corresponding to each word is arranged to obtain a word vector of the opinion information. The encoder encodes the word vector of the opinion information to obtain the opinion encoding vector H.
  • H represents opinion information
  • h represents a hidden state
  • L represents a total of L hidden states
  • i is an integer not greater than L.
  • the human-machine dialog system When the associated information includes factual information, the human-machine dialog system performs word segmentation on the factual information and obtains a plurality of words arranged in order. A word vector corresponding to each word is queried, and the word vector corresponding to each word is arranged to obtain a word vector of the factual information. The encoder encodes the word vector of the factual information to obtain the fact encoding vector H.
  • a single-layer Bi-LSTM is used to encode the word vector of the factual information to obtain the fact encoding vector (also known as a hidden state sequence)
  • H F ⁇ h 1 F , h 2 F , . . . , h i F , . . . , H L F ⁇ .
  • F represents factual information
  • h represents a hidden state
  • L represents a total of L hidden states
  • i is an integer not greater than L.
  • the encoder connects the input encoding vector end to end as an initial state input of the decoder.
  • the encoder connects an initial hidden state of the opinion encoding vector and a final hidden state of the fact encoding vector end to end as an initial state input of the decoder.
  • the encoder connects an initial hidden state of the fact encoding vector and a final hidden state of the opinion encoding vector end to end as an initial state input of the decoder.
  • the opinion encoding vector and the fact encoding vector are separately used as an initial state input of the decoder.
  • Step 405 Determine, at a current decoding instant of the copy generation network, an input attention score of each word in a preset dictionary according to the input encoding vector.
  • Step 406 Determine, at the current decoding instant of the copy generation network, an associated attention score of each word in an extended dictionary according to the associated encoding vector.
  • the copy generation network correspondingly includes: the preset dictionary and the extended dictionary.
  • the preset dictionary is a dictionary with a fixed quantity of words and content; and the extended dictionary is a dictionary constructed based on words in factual information and/or opinion information.
  • the extended dictionary includes a first extended dictionary and/or a second extended dictionary. The first extended dictionary is constructed based on words in the factual information, and the second extended dictionary is constructed based on words in the opinion information.
  • the copy generation network has three modes: a generation mode, an H copy mode and an F copy mode.
  • F copy mode Obtain a probability distribution of each word on a first extended vocabulary corresponding to the factual information.
  • H copy mode Obtain a probability distribution of each word on a second extended vocabulary corresponding to the opinion information.
  • the copy generation network dynamically adopts one of the foregoing modes, and performs decoding to obtain an output word at the current decoding instant. This process is performed based on an attention score of each word, and an attention score is also a probability distribution determined based on the attention score.
  • the decoder determines an input attention score of each word in a preset dictionary according to the input encoding vector; and determines an associated attention score of each word in an extended dictionary according to the associated encoding vector.
  • the fact attention score of each word in the first extended dictionary is determined according to the fact encoding vector.
  • t refers to the t th decoding instant
  • v H F , W h F , W r F and b F are learnable network parameters
  • h j F is the j th hidden state in the fact encoding vector, j being an integer not greater than L.
  • F is factual information
  • ⁇ tj F is the attention score of the j th word in the fact encoding vector at the decoding instant t.
  • the opinion attention score of each word in the second extended dictionary is determined according to the opinion encoding vector.
  • t is the t th decoding instant
  • v H F , W h F , W r F and b F are learnable network parameters
  • h i H is the i th hidden state in the opinion encoding vector, i being an integer not greater than L.
  • H is opinion information
  • ⁇ tj H is the attention score of the i th word in the opinion encoding vector at the decoding instant t.
  • Step 407 Determine a weight according to the decoding information of the previous instant, and perform a weighted summation on the input attention score and the associated attention score according to the weight, to obtain a total attention score of each word.
  • the human-machine dialog system determines a weight according to the decoding information of the previous instant, and performs a weighted summation on the input attention score and the associated attention score according to the weight, to obtain a total attention score of each word.
  • the human-machine dialog system determines, according to the decoding information of the previous instant, a first weight corresponding to the input attention score, a second weight corresponding to the associated attention score, and a third weight corresponding to the opinion attention score; and adds a product of the input attention score and the first weight, a product of the associated attention score and the second weight, and a product of the opinion attention score and the third weight, to obtain the total attention score of the each word.
  • H is an opinion encoding vector
  • F is a fact encoding vector
  • t is a decoding instant
  • m is an index of an attention score corresponding to the three modes (the generation mode, H copy mode, and F copy mode).
  • t, H, F) is an attention score corresponding to the index m at the decoding time t; and Pr m (m
  • the weight is calculated from the decoding information of the previous instant. In an embodiment, the weight is related to the quantity of occurrences of the corresponding word in the opinion information or factual information, the quantity of occurrences being determined based on the decoding information of the previous instant.
  • Step 408 Determine a word with the highest total attention score as an output word of the current decoding instant.
  • the word with the highest total attention score is extracted from the preset dictionary as the output word of the current decoding instant when the word with the highest total attention score belongs to the preset dictionary; and the word with the highest total attention score is copied from the extended dictionary as the output word of the current decoding instant when the word with the highest total attention score belongs to the extended dictionary.
  • the word with the highest total attention score belongs to the first extended dictionary
  • the word is copied from the first extended dictionary as the output word of the current decoding instant
  • the word with the highest total attention score belongs to the second extended dictionary
  • Step 409 Repeat the foregoing steps to obtain output words at each decoding instant, and connect the output words at the each decoding instant in sequence to obtain an output text sequence.
  • Step 410 Transmit the output dialog sequence to the dialog client.
  • the dialog server transmits the output dialog sequence in text form to the dialog client, or transmits the output dialog sequence in speech form to the dialog client after converting the output dialog sequence in text form into the output dialog sequence in speech form.
  • an encoder is called to encode the input dialog sequence to obtain an input encoding vector; the encoder is called to encode the associated information to obtain an associated encoding vector; and a decoder is called to decode the input encoding vector and the associated encoding vector to obtain an output dialog sequence.
  • the output dialog sequence is dynamically generated based on the preset dictionary and the extended dictionary, the output dialog sequence includes an out-of-vocabulary word belonging to the associated information.
  • the associated information includes factual information and/or opinion information related to the input dialog sequence, the dialog system can automatically generate an answer with factual information and/or opinion information, thereby achieving a good dialog effect.
  • an attention mechanism is used to determine an attention score of each word in an extended dictionary, and a dynamic weighting method is used to comprehensively calculate a total attention score of each word.
  • a total attention score of a word belonging to the extended dictionary is high, the word can be copied to the output dialog sequence.
  • an attention probability distribution p 1 of each word in the factual information is calculated according to the fact encoding vector
  • an attention probability distribution p 2 of each word in the opinion information is calculated according to the opinion encoding vector.
  • the first extended dictionary and the second extended dictionary include out-of-vocabulary words.
  • a probability distribution p 3 in a default dictionary may further be determined based on the input dialog sequence of the user.
  • a weighted summation is performed on the three probability distributions to obtain a final probability distribution. Therefore, at each decoding instant t, the word with the highest total attention score is outputted as the output word of the current decoding instant.
  • the output words at each decoding instant are sequentially connected to obtain the output text sequence. If the output word at the current decoding instant is an out-of-vocabulary word belonging to H or F, the out-of-vocabulary word is copied to the output dialog sequence, to generate a reply sentence with facts and/or an opinion.
  • NIST the machine translation evaluation indicator proposed by Dod-dington in 2002
  • BLEU Papereni et al., 2002
  • Me-teor Deteor
  • DIV-1, DIV-2 also known as distinct-1 and distinct-2
  • Entropy 1-4 Zhang et al., 2018
  • the system has achieved the best result on main indicators of NIST-4, BLEU-4, and Meteor.
  • the use of K-means beam search can effectively improve the performance of almost all major algorithms and all diversity indicators.
  • our system produces a longer response than the seq2seq baseline.
  • the system using K-means beam search has a longer response time.
  • human response time is longer than the response time of our system, and the response time of Team G which is generated by using 22 tokens on average is even longer.
  • our system In terms of the ability to output out-of-vocabulary (OOV) words not covered by the first 100 k vocabulary, our system generates 97 and 57 unique OOV words in the submitted test responses by using K-means beam search and traditional beam search respectively. Compared with traditional beam search, K-means beam search can replicate more OOV words.
  • OOV out-of-vocabulary
  • FIG. 6 is a block diagram of a dialog generation apparatus according to one embodiment of this application.
  • the dialog generation apparatus can be implemented as all or a part of the human-machine dialog system through software, hardware or a combination of the two.
  • the apparatus includes: an interface module 620 , an obtaining module 640 , an encoding module 660 , and a decoding module 680 .
  • the interface module 620 is configured to obtain an input dialog sequence from a dialog client.
  • the obtaining module 640 is configured to obtain associated information related to the input dialog sequence.
  • the encoding module 660 is configured to encode the input dialog sequence to obtain an input encoding vector.
  • the encoding module 660 is further configured to encode the associated information to obtain an associated encoding vector.
  • the decoding module 680 is configured to decode the input encoding vector and the associated encoding vector to obtain an output dialog sequence, the output dialog sequence including an out-of-vocabulary word belonging to the associated information.
  • the interface module 620 is configured to transmit the output dialog sequence to the dialog client.
  • the decoding module 680 includes a copy generation network.
  • the decoding module 680 is configured to input the input encoding vector, the associated encoding vector, and decoding information of a previous instant to the copy generation network for decoding, to obtain the output dialog sequence.
  • the decoding module 680 is configured to determine, at a current decoding instant of the copy generation network, an input attention score of each word in a preset dictionary according to the input encoding vector; determine, at the current decoding instant of the copy generation network, an associated attention score of each word in an extended dictionary according to the associated encoding vector; determine a weight according to the decoding information of the previous instant, and perform a weighted summation on the input attention score and the associated attention score according to the weight, to obtain a total attention score of the each word; and determine a word with the highest total attention score as an output word of the current decoding instant.
  • the extended dictionary is a dictionary constructed based on words in the associated information.
  • the associated information includes factual information and/or opinion information
  • the associated encoding vector includes: a fact encoding vector and an opinion encoding vector.
  • the decoding module 680 is further configured to determine a fact attention score of each word in a first extended dictionary according to the fact encoding vector.
  • the decoding module 680 is further configured to determine an opinion attention score of each word in a second extended dictionary according to the opinion encoding vector.
  • the first extended dictionary is a dictionary constructed based on words in the factual information and the second extended dictionary is a dictionary constructed based on words in the opinion information.
  • the decoding module 680 is configured to determine, according to the decoding information of the previous instant, a first weight corresponding to the input attention score, a second weight corresponding to the associated attention score, and a third weight corresponding to the opinion attention score; and add a product of the input attention score and the first weight, a product of the associated attention score and the second weight, and a product of the opinion attention score and the third weight, to obtain the total attention score of the each word.
  • the decoding module 680 is configured to extract the word with the highest total attention score from the preset dictionary as the output word of the current decoding instant when the word with the highest total attention score belongs to the preset dictionary; and copy the word with the highest total attention score from the extended dictionary as the output word of the current decoding instant when the word with the highest total attention score belongs to the extended dictionary.
  • FIG. 7 is a schematic structural diagram of a computer device according to an embodiment of this application.
  • the computer device may be a dialog server, and the dialog server is configured to execute the dialog generation method provided in the foregoing embodiments.
  • the computer device 700 includes a central processing unit (CPU) 701 , a system memory 704 including a random access memory (RAM) 702 and a read-only memory (ROM) 703 , and a system bus 705 connecting the system memory 704 and the CPU 701 .
  • the computer device 700 further includes a basic input/output (I/O) system 706 assisting in transmitting information between components in the computer, and a mass storage device 707 configured to store an operating system 713 , an application program 714 , and another program module 715 .
  • I/O basic input/output
  • the basic I/O system 706 includes a display 708 configured to display information, and an input device 709 configured to allow a user to enter information, for example, a mouse or a keyboard.
  • the display 708 and the input device 709 are both connected to the central processing unit 701 by using the system bus 705 connected to an input/output controller 710 .
  • the basic I/O system 706 may further include the I/O controller 710 , to receive and process input from multiple other devices such as a keyboard, a mouse, and an electronic stylus.
  • the I/O controller 710 further provides an output to a display, a printer or another type of output device.
  • the mass storage device 707 is connected to the CPU 701 by using a mass storage controller (not shown) connected to the system bus 705 .
  • the mass storage device 707 and a computer-readable medium associated with the large-capacity storage device provide non-volatile storage to the computer device 700 . That is, the mass storage device 707 may include the computer readable medium (not shown) such as a hard disk or a CD-ROM drive.
  • the computer-readable medium may include a computer storage medium and a communication medium.
  • the computer storage medium includes volatile and non-volatile, removable and non-removable media that store information such as computer-readable instructions, data structures, program modules, or other data and that are implemented by using any method or technology.
  • the computer storage medium includes a RAM, a ROM, an EPROM, an EEPROM, a flash memory or other solid storage technologies; a CD-ROM, a DVD or other optical storages; and a cassette, a magnetic tape, a disk storage or other magnetic storage devices.
  • the system memory 704 and the mass storage device 707 may be collectively referred to as a memory.
  • the computer device 700 may further be connected, through a network such as the Internet, to a remote computer on the network and run. That is, the computer device 700 may be connected to a network 712 by using a network interface unit 711 connected to the system bus 705 , or may be connected to another type of network or a remote computer system (not shown) by using a network interface unit 711 .
  • the memory further includes one or more computer-readable instructions.
  • the one or more computer-readable instructions are stored in the memory and configured to be executed by one or more processors.
  • the one or more computer-readable instructions are configured to implement the dialog generation method.
  • This application further provides a computer-readable storage medium, the storage medium storing at least one computer-readable instruction, at least one program, a code set, or a computer-readable instruction set, the at least one computer-readable instruction, the at least one program, the code set, or the computer-readable instruction set being loaded and executed by a processor to implement the dialog generation method according to the foregoing method embodiments.
  • this application further provides a computer program product including computer readable instructions.
  • the product when running on an electronic device, causes the electronic device to execute the dialog generation method according to the foregoing method embodiments.
  • unit and other similar terms such as subunit, module, submodule, etc., in this disclosure may refer to a software unit, a hardware unit, or a combination thereof.
  • a software unit e.g., computer program
  • a hardware unit may be implemented using processing circuitry and/or memory.
  • processors or processors and memory
  • a processor or processors and memory
  • each unit can be part of an overall unit that includes the functionalities of the unit.
  • steps are displayed sequentially according to the instructions of the arrows in the flowcharts of the embodiments, these steps are not necessarily performed sequentially according to the sequence instructed by the arrows. Unless otherwise explicitly specified in this specification, execution of the steps is not strictly limited, and the steps may be performed in other orders. Moreover, at least some of the steps in each embodiment may include a plurality of sub-steps or a plurality of stages. The sub-steps or stages are not necessarily performed at the same instant but may be performed at different instants. Execution of the sub-steps or stages is not necessarily sequentially performed, but may be performed alternately with other steps or at least some of sub-steps or stages of other steps.
  • the program may be stored in a computer-readable storage medium.
  • the storage medium may be a read-only memory, a magnetic disk, an optical disc, or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
US17/346,197 2019-01-24 2021-06-11 Dialog generation method and apparatus, device, and storage medium Active 2041-04-21 US12056167B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201910069718.9 2019-01-24
CN201910069718.9A CN110147435B (zh) 2019-01-24 2019-01-24 对话生成方法、装置、设备及存储介质
PCT/CN2020/073383 WO2020151689A1 (zh) 2019-01-24 2020-01-21 对话生成方法、装置、设备及存储介质

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/073383 Continuation WO2020151689A1 (zh) 2019-01-24 2020-01-21 对话生成方法、装置、设备及存储介质

Publications (2)

Publication Number Publication Date
US20210303606A1 true US20210303606A1 (en) 2021-09-30
US12056167B2 US12056167B2 (en) 2024-08-06

Family

ID=67589572

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/346,197 Active 2041-04-21 US12056167B2 (en) 2019-01-24 2021-06-11 Dialog generation method and apparatus, device, and storage medium

Country Status (4)

Country Link
US (1) US12056167B2 (zh)
JP (1) JP7194270B2 (zh)
CN (1) CN110147435B (zh)
WO (1) WO2020151689A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220036011A1 (en) * 2020-07-30 2022-02-03 InfoAuthN AI Inc. Systems and Methods for Explainable Fake News Detection
CN114548092A (zh) * 2022-02-24 2022-05-27 广州华多网络科技有限公司 客服会话调度方法及其装置、设备、介质、产品
US20220207244A1 (en) * 2020-12-30 2022-06-30 Yandex Europe Ag Method and server for training a machine learning algorithm for executing translation

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147435B (zh) * 2019-01-24 2023-08-22 腾讯科技(深圳)有限公司 对话生成方法、装置、设备及存储介质
CN110728356B (zh) * 2019-09-17 2023-08-04 创新先进技术有限公司 基于循环神经网络的对话方法、系统及电子设备
CN110851575B (zh) * 2019-09-23 2022-09-16 深思考人工智能科技(上海)有限公司 一种对话生成系统及对话实现方法
CN110990697A (zh) * 2019-11-28 2020-04-10 腾讯科技(深圳)有限公司 内容推荐方法、装置、设备和存储介质
CN111428015B (zh) * 2020-03-20 2023-03-14 腾讯科技(深圳)有限公司 一种信息生成方法、装置、设备及存储介质
CN115169549B (zh) 2022-06-24 2023-08-22 北京百度网讯科技有限公司 人工智能模型更新方法、装置、电子设备及存储介质

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8645479B2 (en) * 2007-06-28 2014-02-04 Tencent Technology (Shenzhen) Company Limited Chatting system, method and apparatus for virtual pet
US20140316764A1 (en) * 2013-04-19 2014-10-23 Sri International Clarifying natural language input using targeted questions
US9368114B2 (en) * 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US20180137109A1 (en) * 2016-11-11 2018-05-17 The Charles Stark Draper Laboratory, Inc. Methodology for automatic multilingual speech recognition
US20180217979A1 (en) * 2016-02-18 2018-08-02 Tencent Technology (Shenzhen) Company Limited Text information processing method and apparatus
US20180233143A1 (en) * 2017-02-13 2018-08-16 Kabushiki Kaisha Toshiba Dialogue system, a dialogue method and a method of adapting a dialogue system
US20180285348A1 (en) * 2016-07-19 2018-10-04 Tencent Technology (Shenzhen) Company Limited Dialog generation method, apparatus, and device, and storage medium
US20180300400A1 (en) * 2017-04-14 2018-10-18 Salesforce.Com, Inc. Deep Reinforced Model for Abstractive Summarization
US10152970B1 (en) * 2018-02-08 2018-12-11 Capital One Services, Llc Adversarial learning and generation of dialogue responses

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007328283A (ja) * 2006-06-09 2007-12-20 Kenwood Corp 対話装置、プログラム、及び対話方法
WO2014130132A2 (en) * 2012-12-06 2014-08-28 Raytheon Bbn Technologies Corp. Active error detection and resolution for linguistic translation
KR20180001889A (ko) * 2016-06-28 2018-01-05 삼성전자주식회사 언어 처리 방법 및 장치
CN106257441B (zh) * 2016-06-30 2019-03-15 电子科技大学 一种基于词频的skip语言模型的训练方法
JP6333329B2 (ja) * 2016-09-15 2018-05-30 ヤフー株式会社 情報処理装置、情報処理方法、およびプログラム
CN107967262B (zh) * 2017-11-02 2018-10-30 内蒙古工业大学 一种神经网络蒙汉机器翻译方法
CN108038105B (zh) * 2017-12-22 2020-06-05 中科鼎富(北京)科技发展有限公司 一种对未登录词生成仿真词向量的方法及装置
JP6884722B2 (ja) * 2018-03-16 2021-06-09 ヤフー株式会社 情報処理装置、情報処理方法、およびプログラム
CN108829670A (zh) * 2018-06-01 2018-11-16 北京玄科技有限公司 基于单语义的未登录词处理方法、智能问答方法及装置
CN109063174B (zh) * 2018-08-21 2022-06-07 腾讯科技(深圳)有限公司 查询答案的生成方法及装置、计算机存储介质、电子设备
CN110147435B (zh) * 2019-01-24 2023-08-22 腾讯科技(深圳)有限公司 对话生成方法、装置、设备及存储介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8645479B2 (en) * 2007-06-28 2014-02-04 Tencent Technology (Shenzhen) Company Limited Chatting system, method and apparatus for virtual pet
US9368114B2 (en) * 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US20140316764A1 (en) * 2013-04-19 2014-10-23 Sri International Clarifying natural language input using targeted questions
US20180217979A1 (en) * 2016-02-18 2018-08-02 Tencent Technology (Shenzhen) Company Limited Text information processing method and apparatus
US20180285348A1 (en) * 2016-07-19 2018-10-04 Tencent Technology (Shenzhen) Company Limited Dialog generation method, apparatus, and device, and storage medium
US20180137109A1 (en) * 2016-11-11 2018-05-17 The Charles Stark Draper Laboratory, Inc. Methodology for automatic multilingual speech recognition
US20180233143A1 (en) * 2017-02-13 2018-08-16 Kabushiki Kaisha Toshiba Dialogue system, a dialogue method and a method of adapting a dialogue system
US20180300400A1 (en) * 2017-04-14 2018-10-18 Salesforce.Com, Inc. Deep Reinforced Model for Abstractive Summarization
US10152970B1 (en) * 2018-02-08 2018-12-11 Capital One Services, Llc Adversarial learning and generation of dialogue responses

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220036011A1 (en) * 2020-07-30 2022-02-03 InfoAuthN AI Inc. Systems and Methods for Explainable Fake News Detection
US20220207244A1 (en) * 2020-12-30 2022-06-30 Yandex Europe Ag Method and server for training a machine learning algorithm for executing translation
US11989528B2 (en) * 2020-12-30 2024-05-21 Direct Cursus Technology L.L.C Method and server for training a machine learning algorithm for executing translation
CN114548092A (zh) * 2022-02-24 2022-05-27 广州华多网络科技有限公司 客服会话调度方法及其装置、设备、介质、产品

Also Published As

Publication number Publication date
CN110147435A (zh) 2019-08-20
JP7194270B2 (ja) 2022-12-21
CN110147435B (zh) 2023-08-22
US12056167B2 (en) 2024-08-06
WO2020151689A1 (zh) 2020-07-30
JP2022503838A (ja) 2022-01-12

Similar Documents

Publication Publication Date Title
US12056167B2 (en) Dialog generation method and apparatus, device, and storage medium
Amin et al. Will affective computing emerge from foundation models and general artificial intelligence? A first evaluation of ChatGPT
US10061769B2 (en) Machine translation method for performing translation between languages
Bala et al. Chat-bot for college management system using AI
Lowe et al. On the evaluation of dialogue systems with next utterance classification
CN109582956B (zh) 应用于句子嵌入的文本表示方法和装置
US8204751B1 (en) Relevance recognition for a human machine dialog system contextual question answering based on a normalization of the length of the user input
Naous et al. Empathy-driven Arabic conversational chatbot
RU2692427C1 (ru) Система определения интереса, способ определения интереса и носитель информации
US20210117458A1 (en) Response selecting apparatus, response selecting method, and response selecting program
CN111382573A (zh) 用于答案质量评估的方法、装置、设备和存储介质
US20200265327A1 (en) Selecting answer spans from electronic documents using neural networks
CN112307168A (zh) 基于人工智能的问诊会话处理方法、装置和计算机设备
US11995523B2 (en) Systems and methods for determining training parameters for dialog generation
US11790894B2 (en) Machine learning based models for automatic conversations in online systems
Qu et al. Weakly-supervised open-retrieval conversational question answering
Huang et al. Personalized dialogue generation with persona-adaptive attention
CN117494815A (zh) 面向档案的可信大语言模型训练、推理方法和装置
Zalake et al. Generative chat bot implementation using deep recurrent neural networks and natural language understanding
CN109918484B (zh) 对话生成方法和装置
US11941365B2 (en) Response selecting apparatus, model learning apparatus, response selecting method, model learning method, and program
Wang et al. Knowledge grounded pre-trained model for dialogue response generation
Hirano et al. Construction of Domain-specified Japanese Large Language Model for Finance through Continual Pre-training
Young et al. Top-down versus bottom-up analyses of interlanguage data: A reply to Saito
Chen et al. Llama-lora neural prompt engineering: A deep tuning framework for automatically generating chinese text logical reasoning thinking chains

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TAN, YIZHANG;DING, JIACHEN;MIAO, CHANGYU;SIGNING DATES FROM 20210412 TO 20210413;REEL/FRAME:056522/0838

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

ZAAA Notice of allowance and fees due

Free format text: ORIGINAL CODE: NOA

ZAAB Notice of allowance mailed

Free format text: ORIGINAL CODE: MN/=.

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE