CN113239157A - Method, device, equipment and storage medium for training conversation model - Google Patents

Method, device, equipment and storage medium for training conversation model Download PDF

Info

Publication number
CN113239157A
CN113239157A CN202110348055.1A CN202110348055A CN113239157A CN 113239157 A CN113239157 A CN 113239157A CN 202110348055 A CN202110348055 A CN 202110348055A CN 113239157 A CN113239157 A CN 113239157A
Authority
CN
China
Prior art keywords
knowledge
probability
reply
model
dialogue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110348055.1A
Other languages
Chinese (zh)
Other versions
CN113239157B (en
Inventor
黄信娴
鲍思琪
何煌
王凡
吴华
何径舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110348055.1A priority Critical patent/CN113239157B/en
Publication of CN113239157A publication Critical patent/CN113239157A/en
Application granted granted Critical
Publication of CN113239157B publication Critical patent/CN113239157B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Computer Interaction (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Machine Translation (AREA)

Abstract

The present disclosure discloses a training method, an apparatus, a device and a storage medium for a dialogue model, which relate to the technical field of computers, in particular to the technical fields of natural language processing, man-machine dialogue, etc. The dialogue model comprises a knowledge selection model and a reply generation model, and the training method of the dialogue model comprises the following steps: processing a conversation sample and a knowledge base by adopting the knowledge selection model to determine knowledge matched with the conversation sample and a first probability, wherein the first probability is the probability that the knowledge is selected; processing the dialog sample and the knowledge by adopting the reply generation model to determine a second probability corresponding to a predicted reply, wherein the second probability is the probability that the predicted reply is a reply sample; determining a loss function based on the first probability and the second probability, and training the knowledge selection model and the reply generation model based on the loss function. The method and the system can introduce related knowledge into a dialogue system, and the training mode has strong universality.

Description

Method, device, equipment and storage medium for training conversation model
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to the field of natural language processing, human-computer interaction, and the like, and in particular, to a method, an apparatus, a device, and a storage medium for training a dialogue model.
Background
To improve the relevance of replies generated by dialog systems, knowledge (knowledge) is typically introduced into the dialog system. For knowledge-introducing dialog systems, the dialog models employed by the dialog system include a knowledge selection model.
In the related technology, a manual labeling mode can be adopted for labeling knowledge, and a model is selected by supervised training knowledge; alternatively, unsupervised training is performed with Bag of Words (BoW) and KL divergence (Kullback-Leibler divergence) as optimization targets.
Disclosure of Invention
The disclosure provides a training method, a device, equipment and a storage medium of a dialogue model.
According to an aspect of the present disclosure, there is provided a method of training a dialogue model, the dialogue model including a knowledge selection model and a reply generation model, the method including: processing a conversation sample and a knowledge base by adopting the knowledge selection model to determine knowledge matched with the conversation sample and determine a first probability corresponding to the knowledge, wherein the first probability is the probability that the knowledge is selected; processing the dialog sample and the knowledge by adopting the reply generation model to determine a second probability corresponding to a predicted reply, wherein the second probability is the probability that the predicted reply is a reply sample; determining a loss function based on the first probability and the second probability, and training the knowledge selection model and the reply generation model based on the loss function.
According to another aspect of the present disclosure, there is provided a training apparatus of a dialogue model including a knowledge selection model and a reply generation model, the apparatus including: a knowledge selection module, configured to process a dialogue sample and a knowledge base by using the knowledge selection model to determine knowledge matched with the dialogue sample, and determine a first probability corresponding to the knowledge, where the first probability is a probability that the knowledge is selected; a reply generation module, configured to process the dialog sample and the knowledge by using the reply generation model to determine a second probability corresponding to a predicted reply, where the second probability is a probability that the predicted reply is a reply sample; a training module to determine a loss function based on the first probability and the second probability, and train the knowledge selection model and the reply generation model based on the loss function.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the above aspects.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any of the above aspects.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of the above aspects.
According to the technical scheme disclosed by the invention, relevant knowledge can be introduced into a dialogue system, and the training mode has strong universality.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;
FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure;
FIG. 3 is a schematic diagram according to a third embodiment of the present disclosure;
FIG. 4 is a schematic diagram according to a fourth embodiment of the present disclosure;
FIG. 5 is a schematic diagram according to a fifth embodiment of the present disclosure;
FIG. 6 is a schematic diagram according to a sixth embodiment of the present disclosure;
fig. 7 is a schematic diagram of an electronic device for implementing any one of the training methods of the dialogue model of the embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In a dialog system, people often want to introduce knowledge related to dialog, so that the system can discuss the related knowledge and users in the reply, and the relevance, the information amount and the interestingness of the reply are improved.
The method introduces knowledge in a dialogue system, namely selects reasonable relevant knowledge in a knowledge base through dialogue information, and combines the reasonable relevant knowledge with reply generation of the system, and at present, two main schemes are provided: one is supervised training mode, and the other is unsupervised training mode based on posterior probability. The former mainly labels knowledge, and requires a knowledge selection model to output the knowledge labeled as correct as possible during training; the latter does not need manual labeling, mainly utilizes posterior information contained in the current reply, takes the conversation text as prior information, respectively generates prior probability and posterior probability of knowledge selection, generally uses Bag of words (BoW) to establish the relation between the knowledge selected according to the posterior probability and the posterior information (reply), improves the accuracy of the posterior probability, and then draws the distribution of the prior probability and the posterior probability in the modes of KL Divergence and the like.
However, the above solutions all have certain problems, the former solution is expensive in cost, and due to factors such as manpower and time, manual labeling cannot exhaust all reasonable knowledge, and knowledge that only depends on manual labeling is one-sidedness; the latter model has high training difficulty, often needs a large amount of experience and skill to obtain a good convergence effect, for example, the system training time cost is high, the training mode is not universal enough, and the reproducibility is poor due to the fact that certain word frequency characteristics of the data set need to be cleaned, the BoW is pre-trained by using the posterior information on the data set, and the like.
In order to solve at least one of the above problems to some extent, the embodiments of the present disclosure provide a training method for a dialogue model, where the training method belongs to an unsupervised training method, so as to solve the problems of large manual annotation amount and the like existing in the supervised training method.
Fig. 1 is a schematic diagram according to a first embodiment of the present disclosure, which provides a training method of a dialogue model, the dialogue model including a knowledge selection model and a reply generation model, as shown in fig. 1, the method including:
101. and processing the dialogue sample and the knowledge base by adopting the knowledge selection model to determine knowledge matched with the dialogue sample and determine a first probability corresponding to the knowledge, wherein the first probability is the probability of the knowledge being selected.
102. And processing the conversation sample and the knowledge by adopting the reply generation model to determine a second probability corresponding to the predicted reply, wherein the second probability is the probability that the predicted reply is selected as the reply sample.
103. Determining a loss function based on the first probability and the second probability, and training the knowledge selection model and the reply generation model based on the loss function.
The dialog process generally includes: the dialog system obtains dialog information (context) that the dialog system processes using a dialog model to generate a reply (response). The dialog information may also be referred to as context, above, etc., and refers to information generated during the dialog process, including, for example: the search sentence (query) currently input by the user, and in addition, since the dialog process is generally multi-round, the dialog information may also include the dialog contents that have occurred previously.
As shown in fig. 2, the dialogue model may include: a knowledge selection model 201 and a reply generation model 202. During the dialogue, the input of the knowledge selection model 201 includes dialogue information (c) and a knowledge base, the knowledge base is used for storing knowledge, the output of the knowledge selection model is one (top-1) knowledge (k) which is most matched with the input dialogue information, the input of the reply generation model 202 is the knowledge and the dialogue information, and the output is the reply (r).
Knowledge (knowledge) refers to information that is valuable for generating replies, and may be stored in a knowledge base, and may include knowledge in a variety of different areas, including, for example: weather, entertainment, intelligent customer service, traffic navigation and the like. For example, the knowledge includes information about actors, directors, and ratings of a certain movie, for example, entertainment.
In order to distinguish from dialog information (context) and replies (responses) in a dialog process, corresponding parameters are referred to as a dialog sample (context sample) and a reply sample (responses sample) in a training phase, and in addition, a reply generated based on the dialog sample and knowledge may be referred to as a predicted reply in the training phase.
In the training phase, a dialogue sample is input into a knowledge selection model, the other input of the knowledge selection model is knowledge in a pre-configured knowledge base, the knowledge base comprises a plurality of pieces of knowledge, and through the processing of the knowledge selection model, a plurality of pieces of (top-k) knowledge matched with the dialogue sample can be output, namely k pieces of knowledge are selected from high to low matching values, and k is a settable value. The knowledge selection model may include an encoding model (encoder) whose parameters are trainable, the encoder may be a deep neural network model, such as an encoder of a Transformer model, which may encode the input (dialog samples and knowledge) into corresponding vectors to select knowledge based on the vectors. In addition, a first probability corresponding to the knowledge may be calculated according to the matching value, for example, the first probability is a normalized value obtained by normalizing the matching value.
After k pieces of knowledge are obtained, the dialog samples and k pieces of knowledge may be used as input of a reply generation model, and the reply generation model may obtain probabilities corresponding to the predicted reply, including a probability of predicting the reply as a reply sample, which may be referred to as a second probability. It should be noted that, in the training stage, subsequent processing may be performed according to the second probability, and the second probability does not need to be mapped to a corresponding reply, but in the dialog process, that is, in the application stage, after the reply generation model obtains the probability corresponding to the predicted reply, the text with the highest probability may be selected as the reply. The reply generative model may be a deep neural network model, such as a Transformer model, the parameters of which are trainable.
When the first probability and the second probability are obtained, a Loss function may be determined based on the first probability and the second probability, and the Loss function may be a Marginal Loss (Marginal Loss) function, and is formulated as:
Figure BDA0003001436960000051
wherein, p (k)i| c) is the first probability, p (r | c, k)i) Is the second probability, r is the reply sample, c is the dialogue sample, kiIs the ith knowledge.
After the loss function is obtained, a dialogue model may be trained based on the loss function. The dialogue model includes a knowledge selection model and a reply generation model, and during training, the knowledge selection model and the reply generation model may be trained jointly, for example, parameters of the knowledge selection model and parameters of the reply generation model may be adjusted until an optimization goal determined based on the loss function is reached, instead of determining the loss function corresponding to the knowledge selection model separately to train the knowledge selection model separately.
In this embodiment, through the processing of the knowledge selection model, the knowledge may be introduced into the dialog system, and when the knowledge is introduced into the dialog system, the loss function is determined based on the first probability and the second probability, the first probability is determined by using the knowledge selection model, and the second probability is determined by using the reply generation model, so that the knowledge selection model and the reply generation model may be jointly trained, thereby avoiding a problem of a large amount of manual annotations when the knowledge selection model is trained alone, and the second probability is related to the reply, using the reply as an optimization target, and compared with a mode using bows and KL divergence as optimization targets, the training mode may be more general, and the reproducibility is stronger.
Fig. 3 is a schematic diagram according to a third embodiment of the present disclosure, which provides a training method of a dialogue model, the dialogue model includes a knowledge selection model and a reply generation model, as shown in fig. 3, the method includes:
301. and constructing a training corpus.
Wherein, the corpus can be collected from historical conversations, and each set of corpus can be expressed as < conversation sample, reply sample >.
302. And coding the dialogue sample into dialogue vectors by adopting a coding model in the knowledge selection model, and coding each knowledge in the knowledge base into knowledge vectors respectively.
303. Determining an inner product value of the dialogue vector and the knowledge vector.
304. And selecting a preset number of knowledge according to the sequence of the inner product values from large to small, and determining the knowledge as the knowledge matched with the dialogue sample.
305. And normalizing the inner product value corresponding to the knowledge to obtain a normalized value, and determining the normalized value as a first probability corresponding to the knowledge.
As shown in fig. 4, for a structural schematic diagram of a knowledge selection model, the knowledge selection model may include: the coding model 401, the input of the coding model 401 includes a dialog sample and each knowledge in the knowledge base, and the input can be converted into a corresponding vector through the processing of the coding model 401, which can be respectively called a dialog vector (Rep-c) and a knowledge vector (Rep-k). Then, inner product values of the dialogue vector and the knowledge vector can be calculated, and preset k knowledge is selected as the knowledge for matching the dialogue sample according to the sequence from large to small of the inner product values.
For k matched knowledge, the first probability corresponding to each knowledge may be a normalized value of the corresponding inner product value, for example, assuming that the ith knowledge is kiIf the dialog sample is represented by c, the knowledge k is calculatediAfter the inner product value of the dialog sample c is added, softmax processing can be performed on the inner product value to obtain [0,1 ]]Value of between, and the knowledge kiCorresponding first probability p (k)i|c)。
By selecting in order of the inner product value from large to small, knowledge matching with the dialogue sample can be acquired. The first probability can be obtained simply by normalizing the inner product value.
306. And processing the dialogue sample and the knowledge by adopting an input layer of the reply generation model to obtain an input vector.
307. And processing the input vector by adopting the hidden layer of the reply generation model to obtain a state vector.
308. And processing the state vector by adopting an output layer of the reply generation model to determine a second probability corresponding to the prediction reply, wherein the second probability is the probability that the prediction reply is a reply sample.
As shown in fig. 5, for a structural schematic diagram of the reply generative model, the reply generative model may include: an input layer 501, a hidden layer 502, and an output layer 503. The input layer 501 is used to convert an input text into an input vector, where the input text is represented by x; in the present embodiment, the first and second electrodes are,the input text includes the dialog sample, knowledge that matches the dialog sample, and the reply that has been generated. The hidden layer 502 is used to process the input vector and output a state vector, which is denoted by h. The input to the output layer 503 is a state vector and the output at the training stage is the probability of predicting a reply to each candidate text comprising a reply sample, and thus the output of the output layer comprises the probability of predicting a reply to a reply sample, referred to as the second probability, in p (r | c, k |)i) And (4) showing.
Further, the input layer may include a type embedding (type embedding) layer, and inputs of the type embedding layer include a dialog information type identifier, a knowledge type identifier, and a reply type identifier, which are different from each other. For example, the reply type identifier (type id) is 0, the dialog information type identifier (type id) is 1, and the knowledge type identifier (type id) is 2. It will be appreciated that the input layers may also include other general layers, such as a position embedding (position embedding) layer and a label embedding (token embedding) layer.
By introducing a type embedding layer and adopting different type identifications to respectively identify dialog information, knowledge and replies, knowledge can be better distinguished and used.
The backbone structure of the hidden layer and the output layer can be a Transformer model, for example, the hidden layer comprises an encoder of the Transformer model, and the output layer comprises a decoder of the Transformer model. The hidden layer in fig. 5 includes L transform blocks (blocks) as an example.
Further, the hidden layer comprises a self-attention layer model, the self-attention model comprises a first part and a second part, the first part is a part corresponding to the dialogue sample and the knowledge, the second part is a part corresponding to the generated reply, the first part adopts a bidirectional self-attention mechanism, and the second part adopts a unidirectional self-attention mechanism.
As shown in fig. 5, the self-attention mechanism of the dialog sample (indicated above in fig. 5) and the portion corresponding to knowledge is bidirectional (indicated by a solid line), and the self-attention mechanism of the reply corresponding portion is unidirectional (indicated by a dotted line).
By using a two-way self-attention mechanism for the first part of the self-attention layer, information in conversational samples and knowledge can be better extracted, and flexibility can be increased by using two-way for one part and one-way for the other part, rather than all one-way or two-way.
Through the input layer, the hidden layer and the output layer, the second probability related to the reply sample can be determined, so that the reply can be used as an optimization target, and the training is more stable and more universal.
309. Determining a loss function based on the first probability and the second probability.
When the first probability and the second probability are obtained, a Loss function may be determined based on the first probability and the second probability, and the Loss function may be a Marginal Loss (Marginal Loss) function, and is formulated as:
Figure BDA0003001436960000081
wherein, p (k)i| c) is the first probability, p (r | c, k)i) Is the second probability, r is the reply sample, c is the dialogue sample, kiIs the ith knowledge.
310. Jointly training the knowledge selection model and the reply generation model based on the loss function.
The Training Objectives (Training Objectives) may be to minimize the above-mentioned Mardigital Loss. That is, after the loss function is obtained, the parameters of the knowledge selection model and the reply generation model may be adjusted until the training goals described above are achieved.
Fig. 6 is a schematic diagram according to a sixth embodiment of the present disclosure. As shown in fig. 6, this embodiment provides a training apparatus of a dialogue model. The dialogue model comprises a knowledge selection model and a reply generation model, and the apparatus 600 comprises: a knowledge selection module 601, a reply generation module 602, and a training module 603. The knowledge selection module 601 is configured to process the dialog sample and the knowledge base by using the knowledge selection model to determine knowledge matched with the dialog sample, and determine a first probability corresponding to the knowledge, where the first probability is a probability that the knowledge is selected; the reply generation module 602 is configured to process the dialog sample and the knowledge by using the reply generation model to determine a second probability corresponding to a predicted reply, where the second probability is a probability that the predicted reply is a reply sample; the training module 603 is configured to determine a loss function based on the first probability and the second probability, and train the knowledge selection model and the reply generation model based on the loss function.
In some embodiments, the reply generation model includes an input layer, a hidden layer, and an output layer, and the reply generation module 602 is specifically configured to: processing the dialogue sample and the knowledge information by adopting the input layer to obtain an input vector; processing the input vector by adopting the hidden layer to obtain a state vector; processing the state vector with the output layer to determine a second probability corresponding to a prediction reply.
In some embodiments, the input layer comprises: the type embedding layer inputs the dialog information type identification, the knowledge type identification and the reply type identification which are different from each other.
In some embodiments, the hidden layer comprises: a self-attention model comprising a first portion and a second portion, the first portion being a portion corresponding to the dialogue sample and knowledge, the second portion being a portion corresponding to the generated reply, the first portion employing a bi-directional self-attention mechanism, the second portion employing a unidirectional self-attention mechanism.
In some embodiments, the knowledge selection model comprises a coding model, the matched knowledge is determined from the knowledge base, the knowledge base comprises at least one knowledge, and the knowledge selection module 601 is specifically configured to: coding the dialogue sample into dialogue vectors by adopting the coding model, and coding each knowledge in the knowledge base into knowledge vectors respectively; determining an inner product value of the dialogue vector and the knowledge vector; and selecting a preset number of knowledge according to the sequence of the inner product values from large to small, and determining the knowledge as the knowledge matched with the dialogue sample.
In some embodiments, the knowledge selection module 601 is specifically configured to:
and normalizing the inner product value corresponding to the knowledge to obtain a normalized value, and determining the normalized value as a first probability corresponding to the knowledge.
In this embodiment, through the processing of the knowledge selection model, the knowledge may be introduced into the dialog system, and when the knowledge is introduced into the dialog system, the loss function is determined based on the first probability and the second probability, the first probability is determined by using the knowledge selection model, and the second probability is determined by using the reply generation model, so that the knowledge selection model and the reply generation model may be jointly trained, thereby avoiding a problem of a large amount of manual annotations when the knowledge selection model is trained alone, and the second probability is related to the reply, using the reply as an optimization target, and compared with a mode using bows and KL divergence as optimization targets, the training mode may be more general, and the reproducibility is stronger.
It is to be understood that in the disclosed embodiments, the same or similar elements in different embodiments may be referenced.
It is to be understood that "first", "second", and the like in the embodiments of the present disclosure are used for distinction only, and do not indicate the degree of importance, the order of timing, and the like.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the electronic device 700 includes a computing unit 701, which may perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the electronic device 700 can also be stored. The computing unit 701, the ROM702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
A number of components in the electronic device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the electronic device 700 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 701 performs the respective methods and processes described above, such as the training method of the dialogue model. In some embodiments, the training method of the dialogue model may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 700 via the ROM702 and/or the communication unit 709. When loaded into RAM 703 and executed by the computing unit 701, may perform one or more steps of the training method of the dialogue model described above. Alternatively, in other embodiments, the computing unit 701 may be configured by any other suitable means (e.g. by means of firmware) to perform the training method of the dialogue model.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (15)

1. A method of training a dialogue model, the dialogue model comprising a knowledge selection model and a reply generation model, the method comprising:
processing a conversation sample and a knowledge base by adopting the knowledge selection model to determine knowledge matched with the conversation sample and determine a first probability corresponding to the knowledge, wherein the first probability is the probability that the knowledge is selected;
processing the dialog sample and the knowledge by adopting the reply generation model to determine a second probability corresponding to a predicted reply, wherein the second probability is the probability that the predicted reply is a reply sample;
determining a loss function based on the first probability and the second probability, and training the knowledge selection model and the reply generation model based on the loss function.
2. The method of claim 1, wherein the reply generation model comprises an input layer, a hidden layer, and an output layer, and wherein processing the dialogue samples and the knowledge to determine a second probability that a predicted reply corresponds using the reply generation model comprises:
processing the dialogue sample and the knowledge information by adopting the input layer to obtain an input vector;
processing the input vector by adopting the hidden layer to obtain a state vector;
processing the state vector with the output layer to determine a second probability corresponding to a prediction reply.
3. The method of claim 2, wherein the input layer comprises: the type embedding layer inputs the dialog information type identification, the knowledge type identification and the reply type identification which are different from each other.
4. The method of claim 2, wherein the hidden layer comprises: a self-attention model comprising a first portion and a second portion, the first portion being a portion corresponding to the dialogue sample and knowledge, the second portion being a portion corresponding to the generated reply, the first portion employing a bi-directional self-attention mechanism, the second portion employing a unidirectional self-attention mechanism.
5. The method of any of claims 1-4, wherein the knowledge selection model comprises a coding model, the matching knowledge is determined from the knowledge base, the knowledge base comprises at least one knowledge, and processing the dialogue sample using the knowledge selection model to determine the knowledge that matches the dialogue sample comprises:
coding the dialogue sample into dialogue vectors by adopting the coding model, and coding each knowledge in the knowledge base into knowledge vectors respectively;
determining an inner product value of the dialogue vector and the knowledge vector;
and selecting a preset number of knowledge according to the sequence of the inner product values from large to small, and determining the knowledge as the knowledge matched with the dialogue sample.
6. The method of claim 5, wherein the determining a first probability that the knowledge corresponds to comprises:
and normalizing the inner product value corresponding to the knowledge to obtain a normalized value, and determining the normalized value as a first probability corresponding to the knowledge.
7. An apparatus for training a dialogue model, the dialogue model including a knowledge selection model and a reply generation model, the apparatus comprising:
a knowledge selection module, configured to process a dialogue sample and a knowledge base by using the knowledge selection model to determine knowledge matched with the dialogue sample, and determine a first probability corresponding to the knowledge, where the first probability is a probability that the knowledge is selected;
a reply generation module, configured to process the dialog sample and the knowledge by using the reply generation model to determine a second probability corresponding to a predicted reply, where the second probability is a probability that the predicted reply is a reply sample;
a training module to determine a loss function based on the first probability and the second probability, and train the knowledge selection model and the reply generation model based on the loss function.
8. The apparatus of claim 7, wherein the reply generation model comprises an input layer, a hidden layer, and an output layer, and the reply generation module is specifically configured to:
processing the dialogue sample and the knowledge information by adopting the input layer to obtain an input vector;
processing the input vector by adopting the hidden layer to obtain a state vector;
processing the state vector with the output layer to determine a second probability corresponding to a prediction reply.
9. The apparatus of claim 8, wherein the input layer comprises: the type embedding layer inputs the dialog information type identification, the knowledge type identification and the reply type identification which are different from each other.
10. The apparatus of claim 8, wherein the hidden layer comprises: a self-attention model comprising a first portion and a second portion, the first portion being a portion corresponding to the dialogue sample and knowledge, the second portion being a portion corresponding to the generated reply, the first portion employing a bi-directional self-attention mechanism, the second portion employing a unidirectional self-attention mechanism.
11. The apparatus of any of claims 7-10, wherein the knowledge selection model comprises a coding model, the matching knowledge is determined from a knowledge base, the knowledge base comprises at least one knowledge, and the knowledge selection module is specifically configured to:
coding the dialogue sample into dialogue vectors by adopting the coding model, and coding each knowledge in the knowledge base into knowledge vectors respectively;
determining an inner product value of the dialogue vector and the knowledge vector;
and selecting a preset number of knowledge according to the sequence of the inner product values from large to small, and determining the knowledge as the knowledge matched with the dialogue sample.
12. The apparatus of claim 11, wherein the knowledge selection module is specifically configured to:
and normalizing the inner product value corresponding to the knowledge to obtain a normalized value, and determining the normalized value as a first probability corresponding to the knowledge.
13. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.
15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.
CN202110348055.1A 2021-03-31 2021-03-31 Method, device, equipment and storage medium for training conversation model Active CN113239157B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110348055.1A CN113239157B (en) 2021-03-31 2021-03-31 Method, device, equipment and storage medium for training conversation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110348055.1A CN113239157B (en) 2021-03-31 2021-03-31 Method, device, equipment and storage medium for training conversation model

Publications (2)

Publication Number Publication Date
CN113239157A true CN113239157A (en) 2021-08-10
CN113239157B CN113239157B (en) 2022-02-25

Family

ID=77130700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110348055.1A Active CN113239157B (en) 2021-03-31 2021-03-31 Method, device, equipment and storage medium for training conversation model

Country Status (1)

Country Link
CN (1) CN113239157B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114416943A (en) * 2021-12-29 2022-04-29 北京百度网讯科技有限公司 Training method and device for dialogue model, electronic equipment and storage medium
CN114610861A (en) * 2022-05-11 2022-06-10 之江实验室 End-to-end dialogue method for integrating knowledge and emotion based on variational self-encoder
CN114819183A (en) * 2022-04-15 2022-07-29 支付宝(杭州)信息技术有限公司 Model gradient confirmation method, device, equipment and medium based on federal learning

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130110804A1 (en) * 2011-10-31 2013-05-02 Elwha LLC, a limited liability company of the State of Delaware Context-sensitive query enrichment
CN106997375A (en) * 2017-02-28 2017-08-01 浙江大学 Recommendation method is replied in customer service based on deep learning
CN109933785A (en) * 2019-02-03 2019-06-25 北京百度网讯科技有限公司 Method, apparatus, equipment and medium for entity associated
CN110188182A (en) * 2019-05-31 2019-08-30 中国科学院深圳先进技术研究院 Model training method, dialogue generation method, device, equipment and medium
CN110297887A (en) * 2019-06-26 2019-10-01 山东大学 Service robot personalization conversational system and method based on cloud platform
CN111523328A (en) * 2020-04-13 2020-08-11 中博信息技术研究院有限公司 Intelligent customer service semantic processing method
CN111897941A (en) * 2020-08-14 2020-11-06 腾讯科技(深圳)有限公司 Dialog generation method, network training method, device, storage medium and equipment
CN112541060A (en) * 2020-11-19 2021-03-23 中国科学院深圳先进技术研究院 End-to-end task type dialogue learning framework and method based on confrontation training
CN112559706A (en) * 2020-12-11 2021-03-26 中国科学院深圳先进技术研究院 Training method of dialogue generating model, dialogue method, device and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130110804A1 (en) * 2011-10-31 2013-05-02 Elwha LLC, a limited liability company of the State of Delaware Context-sensitive query enrichment
CN106997375A (en) * 2017-02-28 2017-08-01 浙江大学 Recommendation method is replied in customer service based on deep learning
CN109933785A (en) * 2019-02-03 2019-06-25 北京百度网讯科技有限公司 Method, apparatus, equipment and medium for entity associated
CN110188182A (en) * 2019-05-31 2019-08-30 中国科学院深圳先进技术研究院 Model training method, dialogue generation method, device, equipment and medium
CN110297887A (en) * 2019-06-26 2019-10-01 山东大学 Service robot personalization conversational system and method based on cloud platform
CN111523328A (en) * 2020-04-13 2020-08-11 中博信息技术研究院有限公司 Intelligent customer service semantic processing method
CN111897941A (en) * 2020-08-14 2020-11-06 腾讯科技(深圳)有限公司 Dialog generation method, network training method, device, storage medium and equipment
CN112541060A (en) * 2020-11-19 2021-03-23 中国科学院深圳先进技术研究院 End-to-end task type dialogue learning framework and method based on confrontation training
CN112559706A (en) * 2020-12-11 2021-03-26 中国科学院深圳先进技术研究院 Training method of dialogue generating model, dialogue method, device and storage medium

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
JUNKI OHMURA等: "Context-Aware Dialog Re-Ranking for Task-Oriented Dialog Systems", 《2018 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT)》 *
LIQIANG NIE等: "Multimodal Dialog System: Generating Responses via Adaptive Decoders", 《MM "19: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA》 *
LIZI LIAO等: "Knowledge-aware Multimodal Dialogue Systems", 《MM "18: PROCEEDINGS OF THE 26TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA》 *
安波等: "融合知识表示的知识库问答系统", 《融合知识表示的知识库问答系统 *
庄传志等: "基于深度学习的关系抽取研究综述", 《中文信息学报》 *
张衍坤等: "面向社区问答匹配的混合神经网络模型", 《小型微型计算机系统》 *
徐聪: "基于深度学习和强化学习的对话模型研究", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114416943A (en) * 2021-12-29 2022-04-29 北京百度网讯科技有限公司 Training method and device for dialogue model, electronic equipment and storage medium
CN114819183A (en) * 2022-04-15 2022-07-29 支付宝(杭州)信息技术有限公司 Model gradient confirmation method, device, equipment and medium based on federal learning
CN114610861A (en) * 2022-05-11 2022-06-10 之江实验室 End-to-end dialogue method for integrating knowledge and emotion based on variational self-encoder

Also Published As

Publication number Publication date
CN113239157B (en) 2022-02-25

Similar Documents

Publication Publication Date Title
CN113239157B (en) Method, device, equipment and storage medium for training conversation model
CN112528655B (en) Keyword generation method, device, equipment and storage medium
CN111241245B (en) Human-computer interaction processing method and device and electronic equipment
JP7346788B2 (en) Speech recognition model training methods, devices, equipment, and storage media
CN113053367B (en) Speech recognition method, speech recognition model training method and device
CN111737954A (en) Text similarity determination method, device, equipment and medium
WO2023155678A1 (en) Method and apparatus for determining information
CN112507103A (en) Task type dialogue and model training method, device, equipment and storage medium
CN113641805A (en) Acquisition method of structured question-answering model, question-answering method and corresponding device
CN112966744A (en) Model training method, image processing method, device and electronic equipment
CN113468857B (en) Training method and device for style conversion model, electronic equipment and storage medium
CN114528387A (en) Deep learning conversation strategy model construction method and system based on conversation flow bootstrap
CN114490985A (en) Dialog generation method and device, electronic equipment and storage medium
CN112506359B (en) Method and device for providing candidate long sentences in input method and electronic equipment
CN113360683A (en) Method for training cross-modal retrieval model and cross-modal retrieval method and device
CN112990292B (en) Method and device for generating dialogue state based on neural network
CN114416941A (en) Generation method and device of dialogue knowledge point determination model fusing knowledge graph
CN113806541A (en) Emotion classification method and emotion classification model training method and device
CN113204616A (en) Method and device for training text extraction model and extracting text
CN116050427B (en) Information generation method, training device, electronic equipment and storage medium
CN113553863B (en) Text generation method, device, electronic equipment and storage medium
CN114722841B (en) Translation method, translation device and computer program product
CN116383491A (en) Information recommendation method, apparatus, device, storage medium, and program product
CN117591948A (en) Comment generation model training method and device, and information generation method and device
CN115168553A (en) Dialogue statement completion and model training method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant