CN113436752B - Semi-supervised multi-round medical dialogue reply generation method and system - Google Patents
Semi-supervised multi-round medical dialogue reply generation method and system Download PDFInfo
- Publication number
- CN113436752B CN113436752B CN202110577272.8A CN202110577272A CN113436752B CN 113436752 B CN113436752 B CN 113436752B CN 202110577272 A CN202110577272 A CN 202110577272A CN 113436752 B CN113436752 B CN 113436752B
- Authority
- CN
- China
- Prior art keywords
- round
- state
- dialogue
- representing
- priori
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 230000009471 action Effects 0.000 claims abstract description 39
- 238000012549 training Methods 0.000 claims description 68
- 238000009826 distribution Methods 0.000 claims description 25
- 238000005070 sampling Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 9
- 238000003860 storage Methods 0.000 claims description 9
- 230000000694 effects Effects 0.000 claims description 5
- 230000010365 information processing Effects 0.000 abstract description 2
- 230000000875 corresponding effect Effects 0.000 description 23
- 238000010586 diagram Methods 0.000 description 9
- 238000012360 testing method Methods 0.000 description 7
- 208000024891 symptom Diseases 0.000 description 5
- 239000003814 drug Substances 0.000 description 4
- 229940079593 drug Drugs 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 101100001674 Emericella variicolor andI gene Proteins 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000035790 physiological processes and functions Effects 0.000 description 2
- 206010011224 Cough Diseases 0.000 description 1
- 206010037660 Pyrexia Diseases 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 239000008187 granular material Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 201000009240 nasopharyngitis Diseases 0.000 description 1
- 206010029410 night sweats Diseases 0.000 description 1
- 230000036565 night sweats Effects 0.000 description 1
- 239000006188 syrup Substances 0.000 description 1
- 235000020357 syrup Nutrition 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H80/00—ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Human Computer Interaction (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Animal Behavior & Ethology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention belongs to the field of conversational information processing, and provides a semi-supervised multi-round medical conversation reply generation method and system. The method comprises the steps of inputting the problems of a patient in a first round of dialogue into a semi-supervised medical dialogue model to obtain replies of the first round of dialogue; in the second round and the later conversations, inputting the problems of the current round of patients and the replies of the previous round of conversations into a semi-supervised medical conversation model to obtain replies of the corresponding round of conversations until the patients have no new problem input; the semi-supervised medical dialogue model comprises a context encoder, a priori state tracker, an inference strategy state tracker, a priori strategy network, an inference strategy network and a reply generator, wherein the context encoder is used for encoding received information and inputting the information into the priori state tracker and the priori strategy network, the priori state tracker is used for continuously tracking the physical state of a user, the priori strategy network is used for generating a doctor action, and the reply generator is used for generating a corresponding reply according to the physical state and the doctor action.
Description
Technical Field
The invention belongs to the field of conversational information processing, and particularly relates to a semi-supervised multi-round medical conversation reply generation method and system.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
At the same time, session paradigms are used to relate people to information in order to address information needs in open areas and professional needs in highly vertical areas. Existing dialog systems can be divided into two main categories: task oriented and open domain dialog systems. Task oriented dialog systems are intended to help people accomplish specific tasks. Such as scheduling, ordering restaurants, querying weather. Open domain dialog systems are mainly chatting with people to meet the needs of people for information and entertainment. Unlike medical questions and answers, conversations in real medical scenes are more likely to involve multiple rounds of interaction. Because the patient needs to express his/her symptoms, the medication he/she is taking, and his/her medical history by the context of the conversation. This feature makes explicit state tracking indispensable, providing more indicative and interpretable information than hidden state representations. Taking into account the specificity of medical dialogs, medical reasoning capabilities (e.g., whether to prescribe a drug, what to prescribe a drug to treat a disease, what to ask for symptoms) are also an indispensable feature in medical diagnostics.
Existing medical dialog methods are constructed based on task-oriented dialog paradigms, following which the patient expresses symptoms, and the dialog system returns a paradigm of diagnostic results (i.e., determining what disease the patient suffers from). It achieves good effect. However, these methods focus on diagnosing only a single field, and cannot meet various requirements of patients in practical applications, and require a large number of states and actions to be manually marked. When dialogue data is highly confidential or data-large, it is not possible to achieve, and these works are limited by the size of training data, even if the reply cannot be generated using a generative method, it can only be composed by means of templates. Some task-based dialog methods can be applied to state tracking in medical dialogs, but they still cannot cope with situations where there is insufficient annotation data. In order to alleviate the requirement of the task oriented dialog system for data annotation, jin and Zhang and the like both use semi-supervised learning methods for state tracking, but neglect the reasoning ability of the dialog body, i.e. the actions of the unmodeled physician. Liang et al propose a method of training specific modules in a task oriented dialog system using incompletely labeled data, but cannot infer unlabeled labels at the training time, resulting in limited improvement in the medical dialog system with simultaneous stateless and action labeling. The inventors have found that none of these methods allow for retrieval from large scale medical knowledge, fail to generate knowledge-rich replies, and perform poorly in situations where medical conversations are a strong requirement for reasoning.
Disclosure of Invention
In order to solve the technical problems in the background art, the invention provides a semi-supervised multi-round medical dialogue reply generation method and system, which simultaneously consider the state of a patient and the action of a doctor, so that the dialogue system has the capability of modeling the physical state of the user and medical reasoning.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a first aspect of the present invention provides a semi-supervised multi-round medical session reply generation method.
A semi-supervised multi-round medical session reply generation method, comprising:
inputting the problems of the patient in the first round of dialogue into a semi-supervised medical dialogue model to obtain the reply of the first round of dialogue;
in the second round and the later conversations, inputting the problems of the current round of patients and the replies of the previous round of conversations into a semi-supervised medical conversation model to obtain replies of the corresponding round of conversations until the patients have no new problem input;
the semi-supervised medical dialogue model comprises a context encoder, a priori state tracker, an inference state tracker, a priori strategy network, an inference strategy network and a reply generator, wherein the context encoder is used for encoding received information and inputting the information into the priori state tracker and the priori strategy network, the priori state tracker is used for continuously tracking the physical state of a user, the priori strategy network is used for generating corresponding actions of a doctor, and the reply generator is used for generating corresponding replies according to the physical state and the actions of the doctor;
the reasoning state tracker is used for reasoning the physical state of the user, and the reasoning strategy network is used for reasoning the action of the doctor; the inference state tracker and the inference policy network are only executed during the training phase of the semi-supervised medical dialogue model.
A second aspect of the invention provides a semi-supervised multi-round medical session reply generation system.
A semi-supervised multi-round medical session reply generation system, comprising:
the first-round dialogue reply generation module is used for inputting the problems of the patient in the first-round dialogue into the semi-supervised medical dialogue model to obtain replies of the first-round dialogue;
the second round and the later dialogue reply generation module are used for inputting the problems of the current round of patients and the replies of the previous round of dialogue into the semi-supervised medical dialogue model in the second round and the later dialogue, so as to obtain replies of the corresponding round of dialogue until the patients have no new problem input;
the semi-supervised medical dialogue model comprises a context encoder, a priori state tracker, an inference state tracker, a priori strategy network, an inference strategy network and a reply generator, wherein the context encoder is used for encoding received information and inputting the information into the priori state tracker and the priori strategy network, the priori state tracker is used for continuously tracking the physical state of a user, the priori strategy network is used for generating corresponding actions of a doctor, and the reply generator is used for generating corresponding replies according to the physical state and the actions of the doctor;
the reasoning state tracker is used for reasoning the physical state of the user, and the reasoning strategy network is used for reasoning the action of the doctor; the inference state tracker and the inference policy network are only executed during the training phase of the semi-supervised medical dialogue model.
A third aspect of the present invention provides a computer-readable storage medium.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps in a semi-supervised multi-round medical session reply generation method as described above.
A fourth aspect of the invention provides a computer device.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in a semi-supervised multi-round medical session reply generation method as described above when the program is executed.
Compared with the prior art, the invention has the beneficial effects that:
(1) In the second round and the later dialogue, the invention inputs the problems of the current round of patients and the replies of the previous round of dialogue into the semi-supervised medical dialogue model to obtain the replies of the corresponding round of dialogue until the patients have no new problem input, explicitly models the physical state of the user and the actions of doctors, uses text span to express, and improves the capability of the model for modeling the physiological state of the patients and medical reasoning.
(2) The invention takes the physical state of a user and the action of a doctor as hidden variables at the model level, and provides a training method of the model under the condition that intermediate labels (i.e. supervision) exist and intermediate labels (i.e. no supervision) exist. The method greatly reduces the dependence of the dialogue model on the annotation data.
(3) In the process of strategy network learning, the invention uses the tracked patient state to retrieve from the large-scale medical knowledge graph, and the explicit state, action and reasoning path in the medical knowledge graph promote the interpretability of the response generated by the dialogue system.
(4) On model training, the invention provides a two-stage stacked reasoning method, which improves the stability under the condition of less supervision training data.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 (a) is a supervised data training of an embodiment of the present invention;
FIG. 1 (b) is an unsupervised data training of an embodiment of the present invention;
FIG. 1 (c) is a module used in the testing phase of an embodiment of the present invention;
FIG. 2 is a diagram of a method for implementing a medical dialogue system according to an embodiment of the invention;
FIG. 3 is a schematic diagram of a model during training of an embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Term interpretation:
an Encoder-Decoder (Encoder-Decoder) is a neural network structure, which is used to encode a word sequence, then decode it, and transform it into another word sequence, and is mainly used for machine translation and dialogue system.
Coding (encoding): the word sequence is represented as a continuous vector.
Decoding (decoding): a continuous vector is represented as a target sequence.
It is desirable (Expectation) that the results each time possible in the test are multiplied by the sum of the probabilities of the results, expressed in terms of E in the present invention.
KL Divergence (KL diversity): is an asymmetry measure of the difference between two probability distributions, the invention adopts KL (& |) i·) the form representation of the compound, the calculation formula is as follows:
where q, p represent two discrete distributions, q (i), and p (i) represent the distributions q, p, respectively, the ith probability value.
Hidden variable (variable): latent variables, or latent variables, represent statistically unobservable random variables as opposed to observed variables.
Training phase (Train): the training phase of the neural network model receives training data as input and continuously adjusts parameters in the neural network model through the training samples.
Test phase (Test): after the neural network model is trained, information such as labels corresponding to input data is output through the trained bible network model parameters in a test stage. Hereafter we will also refer to as deployment phase.
Example 1
The embodiment provides a semi-supervised multi-round medical dialogue reply generation method, which comprises the following steps:
inputting the problems of the patient in the first round of dialogue into a semi-supervised medical dialogue model to obtain the reply of the first round of dialogue;
in the second round and the later conversations, inputting the problems of the current round of patients and the replies of the previous round of conversations into a semi-supervised medical conversation model to obtain replies of the corresponding round of conversations until the patients have no new problem input;
the semi-supervised medical dialogue model comprises a context encoder, a priori state tracker, an inference state tracker, a priori strategy network, an inference strategy network and a reply generator, wherein the context encoder is used for encoding received information and inputting the information into the priori state tracker and the priori strategy network, the priori state tracker is used for continuously tracking the physical state of a user, the priori strategy network is used for generating corresponding actions of a doctor, and the reply generator is used for generating corresponding replies according to the physical state and the actions of the doctor;
the reasoning state tracker is used for reasoning the physical state of the user, and the reasoning strategy network is used for reasoning the action of the doctor; the inference state tracker and the inference policy network are only executed during the training phase of the semi-supervised medical dialogue model.
The context encoder is used for encoding the received information; direct coding of the questions of the patient of the first session; for the second round and the following dialogue patient questions and corresponding previous round dialogue replies, the codes form context information and are input into five modules of a priori state tracker, an inference state tracker, a priori strategy network, an inference strategy network and a reply generator.
The input signal to the a priori state tracker is: from the probability distribution of the output of the inference state tracker in the previous dialogSampled state instance->Output probability distribution->
The input signals of the inference state tracker are: from the probability distribution of the output of the inference state tracker in the previous dialogSampled state instance->And physician reply R for the current round t Output probability distribution
The input signals to the a priori policy network are: from the output probability distribution of the inference state tracker in the current round of dialogueSampled state instance->And an external medical knowledge graph G, outputting probability distribution +.>
The input signals of the inference policy network are: from the output probability distribution of the inference state tracker in the current round of dialogueSampled state instance->And physician reply R for the current round t Output probability distribution->
The input signal of the reply generator is: the reply generator input is divided into two cases, a training phase and a testing phase (i.e., when deployed), the training phase receivingFrom probability distribution->Sample in sample>As input; the test phase receives->From probability distribution->Sample in sample>As an input there is provided,outputting dialogue reply information R t 。
In the actual deployment phase, given the representation of the patient in each dialog wheel, the medical dialog system continuously tracks the physical state of the user by using the prior state tracker, generates corresponding actions of the doctor by using the prior policy network, and finally generates corresponding replies by combining the states and actions sampled from the prior state tracker and the prior policy network, corresponding to the process of fig. 1 (c). The session continues until the patient has no new problem input, i.e., the patient actively ends the current session.
The medical dialog system has two key features: patient status (symptoms, medication, etc.) and physician actions (treatment, diagnosis, etc.). These two features make the medical dialog system more complex than other knowledge-intensive dialog scenarios. Similar to the task oriented dialog system, the medical dialog generation process splits into three phases:
(1) Patient state tracking: for a given dialog history, the dialog system tracks the physical state (state) of the state;
(2) Physician policy learning: given the patient status and the dialogue history, the dialogue system gives the current physician's actions (actions);
(3) Medical reply generation: given the dialog history, the tracked states and the predicted actions, a fluent and accurate natural language reply is given.
For the scene with labeling data, at the t-th round of dialogue, the patient gives a question or describes his own symptoms U t The post-medical dialogue system receives the reply R of the previous round t-1 Current wheel problem U t And the state S tracked by the previous round t-1 Then output the state S of the current wheel t Then re-use R t-1 U t S t Output action A to be taken by current round doctor t Finally, generating a reply R in the form of natural language t And fed back to the patient. In medical dialog systems, however, there are many cases where the physiological state of the patient and the actions of the physician are not annotated. We regard both states and actions as hidden variables and let the state go through the entire dialog process, taking into accountRepresented by a sequence of words; the same is true of physician actions, i.e., the physician's response may include multiple keywords. In the actual operation, the lengths of the states and actions are set to be fixed lengths of |s| and |a| respectively. And the state has an initial value of'<pad><pad>...<pad>", wherein'<pad>"means a filler word. The details of the State and Action design are as follows:
state design: state is used to record information of the physical state of the user acquired by the dialog system throughout the course of the dialog, which is expressed using one sequence of words, for example, "cold fever cough night sweat.+ -. And which is initialized to" < pad > < pad > ".
The design of action-action is used to represent a summary of physician replies, which also uses a sequence representation, such as "999 common cold granule acute branch syrup.
The semi-supervised medical dialog model includes six modules, a context encoder (context encoder), a priori state tracker (prior state tracker), an inferential state tracker (inference state tracker), a priori policy network (prior policy network), an inferential policy network (inference state tracker), and a reply generator (response generator), respectively. In an entire medical session, which often involves multiple interactions, the following process goes through multiple rounds until the session ends.
Wherein the prior state tracker, the inference state tracker is for patient state tracking, wherein the inference state tracker is only executed during a training phase; the prior strategy network is used for the strategy learning of the doctor, wherein the reasoning strategy network is only executed in a training stage; the reply generator is for medical reply generation. From an unsupervised point of view, i.e. using unsupervised data D u The input and output of each module is described with respect to fig. 1 (b).
At round t, the context encoder is a GRU (or LSTM, transducer, bert) based encoder that receives the reply R of the previous round t-1 Problem U of patient of current wheel t As input, and output a continuous space vectorTo represent the dialog context.
At round t, give the previous round to revert to R t-1 Problem U of patient of current wheel t As an input, the context Encoder first uses bi-directional GRU Encoder encoding to derive a representation H of sequence word granularity t ={h t,1 ,h t,2 ,…,h t,M+N And outputs a vectorTo represent the dialog context. Wherein M and N each represent R t-1 and Ut Sequence length.
wherein R represents t-1 Word embedding (embedding) of the i-th word, this biglu encoder uses the contextual representation of the last moment +.>Initialization, attn [17 ]]Indicated is the action operation.
A priori state tracker receiving the context encoder output and the state of the previous time instantAs input, a GRU-based decoder is then used to output a sequence of words, namely +.>The inference state tracker adopts a structure similar to that of the prior state tracker, but additionally accepts the reply R of the current round t As input, it outputs a word sequence, i.eWe use +.> andRepresenting a priori state tracker and inferential state tracker, respectively, generating a probability distribution abbreviated as +.> and
Both the a priori state tracker and the inferred state tracker are Encoder-Decoder (Encoder-Decoder) structures. In the case of unsupervised information, the state of all the dialog turns is agnostic and the state of the latter turn needs to be dependent on the state of the former turn as input, so we followSampling to get->Into an a priori state tracker and an inferred state tracker.
The prior state tracker firstly obtains the samplingCoding as->Use->Initializing a decoder of a priori state tracker, wherein +.>Is a training parameter. At the ith decoding time, output +.>Then the sequence is decoded to obtain S t Is:
wherein MLP denotes a Multi-Layer Perceptron (Multi-Layer Perceptron). And S is the length of the state text span.
The inference state tracker is similar in structure to the a priori state tracker, which will also use the GRU EncoderCoding as->In addition to coding R t Is->Use of->Initializing the decoder, wherein->Is a training parameter. At the ith decoding time, output +.>Then the sequence is decoded to obtain S t Is a similar posterior distribution of (1): />
A priori policy network for receiving the output of the context encoder, S of the current round t And external medical knowledge G as input, then outputting a sequence of words using a GRU-based decoder, i.eThe inference policy network is similar in structure and receives +.>S t And additionally receives the current wheel reply R t As an output, a sequence of words is then output, i.e. +.>We use andRespectively representing a priori policy network and an inferential policy network, abbreviated as +.> and
The prior policy network and the inference policy network are also the structures of the Encoder-Decoder. Wherein the prior policy network is slaveSampling to get->Inference policy network from->Sampling to get->
Before introducing two strategy networks, firstly introducing a knowledge-graph searching operation qsub and a knowledge-graph coding operation RGAT 15]. The state obtained by the qsub from the G using tracking is retrieved from the medical knowledge graph G to obtain a sub graph G n Extracting all nodes and edges reachable by n-step jump from state as starting point, and connecting all nodes appearing in state to ensure graph G n Is fully connected. RGAT is a graph coding method which combines the types of edges, and obtains the ebedding representation of the node after multiple propagation, namely, a vector representation on a continuous space. We useRepresents G n Encoded node representation, wherein |G n I is G n The number of nodes in the network.
The prior policy network will use GRU EncoderCoding as->Later use +.> To initialize the decoder, and at the ith decoding time, output +.>The decoding process comprises two parts, one is generated from the word list, and the other is the knowledge graph G obtained from the retrieval n Copy of (a) is made.
wherein ej Represents G n The j-th node g j Representation ofThe j-th node of the list. Z is Z A To generate
And (5) copying the regular term. At e j =A t,i In case I (e) j ,A t,i ) =1, otherwise I (e j ,A t,i )=0。
Then A t The a priori distribution of (c) can be expressed as:
inference policy network uses GRU Encoder codingCoding as->Coding R t Is->Is used laterInitializing the decoder to output +.>To strengthen R t For the effect of the results, for A t Approximating the posterior distribution we consider only the direct probability of generation./>
The reply generator is a GRU-based decoder that receives the context encoder outputS t and At As input, then output a medical reply R t . Use->Representing a reply generator, abbreviated +_>
The reply generator uses only the output of the inference state tracker and the inference policy network during the unsupervised training phase. During the unsupervised training, we follow andSampling to obtain +.> andEncode it as +.>Andthe decoder of the reply generator is initialized later to +.>Output at the ith decoding timeThen R is obtained t The output probability of (2) is:
wherein Representing the probability generated from the vocabulary, +.>Representing from->R t-1 and Ut And R is the length of the reply.
Training loss functions for supervised and unsupervised data are L respectively sup and Lun, wherein Lun The method comprises the following steps:
wherein E represents the desired value of, KL (||·) Table KL divergence (Kullback-Leibler divergence).
Considering the instability that exists in training with a small proportion of supervision data, i.e. the a priori policy network is vulnerable to state errors from errors sampled by a priori state tracker. The invention provides a two-stage cascade reasoning training method, which comprises the following steps of un The training parts, respectively, are optimized first, since the policy network depends on the output of the state trackerAnd simultaneously optimizing the rest modules to improve the stability in the training process. L (L) un Is split intoL s and La Two training goals:
in the first training phase, minimize L s Improving the state tracking performance of the model and minimizing L in the second stage s +L a To maintain state tracking effects and policy learning ability of the training model. We name this as a two-stage stacked inference training approach.
FIG. 3 is a schematic diagram of a model during training, global_step being an integer for recording the number of training passes.
In a semi-supervised scenario, dialogue data for model training has two parts, supervised and unsupervised data, respectively, we will describe below for the supervised data D a And unsupervised data D u Is a training method of (a).
(a) For supervision data D a
From D a Sampling training samples to form small batches (i.e. mini-batch) required by training to obtain data R t-1 ,U t ,S t-1 ,S t ,A t ,R t . The corresponding input data is fed to the above-mentioned 6 modules, corresponding to (a) in fig. 1. Negative Log Likelihood (NLL) Loss was used for training. The actual training loss function is:
(b) For unsupervised data D u
From D u Sampling training samples to form small batches (i.e. mini-batch) required by training to obtain data R t - 1 ,U t ,R t . Intermediate annotation data S t-1 ,S t ,A t The marks are absent. We follow fromSampling to get->Will be->Is sent to-> andIs a kind of medium. And then, from-> andRespectively sampling to obtain-> andRespectively as-> andIs input to the computer. And then from->Sampling to get->Finally->Binding R t-1 ,U t Generating a reply R t The above process corresponds to (b) in fig. 1. Training loss is L un (also optionally L s +L a As training loss to improve training stability).
For the whole training dataset d= { D a ,D u Specific training steps are as follows:
step1 assume supervisor data D a The proportion of the training data D is alpha (0 is less than or equal to alpha is less than or equal to 1), the random number between 0 and 1 is selected, if the random number is smaller than alpha to Step2, and if the random number is larger than alpha to Step3.
Step2, training a model by using the supervision data, wherein the training loss is L corresponding to the mode (a) sup The gradient falls to update the parameters and then goes to Step4.
Step3, training a model by using the supervision data, wherein the training loss is L in a corresponding mode (b) un The gradient falls to update the parameters and then goes to Step4.
Step4, judging whether the model converges, if so, turning to Step5, otherwise turning to Step1.
Step5, saving the model weight, and ending training, as shown in fig. 3.
The semi-supervised medical dialogue model is trained using medical dialogue datasets disclosed in the current industry and academia. And sending the sampled supervision data and the sampled non-supervision data into a model, calculating a corresponding loss function, then carrying out gradient descent, and optimizing model parameters.
After model training is completed, the parameters of the model are all fixed, and the inference state tracker and the inference policy network can be discarded. At this point, the model can be applied to the actual dialog scene. As shown in FIG. 2, in a given patient problem input model, the context encoder, the prior state tracker, the prior policy network, and the reply generator work sequentially (at this point the reply generator uses only the prior state tracker output)And the output of the a priori policy network ∈>As input), and finally generates a reply back to the user. The dialogue system continuously interacts with the patient, the prior state tracker uses the state of the previous moment as input in each dialogue round, updates the tracked physical state of the patient, waits for a period of time to receive no new problem of the patient, and ends the current session.
Example two
A semi-supervised multi-round medical session reply generation system, comprising:
the first-round dialogue reply generation module is used for inputting the problems of the patient in the first-round dialogue into the semi-supervised medical dialogue model to obtain replies of the first-round dialogue;
the second round and the later dialogue reply generation module are used for inputting the problems of the current round of patients and the replies of the previous round of dialogue into the semi-supervised medical dialogue model in the second round and the later dialogue, so as to obtain replies of the corresponding round of dialogue until the patients have no new problem input;
the semi-supervised medical dialogue model comprises a context encoder, a priori state tracker, an inference strategy state tracker, a priori strategy network, an inference strategy network and a reply generator, wherein the context encoder is used for encoding received information and inputting the information into the priori state tracker and the priori strategy network, the priori state tracker is used for continuously tracking the physical state of a user, the priori strategy network is used for generating corresponding actions of a doctor, and the reply generator is used for generating corresponding replies according to the physical state and the actions of the doctor;
the reasoning state tracker is used for reasoning the physical state of the user, and the reasoning strategy network is used for reasoning the action of the doctor; the inference state tracker and the inference policy network are only executed during the training phase of the semi-supervised medical dialogue model.
The modules in this embodiment are in one-to-one correspondence with the steps in the first embodiment, and the implementation process is the same, which is not described here again.
Example III
The present embodiment provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps in a semi-supervised multi-round medical session reply generation method as described above.
Example IV
The present embodiment provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the semi-supervised multi-round medical dialogue reply generation method as described above when executing the program.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random access Memory (Random AccessMemory, RAM), or the like.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (8)
1. A method of generating a semi-supervised multi-round medical session reply, comprising:
inputting the problems of the patient in the first round of dialogue into a semi-supervised medical dialogue model to obtain the reply of the first round of dialogue;
in the second round and the later conversations, inputting the problems of the current round of patients and the replies of the previous round of conversations into a semi-supervised medical conversation model to obtain replies of the corresponding round of conversations until the patients have no new problem input;
the semi-supervised medical dialogue model comprises a context encoder, a priori state tracker, an inference strategy state tracker, a priori strategy network, an inference strategy network and a reply generator, wherein the context encoder is used for encoding received information and inputting the information into the priori state tracker and the priori strategy network, the priori state tracker is used for continuously tracking the physical state of a user, and input signals of the priori strategy network are as follows: a state instance obtained by sampling from the output probability distribution of an inference state tracker in the current round of dialogue and an external medical knowledge graph G;
the decoding process of the prior strategy network comprises two parts, one part is generated from the word list, and the other part is obtained from the retrieved knowledge graph G n Copy of (a):
wherein ,representing the dialogue context as a continuous space vector; the a priori policy network uses the GRU self-encoder to apply +.>Coding as-> Representing +.>Sampled incoming state realExamples are; MLP represents a multi-layer perceptron, the +.>Representing the output of the prior strategy network at the ith decoding moment; e, e j Represents G n The j-th node g j Representation->Word embedding of the j-th node in the list; z is Z A Generating a regularized item of the copy; at e j =A t,i In the case of (a), I (e j ,A t,i ) =1, otherwise I (e j ,A t,i )=0;
The prior strategy network is used for generating doctor actions and outputting probability distribution
Where |a| represents the length of the action;
the reply generator is used for generating a corresponding reply according to the physical state and the action of the doctor;
the reasoning state tracker is used for reasoning the physical state of the user, and the reasoning strategy network is used for reasoning the action of the doctor; the inference state tracker and the inference policy network are only executed during the training phase of the semi-supervised medical dialogue model;
during the unsupervised training process, from andSampling to obtain +.> andEncode it as +.> andLater, the decoder of the reply generator is initialized to +.> Output +.>Then R is obtained t The output probability of (2) is:
wherein ,representing the probability generated from the vocabulary, +.>Representing from->R t-1 and Ut The probability of the copy of the file, |R| is the length of the reply;Representing the probability distribution->An instance obtained by sampling; r is R t Physician replies indicating the current round; r is R t-1 Representing a physician reply of a previous round; u (U) t Representing a current wheel problem;Representing the output probability distribution of the inference state tracker in the current round of dialogue;Representing the output probability distribution of the inference strategy network in the current round of dialogue;
training loss function L of unsupervised data according to two-stage stacked reasoning training method un Splitting into L s and La The two training targets, because the strategy network depends on the output of the state tracker, firstly optimize the reasoning state tracker and the reasoning strategy network, and then optimize the rest modules at the same time;
wherein ,
wherein E represents the desired value of, KL (||·) table KL divergence (Kullback-Leibler divergence); a is that t Representing the action that the current round of physician should take; s is S t Representing the state of the output current wheel; s is S t-1 Representing the state tracked by the previous round;representing the output probability distribution of the inference state tracker in the previous dialog;Representing the output probability distribution of the inference state tracker in the current round of dialogue;Representation A t Is a priori distributed of (a);a representation reply generator;
first stage minimizing L s Improving the state tracking performance of the model and minimizing L in the second stage s +L a To maintain state tracking effects and policy learning ability of the training model.
2. The semi-supervised multi-round medical session reply generation method of claim 1, wherein the inference state tracker and the inference policy network are both encoder-decoder structures.
3. The semi-supervised multi-round medical session reply generation method of claim 1, wherein the a priori state tracker and the a priori policy network are both encoder-decoder structures.
4. The semi-supervised, multi-round medical session reply generation method of claim 1, wherein the reply generator is a GRU-based decoder.
5. A semi-supervised multi-round medical session reply generation system, comprising:
the first-round dialogue reply generation module is used for inputting the problems of the patient in the first-round dialogue into the semi-supervised medical dialogue model to obtain replies of the first-round dialogue;
the second round and the later dialogue reply generation module are used for inputting the problems of the current round of patients and the replies of the previous round of dialogue into the semi-supervised medical dialogue model in the second round and the later dialogue, so as to obtain replies of the corresponding round of dialogue until the patients have no new problem input;
the semi-supervised medical dialogue model comprises a context encoder, a priori state tracker, an inference strategy state tracker, a priori strategy network, an inference strategy network and a reply generator, wherein the context encoder is used for encoding received information and inputting the information into the priori state tracker and the priori strategy network, the priori state tracker is used for continuously tracking the physical state of a user, and input signals of the priori strategy network are as follows: a state instance obtained by sampling from the output probability distribution of an inference state tracker in the current round of dialogue and an external medical knowledge graph G;
the decoding process of the prior strategy network comprises two parts, one part is generated from the word list, and the other part is obtained from the retrieved knowledge graph G n Copy of (a):
wherein ,representing the dialogue context as a continuous space vector; the a priori policy network uses the GRU self-encoder to apply +.>Coding as-> Representing +.>Sampled state instances; MLP represents a multi-layer perceptron, the +.>Representing the output of the prior strategy network at the ith decoding moment; e, e j Represents G n The j-th node g j Representation->Word embedding of the j-th node in the list; z is Z A Generating a regularized item of the copy; at e j =A t,i In the case of (a), I (e j ,A t,i ) =1, otherwise I (e j ,A t,i )=0;
The prior strategy network is used for generating doctor actions and outputting probability distribution
Where |a| represents the length of the action;
the reply generator is used for generating a corresponding reply according to the physical state and the action of the doctor;
the reasoning state tracker is used for reasoning the physical state of the user, and the reasoning strategy network is used for reasoning the action of the doctor; the inference state tracker and the inference policy network are only executed during the training phase of the semi-supervised medical dialogue model;
during the unsupervised training process, from andSampling to obtain +.> andEncode it as +.> andLater, the decoder of the reply generator is initialized to +.> Output +.>Then R is obtained t The output probability of (2) is:
wherein ,representing the probability generated from the vocabulary, +.>Representing from->R t-1 and Ut The probability of the copy of the file, |R| is the length of the reply;Representing the probability distribution->An instance obtained by sampling; r is R t Physician replies indicating the current round; r is R t-1 Representing a physician reply of a previous round; u (U) t Representing a current wheel problem;Representing the output probability distribution of the inference state tracker in the current round of dialogue;Representing the output probability distribution of the inference strategy network in the current round of dialogue;
training loss function L of unsupervised data according to two-stage stacked reasoning training method un Splitting into L s and La The two training targets, because the strategy network depends on the output of the state tracker, firstly optimize the reasoning state tracker and the reasoning strategy network, and then optimize the rest modules at the same time;
wherein ,
wherein E represents the desired value of, KL (||·) table KL divergence (Kullback-Leibler divergence); a is that t Representing the action that the current round of physician should take; s is S t Representing the state of the output current wheel; s is S t-1 Representing the state tracked by the previous round;representing the output probability distribution of the inference state tracker in the previous dialog;Representing the output probability distribution of the inference state tracker in the current round of dialogue;Representation A t Is a priori distributed of (a);a representation reply generator;
first stage minimizing L s Improving the state tracking performance of the model and minimizing L in the second stage s +L a To maintain state tracking effects and policy learning ability of the training model.
6. The semi-supervised multi-round medical session reply generation system of claim 5, wherein the inference state tracker and the inference policy network are both encoder-decoder structures; both the a priori state tracker and the a priori policy network are encoder-decoder structures.
7. A computer readable storage medium having stored thereon a computer program, which when executed by a processor performs the steps in the semi-supervised multi-round medical session reply generation method of any of claims 1-4.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, implements the steps of the semi-supervised multi-round medical session reply generation method of any of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110577272.8A CN113436752B (en) | 2021-05-26 | 2021-05-26 | Semi-supervised multi-round medical dialogue reply generation method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110577272.8A CN113436752B (en) | 2021-05-26 | 2021-05-26 | Semi-supervised multi-round medical dialogue reply generation method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113436752A CN113436752A (en) | 2021-09-24 |
CN113436752B true CN113436752B (en) | 2023-04-28 |
Family
ID=77802906
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110577272.8A Active CN113436752B (en) | 2021-05-26 | 2021-05-26 | Semi-supervised multi-round medical dialogue reply generation method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113436752B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111710150A (en) * | 2020-05-14 | 2020-09-25 | 国网江苏省电力有限公司南京供电分公司 | Abnormal electricity consumption data detection method based on countermeasure self-coding network |
CN111797220A (en) * | 2020-07-30 | 2020-10-20 | 腾讯科技(深圳)有限公司 | Dialog generation method and device, computer equipment and storage medium |
CN111897941A (en) * | 2020-08-14 | 2020-11-06 | 腾讯科技(深圳)有限公司 | Dialog generation method, network training method, device, storage medium and equipment |
CN112464645A (en) * | 2020-10-30 | 2021-03-09 | 中国电力科学研究院有限公司 | Semi-supervised learning method, system, equipment, storage medium and semantic analysis method |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110309275B (en) * | 2018-03-15 | 2024-06-14 | 北京京东尚科信息技术有限公司 | Dialog generation method and device |
CN109582767B (en) * | 2018-11-21 | 2024-05-17 | 北京京东尚科信息技术有限公司 | Dialogue system processing method, device, equipment and readable storage medium |
CN109977212B (en) * | 2019-03-28 | 2020-11-24 | 清华大学深圳研究生院 | Reply content generation method of conversation robot and terminal equipment |
CN109992657B (en) * | 2019-04-03 | 2021-03-30 | 浙江大学 | Dialogue type problem generation method based on enhanced dynamic reasoning |
CN109933661B (en) * | 2019-04-03 | 2020-12-18 | 上海乐言信息科技有限公司 | Semi-supervised question-answer pair induction method and system based on deep generation model |
CN110297895B (en) * | 2019-05-24 | 2021-09-17 | 山东大学 | Dialogue method and system based on free text knowledge |
CN110321417B (en) * | 2019-05-30 | 2021-06-11 | 山东大学 | Dialog generation method, system, readable storage medium and computer equipment |
CN111428483B (en) * | 2020-03-31 | 2022-05-24 | 华为技术有限公司 | Voice interaction method and device and terminal equipment |
CN111767383B (en) * | 2020-07-03 | 2022-07-08 | 思必驰科技股份有限公司 | Conversation state tracking method, system and man-machine conversation method |
CN112164476A (en) * | 2020-09-28 | 2021-01-01 | 华南理工大学 | Medical consultation conversation generation method based on multitask and knowledge guidance |
CN112289467B (en) * | 2020-11-17 | 2022-08-02 | 中山大学 | Low-resource scene migratable medical inquiry dialogue system and method |
-
2021
- 2021-05-26 CN CN202110577272.8A patent/CN113436752B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111710150A (en) * | 2020-05-14 | 2020-09-25 | 国网江苏省电力有限公司南京供电分公司 | Abnormal electricity consumption data detection method based on countermeasure self-coding network |
CN111797220A (en) * | 2020-07-30 | 2020-10-20 | 腾讯科技(深圳)有限公司 | Dialog generation method and device, computer equipment and storage medium |
CN111897941A (en) * | 2020-08-14 | 2020-11-06 | 腾讯科技(深圳)有限公司 | Dialog generation method, network training method, device, storage medium and equipment |
CN112464645A (en) * | 2020-10-30 | 2021-03-09 | 中国电力科学研究院有限公司 | Semi-supervised learning method, system, equipment, storage medium and semantic analysis method |
Also Published As
Publication number | Publication date |
---|---|
CN113436752A (en) | 2021-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110321417B (en) | Dialog generation method, system, readable storage medium and computer equipment | |
US11494647B2 (en) | Slot filling with contextual information | |
US11734519B2 (en) | Systems and methods for slot relation extraction for machine learning task-oriented dialogue systems | |
CN109858044B (en) | Language processing method and device, and training method and device of language processing system | |
KR102215286B1 (en) | Method and apparatus for providing sentence generation chatbot service based on document understanding | |
Chi et al. | Speaker role contextual modeling for language understanding and dialogue policy learning | |
CN114528898A (en) | Scene graph modification based on natural language commands | |
Gulyaev et al. | Goal-oriented multi-task bert-based dialogue state tracker | |
JP2019079088A (en) | Learning device, program parameter and learning method | |
US20240037335A1 (en) | Methods, systems, and media for bi-modal generation of natural languages and neural architectures | |
JP2020027609A (en) | Response inference method and apparatus | |
Khan et al. | Timestamp-supervised action segmentation with graph convolutional networks | |
CN111723194B (en) | Digest generation method, device and equipment | |
CN112463935B (en) | Open domain dialogue generation method and system with generalized knowledge selection | |
CN116863920B (en) | Voice recognition method, device, equipment and medium based on double-flow self-supervision network | |
CN113436752B (en) | Semi-supervised multi-round medical dialogue reply generation method and system | |
JP2019021218A (en) | Learning device, program parameter, learning method and model | |
CN116911306A (en) | Natural language understanding method and device, server and storage medium | |
CN115470327A (en) | Medical question-answering method based on knowledge graph and related equipment | |
US11238236B2 (en) | Summarization of group chat threads | |
Khatri et al. | SkillBot: Towards Data Augmentation using Transformer language model and linguistic evaluation | |
JP2019079087A (en) | Learning device, program parameter and learning method | |
Kreyssig | Deep learning for user simulation in a dialogue system | |
Poosarala et al. | Survey of transfer learning and a case study of emotion recognition using inductive approach | |
CN114638238A (en) | Training method and device of neural network model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |