CN113436752B - Semi-supervised multi-round medical dialogue reply generation method and system - Google Patents
Semi-supervised multi-round medical dialogue reply generation method and system Download PDFInfo
- Publication number
- CN113436752B CN113436752B CN202110577272.8A CN202110577272A CN113436752B CN 113436752 B CN113436752 B CN 113436752B CN 202110577272 A CN202110577272 A CN 202110577272A CN 113436752 B CN113436752 B CN 113436752B
- Authority
- CN
- China
- Prior art keywords
- dialogue
- round
- state
- inference
- semi
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 230000009471 action Effects 0.000 claims abstract description 39
- 230000004044 response Effects 0.000 claims description 69
- 238000012549 training Methods 0.000 claims description 64
- 238000009826 distribution Methods 0.000 claims description 36
- 230000008569 process Effects 0.000 claims description 23
- 238000004590 computer program Methods 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims 2
- 230000008447 perception Effects 0.000 claims 2
- 230000010365 information processing Effects 0.000 abstract description 2
- 230000000875 corresponding effect Effects 0.000 description 24
- 238000010586 diagram Methods 0.000 description 9
- 238000005070 sampling Methods 0.000 description 5
- 208000024891 symptom Diseases 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000003745 diagnosis Methods 0.000 description 4
- 239000003814 drug Substances 0.000 description 4
- 238000003062 neural network model Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000035790 physiological processes and functions Effects 0.000 description 2
- 206010011224 Cough Diseases 0.000 description 1
- 206010037660 Pyrexia Diseases 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 239000008187 granular material Substances 0.000 description 1
- 238000002483 medication Methods 0.000 description 1
- 206010029410 night sweats Diseases 0.000 description 1
- 230000036565 night sweats Effects 0.000 description 1
- 239000006188 syrup Substances 0.000 description 1
- 235000020357 syrup Nutrition 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H80/00—ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Human Computer Interaction (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Animal Behavior & Ethology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
Description
技术领域Technical Field
本发明属于对话式信息处理领域,尤其涉及一种半监督的多轮医疗对话回复生成方法及系统。The present invention belongs to the field of conversational information processing, and in particular relates to a semi-supervised multi-round medical conversation response generation method and system.
背景技术Background Art
本部分的陈述仅仅是提供了与本发明相关的背景技术信息,不必然构成在先技术。The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art.
同时为了解决开放领域的信息需求和高度垂直领域的专业需求,会话范式被用来将人们与信息联系起来。现有的对话系统可分为两大类:面向任务的和开放域对话系统。以任务为导向的对话系统旨在帮助人们完成特定的任务。例如日程安排,订餐馆,查询天气。开放域对话系统主要是与人们聊天,用于满足人们对信息和娱乐的需求。不同于医疗问答,真实医学场景中的对话更可能包含多轮交互。因为患者需要通过对话的上下文来表达他/她的症状、他/她正在服用的药物和他/她的病史。这一特性使得显式状态追踪变得不可或缺,其提供了比隐状态表示更具指示性和可解释性的信息。考虑到医学对话的特殊性,医学推理能力(例如是否开药,开什么药治疗疾病,询问何种症状)也是医疗诊断中不可或缺的特性。In order to address both the information needs of open domains and the professional needs of highly vertical domains, the conversational paradigm is used to connect people with information. Existing dialogue systems can be divided into two categories: task-oriented and open-domain dialogue systems. Task-oriented dialogue systems are designed to help people complete specific tasks. For example, scheduling, booking restaurants, and checking the weather. Open-domain dialogue systems are mainly used to chat with people to meet people's needs for information and entertainment. Unlike medical Q&A, conversations in real medical scenarios are more likely to contain multiple rounds of interactions. Because the patient needs to express his/her symptoms, the medicines he/she is taking, and his/her medical history through the context of the conversation. This feature makes explicit state tracking indispensable, which provides more indicative and interpretable information than hidden state representation. Considering the particularity of medical dialogue, medical reasoning capabilities (such as whether to prescribe medicine, what medicine to prescribe to treat the disease, and what symptoms to ask) are also indispensable features in medical diagnosis.
现有的医疗对话方法是基于任务导向的对话范式构建,遵循的是患者表达症状的,对话系统返回诊断结果(即确定病人患了什么疾病)的范式。其取得了很好的效果。但这些方法只聚焦于诊断这一单一领域,无法满足实际应用中病人的多种需求,而且其需要大量人工标注的状态和动作。当对话数据高度机密或数据规模巨大时是无法实现的,并且这些工作受限于训练数据规模的影响,甚至无法使用生成式的方法来生成回复,只能通过模板的方式来组成回复。一些任务型对话的方法可以应用于医疗对话中的状态追踪,但是其依旧无法应对无充分标注数据的情景。为了减轻任务导向对话系统对于数据标注的需求,Jin等和Zhang等都使用了半监督的学习方法来进行状态追踪,但忽视了对话主体的推理能力,即未建模医师的动作。Liang等提出一种利用未完全标注的数据来训练任务导向对话系统中的特定模块的方法,但是无法在训练时刻推理出未标注的标签,致使其在医疗对话系统中同时无状态和动作标注的情况下提升有限。发明人发现,这些方法都未考虑从大规模医疗知识中进行检索,未能生成富含知识的回复,在医学对话这种对于推理能力有很强需求的场景中表现很差。Existing medical dialogue methods are based on the task-oriented dialogue paradigm, following the paradigm that the patient expresses symptoms and the dialogue system returns the diagnosis result (i.e., determines what disease the patient has). It has achieved good results. However, these methods only focus on the single field of diagnosis, which cannot meet the various needs of patients in actual applications, and require a large number of manually labeled states and actions. This is impossible when the dialogue data is highly confidential or the data scale is huge, and these works are limited by the impact of the training data scale, and even generative methods cannot be used to generate replies, and replies can only be composed through templates. Some task-based dialogue methods can be applied to state tracking in medical dialogues, but they still cannot cope with scenarios without sufficient labeled data. In order to reduce the demand for data labeling in task-oriented dialogue systems, Jin et al. and Zhang et al. both used semi-supervised learning methods for state tracking, but ignored the reasoning ability of the dialogue subject, that is, the actions of the physician were not modeled. Liang et al. proposed a method to use incompletely labeled data to train specific modules in task-oriented dialogue systems, but could not infer unlabeled labels at the time of training, resulting in limited improvement in medical dialogue systems without both state and action labeling. The inventors found that these methods did not consider retrieval from large-scale medical knowledge, failed to generate knowledge-rich responses, and performed poorly in scenarios such as medical dialogues that have strong demands for reasoning ability.
发明内容Summary of the invention
为了解决上述背景技术中存在的技术问题,本发明提供一种半监督的多轮医疗对话回复生成方法及系统,其同时考虑了病人状态和医师动作,使得对话系统同时具备了建模用户身体状态和医学推理的能力。In order to solve the technical problems existing in the above-mentioned background technology, the present invention provides a semi-supervised multi-round medical dialogue response generation method and system, which takes into account both the patient status and the doctor's actions, so that the dialogue system has the ability to model the user's physical status and medical reasoning at the same time.
为了实现上述目的,本发明采用如下技术方案:In order to achieve the above object, the present invention adopts the following technical solution:
本发明的第一个方面提供一种半监督的多轮医疗对话回复生成方法。A first aspect of the present invention provides a semi-supervised method for generating responses to multi-round medical conversations.
一种半监督的多轮医疗对话回复生成方法,其包括:A semi-supervised multi-round medical dialogue response generation method, comprising:
将第一轮对话中病人的问题输入至半监督医疗对话模型,得到第一轮对话的回复;Input the patient's questions in the first round of dialogue into the semi-supervised medical dialogue model to obtain the responses in the first round of dialogue;
在第二轮及其后对话中,将当前轮病人的问题及上一轮对话的回复输入至半监督医疗对话模型中,得到相应轮对话的回复,直至病人无新的问题输入;In the second and subsequent rounds of dialogue, the patient's questions in the current round and the responses in the previous round are input into the semi-supervised medical dialogue model to obtain responses for the corresponding round of dialogue until the patient has no new questions input;
其中,半监督医疗对话模型包括上下文编码器、先验状态追踪器、推理状态追踪器、先验策略网络、推理策略网络和回复生成器,上下文编码器用于对接收到的信息进行编码并输入至先验状态追踪器和先验策略网络中,先验状态追踪器用于不断追踪用户的身体状态,先验策略网络用于生成医师相应的动作,回复生成器用于根据身体状态及医师动作,生成对应的回复;Among them, the semi-supervised medical dialogue model includes a context encoder, a priori state tracker, an inference state tracker, a priori policy network, an inference policy network and a reply generator. The context encoder is used to encode the received information and input it into the priori state tracker and the priori policy network. The prior state tracker is used to continuously track the user's physical state. The priori policy network is used to generate the doctor's corresponding actions. The reply generator is used to generate corresponding replies according to the physical state and the doctor's actions.
推理状态追踪器用于推理出用户的身体状态,推理策略网络用于推理出医师动作;推理状态追踪器和推理策略网络仅仅只在半监督医疗对话模型的训练阶段执行。The inference state tracker is used to infer the user's physical state, and the inference strategy network is used to infer the doctor's actions; the inference state tracker and the inference strategy network are only executed during the training phase of the semi-supervised medical dialogue model.
本发明的第二个方面提供一种半监督的多轮医疗对话回复生成系统。A second aspect of the present invention provides a semi-supervised multi-turn medical dialogue response generation system.
一种半监督的多轮医疗对话回复生成系统,其包括:A semi-supervised multi-turn medical dialogue response generation system, comprising:
第一轮对话回复生成模块,其用于将第一轮对话中病人的问题输入至半监督医疗对话模型,得到第一轮对话的回复;A first-round dialogue response generation module, which is used to input the patient's questions in the first-round dialogue into the semi-supervised medical dialogue model to obtain the responses to the first-round dialogue;
第二轮及其后对话回复生成模块,其用于在第二轮及其后对话中,将当前轮病人的问题及上一轮对话的回复输入至半监督医疗对话模型中,得到相应轮对话的回复,直至病人无新的问题输入;The second and subsequent dialogue response generation module is used to input the patient's questions in the current round and the responses in the previous round of dialogue into the semi-supervised medical dialogue model in the second and subsequent dialogues, and obtain the responses in the corresponding round of dialogues until the patient has no new questions input;
其中,半监督医疗对话模型包括上下文编码器、先验状态追踪器、推理状态追踪器、先验策略网络、推理策略网络和回复生成器,上下文编码器用于对接收到的信息进行编码并输入至先验状态追踪器和先验策略网络中,先验状态追踪器用于不断追踪用户的身体状态,先验策略网络用于生成医师相应的动作,回复生成器用于根据身体状态及医师动作,生成对应的回复;Among them, the semi-supervised medical dialogue model includes a context encoder, a priori state tracker, an inference state tracker, a priori policy network, an inference policy network and a reply generator. The context encoder is used to encode the received information and input it into the priori state tracker and the priori policy network. The prior state tracker is used to continuously track the user's physical state. The priori policy network is used to generate the doctor's corresponding actions. The reply generator is used to generate corresponding replies according to the physical state and the doctor's actions.
推理状态追踪器用于推理出用户的身体状态,推理策略网络用于推理出医师动作;推理状态追踪器和推理策略网络仅仅只在半监督医疗对话模型的训练阶段执行。The inference state tracker is used to infer the user's physical state, and the inference strategy network is used to infer the doctor's actions; the inference state tracker and the inference strategy network are only executed during the training phase of the semi-supervised medical dialogue model.
本发明的第三个方面提供一种计算机可读存储介质。A third aspect of the present invention provides a computer-readable storage medium.
一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述所述的半监督的多轮医疗对话回复生成方法中的步骤。A computer-readable storage medium stores a computer program, which, when executed by a processor, implements the steps in the semi-supervised multi-round medical dialogue response generation method as described above.
本发明的第四个方面提供一种计算机设备。A fourth aspect of the present invention provides a computer device.
一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述所述的半监督的多轮医疗对话回复生成方法中的步骤。A computer device comprises a memory, a processor and a computer program stored in the memory and executable on the processor, wherein when the processor executes the program, the steps in the semi-supervised multi-round medical dialogue response generation method as described above are implemented.
与现有技术相比,本发明的有益效果是:Compared with the prior art, the present invention has the following beneficial effects:
(1)本发明在第二轮及其后对话中,将当前轮病人的问题及上一轮对话的回复输入至半监督医疗对话模型中,得到相应轮对话的回复,直至病人无新的问题输入,显式建模了用户的身体状态以及医师的动作,使用text span来进行表示,提升了模型对于病人生理状态建模和医疗推理的能力。(1) In the second and subsequent rounds of dialogue, the present invention inputs the patient's questions in the current round and the replies in the previous round of dialogue into the semi-supervised medical dialogue model to obtain replies for the corresponding round of dialogue until the patient has no new questions to input. The user's physical state and the doctor's actions are explicitly modeled and represented by text spans, thereby improving the model's ability to model the patient's physiological state and conduct medical reasoning.
(2)本发明在模型层面,将用户的身体状态和医师动作当做隐变量,并且提出了存在中间标注(即监督)和不存在中间标注(即无监督)的情况下,模型的训练方法。该方法大大减小了对话模型对于标注数据的依赖。(2) At the model level, the present invention treats the user's physical state and the doctor's actions as latent variables, and proposes a model training method with and without intermediate annotations (i.e., supervision). This method greatly reduces the dependence of the dialogue model on labeled data.
(3)本发明提出在策略网络学习的过程中,使用追踪到的病人状态从大规模医疗知识图谱中进行检索,显式的状态,动作和医疗知识图谱中的推理路径提升了对话系统生成回复的可解释性。(3) The present invention proposes to use the tracked patient status to retrieve from a large-scale medical knowledge graph during the process of policy network learning. The explicit status, action and reasoning path in the medical knowledge graph improves the interpretability of the responses generated by the dialogue system.
(4)在模型训练上,本发明提出了两阶段层叠推理的方法,提升了监督训练数据较少的情况下的稳定性。(4) In model training, the present invention proposes a two-stage cascading reasoning method to improve the stability when there is less supervised training data.
本发明附加方面的优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Advantages of additional aspects of the present invention will be given in part in the following description, and in part will become obvious from the following description, or will be learned through practice of the present invention.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
构成本发明的一部分的说明书附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。The accompanying drawings in the specification, which constitute a part of the present invention, are used to provide a further understanding of the present invention. The exemplary embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute improper limitations on the present invention.
图1(a)是本发明实施例的监督数据训练;FIG. 1( a ) is a diagram of supervised data training according to an embodiment of the present invention;
图1(b)是本发明实施例的无监督数据训练;FIG1( b ) is an unsupervised data training of an embodiment of the present invention;
图1(c)是本发明实施例的测试阶段使用的模块;FIG1( c ) is a module used in the test phase of an embodiment of the present invention;
图2是本发明实施例的医疗对话系统具体实施方法;FIG2 is a specific implementation method of the medical dialogue system according to an embodiment of the present invention;
图3是本发明实施例的训练过程中的模型示意图。FIG. 3 is a schematic diagram of a model during a training process according to an embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
下面结合附图与实施例对本发明作进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and embodiments.
应该指出,以下详细说明都是例示性的,旨在对本发明提供进一步的说明。除非另有指明,本文使用的所有技术和科学术语具有与本发明所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed descriptions are all illustrative and intended to provide further explanation of the present invention. Unless otherwise specified, all technical and scientific terms used herein have the same meanings as those commonly understood by those skilled in the art to which the present invention belongs.
需要注意的是,这里所使用的术语仅是为了描述具体实施方式,而非意图限制根据本发明的示例性实施方式。如在这里所使用的,除非上下文另外明确指出,否则单数形式也意图包括复数形式,此外,还应当理解的是,当在本说明书中使用术语“包含”和/或“包括”时,其指明存在特征、步骤、操作、器件、组件和/或它们的组合。It should be noted that the terms used herein are only for describing specific embodiments and are not intended to limit exemplary embodiments according to the present invention. As used herein, unless the context clearly indicates otherwise, the singular form is also intended to include the plural form. In addition, it should be understood that when the terms "comprising" and/or "including" are used in this specification, it indicates the presence of features, steps, operations, devices, components and/or combinations thereof.
术语解释:Terminology explanation:
编码器-解码器(Encoder-Decoder):一种神经网络结构,功能是将一个词序列编码后再解码转换成另一个词序列,主要用于机器翻译,对话系统等。Encoder-Decoder: A neural network structure that encodes a word sequence and then decodes it into another word sequence. It is mainly used in machine translation, dialogue systems, etc.
编码(encoding):将词序列表示成一个连续向量。Encoding: Representing a word sequence as a continuous vector.
解码(decoding):将一个连续向量表示成目标序列。Decoding: Represent a continuous vector as a target sequence.
期望(Expectation):试验中每次可能的结果乘以其结果概率的总和,本发明中使用E·的形式进行表示。Expectation: The sum of each possible outcome in an experiment multiplied by its outcome probability, which is expressed in the form of E· in the present invention.
KL散度(KL Divergence):是两个概率分布之间的差别的非对称性度量,本发明采用KL(·||·)的形式表示,其计算公式如下:KL Divergence: It is an asymmetric measure of the difference between two probability distributions. The present invention uses the form of KL(·||·), and its calculation formula is as follows:
其中q,p表示两个离散分布,q(i),p(i)分别表示分布q,p第i项概率值。Where q and p represent two discrete distributions, and q(i) and p(i) represent the probability values of the i-th item of distribution q and p respectively.
隐变量(Latent variable):潜变量,或称隐变量,潜在变量,在统计学中的表示不可观测随机变量,与观测变量相对。Latent variable: latent variable, or hidden variable, latent variable, in statistics represents unobservable random variable, as opposed to observed variable.
训练阶段(Train):神经网络模型的训练阶段接收训练数据作为输入,通过训练样本来不断调整神经网络模型中的参数。Training phase (Train): The training phase of the neural network model receives training data as input and continuously adjusts the parameters in the neural network model through training samples.
测试阶段(Test):神经网络模型在训练过后,在测试阶段通过训练过的圣经网络模型参数输出输入数据对应的标签等信息。后面我们亦称之为部署阶段。Test phase: After the neural network model is trained, the trained neural network model parameters are used to output the labels and other information corresponding to the input data in the test phase. We will also call it the deployment phase.
实施例一
本实施例提供了一种半监督的多轮医疗对话回复生成方法,其包括:This embodiment provides a semi-supervised multi-round medical dialogue response generation method, which includes:
将第一轮对话中病人的问题输入至半监督医疗对话模型,得到第一轮对话的回复;Input the patient's questions in the first round of dialogue into the semi-supervised medical dialogue model to obtain the responses in the first round of dialogue;
在第二轮及其后对话中,将当前轮病人的问题及上一轮对话的回复输入至半监督医疗对话模型中,得到相应轮对话的回复,直至病人无新的问题输入;In the second and subsequent rounds of dialogue, the patient's questions in the current round and the responses in the previous round are input into the semi-supervised medical dialogue model to obtain responses for the corresponding round of dialogue until the patient has no new questions input;
其中,半监督医疗对话模型包括上下文编码器、先验状态追踪器、推理状态追踪器、先验策略网络、推理策略网络和回复生成器,上下文编码器用于对接收到的信息进行编码并输入至先验状态追踪器和先验策略网络中,先验状态追踪器用于不断追踪用户的身体状态,先验策略网络用于生成医师相应的动作,回复生成器用于根据身体状态及医师动作,生成对应的回复;Among them, the semi-supervised medical dialogue model includes a context encoder, a priori state tracker, an inference state tracker, a priori policy network, an inference policy network and a reply generator. The context encoder is used to encode the received information and input it into the priori state tracker and the priori policy network. The prior state tracker is used to continuously track the user's physical state. The priori policy network is used to generate the doctor's corresponding actions. The reply generator is used to generate corresponding replies according to the physical state and the doctor's actions.
推理状态追踪器用于推理出用户的身体状态,推理策略网络用于推理出医师动作;推理状态追踪器和推理策略网络仅仅只在半监督医疗对话模型的训练阶段执行。The inference state tracker is used to infer the user's physical state, and the inference strategy network is used to infer the doctor's actions; the inference state tracker and the inference strategy network are only executed during the training phase of the semi-supervised medical dialogue model.
上下文编码器用于对接收到的信息进行编码;对于第一轮对话的病人的问题进行直接编码;对于第二轮及其后对话病人的问题及相应上一轮对话的回复,编码形成上下文信息并输入至先验状态追踪器,推理状态追踪器,先验策略网络,推理策略网络和回复生成器五个模块中。The context encoder is used to encode the received information; the patient's questions in the first round of dialogue are directly encoded; for the patient's questions in the second and subsequent rounds of dialogue and the corresponding replies in the previous round of dialogue, the context information is encoded and input into the five modules of prior state tracker, reasoning state tracker, prior strategy network, reasoning strategy network and reply generator.
先验状态追踪器的输入信号为:从前一轮对话中的推理状态追踪器的输出概率分布中采样的到的状态实例输出概率分布 The input signal of the prior state tracker is: the output probability distribution of the inference state tracker in the previous round of dialogue Sampled state instance Output probability distribution
推理状态追踪器的输入信号为:从前一轮对话中的推理状态追踪器的输出概率分布中采样的到的状态实例和当前轮的医师回复Rt,输出概率分布 The input signal of the reasoning state tracker is: the output probability distribution of the reasoning state tracker in the previous round of dialogue Sampled state instance And the current round of physician responses R t , output probability distribution
先验策略网络的输入信号为:从当前轮对话中的推理状态追踪器的输出概率分布中采样的到的状态实例以及外部医疗知识图谱G,输出概率分布 The input signal of the prior policy network is: the output probability distribution of the inference state tracker in the current round of dialogue Sampled state instance And the external medical knowledge graph G, output probability distribution
推理策略网络的输入信号为:从当前轮对话中的推理状态追踪器的输出概率分布中采样的到的状态实例和当前轮的医师回复Rt,输出概率分布 The input signal of the reasoning policy network is: the output probability distribution of the reasoning state tracker in the current round of dialogue Sampled state instance And the current round of physician responses R t , output probability distribution
回复生成器的输入信号为:回复生成器输入分为训练阶段和测试阶段(就是部署的时候)两种情况,训练阶段其接收以及从概率分布中采样得到的实例作为输入;测试阶段则接收以及从概率分布中采样得到的实例作为输入,输出对话回复信息Rt。The input signal of the reply generator is: The reply generator input is divided into two situations: training phase and test phase (that is, when deployed). In the training phase, it receives And from the probability distribution The sampled examples As input; the test phase receives And from the probability distribution The sampled examples As input, the dialogue reply information R t is output.
在实际部署阶段,在每个对话轮中,给定病人的表述,医疗对话系统采用先验状态追踪器不断追踪用户的身体状态,并且使用先验策略网络生成医师相应的动作,最后回复生成器结合从先验状态追踪器和先验策略网络采样得到的状态以及动作生成对应的回复,对应图1(c)过程。对话进程一直持续到病人无新的问题输入,即病人主动结束当前对话。In the actual deployment stage, in each dialogue round, given the patient's statement, the medical dialogue system uses the prior state tracker to continuously track the user's physical state, and uses the prior policy network to generate the doctor's corresponding actions. Finally, the reply generator combines the state and actions sampled from the prior state tracker and the prior policy network to generate the corresponding reply, corresponding to the process in Figure 1(c). The dialogue process continues until the patient has no new questions to input, that is, the patient actively ends the current dialogue.
医疗对话系统有两个关键特征:患者状态(症状、药物等)和医师动作(治疗、诊断等)。这两个特征使得医疗对话系统比其他知识密集型对话场景更加复杂。与任务导向对话系统相似,医学对话生成过程拆分为以下三个阶段:Medical dialogue systems have two key features: patient status (symptoms, medications, etc.) and physician actions (treatment, diagnosis, etc.). These two features make medical dialogue systems more complex than other knowledge-intensive dialogue scenarios. Similar to task-oriented dialogue systems, the medical dialogue generation process is divided into the following three stages:
(1)病人状态追踪:对于给定的对话历史,对话系统追踪状态的身体状态(state);(1) Patient state tracking: For a given dialogue history, the dialogue system tracks the physical state of the patient.
(2)医师策略学习:给定病人状态和对话历史,对话系统给出当前医师的动作(action);(2) Physician strategy learning: Given the patient status and dialogue history, the dialogue system gives the current physician’s action;
(3)医疗回复生成:给定对话历史,追踪到的状态以及预测的动作,给出流畅并准确的自然语言回复。(3) Medical response generation: Given the conversation history, tracked states, and predicted actions, give fluent and accurate natural language responses.
对于存在标注数据的场景,在对话的第t轮,病人给出问题或者描述自己的症状Ut,后医疗对话系统接收前一轮的回复Rt-1,当前轮问题Ut和前一轮追踪到的状态St-1,然后输出当前轮的状态St,后再利用Rt-1UtSt输出当前轮医师应采取的动作At,最后生成自然语言形式的回复Rt反馈给病人。但是在医疗对话系统中,很多情况下,病人的生理状态和医师的动作是不存在标注的。故我们将状态和动作都视为隐变量,并且考虑到state贯穿整个对话过程,所以使用一个序列的词来表示;医师动作亦是如此,即医师的回复中可能包含多个关键词。实际操作过程中,状态和动作的长度被设置为固定的长度分别为|S|和|A|。并且状态存在一个初始值,为“<pad><pad>...<pad>”,其中“<pad>”表示一个填充词。State和Action的设计细节如下:For scenarios with labeled data, in the tth round of the conversation, the patient asks a question or describes his symptoms U t , and then the medical dialogue system receives the previous round's reply R t-1 , the current round's question U t and the state St-1 tracked in the previous round, and then outputs the current round's state St t , and then uses R t-1 U t St t to output the action At that the physician should take in the current round, and finally generates a natural language reply R t to feed back to the patient. However, in medical dialogue systems, in many cases, the patient's physiological state and the physician's action are not labeled. Therefore, we regard both state and action as hidden variables, and considering that state runs through the entire conversation process, we use a sequence of words to represent it; the same is true for physician actions, that is, the physician's reply may contain multiple keywords. In actual operation, the length of state and action is set to a fixed length of |S| and |A| respectively. And the state has an initial value, which is "<pad><pad>...<pad>", where "<pad>" represents a filler word. The design details of State and Action are as follows:
state的设计:state用于记录整个对话过程中的对话系统所获取到的用户身体状态的信息,其使用一个序列的词来表示,例如“感冒发热咳嗽夜汗......”,并且其初始化为“<pad><pad>......<pad>”。State design: state is used to record the information about the user's physical state obtained by the dialogue system during the entire dialogue process. It is represented by a sequence of words, such as "cold, fever, cough, night sweats...", and is initialized to "<pad><pad>......<pad>".
action的设计:action用于表示医师回复的概要,其亦使用一个序列的表示,例如“999感冒灵颗粒急支糖浆......”。Action design: Action is used to express the summary of the doctor's reply, which is also expressed as a sequence, such as "999 Ganmao Ling Granules Jizhi Syrup..."
半监督医疗对话模型包含了六个模块,分别为上下文编码器(context encoder),先验状态追踪器(prior state tracker),推理状态追踪器(inference state tracker),先验策略网络(prior policy network),推理策略网络(inference state tracker)和回复生成器(response generator)。在一整个医疗对话中,往往包含多次交互,以下过程经历多轮直至对话结束。The semi-supervised medical dialogue model consists of six modules: context encoder, prior state tracker, inference state tracker, prior policy network, inference state tracker, and response generator. In a whole medical dialogue, there are often multiple interactions, and the following process goes through multiple rounds until the dialogue ends.
其中先验状态追踪器,推理状态追踪器用于病人状态追踪,其中推理状态追踪器只在训练阶段执行;先验策略网络,推理策略网络用于医师策略学习,其中推理策略网络只在训练阶段执行;回复生成器用于医疗回复生成。下面主要从无监督的角度,即使用无监督数据Du,对应图1(b)来描述各个模块的输入输出。The prior state tracker and the reasoning state tracker are used for patient state tracking, and the reasoning state tracker is only executed in the training phase; the prior policy network and the reasoning policy network are used for physician policy learning, and the reasoning policy network is only executed in the training phase; the response generator is used for medical response generation. The following mainly describes the input and output of each module from an unsupervised perspective, that is, using unsupervised data Du , corresponding to Figure 1(b).
在t轮,上下文编码器是一个基于GRU(或者基于LSTM,Transformer,Bert)的编码器,其接收上一轮的回复Rt-1以及当前轮的病人的问题Ut作为输入,并且输出一个连续空间向量来表示对话上下文。In round t, the context encoder is a GRU-based (or LSTM-based, Transformer-based, Bert-based) encoder that receives the previous round’s response R t-1 and the current round’s patient’s question U t as input, and outputs a continuous space vector To represent the conversation context.
在第t轮,给定前一轮回复Rt-1以及当前轮的病人的问题Ut作为输入,上下文编码器首先使用双向GRU Encoder编码得到一个序列词粒度的表示Ht={ht,1,ht,2,…,ht,M+N},并且输出一个向量来表示对话上下文。其中M和N分别表示Rt-1和Ut序列长度。In the tth round, given the previous round reply Rt -1 and the current round patient's question Ut as input, the context encoder first uses the bidirectional GRU Encoder to encode a sequence word granularity representation Ht = {ht ,1 , ht,2 ,…, ht,M+N }, and outputs a vector to represent the conversation context. Where M and N represent the length of R t-1 and U t sequences respectively.
其中表示Rt-1第i个词的词嵌入(embedding),这个BiGRU编码器采用了上一个时刻的上下文表示初始化,attn[17]表示的是attention操作。in Represents the word embedding of the i-th word in R t-1 . This BiGRU encoder uses the context representation of the previous moment. Initialization, attn[17] represents the attention operation.
先验状态追踪器,接收上下文编码器输出以及前一时刻的状态作为输入,然后采用一个基于GRU的解码器来输出一个序列的词,即推理状态追踪器采用了先验状态追踪器相近的结构,但是额外接受了当前轮的回复Rt作为输入,其输出一个词序列,即我们使用和分别表示先验状态追踪器和推理状态追踪器,生成概率分布简写为和 The prior state tracker receives the context encoder output and the state at the previous moment As input, a GRU-based decoder is then used to output a sequence of words, namely The reasoning state tracker uses a structure similar to the prior state tracker, but additionally accepts the current round response Rt as input, and outputs a word sequence, namely We use and They represent the prior state tracker and the inference state tracker respectively, and the generated probability distribution is abbreviated as and
先验状态追踪器和推理状态追踪器均是编码器-解码器(Encoder-Decoder)结构。在无监督信息的情况下,所有的对话轮的状态是都是不可知的,并且后一轮的状态需要依赖于前一轮的状态作为输入,故我们从采样得到送入先验状态追踪器和推理状态追踪器中。Both the prior state tracker and the reasoning state tracker are encoder-decoder structures. In the absence of supervision information, the states of all dialogue rounds are unknown, and the state of the next round needs to depend on the state of the previous round as input, so we start from Sampling Feed it into the prior state tracker and the inference state tracker.
先验状态追踪器首先将采样得到的编码为使用初始化先验状态追踪器的解码器,其中为训练参数。在第i个解码时刻,输出然后序列解码得到St的先验分布为:The prior state tracker first converts the sampled Encoded as use Initialize the decoder of the prior state tracker, where is the training parameter. At the i-th decoding moment, the output Then the prior distribution of St obtained by sequence decoding is:
其中MLP表示的是多层感知机(Multi-Layer Perceptron)。|S|为状态text span的长度。Where MLP stands for Multi-Layer Perceptron. |S| is the length of the state text span.
推理状态追踪器同先验状态追踪器的结构类似,其亦使用GRU Encoder将编码为另外编码Rt为其使用初始化解码器,其中为训练参数。在第i个解码时刻,输出然后序列解码得到St的近似后验分布:The inference state tracker has a similar structure to the prior state tracker, and it also uses the GRU Encoder to Encoded as In addition, R t is encoded as Its use Initialize the decoder, where is the training parameter. At the i-th decoding moment, the output Then the sequence is decoded to obtain the approximate posterior distribution of St :
先验策略网络,接收上下文编码器输出,当前轮的St以及外部医疗知识G作为输入,然后使用一个基于GRU的解码器输出一个序列的词,即推理策略网络结构相近,其接收St并且额外接收当前轮回复Rt作为输出,后输出一个序列的词,即我们使用和分别表示先验策略网络和推理策略网络,简写为和 The prior policy network receives the context encoder output, the current round St and the external medical knowledge G as input, and then uses a GRU-based decoder to output a sequence of words, i.e. The reasoning strategy network structure is similar, and its receiving S t and additionally receives the current round reply R t as output, and then outputs a sequence of words, i.e. We use and Represent the prior strategy network and the reasoning strategy network, respectively, abbreviated as and
先验策略网络以及推理策略网络亦是Encoder-Decoder的结构。其中先验策略网络从采样得到推理策略网络从采样得到 The prior strategy network and the reasoning strategy network are also encoder-decoder structures. Sampling The reasoning strategy network is Sampling
在介绍两个策略网络之前,首先引入一个知识图谱检索操作qsub,和知识图谱编码操作RGAT[15]。qsub从G中使用追踪得到的状态从医疗知识图谱G中进行检索得到一个子图Gn,从state作为起始点,抽取出n步跳转可达的所有的节点和边,并且连接所有出现在state中的节点,以保证图Gn是全连通的。RGAT是一种图编码方法,其结合了边的类型,进行多次传播后得到节点的embedding表示,即一个连续空间上的向量表示。我们使用表示Gn编码后的节点表示,其中|Gn|为Gn中的节点数量。Before introducing the two policy networks, we first introduce a knowledge graph retrieval operation qsub and a knowledge graph encoding operation RGAT[15]. qsub retrieves a subgraph Gn from the medical knowledge graph G using the state obtained by tracing from G. Starting from state, it extracts all nodes and edges that can be reached by n-step jumps, and connects all nodes that appear in state to ensure that the graph Gn is fully connected. RGAT is a graph encoding method that combines the type of edges and obtains the embedding representation of the node after multiple propagations, that is, a vector representation in a continuous space. We use represents the node representation after Gn encoding, where | Gn | is the number of nodes in Gn .
先验策略网络使用GRU Encoder将编码为后来使用 来初始化解码器,在第i个解码时刻,输出解码过程包含两个部分,一种从词表中生成,另一种从检索得到的知识图谱Gn中进行拷贝。The prior policy network uses GRU Encoder to Encoded as Later use To initialize the decoder, at the i-th decoding moment, output The decoding process consists of two parts, one is generated from the vocabulary, and the other is copied from the retrieved knowledge graph Gn .
其中ej表示Gn中的第j个节点,gj表示中第j个节点的embedding。ZA为生成Where e j represents the jth node in G n , g j represents The embedding of the jth node in Z A is generated
拷贝的正则项。在ej=At,i的情况下I(ej,At,i)=1,否则I(ej,At,i)=0。Regularization term of the copy. In the case of e j = A t,i, I(e j , A t,i ) = 1, otherwise I(e j , A t,i ) = 0.
则At的先验分布可以表示为:Then the prior distribution of At can be expressed as:
推理策略网络使用GRU Encoder编码编码为编码Rt为后来使用初始化解码器,在第i个解码时刻输出为了强化Rt对于结果的影响,对于At近似后验分布,我们只考虑直接的生成概率。The reasoning policy network is encoded using GRU Encoder Encoded as The code R t is Later use Initialize the decoder and output at the i-th decoding time In order to strengthen the impact of R t on the results, we only consider the direct generation probability for the approximate posterior distribution of A t .
回复生成器是一个基于GRU的解码器,接收上下文编码器输出St和At作为输入,然后输出医疗回复Rt。使用表示回复生成器,简写为 The reply generator is a GRU-based decoder that receives the context encoder output S t and A t are used as input, and then the medical response R t is output. Represents a reply generator, abbreviated as
回复生成器在无监督训练阶段只使用推理状态追踪器和推理策略网络的输出。在无监督训练过程中,我们从和分别采样得到和将其编码为和后来初始化回复生成器的解码器为在第i个解码时刻输出则得到Rt的输出概率为:The reply generator uses only the output of the inference state tracker and the inference policy network during the unsupervised training phase. and Sampled separately and Encode it as and Later the decoder of the reply generator is initialized as Output at the i-th decoding time Then the output probability of R t is:
其中表示从词表中生成的概率,表示从Rt-1和Ut中拷贝的概率,|R|为回复的长度。in represents the probability generated from the vocabulary, Indicates from R t-1 and the probability of copying in U t , |R| is the length of the reply.
对于监督数据和无监督数据的训练损失函数分别为Lsup和Lun,其中Lun为:The training loss functions for supervised data and unsupervised data are L sup and L un respectively, where L un is:
其中E·表示期望,KL(·||·)表KL散度(Kullback-Leibler divergence)。Where E· represents expectation, and KL(·||·) represents KL divergence (Kullback-Leibler divergence).
考虑到在监督数据比例较少的情况下,训练中存在的不稳定性,即先验策略网络容易受从先验状态追踪器采样得到的错误的state误导。本发明提出了两阶段层叠推理训练方法,将Lun分别为多个训练部分,由于策略网络依赖于状态追踪器的输出,故首先优化再同时优化剩余模块,以提高训练过程中的稳定性。Lun被拆分为Ls和La两个训练目标:Considering the instability in training when the proportion of supervised data is small, that is, the prior policy network is easily misled by the wrong state sampled from the prior state tracker, the present invention proposes a two-stage cascade reasoning training method, which divides Lun into multiple training parts. Since the policy network depends on the output of the state tracker, it is first optimized. The remaining modules are then optimized simultaneously to improve the stability during training. L un is split into two training objectives, L s and L a :
在第一训练阶段,最小化Ls提升模型状态追踪性能,第二阶段最小化Ls+La以维持状态追踪效果以及训练模型的策略学习能力。我们将其命名为两阶段层叠推理训练方法。In the first training phase, minimizing L s improves the model state tracking performance, and in the second phase, minimizing L s + L a maintains the state tracking effect and the policy learning ability of the training model. We name it the two-stage cascade reasoning training method.
图3训练过程中的模型示意图,global_step为一个整数用于记录训练经过轮数。Figure 3 is a schematic diagram of the model during training. global_step is an integer used to record the number of training rounds.
在半监督场景下,用于模型训练的对话数据存在监督和无监督数据两个部分,下面我们分别介绍对于监督数据Da和无监督数据Du的训练方法。In the semi-supervised scenario, the conversation data used for model training consists of supervised and unsupervised data. Below we introduce the training methods for supervised data Da and unsupervised data Du respectively.
(a)对于监督数据Da (a) For the supervised data Da
从Da采样训练样本构成训练所需要的小批量(即mini-batch),得到数据Rt-1,Ut,St-1,St,At,Rt。将对应输入数据送入上述的6个模块,对应图1中的(a)。采用Negative LogLikelihood(NLL)Loss来进行训练。实际的训练损失函数为:The training samples are sampled from Da to form the mini-batch required for training, and the data Rt -1 , Ut , St -1 , St , At , and Rt are obtained. The corresponding input data is sent to the above 6 modules, corresponding to (a) in Figure 1. Negative LogLikelihood (NLL) Loss is used for training. The actual training loss function is:
(b)对于无监督数据Du (b) For the unsupervised data Du
从Du采样训练样本构成训练所需要的小批量(即mini-batch),得到数据Rt-1,Ut,Rt。中间的标注数据St-1,St,At因无标注均缺失。我们从采样得到后将送入到和中。再后来,从和中分别采样得到和分别作为和的输入。再然后从采样得到最后结合Rt-1,Ut生成回复Rt,以上过程对应图1中的(b)。训练loss为Lun(亦可选用Ls+La作为训练loss,以提高训练稳定性)。We sample training samples from Du to form the mini-batch required for training, and obtain the data R t - 1 , U t , R t . The labeled data in the middle, St -1 , St t , At , are missing because they are not labeled. Sampling Later Send to and Later, from and The samples were obtained from and As and Then from Sampling at last Combine R t-1 and U t to generate the reply R t . The above process corresponds to (b) in Figure 1. The training loss is L un (L s +L a can also be used as the training loss to improve the training stability).
对于整个训练数据集D={Da,Du},具体训练步骤如下:For the entire training data set D = {D a , Du }, the specific training steps are as follows:
Step1:假设监督数据Da占全部训练数据D的比例为α(0≤α≤1),选择0-1之间的随机数,如果小于α转Step2,如果大于α转Step3。Step 1: Assume that the proportion of the supervised data Da to the total training data D is α (0≤α≤1), select a random number between 0 and 1. If it is less than α, go to Step 2; if it is greater than α, go to Step 3.
Step2:采用于监督数据训练模型,对应(a)方式,训练loss为Lsup,梯度下降更新参数后转Step4。Step 2: Use supervised data to train the model, corresponding to method (a), with the training loss L sup . Update the parameters by gradient descent and then go to Step 4.
Step3:采用于监督数据训练模型,对应(b)方式,训练loss为Lun,梯度下降更新参数后转Step4。Step 3: Use supervised data to train the model, corresponding to method (b), the training loss is Lun , and the parameters are updated by gradient descent and then go to Step 4.
Step4:判断模型是否收敛,若收敛则转Step5,否则转Step1。Step 4: Determine whether the model converges. If so, go to Step 5, otherwise go to
Step5:保存模型权重,结束训练,如图3所示。Step 5: Save the model weights and end the training, as shown in Figure 3.
使用目前工业界和学术界内公开的医疗对话数据集,训练得到半监督医疗对话模型。其中,对于采样得到的监督数据和无监督数据,送入模型中,算出对应的损失函数后进行梯度下降,优化模型参数。The semi-supervised medical dialogue model is trained using the currently public medical dialogue datasets in the industry and academia. The sampled supervised data and unsupervised data are fed into the model, and the corresponding loss function is calculated and then gradient descent is performed to optimize the model parameters.
模型训练完成后,模型的参数便全部固定,我们可将推理状态追踪器和推理策略网络丢弃。此时,模型就可以应用到实际的对话场景中。如图2所示,给定病人问题输入模型中,上下文编码器,先验状态追踪器,先验策略网络,回复生成器相继工作,(此刻回复生成器只使用先验状态追踪器输出和先验策略网络的输出作为输入),最后生成回复返回给用户。对话系统持续和病人交互,先验状态追踪器在每个对话轮中先使用前一时刻的状态作为输入,后更新追踪到的病人身体状态,等待一段时间过后未接收到病人新的问题,结束当前会话。After the model training is completed, the model parameters are all fixed, and we can discard the inference state tracker and inference policy network. At this point, the model can be applied to actual conversation scenarios. As shown in Figure 2, given a patient question input model, the context encoder, prior state tracker, prior policy network, and response generator work in sequence (at this moment, the response generator only uses the prior state tracker output and the output of the prior policy network As input), the system generates a reply and returns it to the user. The dialogue system continues to interact with the patient. In each dialogue round, the prior state tracker first uses the state of the previous moment as input, then updates the tracked patient's physical state, and ends the current session after a period of time has passed without receiving any new questions from the patient.
实施例二Embodiment 2
一种半监督的多轮医疗对话回复生成系统,其包括:A semi-supervised multi-turn medical dialogue response generation system, comprising:
第一轮对话回复生成模块,其用于将第一轮对话中病人的问题输入至半监督医疗对话模型,得到第一轮对话的回复;A first-round dialogue response generation module, which is used to input the patient's questions in the first-round dialogue into the semi-supervised medical dialogue model to obtain the responses to the first-round dialogue;
第二轮及其后对话回复生成模块,其用于在第二轮及其后对话中,将当前轮病人的问题及上一轮对话的回复输入至半监督医疗对话模型中,得到相应轮对话的回复,直至病人无新的问题输入;The second and subsequent dialogue response generation module is used to input the patient's questions in the current round and the responses in the previous round of dialogue into the semi-supervised medical dialogue model in the second and subsequent dialogues, and obtain the responses in the corresponding round of dialogues until the patient has no new questions input;
其中,半监督医疗对话模型包括上下文编码器、先验状态追踪器、推理策略状态追踪器、先验策略网络、推理策略网络和回复生成器,上下文编码器用于对接收到的信息进行编码并输入至先验状态追踪器和先验策略网络中,先验状态追踪器用于不断追踪用户的身体状态,先验策略网络用于生成医师相应的动作,回复生成器用于根据身体状态及医师动作,生成对应的回复;Among them, the semi-supervised medical dialogue model includes a context encoder, a priori state tracker, an inference strategy state tracker, a priori strategy network, an inference strategy network and a reply generator. The context encoder is used to encode the received information and input it into the priori state tracker and the priori strategy network. The priori state tracker is used to continuously track the user's physical state. The priori strategy network is used to generate the doctor's corresponding actions. The reply generator is used to generate corresponding replies according to the physical state and the doctor's actions.
推理状态追踪器用于推理出用户的身体状态,推理策略网络用于推理出医师动作;推理状态追踪器和推理策略网络仅仅只在半监督医疗对话模型的训练阶段执行。The inference state tracker is used to infer the user's physical state, and the inference strategy network is used to infer the doctor's actions; the inference state tracker and the inference strategy network are only executed during the training phase of the semi-supervised medical dialogue model.
本实施例中的各个模块,与实施例一中的各个步骤一一对应,其具体实施过程相同,此处不再累述。Each module in this embodiment corresponds to each step in the first embodiment one by one, and the specific implementation process is the same, which will not be repeated here.
实施例三Embodiment 3
本实施例提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述所述的半监督的多轮医疗对话回复生成方法中的步骤。This embodiment provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps in the semi-supervised multi-round medical dialogue response generation method as described above.
实施例四Embodiment 4
本实施例提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述所述的半监督的多轮医疗对话回复生成方法中的步骤。This embodiment provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, the steps in the semi-supervised multi-round medical dialogue response generation method as described above are implemented.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the present invention may be provided as methods, systems, or computer program products. Therefore, the present invention may take the form of hardware embodiments, software embodiments, or embodiments combining software and hardware. Moreover, the present invention may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) containing computer-usable program codes.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to the flowchart and/or block diagram of the method, device (system), and computer program product according to the embodiment of the present invention. It should be understood that each process and/or box in the flowchart and/or block diagram, as well as the combination of the process and/or box in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(RandomAccessMemory,RAM)等。A person skilled in the art can understand that all or part of the processes in the above-mentioned embodiments can be implemented by instructing the relevant hardware through a computer program, and the program can be stored in a computer-readable storage medium, and when the program is executed, it can include the processes of the embodiments of the above-mentioned methods. The storage medium can be a disk, an optical disk, a read-only memory (ROM) or a random access memory (RAM), etc.
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and variations. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110577272.8A CN113436752B (en) | 2021-05-26 | 2021-05-26 | Semi-supervised multi-round medical dialogue reply generation method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110577272.8A CN113436752B (en) | 2021-05-26 | 2021-05-26 | Semi-supervised multi-round medical dialogue reply generation method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113436752A CN113436752A (en) | 2021-09-24 |
CN113436752B true CN113436752B (en) | 2023-04-28 |
Family
ID=77802906
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110577272.8A Active CN113436752B (en) | 2021-05-26 | 2021-05-26 | Semi-supervised multi-round medical dialogue reply generation method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113436752B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111710150A (en) * | 2020-05-14 | 2020-09-25 | 国网江苏省电力有限公司南京供电分公司 | A method for detecting abnormal electricity consumption data based on adversarial self-encoding network |
CN111797220A (en) * | 2020-07-30 | 2020-10-20 | 腾讯科技(深圳)有限公司 | Dialog generation method and device, computer equipment and storage medium |
CN111897941A (en) * | 2020-08-14 | 2020-11-06 | 腾讯科技(深圳)有限公司 | Dialog generation method, network training method, device, storage medium and equipment |
CN112464645A (en) * | 2020-10-30 | 2021-03-09 | 中国电力科学研究院有限公司 | Semi-supervised learning method, system, equipment, storage medium and semantic analysis method |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110309275B (en) * | 2018-03-15 | 2024-06-14 | 北京京东尚科信息技术有限公司 | Dialog generation method and device |
CN109582767B (en) * | 2018-11-21 | 2024-05-17 | 北京京东尚科信息技术有限公司 | Dialogue system processing method, device, equipment and readable storage medium |
CN109977212B (en) * | 2019-03-28 | 2020-11-24 | 清华大学深圳研究生院 | Reply content generation method of conversation robot and terminal equipment |
CN109992657B (en) * | 2019-04-03 | 2021-03-30 | 浙江大学 | A Conversational Question Generation Method Based on Enhanced Dynamic Reasoning |
CN109933661B (en) * | 2019-04-03 | 2020-12-18 | 上海乐言信息科技有限公司 | Semi-supervised question-answer pair induction method and system based on deep generation model |
CN110297895B (en) * | 2019-05-24 | 2021-09-17 | 山东大学 | Dialogue method and system based on free text knowledge |
CN110321417B (en) * | 2019-05-30 | 2021-06-11 | 山东大学 | Dialog generation method, system, readable storage medium and computer equipment |
CN111428483B (en) * | 2020-03-31 | 2022-05-24 | 华为技术有限公司 | Voice interaction method, device and terminal device |
CN111767383B (en) * | 2020-07-03 | 2022-07-08 | 思必驰科技股份有限公司 | Conversation state tracking method, system and man-machine conversation method |
CN112164476A (en) * | 2020-09-28 | 2021-01-01 | 华南理工大学 | A method for generating medical consultation dialogue based on multitasking and knowledge guidance |
CN112289467B (en) * | 2020-11-17 | 2022-08-02 | 中山大学 | Low-resource scene migratable medical inquiry dialogue system and method |
-
2021
- 2021-05-26 CN CN202110577272.8A patent/CN113436752B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111710150A (en) * | 2020-05-14 | 2020-09-25 | 国网江苏省电力有限公司南京供电分公司 | A method for detecting abnormal electricity consumption data based on adversarial self-encoding network |
CN111797220A (en) * | 2020-07-30 | 2020-10-20 | 腾讯科技(深圳)有限公司 | Dialog generation method and device, computer equipment and storage medium |
CN111897941A (en) * | 2020-08-14 | 2020-11-06 | 腾讯科技(深圳)有限公司 | Dialog generation method, network training method, device, storage medium and equipment |
CN112464645A (en) * | 2020-10-30 | 2021-03-09 | 中国电力科学研究院有限公司 | Semi-supervised learning method, system, equipment, storage medium and semantic analysis method |
Also Published As
Publication number | Publication date |
---|---|
CN113436752A (en) | 2021-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shen et al. | Improving variational encoder-decoders in dialogue generation | |
CN112271001B (en) | Medical consultation dialogue system and method applying heterogeneous graph neural network | |
Li et al. | Adversarial learning for neural dialogue generation | |
US11494647B2 (en) | Slot filling with contextual information | |
CN110111864B (en) | A relational model-based medical report generating system and its generating method | |
CN113448477A (en) | Interactive image editing method and device, readable storage medium and electronic equipment | |
CN112527966B (en) | Network text emotion analysis method based on Bi-GRU neural network and self-attention mechanism | |
Chen et al. | Delving deeper into the decoder for video captioning | |
CN114528898A (en) | Scene graph modification based on natural language commands | |
CN116681810B (en) | Virtual object action generation method, device, computer equipment and storage medium | |
CN114841122A (en) | Text extraction method combining entity identification and relationship extraction, storage medium and terminal | |
Li et al. | DQ-HGAN: A heterogeneous graph attention network based deep Q-learning for emotional support conversation generation | |
CN112131372A (en) | Knowledge-driven dialogue strategy network optimization method, system and device | |
CN112463935A (en) | Open domain dialogue generation method and model with strong generalized knowledge selection | |
US20250054322A1 (en) | Attribute Recognition with Image-Conditioned Prefix Language Modeling | |
Pham et al. | Applied Hedge Algebra Approach with Multilingual Large Language Models to Extract Hidden Rules in Datasets for Improvement of Generative AI Applications | |
Han et al. | Guided discrete diffusion for electronic health record generation | |
CN113436752B (en) | Semi-supervised multi-round medical dialogue reply generation method and system | |
CN113723079A (en) | Method for hierarchical modeling contribution-aware context for long-distance dialog state tracking | |
WO2024243183A2 (en) | Training human-guided ai networks | |
CN116402064B (en) | Comment generation method, comment generation system, storage medium and electronic equipment | |
CN116977509A (en) | Virtual object action generation method, device, computer equipment and storage medium | |
US20210103636A1 (en) | Summarization of group chat threads | |
Ahmed | Combining neural networks with knowledge for spoken dialogue systems | |
Afrae et al. | A Question answering System with a sequence to sequence grammatical correction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |