CN116127056A

CN116127056A - Medical dialogue abstracting method with multi-level characteristic enhancement

Info

Publication number: CN116127056A
Application number: CN202211692317.7A
Authority: CN
Inventors: 张天宝; 冯时; 杨振飞; 王大玲; 张一飞
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2022-12-28
Filing date: 2022-12-28
Publication date: 2023-05-16

Abstract

The invention provides a medical dialogue abstracting method with multi-level characteristic enhancement, and relates to the technical field of medical dialogue abstracts. Firstly, acquiring medical dialogue abstract data, and preprocessing the medical dialogue abstract data to enable the data to meet unified model requirements; building an automatic medical dialogue abstract model; the medical dialogue abstract model uses a pointer generation network as a basic framework, and uses a multi-level enhanced input characteristic representation to adapt to a medical dialogue scene by integrating internal attention, speaker embedding and utterance semantics; and finally training the constructed automatic medical dialogue abstract model and testing. The method can effectively improve the performance of the medical dialogue abstract model and enhance the accuracy of the medical dialogue abstract.

Description

Medical dialogue abstracting method with multi-level characteristic enhancement

Technical Field

The invention relates to the technical field of medical dialogue abstracts, in particular to a medical dialogue abstracting method with multi-level characteristic enhancement.

Background

The safety and convenience of online medical treatment are increasingly prominent. By way of text communication, patients can specify their condition and then doctors can provide diagnoses and advice. After each session, some online platforms require doctors to compose summaries for critical information, including disease diagnosis and treatment advice. These abstracts not only provide important medical advice for current patients, but also provide powerful references for subsequent treatment and patients with similar diseases. However, due to the lengthy and highly specialized nature of medical dialogues, abstracting is a repetitive and burdensome task for doctors, greatly reducing the efficiency of online medical services. To relieve the physician's heavy work pressure, an automated medical dialogue summary has been created that automatically summarizes the physician's disease diagnosis and treatment advice from the medical dialogue.

2020, acl-main.703 proposes BART, which abstracts the generation of text summaries using a pretrained post-fine-tuning approach. BART is a de-noising auto-encoder that maps corrupted text to its derived original text. It is implemented in a sequence-to-sequence model, using a bi-directional encoder on corrupted text, and using an autoregressive decoder from left to right.

BART pre-trains a model that combines bi-directional and autoregressive transformers. The pre-training has two phases: the text is corrupted by any noise function, and the sequence-to-sequence model is learned to reconstruct the original text. The training method of BART is to destroy the text and then optimize the reconstruction penalty, i.e. the cross entropy penalty between the output of the decoder and the original text. Since BART has an autoregressive decoder, fine-tuning can be performed directly for sequence generation tasks such as abstract abstracts, and information is copied from the input, which is closely related to the denoising pre-training target. The encoder input is an input sequence and the decoder autoregressively generates an output. Medical conversation summaries are typically copied from the doctor's original words, so the summaries generated should be more realistic than pursuing creativity, while valuable medical terms need not be summarized simply, but should be kept entirely. The BART is a method for abstracting and generating the abstract, that is, the abstract cannot be generated by copying from the original text, and key diagnosis and treatment information is easy to lose, so that a factual error is caused.

The hierarchical coding annotation model HET proposed by column-main.63 generates a medical dialogue abstract by identifying and extracting important sentences, labeling each sentence in the dialogue with a label of importance, regarding these labels as silver criteria, whereby the extracted abstract model can be trained using these labels. The text sets a threshold for judging the importance of a sentence to the abstract. If the ROUGE-1 score of a sentence for a summary is above a threshold, then the sentence is considered more important for a summary with a higher ROUGE-1 score.

The hierarchical coding labeling model HET provided by the text consists of three parts, namely a word level encoder, a memory module and a sentence level encoder. The word level encoder in the text model adopts BERT, takes the representation of [ CLS ] characters output by BERT as the representation of sentences, and sends the sentence representation into the memory module. The memory module adopts an end-to-end memory neural network, and aims to enhance the representation of the current sentence by utilizing the information contained in the sentence related to the current sentence in the dialogue of the context, thereby realizing better extraction of the context sentence information and further realizing better labeling. The text uses an LSTM word level encoder to encode all sentences in the dialogue separately, resulting in a vector representation of each sentence, and treating it as a value in the memory neural network. And then, based on the similarity of the current sentence and other sentences, weighting the corresponding values, connecting the weighted sum of the weighted values to the representation of the BERT data of the current sentence in series, and finally sending the obtained vector to a sentence-level encoder. The sentence-level encoder is composed of an LSTM, and the importance of the sentence is marked by a softmax or conditional random field marker after the output of the LSTM is subjected to linear transformation.

Although the medical dialogue summary overlaps the original medical dialogue to a high degree, the extraction method HET is not fully applicable, since the extracted sentences may contain redundant and non-critical information, such as "good", "that such bar" and the like, which is not actually meant for a salutary, which may result in redundancy and low readability of the summary.

Medical conversation summaries have unique features and challenges compared to text summaries and other conversation summary tasks. First, the abstract typically plagiarizes the original words of the physician, should be more realistic than pursuing creativity. Meanwhile, the valuable medical terms do not need to be summarized simply, but should be retained entirely.

Although the abstract is highly overlapping with the original dialogue, the extraction method is not entirely suitable, as the extracted sentences may contain redundant and non-critical information. Second, both the patient and the physician can input multiple utterances before another speaker responds, so the model should distinguish whether the utterances were spoken by the patient or by the physician. Finally, the key information is scattered in the dialogue and not all doctors' utterances have the value to generate a summary. The nonsensical parts include questions and answers that are not related to diagnosis and treatment advice, and political expressions such as greetings and thank you. Thus, the model should have the ability to recognize the semantics of a particular utterance in order to focus on valuable information.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a medical dialogue abstract method with multi-level characteristic enhancement, which generates an accurate and concise abstract for diagnosis and suggestion of doctors.

In order to solve the technical problems, the invention adopts the following technical scheme: a medical dialogue abstracting method with multi-level characteristic enhancement comprises the following steps:

step 1: obtaining medical dialogue abstract data;

step 1.1: medical dialogue acquisition; the original data of the medical dialogue is crawled from the part of the classical question-answer of the online medical platform; in a medical session, the patient consults an online doctor for some health problems, the doctor helps the patient determine the nature of the problem, provides treatment advice, or advises the patient to other medical institutions to seek further medical treatment; these data are complete conversations between the patient and the physician covering the entire procedure; in addition to the dialogue content, each medical dialogue also includes additional information about the doctor and the patient;

step 1.2: obtaining a summary; the abstract is added after the medical dialogue, and comprises two parts of 'problem description' and 'analysis and suggestion'. The "problem description" section is a medical problem for the patient; the "analysis and advice" section outlines doctor's diagnosis or treatment advice;

step 2: preprocessing medical dialogue abstract data; the medical dialogue and abstract data obtained in the step 1 are preprocessed respectively, so that the data meet the unified model requirement, and the specific method is as follows:

step 2.1: preprocessing overall data;

firstly, medical dialogue abstract data are cleaned, and data of words of both sides of a missing doctor and a patient or data of the abstract are removed;

step 2.2: preprocessing a medical dialogue;

firstly, sequentially splicing dialogues of doctors and patients according to an original dialog sequence to form medical dialog chapters, using a barking word segmentation tool to segment words, removing special symbols and stop words, counting word frequencies and constructing a medical dialog word list;

step 2.3: preprocessing the abstract;

the abstract is segmented by using a crust segmentation tool, and special symbols and stop words are removed;

step 2.4: dividing the data set;

randomly disturbing the preprocessed medical dialogue abstract data sequence, dividing the data sequence into three parts of a training set, a verification set and a test set, and simultaneously counting the average word number of medical dialogues and the average word number of abstracts of the three data sets;

step 3: constructing an automatic medical dialogue abstract model; based on the pretreatment results of the medical dialogue and the abstracts, constructing a model capable of automatically summarizing abstracts from the medical dialogue;

step 3.1: using the pointer generation network as a basic architecture of a medical dialogue summary model;

step 3.1.1: an encoder and a decoder for setting a medical dialogue abstract model; the medical dialogue summary model uses a bi-directional LSTM as an encoder and a uni-directional LSTM as a decoder;

step 3.1.2: using the internal attention to replace the pointer to generate the coverage loss of the network, and reducing the generation of repeated words;

will e _ti Defined as encoder hidden state

Attention fraction at decoding time step t; model penalizing input words for which the attention score is greater than a set threshold value obtained in a previous decoding step; defining a new encoder attention score e' _ti Specifically, the calculation is as formula (1):

the attention scores of the encoder are then normalized and used to obtain an encoder context vector

For each decoding step t, the model calculates a new decoder attention score +.>

To reduce the generation of previously generated words and to calculate the decoder context vector +.>

The specific calculation formula is as follows:

wherein ,

concealing a state vector for a t-moment decoder, < >>

Is a weight matrix>

Hiding the state vector for the decoder at time t'; />

A normalized decoder attention score;

the abstract generation probability distribution of the abstract generation layer is calculated by using the softmax function, and the calculation formula is as follows:

wherein ,

generating probability distribution for abstraction of abstraction generation layer, W _gen Weight matrix for abstract generation layer, b _gen A bias vector that is an abstract generation layer;

at the same time, the pointer mechanism uses encoder attention scores

As duplicate original input word w _i The calculation formula is as follows:

/>

wherein ,

to the original input word w _i Copy probability of (2);

the probability of using the replication mechanism for the decoding step t is calculated as follows:

wherein ,

to use the probability of the replication mechanism for decoding step t, W _copy Weight matrix for replication mechanism, b _copy A bias vector that is a replication mechanism;

the final probability distribution of the output word is obtained using a weighted sum of the attention probability distribution of the original dialog and the abstract generation probability distribution, calculated as follows:

step 3.2: enhancing the feature representation of the input word by adding speaker-level feature embedding; establishing a trainable speaker embedded vector, wherein the speaker comprises two roles of a doctor and a patient; adding the speaker embedded vector and the word embedded vector corresponding to the speaker utterance to obtain an input embedded vector of the final encoder, wherein the specific calculation formula is as follows:

E _input ＝E _speaker +E _token (9)

wherein ,E_input Embedding vectors for final input, E _speaker Embedding vectors for speaker E _token Embedding vectors for words;

step 3.3: introducing a RoBERTa semantic representation; introducing speech semantics to enhance a feature representation of an input word from the speech level; inputting each round of words into a Chinese pre-training language model RoBERTa respectively, and inserting a classification symbol [ CLS ] in front of each word; finally, the corresponding output of the classification symbol [ CLS ] is used as the semantic representation of the sentence, and the calculation formula is as follows:

r _i ＝RoBERTa([CLS],w _i1 ,w _i2 ,...,w _il ) (10)

wherein ,r_i For semantic representation of the i-th sentence, w _i1 ,w _i2 ,...,w _il Words in the i-th sentence respectively;

then, a semantic vector of the utterance where each word is located is given to each word, and the encoder attention score of each input word is calculated using the semantic vector of the utterance, as follows:

wherein ,

for word w at time t _il Concentration score, v ^T To calculate the inner product vector of the attention score, W _e 、W _d and W_r Respectively->

and r_i Corresponding weight matrix, < >>

For word w _il Semantic vectors of the utterances where they are located;

step 4: training the constructed automatic medical dialogue abstract model and testing;

step 4.1: initializing an input of an encoder; randomly initializing words after medical dialogue word segmentation into word embedded vectors, randomly taking values of word vector values in normal distribution N (0, 1), wherein the used word list is a medical dialogue word list constructed before; randomly initializing speaker embedded vectors, wherein word vector values are randomly valued in normal distribution N (0, 1), and the used word list only contains two words of doctors and patients; the dimension of the two word embedding is set to be the same value, and the word embedding and the speaker word embedding are added to be used as the input of the encoder;

step 4.2: initializing an input of a decoder; mapping words into One-Hot vectors after word segmentation of the abstract, wherein the used word list is a medical dialogue word list constructed before, and a symbol < SOS > special mark is added at an input starting position to serve as a starting mark of decoder input, and the starting mark serves as the input of the decoder;

step 4.3: constructing a standard output of the decoder; mapping words into One-Hot vectors after word segmentation of the abstract, wherein the used word list is a medical dialogue word list constructed before, and the medical dialogue word list is used as a standard output of a decoder;

step 4.4: training a model;

step 4.4.1: setting a loss function; the loss function adopts a cross entropy loss function, and the calculation formula is as follows:

where loss is a loss function, T is the total number of decoding time steps,

for the target word->

The generation probability of (2);

step 4.4.2: setting a training mode; model training is carried out in a Teacher-training mode during training;

step 4.4.3: mini-batch gradient descent; setting the size of batch, putting the divided mini-batch group into an iterative loop, calculating average loss, and making a gradient descent training model;

step 4.5: and testing the model effect.

The beneficial effects of adopting above-mentioned technical scheme to produce lie in: the medical dialogue abstract method with multi-level characteristic enhancement provided by the invention firstly adopts a pointer generation network as a basic framework in order to cope with high replicability, and can selectively replicate original texts and keep the capability of abstract generation. To prevent repeated generation of the same word, internal attention is introduced in the pointer network instead of the original coverage loss mechanism.

Secondly, the medical dialogue abstract model provided by the invention sets a speaker embedding vector for roles of a patient and a doctor so as to distinguish speakers of the words. The model directly adds the speaker embedded vector and the token embedded vector and inputs the resultant embedded vector to the encoder.

Third, a pre-trained language model Roberta is utilized to recognize valuable utterances that contain critical information. Each utterance is input separately to the language model Roberta and the output of the CLS position is taken as a semantic representation of the corresponding utterance. Each tag appends a semantic representation vector of its sentence to participate in the attention calculation. Attention calculations take into account implicit state and sentence semantics, enabling models to focus on key information.

The medical dialog summary model of the present invention uses a pointer generation network as a basic framework to adapt to medical dialog scenarios using a multi-level enhanced input feature representation by integrating in-focus, speaker embedding, and utterance semantics. Based on the results of the automatic assessment of the metrics, the performance of the medical dialog summary model of the present invention exceeds all of the baselines and all of the modules contribute to performance.

Drawings

FIG. 1 is a flowchart of a medical dialogue summarization method with multi-level feature enhancement provided by an embodiment of the invention;

fig. 2 is a block diagram of a medical dialogue summary model according to an embodiment of the present invention.

Detailed Description

The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.

In this embodiment, a medical dialogue summarization method with multi-level feature enhancement, as shown in fig. 1, includes the following steps:

step 1: obtaining medical dialogue abstract data resources;

the embodiment uses a medical dialogue abstract data set of a spring rain doctor, and can be divided into two parts of medical dialogue and corresponding abstract acquisition, and the specific steps are as follows:

step 1.1: medical dialogue acquisition; in this embodiment, the raw data of the medical session is crawled from the "classical question-and-answer" section of the online medical platform spring rain physician. In these dialogs, the patient consults an online doctor for some health problems, the doctor helps them determine the nature of the problem, provides treatment advice, or advises them to other medical institutions to seek further medical assistance. These data are complete conversations between the patient and the physician, covering the entire procedure. In addition to the conversations, each conversation contains additional information such as the type of illness and the corresponding hospital department, as well as the speaker speaking in the conversation;

step 1.2: obtaining a summary; many conversations include abstracts that the physician adds after the conversation is performed, including both "description of the problem" and "analysis and advice". The "problem description" section is a medical problem for the patient; the "analysis and advice" section outlines doctor's diagnosis or treatment advice. The present embodiment uses the "analyze and suggest" section as a standard abstract.

The present embodiment crawls 44900 dialogs from 23 hospital departments, which cover 923 diseases, which constitutes the original corpus of the present embodiment. The data set retains only medical dialogues and corresponding summaries, examples of which are shown in table 1.

Table 1 Chinese medical dialogue summary dataset example

Step 2: preprocessing medical dialogue abstract data resources; the medical dialogue and abstract data resources obtained in the step 1 are preprocessed respectively, so that the data resources meet the unified model requirements, and the specific method is as follows:

step 2.1: preprocessing overall data;

firstly, cleaning a medical dialogue abstract data set, removing words of both sides of a missing doctor and a patient or data lacking an abstract, and finally obtaining 40789 pieces of complete data in total;

step 2.2: preprocessing a medical dialogue;

step 2.3: preprocessing the abstract;

step 2.4: dividing the data set;

the preprocessed medical dialogue abstract data sets are randomly disordered in sequence and divided into three parts of a training set, a verification set and a test set, and the average number of words (Token) of medical dialogue and the average number of words (Token) of abstract of the three-part data sets are counted at the same time so as to ensure that the difference between the data sets cannot influence experimental conclusion. The data set partitioning and detailed information are shown in table 2.

Table 2 data set partitioning and detailed information

Data set partitioning	Training set	Verification set	Test set
				Number of data	32631	4079	4079
Average Token number for medical dialogue	293.0	291.9	288.1
				Summary average Token number	95.4	94.2	93.6

Step 3: constructing an automatic medical dialogue abstract model; based on the pretreatment results of the medical dialogue and the abstracts, a model capable of automatically summarizing the abstracts from the medical dialogue is constructed, the model architecture is shown in fig. 2, and the specific method is as follows:

step 3.1: using the pointer generation network as a basic architecture of a medical dialogue summary model; the present embodiment uses a pointer generation network as a base model that uses a probability of generation to adjust between replication and abstract generation;

step 3.1.2: replacing the coverage loss with internal attention; the medical dialogue abstract model uses the internal attention to replace the coverage loss of a pointer generation network, so that the generation of repeated words can be reduced better;

the embodiment will be e _ti Defined as encoder hidden state

the attention scores of the encoder are then normalized and used to obtain the encoder's attention scoreThe following vectors

The specific calculation formula is as follows:

wherein ,

concealing a state vector for a t-moment decoder, < >>

Is a weight matrix>

Hiding the state vector for the decoder at time t'; />

A normalized decoder attention score; />

wherein ,

at the same time, the pointer mechanism uses encoder attention scores

As duplicate original input word w _i The calculation formula is as follows:

wherein ,

to the original input word w _i Copy probability of (2);

wherein ,

step 3.2: enhancing the feature representation of the input word by adding speaker-level feature embedding; in order to adapt to the medical dialogue multi-speaker scene, speaker-level embedding is introduced to distinguish speakers; establishing a trainable speaker embedded vector, wherein the speaker comprises two roles of a doctor and a patient; adding the speaker embedded vector and the word embedded vector corresponding to the speaker utterance to obtain an input embedded vector of the final encoder, wherein the specific calculation formula is as follows:

E _input ＝E _speaker +E _token (9)

step 3.3: introducing a RoBERTa semantic representation; since word-level embedding alone is not sufficient to locate key words, we introduce utterance semantics to enhance the feature representation of the input words from the speech level. In order to obtain the semantic representation of each utterance, a Chinese pre-training language model RoBERTa is introduced; we input each round of utterances separately into RoBERTa and insert a classification symbol [ CLS ] in front of each utterance; finally, the corresponding output of the classification symbol [ CLS ] is used as the semantic representation of the sentence, and the calculation formula is as follows:

r _i ＝RoBERTa([CLS],w _i1 ,w _i2 ,...,w _il ) (10)

wherein ,

for word w at time t _il Concentration score, v ^T To calculate the attention scoreInner product vector, W _e 、W _d and W_r Respectively->

and r_i Corresponding weight matrix, < >>

For word w _il Semantic vectors of the utterances where they are located;

step 4: training the constructed automatic medical dialogue abstract model and testing; the specific method comprises the following steps:

step 4.4: training a model; the specific method comprises the following steps:

where loss is the loss function and T is the solutionThe total number of code time steps,

for the target word->

The generation probability of (2);

step 4.4.2: setting a training mode; model training is carried out by using a Teacher-forming mode during training, and in the training network process, the output of the last moment is not used as the input of the next moment each time, but the corresponding last item of a standard answer (group trunk) of the abstract is directly used as the input of the next moment;

step 4.5: the model effect is tested by the following specific method:

step 4.5.1: the decoding adopts a Beam Search algorithm during the test; the Beam Search range is set and a summary is generated. The beam Search does not take the absolute probability of each marker itself, but considers all possible extensions of each marker. Then selecting the most suitable marker sequence according to the logarithmic probability, and setting the Beam size=4;

step 4.5.2: using the ROUGE score as an evaluation index; and using a general automatic evaluation index ROUGE score of the text abstract, namely n-gram overlap ratio, as a standard for judging whether the performance of the model is good or not. The present embodiment uses Rough-1, rough-2 and Rough-L scores as automatic assessment indicators. These three metrics represent the accuracy of the single word, double word and longest common subsequence, respectively. The present embodiment uses a commonly accepted third party library to calculate the Rouge score and uses the resulting average as the final result.

Step 4.5.3: selecting a plurality of strong base lines to be compared with the result indexes of the medical dialogue abstract model, and proving the effectiveness of the model;

to verify the validity of the medical dialogue summary model, the present example selects a variety of strong baselines from previous studies: lead-3, which extracts the first three sentences of the doctor as abstracts; the random extraction method randomly extracts three sentences of doctors as abstracts; extractive Oracle it extracts the first three sentences of Rouge-1 score in the active dataset and the test dataset as digests; ranking extraction method TextRank; the pointer generates a network PGNet; ml+rl, which combines maximum likelihood training and reinforcement learning; the hierarchical extraction model HET is specially used for medical dialogue abstract extraction, and a pre-training language model Zen is used; a transducer-based model LongFormer that focuses on processing long input text; the language model BART is pre-trained. The test results are shown in table 3:

table 3 model comparison experiment results

As shown in Table 3, the medical dialog summary model of the present invention performs better than all benchmark models on all Rouge indicators. The two results of Lead-3 and Random are similar, indicating that the physician's diagnosis and suggested location is not fixed. The high performance of Extractive Oracle suggests that the abstract and the original utterance of the physician are highly repeatable. Longformer and BART perform poorly on all metrics, which proves that direct abstract generation does not work. PGNet's performance is moderate, indicating that the replication mechanism is effective, and that traditional text summarization models need to accommodate the context of medical conversations. For HET, the results indicate that the extraction method also takes redundant words as digests when extracting valuable utterances. Ml+rl uses a replication mechanism and uses various optimization techniques, so it performs well on all indicators.

The embodiment also carries out an ablation experiment to prove the validity of the medical dialogue abstract model; as shown in Table 4, after removing the internal attention, speaker embedding, and utterance semantics, respectively, the Rouge-1 score drops by 2.33 points, 4.04 points, and 2.88 points, respectively. The results show that all modules can effectively improve the performance of the medical dialogue abstract model.

Table 4 ablation experimental results

Model	ROUGE-1	ROUGE-2	ROUGE-3
				no intra-attention	87.19	80.12	86.75
no speaker embedding	85.48	78.05	84.49
				no utterance semantics	86.64	78.82	85.76
Our	89.52	82.86	88.79

It should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions, which are defined by the scope of the appended claims.

Claims

1. A medical dialogue abstracting method with multi-level characteristic enhancement is characterized in that: the method comprises the following steps:

step 1: obtaining medical dialogue abstract data; the medical dialogue abstract data comprises two parts of medical dialogue and abstract;

step 2: preprocessing medical dialogue abstract data; preprocessing the acquired medical dialogue and abstract data respectively to enable the data to meet the unified model requirement;

step 4: training the constructed automatic medical dialogue abstract model and testing.

2. The method for medical dialogue summarization with multi-level feature enhancement according to claim 1, wherein: the original data of the medical dialogue is crawled from the part of the classical question-answer of the online medical platform; in a medical session, the patient consults an online doctor for some health problems, the doctor helps the patient determine the nature of the problem, provides treatment advice, or advises the patient to other medical institutions to seek further medical treatment; these data are complete conversations between the patient and the physician covering the entire procedure; in addition to the dialogue content, each medical dialogue also includes additional information about the doctor and the patient;

the abstract is added after the medical dialogue, and comprises two parts of 'problem description' and 'analysis and suggestion'; the "problem description" section is a medical problem for the patient; the "analysis and advice" section outlines doctor's diagnosis or treatment advice.

3. The method for medical dialogue summarization with multi-level feature enhancement according to claim 1, wherein: the specific method of the step 2 is as follows:

step 2.1: preprocessing overall data;

step 2.2: preprocessing a medical dialogue;

step 2.3: preprocessing the abstract;

step 2.4: dividing the data set;

the data sequence of the preprocessed medical dialogue abstract is randomly disordered and divided into three parts of a training set, a verification set and a test set, and the average word number of the medical dialogue and the average word number of the abstract of the three-part data set are counted.

4. The method for medical dialogue summarization with multi-level feature enhancement according to claim 1, wherein: the specific method of the step 3 is as follows:

step 3.2: enhancing the feature representation of the input word by adding speaker-level feature embedding;

step 3.3: introducing a RoBERTa semantic representation; introducing utterance semantics enhances the feature representation of the input word from the utterance level.

5. The method for multi-level feature enhanced medical session summarization of claim 4, wherein: the specific method of the step 3.1 is as follows:

will e _ti Defined as encoder hidden state

The specific calculation formula is as follows:

wherein ,

concealing a state vector for a t-moment decoder, < >>

Is a weight matrix>

Hiding the state vector for the decoder at time t'; />

A normalized decoder attention score;

wherein ,

at the same time, the pointer mechanism uses encoder attention scores

As duplicate original input word w _i The calculation formula is as follows:

wherein ,

to the original input word w _i Copy probability of (2);

wherein ,

6. the method for multi-level feature enhanced medical session summarization of claim 5, wherein: the specific method of the step 3.2 is as follows:

establishing a trainable speaker embedded vector, wherein the speaker comprises two roles of a doctor and a patient; adding the speaker embedded vector and the word embedded vector corresponding to the speaker utterance to obtain an input embedded vector of the final encoder, wherein the specific calculation formula is as follows:

E _input ＝E _speaker +E _token (9)

wherein ,E_input Embedding vectors for final input, E _speaker Embedding vectors for speaker E _token Vectors are embedded for words.

7. The method for multi-level feature enhanced medical session summarization of claim 6 wherein: the specific method of the step 3.3 is as follows:

inputting each round of words into a Chinese pre-training language model RoBERTa respectively, and inserting a classification symbol [ CLS ] in front of each word; finally, the corresponding output of the classification symbol [ CLS ] is used as the semantic representation of the sentence, and the calculation formula is as follows:

r _i ＝RoBERTa([CLS],w _i1 ,w _i2 ,...,w _il ) (10)

wherein ,

and r_i Corresponding weight matrix, < >>

For word w _il Semantic vectors of the utterances that are located.

8. The method for multi-level feature enhanced medical session summarization of claim 7 wherein: the specific method of the step 4 is as follows:

step 4.4: training a model;

step 4.5: and testing the model effect.

9. The method for multi-level feature enhanced medical session summarization of claim 8, wherein: the specific method of the step 44 is as follows:

where loss is a loss function, T is the total number of decoding time steps,

for the target word->

The generation probability of (2);

step 4.4.3: mini-batch gradient descent; setting the size of batch, putting the divided mini-batch group into an iterative loop, calculating average loss, and making a gradient descent training model.