CN117009501B

CN117009501B - Method and related device for generating abstract information

Info

Publication number: CN117009501B
Application number: CN202311284994.XA
Authority: CN
Inventors: 江旺杰; 黄予
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-10-07
Filing date: 2023-10-07
Publication date: 2024-01-30
Anticipated expiration: 2043-10-07
Also published as: CN117009501A

Abstract

The application provides a summary information generation method and a related device. The embodiment of the application can be applied to the technical field of natural language processing. The method comprises the following steps: acquiring role dialogue information; extracting features of the role dialogue information to generate dialogue feature vectors; processing the dialogue feature vectors through M category information labeling modules in the multi-branch sequence labeling model to obtain M category information prediction probability distribution vectors; predicting probability values according to N category information corresponding to each category information prediction probability distribution vector, and determining M groups of category abstract sentences from N dialogue sentences; and splicing each group of class abstract sentences in the M groups of class abstract sentences to generate M class abstract information. The method provided by the embodiment of the application improves the accuracy of generating the abstract information, avoids the problem of missing important information caused by extracting the abstract information of each category from the role dialogue information, and can be suitable for scenes of medical dialogue abstract generation.

Description

Method and related device for generating abstract information

Technical Field

The present disclosure relates to the field of natural language processing technologies, and in particular, to a method and an apparatus for generating abstract information.

Background

The conversation abstract helps to summarize a brief overview of the entire conversation, enabling more efficient knowledge of the main content of the conversation expression. The weight lifting effect is light in many application scenes, such as conferences involving multiple persons, doctor-patient dialogues and the like.

In medical conversation summaries, the focus of the medical conversation summary task is to generalize symptom summaries and diagnostic summaries from the conversation between doctor and patient, and the generalized information helps to help patients with similar problems to provide medical services, as well as help doctors to make appropriate diagnoses and provide a reasonable treatment.

At present, the dialog abstract generation method is mainly a generation type method based on a Seq2Seq neural network model, and the method is generally generated from scratch by autoregressive, so that the problem of low decoding speed exists, input texts cannot be effectively multiplexed, the generated symptom abstract and diagnosis abstract have poor quality, and key information in medical dialog is easy to miss.

Disclosure of Invention

The embodiment of the application provides a summary information generation method and a related device, which solve the problem of poor quality of summary information generated according to dialogue information in the prior art.

One aspect of the present application provides a summary information generating method, including:

acquiring role dialogue information, wherein the role dialogue information comprises N dialogue sentences, each dialogue sentence in the N dialogue sentences carries a role identifier, and N is an integer greater than 1;

extracting features of the role dialogue information to generate dialogue feature vectors, wherein the dialogue feature vectors comprise N dialogue sentence feature values, and the N dialogue sentence feature values correspond to the N dialogue sentences;

the dialogue feature vector is used as input of a multi-branch sequence labeling model, M category information labeling modules in the multi-branch sequence labeling model are used for processing the dialogue feature vector to obtain M category information prediction probability distribution vectors, wherein each category information prediction probability distribution vector comprises N category information prediction probability values, and each category information prediction probability value is used for representing the possibility that a dialogue sentence belongs to category information;

predicting N category information prediction probability values corresponding to each category information prediction probability distribution vector in the probability distribution vectors according to the M category information, and determining M groups of category abstract sentences from the N dialogue sentences, wherein the M groups of category abstract sentences correspond to the M category information;

And splicing each group of class abstract sentences in the M groups of class abstract sentences to generate M class abstract information.

Another aspect of the present application provides a summary information generating apparatus, including: the system comprises a role dialogue information acquisition module, a feature extraction module, a probability distribution vector prediction module, a category abstract statement determination module and a category abstract information generation module; specific:

the role dialogue information acquisition module is used for acquiring role dialogue information, wherein the role dialogue information comprises N dialogue sentences, each dialogue sentence in the N dialogue sentences carries a role identifier, and N is an integer greater than 1;

the feature extraction module is used for carrying out feature extraction on the role dialogue information to generate dialogue feature vectors, wherein the dialogue feature vectors comprise N dialogue sentence feature values, and the N dialogue sentence feature values correspond to the N dialogue sentences;

the probability distribution vector prediction module is used for taking the dialogue feature vector as the input of the multi-branch sequence labeling model, and processing the dialogue feature vector through M category information labeling modules in the multi-branch sequence labeling model to obtain M category information prediction probability distribution vectors, wherein each category information prediction probability distribution vector comprises N category information prediction probability values, each category information prediction probability value is used for representing the possibility that a dialogue sentence belongs to category information, and M is an integer greater than or equal to 1;

The class abstract sentence determining module is used for predicting N class information prediction probability values corresponding to the probability distribution vectors according to each class information in the M class information prediction probability distribution vectors, and determining M groups of class abstract sentences from N dialogue sentences, wherein the M groups of class abstract sentences correspond to the M class information;

the class abstract information generation module is used for splicing each group of class abstract sentences in the M groups of class abstract sentences to generate M class abstract information.

In another implementation manner of the embodiment of the present application, the probability distribution vector prediction module is further configured to:

inputting dialogue feature vectors into the full-connection layers of each of the M category information labeling modules, and performing full-connection processing on the dialogue feature vectors through the full-connection layers of the M category information labeling modules to generate M full-connection feature vectors, wherein each full-connection feature vector comprises N full-connection feature values;

inputting the M full-connection feature vectors into the compression function layers of each category information labeling module in the M category information labeling modules, and compressing the M full-connection feature vectors through the compression function layers of the M category information labeling modules to generate M category information prediction probability distribution vectors.

In another implementation manner of the embodiment of the present application, the category summary statement determination module is further configured to:

obtaining M probability thresholds corresponding to M category information;

and determining M groups of class abstract sentences from the N dialogue sentences according to M class information prediction probability values and M probability thresholds corresponding to each dialogue sentence in the N dialogue sentences, wherein the class information prediction probability values of the class abstract sentences are larger than the probability thresholds corresponding to the class information.

In another implementation manner of the embodiment of the present application, the role dialogue information obtaining module is further configured to:

acquiring original dialogue information and at least one role information, wherein the original dialogue information comprises N original sentences;

and matching the N original sentences with at least one role information to generate role dialogue information, wherein the role dialogue information comprises the N original sentences carrying the role identification.

In another implementation manner of the embodiment of the present application, the feature extraction module is further configured to:

taking the role dialogue information as the input of a text abstract layer in a feature extraction model, and extracting hidden semantic features of the role dialogue information through the text abstract layer to generate hidden semantic vectors;

Character dialogue information is used as the input of a feature embedding layer in a feature extraction model, and feature mapping is carried out on the character dialogue information through the feature embedding layer to generate character embedding feature vectors;

and carrying out feature fusion on the hidden semantic vector and the character embedded feature vector to generate a dialogue feature vector.

splicing the hidden semantic vector and the character embedded feature vector to generate a feature splicing vector;

and taking the feature splicing vector as a self-attention mechanism layer in the feature extraction model to perform feature fusion, and generating a dialogue feature vector.

In another implementation manner of the embodiment of the present application, the category summary information generating module is further configured to:

acquiring N original serial numbers corresponding to N dialogue sentences in the role dialogue information, wherein the original serial numbers are used for representing the sequence numbers of the dialogue sentences in the order role dialogue information;

determining the splicing sequence number of each dialogue sentence in each group of class abstract sentences according to N original sequence numbers corresponding to the N dialogue sentences;

and sequentially splicing each group of class abstract sentences in the M groups of class abstract sentences according to the splicing sequence number of each dialogue sentence in each group of class abstract sentences to generate M classes of abstract information.

In another implementation manner of the embodiment of the present application, the probability distribution vector prediction module is further configured to use a dialogue feature vector as an input of a dual-branch sequence labeling model, process the dialogue feature vector through a first class information labeling module in the dual-branch sequence labeling model to obtain a first class information prediction probability distribution vector, and process the dialogue feature vector through a second class information labeling module in the dual-branch sequence labeling model to obtain a second class information prediction probability distribution vector, where the first class information prediction probability distribution vector includes N first class information prediction probability values, the first class information prediction probability values are used to characterize a possibility that a dialogue sentence belongs to first class information, and the second class information prediction probability distribution vector includes N second class information prediction probability values, and the second class information prediction probability values are used to characterize a possibility that the dialogue sentence belongs to second class information;

the class abstract sentence determining module is further used for determining a first class abstract sentence from the N dialogue sentences according to the N first class information prediction probability values and determining a second class abstract sentence from the N dialogue sentences according to the N second class information prediction probability values;

The category abstract information generation module is also used for splicing the first category abstract sentences to generate first category abstract information, and splicing the second category abstract sentences to generate second category abstract information.

acquiring a first class information probability threshold;

taking N dialogue sentences with the prediction probability values of the N first category information corresponding to the N dialogue sentences being greater than or equal to the probability threshold value of the first category information as first category abstract sentences;

acquiring a probability threshold of the second category information;

and taking the dialogue sentences with the N second-class information prediction probability values larger than or equal to the second-class information probability threshold values corresponding to the N dialogue sentences as second-class abstract sentences.

In another implementation manner of the embodiment of the present application, the summary information generating apparatus further includes: the system comprises a training data acquisition module, a training data feature extraction module, a training probability distribution vector prediction module and a loss function generation module; specific:

the training data acquisition module is used for acquiring training role dialogue information samples, wherein the training role dialogue information samples comprise N training dialogue sentences, each training dialogue sentence in the N training dialogue sentences carries a role identifier, each training role dialogue information sample carries M category information tag sequences, each category information tag sequence comprises N category information tags, each category information tag is used for representing the corresponding relation between the training dialogue sentence and the category information, and N is an integer larger than 1;

The training data feature extraction module is used for carrying out feature extraction on the training character dialogue information sample to generate a training dialogue feature vector, wherein the training dialogue feature vector comprises N training dialogue sentence feature values, and the N training dialogue sentence feature values correspond to the N training dialogue sentences;

the training probability distribution vector prediction module is used for taking training dialogue feature vectors as the input of the multi-branch sequence labeling model, and processing the training dialogue feature vectors through M category information labeling modules in the multi-branch sequence labeling model to obtain M training category information prediction probability distribution vectors, wherein each training category information prediction probability distribution vector comprises N training category information prediction probability values, and each training category information prediction probability value is used for representing the possibility that training dialogue sentences belong to category information;

and the loss function generation module is used for generating a total loss function according to the M training class information prediction probability distribution vectors and the M class information tag sequences, and carrying out parameter optimization on the multi-branch sequence annotation model through the total loss function.

In another implementation manner of the embodiment of the present application, the loss function generating module is further configured to:

Predicting N training class information prediction probability values in the probability distribution vector according to the training class information corresponding to the same class information and N class information labels in the class information label sequence to generate M class loss functions;

a total loss function is generated from the M category loss functions.

Another aspect of the present application provides a computer device comprising:

memory, transceiver, processor, and bus system;

wherein the memory is used for storing programs;

the processor is used for executing programs in the memory, and the method comprises the steps of executing the aspects;

the bus system is used to connect the memory and the processor to communicate the memory and the processor.

Another aspect of the present application provides a computer-readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the methods of the above aspects.

Another aspect of the present application provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the methods provided in the above aspects.

From the above technical method, the embodiment of the application has the following advantages:

the application provides a summary information generation method and a related device, wherein the method comprises the following steps: acquiring role dialogue information, wherein the role dialogue information comprises N dialogue sentences, each dialogue sentence in the N dialogue sentences carries a role identifier, and N is an integer greater than 1; extracting features of the role dialogue information to generate dialogue feature vectors, wherein the dialogue feature vectors comprise N dialogue sentence feature values, and the N dialogue sentence feature values correspond to the N dialogue sentences; the dialogue feature vector is used as input of a multi-branch sequence labeling model, M category information labeling modules in the multi-branch sequence labeling model are used for processing the dialogue feature vector to obtain M category information prediction probability distribution vectors, wherein each category information prediction probability distribution vector comprises N category information prediction probability values, and each category information prediction probability value is used for representing the possibility that a dialogue sentence belongs to category information; predicting N category information prediction probability values corresponding to each category information prediction probability distribution vector in the probability distribution vectors according to the M category information, and determining M groups of category abstract sentences from the N dialogue sentences, wherein the M groups of category abstract sentences correspond to the M category information; and splicing each group of class abstract sentences in the M groups of class abstract sentences to generate M class abstract information. According to the method provided by the embodiment of the application, the multi-category probability prediction is carried out on the role dialogue information carrying the role identification through the multi-branch sequence labeling model, so that the prediction probability value of each dialogue statement in the role dialogue information belonging to each category is obtained, further, the dialogue statements belonging to each category are obtained through the prediction probability value, and the dialogue statements of each category are spliced to obtain the abstract information of each category, so that the accuracy of abstract information generation is improved, the problem of important information omission caused by extracting the abstract information of each category from the role dialogue information is avoided, and the method is applicable to scenes of medical dialogue abstract generation.

Drawings

FIG. 1 is a schematic diagram of a summary information generation system according to an embodiment of the present application;

FIG. 2 is a flowchart of a summary information generation method according to an embodiment of the present disclosure;

FIG. 3 is a block diagram of a feature extraction model according to an embodiment of the present application;

FIG. 4 is a block diagram of a multi-branch sequence annotation model according to an embodiment of the present application;

FIG. 5 is a flowchart of a summary information generation method according to another embodiment of the present application;

FIG. 6 is a block diagram of a dual-branch sequence annotation model according to an embodiment of the present application;

FIG. 7 is a flowchart of a summary information generation method according to another embodiment of the present application;

FIG. 8 is a flowchart of a summary information generation method according to another embodiment of the present application;

FIG. 9 is a flowchart of a summary information generation method according to another embodiment of the present application;

FIG. 10 is a flowchart of a summary information generation method according to another embodiment of the present application;

FIG. 11 is a flowchart of a summary information generation method according to another embodiment of the present application;

FIG. 12 is a flowchart of a summary information generation method according to another embodiment of the present application;

FIG. 13 is a flowchart of a summary information generation method according to another embodiment of the present application;

FIG. 14 is a flowchart of a summary information generation method according to another embodiment of the present application;

FIG. 15 is an example of a medical session abstract provided in an embodiment of the application;

FIG. 16 is a flow chart of a summary generation method for medical dialogue based on sentence-level text editing provided in an embodiment of the present application;

FIG. 17 is a block diagram of a sentence-level editing operation prediction model provided in accordance with one embodiment of the present application;

fig. 18 is a schematic structural diagram of a summary information generating apparatus according to an embodiment of the present application;

fig. 19 is a schematic structural diagram of a summary information generating apparatus according to another embodiment of the present application;

fig. 20 is a schematic structural diagram of a server according to an embodiment of the present application.

Description of the embodiments

The embodiment of the application provides a summary information generation method, which carries out multi-category probability prediction on role dialogue information carrying a role identifier through a multi-branch sequence labeling model to obtain a prediction probability value of each dialogue statement belonging to each category in the role dialogue information, further obtains the dialogue statement belonging to each category through the prediction probability value, and splices the dialogue statements of each category to obtain summary information of each category, thereby improving the accuracy of summary information generation, avoiding the problem of important information omission caused by extracting the role dialogue information to the summary information of each category, and being applicable to scenes of medical dialogue summary generation.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims of this application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be capable of operation in sequences other than those illustrated or described herein, for example. Furthermore, the terms "comprises," "comprising," and "includes" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Natural language processing (Nature Language processing, NLP) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.

It should be noted that, in the application of the present application, the relevant data collection process should strictly obtain the informed consent or the individual consent of the personal information body according to the requirements of the relevant national laws and regulations, and develop the subsequent data use and processing actions within the authorized range of the laws and regulations and the personal information body.

In order to facilitate understanding of the technical methods provided in the embodiments of the present application, some key terms used in the embodiments of the present application are explained here:

generating a text abstract: technology for highly summarizing textual information by generating a segment of concise text. The method can be divided into the following modes according to the abstract: extracting type abstract, generating type abstract and compressing type abstract.

Extracting abstract: the mode mainly uses an algorithm to extract ready-made sentences from a source document as abstract sentences. Generally, the smoothness is better than the summary of the formula. However, too much redundant information is introduced, and the characteristics of the abstract itself cannot be reflected.

Generating a summary: the method is based on NLG technology, and according to the content of the source document, natural language description is generated by an algorithm model rather than extracting sentences of the original text. The original document is interpreted and shortened like a human using deep learning techniques (mainly in the seq-to-seq model). Because abstract machine learning algorithms can generate new phrases and sentences representing the most important information in the source text, they can help overcome grammar errors based on extraction techniques. Recently, after the advent of a large number of pre-trained models, such as bert, much effort has been focused on how to use auto-predictive pre-trained models to do NLG tasks, including generating a formula summary. In addition, because the annotated summary data is often lacking in the real environment, much work is focused on an unsupervised way to do an unsupervised generated summary using a self-encoder or other ideas. While generating a summary performs better in terms of text summaries, developing its algorithm requires complex deep learning techniques and complex language modeling. Thus, the extraction type text summarization method is still widely popular.

Compression type abstract: it is somewhat similar in pattern to the generated abstract, but its purpose is different. The main aim of the compression type abstract is to filter redundant information in a source document, compress an original text and obtain corresponding abstract content.

Medical dialogue abstract (Medical Dialogue Summarization): is a subtask of the text summarization task in the field of natural language processing, and aims to summarize and summarize the dialogue communication content between doctors and patients to form a brief summary.

Text Editing (Text Editing): the method is a novel text generation method, and is characterized in that editing operation corresponding to input text is predicted, and then the editing operation is realized to obtain output.

Seq2Seq: sequence to Sequence, a model architecture for sequence-to-sequence learning.

The focus of the medical dialogue summary task is to generalize the dialogue between doctor and patient from two main dimensions, namely patient symptoms and doctor diagnosis, corresponding to two target outputs respectively: symptom abstracts and diagnostic abstracts. Related art methods can be summarized into two types, one is a generating method based on Seq2Seq, and the other is a extracting method based on sentence annotation.

The method of generating the expression based on the Seq2Seq is generally generated from the beginning by autoregressive, so that the decoding speed is slow, and the input text cannot be effectively multiplexed. The extraction method based on sentence annotation is also rough in the related design of multi-turn dialogue feature extraction, and effective features of each turn dialogue sentence are difficult to obtain. In addition, the label of each round of dialogue sentences is judged by adopting a sentence labeling mode, and the condition that some rounds of dialogue sentences possibly belong to symptom abstract and diagnosis abstract at the same time cannot be processed.

The application provides a summary information generation method, which comprises the following steps: acquiring role dialogue information, wherein the role dialogue information comprises N dialogue sentences carrying role identifiers; extracting features of the role dialogue information to generate dialogue feature vectors, wherein the dialogue feature vectors comprise N dialogue sentence feature values, and the N dialogue sentence feature values correspond to the N dialogue sentences; the dialogue feature vector is used as input of a multi-branch sequence labeling model, M category information labeling modules in the multi-branch sequence labeling model are used for processing the dialogue feature vector to obtain M category information prediction probability distribution vectors, wherein each category information prediction probability distribution vector comprises N category information prediction probability values, and each category information prediction probability value is used for representing the possibility that a dialogue sentence belongs to category information; predicting N category information prediction probability values corresponding to each category information prediction probability distribution vector in the probability distribution vectors according to the M category information, and determining M groups of category abstract sentences from the N dialogue sentences, wherein the M groups of category abstract sentences correspond to the M category information; and splicing each group of class abstract sentences in the M groups of class abstract sentences to generate M class abstract information. According to the method provided by the embodiment of the application, the multi-category probability prediction is carried out on the role dialogue information carrying the role identification through the multi-branch sequence labeling model, so that the prediction probability value of each dialogue statement in the role dialogue information belonging to each category is obtained, further, the dialogue statements belonging to each category are obtained through the prediction probability value, and the dialogue statements of each category are spliced to obtain the abstract information of each category, so that the accuracy of abstract information generation is improved, the problem of important information omission caused by extracting the abstract information of each category from the role dialogue information is avoided, and the method is applicable to scenes of medical dialogue abstract generation.

For ease of understanding, referring to fig. 1, fig. 1 is an application environment diagram of a summary information generating method in an embodiment of the present application, and as shown in fig. 1, the summary information generating method in the embodiment of the present application is applied to a summary information generating system. The summary information generation system includes: a server and a terminal device; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content distribution network (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligent platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited herein.

Firstly, a server acquires role dialogue information, wherein the role dialogue information comprises N dialogue sentences, each dialogue sentence in the N dialogue sentences carries a role identifier, and N is an integer greater than 1;

Secondly, the server performs feature extraction on the color dialogue information to generate dialogue feature vectors, wherein the dialogue feature vectors comprise N dialogue sentence feature values, and the N dialogue sentence feature values correspond to the N dialogue sentences;

then, the server takes the dialogue feature vector as the input of a multi-branch sequence labeling model, and processes the dialogue feature vector through M category information labeling modules in the multi-branch sequence labeling model to obtain M category information prediction probability distribution vectors, wherein each category information prediction probability distribution vector comprises N category information prediction probability values, and each category information prediction probability value is used for representing the possibility that a dialogue sentence belongs to category information;

then, the server predicts N category information prediction probability values corresponding to each category information prediction probability distribution vector in the M category information prediction probability distribution vectors, and determines M groups of category abstract sentences from the N dialogue sentences, wherein the M groups of category abstract sentences correspond to the M category information;

and finally, the server splices each group of class abstract sentences in the M groups of class abstract sentences to generate M class abstract information.

The summary information generating method in the present application will be described from the perspective of the server. Referring to fig. 2, the summary information generating method provided in the embodiment of the present application includes: step S110 to step S150. Specific:

S110, acquiring role dialogue information.

The role dialogue information comprises N dialogue sentences, each dialogue sentence carries a role identifier, and N is an integer greater than 1.

It will be appreciated that the character dialogue information is formed by at least one dialogue sentence carrying a character identifier, i.e. the character dialogue information includes dialogue sentences of multiple rounds, each round of dialogue sentence referring to a dialogue message sent or output by a character object. The role dialogue message may be to directly obtain dialogue sentences carrying the role identifier, or may be to obtain original sentences not carrying the role identifier, and then attach the role identifier to each original sentence to obtain dialogue sentences carrying the role identifier.

Taking a medical dialogue as an example, in one embodiment, dialogue sentences carrying role identifiers may be directly acquired:

patient: the doctor gets his own.

Patient: variability what drugs asthma takes, each time a cold will cause.

The doctor: generally, the medicine can be orally taken for sensitization of cis Ning Jiangdi and self-provided atomization.

Patient: is the drug?

Patient: what is the present symptom cough and nasal discharge?

The doctor: if there is infection, drink the anti-inflammatory and cough-relieving medicine for atomization.

Patient: the doctor is thanks to the metabolism.

The doctor: do not feel comfortable and congratulate you for recovery in the early days.

In another embodiment, the dialogue statement carrying the character identifier is obtained by acquiring the original statement not carrying the character identifier, acquiring the character information of the dialogue, and then attaching the character identifier to each original statement.

And S120, extracting features of the dialogue information to generate dialogue feature vectors.

The dialogue feature vector comprises N dialogue statement feature values, and the N dialogue statement feature values correspond to the N dialogue statements.

It can be understood that feature extraction is implemented on the role dialogue information through a feature extraction model, so as to generate dialogue feature vectors, where the feature extraction model includes: a text abstract layer, a feature embedding layer and a feature fusion layer.

Referring to fig. 3, fig. 3 is a block diagram of a feature extraction model according to an embodiment of the present application. Firstly, character dialogue information is used as input of a text abstract layer in a feature extraction model, hidden semantic feature extraction is carried out on the character dialogue information through the text abstract layer, and hidden semantic vectors are generated; then, character dialogue information is used as the input of a feature embedding layer in a feature extraction model, and feature mapping is carried out on the character dialogue information through the feature embedding layer, so that character embedding feature vectors are generated; and finally, taking the hidden semantic vector and the character embedded feature vector as the input of a feature fusion layer, and carrying out feature fusion on the hidden semantic vector and the character embedded feature vector through the feature fusion layer to generate a dialogue feature vector.

The text abstract layer in the feature extraction model may be implemented by a BERT model or a BERTSUM model, and in this embodiment, the BERTSUM model is taken as an example for illustration.

For ease of understanding, the above feature extraction process is expressed in a formalized manner as follows:

let the original sentence with the dialogue round N beThe dialogue roles corresponding to each round are as follows. By using the role prompt strategy, a role identifier is added to each original sentence to obtain a dialogue sentence +.>Wherein->Dialog sentence representing a character identifier, +.>Representing character identification->Representing the original sentence.

First, the role dialogue informationAs input to the BERTSUM model, the hidden semantic vector +.>The method comprises the steps of carrying out a first treatment on the surface of the Then, character identification contained in the character dialogue information is subjected to feature mapping by using a character embedding strategy to generate character embedding feature vectorsThe method comprises the steps of carrying out a first treatment on the surface of the Finally, feature fusion is carried out on the hidden semantic vector and the character embedded feature vector through a feature fusion layer to generate a dialogue feature vectorWherein->And->Are all +.>， Is the final dialogue feature vector of the multi-round dialogue sentence.

S130, using the dialogue feature vector as input of a multi-branch sequence labeling model, and processing the dialogue feature vector through M category information labeling modules in the multi-branch sequence labeling model to obtain M category information prediction probability distribution vectors.

Each category information prediction probability distribution vector comprises N category information prediction probability values, each category information prediction probability value is used for representing the possibility that a dialogue sentence belongs to category information, and M is an integer greater than 1.

It can be understood that the multi-branch sequence labeling model adopts a multi-branch structure, i.e. is composed of a plurality of parallel category information labeling modules, each category information labeling module judges whether each inputted dialogue sentence belongs to the category, carries out 'reserved' editing operation on the dialogue sentence belonging to the analogy, and carries out 'delete' editing operation on the dialogue sentence not belonging to the category, i.e. the process can be regarded as a sentence level sequence labeling process with the label number of 2. The plurality of parallel category information labeling modules have the same network structure and are formed by combining a full connection layer (Fully Connected Layer, FC layer) and a Sigmoid layer. Preferably, a CRF layer (Conditional Random Fields, named entity recognition layer) may be added to further enhance the dependencies between adjacent turn dialog statements. Wherein M and N have no corresponding relation in number.

Referring to fig. 4, fig. 4 is a block diagram of a multi-branch sequence labeling model according to an embodiment of the present application. Firstly, inputting dialogue feature vectors into the full-connection layers of each of M category information labeling modules, and performing full-connection processing on the dialogue feature vectors through the full-connection layers of the M category information labeling modules to generate M full-connection feature vectors, wherein each full-connection feature vector comprises N full-connection feature values; and then, inputting the M full-connection feature vectors into a compression function layer of each category information labeling module in the M category information labeling modules, and performing compression processing on the M full-connection feature vectors through the compression function layers of the M category information labeling modules to generate M category information prediction probability distribution vectors.

For easy understanding, a dual-branch sequence labeling model composed of two category information labeling modules is taken as an example for explanation. In the dual-branch sequence labeling model provided by the embodiment of the application, the model consists of two parallel labeling sub-modules, namely, a Tagger-P (class information labeling module with the class information of P) and a Tagger-D (class information labeling module with the class information of S), which have the same model structure and are formed by combining a full connection layer and a compression function layer.

Dialogue feature vectorObtaining probability distribution of each dialogue sentence belonging to the category information P in N dialogue sentences through Tagger-P, namely a prediction probability distribution vector of the category information PWherein->， />Is a learnable full connection layer parameter, 2 represents the kind of editing operation, namely "reserved" and "deleted".

Dialogue feature vectorObtaining probability distribution of each dialogue sentence belonging to the category information D in N dialogue sentences through Tagger-D, namely a prediction probability distribution vector of the category information DWherein->，/>Is a learnable full connection layer parameter, 2 represents the kind of editing operation, namely "reserved" and "deleted".

S140, predicting probability values according to N category information corresponding to each category information prediction probability distribution vector in the M category information prediction probability distribution vectors, and determining M groups of category abstract sentences from the N dialogue sentences.

Wherein the M groups of class abstract sentences correspond to the M class information.

It will be appreciated that, when each of the M category information prediction probability distribution vectors is obtained, a dialogue sentence belonging to the category needs to be determined from each category information prediction probability distribution vector.

Firstly, M probability thresholds corresponding to M category information are obtained; and then, predicting probability values and M probability thresholds according to M category information corresponding to each dialogue sentence in the N dialogue sentences, and determining M groups of category abstract sentences from the N dialogue sentences, wherein the category information prediction probability values of the category abstract sentences are larger than the probability thresholds corresponding to the category information.

For example, first, the probability threshold corresponding to the category information P is acquiredAnd a probability threshold corresponding to the category information DThen, a predictive probability distribution vector according to the category information P>Probability threshold->Determining that the predicted probability value is greater than or equal to a probability threshold value from N dialogue sentences>Category abstract sentences corresponding to category information P of (1), specifically, according to predictive probability values and probability threshold values corresponding to N dialogue sentences +.>Determining that the predicted probability value is greater than or equal to the probability threshold +. >Is a category abstract sentence; predictive probability distribution vector based on category information DProbability threshold->Determining that the predicted probability value is greater than or equal to a probability threshold value from N dialogue sentences>Specifically, according to the predicted probability values and probability thresholds corresponding to the N dialogue sentencesDetermining that the predicted probability value is greater than or equal to the probability threshold +.>Is a category abstract sentence corresponding to the category information D.

The above steps are illustrated by way of example for better understanding. If the role dialogue information includes 5 dialogue sentences: a. b, c, D, e, calculating according to the multi-branch sequence labeling model to obtain the category information prediction probability value of each dialogue statement relative to the P and D categories, as shown in table 1,

TABLE 1

According to the prediction probability value of the category information of each dialogue sentence relative to the two categories P and D, determining the category abstract sentence corresponding to the category information P as a dialogue sentence b (the prediction probability value 0.9 of the dialogue sentence b corresponding to the category information P is larger than the probability threshold value 0.8 of the category information P), and determining the category abstract sentence corresponding to the category information D as a dialogue sentence c and a dialogue sentence D (the prediction probability value 0.9 of the category information corresponding to the category information D of the dialogue sentence c is larger than the probability threshold value 0.75 of the category information D; and the prediction probability value 0.8 of the category information corresponding to the category information D of the dialogue sentence D is larger than the probability threshold value 0.75 of the category information D).

In another case, a category summary statement may not exist for a certain category information. For example: if the role dialogue information includes 5 dialogue sentences: a. b, c, D, e, calculating the predicted probability value of each dialogue sentence relative to the category information of the P category and the D category according to the multi-branch sequence labeling model, as shown in table 2,

TABLE 2

Based on the class information prediction probability values of each dialogue sentence with respect to the two classes P and D, it is determined that the class digest sentence corresponding to the class information P is the dialogue sentence b (the class information prediction probability value 0.9 of the dialogue sentence b corresponding to the class information P is greater than the probability threshold value 0.8 of the class information P), and the class information D has no class digest sentence (because the class information prediction probability value of the non-dialogue sentence corresponding to the class information D is greater than the probability threshold value 0.75 of the class information D).

And S150, splicing each group of class abstract sentences in the M groups of class abstract sentences to generate M class abstract information.

It can be understood that, each group of class abstract sentences is spliced to obtain class abstract information corresponding to the class information, and preferably, each group of class abstract sentences is spliced in sequence according to the sequence corresponding to the N dialogue sentences to obtain class abstract information corresponding to the class information.

According to the method provided by the embodiment of the application, the multi-category probability prediction is carried out on the role dialogue information carrying the role identification through the multi-branch sequence labeling model, so that the prediction probability value of each dialogue statement in the role dialogue information belonging to each category is obtained, further, the dialogue statements belonging to each category are obtained through the prediction probability value, and the dialogue statements of each category are spliced to obtain the abstract information of each category, so that the accuracy of abstract information generation is improved, the problem of important information omission caused by extracting the abstract information of each category from the role dialogue information is avoided, and the method is applicable to scenes of medical dialogue abstract generation.

In an alternative embodiment of the summary information generating method provided in the corresponding embodiment of fig. 2 of the present application, referring to fig. 5, step S130 includes sub-steps S131 to S132. Specific:

s131, inputting the dialogue feature vectors into the full-connection layers of each of the M category information labeling modules, and performing full-connection processing on the dialogue feature vectors through the full-connection layers of the M category information labeling modules to generate M full-connection feature vectors.

Wherein each full connection feature vector comprises N full connection feature values.

It will be appreciated that the dialog feature vector is mapped to the sample label space by the fully connected layer (Fully Connected Layer, FC layer) such that the dimension of the feature vector is equal to the dimension corresponding to the category information labeling module.

S132, inputting the M full-connection feature vectors into a compression function layer of each category information labeling module in the M category information labeling modules, and performing compression processing on the M full-connection feature vectors through the compression function layers of the M category information labeling modules to generate M category information prediction probability distribution vectors.

It is understood that the compression function layer may be implemented by a Sigmoid network layer.

The purpose of the category information labeling module is to predict the editing operations of N dialogue sentences in the role dialogue information corresponding to M categories, namely judging whether the N dialogue sentences should execute the 'reserved' editing operation or the 'deleted' editing operation corresponding to each category. Thus, this process can be regarded as a sentence-level sequence labeling process with a tag number of 2.

For easy understanding, a dual-branch sequence labeling model composed of two category information labeling modules is taken as an example for explanation. Referring to fig. 6, fig. 6 is a block diagram of a dual-branch sequence labeling model according to an embodiment of the present application. In the embodiment of the application, the compression function layer is realized through the Sigmoid network layer.

In the dual-branch sequence labeling model provided by the embodiment of the application, the dual-branch sequence labeling model consists of two parallel labeling sub-modules, namely, a Tagger-P (class information labeling module with class information of P) and a Tagger-D (class information labeling module with class information of S), which have the same model structure and are formed by combining an FC layer and a Sigmoid layer.

According to the method provided by the embodiment of the application, the class information labeling module is formed by the full connection layer and the compression function layer, the dimension of the input dialogue feature vector is adjusted by the full connection layer so that the dimension of the dialogue feature vector is matched with the model dimension, then the compression function layer is realized by the Sigmoid network layer so as to calculate the probability of each feature value corresponding to the dialogue feature vector, the class information prediction probability distribution vector is obtained, the class information prediction probability distribution vector can fully reflect the association degree of dialogue sentences and classes, the confidence degree is higher, a foundation is laid for selecting class abstract sentences by the class information prediction probability distribution vector, and the accuracy of abstract information generation is improved.

In an alternative embodiment of the summary information generating method provided in the corresponding embodiment of fig. 2 of the present application, referring to fig. 7, step S140 further includes sub-steps S141 to S142. Specific:

s141, M probability thresholds corresponding to the M category information are obtained.

It is understood that different category information corresponds to different probability thresholds, which are set according to experience values.

S142, predicting probability values and M probability thresholds according to M category information corresponding to each dialogue sentence in the N dialogue sentences, and determining M groups of category abstract sentences from the N dialogue sentences.

Wherein, the prediction probability value of the category information of the category abstract statement is larger than the probability threshold value corresponding to the category information.

It may be understood that, after each of the M category information prediction probability distribution vectors is obtained, a dialogue sentence belonging to the category needs to be determined according to each of the category information prediction probability distribution vectors, that is, for each of the M categories, a dialogue sentence having a prediction probability value equal to or greater than the category probability threshold needs to be determined from the N dialogue sentences as a category summary sentence.

For example, the character dialogue information is:

Patient: the doctor gets his own.

Patient: variability what drugs asthma takes, each time a cold will cause.

Patient: is the drug?

Patient: what is the present symptom cough and nasal discharge?

Patient: the doctor is thanks to the metabolism.

For the role dialogue information including 8 dialogue sentences, the predicted probability distribution vector for the category information P obtained in step S130 is. And the probability threshold value of the category information P obtained through step S141 is 0.85, and according to the information prediction probability value and the probability threshold value corresponding to each dialogue sentence in the 8 dialogue sentences, determining the dialogue sentence with the information prediction probability value greater than or equal to 0.85 from the N dialogue sentences as the category abstract sentence, that is, the dialogue sentence with the information prediction probability value of 0.89 as the category abstract sentence with the category information P: variability what drugs asthma takes, each time a cold will cause.

Similarly, for the character dialogue information including 8 dialogue sentences described above, the predicted probability distribution vector for the category information D obtained in step S130 is . And the probability threshold of the category information D obtained in step S141 is 0.84, and according to the information prediction probability value and the probability threshold corresponding to each of the 8 dialogue sentences, a dialogue sentence with the information prediction probability value greater than or equal to 0.84 is determined from the N dialogue sentences as a category abstract sentence, that is, a dialogue sentence with the information prediction probability value of 0.88 is determined as a category abstract sentence with the category information D: the method can generally take the following dialogue statement with the sensitization of the oral Ning Jiangdi, the self-contained atomized medicine and the information prediction probability value of 0.84 as a category abstract statement with the category information of D: if there is infection, drink the anti-inflammatory and cough-relieving medicine for atomization.

According to the method provided by the embodiment of the invention, the dialogue statement with the class information prediction probability value larger than or equal to the probability threshold value is determined as the class abstract statement by comparing the class information prediction probability value corresponding to each class information with the probability threshold value, so that the accuracy of abstract information generation is improved.

In an alternative embodiment of the summary information generating method provided in the corresponding embodiment of fig. 2 of the present application, referring to fig. 8, step S110 includes sub-steps S111 to S112. Specific:

s111, acquiring original dialogue information and at least one role information.

Wherein the original dialogue information includes N original sentences.

It will be appreciated that the original sentence does not carry a character identifier, i.e. the original dialog information is dialog text without any character identifier. The character information is character information having a correspondence relation with the original dialogue information.

And S112, matching the N original sentences with at least one role information to generate role dialogue information.

The role dialogue information comprises N original sentences carrying role identifications.

It can be understood that, based on the role prompt policy, the N original sentences are matched with the role information to generate N original sentences carrying the role identifier, and the N original sentences carrying the role identifier form the role dialogue information. The role prompt means that a corresponding dialogue role and a colon are added before each round of dialogue sentences, so that the dialogue roles are directly prompted, and corresponding dialogue role semantic features can be obtained after the pre-training language model is encoded.

For ease of understanding, the above-described matching process is formally represented as follows:

let the original sentence with the dialogue round N beThe dialogue roles corresponding to each round are as follows. By using the role prompt strategy, a role identifier is added to each original sentence to obtain a dialogue sentence +. >Wherein->Dialog sentence representing a character identifier, +.>Representing character identification->Representing the original sentence.

For example, the obtained original dialogue information is:

the doctor gets his own.

Variability what drugs asthma takes, each time a cold will cause.

Generally, the medicine can be orally taken for sensitization of cis Ning Jiangdi and self-provided atomization.

Is the drug?

What is the present symptom cough and nasal discharge?

If there is infection, drink the anti-inflammatory and cough-relieving medicine for atomization.

The doctor is thanks to the metabolism.

Do not feel comfortable and congratulate you for recovery in the early days.

The obtained role information comprises the following steps: patients and doctors. By using the role prompt strategy, a corresponding dialogue role and a colon are added before each round of dialogue sentences, namely' doctor: or patient: by this, the dialogue roles are directly prompted, so that the corresponding dialogue role semantic features can be obtained after the pre-training language model is encoded. The generated role dialogue information is as follows:

patient: the doctor gets his own.

Patient: variability what drugs asthma takes, each time a cold will cause.

Patient: is the drug?

Patient: what is the present symptom cough and nasal discharge?

Patient: the doctor is thanks to the metabolism.

Through observation and statistics, the summary content of different categories and the original sentences corresponding to different identity information are found to have a huge relationship, for example, most of symptom summary content is derived from dialogue sentences of patients, and most of diagnosis summary content is derived from dialogue sentences of doctors, so that role dialogue information has important reference significance for final summary acquisition.

In an alternative embodiment of the summary information generating method provided in the corresponding embodiment of fig. 2 of the present application, referring to fig. 9, step S120 further includes sub-steps S121 to S123. Specific:

s121, character dialogue information is used as input of a text abstract layer in a feature extraction model, hidden semantic feature extraction is carried out on the character dialogue information through the text abstract layer, and hidden semantic vectors are generated.

It may be understood that the text summarization layer in the feature extraction model may be implemented by a BERT model or a BERTSUM model, and the method provided in the embodiment of the present application uses BERTSUM as the text summarization layer. In comparison with the BERT model, the BERTSUM model inserts special marks "CLS" and "SEP" before and after each turn of dialogue statement, and sets segment embedding of odd and even turns of statement to "0" and "1" respectively for distinction. Finally, the output vector of the model is marked by the CLS before N dialogue sentences as the hidden layer vector representation of the N dialogue sentences in the role dialogue information.

S122, character dialogue information is used as input of a feature embedding layer in the feature extraction model, and feature mapping is carried out on the character dialogue information through the feature embedding layer to generate character embedding feature vectors.

S123, feature fusion is carried out on the hidden semantic vector and the character embedded feature vector, and a dialogue feature vector is generated.

It can be understood that the hidden semantic vector and the character embedded feature vector are used as the input of the feature fusion layer, and feature fusion is carried out on the hidden semantic vector and the character embedded feature vector through the feature fusion layer to generate the dialogue feature vector. Specific: firstly, splicing hidden semantic vectors and character embedded feature vectors to generate feature splicing vectors; and then, taking the feature stitching vector as a self-attention mechanism layer in the feature extraction model to perform feature fusion, and generating a dialogue feature vector.

First, the role dialogue informationAs input to the BERTRUM model, we obtainHidden semantic vector +.>The method comprises the steps of carrying out a first treatment on the surface of the Then, character identification contained in the character dialogue information is subjected to feature mapping by using a character embedding strategy to generate character embedding feature vectorsThe method comprises the steps of carrying out a first treatment on the surface of the Finally, feature fusion is carried out on the hidden semantic vector and the character embedded feature vector through a feature fusion layer to generate a dialogue feature vectorWherein->And->Are all +.>， Is the final dialogue feature vector of the multi-round dialogue sentence.

According to the method provided by the embodiment of the application, the N dialogue sentences in the role dialogue information are subjected to feature extraction through the feature extraction model formed by the text abstract layer, the feature embedding layer and the feature fusion layer, so that dialogue feature vectors formed by N dialogue sentence feature values are obtained, and a foundation is laid for subsequently improving the accuracy of calculating the class information prediction probability distribution vectors according to the dialogue feature vectors.

In an alternative embodiment of the summary information generating method provided in the corresponding embodiment of fig. 9 of the present application, referring to fig. 10, the substep S123 further includes substeps S1231 to S1232. Specific:

s1231, splicing the hidden semantic vector and the character embedded feature vector to generate a feature splicing vector.

S1232, feature fusion is carried out by taking the feature stitching vector as a self-attention mechanism layer in the feature extraction model, and a dialogue feature vector is generated.

It can be understood that feature fusion is performed on hidden semantic vectors and character embedded feature vectors through a feature fusion layer, and dialogue feature vectors are calculated by adding first and then through a single layer of convermerLayerWherein->And->Are all +.>， />Is the final dialogue feature vector of the multi-round dialogue sentence.

According to the method provided by the embodiment of the application, the hidden semantic vector and the character embedded feature vector are spliced through the feature fusion layer, so that a foundation is laid for subsequently improving the accuracy of calculating the category information prediction probability distribution vector according to the dialogue feature vector.

In an alternative embodiment of the summary information generating method provided in the corresponding embodiment of fig. 2 of the present application, referring to fig. 11, step S150 further includes sub-steps S151 to S153. Specific:

S151, N original serial numbers corresponding to N dialogue sentences in the role dialogue information are obtained.

Wherein the original sequence number is used for representing the sequence number of the dialogue sentence in the order role dialogue information.

It can be understood that the original sequence number corresponding to each dialogue sentence is determined according to the occurrence order of each dialogue sentence in the role dialogue information. As shown in table 3, the color dialogue information is numbered with a serial number.

TABLE 3 Table 3

S152, determining the splicing sequence number of each dialogue sentence in each group of class abstract sentences according to N original sequence numbers corresponding to the N dialogue sentences.

S153, sequentially splicing each group of class abstract sentences in the M groups of class abstract sentences according to the splicing sequence number of each dialogue sentence in each group of class abstract sentences to generate M classes of abstract information.

It can be understood that each dialogue sentence in each group of class abstract sentences is arranged in an ascending order according to the original sequence number, so as to obtain the splicing sequence number of each dialogue sentence in each group of class abstract sentences.

When the category abstract statement only comprises 1 dialogue statement, the dialogue statement is directly used as category abstract information of the category without sequential splicing. For example: for the role dialogue information including 8 dialogue sentences, the predicted probability distribution vector for the category information P obtained in step S130 is . And the probability threshold value of the category information P obtained through step S141 is 0.85, and according to the information prediction probability value and the probability threshold value corresponding to each dialogue sentence in the 8 dialogue sentences, determining the dialogue sentence with the information prediction probability value greater than or equal to 0.85 from the N dialogue sentences as the category abstract sentence, that is, the dialogue sentence with the information prediction probability value of 0.89 as the category abstract sentence with the category information P: variability what drugs asthma takes, each time a cold will cause. As shown in table 4, the category digest information whose category information is P is: variability what drugs asthma takes, each time a cold will cause.

TABLE 4 Table 4

When the category abstract sentence only contains more than 1 dialogueAnd when the sentence is formed, sequentially splicing the dialogue sentences according to the splicing sequence numbers. For example: for the role dialogue information including 8 dialogue sentences, the prediction probability distribution vector for the category information D obtained in step S130 is. And the probability threshold of the category information D obtained in step S141 is 0.84, and according to the information prediction probability value and the probability threshold corresponding to each of the 8 dialogue sentences, a dialogue sentence with the information prediction probability value greater than or equal to 0.84 is determined from the N dialogue sentences as a category abstract sentence, that is, a dialogue sentence with the information prediction probability value of 0.88 is determined as a category abstract sentence with the category information D: generally, the sensitization of the cis Ning Jiangdi and the self-contained atomized medicine can be orally taken, the original serial number corresponding to the dialogue sentence is 3, and the dialogue sentence with the information prediction probability value of 0.84 is a category abstract sentence with category information of D: if there is infection, drink the anti-inflammatory and cough-relieving medicine to be atomized, and the corresponding original serial number of the dialogue sentence is 6.

Because the class abstract sentences of the class comprise two dialogue sentences, and the original serial numbers corresponding to the two dialogue sentences are 3 and 6 respectively, the splicing serial number corresponding to the dialogue sentence with the original serial number of 3 is 1, and the splicing serial number corresponding to the dialogue sentence with the original serial number of 6 is 2; as shown in table 5, the category abstract sentences are spliced sequentially according to the splicing sequence numbers of the dialogue sentences, so that category abstract information is obtained as follows: the medicine can be orally taken for sensitization of the cis Ning Jiangdi and is self-provided with atomized medicine; if there is infection, drink the anti-inflammatory and cough-relieving medicine for atomization.

TABLE 5

According to the method provided by the embodiment of the application, the splicing sequence number of the class abstract statement is determined according to the original sequence number of the dialogue statement in the role dialogue information, and then the class abstract statement is spliced according to the splicing sequence number, so that complete class abstract information is generated, and the logic and the integrity of the class abstract information are improved.

In an alternative embodiment of the summary information generating method provided in the corresponding embodiment of fig. 2 of the present application, referring to fig. 12, another summary information generating method is provided in the embodiment of the present application, including: step S210 to step S250. The summary information generating method aims at extracting symptom summaries and diagnosis summaries from role dialogue messages in medical consultation scenes. Specific:

S210, acquiring role dialogue information.

It can be appreciated that in the method provided in the embodiment of the present application, the role session information is medical session information, and the role includes a patient and a doctor, and the role session information may be obtained from an on-line inquiry scenario.

By observation and statistics, the present embodiments find that the content of symptom summaries is mostly derived from patient dialogue sentences and the content of diagnostic summaries is mostly derived from doctor dialogue sentences. Therefore, the medical dialogue role information has important reference significance for the final abstract acquisition. Therefore, the method provided by the embodiment of the application introduces dialogue role information in the feature extraction process, and specifically provides two strategies, namely role prompt and role embedding.

Role hints refer to attaching a corresponding dialogue role and colon before each round of dialogue sentences, namely' doctor: or patient: by this, the dialogue roles are directly prompted, so that the corresponding dialogue role semantic features can be obtained after the pre-training language model is encoded. Specific: firstly, obtaining original dialogue information (including N original sentences) and role information which do not carry the role identification, and then matching the N original sentences with the role information to generate the role dialogue information.

Let the original sentence with the dialogue round N beEach round corresponds toIs of the conversational role of. By using the role prompt strategy, a role identifier is added to each original sentence to obtain a dialogue sentence +.>Wherein->Dialog sentence representing a character identifier, +.>Representing character identification->Representing the original sentence.

S220, feature extraction is carried out on the dialogue information to generate dialogue feature vectors.

Firstly, character dialogue information is used as input of a BERTRUM layer in a feature extraction model, hidden semantic feature extraction is carried out on the character dialogue information through the BERTRUM layer, and hidden semantic vectors are generated. The BERTSUM inserts special marks "CLS" and "SEP" before and after the N dialogue sentences, and sets segment embedding of the odd and even round sentences to "0" and "1" respectively to distinguish. Furthermore, to better adapt to this particular field of medicine, the method provided by embodiments of the present application uses the pre-trained language model MedBERT of the medical field to initialize BERTSUM. Finally, the output vector of the model marked by the CLS before each turn of dialogue statement is used as N dialogue statement characteristic values corresponding to the N dialogue statements, and the N dialogue statement characteristic values form dialogue characteristic vectors.

Conversational information of charactersAs input to the BERTSUM model, the hidden semantic vector +.>。

Then, character dialogue information is used as input of a feature embedding layer in the feature extraction model, and feature mapping is carried out on the character dialogue information through the feature embedding layer, so that character embedding feature vectors are generated. Character embedding refers to encoding two medical dialog characters, mapping them to a high-dimensional space to obtain high-dimensional representation vectors, and fusing with semantic representations obtained from a pre-trained language model.

Character embedding strategy is used for carrying out feature mapping on character identifiers contained in character dialogue information to generate character embedding feature vectors。

And finally, taking the hidden semantic vector and the character embedded feature vector as the input of a feature fusion layer, and carrying out feature fusion on the hidden semantic vector and the character embedded feature vector through the feature fusion layer to generate a dialogue feature vector.

Feature fusion is carried out on hidden semantic vectors and character embedded feature vectors through a feature fusion layer to generate dialogue feature vectorsWherein->And->Are all of dimensions of， />Is the final dialogue feature vector of the multi-round dialogue sentence. />

S230, taking the dialogue feature vector as the input of a double-branch sequence labeling model, processing the dialogue feature vector through a first class information labeling module in the double-branch sequence labeling model to obtain a first class information prediction probability distribution vector, and processing the dialogue feature vector through a second class information labeling module in the double-branch sequence labeling model to obtain a second class information prediction probability distribution vector.

The first class information prediction probability distribution vector comprises N first class information prediction probability values, the first class information prediction probability values are used for representing the possibility that the dialogue sentence belongs to the first class information, the second class information prediction probability distribution vector comprises N second class information prediction probability values, and the second class information prediction probability values are used for representing the possibility that the dialogue sentence belongs to the second class information.

It will be appreciated that the purpose of the dual branch sequence annotation model is to predict two sets of sentence-level editing operations (corresponding to symptom digests and diagnostic digests, respectively), i.e., to determine whether each turn of dialog statement in a multi-turn dialog should be "preserved" or "deleted". Thus, this process can be regarded as a sentence-level sequence labeling process with a tag number of 2. For this reason, structurally, the sequence labeling model adopts a double-branch structure, namely, is composed of two parallel labeling sub-modules, namely, a tab-P (corresponding to symptom abstract) and a tab-D (corresponding to diagnosis abstract), which have the same model structure and are formed by combining a full connection layer and a Sigmoid layer.

Dialogue feature vectorObtaining probability distribution of each dialogue sentence belonging to the category information P in N dialogue sentences through Tagger-P, namely a prediction probability distribution vector of the category information P Wherein, the method comprises the steps of, wherein,/>， />is a learnable full connection layer parameter, 2 represents the kind of editing operation, namely "reserved" and "deleted".

S240, determining a first class abstract sentence from N dialogue sentences according to the predicted probability values of the N first class information, and determining a second class abstract sentence from the N dialogue sentences according to the predicted probability values of the N second class information.

It can be understood that, by acquiring the probability threshold of the first category information, then taking the dialog sentences with the N first category information prediction probability values corresponding to the N dialog sentences being greater than or equal to the probability threshold of the first category information as the first category abstract sentences; and obtaining a second-class information probability threshold, and then taking the dialogue sentences with N second-class information prediction probability values larger than or equal to the second-class information probability threshold corresponding to the N dialogue sentences as second-class abstract sentences.

S250, splicing the first class abstract sentences to generate first class abstract information, and splicing the second class abstract sentences to generate second class abstract information.

It can be understood that the category abstract information is obtained by splicing the category abstract sentences.

According to the method provided by the embodiment of the application, multi-category probability prediction is carried out on the role dialogue information carrying the role identification through the double-branch sequence labeling model, so that the prediction probability value of each dialogue sentence belonging to each category in the role dialogue information is obtained, further, the dialogue sentences belonging to each category are obtained through the prediction probability value, the dialogue sentences of each category are spliced to obtain the abstract information of each category, and the prediction of the double-branch sequence labeling model is non-autoregressive and parallel, so that the method provided by the embodiment of the application has the advantage of high reasoning speed; the feature extraction model formed by the text abstract layer, the feature embedding layer and the feature fusion layer can effectively extract the features of each dialogue statement, and the accuracy of abstract information generation is improved; through the intuitive, reasonable and flexible sentence-level text editing method, the medical text abstract generating task can be well adapted, and the problem of important information omission caused by extracting abstract information of each category from role dialogue information is avoided.

In an alternative embodiment of the summary information generating method provided in the corresponding embodiment of fig. 2 of the present application, referring to fig. 13 and 14, another summary information generating method is provided in the embodiment of the present application, including: step S310 to step S340. The purpose of the summary information generating method is to train a summary information generating model. Specific:

s310, acquiring a training character dialogue information sample.

The training role dialogue information sample comprises N training dialogue sentences, each training dialogue sentence in the N training dialogue sentences carries a role identifier, the training role dialogue information sample carries M category information tag sequences, each category information tag sequence comprises N category information tags, the category information tags are used for representing the corresponding relation between the training dialogue sentences and the category information, and N is an integer larger than 1.

It can be understood that the training role dialogue information sample may be directly obtained N training dialogue sentences carrying the role identifier, or may be obtained N training original sentences not carrying the role identifier first, and then a role identifier is added to each training original sentence to obtain the dialogue sentences carrying the training role identifier.

Taking a medical dialogue as an example, in one embodiment, N training dialogue sentences carrying character identifiers can be directly obtained, and each of the N training dialogue sentences carries a class information tag corresponding to the training dialogue sentence, where the class information tag is used to characterize a corresponding relationship between the training dialogue sentence and the class, that is, when the class information tag is 1, the training dialogue sentence is represented to have a corresponding relationship with the class; when the category information label is 0, the training dialogue statement and the category do not have a corresponding relation; for ease of understanding, please refer to table 6.

TABLE 6

In another embodiment, N training original sentences without character identifiers and character information with corresponding relations with the original dialogue information are obtained, and first, by using a character prompting policy, a corresponding dialogue character and a colon are attached to each round of training dialogue sentences, namely, "doctor: or patient: by this, the dialogue roles are directly prompted, so that the corresponding dialogue role semantic features can be obtained after the pre-training language model is encoded.

Each of the N training dialogue sentences carries a class information label corresponding to the training dialogue sentence, and the class information label is used for representing the corresponding relation between the training dialogue sentence and the class, namely when the class information label is 1, the training dialogue sentence has the corresponding relation with the class; when the category information label is 0, the training dialogue statement and the category do not have a corresponding relation; for ease of understanding, please refer to table 7.

TABLE 7

S320, feature extraction is carried out on the training character dialogue information sample, and training dialogue feature vectors are generated.

The training dialogue feature vector comprises N training dialogue sentence feature values, and the N training dialogue sentence feature values correspond to the N training dialogue sentences.

It can be understood that firstly, training character dialogue information is used as input of a text abstract layer in a feature extraction model, hidden semantic feature extraction is carried out on the training character dialogue information through the text abstract layer, and training hidden semantic vectors are generated; then, taking training character dialogue information as the input of a feature embedding layer in a feature extraction model, and performing feature mapping on the training character dialogue information through the feature embedding layer to generate a training character embedded feature vector; and finally, taking the training hidden semantic vector and the training character embedded feature vector as the input of a feature fusion layer, and carrying out feature fusion on the training hidden semantic vector and the training character embedded feature vector through the feature fusion layer to generate a training dialogue feature vector. The text abstract layer in the feature extraction model can be a BERT model or a BERTSUM model.

S330, the training dialogue feature vector is used as input of a multi-branch sequence labeling model, and M training class information prediction probability distribution vectors are obtained by processing the training dialogue feature vector through M class information labeling modules in the multi-branch sequence labeling model.

Each training category information prediction probability distribution vector comprises N training category information prediction probability values, and each training category information prediction probability value is used for representing the possibility that a training dialogue sentence belongs to category information.

It can be understood that the multi-branch sequence labeling model adopts a multi-branch structure, i.e. is composed of a plurality of parallel category information labeling modules, each category information labeling module judges whether each input training dialogue sentence belongs to the category, carries out 'reserved' editing operation on the dialogue sentence belonging to the analogy, and carries out 'delete' editing operation on the dialogue sentence not belonging to the category, i.e. the process can be regarded as a sentence level sequence labeling process with the label number of 2. The plurality of parallel category information labeling modules have the same network structure and are formed by combining a full connection layer (Fully Connected Layer, FC layer) and a Sigmoid layer.

Specifically, the training dialogue feature vectors are input into the full-connection layers of each of the M category information labeling modules, full-connection processing is carried out on the training dialogue feature vectors through the full-connection layers of the M category information labeling modules, and M training full-connection feature vectors are generated, wherein each training full-connection feature vector comprises N training full-connection feature values; and then, inputting the M training full-connection feature vectors into a compression function layer of each class information labeling module in the M class information labeling modules, and performing compression processing on the M training full-connection feature vectors through the compression function layers of the M class information labeling modules to generate M training class information prediction probability distribution vectors.

S340, generating a total loss function according to the M training class information prediction probability distribution vectors and the M class information tag sequences, and carrying out parameter optimization on the multi-branch sequence labeling model through the total loss function.

Preferably, step S340 includes sub-steps S341 to S342. Specific:

s341, predicting probability values according to N training class information in the probability distribution vector according to the training class information corresponding to the same class information and generating M class loss functions according to N class information labels in the class information label sequence.

S342, generating a total loss function according to M category loss functions.

It can be understood that taking the double-category as an example, let N training dialogue sentences in the training character dialogue information sample correspond to the category information label sequence with the category information P as followsAnd setting N training dialogue sentences in the training character dialogue information sampleClass information tag sequence corresponding to class information D isThe method comprises the steps of carrying out a first treatment on the surface of the And the multi-branch sequence labeling model corresponds to a category information sequence of +.>The multi-branch sequence labeling model corresponds to the category information sequence with the category information D as follows。

Tagger-P is a training target for editing operation sequence at the level of predicted sentencesAnd edit manipulation sequence tag->The corresponding learning process is to minimize the following loss functions:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>And->Respectively represent->And->At->Elements at the location.

The training target corresponding to Tagger-D is to make the predicted sentence level edit operation sequenceAnd edit manipulation sequence tag->The corresponding learning process is to minimize the following loss functions:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>And->Respectively represent->And->In positionElements at the site.

Finally, a multi-task learning frame is adopted to carry out combined training on the prediction tasks of the two editing operations, and the total training targets are as follows:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>And->Is a super parameter and respectively represents the weight of the corresponding loss function of the two training targets.

According to the method provided by the embodiment of the application, through carrying out sufficient training on a large amount of supervised data, a sentence level abstract generation model with excellent performance can be finally obtained, the model can predict the corresponding class information prediction probability distribution vectors of a plurality of classes according to the dialogue information, and further can be used for executing corresponding abstract generation on the dialogue information, and finally various target abstract information is obtained.

In order to facilitate understanding, a summary generating method applied to medical dialogues will be described below, and the method provided in the embodiments of the present application is applicable to related products of medical consultation, and can automatically generate a corresponding summary according to dialogues between a doctor and a patient, thereby improving the working efficiency of the doctor and providing references for other patients with similar situations.

The focus of the medical dialogue summary task is to generalize the dialogue between doctor and patient from two main dimensions, namely patient symptoms and doctor diagnosis, corresponding to two target outputs respectively: symptom abstracts and diagnostic abstracts. Symptom digests and diagnostic digests can typically be obtained by a combination of certain rounds of dialogue statements in a medical dialogue. Therefore, the method provided by the embodiment of the application provides a sentence annotation model for judging whether the dialogue sentences of each round in the medical dialogue belong to symptom abstract, diagnosis abstract or other dialogue sentences. The final symptom digest is obtained by concatenating all dialogue sentences judged to be "belonging to symptom digest", and the final diagnostic digest is obtained by concatenating all dialogue sentences judged to be "belonging to symptom digest".

By exploring the existing medical dialogue summary data set, it is found that symptom summaries and diagnostic summaries can be almost all combined by dialogue sentences of some rounds in a multi-round medical dialogue. As shown in the example illustrated in FIG. 15, the symptom digest is in a multi-round dialogWhereas the diagnosis digest is +.>And->Is a combination of (a) and (b). For this feature, the embodiment of the present application proposes a concept of "sentence-level text editing", and two sentence-level text editing operations are designed: "retain" and "delete" denote retaining or deleting, respectively, the dialogue statement of the current round. At this time, the symptom digest may be deleted +. >While keeping->"obtained, whereas diagnostic digest may be obtained by" deletion->While keeping->"obtained. The thought of applying sentence-level text editing in the medical dialogue abstract task fully utilizes the characteristics of the task, can improve the quality of generating the medical dialogue abstract, and has better interpretability and higher efficiency in the generating process.

Based on the above-mentioned idea, the embodiment of the application provides a summary generating method based on sentence-level text editing and applied to a medical dialogue, and the basic flow is shown in fig. 16, where the training phase includes feature extraction, editing operation prediction and editing operation on the medical dialogue, and the prediction phase includes feature extraction, editing operation prediction, editing operation and editing implementation operation on the medical dialogue, so as to obtain a symptom summary and a diagnosis summary.

The training stage aims at training a sentence level editing operation prediction model, wherein the model takes medical dialogues of multiple rounds as input, firstly, a feature extraction module is used for obtaining hidden layer vector representation of each round of dialog sentences; then, through an edit operation prediction module, obtaining probability distribution of edit operations (two, corresponding to symptom abstract and diagnosis abstract respectively) corresponding to each turn of dialogue sentences; and finally, calculating and generating the loss between the probability distribution and the editing operation label and optimizing the model.

In the prediction stage, firstly, a sentence level editing operation prediction model predicts two corresponding groups of sentence level editing operations (a symptom abstract and a diagnosis abstract respectively correspond to one group) according to a medical dialogue; then, the two groups of editing operations obtained by prediction are respectively edited, namely, corresponding sentence-level editing operations are executed on the original medical dialogue, so that a final symptom abstract and a diagnosis abstract are obtained.

Referring to fig. 17, fig. 17 is a block diagram of a sentence-level editing operation prediction model according to an embodiment of the present application. The model mainly comprises two modules: the feature extraction module is used for introducing dialogue role information and the double-branch sequence labeling module is used for editing based on sentence level texts.

First, a feature extraction module for introducing dialogue character information is described. In order to perform feature extraction on multiple rounds of medical dialogues, the embodiment of the application firstly adopts a classical text abstract model BERTRUM model as a basic text encoder. In comparison with the BERT model, the BERTSUM model inserts special marks "CLS" and "SEP" before and after each turn of dialogue statement, and sets segment embedding of odd and even turns of statement to "0" and "1" respectively for distinction. Furthermore, to better adapt to this particular field of medicine, embodiments of the present application use the pre-trained language model MedBERT of the medical field to initialize BERTSUM. The output vector of the model is finally marked with a "CLS" before each turn of dialogue statement as a hidden layer vector representation of that turn of dialogue statement.

By observation and statistics, the content of symptom summaries is mostly derived from dialogue sentences of the patient and the content of diagnostic summaries is mostly derived from dialogue sentences of the doctor. Therefore, the medical dialogue role information has important reference significance for the final abstract acquisition. Therefore, in the embodiment of the application, dialogue role information is introduced in the feature extraction process, and two strategies, namely role prompt and role embedding, are specifically provided.

(1) Role hints refer to attaching a corresponding dialogue role and colon before each round of dialogue sentences, namely' doctor: or patient: by this, the dialogue roles are directly prompted, so that the corresponding dialogue role semantic features can be obtained after the pre-training language model is encoded.

(2) Character embedding refers to encoding two medical dialog characters, mapping them to a high-dimensional space to obtain high-dimensional representation vectors, and fusing with semantic representations obtained from a pre-trained language model.

In order to better understand the above process, a formal description will be given below.

Let the original sentence with the dialogue round N beThe dialogue roles corresponding to each round are as follows. By using a role hint strategy, the incoming multi-round dialog becomes Then, obtaining hidden semantic representation through BERTUM: />. By using a role embedding strategy, firstly, the role embedding of each round of dialogue sentences is obtained: />。

Next, toAnd->Further feature fusion is performed in a simple manner by adding first and then passing through a single layer transformerlyer: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>And->Are all +.>， />Is the final dialogue feature vector of the multi-round dialogue sentence.

The two-branch sequence labeling module based on sentence-level text editing is described next. The purpose of the dual branch sequence labeling module is to predict two sets of sentence-level editing operations (corresponding to symptom digests and diagnostic digests, respectively), i.e., to determine whether each turn of dialog statement in a multi-turn dialog should be "preserved" or "deleted". Thus, this process can be regarded as a sentence-level sequence labeling process with a tag number of 2. For this reason, the module structurally adopts a dual-branch structure, namely, is composed of two parallel labeling sub-modules, namely, a tab-P (corresponding symptom abstract) and a tab-D (corresponding diagnosis abstract), which have the same model structure and are formed by combining a full connection layer and a Sigmoid layer.

Sentence level editing operation sequence labels corresponding to symptom abstract and diagnosis abstract are respectively set as And->Whereas the sentence-level editing operation sequences of model prediction are +.>And->。

Dialog token vectorThe sentence level editing operation probability distribution corresponding to the symptom abstract of each round of dialogue sentences is obtained through Tagger-P, and is as follows: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein the method comprises the steps ofIs a learnable full connection layer parameter; 2 represents the kind of editing operation, i.e. "reserved" andtwo types of deletion.

Dialog token vectorThe sentence level editing operation probability distribution corresponding to the diagnosis abstract of each round of dialogue sentences is obtained through the Tagger-D, and is as follows: />The method comprises the steps of carrying out a first treatment on the surface of the Wherein,is a learnable full connection layer parameter; 2 represents the kind of editing operation, i.e., both "hold" and "delete".

The training target corresponding to Tagger-D is to make the predicted sentence level edit operation sequenceAnd edit manipulation sequence tag->The corresponding learning process is to minimize the following loss functions:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>And->Respectively represent->And->In position Elements at the site. />

Through carrying out sufficient training on a large amount of supervised data, the embodiment of the application can finally obtain a sentence level editing operation prediction model with excellent performance, and the model can predict two groups of corresponding sentence level editing operation sequences according to multiple rounds of medical dialogues, and further can be used for executing text editing of corresponding sentence levels on the multiple rounds of medical dialogues, so that two kinds of target abstracts are finally obtained: symptom abstracts and diagnostic abstracts.

In summary, the method provided by the embodiment of the application provides a method for modeling a medical dialogue summary problem as a sentence-level text editing problem, and provides a medical dialogue summary method based on sentence-level text editing. Mainly comprises the following steps:

1) Modeling the acquisition process of the medical conversation abstract as a sentence-level text editing process: an edit operation of "hold" or "delete" is performed for each round of dialogue statements in the medical dialogue. The modeling mode is more visual and has potential to achieve better effects.

2) A feature extraction module for introducing dialogue role information is designed to obtain the semantic features of each round of dialogue sentences, the dialogue role information can be fully introduced, and sufficient and effective features can be provided for subsequent modules.

3) The editing operation prediction module with the double-branch structure is designed, meanwhile, the editing operation corresponding to the symptom abstract and the diagnosis abstract is predicted, the prediction accuracy is high, and the abstract with high quality can be obtained.

The method provided by the embodiment of the application models the medical pair abstract as a two-stage process: the editing operation is predicted and then implemented. The prediction of the editing operation is non-autoregressive and parallel, so that the method provided by the invention has the advantage of high reasoning speed; secondly, the method provided by the embodiment of the application is suitable for a feature extraction module of a multi-round dialogue, and features of each round of dialogue sentences can be effectively extracted; the method provided by the embodiment of the application calculates an intuitive, reasonable and flexible sentence-level text editing method, and can be well adapted to medical text abstract tasks. The embodiment of the application is suitable for relevant scenes/products of medical dialogue, effectively relieves the problems that the medical dialogue is long and key information is difficult to acquire quickly, can improve the working efficiency of doctors, helps the doctors to make diagnosis better, and can provide references for users with similar conditions.

The summary information generating apparatus in the present application will be described in detail below, referring to fig. 18. Fig. 18 is a schematic diagram of an embodiment of the summary information generating apparatus 10 in the embodiment of the present application, where the summary information generating apparatus 10 includes: the character dialogue information acquisition module 110, the feature extraction module 120, the probability distribution vector prediction module 130, the category abstract statement determination module 140 and the category abstract information generation module 150; specific:

the role dialogue information acquiring module 110 is configured to acquire role dialogue information.

The feature extraction module 120 is configured to perform feature extraction on the role dialogue information, and generate a dialogue feature vector.

The probability distribution vector prediction module 130 is configured to take the dialogue feature vector as an input of the multi-branch sequence labeling model, and process the dialogue feature vector through M category information labeling modules in the multi-branch sequence labeling model to obtain M category information prediction probability distribution vectors.

Each category information prediction probability distribution vector comprises N category information prediction probability values, and each category information prediction probability value is used for representing the possibility that the dialogue sentence belongs to the category information.

The class abstract sentence determining module 140 is configured to determine M groups of class abstract sentences from the N dialogue sentences according to N class information prediction probability values corresponding to each of the M class information prediction probability distribution vectors.

The class summary information generating module 150 is configured to splice each group of class summary sentences in the M groups of class summary sentences to generate M class summary information.

According to the device provided by the embodiment of the application, the multi-branch sequence labeling model is used for carrying out multi-category probability prediction on the role dialogue information carrying the role identification, so that the prediction probability value of each dialogue statement in the role dialogue information belonging to each category is obtained, further, the dialogue statements belonging to each category are obtained through the prediction probability value, and the dialogue statements of each category are spliced to obtain the abstract information of each category, so that the accuracy of abstract information generation is improved, the problem of important information omission caused by extracting the abstract information of each category from the role dialogue information is avoided, and the device can be applied to scenes of medical dialogue abstract generation.

In an optional embodiment of the summary information generating apparatus provided in the embodiment corresponding to fig. 18 of the present application, the probability distribution vector prediction module 130 is further configured to:

inputting the dialogue feature vectors into the full-connection layers of each of the M category information labeling modules, and performing full-connection processing on the dialogue feature vectors through the full-connection layers of the M category information labeling modules to generate M full-connection feature vectors.

According to the device provided by the embodiment of the application, the class information labeling module is formed through the full connection layer and the compression function layer, the dimension of the input dialogue feature vector is adjusted through the full connection layer, so that the dimension of the dialogue feature vector is matched with the model dimension, then the compression function layer is realized through the Sigmoid network layer, the probability corresponding to each feature value in the dialogue feature vector is calculated, the class information prediction probability distribution vector is obtained, the relevance of dialogue sentences and classes can be fully reflected by the class information prediction probability distribution vector, the confidence is high, a foundation is laid for selecting class abstract sentences through the class information prediction probability distribution vector, and the accuracy of abstract information generation is improved.

In an optional embodiment of the summary information generating apparatus provided in the embodiment corresponding to fig. 18 of the present application, the category summary statement determining module 140 is further configured to:

and obtaining M probability thresholds corresponding to the M category information.

And predicting probability values and M probability thresholds according to M category information corresponding to each dialogue sentence in the N dialogue sentences, and determining M groups of category abstract sentences from the N dialogue sentences.

According to the device provided by the embodiment of the invention, the dialogue statement with the category information prediction probability value larger than or equal to the probability threshold value is determined as the category abstract statement by comparing the category information prediction probability value corresponding to each category information with the probability threshold value, so that the accuracy of abstract information generation is improved.

In an optional embodiment of the summary information generating apparatus provided in the embodiment corresponding to fig. 18 of the present application, the role dialogue information obtaining module 110 is further configured to:

and acquiring the original dialogue information and at least one role information.

Wherein the original dialogue information includes N original sentences.

And matching the N original sentences with at least one role information to generate role dialogue information.

According to the device provided by the embodiment of the application, the character identification is given to the original dialogue information which does not carry the character identification, and the feature extraction and the category information labeling of different character information can be carried out subsequently through the distinction of the character identification, so that the calculated amount is reduced, the calculation time is shortened, and the model processing efficiency is improved.

In an alternative embodiment of the summary information generating apparatus provided in the embodiment corresponding to fig. 18 of the present application, the feature extraction module 120 is further configured to:

and taking the role dialogue information as the input of a text abstract layer in the feature extraction model, and extracting hidden semantic features of the role dialogue information through the text abstract layer to generate hidden semantic vectors.

Character dialogue information is used as input of a feature embedding layer in a feature extraction model, and feature mapping is carried out on the character dialogue information through the feature embedding layer to generate character embedding feature vectors.

According to the device provided by the embodiment of the application, the N dialogue sentences in the role dialogue information are subjected to feature extraction through the feature extraction model formed by the text abstract layer, the feature embedding layer and the feature fusion layer, so that dialogue feature vectors formed by N dialogue sentence feature values are obtained, and a foundation is laid for subsequently improving the accuracy of calculating the class information prediction probability distribution vectors according to the dialogue feature vectors.

and splicing the hidden semantic vector and the character embedded feature vector to generate a feature splicing vector.

The device provided by the embodiment of the application splices the hidden semantic vector and the character embedded feature vector through the feature fusion layer, and lays a foundation for the follow-up improvement of the accuracy of calculating the category information prediction probability distribution vector according to the dialogue feature vector.

In an optional embodiment of the summary information generating apparatus provided in the embodiment corresponding to fig. 18 of the present application, the category summary information generating module 150 is further configured to:

and acquiring N original serial numbers corresponding to the N dialogue sentences in the role dialogue information.

And determining the splicing sequence number of each dialogue sentence in each group of class abstract sentences according to N original sequence numbers corresponding to the N dialogue sentences.

According to the device provided by the embodiment of the application, the splicing sequence number of the class abstract statement is determined according to the original sequence number of the dialogue statement in the role dialogue information, and then the class abstract statement is spliced according to the splicing sequence number, so that complete class abstract information is generated, and the logic and the integrity of the class abstract information are improved.

In an optional embodiment of the summary information generating apparatus provided in the embodiment corresponding to fig. 18 of the present application, the probability distribution vector prediction module 130 is further configured to: the dialogue feature vector is used as input of a double-branch sequence labeling model, the dialogue feature vector is processed through a first class information labeling module in the double-branch sequence labeling model to obtain a first class information prediction probability distribution vector, and the dialogue feature vector is processed through a second class information labeling module in the double-branch sequence labeling model to obtain a second class information prediction probability distribution vector.

The category digest sentence determination module 140 is further configured to determine a first category digest sentence from the N dialogue sentences according to the N first category information prediction probability values, and determine a second category digest sentence from the N dialogue sentences according to the N second category information prediction probability values.

The category digest information generation module 150 is further configured to splice the first category digest sentence to generate first category digest information, and splice the second category digest sentence to generate second category digest information.

According to the device provided by the embodiment of the application, multi-category probability prediction is carried out on the role dialogue information carrying the role identification through the double-branch sequence labeling model, so that the prediction probability value of each dialogue sentence belonging to each category in the role dialogue information is obtained, further, the dialogue sentences belonging to each category are obtained through the prediction probability value, the dialogue sentences of each category are spliced to obtain the abstract information of each category, and the prediction of the double-branch sequence labeling model is non-autoregressive and parallel, so that the method provided by the embodiment of the application has the advantage of high reasoning speed; the feature extraction model formed by the text abstract layer, the feature embedding layer and the feature fusion layer can effectively extract the features of each dialogue statement, and the accuracy of abstract information generation is improved; through the intuitive, reasonable and flexible sentence-level text editing method, the medical text abstract generating task can be well adapted, and the problem of important information omission caused by extracting abstract information of each category from role dialogue information is avoided.

In another implementation manner of the embodiment of the present application, the category summary statement determination module 150 is further configured to:

a first class information probability threshold is obtained.

And taking the dialogue sentences with the N first category information prediction probability values larger than or equal to the first category information probability threshold value corresponding to the N dialogue sentences as first category abstract sentences.

And acquiring a second class information probability threshold.

In another implementation manner of the embodiment of the present application, referring to fig. 19, the summary information generating apparatus 10 further includes: the system comprises a training data acquisition module 310, a training data feature extraction module 320, a training probability distribution vector prediction module 330 and a loss function generation module 340; specific:

the training data acquisition module 310 is configured to acquire a training character dialogue information sample.

The training data feature extraction module 320 is configured to perform feature extraction on the training character dialogue information sample, and generate a training dialogue feature vector.

The training probability distribution vector prediction module 330 is configured to take the training dialogue feature vector as an input of the multi-branch sequence labeling model, and process the training dialogue feature vector through M class information labeling modules in the multi-branch sequence labeling model to obtain M training class information prediction probability distribution vectors.

The loss function generating module 340 is configured to generate a total loss function according to the M training class information prediction probability distribution vectors and the M class information tag sequences, and perform parameter optimization on the multi-branch sequence labeling model through the total loss function.

According to the device provided by the embodiment of the application, through carrying out sufficient training on a large amount of supervised data, a sentence level abstract generation model with excellent performance can be finally obtained, the model can predict the corresponding class information prediction probability distribution vectors of a plurality of classes according to the dialogue information, and further can be used for executing corresponding abstract generation on the dialogue information, and finally various target abstract information is obtained.

In another implementation manner of the embodiment of the present application, referring to fig. 19, the loss function generating module 340 is further configured to:

and predicting N training class information prediction probability values in the probability distribution vector according to the training class information corresponding to the same class information and N class information labels in the class information label sequence to generate M class loss functions.

A total loss function is generated from the M category loss functions.

Fig. 20 is a schematic diagram of a server structure provided in an embodiment of the present application, where the server 300 may vary considerably in configuration or performance, and may include one or more central processing units (central processing units, CPU) 322 (e.g., one or more processors) and memory 332, one or more storage media 330 (e.g., one or more mass storage devices) storing applications 342 or data 344. Wherein the memory 332 and the storage medium 330 may be transitory or persistent. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 322 may be configured to communicate with the storage medium 330 and execute a series of instruction operations in the storage medium 330 on the server 300.

The Server 300 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input/output interfaces 358, and/or one or more operating systems 341, such as Windows Server ^TM ，Mac OS X ^TM ，Unix ^TM , Linux ^TM ，FreeBSD ^TM Etc.

The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 20.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the method of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical method of the present application, or a part contributing to the prior art, or all or part of the technical method, may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The above embodiments are merely illustrative of the technical methods of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical method described in the foregoing embodiments may be modified or some technical features may be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the technical methods of the various embodiments of the present application.

Claims

1. A digest information generation method, comprising:

taking the role dialogue information as input of a text abstract layer in a feature extraction model, and extracting hidden semantic features of the role dialogue information through the text abstract layer to generate hidden semantic vectors;

taking the role dialogue information as the input of a feature embedding layer in a feature extraction model, and performing feature mapping on the role dialogue information through the feature embedding layer to generate a role embedding feature vector;

feature fusion is carried out by taking the feature splicing vector as a self-attention mechanism layer in a feature extraction model, and a dialogue feature vector is generated, wherein the dialogue feature vector comprises N dialogue sentence feature values, and the N dialogue sentence feature values correspond to the N dialogue sentences;

the dialogue feature vector is used as input of a multi-branch sequence labeling model, M category information labeling modules in the multi-branch sequence labeling model are used for processing the dialogue feature vector to obtain M category information prediction probability distribution vectors, wherein each category information prediction probability distribution vector comprises N category information prediction probability values, each category information prediction probability value is used for representing the possibility that the dialogue sentence belongs to the category information, and M is an integer greater than or equal to 1;

determining M groups of class abstract sentences from the N dialogue sentences according to N class information prediction probability values corresponding to each class information prediction probability distribution vector in the M class information prediction probability distribution vectors, wherein the M groups of class abstract sentences correspond to the M class information;

2. The summary information generating method as claimed in claim 1, wherein said processing the dialogue feature vector by M class information labeling modules in the multi-branch sequence labeling model using the dialogue feature vector as an input of the multi-branch sequence labeling model to obtain M class information prediction probability distribution vectors comprises:

inputting the dialogue feature vectors into the full-connection layers of each of the M category information labeling modules, and performing full-connection processing on the dialogue feature vectors through the full-connection layers of the M category information labeling modules to generate M full-connection feature vectors, wherein each full-connection feature vector comprises N full-connection feature values;

inputting the M full-connection feature vectors into compression function layers of each category information labeling module in the M category information labeling modules, and compressing the M full-connection feature vectors through the compression function layers of the M category information labeling modules to generate M category information prediction probability distribution vectors.

3. The digest information generation method of claim 1, wherein said predicting probability values from N category information corresponding to each of said M category information prediction probability distribution vectors, determining M groups of category digest sentences from said N dialogue sentences, comprises:

obtaining M probability thresholds corresponding to M category information;

and determining M groups of class abstract sentences from the N dialogue sentences according to M class information prediction probability values corresponding to each dialogue sentence in the N dialogue sentences and the M probability thresholds, wherein the class information prediction probability values of the class abstract sentences are larger than the probability thresholds corresponding to the class information.

4. The summary information generating method as claimed in claim 1, wherein said acquiring character dialogue information comprises:

and matching the N original sentences with the at least one role information to generate role dialogue information, wherein the role dialogue information comprises N original sentences carrying role identifications.

5. The summary information generating method as claimed in claim 1, wherein said concatenating each of said M groups of class summary sentences to generate M class summary information comprises:

acquiring N original serial numbers corresponding to N dialogue sentences in the role dialogue information, wherein the original serial numbers are used for representing the sequence numbers of the dialogue sentences in order role dialogue information;

and sequentially splicing each group of class abstract sentences in the M groups of class abstract sentences according to the splicing sequence number of each dialogue sentence in each group of class abstract sentences to generate M groups of class abstract information.

6. The summary information generation method as claimed in claim 1, wherein after said generating a dialogue feature vector, further comprising:

the dialogue feature vector is used as input of a double-branch sequence labeling model, a first class information labeling module in the double-branch sequence labeling model is used for processing the dialogue feature vector to obtain a first class information prediction probability distribution vector, and a second class information labeling module in the double-branch sequence labeling model is used for processing the dialogue feature vector to obtain a second class information prediction probability distribution vector, wherein the first class information prediction probability distribution vector comprises N first class information prediction probability values, the first class information prediction probability values are used for representing the possibility that the dialogue sentence belongs to the first class information, and the second class information prediction probability distribution vector comprises N second class information prediction probability values, and the second class information prediction probability values are used for representing the possibility that the dialogue sentence belongs to the second class information;

Determining a first class abstract sentence from the N dialogue sentences according to the N first class information prediction probability values, and determining a second class abstract sentence from the N dialogue sentences according to the N second class information prediction probability values;

splicing the first class abstract sentences to generate first class abstract information, and splicing the second class abstract sentences to generate second class abstract information.

7. The summary information generation method of claim 6, wherein said determining a first category summary sentence from said N dialogue sentences based on said N first category information prediction probability values comprises:

acquiring a first class information probability threshold;

taking the dialog sentences of which the N first class information prediction probability values are greater than or equal to the first class information probability threshold values corresponding to the N dialog sentences as the first class abstract sentences;

the determining, according to the N second category information prediction probability values, a second category abstract sentence from the N dialogue sentences includes:

acquiring a probability threshold value of the second category information;

and taking the dialogue sentences of which the N second-class information prediction probability values are greater than or equal to the second-class information probability threshold values corresponding to the N dialogue sentences as the second-class abstract sentences.

8. The digest information generation method of claim 1, wherein the method further comprises:

acquiring training role dialogue information samples, wherein the training role dialogue information samples comprise N training dialogue sentences, each training dialogue sentence in the N training dialogue sentences carries a role identifier, each training role dialogue information sample carries M category information tag sequences, each category information tag sequence comprises N category information tags, each category information tag is used for representing the corresponding relation between the training dialogue sentence and category information, and N is an integer greater than 1;

extracting features of the training character dialogue information sample to generate training dialogue feature vectors, wherein the training dialogue feature vectors comprise N training dialogue sentence feature values, and the N training dialogue sentence feature values correspond to the N training dialogue sentences;

the training dialogue feature vector is used as input of a multi-branch sequence labeling model, M category information labeling modules in the multi-branch sequence labeling model are used for processing the training dialogue feature vector to obtain M training category information prediction probability distribution vectors, wherein each training category information prediction probability distribution vector comprises N training category information prediction probability values, and each training category information prediction probability value is used for representing the possibility that the training dialogue sentence belongs to the category information;

And generating a total loss function according to the M training class information prediction probability distribution vectors and the M class information tag sequences, and performing parameter optimization on the multi-branch sequence annotation model through the total loss function.

9. The summary information generation method of claim 8, wherein said generating a total loss function from said M training class information predictive probability distribution vectors and said M class information tag sequences comprises:

and generating a total loss function according to the M category loss functions.

10. A digest information generation apparatus, comprising:

The probability distribution vector prediction module is used for taking the dialogue feature vector as the input of a multi-branch sequence labeling model, and processing the dialogue feature vector through M category information labeling modules in the multi-branch sequence labeling model to obtain M category information prediction probability distribution vectors, wherein each category information prediction probability distribution vector comprises N category information prediction probability values, each category information prediction probability value is used for representing the possibility that the dialogue sentence belongs to the category information, and M is an integer greater than or equal to 1;

a class abstract sentence determining module, configured to determine M groups of class abstract sentences from the N dialogue sentences according to N class information prediction probability values corresponding to each class information prediction probability distribution vector in the M class information prediction probability distribution vectors, where the M groups of class abstract sentences correspond to the M class information;

the class abstract information generation module is used for splicing each group of class abstract sentences in the M groups of class abstract sentences to generate M class abstract information;

the feature extraction module is further configured to:

feature fusion is carried out on the hidden semantic vector and the character embedded feature vector, and a dialogue feature vector is generated;

the feature extraction module is further configured to:

11. The apparatus of claim 10, wherein the probability distribution vector prediction module is further configured to:

inputting the dialogue feature vectors into the full-connection layers of each of the M category information labeling modules, and performing full-connection processing on the dialogue feature vectors through the full-connection layers of the M category information labeling modules to generate M full-connection feature vectors; wherein each full-connection feature vector comprises N full-connection feature values;

12. The apparatus of claim 10, wherein the category summary statement determination module is further configured to:

obtaining M probability thresholds corresponding to M category information;

predicting probability values and M probability thresholds according to M category information corresponding to each dialogue sentence in the N dialogue sentences, and determining M groups of category abstract sentences from the N dialogue sentences; wherein, the prediction probability value of the category information of the category abstract statement is larger than the probability threshold value corresponding to the category information.

13. The apparatus of claim 10, wherein the character conversation information acquisition module is further configured to:

acquiring original dialogue information and at least one role information; the original dialogue information comprises N original sentences;

matching the N original sentences with at least one role information to generate role dialogue information; the role dialogue information comprises N original sentences carrying role identifications.

14. The apparatus of claim 10, wherein the category summary information generation module is further configured to:

acquiring N original serial numbers corresponding to N dialogue sentences in the role dialogue information; the original serial numbers are used for representing the serial numbers of the dialogue sentences in the order role dialogue information;

15. The apparatus of claim 10, wherein the probability distribution vector prediction module is further configured to: the dialogue feature vector is used as input of a double-branch sequence labeling model, a first class information labeling module in the double-branch sequence labeling model is used for processing the dialogue feature vector to obtain a first class information prediction probability distribution vector, and a second class information labeling module in the double-branch sequence labeling model is used for processing the dialogue feature vector to obtain a second class information prediction probability distribution vector;

the first class information prediction probability distribution vector comprises N first class information prediction probability values, wherein the first class information prediction probability values are used for representing the possibility that the dialogue sentence belongs to the first class information, and the second class information prediction probability distribution vector comprises N second class information prediction probability values, and the second class information prediction probability values are used for representing the possibility that the dialogue sentence belongs to the second class information;

The class abstract sentence determining module is further configured to determine a first class abstract sentence from the N dialogue sentences according to the N first class information prediction probability values, and determine a second class abstract sentence from the N dialogue sentences according to the N second class information prediction probability values;

the category abstract information generation module is also used for splicing the first category abstract sentences to generate first category abstract information and splicing the second category abstract sentences to generate second category abstract information.

16. The apparatus of claim 15, wherein the category summary statement determination module is further configured to:

acquiring a first class information probability threshold;

acquiring a probability threshold of the second category information;

17. The apparatus of claim 10, wherein the summary information generating means further comprises: the system comprises a training data acquisition module, a training data feature extraction module, a training probability distribution vector prediction module and a loss function generation module; specific:

The training data acquisition module is used for acquiring training character dialogue information samples; the training role dialogue information sample comprises N training dialogue sentences, each training dialogue sentence in the N training dialogue sentences carries a role identifier, the training role dialogue information sample carries M category information tag sequences, each category information tag sequence comprises N category information tags, the category information tags are used for representing the corresponding relation between the training dialogue sentences and the category information, and N is an integer larger than 1;

the training data feature extraction module is used for carrying out feature extraction on training character dialogue information samples to generate training dialogue feature vectors; the training dialogue feature vector comprises N training dialogue sentence feature values, and the N training dialogue sentence feature values correspond to the N training dialogue sentences;

the training probability distribution vector prediction module is used for taking training dialogue feature vectors as the input of the multi-branch sequence labeling model, and processing the training dialogue feature vectors through M category information labeling modules in the multi-branch sequence labeling model to obtain M training category information prediction probability distribution vectors; each training category information prediction probability distribution vector comprises N training category information prediction probability values, and each training category information prediction probability value is used for representing the possibility that a training dialogue sentence belongs to category information;

The loss function generation module is used for generating a total loss function according to M training class information prediction probability distribution vectors and M class information tag sequences, and carrying out parameter optimization on the multi-branch sequence labeling model through the total loss function.

18. The apparatus of claim 17, wherein the loss function generation module is further configured to:

a total loss function is generated from the M category loss functions.

19. A computer device, comprising: memory, transceiver, processor, and bus system;

wherein the memory is used for storing programs;

the processor is configured to execute a program in the memory, including executing the digest information generation method according to any one of claims 1 to 9;

the bus system is used for connecting the memory and the processor so as to enable the memory and the processor to communicate.

20. A computer-readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the summary information generation method of any one of claims 1 to 9.