CN113673244A

CN113673244A - Medical text processing method and device, computer equipment and storage medium

Info

Publication number: CN113673244A
Application number: CN202110001747.9A
Authority: CN
Inventors: 张先礼; 管冲; 陈曦
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-01-04
Filing date: 2021-01-04
Publication date: 2021-11-19
Anticipated expiration: 2041-01-04
Also published as: CN113673244B

Abstract

The application relates to a medical text processing method, a medical text processing device, a computer device and a storage medium. The method relates to natural language processing technology of artificial intelligence, and the method comprises the following steps: acquiring an entity sequence in a medical text; inquiring a sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph; carrying out graph convolution operation on each subgraph in the subgraph sequence to obtain subgraph sequence representation; fusing distributed vector representations corresponding to all entities in the entity sequence with distributed vector representations corresponding to neighbor nodes of all the entities in the medical knowledge graph to obtain entity sequence representations corresponding to the entity sequence; based on the sub-graph sequence representation and the entity sequence representation, a probability that the medical text belongs to a predetermined category is determined. By adopting the method, the accuracy of determining the probability that the medical text belongs to the preset category can be improved.

Description

Medical text processing method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a medical text processing method and apparatus, a computer device, and a storage medium.

Background

With the construction of medical informatization systems, a large amount of clinical data is accumulated. Based on these data, a machine learning algorithm is used to build a medical prediction model to assist medical decision making or users to pay attention to their health conditions, and the model is gradually becoming a current hot research direction. Among them, medical text processing is one of the most valuable tasks.

In the traditional medical text processing method, only effective information is mined from medical texts for prediction, and no guidance of medical field knowledge exists, however, in an actual scene, the obtained medical text data is usually noisy, and incomplete and insufficient data can be caused by recording errors or omission, so that the probability accuracy of the obtained medical text corresponding to a predetermined category is low.

Disclosure of Invention

In view of the above, it is necessary to provide a medical text processing method, an apparatus, a computer device, and a storage medium capable of improving the accuracy of determining that a medical text belongs to a predetermined category in view of the above technical problems.

A medical text processing method, the method comprising:

acquiring an entity sequence in a medical text;

inquiring a sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph;

carrying out graph convolution operation on each subgraph in the subgraph sequence to obtain subgraph sequence representation;

fusing distributed vector representations corresponding to the entities in the entity sequence with distributed vector representations corresponding to neighbor nodes of the entities in the medical knowledge graph to obtain entity sequence representations corresponding to the entity sequence;

determining a probability that the medical text belongs to a predetermined category based on the sub-graph sequence representation and the entity sequence representation.

In one embodiment, the acquiring a sequence of entities in medical text includes:

acquiring a plurality of medical texts;

extracting entity sequences related to medical treatment from each medical treatment text;

and sequencing the entity sequences according to the time sequence generated by each medical text to obtain a plurality of groups of entity sequences corresponding to the medical texts.

In one embodiment, the querying a sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph includes:

for entities in the entity sequence, determining corresponding entity nodes in a medical knowledge graph;

determining neighbor nodes of the entity nodes from the medical knowledge graph;

determining a path from the entity node to a target node from the medical knowledge-graph;

obtaining a subgraph corresponding to the entity according to the neighbor nodes and the nodes on the path;

and obtaining a subgraph sequence according to the subgraph corresponding to each entity in the entity sequence.

In an embodiment, after performing the graph convolution operation on each of the sub-graphs, obtaining a sub-graph sequence representation corresponding to the entity sequence includes:

inputting each sub-graph into a graph convolution network to obtain graph convolution characteristics corresponding to each sub-graph;

performing pooling operation on the graph convolution characteristics to obtain vector representation corresponding to a sub-graph;

and fusing the vector representations corresponding to the sub-graphs to obtain sub-graph sequence representations corresponding to the entity sequence.

In one embodiment, the method further comprises:

determining edge relationships between the entity nodes and the neighbor nodes from the medical knowledge graph;

determining distributed vector representations corresponding to the entity node, the edge relation and the neighbor node;

wherein there is an association between the sum of the distributed vector representations between the entity node and the edge relation and the distributed vector representation of the neighbor node.

In one embodiment, the fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph to obtain the entity sequence representation corresponding to the entity sequence includes:

for entities in the entity sequence, determining corresponding entity nodes in the medical knowledge graph;

determining an attention weight of each neighbor node to the entity node;

according to the attention weight, fusing the distributed vector representation corresponding to the entity node with the distributed vector representation of each neighbor node to obtain an entity representation corresponding to the entity node;

and obtaining entity sequence representations corresponding to the entity sequences according to the entity representations corresponding to the entities in the entity sequences.

In one embodiment, the determining the attention weight of each neighbor node to the entity node comprises:

determining the attention score of the neighbor node to the entity node according to the distributed vector representation corresponding to the neighbor node, the distributed vector representation corresponding to the edge relation between the neighbor node and the distributed vector representation corresponding to the entity node;

and obtaining the attention weight corresponding to each neighbor node according to the attention score of each neighbor node to the entity node.

In one embodiment, the determining the probability that the medical text belongs to the predetermined category based on the sub-graph sequence representation and the entity sequence representation comprises:

obtaining a subgraph hidden state represented by the subgraph sequence through a first convolutional network;

obtaining an entity hiding state represented by the entity sequence through a second convolutional network;

fusing the sub-image hidden state and the entity hidden state to obtain medical text representation;

determining, by a classifier, a probability that the medical text belongs to a predetermined category from the medical text representation.

In one embodiment, the fusing the sub-image hidden state and the entity hidden state to obtain the medical text representation includes:

fusing a plurality of sub-image hidden states corresponding to a plurality of medical texts to obtain a sub-image hidden vector;

fusing a plurality of entity hidden states corresponding to a plurality of medical texts to obtain a sub-image hidden vector;

and connecting the subgraph hidden vector with the subgraph hidden vector to obtain a medical text representation.

In one embodiment, the method is implemented by a medical text processing model, the training step of the medical text processing model comprising:

acquiring a sample medical text and marking information corresponding to the sample medical text;

inputting the sample medical text into a medical text processing model, obtaining an entity sequence in the sample medical text through the medical text processing model, inquiring a sub-graph sequence corresponding to the entity sequence according to a medical knowledge graph, carrying out graph convolution operation on each sub-graph in the sub-graph sequence to obtain a sub-graph sequence representation, fusing a distributed vector representation corresponding to each entity in the entity sequence and a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, and determining the prediction probability that the sample medical text belongs to a preset category based on the sub-graph sequence representation and the entity sequence representation;

constructing cross entropy loss by the prediction probability and the labeling information;

and after updating the model parameters of the medical text processing model according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training until a training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to the preset category.

A method of training a medical text processing model, the method comprising:

A medical text processing apparatus, the apparatus comprising:

the acquisition module is used for acquiring an entity sequence in the medical text;

the query module is used for querying the sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph;

the sub-graph sequence representation module is used for carrying out graph convolution operation on each sub-graph in the sub-graph sequence to obtain sub-graph sequence representation;

the entity sequence representation module is used for fusing distributed vector representations corresponding to the entities in the entity sequence and distributed vector representations corresponding to neighbor nodes of the entities in the medical knowledge graph to obtain entity sequence representations corresponding to the entity sequence;

a determination module for determining a probability that the medical text belongs to a predetermined category based on the sub-graph sequence representation and the entity sequence representation.

A training apparatus for a medical text processing model, the apparatus comprising:

the acquisition module is used for acquiring a sample medical text and marking information corresponding to the sample medical text;

a determining module, configured to input the sample medical text into a medical text processing model, obtain an entity sequence in the sample medical text through the medical text processing model, query a sub-graph sequence corresponding to the entity sequence according to a medical knowledge graph, perform graph convolution operation on each sub-graph in the sub-graph sequence, obtain a sub-graph sequence representation, fuse a distributed vector representation corresponding to each entity in the entity sequence and a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph, obtain an entity sequence representation corresponding to the entity sequence, and determine, based on the sub-graph sequence representation and the entity sequence representation, a prediction probability that the sample medical text belongs to a predetermined category;

the loss construction module is used for constructing cross entropy loss by the prediction probability and the marking information;

and the model updating module is used for returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training after updating the model parameters of the medical text processing model according to the cross entropy loss until a training stopping condition is met, and obtaining the medical text processing model used for determining that the medical text belongs to a preset category.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the above medical text processing method and/or training method of a medical text processing model when executing the computer program.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned medical text processing method and/or training method of a medical text processing model.

A computer program comprising computer instructions stored in a computer readable storage medium, the computer instructions being read from the computer readable storage medium by a processor of a computer device, the computer instructions being executed by the processor to cause the computer device to perform the steps of the above medical text processing method and/or training method of a medical text processing model.

According to the medical text processing method, the medical text processing device, the computer equipment and the storage medium, the field knowledge in the medical knowledge graph is inquired, the sub-graph sequence corresponding to the entity sequence of the medical text is constructed, the sub-graph sequence representation is obtained by utilizing graph convolution operation, namely, new characteristics are constructed for the entity sequence, so that the input information can be enriched, the abundant semantic information in the medical knowledge graph can be mined, and graph structure information can be obtained; in addition, meaningful distributed vector representations of the entity and the neighbor nodes in the medical knowledge graph are obtained, and the distributed vector representations of the neighbor nodes are embedded into the vector representation of the entity, so that domain knowledge in the medical knowledge graph is more fully embedded into the entity sequence representation. In this way, the accuracy of determining the probability that the medical text belongs to the predetermined category based on the sub-image sequence representation and the entity sequence representation may be improved.

According to the training method, the training device, the computer equipment and the storage medium of the medical text processing model, the domain knowledge is mined from the medical knowledge map in the training process, the semantic information of the entity sequence corresponding to the sample medical text is enriched, and the model accuracy is improved. By constructing a sub-graph sequence corresponding to an entity sequence of a sample medical text and obtaining sub-graph sequence representation by using graph convolution operation, a new characteristic is constructed for the entity sequence, so that not only can input information be enriched, and abundant semantic information in a medical knowledge graph be mined, but also graph structure information can be obtained, adverse effects of medical text data deviation (such as data loss) on model training can be reduced, meanwhile, graph convolution operation can capture graph structure information in the medical knowledge graph, and the information can obtain an association relation between an entity with higher discrimination and a target preset category, so that the model accuracy is improved. Furthermore, embedding the distributed vector representations of the neighboring nodes into the vector representation of the entity allows domain knowledge in the medical knowledge graph to be more fully embedded into the entity sequence representation. Finally, the accuracy of the probability that the medical text determined by the model belongs to the predetermined category based on the sub-image sequence representation and the entity sequence representation is obviously improved.

Drawings

FIG. 1 is a diagram of an application environment of a medical text processing method in one embodiment;

FIG. 2 is a block diagram of an overall architecture of a medical text processing method in one embodiment;

FIG. 3 is a flow diagram that illustrates a method for medical text processing, according to one embodiment;

FIG. 4 is a schematic flow chart illustrating query of sub-graph sequences based on a medical knowledge-graph in one embodiment;

FIG. 5 is a schematic flow diagram for constructing a sub-graph corresponding to an entity from a medical knowledge-graph in one embodiment;

FIG. 6 is a schematic flow chart illustrating obtaining a sub-graph sequence representation corresponding to an entity sequence in one embodiment;

FIG. 7 is a flow diagram illustrating obtaining a representation of an entity sequence corresponding to the entity sequence, in accordance with an embodiment;

FIG. 8 is a flow diagram that illustrates the determination of a probability that a medical text belongs to a predetermined category based on a sub-graph sequence representation and an entity sequence representation, under an embodiment;

FIG. 9 is a flow diagram of a method for medical text processing in an exemplary embodiment;

FIG. 10 is a flowchart illustrating a method for training a medical text processing model according to an embodiment;

FIG. 11 is a diagram of a training framework for a medical text processing model in accordance with an embodiment;

FIG. 12 is a block diagram showing the construction of a medical text processing apparatus according to an embodiment;

FIG. 13 is a block diagram showing a configuration of a training apparatus for a medical text processing model according to an embodiment;

FIG. 14 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The present application provides a medical text processing method and a training method for a medical text processing model, which relate to an Artificial Intelligence (AI) technology, and the AI technology is a theory, a method, a technology and an application system that simulates, extends and expands human Intelligence by using a digital computer or a machine controlled by a digital computer, senses the environment, obtains knowledge and obtains an optimal result by using the knowledge. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

The embodiment of the application provides a training method of a medical text processing model and a medical text processing method, which mainly relate to a Natural Language Processing (NLP) technology of artificial intelligence, and natural Language processing is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

Intelligent medical treatment is a large application and industrial field of artificial intelligence, and the medical artificial intelligence comprises image artificial intelligence, such as fundus images, CT images, MRI images, skin images and the like, and also comprises voice conversion electronic medical record artificial intelligence, auxiliary examination artificial intelligence, a guide robot, medicine dosage, medicine research and development, remote medical treatment and the like.

The medical text processing method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 can obtain a medical text and send the medical text to the server 104, the server 104 extracts an entity sequence in the medical text, queries a sub-graph sequence corresponding to the entity sequence according to a medical knowledge graph, performs graph convolution operation on each sub-graph in the sub-graph sequence to obtain a sub-graph sequence representation, fuses a distributed vector representation corresponding to each entity in the entity sequence and a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, determines the probability that the medical text belongs to a predetermined category based on the sub-graph sequence representation and the entity sequence representation, and the server 104 can also return the determined probability to the terminal 102.

The training method of the medical text processing model provided by the embodiment of the application can also be applied to the application environment shown in fig. 1. The server 104 acquires the sample medical text and the labeling information corresponding to the sample medical text; inputting a sample medical text into a medical text processing model, acquiring an entity sequence in the medical text through the medical text processing model, inquiring a sub-graph sequence corresponding to the entity sequence according to a medical knowledge graph, performing graph convolution operation on each sub-graph in the sub-graph sequence, acquiring sub-graph sequence representation, fusing distributed vector representation corresponding to each entity in the entity sequence and distributed vector representation corresponding to neighbor nodes of each entity in the medical knowledge graph, acquiring entity sequence representation corresponding to the entity sequence, and determining the prediction probability that the sample medical text belongs to a preset category based on the sub-graph sequence representation and the entity sequence representation; constructing cross entropy loss by the prediction probability and the marking information; and after the model parameters of the medical text processing model are updated according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training until the training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to the preset category. Then, the terminal 102 may set an initial medical text processing model locally, and import the model parameters of the medical text processing model trained by the server 104, thereby obtaining the medical text processing model.

The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.

According to the medical text processing method provided by the embodiment of the application, the field knowledge in the medical knowledge graph is inquired, the sub-graph sequence corresponding to the entity sequence of the medical text is constructed, the sub-graph sequence representation is obtained by utilizing graph convolution operation, namely, new characteristics are constructed for the entity sequence, so that the input information can be enriched, the abundant semantic information in the medical knowledge graph can be mined, and the graph structure information can be obtained; in addition, meaningful distributed vector representations of the entity and the neighbor nodes in the medical knowledge graph are obtained, and the distributed vector representations of the neighbor nodes are embedded into the vector representation of the entity, so that domain knowledge in the medical knowledge graph is more fully embedded into the entity sequence representation. In this way, the accuracy of the probability that the medical text determined based on the sub-image sequence representation and the entity sequence representation belongs to the predetermined category can be improved.

The medical text processing method and the training method of the medical text processing model provided by the embodiment of the application can be applied to an artificial intelligent medical aid decision-making system, in the aid decision-making system, the medical text processing is an important function, the input of the aid decision-making system is a medical text, and the output of the aid decision-making system is the probability that the medical text belongs to a preset category. In one application scenario, the system may provide clinical decision assistance support for a physician. In another application scenario, a user may input a medical text related to the user into the decision assistance system, and the probability that the medical text output by the decision assistance system belongs to a predetermined category may make the user pay more attention to the health condition of the user.

Fig. 2 is a schematic diagram of an overall framework of the medical text processing method in one embodiment. Referring to fig. 2, the method mainly comprises four steps of sub-graph sequence construction, sub-graph sequence representation, entity sequence representation, time sequence relation mining and output of probability of belonging to a predetermined category. Firstly, extracting an entity from a medical text to obtain an entity sequence. If the user inputs a plurality of medical texts, each medical text includes a plurality of entities, the entity sequence of each medical text may be arranged according to the time sequence generated by the plurality of medical texts to serve as the input entity sequence. And constructing a sub-graph sequence corresponding to the entity sequence of the medical text by inquiring the medical knowledge graph, and obtaining sub-graph sequence representation by utilizing a graph convolution network. Embedding the distributed vector representation of the neighbor nodes of each entity in the entity sequence into the representation of the entity by inquiring the medical knowledge graph, and obtaining the representation of the entity sequence by utilizing a graph attention machine mechanism. And finally, mining hidden states represented by the sub-graph sequence and the entity sequence respectively through two time sequence relation mining networks, fusing the two hidden states, and inputting the fused hidden states into a classifier for prediction to obtain the probability that the medical text belongs to a preset class. The timing relationship mining Network may adopt a Long Short-Term Memory Network (LSTM), a Recurrent Neural Network (RNN), a Gate Recurrent Unit (GRU), a Convolutional Neural Network (CNN), and the like.

In one embodiment, as shown in fig. 3, a medical text processing method is provided, which is described by taking the method as an example applied to the computer device (the terminal 102 or the server 104) in fig. 1, and includes the following steps:

step 302, an entity sequence in the medical text is obtained.

The medical text is text data related to the physical condition of the user, for example, the medical text may be a series of physical examination data obtained after physical examination of the user, or may be physical condition monitoring data recorded by the user. The entity sequence is a sequence of entity constructs extracted from the medical text. The computer device may analyze structured discrete features including symptoms, diseases, drugs, personal information of the user, past medical history, etc. from the medical text using natural language processing techniques (e.g., named entity recognition techniques), which may result in features such as "barking cough", "hoarseness", "dyspnea", "having diabetes", etc., and the medical text of the user may include a plurality of entities that constitute a sequence of entities to which the medical text corresponds. The computer device can also obtain an electronic medical record of the user, and extract each entity from the electronic medical record by using a named entity recognition technology to obtain an entity sequence.

In one embodiment, acquiring a sequence of entities in medical text comprises: acquiring a plurality of medical texts; extracting entity sequences related to medical treatment from each medical treatment text; and sequencing the entity sequences according to the time sequence generated by each medical text to obtain a plurality of groups of entity sequences corresponding to a plurality of medical texts.

In this embodiment, a user may input a plurality of medical texts, each medical text includes a plurality of entities, and the computer device may sort the entity sequences extracted from the medical texts according to a time sequence generated by the medical texts, to obtain a plurality of groups of entity sequences corresponding to the plurality of medical texts, so as to facilitate mining of potential time-series relationships between the groups of entity sequences.

And step 304, inquiring a sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph.

Where a knowledge graph is a formal description framework for semantic knowledge, the data in the knowledge graph is usually represented by triples, i.e. (entities, relationships, entities). The medical knowledge map is a combination of professional knowledge and thinking logic mode in the medical field, the professional knowledge and thinking logic mode in the medical field are formed according to the study and clinical work of medical schools for many years, the medical knowledge map comprises structured medical knowledge points (information and data) and logic association which accords with the medical knowledge, wherein the entity 1 causes the entity 2, and the entity 1 relieves the entity 2. The structured data and professional logic association determine the flow direction and sequence of disease condition data of the artificial intelligence during operation, and the computer program operates according to the flow direction and sequence, so that the cognition and reasoning of doctors on the disease condition data are simulated, and the medical artificial intelligence is realized. The medical knowledge map can be a knowledge map of clinical specialty diseases, such as a cardiovascular disease knowledge map, a lung disease knowledge map, a critical illness knowledge map, and the like.

The inventor realizes that in an actual scene, the obtained entity sequence data is usually noisy and data is incomplete and insufficient due to recording errors or omission, and after the entity sequence corresponding to the medical text is obtained, if only effective information is mined from the entity sequence for prediction, the prediction probability of a predetermined category is influenced by data noise without guidance of domain knowledge, so that prediction deviation is caused. Therefore, in this embodiment, the computer device may query a sub-graph corresponding to each entity in the entity sequence from the medical knowledge graph according to the medical knowledge graph to obtain a sub-graph sequence, and since the sub-graph includes not only other entities associated with the entity in the medical knowledge graph but also relationships with other entities, the domain knowledge may be integrated into the representation of the entity, so that semantic information may be enriched, and prediction deviation caused by data recording errors or missing data may be reduced.

In one embodiment, a computer device may query a medical knowledge-graph, determine from the medical knowledge-graph an entity node corresponding to an entity in a sequence of entities, and treat as a subgraph corresponding to the entity a subgraph composed of the entity node, neighbor nodes of the entity node, and edge relationships with the neighbor nodes. And acquiring a sub-graph sequence according to the sub-graph corresponding to each entity in the entity sequence, thereby constructing a new feature for the entity sequence, wherein the new feature comprises the neighbor nodes of the entity in the knowledge graph, and the semantic information of the entity can be enriched.

The neighbor node may be a first-order neighbor node or a second-order neighbor node.

In one embodiment, the computer device may further query a medical knowledge-graph, determine entity nodes corresponding to entities in the sequence of entities from the medical knowledge-graph, determine target nodes corresponding to the predetermined categories, and use a sub-graph formed by nodes and edge relationships involved in a path from the entity node to the target nodes as the sub-graph corresponding to the entities. The target node may be a node corresponding to an entity in the medical knowledge graph that is closely related to the predetermined category, for example, the entity may be at least one of a target disease, a drug required to treat the target disease, a symptom of the target disease, a diagnostic conclusion in the medical knowledge field, and the like. According to the subgraph corresponding to each entity in the entity sequence, the subgraph sequence is obtained, so that a new feature is constructed for the entity sequence, the new feature comprises all nodes on the path from the entity to the predetermined category, and equivalently, more data are introduced to participate in the representation of the entity, so that the prediction deviation caused by data recording errors or missing data can be reduced.

In one embodiment, the computer device may further consider all nodes of an entity node that are neighbors of the entity node in the medical knowledge-graph and on the path of the entity node to the target node as subgraphs of the entity by querying the medical knowledge-graph. As shown in fig. 4, querying a sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph includes:

for entities in the entity sequence, corresponding entity nodes in the medical knowledge-graph are determined, step 402.

Step 404, determining neighbor nodes of the entity nodes from the medical knowledge graph.

Step 406, determining a path from the entity node to the target node from the medical knowledge-graph.

And step 408, acquiring a subgraph corresponding to the entity according to the neighbor nodes and the nodes on the path.

And step 410, obtaining a subgraph sequence according to the subgraphs corresponding to the entities in the entity sequence.

FIG. 5 is a diagram illustrating the construction of a sub-graph corresponding to an entity from a medical knowledge-graph in one embodiment. Referring to fig. 5, white nodes are schematic of partial nodes of the medical knowledge graph, black nodes represent target nodes, gray nodes represent entity nodes corresponding to a certain entity in an entity sequence, the entity nodes, neighbor nodes of the entity nodes, and all nodes on a path from the entity nodes to the target nodes are used as subgraphs corresponding to the entity, and as shown in fig. 5, nodes and edges surrounded by a dashed frame are subgraphs corresponding to the entity nodes.

In this embodiment, the constructed subgraph is equivalent to constructing a new feature for an entity sequence, the new feature includes neighbor nodes of the entity in a knowledge graph, and can enrich semantic information of the entity, and also includes nodes and edge relations related to a path from the entity node to a target node, so that prediction deviation caused by data recording errors or missing data can be reduced.

And step 306, after graph convolution operation is carried out on each subgraph in the subgraph sequence, a subgraph sequence representation is obtained.

The graph convolution operation is to extract features from graph data and perform the graph convolution operation on the graph data to obtain the features of the graph data. The inventor realizes that the mode of determining the probability that the medical text belongs to the predetermined category by using the knowledge graph lacks the mining of medical knowledge, so that the determination accuracy is not high. Because the graph convolution operation can process not only the nodes in the subgraph but also the edge relation among the nodes in the subgraph, and the characteristics of the subgraph are extracted together according to the nodes and the edge relation in the subgraph, the obtained subgraph representation not only can mine semantic information in the medical knowledge graph, but also can obtain graph structure information, and further more fully utilizes the domain knowledge.

In an embodiment, as shown in fig. 6, after performing graph convolution operation on each sub-graph, obtaining a sub-graph sequence representation corresponding to an entity sequence includes:

step 602, inputting each subgraph into a graph convolution network, and obtaining graph convolution characteristics corresponding to each subgraph.

The graph convolution network is a neural network structure for processing graph data, and is different from a traditional network which can only be used for grid-based data, such as a convolutional neural network, and the graph convolution network can process data with a generalized topology structure. The graph convolution network may employ ChebNet (chebyshev network) or GCN (graph convolution neural network), among others.

And step 604, performing pooling operation on the graph convolution characteristics to obtain vector representation corresponding to the subgraph.

The Pooling operation (Pooling) is a data processing operation in a neural network, and is used for performing dimensionality reduction on data by simulating a human visual system, and is also called Subsampling (Subsampling) or Downsampling (Downsampling). The pooling operation herein may be global maximum pooling, global average pooling, random pooling, etc., and may be set according to actual needs.

And 606, fusing the vector representations corresponding to the sub-graphs to obtain sub-graph sequence representations corresponding to the entity sequence.

Specifically, the computer device may fuse vector representations corresponding to respective subgraphs in the entity sequence to obtain a subgraph sequence representation corresponding to the entity sequence.

In one embodiment, a sub-graph sequence representation corresponding to an entity sequence can be obtained by the following formula.

Where GCN () is a graph convolution operation, GlobalMaxPooling () is a global max pooling operation,

is a sub-graph corresponding to the ith entity in the entity sequence, which can also be called as the ith sub-graph,

is a vector representation of the ith sub-graph. Extracting a plurality of groups of entity sequences from the medical text, and fusing vector representations of subgraphs corresponding to a plurality of entities in the jth entity sequence into a vector representation, namely the subgraph sequence representation corresponding to the entity sequence:

wherein the content of the first and second substances,

represents a sub-graph sequence representation corresponding to the jth entity sequence, W_vAnd b_vAre the trained model parameters.

And 308, fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph to obtain the entity sequence representation corresponding to the entity sequence.

Specifically, to obtain meaningful vector representations of entities and edges in the medical Knowledge Graph, the computer device may obtain distributed vector representations of nodes and edges in the medical Knowledge Graph by using Knowledge Graph Representation Learning (Knowledge Graph Representation Learning) during training of the medical text processing model. Then, after the model is trained, the entity node corresponding to the entity can be directly queried in the medical knowledge graph to obtain the distributed vector representation corresponding to the entity node, and the distributed vector representation corresponding to each entity in the entity sequence is fused with the distributed vector representation corresponding to the neighbor node to obtain the entity sequence representation corresponding to the entity sequence.

During the training process of the medical text processing model, the computer device can use a Trans model to represent the distributed vectors of the entity and edge relations in the medical knowledge graph, such as a TransE model, and the idea of the model is that during the training process of the model, the sum of the distributed vectors corresponding to the head nodes in the entity relation triplets and the distributed vectors corresponding to the edge relations is as close as possible to the distributed vectors corresponding to the tail nodes. In some alternative embodiments, the computer device may also employ other Trans models to characterize the distributed vectors of entity and edge relationships in the medical knowledge graph, such as a TransH model, a TransR model, and a TransD model, among others, as desired.

In one embodiment, a TransE model is used to obtain a vector representation of node and edge relationships in a medical knowledge graph, as shown in the following equation:

wherein h isⁱ、rⁱAnd tⁱRespectively a head entity, a variable relation and a tail entity corresponding to the ith edge in the medical knowledge map,

respectively representing vectors corresponding to a head entity, an edge relation and a tail entity corresponding to the ith edge, obtaining characteristics corresponding to the head entity, the edge relation and the tail entity through a TransE model, and obtaining the vector representation corresponding to the characteristics by using a multilayer perceptron (MLP).

In one embodiment, the method further comprises: for entities in the entity sequence, determining corresponding entity nodes in the medical knowledge graph; determining neighbor nodes of the entity nodes from the medical knowledge graph; determining an edge relation between an entity node and a neighbor node from a medical knowledge graph; determining distributed vector representations corresponding to the entity node, the edge relation and the neighbor node; wherein, the sum of the distributed vector representations between the entity node and the edge relation and the distributed vector representation of the neighbor node have correlation.

Specifically, for each entity in the sequence of entities, the computer device queries the entity node corresponding to the entity node and the neighbor nodes of the entity node from the medical knowledge graph, and determines the distributed vector representation corresponding to each triplet related to the entity node according to the entity-edge relationship-entity triplet, after determining the neighbor nodes, wherein in the distributed vector representation corresponding to each triplet, the sum of the distributed vector representations corresponding to the entity node and the edge relationship is very close to the distributed vector representation of the neighbor node. After determining the distributed vector representation corresponding to each triple related to the entity node, the computer device fuses the distributed vector representations of the neighbor nodes in the distributed vector representations with the distributed vector representation of the entity node, and the fused result is used as the knowledge representation of the entity node in the medical knowledge graph, so as to obtain the entity sequence representation corresponding to the whole entity sequence.

In this embodiment, since the medical knowledge graph includes the structured medical knowledge points and the logical association between the structured medical knowledge points and the medical knowledge points, the distributed vector representation of the neighbor nodes is embedded into the entity representation to obtain richer semantic information of the entity, thereby improving the prediction accuracy.

In one embodiment, as shown in fig. 7, fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, includes:

step 702, for entities in the entity sequence, corresponding entity nodes in the medical knowledge-graph are determined.

Step 704, determine the neighbor nodes of the entity nodes from the medical knowledge graph.

At step 706, the attention weight of each neighbor node to the physical node is determined.

In this embodiment, in order to extract a neighbor node vector representation that has more influence on an entity node from a plurality of neighbor nodes of the entity node, a computer device obtains an Attention weight of each neighbor node for the entity node by using Graph Attention (Graph Attention). That is, the attention weight is used to represent the influence of the neighboring nodes on the entity node, and the larger the attention weight is, the larger the influence of the attention weight on the entity node is, which is more critical for outputting the knowledge representation of the entity node.

In one embodiment, determining the attention weight of each neighbor node to the physical node comprises: determining the attention score of the neighbor node to the entity node according to the distributed vector representation corresponding to the neighbor node, the distributed vector representation corresponding to the edge relation between the neighbor node and the distributed vector representation corresponding to the entity node; and obtaining the attention weight corresponding to each neighbor node according to the attention score of each neighbor node to the entity node.

In particular, to enable the computer device to express different attentiveness to different neighbor nodes of the entity node in the medical knowledge-graph upon determining that the entity node is, then based on the graph attention mechanism, an attention score for each neighbor node to the entity node is obtained. The computer device may input the distributed vector representation corresponding to the triplet formed by the entity node and the neighbor node into the graph attention model, calculate the attention score of each neighbor node for the entity node, normalize the attention scores, and obtain the corresponding attention weight.

In one embodiment, the attention weight corresponding to a neighbor node may be represented by the following formula:

wherein q is_ijIndicating the attention scores of the ith entity node, the jth neighbor node thereof,

and respectively representing vectors corresponding to a head entity, an edge relation and a tail entity in a triple formed by the jth neighbor node of the ith entity node in the entity sequence, wherein tanh is an activation function and is used for introducing nonlinearity.

Representing the attention weight of the jth neighbor node to the ith entity node in the entity sequence, exp () represents an exponential function with a natural constant e as the base.

For example, for the ith entity node a in the entity sequence, there are 3 neighbor nodes in the medical knowledge graph, which are node B, node C, and node D, respectively. For node B, the edge relationship formed is AB, for node C, AC, and for node D, AD. Then the node B corresponds to an attention score of:

q_AB＝e_ABtanh(e_A+e_B)；

similarly, the attention scores corresponding to the nodes C and D are:

q_AC＝e_ACtanh(e_A+e_C)；

q_AD＝e_ADtanh(e_A+e_D)；

the attention weight corresponding to the node B is:

similarly, the attention weights corresponding to the nodes C and D are:

step 708, according to the attention weight, fusing the distributed vector representation corresponding to the entity node with the distributed vector representation of each neighbor node to obtain an entity representation corresponding to the entity node.

Specifically, after obtaining the attention weight of each neighbor node to the entity node, the computer device performs weighted summation on the distributed vector representation of each neighbor node according to the attention weight to obtain the entity representation corresponding to the entity node, and the entity representation embeds semantic information of the neighbor node of the entity node.

In one embodiment, the entity representation corresponding to the ith entity in the entity sequence can be represented by the following formula:

wherein, g_iAnd representing the entity representation corresponding to the ith entity in the entity sequence.

According to the above example, for node a, its corresponding entities are represented as:

step 710, obtaining an entity sequence representation corresponding to the entity sequence according to the entity representation corresponding to each entity in the entity sequence.

Specifically, the computer device may obtain, as the entity sequence representation corresponding to the entity sequence, the entity representation corresponding to each entity in the entity sequence in the manner described above. When the medical text comprises entity sequences corresponding to a plurality of medical texts, a plurality of groups of entity sequence representations corresponding to the medical texts can be obtained.

In one embodiment, a plurality of entity sequences are extracted from the medical text, and entity representations corresponding to entities in the jth entity sequence can be fused into a vector representation according to the following formula:

wherein the content of the first and second substances,

denotes the entity sequence representation corresponding to the jth entity sequence, W_kAnd b_kAre the trained model parameters.

Step 310, determining the probability that the medical text belongs to the predetermined category based on the sub-image sequence representation and the entity sequence representation.

Wherein the text may be used to reflect attributes of the user associated therewith, and the medical text may reflect the physical condition of the user. For example, the medical text may reflect a diagnosis situation of the user when the medical text is medical diagnosis data of the user, and may reflect a physical condition of the user when the medical text is body monitoring data of the user. Based on this, the predetermined category is a category related to the user that is reflected by the text. For example, for the first example, two predetermined categories, normal and lesion, respectively, may be provided, and three predetermined categories, normal, mild lesion and severe lesion, may be provided. As another example, for the second example, three predetermined classification categories may be set, respectively good health, general health, and sub-health.

In some embodiments, when the predetermined categories are only two, i.e., binary, the computer device may determine the probability that the medical text belongs to one of the predetermined categories based only on the sub-image sequence representation and the entity sequence representation, then the probability that the medical text belongs to the other predetermined category can also be determined.

Specifically, after obtaining the sub-image sequence representation and the entity sequence representation corresponding to each entity sequence, the computer device may obtain a final representation of the medical text based on the sub-image sequence representation and the entity sequence representation, and determine a probability that the medical text belongs to the predetermined category using the final representation. The value of the probability is a value between 0 and 1, and the greater the probability is, the greater the possibility that the medical text belongs to the predetermined category is represented.

In an alternative embodiment, considering that the medical text may correspond to one or more categories, the predetermined category may also be one or more categories, and the computer device may input the final representation of the medical text into a classifier corresponding to the different predetermined categories to obtain probabilities that the medical text belongs to the different predetermined categories.

In one embodiment, as shown in fig. 8, determining the probability that the medical text belongs to the predetermined category based on the sub-graph sequence representation and the entity sequence representation comprises:

step 802, obtaining a subgraph hidden state represented by the subgraph sequence through a first convolution network.

And step 804, obtaining an entity hidden state represented by the entity sequence through a second convolutional network.

Specifically, the sub-graph hidden state represents a time sequence relationship between sub-graph representations in the sub-graph sequence representation, and the entity hidden state represents a time sequence relationship between entity representations in the entity sequence. In one embodiment, the time-series relationship inside the sub-graph sequence representation and the entity sequence representation can be mined separately by the following formulas:

wherein the content of the first and second substances,

showing the sub-graph sequence representation corresponding to the j entity sequence,

representing a subgraph hidden state corresponding to the jth entity sequence;

indicating the entity sequence representation corresponding to the jth entity sequence,

indicating the hidden state of the entity corresponding to the jth entity sequence, LSTM_v() And LSTM_k() The network is memorized for different durations.

In alternative embodiments, the long and short term memory network may be replaced with other networks, such as a recurrent neural network, a gated recurrent unit, and a convolutional neural network, among others.

And 806, fusing the hidden state of the subgraph and the hidden state of the entity to obtain medical text representation.

In one embodiment, fusing the subgraph hidden state and the entity hidden state to obtain a medical text representation comprises: fusing a plurality of sub-image hidden states corresponding to a plurality of medical texts to obtain a sub-image hidden vector; fusing a plurality of entity hidden states corresponding to a plurality of medical texts to obtain a sub-image hidden vector; and connecting the subgraph hidden vector with the subgraph hidden vector to obtain the medical text representation.

In one embodiment, the sub-graph hidden states corresponding to multiple entity sequences may be converted into sub-graph hidden vectors in a unified vector space according to the following formula:

in one embodiment, the entity hiding states corresponding to the plurality of entity sequences may be converted into entity hiding vectors in a unified vector space according to the following formula:

in one embodiment, the resulting two vectors may be fused into a final medical text representation via a full-connectivity layer:

C_p＝W_p[C_v,C_k]+b_p。

the probability that the medical text belongs to the predetermined category is determined from the medical text representation by the classifier, step 808.

In this embodiment, a linear classifier may be employed to map the medical text representation to probability values:

y＝sigmoid(W_fC_p)+b_f

according to the medical text processing method, the field knowledge in the medical knowledge map is inquired, the sub-graph sequence corresponding to the entity sequence of the medical text is constructed, the sub-graph sequence representation is obtained by using graph convolution operation, namely, new characteristics are constructed for the entity sequence, so that the input information can be enriched, the abundant semantic information in the medical knowledge map can be mined, and the graph structure information can be obtained; in addition, meaningful distributed vector representations of the entity and the neighbor nodes in the medical knowledge graph are obtained, and the distributed vector representations of the neighbor nodes are embedded into the vector representation of the entity, so that domain knowledge in the medical knowledge graph is more fully embedded into the entity sequence representation. In this way, the accuracy of the probability that the medical text determined based on the sub-image sequence representation and the entity sequence representation belongs to the predetermined category can be improved.

As shown in fig. 9, in a specific embodiment, the medical text processing method includes the steps of:

step 902, an entity sequence in a medical text is obtained.

Step 904, for entities in the entity sequence, determines corresponding entity nodes in the medical knowledge graph.

Step 906, determining neighbor nodes of the entity nodes from the medical knowledge graph.

Step 908, determine paths from the entity nodes to the target nodes from the medical knowledge-graph.

Step 910, obtaining a subgraph corresponding to the entity according to the neighboring nodes and the nodes on the path.

Step 912, obtaining a subgraph sequence according to the subgraph corresponding to each entity in the entity sequence.

Step 914, input each subgraph into the graph convolution network, obtain the graph convolution characteristic corresponding to each subgraph.

Step 916, performing pooling operation on the graph convolution characteristics to obtain vector representation corresponding to the subgraph.

Step 918, fusing the vector representations corresponding to the sub-graphs to obtain sub-graph sequence representations corresponding to the entity sequence.

Step 920, for the entities in the entity sequence, the corresponding entity nodes in the medical knowledge graph are determined.

And step 922, determining neighbor nodes of the entity nodes from the medical knowledge graph.

Step 924, determining the attention score of the neighboring node to the entity node according to the distributed vector representation corresponding to the neighboring node, the distributed vector representation corresponding to the edge relationship between the neighboring node and the distributed vector representation corresponding to the entity node.

In step 926, according to the attention score of each neighboring node to the entity node, the attention weight corresponding to each neighboring node is obtained.

Step 928, according to the attention weight, fusing the distributed vector representation corresponding to the entity node with the distributed vector representation of each neighbor node to obtain the entity representation corresponding to the entity node.

Step 930, obtaining an entity sequence representation corresponding to the entity sequence according to the entity representation corresponding to each entity in the entity sequence.

Step 932, obtaining a subgraph hidden state represented by the subgraph sequence through the first convolution network.

And 934, obtaining an entity hiding state represented by the entity sequence through a second convolutional network.

Step 936, fusing a plurality of sub-image hidden states corresponding to the plurality of medical texts to obtain a sub-image hidden vector.

And step 938, fusing a plurality of entity hiding states corresponding to a plurality of medical texts to obtain a sub-image hiding vector.

And step 940, connecting the subgraph hidden vector with the subgraph hidden vector to obtain medical text representation.

At step 942, a probability that the medical text belongs to the predetermined category is determined from the medical text representation by the classifier.

It should be understood that, although the respective steps in the flowcharts of fig. 3 to 9 are sequentially shown as indicated by arrows, the steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 3 to 9 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the other steps or stages.

In one embodiment, as shown in fig. 10, a training method for a medical text processing model is provided, which is described by taking the method as an example applied to a computer device (terminal 102 or server 104) in fig. 1, and includes the following steps:

step 1002, a sample medical text and marking information corresponding to the sample medical text are obtained.

The labeling information corresponding to the sample medical text indicates whether the sample medical text belongs to a predetermined category, for example, if the sample medical text belongs to the predetermined category, the labeling information is 0, and if the sample medical text does not belong to the predetermined category, the labeling information is 1. When the predetermined categories are multiple, the labeling information corresponding to the sample medical text is a multi-dimensional vector, and the value of each element in the multi-dimensional vector is 0 or 1. For example, to predict the probability that the medical text belongs to N predetermined categories, the labeling information of the medical text is an N-dimensional vector, and each element in the N-dimensional vector represents whether the medical text belongs to the corresponding predetermined category.

Step 1004, inputting the sample medical text into a medical text processing model, obtaining an entity sequence in the sample medical text through the medical text processing model, inquiring a sub-graph sequence corresponding to the entity sequence according to a medical knowledge graph, performing graph convolution operation on each sub-graph in the sub-graph sequence to obtain a sub-graph sequence representation, fusing a distributed vector representation corresponding to each entity in the entity sequence with a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, and determining the probability that the sample medical text belongs to a preset category based on the sub-graph sequence representation and the entity sequence representation.

Reference may be made to embodiments of the medical text processing method described hereinbefore with respect to the embodiments of step 1004.

In one embodiment, a computer device may treat all nodes of an entity node on neighboring nodes in a medical knowledge-graph and a path of the entity node to a target node as subgraphs of the entity by querying the medical knowledge-graph.

is a vector representation of the ith sub-graph. Extracting a plurality of groups of entity sequences from the sample medical text, and fusing vector representations of subgraphs corresponding to a plurality of entities in the jth entity sequence into a vector representation, namely the subgraph sequence representation corresponding to the entity sequence:

wherein the content of the first and second substances,

In one embodiment, the computer device employs a TransE model to obtain a vector representation of node and edge relationships in the medical knowledge-graph, as shown in the following equation:

respectively corresponding to the ith edgeAnd the vector representation corresponding to the head entity, the edge relation and the tail entity is obtained through a TransE model, and the vector representation corresponding to the feature is obtained by utilizing a multilayer perceptron (MLP).

In one embodiment, a plurality of entity sequences are extracted from the sample medical text, and entity representations corresponding to the entities in the jth entity sequence can be fused into a vector representation according to the following formula:

in one embodiment, the time-series relationship inside the sub-graph sequence representation and the entity sequence representation can be mined separately by the following formulas:

wherein the content of the first and second substances,

representing a subgraph hidden state corresponding to the jth entity sequence;

C_p＝W_p[C_v,C_k]+b_p。

in one embodiment, a prediction probability that the sample medical text belongs to the predetermined category is determined from the medical text representation by the classifier, and the prediction probability can be obtained by the following formula:

and step 1006, constructing cross entropy loss according to the prediction probability and the labeling information.

In one embodiment, the entire medical text processing model may be optimized with a binary cross entropy loss as an objective function:

wherein the content of the first and second substances,

and y is the labeling information corresponding to the sample medical text.

And 1008, after updating the model parameters of the medical text processing model according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training until the training stopping condition is met, and obtaining the medical text processing model.

In one embodiment, when it is required to determine that the medical text belongs to multiple predetermined categories, multiple classifiers are required to respectively predict prediction probabilities corresponding to each predetermined category, cross entropy losses of each predetermined category are constructed based on input labeling information for the multiple predetermined categories, and after an objective function is obtained based on the constructed cross entropy losses, model parameters of the medical text processing model are updated according to the objective function.

FIG. 11 is a diagram of a training framework for a medical text processing model in an embodiment. Referring to fig. 11, the medical text processing model includes a sub-graph sequence representation network, an entity sequence representation network, a time-series relationship mining network, and an output network. The computer equipment respectively inputs entity sequences extracted from the sample medical texts into a sub-image sequence representation network and an entity sequence representation network, the sub-image sequence representation network obtains corresponding sub-image sequences according to the medical knowledge graph and carries out graph convolution operation on the sub-image sequences to obtain sub-image sequence representations; the entity sequence representation network acquires corresponding neighbor nodes according to the knowledge graph, and embeds the distributed vector representation of the neighbor nodes into the entity representation by using the attention of the graph to acquire an entity representation sequence; respectively mining the time sequence relations inside the sub-graph representation sequence and the entity representation sequence through a time sequence relation mining network to obtain a sub-graph hidden vector and an entity hidden vector; and finally, fusing the two hidden vectors into a final medical text representation through a full connection layer through an output network, mapping the medical text representation to a probability value by using a classifier to obtain a prediction probability, and finally constructing a cross entropy loss optimization whole medical text processing model by using the output prediction probability and the labeling information of the sample medical text.

According to the training method of the medical text processing model, the domain knowledge is mined from the medical knowledge map in the training process, the semantic information of the entity sequence corresponding to the sample medical text is enriched, and the model accuracy is improved. By constructing a sub-graph sequence corresponding to an entity sequence of a sample medical text and obtaining sub-graph sequence representation by using graph convolution operation, a new characteristic is constructed for the entity sequence, so that not only can input information be enriched, and abundant semantic information in a medical knowledge graph be mined, but also graph structure information can be obtained, adverse effects of medical text data deviation (such as data loss) on model training can be reduced, meanwhile, graph convolution operation can capture graph structure information in the medical knowledge graph, and the information can obtain an association relation between an entity with higher discrimination and a preset category, so that the model accuracy is improved. Furthermore, embedding the distributed vector representations of the neighboring nodes into the vector representation of the entity allows domain knowledge in the medical knowledge graph to be more fully embedded into the entity sequence representation. Finally, the accuracy of the probability that the medical text determined by the model belongs to the predetermined category based on the sub-image sequence representation and the entity sequence representation is obviously improved.

In one embodiment, as shown in fig. 12, a medical text processing apparatus 1200 is provided, which may be a part of a computer device using software modules or hardware modules, or a combination of the two modules, and specifically includes: an obtaining module 1202, a querying module 1204, a sub-graph sequence representation module 1206, an entity sequence representation module 1208, and a determining module 1210, wherein:

an obtaining module 1202, configured to obtain an entity sequence in a medical text;

the query module 1204 is configured to query a sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph;

a subgraph sequence representation module 1206, configured to perform graph convolution operation on each subgraph in the subgraph sequence to obtain subgraph sequence representation;

an entity sequence representation module 1208, configured to fuse distributed vector representations corresponding to the entities in the entity sequence with distributed vector representations corresponding to neighbor nodes of the entities in the medical knowledge graph, so as to obtain an entity sequence representation corresponding to the entity sequence;

a determination module 1210 for determining a probability that the medical text belongs to the predetermined category based on the sub-graph sequence representation and the entity sequence representation.

In one embodiment, the obtaining module 1202 is further configured to obtain a plurality of medical texts; extracting entity sequences related to medical treatment from each medical treatment text; and sequencing the entity sequences according to the time sequence generated by each medical text to obtain a plurality of groups of entity sequences corresponding to a plurality of medical texts.

In one embodiment, the query module 1204 is further configured to determine, for entities in the sequence of entities, corresponding entity nodes in the medical knowledge-graph; determining neighbor nodes of the entity nodes from the medical knowledge graph; determining a path from an entity node to a target node from a medical knowledge graph; obtaining a subgraph corresponding to the entity according to the neighbor nodes and the nodes on the path; and obtaining a subgraph sequence according to the subgraphs corresponding to the entities in the entity sequence.

In one embodiment, the sub-graph sequence representation module 1206 is further configured to input each sub-graph to a graph convolution network, so as to obtain a graph convolution feature corresponding to each sub-graph; performing pooling operation on the graph convolution characteristics to obtain vector representation corresponding to the subgraph; and fusing the vector representations corresponding to the sub-graphs to obtain sub-graph sequence representations corresponding to the entity sequences.

In one embodiment, the apparatus further comprises a distributed vector representation module for determining, for an entity in the sequence of entities, a corresponding entity node in the medical knowledge-graph; determining neighbor nodes of the entity nodes from the medical knowledge graph; determining an edge relation between an entity node and a neighbor node from a medical knowledge graph; determining distributed vector representations corresponding to the entity node, the edge relation and the neighbor node; wherein, the sum of the distributed vector representations between the entity node and the edge relation and the distributed vector representation of the neighbor node have correlation.

In one embodiment, the entity sequence representation module 1208 is further configured to determine, for entities in the entity sequence, corresponding entity nodes in the medical knowledge-graph; determining neighbor nodes of the entity nodes from the medical knowledge graph; determining the attention weight of each neighbor node to the entity node; according to the attention weight, fusing the distributed vector representation corresponding to the entity node with the distributed vector representation of each neighbor node to obtain an entity representation corresponding to the entity node; and obtaining entity sequence representation corresponding to the entity sequence according to the entity representation corresponding to each entity in the entity sequence.

In one embodiment, the entity sequence representing module 1208 is further configured to determine the attention score of the neighboring node to the entity node according to the distributed vector representation corresponding to the neighboring node, the distributed vector representation corresponding to the edge relationship between the neighboring node and the distributed vector representation corresponding to the entity node; and obtaining the attention weight corresponding to each neighbor node according to the attention score of each neighbor node to the entity node.

In one embodiment, the determining module 1210 is further configured to obtain a hidden state of a subgraph represented by the subgraph sequence through the first convolutional network; obtaining an entity hiding state represented by an entity sequence through a second convolutional network; fusing the hidden state of the subgraph and the hidden state of the entity to obtain medical text representation; determining, by the classifier, a probability that the medical text belongs to the predetermined category from the medical text representation.

In one embodiment, the determining module 1210 is further configured to fuse a plurality of sub-image hidden states corresponding to a plurality of medical texts to obtain a sub-image hidden vector; fusing a plurality of entity hidden states corresponding to a plurality of medical texts to obtain a sub-image hidden vector; and connecting the subgraph hidden vector with the subgraph hidden vector to obtain the medical text representation.

In one embodiment, the system further comprises a training module, configured to obtain the sample medical text and the labeling information corresponding to the sample medical text; inputting a sample medical text into a medical text processing model, acquiring an entity sequence in the sample medical text through the medical text processing model, inquiring a sub-graph sequence corresponding to the entity sequence according to a medical knowledge graph, performing graph convolution operation on each sub-graph in the sub-graph sequence, acquiring sub-graph sequence representation, fusing distributed vector representation corresponding to each entity in the entity sequence and distributed vector representation corresponding to neighbor nodes of each entity in the medical knowledge graph, acquiring entity sequence representation corresponding to the entity sequence, and acquiring prediction probability of a preset category based on the sub-graph sequence representation and the entity sequence representation; constructing cross entropy loss by the prediction probability and the marking information; and after the model parameters of the medical text processing model are updated according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training until the training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to the preset category.

The medical text processing device 1200 constructs the sub-graph sequence corresponding to the entity sequence of the medical text by inquiring the domain knowledge in the medical knowledge graph, and obtains the sub-graph sequence representation by graph convolution operation, which is equivalent to constructing a new feature for the entity sequence, thereby not only enriching the input information, excavating the abundant semantic information in the medical knowledge graph, but also obtaining the graph structure information; in addition, meaningful distributed vector representations of the entity and the neighbor nodes in the medical knowledge graph are obtained, and the distributed vector representations of the neighbor nodes are embedded into the vector representation of the entity, so that domain knowledge in the medical knowledge graph is more fully embedded into the entity sequence representation. In this way, the accuracy of the probability that the medical text determined based on the sub-image sequence representation and the entity sequence representation belongs to the predetermined category can be improved.

For specific limitations of the medical text processing apparatus 1200, the above limitations of the medical text processing method can be referred to, and will not be described herein again. The various modules in the medical text processing apparatus 1200 described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, as shown in fig. 13, there is provided a medical text processing model training apparatus 1300, which may be a part of a computer device using software modules or hardware modules, or a combination of the two, the apparatus specifically comprising: an obtaining module 1302, a determining module 1304, a loss constructing module 1306, and a model updating module 1308, wherein:

an obtaining module 1302, configured to obtain a sample medical text and label information corresponding to the sample medical text;

a determining module 1304, configured to input a sample medical text into a medical text processing model, obtain an entity sequence in the sample medical text through the medical text processing model, query a sub-graph sequence corresponding to the entity sequence according to a medical knowledge graph, perform graph convolution on each sub-graph in the sub-graph sequence, obtain a sub-graph sequence representation, fuse a distributed vector representation corresponding to each entity in the entity sequence and a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph, obtain an entity sequence representation corresponding to the entity sequence, and determine a probability that the sample medical text belongs to a predetermined category based on the sub-graph sequence representation and the entity sequence representation;

a loss construction module 1306, configured to construct cross entropy loss from the prediction probability and the labeling information;

the model updating module 1308 is configured to, after updating the model parameters of the medical text processing model according to the cross entropy loss, return to the step of obtaining the sample medical text and the label information corresponding to the sample medical text to continue training until a training stop condition is met, and obtain a medical text processing model for determining a probability that the medical text belongs to a predetermined category.

For specific limitations of the training apparatus 1300 for a medical text processing model, reference may be made to the limitations of the medical text processing method and/or the training method for a medical text processing model, which are not described herein again. The modules in the training apparatus 1300 for medical text processing model described above can be implemented in whole or in part by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server or a terminal, and its internal structure diagram may be as shown in fig. 14. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of medical text processing and/or training of a medical text processing model.

Those skilled in the art will appreciate that the architecture shown in fig. 14 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A medical text processing method, the method comprising:

acquiring an entity sequence in a medical text;

2. The method of claim 1, wherein the obtaining of the sequence of entities in the medical text comprises:

acquiring a plurality of medical texts;

3. The method of claim 1, wherein the querying the sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph comprises:

4. The method of claim 1, wherein obtaining a sub-graph sequence representation corresponding to the entity sequence after performing the graph convolution operation on each sub-graph comprises:

5. The method of claim 1, further comprising:

6. The method according to claim 1, wherein the fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph to obtain the entity sequence representation corresponding to the entity sequence comprises:

determining an attention weight of each neighbor node to the entity node;

7. The method of claim 6, wherein determining the attention weight of each neighboring node to the entity node comprises:

8. The method of claim 1, wherein determining the probability that the medical text belongs to a predetermined category based on the sub-graph sequence representation and the entity sequence representation comprises:

9. The method of claim 8, wherein fusing the sub-graph hidden state and the entity hidden state to obtain the medical text representation comprises:

10. The method according to any one of claims 1 to 9, wherein the method is implemented by a medical text processing model, and the training step of the medical text processing model comprises:

and after updating the model parameters of the medical text processing model according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training until a training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to a preset category.

11. A method of training a medical text processing model, the method comprising:

constructing cross entropy loss according to the prediction probability and the labeling information;

12. A medical text processing apparatus, characterized in that the apparatus comprises:

13. An apparatus for training a medical text processing model, the apparatus comprising:

and the model updating module is used for returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training after updating the model parameters of the medical text processing model according to the cross entropy loss until a training stopping condition is met, and obtaining the medical text processing model used for determining the probability that the medical text belongs to the preset category.

14. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 11 when executing the computer program.

15. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 11.