CN113673244B - Medical text processing method, medical text processing device, computer equipment and storage medium - Google Patents
Medical text processing method, medical text processing device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN113673244B CN113673244B CN202110001747.9A CN202110001747A CN113673244B CN 113673244 B CN113673244 B CN 113673244B CN 202110001747 A CN202110001747 A CN 202110001747A CN 113673244 B CN113673244 B CN 113673244B
- Authority
- CN
- China
- Prior art keywords
- entity
- sequence
- graph
- medical
- sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012545 processing Methods 0.000 title claims abstract description 98
- 238000003672 processing method Methods 0.000 title claims abstract description 26
- 239000013598 vector Substances 0.000 claims abstract description 197
- 238000000034 method Methods 0.000 claims abstract description 65
- 238000012549 training Methods 0.000 claims description 54
- 238000002372 labelling Methods 0.000 claims description 38
- 230000015654 memory Effects 0.000 claims description 22
- 238000005096 rolling process Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 18
- 238000011176 pooling Methods 0.000 claims description 16
- 238000010276 construction Methods 0.000 claims description 6
- 238000012163 sequencing technique Methods 0.000 claims description 5
- 238000013473 artificial intelligence Methods 0.000 abstract description 21
- 238000005516 engineering process Methods 0.000 abstract description 20
- 238000003058 natural language processing Methods 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 description 15
- 230000006870 function Effects 0.000 description 9
- 238000005065 mining Methods 0.000 description 9
- 238000013528 artificial neural network Methods 0.000 description 7
- 201000010099 disease Diseases 0.000 description 6
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000011160 research Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 5
- 230000000306 recurrent effect Effects 0.000 description 5
- 230000002829 reductive effect Effects 0.000 description 5
- 238000003745 diagnosis Methods 0.000 description 4
- 239000003814 drug Substances 0.000 description 4
- 230000036541 health Effects 0.000 description 3
- 230000003902 lesion Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 230000002411 adverse Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 206010011224 Cough Diseases 0.000 description 1
- 206010013952 Dysphonia Diseases 0.000 description 1
- 208000000059 Dyspnea Diseases 0.000 description 1
- 206010013975 Dyspnoeas Diseases 0.000 description 1
- 208000010473 Hoarseness Diseases 0.000 description 1
- 208000019693 Lung disease Diseases 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 230000036449 good health Effects 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000002483 medication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H15/00—ICT specially adapted for medical reports, e.g. generation or transmission thereof
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Epidemiology (AREA)
- Public Health (AREA)
- Primary Health Care (AREA)
- Medical Informatics (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The application relates to a medical text processing method, a medical text processing device, computer equipment and a storage medium. The method relates to natural language processing technology of artificial intelligence, and comprises the following steps: acquiring an entity sequence in a medical text; inquiring a sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph; carrying out graph convolution operation on each sub graph in the sub graph sequence to obtain a sub graph sequence representation; fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph to obtain the entity sequence representation corresponding to the entity sequence; based on the sub-image sequence representation and the entity sequence representation, a probability that the medical text belongs to a predetermined category is determined. By adopting the method, the accuracy of determining the probability that the medical text belongs to the preset category can be improved.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a medical text processing method, apparatus, computer device, and storage medium.
Background
With the construction of medical information systems, massive clinical data is accumulated. Based on the data, a medical prediction model is established by using a machine learning algorithm to assist medical decision-making or users to pay attention to the health condition of the users, so that the method gradually becomes the current hot research direction. Among these, medical text processing is one of the most valuable tasks.
In the traditional medical text processing method, only effective information is mined from medical texts to predict, and guidance of medical field knowledge is not available, however, in an actual scene, the obtained medical text data is usually noisy, incomplete and insufficient data can be caused by recording errors or omission, so that the probability accuracy of the obtained medical text corresponding to a preset category is low.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a medical text processing method, apparatus, computer device, and storage medium capable of improving the accuracy of determining that a medical text belongs to a predetermined category.
A medical text processing method, the method comprising:
acquiring an entity sequence in a medical text;
Inquiring a sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph;
Carrying out graph convolution operation on each sub graph in the sub graph sequence to obtain a sub graph sequence representation;
Fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph to obtain the entity sequence representation corresponding to the entity sequence;
Based on the sub-image sequence representation and the entity sequence representation, a probability that the medical text belongs to a predetermined category is determined.
In one embodiment, the acquiring the entity sequence in the medical text includes:
Acquiring a plurality of medical texts;
extracting entity sequences related to medical treatment from each medical treatment text;
and sequencing the entity sequences according to the time sequence generated by the medical texts to obtain a plurality of groups of entity sequences corresponding to the medical texts.
In one embodiment, the querying the sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph includes:
For the entities in the entity sequence, determining corresponding entity nodes in a medical knowledge graph;
Determining neighbor nodes of the entity node from the medical knowledge graph;
determining a path from the entity node to a target node from the medical knowledge graph;
Obtaining a sub-graph corresponding to the entity according to the neighbor node and the node on the path;
and obtaining a sub-graph sequence according to the sub-graphs corresponding to the entities in the entity sequence.
In one embodiment, after performing a graph rolling operation on each sub-graph, obtaining a sub-graph sequence representation corresponding to the entity sequence includes:
inputting each sub-graph to a graph rolling network to obtain graph rolling characteristics corresponding to each sub-graph;
Pooling the graph convolution characteristics to obtain vector representations corresponding to the subgraphs;
and fusing vector representations corresponding to the sub-graphs to obtain sub-graph sequence representations corresponding to the entity sequences.
In one embodiment, the method further comprises:
For the entities in the entity sequence, determining corresponding entity nodes in a medical knowledge graph;
Determining neighbor nodes of the entity node from the medical knowledge graph;
determining an edge relationship between the entity node and the neighbor node from the medical knowledge graph;
Determining the entity node, the edge relation and the distributed vector representation corresponding to the neighbor node;
Wherein there is an association between a sum of the distributed vector representations between the entity node and the edge relationship and the distributed vector representation of the neighbor node.
In one embodiment, the fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighboring node of each entity in the medical knowledge graph to obtain the entity sequence representation corresponding to the entity sequence includes:
For the entities in the entity sequence, determining corresponding entity nodes in the medical knowledge graph;
Determining neighbor nodes of the entity node from the medical knowledge graph;
Determining the attention weight of each neighbor node to the entity node;
According to the attention weight, fusing the distributed vector representation corresponding to the entity node with the distributed vector representation of each neighbor node to obtain the entity representation corresponding to the entity node;
And obtaining the entity sequence representation corresponding to the entity sequence according to the entity representation corresponding to each entity in the entity sequence.
In one embodiment, the determining the attention weight of each neighbor node to the entity node includes:
Determining the attention score of the neighbor node to the entity node according to the distributed vector representation corresponding to the neighbor node, the distributed vector representation corresponding to the edge relation between the neighbor nodes and the distributed vector representation corresponding to the entity node;
and obtaining the attention weight corresponding to each neighbor node according to the attention score of each neighbor node to the entity node.
In one embodiment, the determining the probability that the medical text belongs to a predetermined category based on the sub-image sequence representation and the entity sequence representation comprises:
Obtaining a subgraph hiding state represented by the subgraph sequence through a first convolution network;
obtaining an entity hiding state represented by the entity sequence through a second convolution network;
Fusing the subgraph hiding state and the entity hiding state to obtain medical text representation;
and determining the probability that the medical text belongs to a preset category according to the medical text representation through a classifier.
In one embodiment, said fusing said sub-graph hidden state and said entity hidden state to obtain said medical text representation comprises:
fusing a plurality of sub-graph hiding states corresponding to a plurality of medical texts to obtain sub-graph hiding vectors;
Fusing a plurality of entity hiding states corresponding to a plurality of medical texts to obtain sub-graph hiding vectors;
And connecting the subgraph hidden vector with the subgraph hidden vector to obtain medical text representation.
In one embodiment, the method is implemented by a medical text processing model, the training step of which comprises:
Acquiring labeling information corresponding to a sample medical text;
Inputting the sample medical text into a medical text processing model, acquiring an entity sequence in the sample medical text through the medical text processing model, inquiring a sub-image sequence corresponding to the entity sequence according to a medical knowledge graph, performing graph convolution operation on each sub-image in the sub-image sequence to obtain sub-image sequence representation, fusing a distributed vector representation corresponding to each entity in the entity sequence with a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, and determining the prediction probability of the sample medical text belonging to a preset category based on the sub-image sequence representation and the entity sequence representation;
constructing cross entropy loss between the prediction probability and the labeling information;
And after updating the model parameters of the medical text processing model according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training until the training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to the preset category.
A method of training a medical text processing model, the method comprising:
Acquiring a sample medical text and labeling information corresponding to the sample medical text;
Inputting the sample medical text into a medical text processing model, acquiring an entity sequence in the sample medical text through the medical text processing model, inquiring a sub-image sequence corresponding to the entity sequence according to a medical knowledge graph, performing graph convolution operation on each sub-image in the sub-image sequence to obtain sub-image sequence representation, fusing a distributed vector representation corresponding to each entity in the entity sequence with a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, and determining the prediction probability of the sample medical text belonging to a preset category based on the sub-image sequence representation and the entity sequence representation;
constructing cross entropy loss between the prediction probability and the labeling information;
And after updating the model parameters of the medical text processing model according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training until the training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to the preset category.
A medical text processing device, the device comprising:
The acquisition module is used for acquiring the entity sequence in the medical text;
The query module is used for querying the sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph;
the sub-image sequence representation module is used for obtaining sub-image sequence representation after carrying out image convolution operation on each sub-image in the sub-image sequence;
The entity sequence representation module is used for fusing the distributed vector representations corresponding to the entities in the entity sequence with the distributed vector representations corresponding to the neighbor nodes of the entities in the medical knowledge graph to obtain entity sequence representations corresponding to the entity sequence;
And the determining module is used for determining the probability that the medical text belongs to a preset category based on the sub-image sequence representation and the entity sequence representation.
A training device for a medical text processing model, the device comprising:
The acquisition module is used for acquiring a sample medical text and labeling information corresponding to the sample medical text;
The determining module is used for inputting the sample medical text into a medical text processing model, acquiring an entity sequence in the sample medical text through the medical text processing model, inquiring a sub-image sequence corresponding to the entity sequence according to a medical knowledge graph, performing graph convolution operation on each sub-image in the sub-image sequence to obtain a sub-image sequence representation, fusing a distributed vector representation corresponding to each entity in the entity sequence with a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, and determining the prediction probability of the sample medical text belonging to a preset category based on the sub-image sequence representation and the entity sequence representation;
the loss construction module is used for constructing cross entropy loss between the prediction probability and the labeling information;
And the model updating module is used for returning the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training after updating the model parameters of the medical text processing model according to the cross entropy loss until the training stopping condition is met, and obtaining the medical text processing model for determining that the medical text belongs to the preset category.
A computer device comprising a memory storing a computer program and a processor implementing the steps of the above-described medical text processing method and/or training method of a medical text processing model when the computer program is executed.
A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above-described medical text processing method and/or training method of a medical text processing model.
A computer program comprising computer instructions stored in a computer readable storage medium, the computer instructions being read from the computer readable storage medium by a processor of a computer device, the computer instructions being executed by the processor to cause the computer device to perform the steps of the above-described medical text processing method and/or training method of a medical text processing model.
According to the medical text processing method, the medical text processing device, the computer equipment and the storage medium, the sub-image sequence corresponding to the entity sequence of the medical text is constructed by inquiring the domain knowledge in the medical knowledge graph, and the sub-image sequence representation is obtained by utilizing graph convolution operation, which is equivalent to constructing new features for the entity sequence, so that not only can the input information be enriched, but also the abundant semantic information in the medical knowledge graph can be mined, and the graph structure information can be obtained; in addition, a meaningful distributed vector representation of the entity and the neighbor node in the medical knowledge graph is obtained, and the distributed vector representation of the neighbor node is embedded into the vector representation of the entity, so that domain knowledge in the medical knowledge graph is more fully embedded into the entity sequence representation. In this way, the accuracy of determining the probability that the medical text belongs to the predetermined category based on the sub-image sequence representation and the entity sequence representation can be improved.
According to the training method, the training device, the computer equipment and the storage medium of the medical text processing model, the field knowledge is mined from the medical knowledge graph in the training process, the semantic information of the entity sequence corresponding to the sample medical text is enriched, and the model accuracy is improved. By constructing the sub-image sequence corresponding to the entity sequence of the sample medical text and utilizing the graph convolution operation to obtain the sub-image sequence representation, the method is equivalent to constructing new features for the entity sequence, not only can enrich input information and mine rich semantic information in the medical knowledge graph, but also can obtain graph structure information, and reduce adverse effects on model training caused by medical text data deviation (such as data loss), and meanwhile, graph structure information in the medical knowledge graph can be captured by the graph convolution operation, and the part of information can obtain association relations between entities with higher distinction degree and target preset categories, so that the model accuracy is improved. Furthermore, the distributed vector representation of the neighbor nodes is embedded in the vector representation of the entity, such that domain knowledge in the medical knowledge-graph is more fully embedded in the entity sequence representation. Finally, the accuracy of the probability that the medical text determined by the model based on the sub-image sequence representation and the entity sequence representation belongs to the preset category is obviously improved.
Drawings
FIG. 1 is a diagram of an application environment for a medical text processing method in one embodiment;
FIG. 2 is a general framework diagram of a medical text processing method in one embodiment;
FIG. 3 is a flow diagram of a method of medical text processing in one embodiment;
FIG. 4 is a flow diagram of a sub-graph sequence queried according to a medical knowledge-graph in one embodiment;
FIG. 5 is a flow chart of constructing a sub-graph corresponding to an entity from a medical knowledge graph in one embodiment;
FIG. 6 is a flow diagram of a sub-sequence representation corresponding to an obtained entity sequence in one embodiment;
FIG. 7 is a flowchart of obtaining an entity sequence representation corresponding to an entity sequence in one embodiment;
FIG. 8 is a flow diagram of determining a probability that a medical text belongs to a predetermined category based on a sub-graph sequence representation and an entity sequence representation in one embodiment;
FIG. 9 is a flow chart of a method of medical text processing in a specific embodiment;
FIG. 10 is a flow diagram of a method of training a medical text processing model in one embodiment;
FIG. 11 is a schematic diagram of a training framework for a medical text processing model in one embodiment;
FIG. 12 is a block diagram of a medical text processing device in one embodiment;
FIG. 13 is a block diagram of a training device for medical text processing models in one embodiment;
Fig. 14 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The application provides a medical text processing method and a training method of a medical text processing model, which relate to the technology of artificial intelligence (ARTIFICIAL INTELLIGENCE, AI), wherein the artificial intelligence is the theory, method, technology and application system which utilizes a digital computer or a machine controlled by the digital computer to simulate, extend and expand the intelligence of a person, sense the environment, acquire knowledge and acquire the best result by using the knowledge. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The embodiment of the application provides a training method of a medical text processing model and a medical text processing method, which mainly relate to natural language processing (Nature Language processing, NLP) technology of artificial intelligence, wherein the natural language processing is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.
With research and advancement of artificial intelligence technology, research and application of artificial intelligence technology is being developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, etc., and it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and with increasing importance value.
Intelligent medical treatment is a large application and industrial field of artificial intelligence, and medical artificial intelligence includes image artificial intelligence such as fundus images, CT images, MRI images, skin images, and the like, and also includes voice conversion electronic medical record artificial intelligence, auxiliary inspection artificial intelligence, diagnosis guiding robots, medicine dosage, medicine development, telemedicine, and the like.
The medical text processing method provided by the embodiment of the application can be applied to an application environment shown in figure 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 may obtain a medical text and send the medical text to the server 104, the server 104 extracts an entity sequence in the medical text, queries a sub-graph sequence corresponding to the entity sequence according to a medical knowledge graph, performs graph convolution operation on each sub-graph in the sub-graph sequence to obtain a sub-graph sequence representation, fuses a distributed vector representation corresponding to each entity in the entity sequence and a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, determines a probability that the medical text belongs to a predetermined category based on the sub-graph sequence representation and the entity sequence representation, and the server 104 may return the determined probability to the terminal 102.
The training method of the medical text processing model provided by the embodiment of the application can also be applied to an application environment shown in figure 1. The server 104 acquires the labeling information corresponding to the sample medical text; inputting a sample medical text into a medical text processing model, acquiring an entity sequence in the medical text through the medical text processing model, inquiring a sub-image sequence corresponding to the entity sequence according to a medical knowledge graph, carrying out graph convolution operation on each sub-image in the sub-image sequence to obtain a sub-image sequence representation, fusing a distributed vector representation corresponding to each entity in the entity sequence with a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, and determining the prediction probability that the sample medical text belongs to a preset category based on the sub-image sequence representation and the entity sequence representation; constructing cross entropy loss of the prediction probability and the labeling information; and after updating the model parameters of the medical text processing model according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text, and continuing training until the training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to the preset category. After that, the terminal 102 may locally set an initial medical text processing model, and import model parameters of the medical text processing model trained by the server 104, thereby obtaining the medical text processing model.
The terminal 102 may be, but not limited to, various personal computers, notebook computers, smartphones, tablet computers, and portable wearable devices, and the server 104 may be implemented by a stand-alone server or a server cluster composed of a plurality of servers.
According to the medical text processing method provided by the embodiment of the application, the sub-image sequence corresponding to the entity sequence of the medical text is constructed by inquiring the domain knowledge in the medical knowledge graph, and the sub-image sequence representation is obtained by utilizing the graph convolution operation, which is equivalent to constructing new features for the entity sequence, so that not only can the input information be enriched, but also the abundant semantic information in the medical knowledge graph can be excavated, and the graph structure information can be obtained; in addition, a meaningful distributed vector representation of the entity and the neighbor node in the medical knowledge graph is obtained, and the distributed vector representation of the neighbor node is embedded into the vector representation of the entity, so that domain knowledge in the medical knowledge graph is more fully embedded into the entity sequence representation. In this way, the accuracy of the probability that the medical text determined based on the sub-image sequence representation and the entity sequence representation belongs to the predetermined category may be improved.
The medical text processing method and the training method of the medical text processing model provided by the embodiment of the application can be applied to an artificial intelligent medical auxiliary decision-making system, wherein medical text processing is an important function in the auxiliary decision-making system, and the input of the auxiliary decision-making system is medical text and the output is the probability that the medical text belongs to a preset category. In one application scenario, the system may provide clinical decision support to a physician. In another application scenario, the user may input the relevant medical text into the auxiliary decision making system, and the probability that the medical text output by the auxiliary decision making system belongs to the predetermined category may make the user pay more attention to the health condition of the user, for example, if the probability that the medical text belongs to the predetermined category is high, the user may go to the hospital for physical examination or further examination.
As shown in fig. 2, a schematic diagram of the overall framework of the medical text processing method in one embodiment is shown. Referring to fig. 2, the method mainly comprises four steps, namely, sub-image sequence construction and sub-image sequence representation, entity sequence representation, time sequence relation mining and probability of belonging to a preset category output. Firstly, extracting an entity from a medical text to obtain an entity sequence. If a user inputs a plurality of medical texts, each medical text includes a plurality of entities, the entity sequence of each medical text may be arranged according to the time sequence of the generation of the plurality of medical texts and then used as an input entity sequence. And constructing a sub-graph sequence corresponding to the entity sequence of the medical text by inquiring the medical knowledge graph, and obtaining the sub-graph sequence representation by utilizing a graph convolution network. The distributed vector representation of the neighbor nodes of each entity in the entity sequence is embedded into the representation of the entity by querying the medical knowledge graph, and the entity sequence representation is obtained by using a graph attention mechanism. And finally, mining hidden states of the sub-image sequence representation and the entity sequence representation through two time sequence relation mining networks respectively, fusing the two hidden states, and inputting the fused hidden states into a classifier for prediction to obtain the probability that the medical text belongs to a preset category. Among them, the timing relation mining network may employ a Long Short-Term Memory network (LSTM), a recurrent neural network (RNN, recurrent Neural Network), a gate recurrent unit (GRU, gate Recurrent Unit), a convolutional neural network (CNN, convolutional Neural Networks), and the like.
In one embodiment, as shown in fig. 3, a medical text processing method is provided, which is illustrated by taking the computer device (the terminal 102 or the server 104) in fig. 1 as an example, and includes the following steps:
Step 302, a sequence of entities in a medical text is obtained.
The medical text is text data related to the physical condition of the user, for example, the medical text can be a series of physical examination data obtained after physical examination of the user, and can also be physical condition monitoring data recorded by the user. The entity sequence is a sequence of entity constructs extracted from the medical text. The computer device may analyze structured discrete features from the medical text using natural language processing techniques (e.g., named entity recognition techniques), including symptoms, diseases, medications, personal information of the user, past medical history, etc., may derive features such as "barking cough", "hoarseness", "dyspnea", "diabetes, etc., and the user's medical text may include a plurality of entities that form a sequence of entities corresponding to the medical text. The computer device may also obtain an electronic medical record of the user, and extract each entity from the electronic medical record using a named entity recognition technique to obtain an entity sequence.
In one embodiment, obtaining the sequence of entities in the medical text includes: acquiring a plurality of medical texts; extracting entity sequences related to medical treatment from each medical treatment text; and sequencing the entity sequences according to the time sequence generated by the medical texts to obtain a plurality of groups of entity sequences corresponding to the medical texts.
In this embodiment, a user may input a plurality of medical texts, where each medical text includes a plurality of entities, and the computer device may sort the entity sequences extracted from the respective medical texts according to a time sequence of the medical text generation, to obtain a plurality of groups of entity sequences corresponding to the plurality of medical texts, so as to facilitate mining of potential timing relationships between the plurality of groups of entity sequences.
Step 304, inquiring the sub-image sequence corresponding to the entity sequence according to the medical knowledge graph.
The knowledge graph is a formalized description framework of semantic knowledge, and data in the knowledge graph is usually represented by triples, namely (entities, relations and entities). The medical knowledge graph is a combination of medical domain expertise and thinking logic patterns, which are formed according to study and clinical work of medical colleges for many years, and comprises structured medical knowledge points (information and data) and logic relations between the structured medical knowledge points according with medical knowledge, for example, an entity 1 leads to an entity 2, and the entity 1 relieves the entity 2. The structured data and the professional logic are associated, the flow direction and sequence of the illness state data of the artificial intelligence during operation are determined, and the computer program operates according to the flow direction and sequence, so that the cognition and reasoning of doctors on the illness state data are simulated, and the medical artificial intelligence is realized. The medical knowledge graph may be a knowledge graph of a clinical specialty disease, such as a cardiovascular disease knowledge graph, a pulmonary disease knowledge graph, a critical disease knowledge graph, and the like.
The inventor realizes that in an actual scene, the obtained entity sequence data is usually noisy, and the phenomenon of incomplete data caused by recording errors or omission can be caused, after the entity sequence corresponding to the medical text is obtained, if only effective information is mined from the entity sequence to predict, no guidance of domain knowledge exists, the prediction probability of a preset category can be influenced by data noise to cause prediction deviation. Therefore, in this embodiment, the computer device may query, from the medical knowledge graph, the sub-graph corresponding to each entity in the entity sequence according to the medical knowledge graph, and obtain the sub-graph sequence, and since the sub-graph includes not only other entities associated with the entity in the medical knowledge graph, but also relationships between the other entities, the domain knowledge can be integrated into the representation of the entity, so that semantic information can be enriched, and prediction bias caused by data recording errors or missing data can be reduced.
In one embodiment, the computer device may query a medical knowledge graph, determine an entity node corresponding to an entity in the sequence of entities from the medical knowledge graph, and use a subgraph formed by the entity node, a neighbor node of the entity node, and a side relationship of the neighbor node as the subgraph corresponding to the entity. And obtaining a sub-image sequence according to the sub-image corresponding to each entity in the entity sequence, so that a new feature is constructed for the entity sequence, wherein the new feature comprises neighbor nodes of the entity in the knowledge graph, and semantic information of the entity can be enriched.
The neighbor node may be a first-order neighbor node or a second-order neighbor node.
In one embodiment, the computer device may further query a medical knowledge graph, determine an entity node corresponding to an entity in the entity sequence from the medical knowledge graph, determine a target node corresponding to a predetermined category, and use a subgraph formed by a node and a side relationship related to a path from the entity node to the target node as the subgraph corresponding to the entity. The target node may be a node corresponding to an entity closely related to the predetermined category in the medical knowledge graph, for example, the entity may be at least one of a target disease, a drug required for treating the target disease, a symptom of the target disease, a diagnosis conclusion in the medical knowledge field, and the like. According to the sub-image corresponding to each entity in the entity sequence, a sub-image sequence is obtained, so that a new feature is constructed for the entity sequence, the new feature comprises all nodes on the path from the entity to the preset category, which is equivalent to introducing more data to participate in the representation of the entity, and the prediction deviation caused by data recording errors or missing data can be reduced.
In one embodiment, the computer device may also take all nodes of the entity node on the neighbor node and the path of the entity node to the target node in the medical knowledge-graph as subgraphs of the entity by querying the medical knowledge-graph. As shown in fig. 4, querying a sub-graph sequence corresponding to an entity sequence according to a medical knowledge graph includes:
step 402, for an entity in the entity sequence, determining a corresponding entity node in the medical knowledge-graph.
Step 404, determining neighbor nodes of the entity node from the medical knowledge graph.
Step 406, determining a path from the entity node to the target node from the medical knowledge-graph.
And step 408, obtaining a sub-graph corresponding to the entity according to the neighbor nodes and the nodes on the path.
Step 410, obtaining a sub-graph sequence according to sub-graphs corresponding to each entity in the entity sequence.
Fig. 5 is a schematic diagram of a sub-graph corresponding to an entity constructed from a medical knowledge-graph in one embodiment. Referring to fig. 5, white nodes are schematic representations of partial nodes of a medical knowledge graph, black nodes represent target nodes, gray nodes represent entity nodes corresponding to a certain entity in an entity sequence, all nodes on the path from the entity node to the target node, and neighbor nodes of the entity node and all nodes on the path from the entity node to the target node are used as subgraphs corresponding to the entity, and as shown in fig. 5, nodes and edges surrounded by a dashed line frame are subgraphs corresponding to the entity node.
In this embodiment, the constructed subgraph is equivalent to constructing a new feature for the entity sequence, where the new feature includes neighboring nodes of the entity in the knowledge graph, so that semantic information of the entity can be enriched, and includes nodes and side relations related to a path from the entity node to the target node, so that prediction bias caused by data recording errors or missing data can be reduced.
Step 306, obtaining the sub-graph sequence representation after performing graph convolution operation on each sub-graph in the sub-graph sequence.
The graph rolling operation is used for extracting features from the graph data, and the graph data is subjected to the graph rolling operation, so that the features of the graph data can be obtained. The inventor realizes that the method of determining the probability that the medical text belongs to the predetermined category by using the knowledge graph lacks of mining the medical knowledge, so that the determined accuracy is not high. Because the graph convolution operation can process nodes in the subgraph and can process side relations among the nodes in the subgraph, the characteristics of the subgraph are extracted together according to the nodes and the side relations in the subgraph, the obtained subgraph representation can not only mine semantic information in a medical knowledge graph, but also can obtain graph structure information, and further the domain knowledge is utilized more fully.
In one embodiment, as shown in fig. 6, after performing a graph convolution operation on each sub-graph, obtaining a sub-graph sequence representation corresponding to the entity sequence includes:
step 602, inputting each sub-graph into a graph rolling network to obtain the graph rolling characteristics corresponding to each sub-graph.
The graph rolling network is a neural network structure for processing graph data, and is different from a traditional network only used for grid-based data, such as a convolutional neural network, and is capable of processing data with a generalized topological structure. The graph roll-up network may employ ChebNet (chebyshev network) or GCN (graph roll-up neural network), or the like.
And step 604, carrying out pooling operation on the graph rolling characteristics to obtain vector representations corresponding to the subgraph.
The pooling operation (Pooling) is a data processing operation in the neural network, and is used for simulating a human visual system to reduce the dimension of data, and is also called sub-sampling (Subsampling) or downsampling (Downsampling), when the neural network is constructed, the pooling operation is often used after the convolutional layer, the characteristic dimension of the output of the convolutional layer is reduced through the pooling operation, the network parameters are effectively reduced, and meanwhile, the overfitting phenomenon can be prevented. The pooling operation here may be global maximum pooling, global average pooling, random pooling, and the like, and may be set according to actual requirements.
And step 606, fusing the vector representations corresponding to the sub-images to obtain the sub-image sequence representation corresponding to the entity sequence.
Specifically, the computer device may fuse the vector representations corresponding to each sub-graph in the entity sequence to obtain a sub-graph sequence representation corresponding to the entity sequence.
In one embodiment, the sub-picture sequence representation corresponding to the entity sequence may be obtained by the following formula.
Where GCN () is a graph rolling operation, globalMaxPooling () is a global max pooling operation,Is the sub-graph corresponding to the ith entity in the entity sequence, and can also be called as the ith sub-graph,Is the vector representation of the ith sub-graph. Extracting a plurality of groups of entity sequences from the medical text, and merging vector representations of a plurality of entity corresponding subgraphs in the jth entity sequence into one vector representation, namely, the sub-sequence representation corresponding to the entity sequence:
Wherein, Representing a sub-sequence representation corresponding to the jth entity sequence, W v and b v are trained model parameters.
Step 308, fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph to obtain the entity sequence representation corresponding to the entity sequence.
In particular, to obtain meaningful vector representations of entities and edges in a medical knowledge-graph, a computer device may obtain distributed vector representations of nodes and edges in the medical knowledge-graph by utilizing knowledge-graph representation learning (Knowledge Graph Representation Learning) during training of a medical text-processing model. After the model is trained, the entity node corresponding to the entity can be directly queried in the medical knowledge graph to obtain the distributed vector representation corresponding to the entity node, and the distributed vector representation corresponding to each entity in the entity sequence is fused with the distributed vector representation corresponding to the neighbor node to obtain the entity sequence representation corresponding to the entity sequence.
In the training process of the medical text processing model, the computer equipment can use a Trans model to represent the distributed vectors of the entity and the side relation in the medical knowledge graph, such as TransE model, and the idea of the model is to make the sum of the distributed vector corresponding to the head node and the distributed vector corresponding to the side relation in the entity relation triplet approximate to the distributed vector corresponding to the tail node as much as possible in the model training process. In some alternative embodiments, the computer device may also employ other Trans models to characterize the distributed vectors of entity and edge relationships in the medical knowledge-graph, such as TransH models, transR models, transD models, and so forth, as desired.
In one embodiment, a TransE model is used to obtain a vector representation of node and edge relationships in a medical knowledge-graph, as shown in the following equation:
Wherein h i、ri and t i are respectively a head entity, a variant relation and a tail entity corresponding to the ith edge in the medical knowledge graph, The method comprises the steps of obtaining features corresponding to a head entity, a side relation and a tail entity through TransE models, and obtaining vector representations corresponding to the features by using a multi-Layer perceptron (Muti-Layer preference, MLP).
In one embodiment, the method further comprises: for entities in the entity sequence, determining corresponding entity nodes in the medical knowledge graph; determining neighbor nodes of the entity node from the medical knowledge graph; determining the edge relation between the entity node and the neighbor node from the medical knowledge graph; determining distributed vector representations corresponding to the entity nodes, the edge relations and the neighbor nodes; wherein there is an association between the sum of the distributed vector representations between the entity node and the edge relationship and the distributed vector representation of the neighbor node.
Specifically, for each entity in the entity sequence, the computer equipment queries the entity node corresponding to the entity node and the neighbor node of the entity node from the medical knowledge graph, and after determining the neighbor node, determines a distributed vector representation corresponding to each triplet related to the entity node according to the entity-side relation-entity triplet, wherein the sum of the distributed vector representations corresponding to the entity node and the side relation is very close to the distributed vector representation of the neighbor node in the distributed vector representation corresponding to each triplet. After determining the distributed vector representation corresponding to each triplet related to the entity node, the computer equipment fuses the distributed vector representation of the neighbor node in the distributed vector representation with the distributed vector representation of the entity node, and the fused result is used as the knowledge representation of the entity node in the medical knowledge graph, so that the entity sequence representation corresponding to the whole entity sequence is obtained.
In this embodiment, since the medical knowledge graph includes the structured medical knowledge points and the logical association between the structured medical knowledge points according with the medical knowledge, the distributed vector representation of the neighboring nodes is embedded into the entity representation, so as to obtain semantic information richer in the entity, thereby improving the prediction accuracy.
In one embodiment, as shown in fig. 7, fusing a distributed vector representation corresponding to each entity in an entity sequence with a distributed vector representation corresponding to a neighboring node of each entity in a medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, including:
step 702, for an entity in the entity sequence, determining a corresponding entity node in the medical knowledge-graph.
In step 704, neighboring nodes of the entity node are determined from the medical knowledge-graph.
At step 706, the attention weight of each neighbor node to the entity node is determined.
In this embodiment, in order to extract a neighbor node vector representation that has a greater impact on an entity node from among a plurality of neighbor nodes of the entity node, a computer device uses Graph Attention (Graph Attention) to obtain an Attention weight of each neighbor node for the entity node. That is, the attention weight is used to represent the impact on the entity node in the neighbor node, the greater the attention weight, the greater its impact on the entity node, and the more critical it is to output a knowledge representation of the entity node.
In one embodiment, determining the attention weight of each neighbor node to the entity node includes: determining the attention score of the neighbor node to the entity node according to the distributed vector representation corresponding to the neighbor node, the distributed vector representation corresponding to the edge relation between the neighbor nodes and the distributed vector representation corresponding to the entity node; and obtaining the attention weight corresponding to each neighbor node according to the attention score of each neighbor node to the entity node.
In particular, the computer device is configured to obtain an attention score for each neighboring node to a medical knowledge-graph in order to be able to express different attention to different neighboring nodes of the entity node in determining that the entity node is in the entity node. The computer equipment can input the distributed vector representation corresponding to the triplet formed by the entity node and the neighbor node into a drawing attention model, calculate the attention score of each neighbor node for the entity node, normalize the attention score and obtain the corresponding attention weight.
In one embodiment, the attention weight corresponding to a neighbor node may be expressed by the following formula:
Where q ij represents the attention score for the ith entity node in the sequence of entities, its jth neighbor node, Vector representations corresponding to a head entity, an edge relation and a tail entity in a triplet formed by a j-th neighbor node of an i-th entity node in the entity sequence are respectively, and tanh is an activation function and is used for introducing nonlinearity. /(I)Representing the attention weight of the j-th neighbor node for the i-th entity node in the entity sequence, exp () represents an exponential function based on a natural constant e.
For example, for the ith entity node a in the entity sequence, there are 3 neighbor nodes in the medical knowledge graph, namely node B, node C and node D, respectively. For node B, the side relationship formed is AB, for node C the side relationship formed is AC, and for node D the side relationship formed is AD. Then the attention score corresponding to node B is:
qAB=eABtanh(eA+eB);
similarly, the attention scores corresponding to the nodes C and D are as follows:
qAC=eACtanh(eA+eC);
qAD=eADtanh(eA+eD);
the attention weight corresponding to node B is:
Similarly, the attention weights corresponding to the nodes C and D are as follows:
Step 708, according to the attention weight, the distributed vector representation corresponding to the entity node is fused with the distributed vector representation of each neighboring node, and the entity representation corresponding to the entity node is obtained.
Specifically, after the computer device obtains the attention weight of each neighbor node to the entity node, the distributed vector representation of each neighbor node is weighted and summed according to the attention weight to obtain the entity representation corresponding to the entity node, and the entity representation is embedded with the semantic information of the neighbor node of the entity node.
In one embodiment, the entity representation corresponding to the ith entity in the sequence of entities may be represented by the following formula:
wherein g i represents an entity representation corresponding to the ith entity in the entity sequence.
According to the example above, for node a, its corresponding entity is denoted as:
Step 710, obtaining the entity sequence representation corresponding to the entity sequence according to the entity representations corresponding to the entities in the entity sequence.
Specifically, the computer device may obtain, as the entity sequence representation corresponding to the entity sequence, the entity representation corresponding to each entity in the entity sequence in the above manner. When the medical text includes a plurality of entity sequences corresponding to the medical text, a plurality of groups of entity sequence representations corresponding to the medical text can be obtained.
In one embodiment, a plurality of entity sequences are extracted from the medical text, and entity representations corresponding to the entities in the jth entity sequence can be merged into one vector representation according to the following formula:
Wherein, Representing the entity sequence representation corresponding to the jth entity sequence, W k and b k are trained model parameters.
At step 310, a probability that the medical text belongs to a predetermined category is determined based on the sub-image sequence representation and the entity sequence representation.
Wherein the text may be used to reflect the attributes of the user associated therewith and the medical text may reflect the physical condition of the user. For example, when the medical text is medical diagnosis data of the user, the medical text may reflect the diagnosis condition of the user, and when the medical text is body monitoring data of the user itself, the medical text may reflect the physical condition of the user. Based on this, the predetermined category is a category related to the user reflected by the text. For example, for the first example, two predetermined categories may be set, normal and lesion, respectively, and three predetermined categories may be set, classified into normal, mild lesion and severe lesion. For a second example, three predetermined classification categories may be set, namely good health, general health and sub-health, respectively.
In some embodiments, when there are only two, i.e., two, categories, the computer device may determine the probability that the medical text belongs to one of the predetermined categories based only on the sub-graph sequence representation and the entity sequence representation, then the probability that the medical text belongs to the other of the predetermined categories can also be determined.
In particular, after obtaining the sub-image sequence representation and the entity sequence representation corresponding to each entity sequence, the computer device may obtain a final representation of the medical text based on the sub-image sequence representation and the entity sequence representation, and determine a probability that the medical text belongs to the predetermined category using the final representation. The probability is a value between 0 and 1, the greater the probability, the greater the likelihood that the medical text belongs to a predetermined category.
In an alternative embodiment, where the medical text may correspond to one or more categories, the predetermined categories herein may also be one or more, and the computer device may input a final representation of the medical text into a classifier corresponding to a different predetermined category to obtain a probability that the medical text belongs to the different predetermined category.
In one embodiment, as shown in fig. 8, determining the probability that the medical text belongs to the predetermined category based on the sub-graph sequence representation and the entity sequence representation includes:
Step 802, obtaining a hidden state of a sub-graph represented by a sub-graph sequence through a first convolution network.
In step 804, the entity hiding state represented by the entity sequence is obtained through the second convolution network.
Specifically, the hidden state of a sub-graph represents the timing relationship between sub-graph representations in a sub-graph sequence representation, and the hidden state of an entity represents the timing relationship between entity representations in an entity sequence. In one embodiment, the timing relationship inside the sub-graph sequence representation and the entity sequence representation may be mined separately by the following formula:
Wherein, Representing a sub-graph sequence representation corresponding to the jth entity sequence,Representing the hidden state of the sub-graph corresponding to the j-th entity sequence; /(I)Entity sequence representation corresponding to the jth entity sequence,Indicating the entity hidden state corresponding to the jth entity sequence, LSTM v () and LSTM k () are different long-short-term memory networks.
In alternative embodiments, the long-short term memory network may be replaced with other networks, such as recurrent neural networks, gated loop units, convolutional neural networks, and so forth.
Step 806, fusing the hidden state of the subgraph and the hidden state of the entity to obtain the medical text representation.
In one embodiment, fusing the hidden state of the subgraph and the hidden state of the entity to obtain the medical text representation includes: fusing a plurality of sub-graph hiding states corresponding to a plurality of medical texts to obtain sub-graph hiding vectors; fusing a plurality of entity hiding states corresponding to a plurality of medical texts to obtain sub-graph hiding vectors; and connecting the sub-graph hidden vector with the sub-graph hidden vector to obtain the medical text representation.
In one embodiment, the sub-graph hidden states corresponding to the plurality of entity sequences may be converted into sub-graph hidden vectors in the unified vector space according to the following formula:
In one embodiment, the entity hiding states corresponding to the plurality of entity sequences may be converted into entity hiding vectors in the unified vector space according to the following formula:
in one embodiment, the resulting two vectors may be fused through the full connection layer to a final medical text representation:
Cp=Wp[Cv,Ck]+bp。
At step 808, a probability that the medical text belongs to a predetermined category is determined from the medical text representation by the classifier.
In this embodiment, a linear classifier may be employed to map the medical text representation to the probability values:
y=sigmoid(WfCp)+bf
According to the medical text processing method, the sub-image sequence corresponding to the entity sequence of the medical text is constructed by inquiring the domain knowledge in the medical knowledge graph, and the sub-image sequence representation is obtained by utilizing the graph convolution operation, which is equivalent to constructing new features for the entity sequence, so that not only can the input information be enriched, but also the abundant semantic information in the medical knowledge graph can be excavated, and the graph structure information can be obtained; in addition, a meaningful distributed vector representation of the entity and the neighbor node in the medical knowledge graph is obtained, and the distributed vector representation of the neighbor node is embedded into the vector representation of the entity, so that domain knowledge in the medical knowledge graph is more fully embedded into the entity sequence representation. In this way, the accuracy of the probability that the medical text determined based on the sub-image sequence representation and the entity sequence representation belongs to the predetermined category may be improved.
As shown in fig. 9, in a specific embodiment, the medical text processing method includes the steps of:
step 902, a sequence of entities in a medical text is obtained.
Step 904, for the entities in the entity sequence, determining corresponding entity nodes in the medical knowledge-graph.
Step 906, determining neighbor nodes of the entity node from the medical knowledge graph.
Step 908, determining a path from the entity node to the target node from the medical knowledge-graph.
And step 910, obtaining a sub-graph corresponding to the entity according to the neighbor nodes and the nodes on the path.
Step 912, obtaining a sub-graph sequence according to sub-graphs corresponding to the entities in the entity sequence.
Step 914, inputting each sub-graph into a graph rolling network to obtain the graph rolling characteristics corresponding to each sub-graph.
And step 916, carrying out pooling operation on the graph rolling characteristics to obtain vector representations corresponding to the subgraph.
Step 918, fusing the vector representations corresponding to the sub-images to obtain the sub-image sequence representation corresponding to the entity sequence.
Step 920, for the entities in the entity sequence, determining corresponding entity nodes in the medical knowledge-graph.
In step 922, neighboring nodes of the entity node are determined from the medical knowledge-graph.
Step 924, determining the attention score of the neighbor node to the entity node according to the distributed vector representation corresponding to the neighbor node, the distributed vector representation corresponding to the edge relationship between the neighbor nodes, and the distributed vector representation corresponding to the entity node.
In step 926, according to the attention score of each neighboring node to the entity node, the attention weight corresponding to each neighboring node is obtained.
And step 928, according to the attention weight, fusing the distributed vector representation corresponding to the entity node with the distributed vector representation of each neighbor node to obtain the entity representation corresponding to the entity node.
Step 930, obtaining the entity sequence representation corresponding to the entity sequence according to the entity representations corresponding to the entities in the entity sequence.
At step 932, the hidden state of the sub-graph represented by the sub-graph sequence is obtained through the first convolutional network.
Step 934, obtaining, through the second convolutional network, an entity hiding state represented by the entity sequence.
Step 936, merging the plurality of hidden states of the sub-graph corresponding to the plurality of medical texts to obtain the hidden vector of the sub-graph.
And step 938, fusing a plurality of entity hiding states corresponding to the plurality of medical texts to obtain a subgraph hiding vector.
Step 940, connecting the sub-graph hidden vector with the sub-graph hidden vector to obtain a medical text representation.
At step 942, a probability that the medical text belongs to a predetermined category is determined from the medical text representation by the classifier.
It should be understood that, although the steps in the flowcharts of fig. 3 to 9 are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps of fig. 3-9 may include steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the steps or stages are performed necessarily occur sequentially, but may be performed alternately or alternately with other steps or at least a portion of the steps or stages in other steps.
In one embodiment, as shown in fig. 10, a training method of a medical text processing model is provided, and the method is applied to the computer device (the terminal 102 or the server 104) in fig. 1, for illustration, and includes the following steps:
step 1002, a sample medical text and labeling information corresponding to the sample medical text are obtained.
The labeling information corresponding to the sample medical text indicates whether the sample medical text belongs to a predetermined category, for example, if the sample medical text belongs to the predetermined category, the labeling information is 0, and if the sample medical text does not belong to the predetermined category, the labeling information is 1. When the predetermined category is a plurality of, the labeling information corresponding to the sample medical text is a multidimensional vector, and the value of each element in the multidimensional vector is 0 or 1. For example, to predict the probability that a medical text belongs to N predetermined categories, the labeling information of the medical text is an N-dimensional vector, and each element in the N-dimensional vector represents whether the medical text belongs to the corresponding predetermined category.
Step 1004, inputting the sample medical text into a medical text processing model, obtaining an entity sequence in the sample medical text through the medical text processing model, inquiring a sub-image sequence corresponding to the entity sequence according to a medical knowledge graph, performing graph convolution operation on each sub-image in the sub-image sequence to obtain a sub-image sequence representation, fusing a distributed vector representation corresponding to each entity in the entity sequence with a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, and determining the probability that the sample medical text belongs to a preset category based on the sub-image sequence representation and the entity sequence representation.
Reference may be made to the previously described embodiments of the medical text processing method for the embodiments of step 1004.
In one embodiment, the computer device may take all nodes of the entity node on the neighbor node and the path of the entity node to the target node in the medical knowledge-graph as subgraphs of the entity by querying the medical knowledge-graph.
In one embodiment, the sub-picture sequence representation corresponding to the entity sequence may be obtained by the following formula.
Where GCN () is a graph rolling operation, globalMaxPooling () is a global max pooling operation,Is the sub-graph corresponding to the ith entity in the entity sequence, and can also be called as the ith sub-graph,Is the vector representation of the ith sub-graph. Extracting a plurality of groups of entity sequences from the sample medical text, and merging vector representations of a plurality of entity corresponding subgraphs in the jth entity sequence into one vector representation, namely, the sub-sequence representation corresponding to the entity sequence:
Wherein, Representing a sub-sequence representation corresponding to the jth entity sequence, W v and b v are trained model parameters.
In one embodiment, the computer device uses TransE model to obtain a vector representation of the node and edge relationships in the medical knowledge-graph, as shown in the following equation:
Wherein h i、ri and t i are respectively a head entity, a variant relation and a tail entity corresponding to the ith edge in the medical knowledge graph, The method comprises the steps of obtaining features corresponding to a head entity, a side relation and a tail entity through TransE models, and obtaining vector representations corresponding to the features by using a multi-Layer perceptron (Muti-Layer preference, MLP).
In one embodiment, the attention weight corresponding to a neighbor node may be expressed by the following formula:
Where q ij represents the attention score for the ith entity node in the sequence of entities, its jth neighbor node, Vector representations corresponding to a head entity, an edge relation and a tail entity in a triplet formed by a j-th neighbor node of an i-th entity node in the entity sequence are respectively, and tanh is an activation function and is used for introducing nonlinearity. /(I)Representing the attention weight of the j-th neighbor node for the i-th entity node in the entity sequence, exp () represents an exponential function based on a natural constant e.
In one embodiment, the entity representation corresponding to the ith entity in the sequence of entities may be represented by the following formula:
wherein g i represents an entity representation corresponding to the ith entity in the entity sequence.
In one embodiment, a plurality of entity sequences are extracted from the sample medical text, and entity representations corresponding to the entities in the jth entity sequence can be fused into a vector representation according to the following formula:
In one embodiment, the timing relationship inside the sub-graph sequence representation and the entity sequence representation may be mined separately by the following formula:
Wherein, Representing a sub-graph sequence representation corresponding to the jth entity sequence,Representing the hidden state of the sub-graph corresponding to the j-th entity sequence; /(I)Entity sequence representation corresponding to the jth entity sequence,Indicating the entity hidden state corresponding to the jth entity sequence, LSTM v () and LSTM k () are different long-short-term memory networks.
In one embodiment, the sub-graph hidden states corresponding to the plurality of entity sequences may be converted into sub-graph hidden vectors in the unified vector space according to the following formula:
In one embodiment, the entity hiding states corresponding to the plurality of entity sequences may be converted into entity hiding vectors in the unified vector space according to the following formula:
in one embodiment, the resulting two vectors may be fused through the full connection layer to a final medical text representation:
Cp=Wp[Cv,Ck]+bp。
In one embodiment, a predictive probability that a sample medical text belongs to a predetermined category is determined by a classifier from a medical text representation, the predictive probability being obtained by the following formula:
And step 1006, constructing cross entropy loss according to the prediction probability and the labeling information.
In one embodiment, the entire medical text processing model may be optimized with binary cross entropy loss as an objective function:
Wherein, The prediction probability of the model output is obtained, and y is the labeling information corresponding to the sample medical text.
And step 1008, after updating the model parameters of the medical text processing model according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training until the training stopping condition is met, and obtaining the medical text processing model.
In one embodiment, when it is determined that the medical text belongs to a plurality of predetermined categories, a plurality of classifiers are required to predict the prediction probability corresponding to each predetermined category, cross entropy loss of each predetermined category is constructed based on the input labeling information for the plurality of predetermined categories, and model parameters of the medical text processing model are updated according to the objective function after the objective function is obtained based on the constructed cross entropy loss.
FIG. 11 is a schematic diagram of a training framework for a medical text processing model in one embodiment. Referring to fig. 11, the medical text processing model includes a sub-sequence representation network, an entity sequence representation network, a timing relationship mining network, and an output network. The computer equipment inputs the entity sequence extracted from the sample medical text into a sub-image sequence representation network and an entity sequence representation network respectively, the sub-image sequence representation network acquires a corresponding sub-image sequence according to a medical knowledge graph, and performs graph convolution operation on the sub-image sequence to acquire sub-image sequence representation; the entity sequence representation network acquires corresponding neighbor nodes according to the knowledge graph, and embeds the distributed vector representation of the neighbor nodes into the entity representation by utilizing the graph attention to acquire an entity representation sequence; respectively mining time sequence relations inside the sub-graph representation sequence and the entity representation sequence through a time sequence relation mining network to obtain a sub-graph hidden vector and an entity hidden vector; and finally, fusing the two hidden vectors into a final medical text representation through a full-connection layer by an output network, mapping the medical text representation to a probability value by using a classifier to obtain a prediction probability, and constructing a cross entropy loss optimization whole medical text processing model by using the output prediction probability and labeling information of the sample medical text.
According to the training method of the medical text processing model, the domain knowledge is mined from the medical knowledge graph in the training process, the semantic information of the entity sequence corresponding to the sample medical text is enriched, and the accuracy of the model is improved. By constructing the sub-image sequence corresponding to the entity sequence of the sample medical text and utilizing the graph convolution operation to obtain the sub-image sequence representation, the method is equivalent to constructing new features for the entity sequence, not only can enrich input information and mine rich semantic information in the medical knowledge graph, but also can obtain graph structure information, reduce adverse effects on model training caused by medical text data deviation (such as data loss), and meanwhile, the graph convolution operation can capture graph structure information in the medical knowledge graph, and the part of information can obtain association relations between entities with higher distinction degree and preset categories, so that the model accuracy is improved. Furthermore, the distributed vector representation of the neighbor nodes is embedded in the vector representation of the entity, such that domain knowledge in the medical knowledge-graph is more fully embedded in the entity sequence representation. Finally, the accuracy of the probability that the medical text determined by the model based on the sub-image sequence representation and the entity sequence representation belongs to the preset category is obviously improved.
In one embodiment, as shown in fig. 12, a medical text processing apparatus 1200 is provided, which may employ software modules or hardware modules, or a combination of both, as part of a computer device, the apparatus specifically comprising: an acquisition module 1202, a query module 1204, a sub-graph sequence representation module 1206, an entity sequence representation module 1208, and a determination module 1210, wherein:
An acquisition module 1202 for acquiring an entity sequence in a medical text;
the query module 1204 is configured to query the sub-graph sequence corresponding to the entity sequence according to the medical knowledge graph;
a sub-graph sequence representation module 1206, configured to obtain a sub-graph sequence representation after performing a graph convolution operation on each sub-graph in the sub-graph sequence;
The entity sequence representation module 1208 is configured to fuse the distributed vector representations corresponding to each entity in the entity sequence with the distributed vector representations corresponding to the neighboring nodes of each entity in the medical knowledge graph, so as to obtain an entity sequence representation corresponding to the entity sequence;
a determining module 1210 is configured to determine a probability that the medical text belongs to a predetermined category based on the sub-graph sequence representation and the entity sequence representation.
In one embodiment, the acquisition module 1202 is further configured to acquire a plurality of medical texts; extracting entity sequences related to medical treatment from each medical treatment text; and sequencing the entity sequences according to the time sequence generated by the medical texts to obtain a plurality of groups of entity sequences corresponding to the medical texts.
In one embodiment, the query module 1204 is further configured to determine, for an entity in the sequence of entities, a corresponding entity node in the medical knowledge graph; determining neighbor nodes of the entity node from the medical knowledge graph; determining a path from the entity node to the target node from the medical knowledge graph; obtaining a sub-graph corresponding to the entity according to the neighbor nodes and the nodes on the path; and obtaining a sub-graph sequence according to the sub-graphs corresponding to the entities in the entity sequence.
In one embodiment, the sub-graph sequence representation module 1206 is further configured to input each sub-graph to a graph convolution network to obtain a graph convolution feature corresponding to each sub-graph; pooling operation is carried out on the graph rolling characteristics, and vector representations corresponding to the subgraph are obtained; and fusing the vector representations corresponding to the sub-images to obtain the sub-image sequence representation corresponding to the entity sequence.
In one embodiment, the apparatus further includes a distributed vector representation module configured to determine, for an entity in the sequence of entities, a corresponding entity node in the medical knowledge-graph; determining neighbor nodes of the entity node from the medical knowledge graph; determining the edge relation between the entity node and the neighbor node from the medical knowledge graph; determining distributed vector representations corresponding to the entity nodes, the edge relations and the neighbor nodes; wherein there is an association between the sum of the distributed vector representations between the entity node and the edge relationship and the distributed vector representation of the neighbor node.
In one embodiment, the entity sequence representation module 1208 is further configured to determine, for the entities in the entity sequence, corresponding entity nodes in the medical knowledge-graph; determining neighbor nodes of the entity node from the medical knowledge graph; determining the attention weight of each neighbor node to the entity node; according to the attention weight, fusing the distributed vector representation corresponding to the entity node with the distributed vector representation of each neighbor node to obtain the entity representation corresponding to the entity node; and obtaining the entity sequence representation corresponding to the entity sequence according to the entity representation corresponding to each entity in the entity sequence.
In one embodiment, the entity sequence representation module 1208 is further configured to determine an attention score of the neighboring node to the entity node according to the distributed vector representation corresponding to the neighboring node, the distributed vector representation corresponding to the edge relationship between the neighboring node, and the distributed vector representation corresponding to the entity node; and obtaining the attention weight corresponding to each neighbor node according to the attention score of each neighbor node to the entity node.
In one embodiment, the determining module 1210 is further configured to obtain, through the first convolution network, a hidden state of the sub-graph represented by the sub-graph sequence; obtaining an entity hiding state represented by an entity sequence through a second convolution network; fusing the hidden state of the subgraph and the hidden state of the entity to obtain medical text representation; the probability that the medical text belongs to the predetermined category is determined by the classifier from the medical text representation.
In one embodiment, the determining module 1210 is further configured to fuse a plurality of sub-graph hidden states corresponding to a plurality of medical texts to obtain a sub-graph hidden vector; fusing a plurality of entity hiding states corresponding to a plurality of medical texts to obtain sub-graph hiding vectors; and connecting the sub-graph hidden vector with the sub-graph hidden vector to obtain the medical text representation.
In one embodiment, the system further comprises a training module for acquiring the sample medical text and the labeling information corresponding to the sample medical text; inputting a sample medical text into a medical text processing model, acquiring an entity sequence in the sample medical text through the medical text processing model, inquiring a sub-image sequence corresponding to the entity sequence according to a medical knowledge graph, carrying out graph convolution operation on each sub-image in the sub-image sequence to obtain sub-image sequence representation, fusing a distributed vector representation corresponding to each entity in the entity sequence with a distributed vector representation corresponding to a neighbor node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, and obtaining a prediction probability of a preset category based on the sub-image sequence representation and the entity sequence representation; constructing cross entropy loss of the prediction probability and the labeling information; and after updating the model parameters of the medical text processing model according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text for continuous training until the training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to the preset category.
The medical text processing device 1200 constructs a sub-image sequence corresponding to the entity sequence of the medical text by inquiring the domain knowledge in the medical knowledge graph, and obtains the sub-image sequence representation by utilizing the graph convolution operation, which is equivalent to constructing new features for the entity sequence, so that not only can the input information be enriched, but also the abundant semantic information in the medical knowledge graph can be mined, and the graph structure information can be obtained; in addition, a meaningful distributed vector representation of the entity and the neighbor node in the medical knowledge graph is obtained, and the distributed vector representation of the neighbor node is embedded into the vector representation of the entity, so that domain knowledge in the medical knowledge graph is more fully embedded into the entity sequence representation. In this way, the accuracy of the probability that the medical text determined based on the sub-image sequence representation and the entity sequence representation belongs to the predetermined category may be improved.
For specific limitations of the medical text processing apparatus 1200, reference may be made to the above limitations of the medical text processing method, and no further description is given here. The various modules in the medical text processing device 1200 described above may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, as shown in fig. 13, a training apparatus 1300 of a medical text processing model is provided, which may employ software modules or hardware modules, or a combination of both, as part of a computer device, the apparatus specifically comprising: an acquisition module 1302, a determination module 1304, a loss construction module 1306, and a model update module 1308, wherein:
The obtaining module 1302 is configured to obtain a sample medical text and labeling information corresponding to the sample medical text;
A determining module 1304, configured to input a sample medical text into a medical text processing model, obtain an entity sequence in the sample medical text through the medical text processing model, query a sub-graph sequence corresponding to the entity sequence according to a medical knowledge graph, perform graph convolution operation on each sub-graph in the sub-graph sequence to obtain a sub-graph sequence representation, fuse a distributed vector representation corresponding to each entity in the entity sequence with a distributed vector representation corresponding to a neighboring node of each entity in the medical knowledge graph to obtain an entity sequence representation corresponding to the entity sequence, and determine a probability that the sample medical text belongs to a predetermined category based on the sub-graph sequence representation and the entity sequence representation;
A loss construction module 1306, configured to construct a cross entropy loss of the prediction probability and the labeling information;
the model updating module 1308 is configured to, after updating the model parameters of the medical text processing model according to the cross entropy loss, return to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text, and continue training until the training stopping condition is met, obtain the medical text processing model for determining the probability that the medical text belongs to the predetermined category.
For specific limitations regarding the training apparatus 1300 of the medical text processing model, reference may be made to the above limitations regarding the medical text processing method and/or the training method of the medical text processing model, and no further description is given here. The various modules in the training apparatus 1300 of the medical text processing model described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server or a terminal, and the internal structure of which may be as shown in fig. 14. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of training a medical text processing and/or a medical text processing model.
It will be appreciated by those skilled in the art that the structure shown in fig. 14 is merely a block diagram of a portion of the structure associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements are applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, storing a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the steps in the above-described method embodiments.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.
Claims (21)
1. A medical text processing method, the method comprising:
acquiring an entity sequence in a medical text;
Determining corresponding entity nodes in the entity sequence, determining neighbor nodes of the entity nodes from the medical knowledge graph, determining paths from the entity nodes to target nodes from the medical knowledge graph, obtaining sub-graphs corresponding to the entities according to the neighbor nodes and the nodes on the paths, and obtaining sub-graph sequences according to the sub-graphs corresponding to each entity in the entity sequence;
Carrying out graph convolution operation on each sub graph in the sub graph sequence to obtain a sub graph sequence representation;
Fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph to obtain the entity sequence representation corresponding to the entity sequence;
Based on the sub-image sequence representation and the entity sequence representation, a probability that the medical text belongs to a predetermined category is determined.
2. The method of claim 1, wherein the obtaining the sequence of entities in the medical text comprises:
Acquiring a plurality of medical texts;
extracting entity sequences related to medical treatment from each medical treatment text;
and sequencing the entity sequences according to the time sequence generated by the medical texts to obtain a plurality of groups of entity sequences corresponding to the medical texts.
3. The method according to claim 1, wherein obtaining the sub-graph sequence representation corresponding to the entity sequence after performing the graph rolling operation on each sub-graph includes:
inputting each sub-graph to a graph rolling network to obtain graph rolling characteristics corresponding to each sub-graph;
Pooling the graph convolution characteristics to obtain vector representations corresponding to the subgraphs;
and fusing vector representations corresponding to the sub-graphs to obtain sub-graph sequence representations corresponding to the entity sequences.
4. The method according to claim 1, wherein the method further comprises:
For the entities in the entity sequence, determining corresponding entity nodes in a medical knowledge graph;
Determining neighbor nodes of the entity node from the medical knowledge graph;
determining an edge relationship between the entity node and the neighbor node from the medical knowledge graph;
Determining the entity node, the edge relation and the distributed vector representation corresponding to the neighbor node;
Wherein there is an association between a sum of the distributed vector representations between the entity node and the edge relationship and the distributed vector representation of the neighbor node.
5. The method according to claim 1, wherein the fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighboring node of each entity in the medical knowledge-graph to obtain the entity sequence representation corresponding to the entity sequence comprises:
For the entities in the entity sequence, determining corresponding entity nodes in the medical knowledge graph;
Determining neighbor nodes of the entity node from the medical knowledge graph;
Determining the attention weight of each neighbor node to the entity node;
According to the attention weight, fusing the distributed vector representation corresponding to the entity node with the distributed vector representation of each neighbor node to obtain the entity representation corresponding to the entity node;
And obtaining the entity sequence representation corresponding to the entity sequence according to the entity representation corresponding to each entity in the entity sequence.
6. The method of claim 5, wherein the determining the attention weight of each neighbor node to the entity node comprises:
Determining the attention score of the neighbor node to the entity node according to the distributed vector representation corresponding to the neighbor node, the distributed vector representation corresponding to the edge relation between the neighbor nodes and the distributed vector representation corresponding to the entity node;
and obtaining the attention weight corresponding to each neighbor node according to the attention score of each neighbor node to the entity node.
7. The method of claim 1, wherein the determining a probability that the medical text belongs to a predetermined category based on the sub-image sequence representation and the entity sequence representation comprises:
Obtaining a subgraph hiding state represented by the subgraph sequence through a first convolution network;
obtaining an entity hiding state represented by the entity sequence through a second convolution network;
Fusing the subgraph hiding state and the entity hiding state to obtain medical text representation;
and determining the probability that the medical text belongs to a preset category according to the medical text representation through a classifier.
8. The method of claim 7, wherein the fusing the sub-graph hidden state and the entity hidden state to obtain the medical text representation comprises:
fusing a plurality of sub-graph hiding states corresponding to a plurality of medical texts to obtain sub-graph hiding vectors;
Fusing a plurality of entity hiding states corresponding to a plurality of medical texts to obtain sub-graph hiding vectors;
And connecting the subgraph hidden vector with the subgraph hidden vector to obtain medical text representation.
9. The method according to any one of claims 1 to 8, wherein the method is implemented by a medical text processing model, the training step of the medical text processing model comprising:
Acquiring labeling information corresponding to a sample medical text;
Inputting the sample medical text into a medical text processing model, acquiring an entity sequence in the sample medical text through the medical text processing model, determining a corresponding entity node in a medical knowledge graph for an entity in the entity sequence, determining a neighbor node of the entity node from the medical knowledge graph, determining a path from the entity node to a target node from the medical knowledge graph, acquiring a sub-graph corresponding to the entity according to the neighbor node and the node on the path, acquiring a sub-graph sequence according to the sub-graph corresponding to each entity in the entity sequence, performing graph convolution operation on each sub-graph in the sub-graph sequence to acquire a sub-graph sequence representation, fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph, acquiring the entity sequence representation corresponding to the entity sequence, and determining the prediction probability that the sample medical text belongs to a preset category based on the sub-graph sequence representation and the entity sequence representation;
constructing cross entropy loss between the prediction probability and the labeling information;
And after updating the model parameters of the medical text processing model according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training until the training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to the preset category.
10. A method of training a medical text processing model, the method comprising:
Acquiring a sample medical text and labeling information corresponding to the sample medical text;
Inputting the sample medical text into a medical text processing model, acquiring an entity sequence in the sample medical text through the medical text processing model, determining a corresponding entity node in a medical knowledge graph for an entity in the entity sequence, determining a neighbor node of the entity node from the medical knowledge graph, determining a path from the entity node to a target node from the medical knowledge graph, acquiring a sub-graph corresponding to the entity according to the neighbor node and the node on the path, acquiring a sub-graph sequence according to the sub-graph corresponding to each entity in the entity sequence, performing graph convolution operation on each sub-graph in the sub-graph sequence to acquire a sub-graph sequence representation, fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph, acquiring the entity sequence representation corresponding to the entity sequence, and determining the prediction probability that the sample medical text belongs to a preset category based on the sub-graph sequence representation and the entity sequence representation;
Constructing cross entropy loss according to the prediction probability and the labeling information;
And after updating the model parameters of the medical text processing model according to the cross entropy loss, returning to the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training until the training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to the preset category.
11. A medical text processing device, the device comprising:
The acquisition module is used for acquiring the entity sequence in the medical text;
The query module is used for determining entity nodes corresponding to the entity in the entity sequence, determining neighbor nodes of the entity nodes from the medical knowledge graph, determining paths from the entity nodes to target nodes from the medical knowledge graph, obtaining subgraphs corresponding to the entity according to the neighbor nodes and the nodes on the paths, and obtaining subgraph sequences according to subgraphs corresponding to each entity in the entity sequence;
the sub-image sequence representation module is used for obtaining sub-image sequence representation after carrying out image convolution operation on each sub-image in the sub-image sequence;
The entity sequence representation module is used for fusing the distributed vector representations corresponding to the entities in the entity sequence with the distributed vector representations corresponding to the neighbor nodes of the entities in the medical knowledge graph to obtain entity sequence representations corresponding to the entity sequence;
And the determining module is used for determining the probability that the medical text belongs to a preset category based on the sub-image sequence representation and the entity sequence representation.
12. The apparatus of claim 11, wherein the acquisition module is further configured to acquire a plurality of medical texts; extracting entity sequences related to medical treatment from each medical treatment text; and sequencing the entity sequences according to the time sequence generated by the medical texts to obtain a plurality of groups of entity sequences corresponding to the medical texts.
13. The apparatus of claim 11, wherein the sub-graph sequence representation module is further configured to input each sub-graph to a graph rolling network to obtain a graph rolling feature corresponding to each sub-graph; pooling the graph convolution characteristics to obtain vector representations corresponding to the subgraphs; and fusing vector representations corresponding to the sub-graphs to obtain sub-graph sequence representations corresponding to the entity sequences.
14. The apparatus of claim 11, wherein the apparatus further comprises:
The distributed vector representation module is used for determining corresponding entity nodes in the medical knowledge graph for the entities in the entity sequence; determining neighbor nodes of the entity node from the medical knowledge graph; determining an edge relationship between the entity node and the neighbor node from the medical knowledge graph; determining the entity node, the edge relation and the distributed vector representation corresponding to the neighbor node; wherein there is an association between a sum of the distributed vector representations between the entity node and the edge relationship and the distributed vector representation of the neighbor node.
15. The apparatus of claim 11, wherein the entity sequence representation module is further configured to determine, for an entity in the entity sequence, a corresponding entity node in the medical knowledge-graph; determining neighbor nodes of the entity node from the medical knowledge graph; determining the attention weight of each neighbor node to the entity node; according to the attention weight, fusing the distributed vector representation corresponding to the entity node with the distributed vector representation of each neighbor node to obtain the entity representation corresponding to the entity node; and obtaining the entity sequence representation corresponding to the entity sequence according to the entity representation corresponding to each entity in the entity sequence.
16. The apparatus of claim 15, wherein the entity sequence representation module is further configured to determine an attention score of the neighboring node to the entity node based on the distributed vector representation corresponding to the neighboring node, the distributed vector representation corresponding to the edge relationship between the neighboring nodes, and the distributed vector representation corresponding to the entity node; and obtaining the attention weight corresponding to each neighbor node according to the attention score of each neighbor node to the entity node.
17. The apparatus of claim 11, wherein the determining module is further configured to obtain, via a first convolutional network, a subgraph concealment state for the subgraph sequence representation; obtaining an entity hiding state represented by the entity sequence through a second convolution network; fusing the subgraph hiding state and the entity hiding state to obtain medical text representation; and determining the probability that the medical text belongs to a preset category according to the medical text representation through a classifier.
18. The apparatus of claim 17, wherein the determining module is further configured to fuse a plurality of sub-graph hidden states corresponding to a plurality of medical texts to obtain a sub-graph hidden vector; fusing a plurality of entity hiding states corresponding to a plurality of medical texts to obtain sub-graph hiding vectors; and connecting the subgraph hidden vector with the subgraph hidden vector to obtain medical text representation.
19. A training device for a medical text processing model, the device comprising:
The acquisition module is used for acquiring a sample medical text and labeling information corresponding to the sample medical text;
The determining module is used for inputting the sample medical text into a medical text processing model, acquiring an entity sequence in the sample medical text through the medical text processing model, determining a corresponding entity node in a medical knowledge graph for an entity in the entity sequence, determining a neighbor node of the entity node from the medical knowledge graph, determining a path from the entity node to a target node from the medical knowledge graph, acquiring a sub-graph corresponding to the entity according to the neighbor node and the node on the path, acquiring a sub-graph sequence according to the sub-graph corresponding to each entity in the entity sequence, performing graph convolution operation on each sub-graph in the sub-graph sequence to acquire a sub-graph sequence representation, fusing the distributed vector representation corresponding to each entity in the entity sequence with the distributed vector representation corresponding to the neighbor node of each entity in the medical knowledge graph, acquiring an entity sequence representation corresponding to the entity sequence, and determining the prediction probability of the sample medical text belonging to a preset category based on the sub-graph sequence representation and the entity sequence representation;
the loss construction module is used for constructing cross entropy loss between the prediction probability and the labeling information;
And the model updating module is used for returning the step of obtaining the sample medical text and the labeling information corresponding to the sample medical text to continue training after updating the model parameters of the medical text processing model according to the cross entropy loss until the training stopping condition is met, and obtaining the medical text processing model for determining the probability that the medical text belongs to the preset category.
20. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 10 when the computer program is executed.
21. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method of any one of claims 1 to 10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110001747.9A CN113673244B (en) | 2021-01-04 | 2021-01-04 | Medical text processing method, medical text processing device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110001747.9A CN113673244B (en) | 2021-01-04 | 2021-01-04 | Medical text processing method, medical text processing device, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113673244A CN113673244A (en) | 2021-11-19 |
CN113673244B true CN113673244B (en) | 2024-05-10 |
Family
ID=78538008
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110001747.9A Active CN113673244B (en) | 2021-01-04 | 2021-01-04 | Medical text processing method, medical text processing device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113673244B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114399048B (en) * | 2022-01-14 | 2024-09-13 | 河南大学 | Education field combined knowledge point prediction method and system based on graph convolution neural network and type embedding |
CN116504382B (en) * | 2023-05-29 | 2023-11-24 | 北京智胜远景科技有限公司 | Remote medical monitoring system and method thereof |
CN116842109B (en) * | 2023-06-27 | 2024-09-13 | 北京大学 | Information retrieval knowledge graph embedding method, device and computer equipment |
CN117594241B (en) * | 2024-01-15 | 2024-04-30 | 北京邮电大学 | Dialysis hypotension prediction method and device based on time sequence knowledge graph neighborhood reasoning |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020147595A1 (en) * | 2019-01-16 | 2020-07-23 | 阿里巴巴集团控股有限公司 | Method, system and device for obtaining relationship expression between entities, and advertisement recalling system |
CN111475658A (en) * | 2020-06-12 | 2020-07-31 | 北京百度网讯科技有限公司 | Knowledge representation learning method, device, equipment and storage medium |
CN111931505A (en) * | 2020-05-22 | 2020-11-13 | 北京理工大学 | Cross-language entity alignment method based on subgraph embedding |
CN112000689A (en) * | 2020-08-17 | 2020-11-27 | 吉林大学 | Multi-knowledge graph fusion method based on text analysis |
CN112084383A (en) * | 2020-09-07 | 2020-12-15 | 中国平安财产保险股份有限公司 | Information recommendation method, device and equipment based on knowledge graph and storage medium |
CN112100406A (en) * | 2020-11-11 | 2020-12-18 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and medium |
-
2021
- 2021-01-04 CN CN202110001747.9A patent/CN113673244B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020147595A1 (en) * | 2019-01-16 | 2020-07-23 | 阿里巴巴集团控股有限公司 | Method, system and device for obtaining relationship expression between entities, and advertisement recalling system |
CN111931505A (en) * | 2020-05-22 | 2020-11-13 | 北京理工大学 | Cross-language entity alignment method based on subgraph embedding |
CN111475658A (en) * | 2020-06-12 | 2020-07-31 | 北京百度网讯科技有限公司 | Knowledge representation learning method, device, equipment and storage medium |
CN112000689A (en) * | 2020-08-17 | 2020-11-27 | 吉林大学 | Multi-knowledge graph fusion method based on text analysis |
CN112084383A (en) * | 2020-09-07 | 2020-12-15 | 中国平安财产保险股份有限公司 | Information recommendation method, device and equipment based on knowledge graph and storage medium |
CN112100406A (en) * | 2020-11-11 | 2020-12-18 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and medium |
Non-Patent Citations (3)
Title |
---|
Dual graph convolutional neural network for predicting chemical networks.;Harada Shonosuke等;BMC bioinformatics.;20201231;第21卷(第S3期);全文 * |
Graph convolutional networks for computational drug development and discovery;Sun Mengying等;Briefings in bioinformatics;20200831;第21卷(第03期);全文 * |
结合实体共现信息与句子语义特征的关系抽取方法;马语丹;赵义;金婧;万怀宇;;中国科学:信息科学;20181121(11);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113673244A (en) | 2021-11-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113673244B (en) | Medical text processing method, medical text processing device, computer equipment and storage medium | |
CN107516110B (en) | Medical question-answer semantic clustering method based on integrated convolutional coding | |
Dalvi et al. | A survey of ai-based facial emotion recognition: Features, ml & dl techniques, age-wise datasets and future directions | |
Ambekar et al. | Disease risk prediction by using convolutional neural network | |
Chen et al. | Automatic social signal analysis: Facial expression recognition using difference convolution neural network | |
Qi et al. | Embedding deep networks into visual explanations | |
WO2020224433A1 (en) | Target object attribute prediction method based on machine learning and related device | |
Jarraya et al. | A comparative study of Autistic Children Emotion recognition based on Spatio-Temporal and Deep analysis of facial expressions features during a Meltdown Crisis | |
EP4094202A1 (en) | Encoding and transmission of knowledge, data and rules for explainable ai | |
Liao et al. | FERGCN: facial expression recognition based on graph convolution network | |
Dhawan et al. | Deep Learning Based Sugarcane Downy Mildew Disease Detection Using CNN-LSTM Ensemble Model for Severity Level Classification | |
CN110889505A (en) | Cross-media comprehensive reasoning method and system for matching image-text sequences | |
Martinez-Seras et al. | A novel out-of-distribution detection approach for spiking neural networks: design, fusion, performance evaluation and explainability | |
Kenger et al. | Fuzzy min–max neural networks: a bibliometric and social network analysis | |
Prusty et al. | Comparative analysis and prediction of coronary heart disease | |
CN114191665A (en) | Method and device for classifying man-machine asynchronous phenomena in mechanical ventilation process | |
CN113408721A (en) | Neural network structure searching method, apparatus, computer device and storage medium | |
Sonawane et al. | A design and implementation of heart disease prediction model using data and ECG signal through hybrid clustering | |
Liu et al. | Deep Fuzzy Multi-Teacher Distillation Network for Medical Visual Question Answering | |
Wu et al. | Question-driven multiple attention (dqma) model for visual question answer | |
Feng et al. | Learning twofold heterogeneous multi-task by sharing similar convolution kernel pairs | |
Xiong et al. | Adaptive graph-based feature normalization for facial expression recognition | |
Wang et al. | A Brain-inspired Computational Model for Human-like Concept Learning | |
CN114022698A (en) | Multi-tag behavior identification method and device based on binary tree structure | |
Hao | Human activity recognition based on WaveNet |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40055312 Country of ref document: HK |
|
GR01 | Patent grant | ||
GR01 | Patent grant |