CN114639483A - Electronic medical record retrieval method and device based on graph neural network - Google Patents

Electronic medical record retrieval method and device based on graph neural network Download PDF

Info

Publication number
CN114639483A
CN114639483A CN202210291079.2A CN202210291079A CN114639483A CN 114639483 A CN114639483 A CN 114639483A CN 202210291079 A CN202210291079 A CN 202210291079A CN 114639483 A CN114639483 A CN 114639483A
Authority
CN
China
Prior art keywords
medical
patient
vector representation
entity
entities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210291079.2A
Other languages
Chinese (zh)
Inventor
吕旭东
李梦阳
段会龙
蔡海领
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202210291079.2A priority Critical patent/CN114639483A/en
Publication of CN114639483A publication Critical patent/CN114639483A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

Abstract

The invention discloses an electronic medical record retrieval method based on a graph neural network, which comprises the following steps: acquiring a co-occurrence matrix of medical entities in the electronic medical record, adding co-occurrence information of the medical entities and ancestor medical entities into the medical entity co-occurrence matrix to obtain an enhanced medical entity co-occurrence matrix, extracting vector representation of each medical entity and vector representation of a patient by adopting a GloVe model, wherein the electronic medical record heterogeneous graph comprises medical entity nodes, patient nodes, a real relation of links between the medical entities and a real relation of links between the patient and the medical entities; inputting the electronic medical record abnormal picture into a graph neural network to respectively obtain patient node output vector representation and medical entity node output vector representation, and the probability of the link relation between the patient and the medical entity; probability of link relationships between medical entities; training the neural network of the graph by using the total loss function, and updating the parameters to obtain the neural network of the final graph; the method can provide for predicting a probability of association of a patient with a medical entity.

Description

Electronic medical record retrieval method and device based on graph neural network
Technical Field
The invention relates to the technical field of medical information data processing, in particular to an electronic medical record retrieval method and device based on a graph neural network.
Background
Medical practice is an activity that requires a large amount of data to support, which requires constant access to patient information for analysis and decision making. Electronic medical records, one of the main information sources at present, contain abundant information, and it is of great significance to support medical activities such as clinical decision support, clinical research and clinical trials by using the information, and the research development needs to effectively query the data of the electronic medical records. In the query tasks carried out by the medical field personnel, the support of information technical personnel is lacked, so that the personnel can only finish the expression of query by relying on self knowledge, the process of the query task is full of challenges, a large amount of browsing and exploration are needed to find the target information, the working efficiency is greatly reduced, and the workload of medical professionals is increased. To address this problem, an automated method is needed to reduce the time cost of clinical staff.
In the process, the query performance can be effectively improved by utilizing a semantic association mode. Currently, in the query task of an actual scene, various medical entities in an electronic medical record are associated by using medical ontology knowledge, and a corresponding target query entity is expanded in the query through the relationship. However, the method excessively depends on a general medical knowledge body, and related information existing in the electronic medical record is easily ignored; in addition, entities in the electronic medical record which do not appear in the medical knowledge ontology cannot be expanded, so that the application range of the method is limited.
The electronic medical record contains rich associated information which can effectively help to optimize the query task. Based on the idea, different information is needed to establish the link relation between the electronic medical record data, and then the query task is improved through the association relation between the data. With the development of machine learning, the practical effect of deep learning in various fields is proved, and therefore, the modeling of the electronic medical record by utilizing the neural network is an effective means. The graph neural network can represent a complex topological structure, and the electronic medical record can be regarded as a complex heterogeneous graph structure, so that the structure of the electronic medical record can be effectively represented by the graph neural network.
The graph neural network is a novel neural network structure developed from convolutional neural networks and graph representation learning, can extract and represent the characteristics of data in the graph field compared with the data type oriented by the neural networks, is an efficient and easily-expanded structure, and has strong functions in the aspect of graph data learning. Compared with the traditional deep learning method, the entity and the connection between the entities can be reflected by the constructed graph model. The graph neural network firstly carries out initialization description on nodes, then obtains the state with the characteristics of containing neighbor node information and a network topological structure through continuous node state updating, finally outputs the nodes through a specific method to obtain required results, and the results can be used in subsequent tasks. Therefore, the method is very suitable for modeling heterogeneous electronic medical records.
Due to their expertise and complexity, the medical field has a great deal of medical ontology knowledge, such as: ICD, SNOMED-CT, etc., can be used for establishing the relation between different medical entities, establish the association information that does not exist in the electronic medical record, thus enriching the topological structure information in the network.
Disclosure of Invention
The invention discloses an electronic medical record retrieval method based on a graph neural network, which can expand the relationship range between medical entities and between the medical entities and patients so as to prepare for predicting the association probability between the patients and the medical entities.
An electronic medical record retrieval method based on a graph neural network comprises the following steps:
(1) acquiring a co-occurrence matrix of medical entities in the electronic medical record, traversing ICD codes of medical ontology knowledge to acquire a plurality of ancestor medical entities corresponding to the medical entities, adding co-occurrence information of the medical entities and the ancestor medical entities into the medical entity co-occurrence matrix to acquire an enhanced medical entity co-occurrence matrix, extracting vector representation of each medical entity by adopting a GloVe model based on the enhanced medical entity co-occurrence matrix, and taking an aggregation result of a plurality of medical entity vector representations associated with a patient as the patient vector representation;
(2) constructing an electronic medical record heterogeneous graph, wherein the electronic medical record heterogeneous graph comprises medical entity nodes, patient nodes, real linking relations among medical entities and real linking relations between patients and medical entities;
representing each medical entity vector as an initial attribute of each medical entity node, representing each patient vector as an initial attribute of each patient node, connecting related medical entities to obtain a real link relationship between the medical entities, and connecting the related medical entities and the patients to obtain a real link relationship between the patients and the medical entities;
(3) inputting the electronic medical record abnormal graph into a GraphSAGE graph neural network to respectively obtain the output vector representation of the patient node and the output vector representation of the medical entity node; based on the patient node output vector representation and the medical entity node output vector representation, obtaining the link relation probability of the patient and the medical entity by adopting an activation function; based on the medical entity node output vector representation, obtaining the probability of the link relation between the medical entities by adopting an activation function;
(4) constructing a total loss function, wherein the total loss function comprises a first loss function, a second loss function and a multitask weighted loss function;
constructing a first loss function through the cross entropy of the real relation of the patient and the medical entity link and the probability of the patient and the medical entity link;
constructing a second loss function through the cross entropy of the link real relation between the medical entities and the link relation probability of the medical entities and the medical entities;
constructing a multitask weighted loss function through the loss value of the first loss function and the loss value of the second loss function;
(5) training a GraphSAGE graph neural network by using a total loss function, and updating parameters to obtain a final GraphSAGE graph neural network;
(6) when the method is applied, the medical entity vector representation and the patient vector representation are input into the final GraphSAGE graph neural network to predict the association probability of the medical entity and the patient.
Obtaining a co-occurrence matrix of medical entities in an electronic medical record comprises:
the frequency product of every two medical entities in every visit record of every patient is used as the co-occurrence information of every two medical entities, a co-occurrence matrix of every visit record is constructed based on the co-occurrence information of every two medical entities, the co-occurrence matrices of the multiple times of visit records of every patient are added to obtain an electronic medical record co-occurrence matrix of every patient, and the co-occurrence matrices of the electronic medical records of the multiple patients are added to obtain the co-occurrence matrix of the medical entities in the electronic medical records.
Traversing the ICD codes of the medical ontology knowledge to obtain a plurality of ancestor medical entities corresponding to the medical entities, wherein the method comprises the following steps:
and taking each medical entity as a leaf node, obtaining a plurality of ancestor nodes corresponding to the leaf node by traversing the ICD codes of the medical ontology from bottom to top, extracting the medical entities corresponding to the ancestor nodes to obtain ancestor medical entities, obtaining the co-occurrence information of each medical entity and the ancestor medical entities in the ICD codes of the medical ontology, and adding the co-occurrence information into the medical entity co-occurrence matrix to expand the medical entity co-occurrence matrix.
Extracting each medical entity vector representation by adopting a GloVe model based on the enhanced medical entity co-occurrence matrix, wherein the extraction comprises the following steps:
setting an initial vector representation of each medical entity, inputting the initial vector representation to a GloVe model, and training an objective function to obtain a vector representation of each medical entity, wherein the objective function J is as follows:
Figure BDA0003560152490000041
wherein M isijTo enhance the co-occurrence product of the ith and jth entity vectors in the medical entity co-occurrence matrix, | D | is the number of medical entities, e |jIs a vector representation of the jth medical entity,eiis a vector representation of the ith medical entity, biBias parameter for the ith medical entity, bjIs the bias parameter for the jth medical entity.
An aggregated result of the plurality of medical entity vector representations associated with the patient is taken as the patient vector representation, the aggregated result comprising a summation, an average, a maximum, or a minimum.
Inputting the electronic medical record abnormal graph into a GraphSAGE graph neural network to respectively obtain the output vector representation of the patient node and the output vector representation of the medical entity node, wherein the method comprises the following steps:
performing Mean agglomerator aggregation in the GraphSAGE on the current layer neighbor node vector representation and the previous layer output vector representation of the patient node to obtain a patient node output vector representation, and performing Mean agglomerator aggregation in the GraphSAGE to obtain a medical entity node output vector representation through the current layer neighbor node vector representation and the previous layer output vector representation of the medical entity node;
wherein the current layer neighbor node vector represents
Figure BDA0003560152490000042
Comprises the following steps:
Figure BDA0003560152490000043
wherein R is the real relation of links between medical entities or the real relation of links between a patient and the medical entities, R is the set of the real relations of links, u is a neighbor node, v is a current node, N is(r)(v) For the neighbor nodes of the current node v in the real relation of r link,
Figure BDA00035601524900000410
the vector representation of the neighbor node of the previous layer is 1, namely the current layer, and AGGREGATE (-) is the aggregation operation for combining the neighbor information of all the current nodes v together;
medical entity node output vector representation
Figure BDA0003560152490000044
Comprises the following steps:
Figure BDA0003560152490000045
wherein the content of the first and second substances,
Figure BDA0003560152490000046
is a vector representation of the node d of the medical entity of the previous layer, WdFor the weight parameter of medical entity node d, MEAN () is the averaging function, and σ () is the activation function;
patient node output vector representation
Figure BDA0003560152490000047
Comprises the following steps:
Figure BDA0003560152490000048
wherein the content of the first and second substances,
Figure BDA0003560152490000049
is a vector representation of the patient node p of the previous layer, WpIs the weight parameter of the patient node p.
The multitask weighted loss function L is:
Figure BDA0003560152490000051
wherein e is-ηmIs an index of the mth loss function weight factor,
Figure BDA0003560152490000052
is the loss value of the mth loss function.
An electronic medical record prediction device based on a graph neural network, comprising a computer memory, a computer processor and a computer program stored in the computer memory and executable on the computer processor, wherein the computer memory employs the final graph SAGE graph neural network model according to any one of claims 1-7;
the computer processor, when executing the computer program, performs the steps of:
and inputting the medical entity vector representation and the patient vector representation into the final GraphSAGE graph neural network to predict the association probability of the medical entity and the patient.
Compared with the prior art, the invention has the beneficial effects that:
(1) the medical entity co-occurrence information in the ICD codes of the medical ontology is introduced into the co-occurrence matrix of the medical entities of the electronic medical record, so that the co-occurrence matrix of the medical entities is expanded, the relationships among the medical entities and between the medical entities and patients are enriched, and the relevance between the patients and the medical entities is more accurately obtained through the learnt graph neural network.
(2) According to the invention, the link relation between the medical entities and the patients are established through the heterogeneous graph, and the relevance between the medical entities and the patients can be accurately determined through the multi-task weighted loss function training.
Drawings
Fig. 1 is a flowchart of an electronic medical record retrieval method based on a graph neural network according to an embodiment of the present invention.
Fig. 2 is a flowchart of a multitask weighted loss function optimization graph neural network model according to an embodiment of the present invention.
Detailed Description
The following clearly and completely describes an implementation scheme of the electronic medical record link prediction method based on the graph neural network and fusing knowledge in combination with the accompanying drawings.
An electronic medical record prediction method based on a graph neural network is disclosed, as shown in fig. 1, and specifically comprises the following steps:
s1: the frequency product of every two medical entities of each patient in the visit record is used as the co-occurrence information of every two medical entities, the co-occurrence matrix of the visit record of each time is constructed based on the co-occurrence information of every two medical entities, the co-occurrence matrixes of the multiple times of the visit records of each patient are added to obtain the co-occurrence matrix of the electronic medical record of each patient, and the co-occurrence matrixes of the electronic medical records of the multiple patients are added to obtain the co-occurrence matrix of the medical entities in the electronic medical record.
Wherein the co-occurrence information co-occurrence (c) of every two medical entitiesj,cjAnd p) is:
co-occurrence(ci,cj,p)=count(ci,p)×count(cj,p)
wherein, count (c)iP) count (c) the number of occurrences of the ith medical entity for the p patient in each visit recordjP) is the number of occurrences of the jth medical entity for the pth patient in each visit record.
Acquiring a co-occurrence matrix of medical entities in an electronic medical record, taking each medical entity as a leaf node, acquiring a plurality of ancestor nodes corresponding to the leaf node by traversing ICD codes of medical ontology from bottom to top, extracting the medical entities corresponding to the ancestor nodes to acquire ancestor medical entities, acquiring co-occurrence information of each medical entity and the ancestor medical entities in the ICD codes of the medical ontology, adding the co-occurrence information into the medical entity co-occurrence matrix to acquire an enhanced medical entity co-occurrence matrix, expanding the medical entity co-occurrence matrix, setting initial vector representation of each medical entity, inputting the initial vector representation into a GloVe model, and acquiring vector representation of each medical entity through target function training, wherein a target function J is as follows:
Figure BDA0003560152490000061
wherein M isijTo enhance the co-occurrence product of the ith and jth entity vectors in the medical entity co-occurrence matrix, | D | is the number of medical entities, e |jIs a vector representation of the jth medical entity, eiIs a vector representation of the ith medical entity, biBias parameter for the ith medical entity, bjIs the bias parameter for the jth medical entity.
And taking as the patient vector representation an aggregated result of the plurality of medical entity vector representations associated with the patient, the aggregated result comprising a summation, an average, a maximum, or a minimum.
S2: taking each medical entity vector representation as the initial input of each medical entity node, taking each patient vector representation as the initial input of each patient node, connecting related medical entities to obtain a real link relation between the medical entities, and connecting the related medical entities and patients to obtain a real link relation between the patients and the medical entities; so as to construct an electronic medical record abnormal composition.
S3: inputting the electronic medical record abnormal graph into a GraphSAGE graph neural network, obtaining the current layer neighbor node vector representation through the real link relationship between medical entities and the real link relationship between a patient and the medical entities, the aggregators of the GraphSAGE, and the sum calculation method, wherein the current layer neighbor node vector representation
Figure BDA0003560152490000062
Comprises the following steps:
Figure BDA0003560152490000071
wherein R is the real link relationship between medical entities or the real link relationship between the patient and the medical entities, R is the set of the real link relationship, u is the neighbor node, v is the current node, N (R), (v) is the neighbor node of the current node v on the real link relationship of R,
Figure BDA0003560152490000072
and the neighbor node vector representation of the previous layer is 1, the current layer is AGGREGATE (.) and the aggregation operation is used for combining the neighbor information of all current nodes v together.
Performing Mean agglomerator aggregation in the GraphSAGE on the current layer neighbor node vector representation and the previous layer output vector representation of the patient node to obtain a patient node output vector representation, and performing Mean agglomerator aggregation in the GraphSAGE to obtain a medical entity node output vector representation through the current layer neighbor node vector representation and the previous layer output vector representation of the medical entity node;
medical entity node output vector representation
Figure BDA0003560152490000073
Comprises the following steps:
Figure BDA0003560152490000074
wherein the content of the first and second substances,
Figure BDA0003560152490000075
is a vector representation of the node d of the medical entity of the previous layer, WdFor the weight parameter of medical entity node d, MEAN () is the averaging function, and σ () is the activation function;
patient node output vector representation
Figure BDA0003560152490000076
Comprises the following steps:
Figure BDA0003560152490000077
wherein the content of the first and second substances,
Figure BDA0003560152490000078
is a vector representation of the patient node p of the previous layer, WpIs the weight parameter of the patient node p.
Based on the patient node output vector representation and the medical entity node output vector representation, the probability of the link relation between the medical entity and the patient is obtained by adopting an activation function
Figure BDA0003560152490000079
Comprises the following steps:
Figure BDA00035601524900000710
wherein z isdIs an output vector representation of the medical entity node, zpOutput vector representation, δ, for the patient node(. cndot.) is an activation function.
Based on medical entity node output vector representation, the probability of the link relation between medical entities is obtained by adopting an activation function
Figure BDA00035601524900000711
Is as follows;
Figure BDA00035601524900000712
wherein z isd′A vector representation is output for another medical entity node.
S4: constructing a total loss function: as shown in fig. 2, the node information of the abnormal graph G is calculated by using the graph neural network, and a first loss function is constructed by the cross entropy of the patient-medical entity link real relationship (if the link relationship exists, the patient-medical entity link real relationship is 1, and if the link relationship does not exist, the patient-medical entity link real relationship is 0) and the probability of the patient-medical entity link real relationship to train the patient-medical entity relationship link prediction task L1
Training medical entity-medical entity relation link prediction task L by constructing second loss function through cross entropy of link real relation between medical entities and link relation probability of medical entities and medical entities2
Constructing a multitask weighted loss function learning weight factor eta through the loss value of the first loss function and the loss value of the second loss function;
the training method using the multi-task weighted loss function is combined with the two loss functions to carry out optimization simultaneously, and the multi-task weighted loss function L is as follows:
Figure BDA0003560152490000081
wherein e is-ηmAs a mth loss function weight factor ηmThe index of (a) is,
Figure BDA0003560152490000082
and (4) finishing training to obtain a weight factor eta if the m-th loss function is a loss value, and continuously calculating the node information of the abnormal graph G by using the graph neural network if the m-th loss function is not converged.
Training a GraphSAGE graph neural network by using a total loss function, and updating parameters to obtain a final GraphSAGE graph neural network;
s5: when the method is applied, the medical entity vector representation and the patient vector representation are input into the final GraphSAGE graph neural network to predict the association probability of the medical entity and the patient.
Based on the method, the relationship range between the medical entities and the relationship range between the medical entities and the patients are expanded, so that the association degree between the medical entities and the patients can be accurately predicted.

Claims (8)

1. An electronic medical record retrieval method based on a graph neural network is characterized by comprising the following steps:
(1) acquiring a co-occurrence matrix of medical entities in the electronic medical record, traversing medical ontology knowledge ICD codes to obtain a plurality of ancestor medical entities corresponding to the medical entities, adding co-occurrence information of the medical entities and the ancestor medical entities into the medical entity co-occurrence matrix to obtain an enhanced medical entity co-occurrence matrix, extracting vectors of each medical entity based on the enhanced medical entity co-occurrence matrix by adopting a GloVe model to represent, and representing an aggregation result of a plurality of medical entity vectors associated with a patient as a patient vector;
(2) constructing an electronic medical record heterogeneous graph, wherein the electronic medical record heterogeneous graph comprises medical entity nodes, patient nodes, real linking relations among medical entities and real linking relations between patients and medical entities;
each medical entity vector is represented as an initial attribute of each medical entity node, each patient vector is represented as an initial attribute of each patient node, related medical entities are connected to obtain a real link relationship between the medical entities, and the related medical entities and patients are connected to obtain a real link relationship between the patients and the medical entities;
(3) inputting the electronic medical record abnormal graph into a GraphSAGE graph neural network to respectively obtain the output vector representation of the patient node and the output vector representation of the medical entity node; obtaining the probability of the link relation between the patient and the medical entity by adopting an activation function based on the patient node output vector representation and the medical entity node output vector representation; obtaining the probability of the link relation between the medical entities by adopting an activation function based on the medical entity node output vector representation;
(4) constructing a total loss function, wherein the total loss function comprises a first loss function, a second loss function and a multitask weighted loss function;
constructing a first loss function through the cross entropy of the real relation of the patient and the medical entity link and the probability of the patient and the medical entity link;
constructing a second loss function through the cross entropy of the link real relation between the medical entities and the link relation probability of the medical entities and the medical entities;
constructing a multitask weighted loss function through the loss value of the first loss function and the loss value of the second loss function;
(5) training a GraphSAGE graph neural network by using a total loss function, and updating parameters to obtain a final GraphSAGE graph neural network;
(6) when the method is applied, the medical entity vector representation and the patient vector representation are input into the final GraphSAGE graph neural network to predict the association probability of the medical entity and the patient.
2. The method of claim 1, wherein obtaining a co-occurrence matrix of medical entities in the electronic medical record comprises:
the frequency product of every two medical entities in every visit record of every patient is used as the co-occurrence information of every two medical entities, a co-occurrence matrix of every visit record is constructed based on the co-occurrence information of every two medical entities, the co-occurrence matrices of the multiple times of visit records of every patient are added to obtain an electronic medical record co-occurrence matrix of every patient, and the co-occurrence matrices of the electronic medical records of the multiple patients are added to obtain the co-occurrence matrix of the medical entities in the electronic medical records.
3. The method for retrieving the electronic medical record based on the graph neural network as claimed in claim 1, wherein traversing the ICD code to obtain a plurality of ancestor medical entities corresponding to the medical entity comprises:
and taking each medical entity as a leaf node, obtaining a plurality of ancestor nodes corresponding to the leaf node by traversing the ICD codes of the medical ontology from bottom to top, extracting the medical entities corresponding to the ancestor nodes to obtain ancestor medical entities, obtaining the co-occurrence information of each medical entity and the ancestor medical entities in the ICD codes of the medical ontology, and adding the co-occurrence information into the medical entity co-occurrence matrix to expand the medical entity co-occurrence matrix.
4. The electronic medical record retrieval method based on the graph neural network as claimed in claim 1, wherein the extracting of each medical entity vector representation using a GloVe model based on the enhanced medical entity co-occurrence matrix comprises:
setting an initial vector representation of each medical entity, inputting the initial vector representation to a GloVe model, and training an objective function to obtain a vector representation of each medical entity, wherein the objective function J is as follows:
Figure FDA0003560152480000021
wherein M isijTo enhance the co-occurrence product of the ith and jth entity vectors in the medical entity co-occurrence matrix, | D | is the number of medical entities, e |jIs a vector representation of the jth medical entity, eiIs a vector representation of the ith medical entity, biBias parameter for the ith medical entity, bjIs the bias parameter for the jth medical entity.
5. The method of claim 1, wherein the aggregating results of the plurality of medical entity vector representations associated with the patient are used as the patient vector representation, and the aggregating operation comprises summing, averaging, maximizing or minimizing.
6. The method for retrieving the electronic medical record based on the graph neural network as claimed in claim 1, wherein the step of inputting the electronic medical record differential map into the graph neural network to respectively obtain the output vector representation of the patient node and the output vector representation of the medical entity node comprises:
carrying out Mean agglomerator aggregation in GraphSAGE on the current layer neighbor node vector representation and the previous layer output vector representation of the patient node to obtain a patient node output vector representation, and carrying out Mean agglomerator aggregation in GraphSAGE through the current layer neighbor node vector representation and the previous layer output vector representation of the medical entity node to obtain a medical entity node output vector representation;
wherein the current layer neighbor node vector represents
Figure FDA0003560152480000031
Comprises the following steps:
Figure FDA0003560152480000032
wherein R is the real relation of the links between the medical entities or the real relation of the links between the patient and the medical entities, R is the set of the real relations of the links, u is the neighbor node, v is the current node, N(r)(v) For the neighbor nodes of the current node v in the real relation of r link,
Figure FDA0003560152480000033
the vector representation of the neighbor node of the previous layer is shown, l is the current layer, and AGGREGATE (-) is the aggregation operation for combining the neighbor information of the current node v together;
medical entity node output vector representation
Figure FDA0003560152480000034
Comprises the following steps:
Figure FDA0003560152480000035
wherein the content of the first and second substances,
Figure FDA0003560152480000036
is a vector representation of the node d of the medical entity of the previous layer, WdFor the weight parameter of medical entity node d, MEAN () is the averaging function, and σ () is the activation function;
patient node output vector representation
Figure FDA0003560152480000037
Comprises the following steps:
Figure FDA0003560152480000038
wherein the content of the first and second substances,
Figure FDA0003560152480000039
is a vector representation of the patient node p of the previous layer, WpIs the weight parameter of the patient node p.
7. The method for retrieving the electronic medical record based on the graph neural network as claimed in claim 1, wherein the multitask weighted loss function L is:
Figure FDA00035601524800000310
wherein the content of the first and second substances,
Figure FDA00035601524800000312
as the mth loss function weight factor etamThe index of (a) is,
Figure FDA00035601524800000311
is the loss value of the mth loss function.
8. An electronic medical record retrieval device based on a graph neural network, comprising a computer memory, a computer processor and a computer program stored in the computer memory and executable on the computer processor, wherein the computer memory adopts the final graph neural network model of any one of claims 1 to 7;
the computer processor, when executing the computer program, performs the steps of:
and inputting the medical entity vector representation and the patient vector representation into the final GraphSAGE diagram neural network to predict the association probability of the medical entity and the patient.
CN202210291079.2A 2022-03-23 2022-03-23 Electronic medical record retrieval method and device based on graph neural network Pending CN114639483A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210291079.2A CN114639483A (en) 2022-03-23 2022-03-23 Electronic medical record retrieval method and device based on graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210291079.2A CN114639483A (en) 2022-03-23 2022-03-23 Electronic medical record retrieval method and device based on graph neural network

Publications (1)

Publication Number Publication Date
CN114639483A true CN114639483A (en) 2022-06-17

Family

ID=81949527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210291079.2A Pending CN114639483A (en) 2022-03-23 2022-03-23 Electronic medical record retrieval method and device based on graph neural network

Country Status (1)

Country Link
CN (1) CN114639483A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114943314A (en) * 2022-07-26 2022-08-26 牛津大学(苏州)科技有限公司 ICD (interface control document) diagnosis code-based object partitioning method, storage medium and electronic medical record system
CN115083616A (en) * 2022-08-16 2022-09-20 之江实验室 Chronic nephropathy subtype mining system based on self-supervision graph clustering
CN116564535A (en) * 2023-05-11 2023-08-08 之江实验室 Central disease prediction method and device based on local graph information exchange under privacy protection
CN116821375A (en) * 2023-08-29 2023-09-29 之江实验室 Cross-institution medical knowledge graph representation learning method and system
CN116936108A (en) * 2023-09-19 2023-10-24 之江实验室 Unbalanced data-oriented disease prediction system

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114943314A (en) * 2022-07-26 2022-08-26 牛津大学(苏州)科技有限公司 ICD (interface control document) diagnosis code-based object partitioning method, storage medium and electronic medical record system
CN115083616A (en) * 2022-08-16 2022-09-20 之江实验室 Chronic nephropathy subtype mining system based on self-supervision graph clustering
CN115083616B (en) * 2022-08-16 2022-11-08 之江实验室 Chronic nephropathy subtype mining system based on self-supervision graph clustering
JP7404581B1 (en) 2022-08-16 2023-12-25 之江実験室 Chronic nephropathy subtype mining system based on self-supervised graph clustering
CN116564535A (en) * 2023-05-11 2023-08-08 之江实验室 Central disease prediction method and device based on local graph information exchange under privacy protection
CN116564535B (en) * 2023-05-11 2024-02-20 之江实验室 Central disease prediction method and device based on local graph information exchange under privacy protection
CN116821375A (en) * 2023-08-29 2023-09-29 之江实验室 Cross-institution medical knowledge graph representation learning method and system
CN116821375B (en) * 2023-08-29 2023-12-22 之江实验室 Cross-institution medical knowledge graph representation learning method and system
CN116936108A (en) * 2023-09-19 2023-10-24 之江实验室 Unbalanced data-oriented disease prediction system
CN116936108B (en) * 2023-09-19 2024-01-02 之江实验室 Unbalanced data-oriented disease prediction system

Similar Documents

Publication Publication Date Title
CN114639483A (en) Electronic medical record retrieval method and device based on graph neural network
WO2023000574A1 (en) Model training method, apparatus and device, and readable storage medium
CN110347932B (en) Cross-network user alignment method based on deep learning
CN111079931A (en) State space probabilistic multi-time-series prediction method based on graph neural network
Poornima et al. A survey of predictive analytics using big data with data mining
CN112613602A (en) Recommendation method and system based on knowledge-aware hypergraph neural network
CN112364976A (en) User preference prediction method based on session recommendation system
CN112308115A (en) Multi-label image deep learning classification method and equipment
CN113254716B (en) Video clip retrieval method and device, electronic equipment and readable storage medium
Biswas et al. Hybrid expert system using case based reasoning and neural network for classification
CN115270007B (en) POI recommendation method and system based on mixed graph neural network
Shan et al. The data-driven fuzzy cognitive map model and its application to prediction of time series
WO2024067373A1 (en) Data processing method and related apparatus
CN112925857A (en) Digital information driven system and method for predicting associations based on predicate type
CN114463596A (en) Small sample image identification method, device and equipment of hypergraph neural network
CN110993121A (en) Drug association prediction method based on double-cooperation linear manifold
CN115631008B (en) Commodity recommendation method, device, equipment and medium
CN115544307A (en) Directed graph data feature extraction and expression method and system based on incidence matrix
CN115344794A (en) Scenic spot recommendation method based on knowledge map semantic embedding
Liang et al. The graph embedded topic model
CN115019342A (en) Endangered animal target detection method based on class relation reasoning
CN114625969A (en) Recommendation method based on interactive neighbor session
CN114625886A (en) Entity query method and system based on knowledge graph small sample relation learning model
CN115238075B (en) Text sentiment classification method based on hypergraph pooling
Ghorbani et al. Feature engineering with gray wolf algorithm and fuzzy methods for Friend recommender system in social networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination