CN115083616A - Chronic nephropathy subtype mining system based on self-supervision graph clustering - Google Patents
Chronic nephropathy subtype mining system based on self-supervision graph clustering Download PDFInfo
- Publication number
- CN115083616A CN115083616A CN202210980822.5A CN202210980822A CN115083616A CN 115083616 A CN115083616 A CN 115083616A CN 202210980822 A CN202210980822 A CN 202210980822A CN 115083616 A CN115083616 A CN 115083616A
- Authority
- CN
- China
- Prior art keywords
- node
- kidney disease
- clustering
- chronic kidney
- visit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Public Health (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- Medical Informatics (AREA)
- General Engineering & Computer Science (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Primary Health Care (AREA)
- Databases & Information Systems (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
The invention discloses a chronic kidney disease subtype mining system based on self-supervision graph clustering, which comprises the following steps: a data acquisition module: the system is used for collecting the structured data in the diagnosis and treatment record of the chronic kidney disease; the data extraction and pretreatment module comprises: the system is used for extracting and preprocessing the structured data to obtain an entity set and a visit set; chronic kidney disease subtype mining module: the entity set and the visit set are used for constructing a chronic kidney disease subtype mining model; chronic kidney disease phenotype subtype assessment module: for evaluating the chronic kidney disease subtype mining model; chronic kidney disease subtype prediction module: for predicting structured data of a patient. The invention solves the problem that the process mining method can not process the coexistence of multi-granularity information such as event information in a single visit and event information among multiple visits in longitudinal electronic medical record data.
Description
Technical Field
The invention relates to the technical field of medical health information, in particular to a chronic kidney disease subtype mining system based on self-supervision graph clustering.
Background
According to clinical guidelines, chronic kidney disease is graded based on the patient's estimated glomerular filtration rate (eGFR) and urinary albumin-creatinine ratio (UACR). While eGFR and UACR can be used for screening and monitoring of chronic kidney disease, phenotypic differences in disease between individuals with chronic kidney disease cannot be characterized based on eGFR and UACR alone. Chronic kidney disease is a highly heterogeneous disease, closely related to systemic diseases and conditions, such as diabetes, hypertension, autoimmune diseases, genetic predisposition or congenital abnormalities. There are significant differences between individuals with chronic kidney disease, which can be described by disease phenotypes such as laboratory tests, medical history, medication history, and social factors. The initial phenotype difference of chronic kidney disease patients also causes the diagnosis and treatment process and complications of individuals to be different. A rational phenotypic classification of chronic kidney disease should differentiate between different subpopulations of patients, revealing disease characteristics and underlying disease pathology of the different subpopulations, thereby helping to better understand the different mechanisms of disease progression and progression.
The existing classification method of the chronic kidney disease subtype is mainly based on the clustering analysis of initial static phenotype data of a patient. The method mainly utilizes multidimensional data such as patient demographics, biomarkers and clinical characteristics collected at the beginning of research and mines the phenotype classification of chronic kidney disease patients based on common clustering algorithms such as hierarchical clustering and consistency clustering. However, chronic kidney disease patients have long disease process and many complications, which causes great difference in diagnosis and treatment process among patients. The clinical process data may imply important information for distinguishing different phenotypes of chronic kidney disease patients. In the data of the patient diagnosis and treatment process collected and stored in the electronic medical record system, event information such as operation, examination, inspection and medication for a specific patient and the occurrence time of the events can be extracted. The method utilizes the diagnosis and treatment process data of the patients to perform clustering, researches the disease phenotype mode of the patients, and has important significance for identifying and researching the characteristics of different subgroups of patients. The commonly used method for mining the data of the disease diagnosis and treatment process comprises the following steps: (1) the process mining method comprises the following steps: information is extracted from an event log generated in the process of diagnosis and treatment of a patient, and diagnosis and treatment event sequences are formed by arranging according to time sequence. Different patterns in the sequence of clinical events are then mined as different clinical paths for the disease, thereby classifying the disease phenotype of the patient. The method is difficult to utilize the co-occurrence information among the events, and cannot process the event incidence relation and the sequence relation in the longitudinal electronic medical record multi-time visit data. The excavation diagnosis and treatment process is complex, and the representativeness and the coverage rate are poor. (2) Tensor decomposition-based approach: and combining the information of the three dimensions of the patient, the time and the phenotype into a third-order tensor, and decomposing the third-order tensor so as to mine the potential phenotype classification of the patient. The method only considers disease phenotype conversion between continuous diagnosis and treatment and cannot process phenotype evolution information in a long-distance diagnosis and treatment process.
Therefore, we propose a chronic kidney disease subtype mining system based on the self-supervision graph clustering to solve the above technical problem.
Disclosure of Invention
In order to solve the technical problems, the invention provides a chronic kidney disease subtype mining system based on self-supervision graph clustering.
The technical scheme adopted by the invention is as follows:
a chronic kidney disease subtype mining system based on self-supervision picture clustering comprises:
a data acquisition module: the system is used for collecting the structured data in the diagnosis and treatment record of the chronic kidney disease;
the data extraction and pretreatment module: the system is used for extracting and preprocessing the structured data to obtain an entity set and a visit set;
chronic kidney disease subtype mining module: the entity set and the visit set are used for constructing a chronic kidney disease subtype mining model;
a chronic kidney disease phenotype subtype evaluation module: for evaluating the chronic kidney disease subtype mining model;
chronic kidney disease subtype prediction module: for predicting structured data of a patient.
Further, the structured data includes basic information of the patient, medical records, diagnoses during a viewing window, laboratory tests, medical examinations, surgeries, and/or medication data.
Further, the data extraction and preprocessing module is specifically configured to preprocess the structured data, extract the structured data in the diagnosis and treatment record of chronic kidney disease in the electronic medical record system, and preprocess the extracted structured data, where the structured data includes basic information of a patient, a diagnosis record, diagnosis during an observation window, laboratory test, medical examination, surgical data, and medication data, and the laboratory test data only focuses on an abnormal test item according to a normal reference range, divides the result of the abnormal test item into two categories, namely a lower category and a higher category, and retains the name of the abnormal test item and the abnormal category; medical examination and operation data are processed by a simple natural language processing technology, and the examined part, the examined type and the operation name are reserved; the medication data only pay attention to the use of six types of medicines, namely antihyperglycemic medicines, antihypertensive medicines, lipid regulating medicines, non-steroidal anti-inflammatory medicines, antiplatelet medicines and steroids, the six types of medicines in the medication data are classified, and the medicine categories are reserved; obtaining a diagnosis set, a medication set, an operation set, a test set, the number of diagnosis types, the number of medication types, the number of operation types, the number of test types and the number of treatment records, combining the diagnosis set, the medication set, the operation set and the test set to form an entity set, and combining the treatment records of patients to form a treatment set.
Further, the chronic kidney disease subtype mining module specifically comprises:
a visit network construction unit: a network for constructing a visit network using the visit set and the entity set;
an embedded representation construction unit: the entity co-occurrence matrix is constructed by utilizing the entity set, the entity node initial embedded representation and the clinic node initial embedded representation are obtained through the entity co-occurrence matrix, and the entity node initial embedded representation and the clinic node initial embedded representation form the node initial embedded representation;
a clustering network construction unit: the system comprises a node clustering network model, a node clustering model and a node clustering model, wherein the node clustering network model is used for constructing an adjacency matrix by utilizing the relationship among nodes in the visit network, and training the visit node clustering network model based on self-supervision graph clustering through the adjacency matrix and the initial embedded representation of the nodes;
the chronic kidney disease subtype mining model construction unit: and the method is used for constructing the chronic kidney disease subtype mining model through the self-supervision graph clustering-based visit node clustering network model.
Further, the visiting network constructing unit specifically includes:
the system is used for forming the visit set and the entity set into a node set;
the edge set is constructed through the node co-occurrence relations in the node set;
for constructing a treatment network using the set of nodes and the set of edges.
Further, the embedded representation building unit specifically includes:
the entity co-occurrence matrix is constructed by utilizing the entity set;
the initial embedded representation of each entity node is obtained through calculation of a GloVe algorithm based on the entity co-occurrence matrix;
the node initial embedded representation is obtained by calculating an average value of the entity node initial embedded representations of all adjacent entity nodes, and the clinic node initial embedded representation and the entity node initial embedded representation form the node initial embedded representation.
Further, the clustering network constructing unit specifically includes:
the self-supervision graph clustering based visit node clustering network model is used for constructing an adjacency matrix by utilizing the relationship among the nodes in the visit network, inputting the adjacency matrix and the initial node embedded representation into the visit node clustering network model based on the self-supervision graph clustering for graph attention training, and obtaining a node embedded representation, wherein the node embedded representation comprises a visit node embedded representation and an entity node embedded representation;
the node embedded representation is used for reconstructing the visit network and calculating a visit network reconstruction error;
the decoder is used for inputting the entity node embedded representation into the neural network for training, the output of the last layer of the decoder is used as entity node reconstruction embedded representation, and entity node reconstruction errors are calculated;
the system is used for performing softmax regression operation on the embedded expression of the treatment nodes to obtain the probability distribution of the treatment nodes, and calculating the clustering loss according to the probability distribution of the treatment nodes;
and the overall loss function is used for constructing the visit node clustering network model based on the self-supervision graph clustering according to the visit network reconstruction error, the entity node reconstruction error and the clustering loss.
Further, the chronic kidney disease subtype mining model construction unit specifically includes:
the self-supervision graph clustering-based diagnosis node clustering network model is used for obtaining diagnosis node clustering distribution as classification distribution of the diagnosis nodes, selecting the classification with the highest probability in the classification distribution as a classification label of the diagnosis nodes, and arranging all the diagnosis nodes of each patient according to a time sequence;
the event matrix is constructed by arranging the diagnosis nodes;
the method is used for searching for frequent event determination nodes, the frequent events are used as nodes in an event flow, the rest events directly enter an end node, each event in the frequent events is used as an initial node of the next search, a corresponding event vector is extracted to be combined into a new event matrix, the same frequent event searching operation is carried out after the first column is removed, the node obtained by each search is connected with the initial node so as to prolong the event flow until the frequent event is empty or the event flow length reaches the maximum event flow length, and a chronic kidney disease subtype mining model is obtained after the circulation is ended.
Further, the module for predicting the subtype of chronic kidney disease specifically comprises:
the self-supervision graph clustering-based visit node clustering network model is used for inputting the preprocessed patient structured data into the visit node clustering network model for prediction to obtain the probability distribution of the visit node of the patient;
the cluster type of the treatment nodes is judged according to the probability distribution of the treatment nodes, and a treatment event sequence is constructed;
the system is used for inputting the treatment event sequence into the chronic kidney disease subtype mining model, fitting nodes in the chronic kidney disease subtype mining model according to the sequence to obtain an event flow, and judging which chronic kidney disease subtype belongs to through the event flow.
The invention has the beneficial effects that: the invention provides a chronic kidney disease subtype mining system based on self-supervision graph clustering. Firstly, longitudinal electronic medical record data of a patient for multiple times of treatment is constructed into a treatment network, and the treatment network comprises multi-dimensional patient diagnosis and treatment event information such as treatment, diagnosis, laboratory examination, medical examination, operation, medication and the like. And secondly, acquiring vector representation of the diagnosis and treatment events by using the co-occurrence information of the diagnosis and treatment events. And clustering the treatment events by using a treatment node clustering network model based on the self-supervision graph clustering, and labeling each treatment event. Then, on the aspect of the treatment, the diagnosis and treatment path of the patient is excavated to obtain different subtypes of the chronic kidney disease phenotype. Finally, a phenotypic subtype assessment method is provided to assess whether clinically interpretable differences exist among the different mined subtypes, including a series of comprehensive indicators of patient demographics, medication, complications, and survival rates.
The method comprises the steps that diagnosis, laboratory inspection, medical examination, operation, medication and other event information in each visit are trained through a visit node clustering network model based on self-supervision graph clustering to obtain category labels of each visit, and low-level and fine-grained information is gathered into high-level and coarse-grained general information in the process; and the type label of the diagnosis is used for a diagnosis and treatment path mining mode, so that the problem that multi-granularity information such as event information in a single diagnosis and event information among multiple times of diagnoses cannot be processed in longitudinal electronic medical record data by the process mining method is solved.
The event vector representation is obtained based on the co-occurrence information and used for the graph model, the problem that the process mining method is difficult to utilize the event co-occurrence information is effectively solved, and the full feature mining of the diseases by simultaneously utilizing the cross section and the longitudinal electronic medical record data is realized.
The self-supervision graph clustering algorithm provided by the invention brings the multi-time diagnosis information of the patient into a diagnosis node clustering network model based on self-supervision graph clustering, trains the embedded expression of the nodes, and can process the phenotype evolution information in the long-distance diagnosis and treatment process. Then, different nodes and relations in the treatment network are supervised and learned respectively. Computing a reconstruction error of the node using the L2 norm based on the decoder reconstructing the embedded representation of the lower level node; calculating the reconstruction error of the graph relation by using the cross entropy; and calculating the clustering error of the treatment nodes by utilizing the KL divergence.
Based on the distribution similarity of the event labels of the diagnosis nodes, similar adjacent events are combined, the process mining method is optimized, the mined diagnosis and treatment process is simplified, and the representativeness and the coverage rate of the diagnosis and treatment process are improved.
Drawings
FIG. 1 is a schematic structural diagram of a chronic kidney disease subtype mining system based on self-supervision graph clustering according to the present invention;
FIG. 2 is a functional flow diagram of a chronic kidney disease subtype mining system based on self-supervision picture clustering according to the present invention;
FIG. 3 is a treatment network according to an embodiment of the present invention;
FIG. 4 is a co-occurrence matrix of an embodiment of the present invention;
fig. 5 is a diagram of a self-supervision graph clustering-based clinic node clustering network model structure according to an embodiment of the present invention.
Detailed Description
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a chronic kidney disease subtype mining system based on self-supervision map clustering comprises:
a data acquisition module: the system is used for collecting the structured data in the chronic kidney disease diagnosis and treatment record;
the data extraction and pretreatment module: the system is used for extracting and preprocessing the structured data to obtain an entity set and a visit set;
chronic kidney disease subtype mining module: the entity set and the visit set are used for constructing a chronic kidney disease subtype mining model;
chronic kidney disease phenotype subtype assessment module: for evaluating the chronic kidney disease subtype mining model;
chronic kidney disease subtype prediction module: for predicting structured data of a patient.
Referring to fig. 2, a functional process of a chronic kidney disease subtype mining system based on self-supervision graph clustering comprises the following steps:
step S1: the method comprises the steps of collecting structural data in a chronic kidney disease diagnosis and treatment record to construct a data set through a data collection module; the structured data includes patient basic information, medical records, diagnoses during viewing windows, laboratory tests, medical examinations, surgery and/or medication data;
step S2: preprocessing the structured data through a data extraction and preprocessing module to obtain a doctor seeing set and an entity set; preprocessing the data set, extracting structured data in the diagnosis and treatment record of the chronic kidney disease in an electronic medical record system, wherein the structured data comprises basic information of a patient, a diagnosis record, diagnosis during an observation window, laboratory inspection, medical examination, operation data and medication data, preprocessing the extracted structured data, only paying attention to abnormal inspection items according to a normal reference range, dividing results of the abnormal inspection items into a lower type and a higher type, and keeping names and abnormal types of the abnormal inspection items; medical examination and operation data are processed by a simple natural language processing technology, and the examined part, the examined type and the operation name are reserved; the medication data only pay attention to the use of six types of medicines, namely antihyperglycemic medicines, antihypertensive medicines, lipid regulating medicines, non-steroidal anti-inflammatory medicines, antiplatelet medicines and steroids, the six types of medicines in the medication data are classified, and the medicine categories are reserved; obtaining a diagnosis set, a medication set, an operation set, a test set, the number of diagnosis types, the number of medication types, the number of operation types, the number of test types and the number of treatment records, combining the diagnosis set, the medication set, the operation set and the test set to form an entity set, and combining the treatment records of patients to form a treatment set.
Step S3: inputting the treatment set and the entity set into a chronic kidney disease subtype mining module, and constructing a chronic kidney disease subtype mining model through the chronic kidney disease subtype mining module;
step S31: constructing a treatment network by using the treatment set and the entity set;
step S311: forming a node set by the visit set and the entity set;
step S312: constructing an edge set through the node co-occurrence relationship in the node set;
step S313: and constructing a treatment network by using the node set and the edge set.
Step S32: constructing an entity co-occurrence matrix by using the entity set, acquiring an entity node initial embedded representation and a diagnosis node initial embedded representation through the entity co-occurrence matrix, and forming the entity node initial embedded representation and the diagnosis node initial embedded representation into a node initial embedded representation;
step S321: constructing an entity co-occurrence matrix by using the entity set;
step S322: based on the entity co-occurrence matrix, calculating by a GloVe algorithm to obtain an initial embedded representation of each entity node;
step S323: obtaining a visit node initial embedded representation by calculating an average value of the entity node initial embedded representations of all adjacent entity nodes, wherein the visit node initial embedded representation and the entity node initial embedded representation form a node initial embedded representation.
Step S33: constructing an adjacency matrix by utilizing the relation between nodes in the visit network, and initially embedding the adjacency matrix and the nodes to express and train a visit node clustering network model based on self-supervision graph clustering;
step S331: constructing an adjacency matrix by utilizing the relationship among the nodes in the visit network, inputting the adjacency matrix and the initial node embedded representation into the visit node clustering network model based on the self-supervision graph clustering for graph attention training to obtain a node embedded representation, wherein the node embedded representation comprises a visit node embedded representation and an entity node embedded representation;
step S332: reconstructing the visit network by using the node embedded representation, and calculating a visit network reconstruction error;
step S333: inputting the entity node embedded representation into a decoder of a neural network for training, taking the output of the last layer of the decoder as an entity node reconstruction embedded representation, and calculating an entity node reconstruction error;
step S334: performing softmax regression operation on the embedded representation of the treatment nodes to obtain the probability distribution of the treatment nodes, and calculating clustering loss according to the probability distribution of the treatment nodes;
step S335: and constructing an overall loss function of the visit node clustering network model based on the self-supervision graph clustering according to the visit network reconstruction error, the entity node reconstruction error and the clustering loss.
Step S34: and constructing a chronic kidney disease subtype mining model through the diagnosis node clustering network model based on the self-supervision graph clustering.
Step S341: using the clinic node cluster distribution obtained by the clinic node cluster network model based on the self-supervision graph cluster as the class distribution of the clinic nodes, selecting the class with the highest probability in the class distribution as the class label of the clinic nodes, and arranging all the clinic nodes of each patient according to the time sequence;
step S342: determining to combine or separately reserve the treatment nodes by calculating cosine similarity between category distributions of the continuous treatment nodes having the same category label, and constructing an event matrix by arranging the treatment nodes;
step S343: searching frequent event determination nodes, connecting the diagnosis nodes in sequence to form an event flow, starting from a first column of the event matrix, selecting events with the frequency of occurrence of the events in each column being greater than a threshold value as frequent events, using the frequent events as nodes in the event flow, directly entering the remaining events into a terminal node, taking each event in the frequent events as a starting node of the next round of searching, extracting corresponding event vectors, combining the event vectors into a new event matrix, removing the first column, performing the same operation of searching the frequent events, connecting the nodes obtained by each round of searching with the starting node so as to prolong the event flow until the frequent events are empty or the event flow length reaches the maximum event flow length, and obtaining a chronic kidney disease subtype mining model after the cycle is finished.
Step S4: evaluating the chronic kidney disease subtype mining model through a chronic kidney disease phenotype subtype evaluation module;
step S5: predicting structured data of a patient by a chronic kidney disease subtype prediction module;
step S51: preprocessing structured data of a patient, inputting the preprocessed structured data into the visit node clustering network model based on the self-supervision graph clustering for prediction, and obtaining probability distribution of the visit nodes of the patient;
step S52: judging the cluster type of the treatment nodes according to the probability distribution of the treatment nodes, and constructing a treatment event sequence;
step S53: inputting the diagnosis event sequence into the chronic kidney disease subtype mining model, fitting nodes in the chronic kidney disease subtype mining model according to the sequence to obtain an event flow, and judging which chronic kidney disease subtype belongs to through the event flow.
Example (b):
a chronic kidney disease subtype mining system based on self-supervision picture clustering comprises:
a data acquisition module: the system is used for acquiring structured data in the diagnosis and treatment record of chronic kidney disease to construct a data set; the structured data includes basic information of the patient, medical records, diagnoses during viewing windows, laboratory tests, medical examinations, surgery, and/or medication data;
the data extraction and pretreatment module: the system is used for extracting and preprocessing the structured data to obtain a doctor seeing set and an entity set; the data extraction and preprocessing module is specifically used for preprocessing the structured data, extracting the structured data in the chronic kidney disease diagnosis and treatment records in the electronic medical record system, wherein the structured data comprises basic information of a patient, a diagnosis record, diagnosis during an observation window, laboratory test, medical examination, operation data and medication data, preprocessing the extracted structured data, only paying attention to an abnormal test item according to a normal reference range, dividing the result of the abnormal test item into a lower type and a higher type, and keeping the name and the type of the abnormal test item; medical examination and operation data are processed by a simple natural language processing technology, and the examined part, the examined type and the operation name are reserved; the medication data only pay attention to the use of six types of medicines, namely antihyperglycemic medicines, antihypertensive medicines, lipid regulating medicines, non-steroidal anti-inflammatory medicines, antiplatelet medicines and steroids, the six types of medicines in the medication data are classified, and the medicine categories are reserved; obtaining a diagnosis set, a medication set, an operation set, a test set, the number of diagnosis types, the number of medication types, the number of operation types, the number of test types and the number of treatment records, combining the diagnosis set, the medication set, the operation set and the test set to form an entity set, and combining the treatment records of patients to form a treatment set.
Chronic kidney disease subtype mining module: the system is used for inputting the treatment set and the entity set into a chronic kidney disease subtype mining module, and a chronic kidney disease subtype mining model is constructed through the chronic kidney disease subtype mining module;
a visit network construction unit: a network for constructing a visit network using the visit set and the entity set;
the system is used for forming the visit set and the entity set into a node set;
the doctor is integrated intoIn whichIndicating the number of visits.Respectively a diagnosis set, a medication set, an operation set and a test set,, , , in which、 、 、 Respectively representing the diagnosis type quantity, the medicine type quantity, the operation type quantity and the inspection type quantity.Composing collections of entitiesThe number of entity set types is。
The edge set is constructed through the node co-occurrence relations in the node set;
the same visit will be () The entities present in constitute a subset of entities, Representing a subset of entitiesThe number of the entities in the group,. Each entity subset and the corresponding visit form a visit UNICOM subset. One of the visit unicom subsets comprises a visit node and all entity nodes in the visit, all nodes in one visit unicom subset have a co-occurrence relationship, and the nodes are connected pairwise to form an edge subset; all the edge subsets form an edge set, and the edge set is;
Referring to FIG. 3, at the visitIn the middle, the physician prescribes goiter () Thyroid nodule (A)) Two diagnoses, partial thyroidectomy () And the levothyroxine sodium tablet () The medicine is prepared. ThenA subset of visit links is formed, and the 5 nodes in the visit network are connected pairwise. At the moment of treatmentIn (A), the doctor has carried out TSH measurement: (A)) After that, hypothyroidism () Diagnosis and development of levothyroxine sodium tablet () And (4) medicine preparation. ThenIs also a subset of treatment links, and the 4 nodes are connected in pairs in the treatment network. Due to the fact thatAt the same time appear inAndin the visit networkTo the other nodes in both of these subsets of patient associations.
An embedded representation construction unit: the entity co-occurrence matrix is constructed by utilizing the entity set, the entity node initial embedded representation and the clinic node initial embedded representation are obtained through the entity co-occurrence matrix, and the entity node initial embedded representation and the clinic node initial embedded representation form the node initial embedded representation;
for constructing an entity co-occurrence matrix using the set of entities;
utilizing entity collectionsConstructing entity co-occurrence matricesReferring to FIG. 4, the entity co-occurrence matrixHas the dimension ofEach row and column representing a set of entitiesIn the context of one of the entities,representing entitiesAnd entitiesCo-occurrence information of (a).The calculation formula of (2) is as follows:
wherein, if the entityAnd entitiesAt the moment of treatmentWhen the two occur at the same time, thenEqual to 1; if not, it is noted as 0. WhereinTo be at the clinicAll entities present in (a) constitute a subset of entities. Entity co-occurrence matrixThe two-dimensional mirror is symmetrical to each other,andequal, co-occurrence information of the same entity on the diagonal is marked as 0.
The initial embedded representation of each entity node is obtained through calculation of a GloVe algorithm based on the entity co-occurrence matrix;
the relationship between the entity node initial embedded representation and the entity co-occurrence matrix is represented as:
wherein the content of the first and second substances,andrespectively, the entities that ultimately need to be solvedAnd entitiesThe entity node of (1) is initially embedded and expressed, and is randomly initialized into a random vector with 128 dimensions and the value between-0.1 and 0.1; upper labelIs a transposition operation;andthe bias terms are respectively represented by the initial embedding of two entity nodes, and the initial value is 0.
Constructing an objective function based on the relation between the entity co-occurrence matrix and the entity node initial embedded representation;
Wherein the content of the first and second substances,is the co-occurrence information threshold value and,is an exponential parameter.
If two physical nodes do not appear together, i.e.They do not participate in the calculation of the objective function. Optimizing the objective function through AdaDelta gradient descent algorithm until convergence, and obtaining each entity in the entity setCorresponding entity node initial embedded representation;
The node initial embedded representation is obtained by calculating an average value of the entity node initial embedded representations of all adjacent entity nodes, and the clinic node initial embedded representation and the entity node initial embedded representation form a node initial embedded representation;
for the point of visitThe set of all adjacent entity nodes is, The initial embedding of the node is represented as:
wherein the content of the first and second substances,is thatThe number of intermediate entity nodes.
Node initial embedded representation, Is the initial embedded representation of the treatment node,is the entity node initial embedded representation.
A clustering network construction unit: the system comprises a node clustering network model, a node clustering model and a node clustering model, wherein the node clustering network model is used for constructing an adjacency matrix by utilizing the relationship among nodes in the visit network, and training the visit node clustering network model based on self-supervision graph clustering through the adjacency matrix and the initial embedded representation of the nodes; referring to fig. 5, the self-supervision graph clustering-based diagnosis node clustering network model consists of 3 parts of graph attention, self-encoder and self-supervision.
For constructing an adjacency matrix using relationships between nodes in the treatment networkConnecting the adjacent matrixesAnd the node initial embedded representationInputting the information into the visit node clustering network model based on the self-supervision graph clusteringAttention-oriented exercise of secondary drawings, firstNode embedding of a layer is represented asThe calculation method is as follows:
wherein the content of the first and second substances,is the function of the activation of the relu,is the firstThe layer map is aware of the force weights., Is a normalized adjacency matrix that is,is an identity matrix. In the process of passingAfter the layer diagram attention training, the node embedding expression is obtained。 With node initial embedded representationLikewise, the embedded representation by the updated treatment nodeAnd entity node embedded representationThe structure of the utility model is that the material,。
the node embedded representation is used for reconstructing the visit network and calculating a visit network reconstruction error;
wherein the content of the first and second substances,is thatThe transpose matrix of (a) is,is the sigmoid activation function.
for embedding a physical node into a representationInput deviceThe decoder of the layer neural network is trained, the node is in the secondRepresentation in a layer decoder asThe following calculation formula is used to obtain:
wherein the content of the first and second substances,is the firstThe network weights of the layer decoder are set,is a deviation, the input of the decoder is. Embedding representation with output of last layer of decoder as solid node reconstructionCalculating the error of reconstruction of the physical node:
For embedding representations for treatment nodesPerforming softmax regression operation to obtain the probability distribution of the treatment nodes:
wherein the content of the first and second substances,is of the dimension of, The preset number of the clustering centers, namely the number of the categories of the treatment nodes, is selected according to experience attempts 3, 5 and 10, and the category number with a better result is obtained.Is shown asA sample belongs toThe probability of a class.
Calculating clustering loss according to the probability distribution of the treatment nodes;
for the firstIndividual visit sample andcluster clustering using student t-distribution to judge data characterizationAnd a cluster centerThe similarity of (c).Is thatTo (1) aThe number of rows is such that,is based on the probability distribution of the treatment nodeA clustering center initialized by a K-means method,is the degree of freedom of the distribution of the student t,the calculation formula of (2) is as follows:
wherein the content of the first and second substances,is the firstA sample belongs toProbability of each cluster being aggregated. Is provided withCluster the distributed set for all samples. Obtaining a cluster distributionThen, the target distribution is calculatedTarget distributionSample assignment with higher confidence, and therefore can be based onTo optimize the data distribution so that the data is closer to the cluster center.Andis of the dimension of. Target distributionEach element ofThe calculation formula of (2) is as follows:
wherein the content of the first and second substances,. Target distributionIn the step (1), the first step,is squared, soWith a higher confidence. The calculation formula of the clustering loss is as follows:
for reconstructing errors from the visit networkEntity node reconstruction errorAnd cluster lossAnd constructing a total loss function of the visit node clustering network model based on the self-supervision graph clustering. The overall loss function is:
wherein the content of the first and second substances,is a super parameter for adjusting the importance of different loss items, and is set to be 0.1 by default.
The chronic kidney disease subtype mining model construction unit: and the method is used for constructing the chronic kidney disease subtype mining model through the self-supervision graph clustering-based visit node clustering network model.
The visit node clustering distribution obtained by the visit node clustering network model based on the self-supervision graph clusteringSelecting the category with the highest probability in the category distribution as the category label of the treatment node; medical treatment nodeThe corresponding category label is. All the treatment nodes of each patient are arranged in time sequence by taking the recording time of the first medical record of a single treatment as the starting time of the treatment node and the recording time of the last medical record as the ending time of the treatment node.
Determining to combine or separately reserve the treatment nodes by calculating cosine similarity between category distributions of successive treatment nodes having the same category label, and constructing an event matrix by arranging the treatment nodes;
for two consecutive treatment nodes with the same category labelCalculatingCosine similarity between class distributions:
Combining the front and back treatment nodes with cosine similarity larger than 0.8 into one treatment node, wherein the category of the combined treatment node is distributed asOtherwise, the two treatment nodes are kept separately. And (4) for a plurality of continuous treatment nodes with the same category label, performing cosine similarity judgment from front to back according to the arrangement sequence, and determining to merge or separately reserve.
The final visit nodes for each patient are arranged into an event vector, The node number of the patient with the most visiting nodes is insufficientFills the event vector with 0. Combining event vectors for all patients into an event matrixThe event matrixComprises the following steps:
The method is used for searching for frequent event determination nodes, the frequent events are used as nodes in an event flow, the rest events directly enter an end node, each event in the frequent events is used as an initial node of the next search, a corresponding event vector is extracted to be combined into a new event matrix, the same frequent event searching operation is carried out after the first column is removed, the node obtained by each search is connected with the initial node so as to prolong the event flow until the frequent event is empty or the event flow length reaches the maximum event flow length, and a chronic kidney disease subtype mining model is obtained after the circulation is ended.
Chronic kidney disease phenotype subtype assessment module: for evaluating the chronic kidney disease subtype mining model;
and comparing the differences of the patients with different phenotype subtypes, and checking whether the characteristics of the excavated different subtypes have statistical differences, thereby evaluating whether the disease subtypes obtained by the phenotype subtype excavation method have clinical significance. The specific evaluation protocol was as follows:
and calculating indexes such as sex, age, glomerular filtration rate and the like of the patients with different phenotype subtypes, and judging whether the clinical manifestations of the patients with different phenotype subtypes are different by using a statistical test method.
And (4) counting whether difference exists in important medication data such as the use amount of recombinant human erythropoietin, metformin, candesartan and pravastatin of patients with different subtypes, and analyzing by using a statistical test method.
Counting the number of the patients with various complications of each subtype, including heart failure, coronary heart disease, hypertension, diabetes and hyperlipidemia, calculating the ratio of each complication, and checking whether the ratio of the complications in different subtypes is different.
And counting the total number of all subtypes and the survival number at different time points, and comparing the survival rates of different subtype patients. The difference in survival rates over time for patients of different subtypes was observed and analyzed using the Log-rank test.
If the characteristics of the patient groups of different subtypes are remarkably different by more than 50 percent, the excavated subtypes have better clinical use value.
Chronic kidney disease subtype prediction module: for predicting structured data of a patient;
the self-supervision graph clustering-based diagnosis node clustering network model is used for inputting the preprocessed patient structural data into the diagnosis node clustering network model for prediction to obtain the probability distribution of the diagnosis nodes of the patient;
the cluster type of the treatment nodes is judged according to the probability distribution of the treatment nodes, and a treatment event sequence is constructed;
and the system is used for inputting the visit event sequence into the chronic kidney disease subtype mining model, fitting nodes in the chronic kidney disease subtype mining model according to the sequence to obtain an event flow, and judging which chronic kidney disease subtype belongs to through the event flow.
The invention provides a diagnosis node clustering network model based on self-supervision graph clustering, wherein a decoder is added in graph attention training for reconstructing node embedded representation; adding self-supervision loss for training a clustering model; the method comprises the steps that a clinic node clustering network model based on self-supervision graph clustering is used for gathering low-level and fine-grained chronic nephropathy patient information into high-level and coarse-grained general information for diagnosis and treatment process mining, and the problem that multi-grained information such as event information in a single clinic and event information among multiple diagnoses cannot be processed in longitudinal electronic medical record data in process mining is solved; based on an automatic supervision graph clustering method, multi-dimensional diagnosis and treatment information in a single diagnosis of a patient and time sequence information among multiple diagnoses are fully integrated, and meanwhile, full feature mining is carried out on electronic medical record data from two dimensions, namely a cross section and a longitudinal dimension; based on the distribution similarity of event labels of the diagnosis nodes, similar adjacent events are combined, the process mining method is optimized, the mined diagnosis and treatment process is simplified, and the representativeness and the coverage rate of the diagnosis and treatment process are improved.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (9)
1. A chronic kidney disease subtype mining system based on self-supervision graph clustering is characterized by comprising:
a data acquisition module: the system is used for collecting the structured data in the diagnosis and treatment record of the chronic kidney disease;
the data extraction and pretreatment module: the system is used for extracting and preprocessing the structured data to obtain an entity set and a visit set;
chronic kidney disease subtype mining module: the entity set and the visit set are used for constructing a chronic kidney disease subtype mining model;
chronic kidney disease phenotype subtype assessment module: for evaluating the chronic kidney disease subtype mining model;
chronic kidney disease subtype prediction module: for predicting structured data of a patient.
2. The system of claim 1, wherein the structured data comprises basic patient information, medical records, diagnosis during observation windows, laboratory tests, medical examinations, surgery and/or medication data.
3. The chronic kidney disease subtype mining system based on the autopsy clustering as claimed in claim 1, wherein the data extraction and preprocessing module is specifically configured to preprocess the structured data, extract the structured data in the chronic kidney disease diagnosis and treatment records in the electronic medical record system, including basic information of a patient, a diagnosis record, diagnosis during an observation window, laboratory tests, medical examinations, surgical data, and medication data, preprocess the extracted structured data, focus on only abnormal test items according to a normal reference range, classify results of the abnormal test items into lower and higher categories, and retain names and classes of the abnormal test items; medical examination and operation data are processed by a simple natural language processing technology, and the examined part, the examined type and the operation name are reserved; the medication data only pay attention to the use of six types of medicines, namely antihyperglycemic medicines, antihypertensive medicines, lipid regulating medicines, non-steroidal anti-inflammatory medicines, antiplatelet medicines and steroids, the six types of medicines in the medication data are classified, and the medicine categories are reserved; obtaining a diagnosis set, a medication set, an operation set, a test set, the number of diagnosis types, the number of medication types, the number of operation types, the number of test types and the number of treatment records, combining the diagnosis set, the medication set, the operation set and the test set to form an entity set, and combining the treatment records of patients to form a treatment set.
4. The chronic kidney disease subtype mining system based on unsupervised graph clustering of claim 1, wherein the chronic kidney disease subtype mining module specifically includes:
a visit network construction unit: a network for constructing a visit network using the visit set and the entity set;
an embedded representation construction unit: the entity co-occurrence matrix is constructed by utilizing the entity set, the entity node initial embedded representation and the clinic node initial embedded representation are obtained through the entity co-occurrence matrix, and the entity node initial embedded representation and the clinic node initial embedded representation form the node initial embedded representation;
a clustering network construction unit: the system comprises a node clustering network model, a node clustering model and a node clustering model, wherein the node clustering network model is used for constructing an adjacency matrix by utilizing the relationship among nodes in the visit network, and training the visit node clustering network model based on self-supervision graph clustering through the adjacency matrix and the initial embedded representation of the nodes;
the chronic kidney disease subtype mining model construction unit: and the method is used for constructing the chronic kidney disease subtype mining model through the self-supervision graph clustering-based visit node clustering network model.
5. The chronic kidney disease subtype mining system based on self-supervision picture clustering as claimed in claim 4, wherein the visiting network constructing unit specifically includes:
the system is used for forming the visit set and the entity set into a node set;
the edge set is constructed through the node co-occurrence relations in the node set;
for constructing a treatment network using the set of nodes and the set of edges.
6. The chronic kidney disease subtype mining system based on unsupervised graph clustering as claimed in claim 4, wherein said embedded representation construction unit specifically includes:
for constructing an entity co-occurrence matrix using the set of entities;
the initial embedded representation of each entity node is obtained through calculation of a GloVe algorithm based on the entity co-occurrence matrix;
the node initial embedded representation is obtained by calculating an average value of the entity node initial embedded representations of all adjacent entity nodes, and the clinic node initial embedded representation and the entity node initial embedded representation form the node initial embedded representation.
7. The chronic kidney disease subtype mining system based on self-supervision graph clustering according to claim 4, characterized in that the clustering network construction unit specifically includes:
the self-supervision graph clustering-based visit node clustering network model is used for constructing an adjacency matrix by utilizing the relationship among the nodes in the visit network, inputting the adjacency matrix and the initial node embedded representation into the visit node clustering network model based on self-supervision graph clustering for graph attention training, and obtaining a node embedded representation, wherein the node embedded representation comprises a visit node embedded representation and an entity node embedded representation;
the node embedded representation is used for reconstructing the diagnosis network and calculating the diagnosis network reconstruction error;
the decoder is used for inputting the entity node embedded representation into the neural network for training, the output of the last layer of the decoder is used as entity node reconstruction embedded representation, and entity node reconstruction errors are calculated;
the system is used for performing softmax regression operation on the embedded expression of the treatment nodes to obtain the probability distribution of the treatment nodes, and calculating clustering loss according to the probability distribution of the treatment nodes;
and the overall loss function is used for constructing the self-supervision graph clustering-based visit node clustering network model according to the visit network reconstruction error, the entity node reconstruction error and the clustering loss.
8. The chronic kidney disease subtype mining system based on self-supervision picture clustering according to claim 4, characterized in that the chronic kidney disease subtype mining model building unit specifically includes:
the self-supervision graph clustering-based visit node clustering network model is used for obtaining visit node clustering distribution as the category distribution of the visit nodes, selecting the category with the highest probability in the category distribution as the category label of the visit nodes, and arranging all the visit nodes of each patient according to the time sequence;
determining to combine or separately reserve the treatment nodes by calculating cosine similarity between category distributions of successive treatment nodes having the same category label, and constructing an event matrix by arranging the treatment nodes;
the method is used for searching for frequent event determination nodes, the frequent events are used as nodes in an event flow, the rest events directly enter an end node, each event in the frequent events is used as an initial node of the next search, a corresponding event vector is extracted to be combined into a new event matrix, the same frequent event searching operation is carried out after the first column is removed, the node obtained by each search is connected with the initial node so as to prolong the event flow until the frequent event is empty or the event flow length reaches the maximum event flow length, and a chronic kidney disease subtype mining model is obtained after the circulation is ended.
9. The chronic kidney disease subtype mining system based on unsupervised graph clustering of claim 1, wherein the chronic kidney disease subtype prediction module specifically includes:
the self-supervision graph clustering-based visit node clustering network model is used for inputting the preprocessed patient structured data into the visit node clustering network model for prediction to obtain the probability distribution of the visit node of the patient;
the cluster type of the treatment nodes is judged according to the probability distribution of the treatment nodes, and a treatment event sequence is constructed;
and the system is used for inputting the visit event sequence into the chronic kidney disease subtype mining model, fitting nodes in the chronic kidney disease subtype mining model according to the sequence to obtain an event flow, and judging which chronic kidney disease subtype belongs to through the event flow.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210980822.5A CN115083616B (en) | 2022-08-16 | 2022-08-16 | Chronic nephropathy subtype mining system based on self-supervision graph clustering |
JP2023092731A JP7404581B1 (en) | 2022-08-16 | 2023-06-05 | Chronic nephropathy subtype mining system based on self-supervised graph clustering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210980822.5A CN115083616B (en) | 2022-08-16 | 2022-08-16 | Chronic nephropathy subtype mining system based on self-supervision graph clustering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115083616A true CN115083616A (en) | 2022-09-20 |
CN115083616B CN115083616B (en) | 2022-11-08 |
Family
ID=83244725
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210980822.5A Active CN115083616B (en) | 2022-08-16 | 2022-08-16 | Chronic nephropathy subtype mining system based on self-supervision graph clustering |
Country Status (2)
Country | Link |
---|---|
JP (1) | JP7404581B1 (en) |
CN (1) | CN115083616B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116364299A (en) * | 2023-03-30 | 2023-06-30 | 之江实验室 | Disease diagnosis and treatment path clustering method and system based on heterogeneous information network |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108231201A (en) * | 2018-01-25 | 2018-06-29 | 华中科技大学 | A kind of construction method, system and the application of disease data analyzing and processing model |
CN108417271A (en) * | 2018-01-11 | 2018-08-17 | 复旦大学 | Mental inhibitor object based on phrenoblabia Subtypes recommends method and system |
CN109830303A (en) * | 2019-02-01 | 2019-05-31 | 上海众恒信息产业股份有限公司 | Clinical data mining analysis and aid decision-making method based on internet integration medical platform |
WO2021096932A1 (en) * | 2019-11-13 | 2021-05-20 | Memorial Sloan Kettering Cancer Center | Classifier models to predict tissue of origin from targeted tumor dna sequencing |
CN112992370A (en) * | 2021-05-06 | 2021-06-18 | 四川大学华西医院 | Unsupervised electronic medical record-based medical behavior compliance assessment method |
CN113161001A (en) * | 2021-05-12 | 2021-07-23 | 东北大学 | Process path mining method based on improved LDA |
CN114049966A (en) * | 2022-01-12 | 2022-02-15 | 中国科学院计算机网络信息中心 | Food-borne disease outbreak identification method and system based on link prediction |
CN114093445A (en) * | 2021-11-18 | 2022-02-25 | 重庆邮电大学 | Patient screening and marking method based on multi-label learning |
CN114242194A (en) * | 2021-12-07 | 2022-03-25 | 深圳市云影医疗科技有限公司 | Natural language processing device and method for medical image diagnosis report based on artificial intelligence |
CN114639483A (en) * | 2022-03-23 | 2022-06-17 | 浙江大学 | Electronic medical record retrieval method and device based on graph neural network |
CN114664463A (en) * | 2022-03-18 | 2022-06-24 | 中南大学湘雅医院 | General practitioner diagnoses auxiliary system |
CN114864107A (en) * | 2021-02-03 | 2022-08-05 | 阿里巴巴集团控股有限公司 | Clinical pathway variation analysis method, equipment and storage medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109920547A (en) | 2019-03-05 | 2019-06-21 | 北京工业大学 | A kind of diabetes prediction model construction method based on electronic health record data mining |
-
2022
- 2022-08-16 CN CN202210980822.5A patent/CN115083616B/en active Active
-
2023
- 2023-06-05 JP JP2023092731A patent/JP7404581B1/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108417271A (en) * | 2018-01-11 | 2018-08-17 | 复旦大学 | Mental inhibitor object based on phrenoblabia Subtypes recommends method and system |
CN108231201A (en) * | 2018-01-25 | 2018-06-29 | 华中科技大学 | A kind of construction method, system and the application of disease data analyzing and processing model |
CN109830303A (en) * | 2019-02-01 | 2019-05-31 | 上海众恒信息产业股份有限公司 | Clinical data mining analysis and aid decision-making method based on internet integration medical platform |
WO2021096932A1 (en) * | 2019-11-13 | 2021-05-20 | Memorial Sloan Kettering Cancer Center | Classifier models to predict tissue of origin from targeted tumor dna sequencing |
CN114864107A (en) * | 2021-02-03 | 2022-08-05 | 阿里巴巴集团控股有限公司 | Clinical pathway variation analysis method, equipment and storage medium |
CN112992370A (en) * | 2021-05-06 | 2021-06-18 | 四川大学华西医院 | Unsupervised electronic medical record-based medical behavior compliance assessment method |
CN113161001A (en) * | 2021-05-12 | 2021-07-23 | 东北大学 | Process path mining method based on improved LDA |
CN114093445A (en) * | 2021-11-18 | 2022-02-25 | 重庆邮电大学 | Patient screening and marking method based on multi-label learning |
CN114242194A (en) * | 2021-12-07 | 2022-03-25 | 深圳市云影医疗科技有限公司 | Natural language processing device and method for medical image diagnosis report based on artificial intelligence |
CN114049966A (en) * | 2022-01-12 | 2022-02-15 | 中国科学院计算机网络信息中心 | Food-borne disease outbreak identification method and system based on link prediction |
CN114664463A (en) * | 2022-03-18 | 2022-06-24 | 中南大学湘雅医院 | General practitioner diagnoses auxiliary system |
CN114639483A (en) * | 2022-03-23 | 2022-06-17 | 浙江大学 | Electronic medical record retrieval method and device based on graph neural network |
Non-Patent Citations (1)
Title |
---|
宫雪 崔雷: ""基于医学主题词共现网络的链接预测研究"", 《情报杂志》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116364299A (en) * | 2023-03-30 | 2023-06-30 | 之江实验室 | Disease diagnosis and treatment path clustering method and system based on heterogeneous information network |
CN116364299B (en) * | 2023-03-30 | 2024-02-13 | 之江实验室 | Disease diagnosis and treatment path clustering method and system based on heterogeneous information network |
Also Published As
Publication number | Publication date |
---|---|
JP7404581B1 (en) | 2023-12-25 |
CN115083616B (en) | 2022-11-08 |
JP2024027086A (en) | 2024-02-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Esfahani et al. | Cardiovascular disease detection using a new ensemble classifier | |
CN111261282A (en) | Sepsis early prediction method based on machine learning | |
Mattila et al. | A disease state fingerprint for evaluation of Alzheimer's disease | |
CN108648827A (en) | Cardiovascular and cerebrovascular disease Risk Forecast Method and device | |
CN112201330B (en) | Medical quality monitoring and evaluating method combining DRGs tool and Bayesian model | |
CN108742513A (en) | Patients with cerebral apoplexy rehabilitation prediction technique and system | |
CN111081379A (en) | Disease probability decision method and system | |
CN111081381A (en) | Intelligent screening method for critical indexes of prediction of nosocomial fatal gastrointestinal rebleeding | |
Mounika et al. | Prediction of type-2 diabetes using machine learning algorithms | |
CN115083616B (en) | Chronic nephropathy subtype mining system based on self-supervision graph clustering | |
CN113593708A (en) | Sepsis prognosis prediction method based on integrated learning algorithm | |
CN114023441A (en) | Severe AKI early risk assessment model and device based on interpretable machine learning model and development method thereof | |
Razavi et al. | Predicting metastasis in breast cancer: comparing a decision tree with domain experts | |
Samet et al. | Predicting and staging chronic kidney disease using optimized random forest algorithm | |
CN113128654A (en) | Improved random forest model for coronary heart disease pre-diagnosis and pre-diagnosis system thereof | |
Thelagathoti et al. | A population analysis approach using mobility data and correlation networks for depression episodes detection | |
Kalogiannis et al. | Geriatric group analysis by clustering non-linearly embedded multi-sensor data | |
CN116469570A (en) | Malignant tumor complication analysis method based on electronic medical record | |
Conforti et al. | Kernel-based support vector machine classifiers for early detection of myocardial infarction | |
Thelagathoti et al. | A data-driven approach for the analysis of behavioral disorders with a focus on classification and severity estimation | |
Tolentino et al. | CAREdio: Health screening and heart disease prediction system for rural communities in the Philippines | |
Bose et al. | Female Diabetic Prediction in India Using Different Learning Algorithms | |
Ndirangu et al. | Support vector machine based disease diagnostic assistant | |
CN111028953B (en) | Control method for prompting marking of medical data | |
AU2021102832A4 (en) | System & method for automatic health prediction using fuzzy based machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |